Package edu.ucsb.nceas.metacat.storage
Class Storage
java.lang.Object
edu.ucsb.nceas.metacat.storage.Storage
The HashStore implementation of the Storage interface.
-
Method Summary
Modifier and TypeMethodDescriptionvoid
deleteIfInvalidObject
(ObjectInfo objectInfo, String checksum, String checksumAlgorithm, long objSize) Confirms that an ObjectMetadata's content is equal to the given values.void
deleteMetadata
(org.dataone.service.types.v1.Identifier pid) Deletes all metadata related for the given 'pid' from HashStorevoid
deleteMetadata
(org.dataone.service.types.v1.Identifier pid, String formatId) Deletes a metadata document (ex.void
deleteObject
(org.dataone.service.types.v1.Identifier pid) Deletes an object and all relevant associated files (ex.Get the default namespace in the storage systemgetHexDigest
(org.dataone.service.types.v1.Identifier pid, String algorithm) Calculates the hex digest of an object that exists in HashStore using a given persistent identifier and hash algorithm.static Storage
Get the instance of the class through the singleton patternstatic Properties
Get the store properties which initializes the hashstore instanceretrieveMetadata
(org.dataone.service.types.v1.Identifier pid) retrieveMetadata
(org.dataone.service.types.v1.Identifier pid, String formatId) Returns an InputStream to the metadata content of a given pid and metadata namespace from HashStore.retrieveObject
(org.dataone.service.types.v1.Identifier pid) Returns an InputStream to an object from HashStore using a given persistent identifier.storeMetadata
(InputStream metadata, org.dataone.service.types.v1.Identifier pid) storeMetadata
(InputStream metadata, org.dataone.service.types.v1.Identifier pid, String formatId) Adds/updates metadata (ex.storeObject
(InputStream object) storeObject
(InputStream object, org.dataone.service.types.v1.Identifier pid, String additionalAlgorithm, String checksum, String checksumAlgorithm, long objSize) The `storeObject` method is responsible for the atomic storage of objects to disk using a given InputStream.void
Creates references that allow objects stored in HashStore to be discoverable.
-
Method Details
-
getInstance
public static Storage getInstance() throws ServiceException, edu.ucsb.nceas.utilities.PropertyNotFoundExceptionGet the instance of the class through the singleton pattern- Returns:
- the instance of the class
- Throws:
ServiceException
edu.ucsb.nceas.utilities.PropertyNotFoundException
-
storeObject
public ObjectInfo storeObject(InputStream object, org.dataone.service.types.v1.Identifier pid, String additionalAlgorithm, String checksum, String checksumAlgorithm, long objSize) throws NoSuchAlgorithmException, IOException, org.dataone.service.exceptions.InvalidRequest, org.dataone.service.exceptions.InvalidSystemMetadata, RuntimeException, InterruptedException The `storeObject` method is responsible for the atomic storage of objects to disk using a given InputStream. Upon successful storage, the method returns a (ObjectInfo) object containing relevant file information, such as the file's id (which can be used to locate the object on disk), the file's size, and a hex digest dict of algorithms and checksums. Storing an object with `store_object` also tags an object (creating references) which allow the object to be discoverable. `storeObject` also ensures that an object is stored only once by synchronizing multiple calls and rejecting calls to store duplicate objects. Note, calling `storeObject` without a pid is a possibility, but should only store the object without tagging the object. It is then the caller's responsibility to finalize the process by calling `tagObject` after verifying the correct object is stored. The file's id is determined by calculating the object's content identifier based on the store's default algorithm, which is also used as the permanent address of the file. The file's identifier is then sharded using the store's configured depth and width, delimited by '/' and concatenated to produce the final permanent address and is stored in the `./[storePath]/objects/` directory. By default, the hex digest map includes the following hash algorithms: MD5, SHA-1, SHA-256, SHA-384, SHA-512 - which are the most commonly used algorithms in dataset submissions to DataONE and the Arctic Data Center. If an additional algorithm is provided, the `storeObject` method checks if it is supported and adds it to the hex digests dict along with its corresponding hex digest. An algorithm is considered "supported" if it is recognized as a valid hash algorithm in `java.security.MessageDigest` class. Similarly, if a file size and/or checksum & checksumAlgorithm value are provided, `storeObject` validates the object to ensure it matches the given arguments before moving the file to its permanent address.- Parameters:
object
- Input stream to filepid
- Authority-based identifieradditionalAlgorithm
- Additional hex digest to include in hexDigestschecksum
- Value of checksum to validate againstchecksumAlgorithm
- Algorithm of checksum submittedobjSize
- Expected size of object to validate after storing- Returns:
- ObjectMetadata object encapsulating file information
- Throws:
NoSuchAlgorithmException
- When additionalAlgorithm or checksumAlgorithm is invalidIOException
- I/O Error when writing file, generating checksums and/or moving fileorg.dataone.service.exceptions.InvalidRequest
- If a pid refs file already exists, meaning the pid is already referencing a file.org.dataone.service.exceptions.InvalidSystemMetadata
- The checksum or size don't match the calculationRuntimeException
- Thrown when there is an issue with permissions, illegal arguments (ex. empty pid) or null pointersInterruptedException
- When tagging pid and cid process is interrupted
-
storeObject
public ObjectInfo storeObject(InputStream object) throws NoSuchAlgorithmException, IOException, org.dataone.service.exceptions.InvalidRequest, RuntimeException, InterruptedException - Throws:
NoSuchAlgorithmException
IOException
org.dataone.service.exceptions.InvalidRequest
RuntimeException
InterruptedException
- See Also:
-
tagObject
public void tagObject(org.dataone.service.types.v1.Identifier pid, String cid) throws IOException, org.dataone.service.exceptions.InvalidRequest, NoSuchAlgorithmException, FileNotFoundException, InterruptedException Creates references that allow objects stored in HashStore to be discoverable. Retrieving, deleting or calculating a hex digest of an object is based on a pid argument; and to proceed, we must be able to find the object associated with the pid.- Parameters:
pid
- Authority-based identifiercid
- Content-identifier (hash identifier)- Throws:
IOException
- Failure to create tmp fileorg.dataone.service.exceptions.InvalidRequest
- When pid refs file already existsNoSuchAlgorithmException
- When algorithm used to calculate pid refs address does not existFileNotFoundException
- If refs file is missing during verificationInterruptedException
- When tagObject is waiting to execute but is interrupted
-
storeMetadata
public String storeMetadata(InputStream metadata, org.dataone.service.types.v1.Identifier pid, String formatId) throws IOException, IllegalArgumentException, FileNotFoundException, InterruptedException, NoSuchAlgorithmException Adds/updates metadata (ex. `sysmeta`) to the HashStore by using a given InputStream, a persistent identifier (`pid`) and metadata format (`formatId`). All metadata documents for a given pid will be stored in the directory (under ../metadata) that is determined by calculating the hash of the given pid, with the document name being the hash of the metadata format (`formatId`). Note, multiple calls to store the same metadata content will all be accepted, but is not guaranteed to execute sequentially.- Parameters:
metadata
- Input stream to metadata documentpid
- Authority-based identifierformatId
- Metadata namespace/format- Returns:
- Path to metadata content identifier (string representing metadata address)
- Throws:
IOException
- When there is an error writing the metadata documentIllegalArgumentException
- Invalid values like null for metadata, or empty pids and formatIdsFileNotFoundException
- When temp metadata file is not foundInterruptedException
- metadataLockedIds synchronization issueNoSuchAlgorithmException
- Algorithm used to calculate permanent address is not supported
-
storeMetadata
public String storeMetadata(InputStream metadata, org.dataone.service.types.v1.Identifier pid) throws IOException, IllegalArgumentException, FileNotFoundException, InterruptedException, NoSuchAlgorithmException -
retrieveObject
public InputStream retrieveObject(org.dataone.service.types.v1.Identifier pid) throws IllegalArgumentException, FileNotFoundException, IOException, NoSuchAlgorithmException Returns an InputStream to an object from HashStore using a given persistent identifier.- Parameters:
pid
- Authority-based identifier- Returns:
- Object InputStream
- Throws:
IllegalArgumentException
- When pid is null or emptyFileNotFoundException
- When requested pid has no associated objectIOException
- I/O error when creating InputStream to objectNoSuchAlgorithmException
- When algorithm used to calculate object address is not supported
-
retrieveMetadata
public InputStream retrieveMetadata(org.dataone.service.types.v1.Identifier pid, String formatId) throws IllegalArgumentException, FileNotFoundException, IOException, NoSuchAlgorithmException Returns an InputStream to the metadata content of a given pid and metadata namespace from HashStore.- Parameters:
pid
- Authority-based identifierformatId
- Metadata namespace/format- Returns:
- Metadata InputStream
- Throws:
IllegalArgumentException
- When pid/formatId is null or emptyFileNotFoundException
- When requested pid+formatId has no associated objectIOException
- I/O error when creating InputStream to metadataNoSuchAlgorithmException
- When algorithm used to calculate metadata address is not supported
-
retrieveMetadata
public InputStream retrieveMetadata(org.dataone.service.types.v1.Identifier pid) throws IllegalArgumentException, FileNotFoundException, IOException, NoSuchAlgorithmException -
deleteObject
public void deleteObject(org.dataone.service.types.v1.Identifier pid) throws IllegalArgumentException, IOException, NoSuchAlgorithmException, InterruptedException Deletes an object and all relevant associated files (ex. system metadata, reference files, etc.) based on a given pid. If other pids still reference the pid's associated object, the object will not be deleted.- Parameters:
pid
- Authority-based identifier- Throws:
IllegalArgumentException
IOException
NoSuchAlgorithmException
InterruptedException
- See Also:
-
for more details.
-
deleteMetadata
public void deleteMetadata(org.dataone.service.types.v1.Identifier pid, String formatId) throws IllegalArgumentException, IOException, NoSuchAlgorithmException, InterruptedException Deletes a metadata document (ex. `sysmeta`) permanently from HashStore using a given persistent identifier and its respective metadata namespace.- Parameters:
pid
- Authority-based identifierformatId
- Metadata namespace/format- Throws:
IllegalArgumentException
- When pid or formatId is null or emptyIOException
- I/O error when deleting metadata or empty directoriesNoSuchAlgorithmException
- When algorithm used to calculate object address is not supportedInterruptedException
- Issue with synchronization on metadata doc
-
deleteMetadata
public void deleteMetadata(org.dataone.service.types.v1.Identifier pid) throws IllegalArgumentException, IOException, NoSuchAlgorithmException, InterruptedException Deletes all metadata related for the given 'pid' from HashStore- Parameters:
pid
- Authority-based identifier- Throws:
IllegalArgumentException
- If pid is invalidIOException
- I/O error when deleting metadata or empty directoriesNoSuchAlgorithmException
- When algorithm used to calculate object address is not supportedInterruptedException
- Issue with synchronization on metadata doc
-
deleteIfInvalidObject
public void deleteIfInvalidObject(ObjectInfo objectInfo, String checksum, String checksumAlgorithm, long objSize) throws org.dataone.service.exceptions.InvalidSystemMetadata, NoSuchAlgorithmException, InterruptedException, org.dataone.service.exceptions.ServiceFailure, IOException Confirms that an ObjectMetadata's content is equal to the given values. This method throws an exception if there are any issues, and attempts to remove the data object if it is determined to be invalid.- Parameters:
objectInfo
- ObjectInfo object with valueschecksum
- Value of checksum to validate againstchecksumAlgorithm
- Algorithm of checksum submittedobjSize
- Expected size of object to validate after storing- Throws:
org.dataone.service.exceptions.InvalidSystemMetadata
- Given size =/= objMeta size value or checksum =/= objMeta checksum valueorg.dataone.service.exceptions.ServiceFailure
- Given algo is not found or supportedNoSuchAlgorithmException
- When 'deleteInvalidObject' is true and an algo used to get a cid refs file is not supportedInterruptedException
- When 'deleteInvalidObject' is true and an issue with coordinating deleting objects occursIOException
- Issue with recalculating supported algo for checksum not found
-
getHexDigest
public String getHexDigest(org.dataone.service.types.v1.Identifier pid, String algorithm) throws IllegalArgumentException, FileNotFoundException, IOException, NoSuchAlgorithmException Calculates the hex digest of an object that exists in HashStore using a given persistent identifier and hash algorithm.- Parameters:
pid
- Authority-based identifieralgorithm
- Algorithm of desired hex digest- Returns:
- String hex digest of requested pid
- Throws:
IllegalArgumentException
- When pid or formatId is null or emptyFileNotFoundException
- When requested pid object does not existIOException
- I/O error when calculating hex digestsNoSuchAlgorithmException
- When algorithm used to calculate object address is not supported
-
getDefaultNameSpace
Get the default namespace in the storage system- Returns:
- the default namespace
-
getStoreProperties
Get the store properties which initializes the hashstore instance- Returns:
- the store properties. It can be null
-