feat: Support pre-calculated checksums in SIP generation#369
Open
JohannesKarlsen99 wants to merge 6 commits intokeeps:masterfrom
Open
Conversation
|
Related Documentation 1 document(s) may need updating based on files changed in this PR: RODA and DBPTK Space |
used for data files during SIP generation. Metadata files (descriptive, preservation, technical, source, rights, and other metadata) had their checksums ignored, causing them to be recalculated during ZIP creation. Changes: - Add constructor to METSMdRefZipEntryInfo that accepts pre-calculated checksum and algorithm parameters - Add overloaded addMdRefFileToZip() method in ZIPUtils that passes checksum parameters to METSMdRefZipEntryInfo - Update all metadata adding methods in EARKUtils to pass the checksum from IPFileInterface to ZIPUtils: - addDescriptiveMetadataToZipAndMETS - addPreservationMetadataToZipAndMETS - addOtherMetadataToZipAndMETS - addTechnicalMetadataToZipAndMETS - addSourceMetadataToZipAndMETS - addRightsMetadataToZipAndMETS
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds support for using pre-calculated checksums during E-ARK SIP generation, avoiding redundant checksum calculations for large files.
Fixes #368
Changes
Interface Changes
IPFileInterface.java: AddedgetChecksum(),getChecksumAlgorithm(), andhasPreCalculatedChecksum(String algorithm)methods to the interfaceImplementation Changes
IPFileShallow.java: Implemented the new interface methodsMETSFileTypeZipEntryInfo.java: Added constructor that accepts pre-calculated checksumZIPUtils.java:addFileTypeFileToZip()method accepting pre-calculated checksumzip()to skip checksum calculation when valid pre-calculated checksum existscopyWithoutChecksum()helper methodDatatypeConverter.printHexBinarywithHexFormatFolderWriteStrategy.java:writeFileToPath()to skip checksum calculation when valid pre-calculated checksum existsDatatypeConverter.printHexBinarywithHexFormatEARKUtils.java: Updated alladdFileTypeFileToZip()calls to pass checksum fromIPFileBug Fix
Fixed a bug where
file.setChecksum(sip.getChecksum())was overwriting the pre-calculated checksum with the algorithm name before it could be used. The fix saves the pre-calculated checksum values before this call.Usage Example
Testing
Added
testPreCalculatedChecksumSupport()test that:IPFile