Archivematica users rely on spreadsheets created in a specific way to perform tasks within or after Archivematica. Documentations and examples can be found here but there is no clear validation as can be performed by a machine. As one example, the metadata.csv and rights.csv files are “special” and are utilized by Archivematica to add metadata or rights metadata into the AIP’s METS file. Another example is the Avalon Media System having a specific Manifest.csv file that is used to recreate hierarchical information and additional metadata, which is used after a DIP is created from a stored AIP. It would be beneficial if this manifest could be validated prior to going through the preservation process. Both of these examples would benefit from a validation service that a user (or automated system) could access prior to ingest into Archivematica.
Chosen option: “1. API endpoint for pre-ingest CSV validation”, because it is flexible, it lays the groundwork for future work around CSV validation as a step to be taken by Archivematica. It doesn’t have the complications of the long-term maintenance and testing of a GUI component. It can be more easily automated. The solution allows for custom or institutionally-specific CSV to be used/added.
Technically, this would live in the Archivematica codebase and be a new endpoint in the Archivematica API
Proposed endpoint below:
URL: /api/transfer/validate_csv Verb: POST Validates local CSV with validator service Python script Parameters: CSV input: Path to the CSV validator: Name of service CSV should be checked against, i.e. "avalon" or "rights" Response: JSON message: Approval or non-approval, depending on service output
Options are discussed with pros/cons outlined in the linked issue and pull request. Some are implicit in the above decision outcome and positive/negative consequences sections.