KOST-Val released by the TI/A Standard Initiative team

Share

200px-KOST-ValKOST-Val is an open source validator for different file formats (TIFF, SIARD, PDF/A, JP2, JPEG) and Submission Information Package (SIP).

It has been developed by KOST-CECO, is a Swiss coordination office which is member of the TI/A Standard Initiative team, a group of experts focussing on the definition of a specification of a Archival TIFF Format

For futher information visit the KOST-Val page in the Community Owned digital Preservation Tool Registry (COPTR).

 

Funtional Principle

KOST-Val complies with the following requirements.

  • TIFF validation: KOST-Val reads a TIFF file and uses JHOVE to validate the structure, the content, and ExifTool to validate the key properties such as compression, colour space, and multipage. These properties can be configured.
  • SIARD validation: KOST-Val reads a SIARD (eCH-0165 v1 ) file and validates the structure and the content.
  • PDF/A validation: KOST-Val reads a PDF or PDF/A file (ISO 19005-1 and 19005-2) and uses 3-Heights? PDF/A Validator by PDF-Tools or PDF/A Manager by PDFTron to validate the structure and the content of the PDF file. KOST-Val organises the different error messages into main categories such as fonts, graphics, and metadata. KOST-Val supplies only a limited version from 3-Heights? PDF/A Validator by PDF-Tools. Module J extracts (with iText) and validates the JPEG and JP2 images contained in the PDF file (depending on the configuration). It is also possible to configure whether the JBIG2 compression is accepted or not.
  • JP2 validation: KOST-Val reads a JP2 file (ISO 15444) and uses Jpylyzer to validate the structure and the content.
  • JPEG validation: KOST-Val reads a JPEG file (ISO 10918-1) and uses Bad Peggy to validate the structure and the content.
  • SIP validation: KOST-Val reads an SIP (eCH-0160 v1 as well as Swiss Federal Archives SFA v1 and v4 ) and validates the mandatory requirements of the SIP specification. The validated requirements are organised into groups such as folder structure, schema validation, and checksum validation. At the outset, a file format validation is performed.

The results (including information on inconsistencies and errors) are output for every step and written into a validation log. The validation steps are executed sequentially. Whenever possible the validation shall continue after an error has been detected in order to reduce the number of correction cycles.

 

Third-party applications

KOST-Val uses unmodified components of other manufacturers by embedding them directly into the source code. Users of KOST-Val are requested to adhere to these components’ terms of licence.

  • The TIFF validation module uses JHOVE and ExifTool and evaluates its output further.
  • For the PDF/A validation module PDF-A Manager or 3-Heights PDF/A Validator are used.
  • The JP2 validation module uses Jpylyzer and translates the failed tests into appropriate error messages (DE/FR/EN).
  • The JPEG validation module uses Bad Peggy and evaluates the error message “Not a JPEG file” further.
  • To extract the JPEG and JP2 images from PDF/A the iText library is used.
  • For the file format identification DROID is used. For performance and granularity reasons an own SignatureFile is used instead of the official PRONOM registry.

 

 

tia_logoAbout the TI/A Standard initiative

The TI/A Standard initiative is promoted by the Digital Humanities Lab of the University of Basel, the Agents Research Lab of the University of Girona and Easy Innova with the support of many interested memory institutions.

This standard will be created in parallel with DPF Manager, an open source TIFF format validator that, in addition to the current TIFF ISO Standards, will be the first conformance checker for the TI/A new standard.

This initiative has been boosted by PREFORMA, a PCP project that aims to address the challenge of implementing good quality standardised file formats for preserving data content in the long term.

 

Visit the PREFORMA Blog

Visit the PREFORMA Website

Leave a Reply


Related Articles

veraPDF 1.4 released and available to download
We are pleased to announce that the new release of veraPDF, the open source file format validator for PDF/A documents, is available to download on the PREFORMA Open Source Portal. veraPDF 1.4 has a new GUI wizard for creating custom policy files. Significant performance optimisations have been made to the greenfield PDF parser. Testing and user feedback is key to improving the software. Please download and use the latest release.
DPF Manager Updates
DPFManager is the most advanced TIFF conformance checker for digital preservation that has been created and designed by a strong community of experts who are highly interested in file format validation for digital preservation. DPF Manager allows archivists to know the state of their TIFF images in terms of preservation. It validates the files with the main standards (Baseline 6, TIFF-EP, TIFF-IT), and additionally checks the conformance to the newest TIFF for Archival (TI/A) Recommendations (in...
iPRES PREFORMA Workshop
Here is the report of the workshop that the PREFORMA team organised at iPRES 2016 to dive into the larger narrative of the project, show the three different tools in development, detail the main standardisation strands and get a conversation started between potential users of the tools with the people who are hard at work making them happen.
Complete PDF/A-1b coverage now available in the 0.6 release of veraPDF
The veraPDF consortium is pleased to announce that the latest release of the veraPDF PDF/A validation software and test-suite currently under development is now available in the PREFORMA Open Source Portal. Led by the Open Preservation Foundation and the PDF Association, the veraPDF consortium is developing the definitive open source, file-format validator for all parts and conformance levels of ISO 19005 (PDF/A).