It’s time again for our series ” RDM Explained”, in which we provide you with important terms and information for practical research data management.
In the present day, digital information is an integral part of our everyday lives. Whether messages in social networks, digital photos, documents, or videos – we want to access all this data for as long as possible. But permanent access to digital data is not only an important issue in the private sphere, but also in the research context. The long-term usability of research data is a prerequisite for scientific progress. This article therefore offers a brief introduction to digital preservation.
Digital preservation – what’s behind it?
Digital data are bits that are stored on data carriers. Each file consists of a fixed sequence of bits (bitstream) that have either the value 0 or 1. The ability to preserve the bitstream in the long term beyond technological changes (bitstream preservation) is the basis of digital preservation. This is because, due to constant technological change, there is always the danger that data carriers, file formats, software and storage locations become inaccessible and unusable. Digital preservation must therefore adapt to these changes. It is a process that requires regular attention and is never finally completed.
File formats and preservation strategies
However, bitstream preservation alone is not enough for long-term digital archiving. In order to ensure that research data in particular can be reproduced and interpreted correctly and without loss in the long term, sufficient contextual information is required on the used data collection methods, software and hardware, coding and metadata.
Furthermore, the functioning interaction of hardware and software is essential for digital data. Due to the constant change in technology, there is a risk that certain formats will at some point no longer be supported by the available software or hardware. It may therefore be necessary to migrate the source files into new formats (migration) or to recreate the original software or hardware environment using emulators (emulation). Proprietary file formats in particular make these preservation strategies more difficult. Open source formats are more suitable because their specifications are openly documented and comprehensible. They are independent of the manufacturer and can be opened and modified with different programmes.
How can you recognise a trustworthy long-term archive?
In August 2012, the reference model Open Archival Information System (OAIS) for a dynamic, extensible archive system was published as ISO standard 14721:2012. The implementation of the standard reference model is assured by many digital archives and repositories. Various evaluation procedures check the implementation of these basic functionalities and thus the trustworthiness of the long-term archives:
For true long-term archiving of special research data, RWTH can draw on the services of the North Rhine-Westphalia University Library Center (hbz) and use the infrastructure operated there to ensure long-term digital availability.
For standard archiving of research data in line with good scientific practice, researchers at RWTH Aachen University use the Coscine integration platform. This can be used to store, organize and archive research data efficiently over the long term.
RWTH data sets that cannot be classified as research data can be stored long-term in the DigitalArchive by RWTH employees (with the appropriate status in identity management).
Learn more
Further detailed information on digital preservation of research data can be found in the NESTOR manuals or via the Nestor Wiki (only in German).
If you have any questions about research data management in general, feel free to write the ServiceDesk. The RDM team looks forward to hearing from you.
For more information on RDM, please visit the RWTH websites.
______
Responsible for the content of this article is Sophia Nosthoff.
Leave a Reply
You must be logged in to post a comment.