Categories
Pages
-

Research Data – Latest News & Worth Knowing

Review Coscine – FAIRly in Love with Research Data

February 23rd, 2023 | by
Coscine Logo & FAIR Principles in heart shape

Source: Own illustration

If you haven’t fallen in love with Coscine yet, you will have fallen under the spell of the research data management platform by February 14, 2023 at the latest – fittingly on the day of love. In this follow-up-report we have summarized the event for you and why Coscine makes your heart race. The event Coscine – FAIRly in Love with Research Data event took place as part of the Love Data Week.

Getting to know each other – Speed Dating

All those who do not yet know what Coscine is and who is behind it, pay attention:

Coscine is an open source platform that enables researchers to run their research data management (RDM) at the highest level and according to good scientific practice. The platform was developed at the IT Center of RWTH Aachen University and is currently offered across universities.

But why fall in love with a RDM platform?
It’s quite simple! Thanks to the project structure, Coscine allows easy data management as well as access for all project members, who have access to their project data and code. In addition, research data can be automatically linked to metadata, and individual and project-specific application profiles, known as application profiles, can be created. These application profiles in turn contain individual metadata combinations from various metadata schemas and terms. Integrated IT services such as the Research Data Storage (RDS) and GitLab are offered as resources for the data sources. In the future, the integration of Nextcloud as well as Sciebo will also be targeted. When using the RDS as a storage resource, research data can be archived for 10 years after the end of the project. Last but not least, all FAIR principles are supported by Coscine.

Have we convinced you? 😉

FAIRly in Love

Coscine manages to be in a very special harmony with the FAIR principles. So what makes Coscine FAIR?

With the help of the login via SSO and ORCID, institutional boundaries for access to the respective project are removed. The individual accounts allow authentication of the owner or all contributors to the dataset as well as user-specific rights such as Owner, Member or Guest. This makes research data accessible and reusable.

Metadata is captured at the project, resource, and file levels and automatically linked to the research data. Optionally, the metadata can be made publicly accessible within Coscine. They are also searchable via ElasticSearch. Technical mapping and validation is done using the W3C standards RDF and SHACL. Additionally, a connection to the NFDI4Ing metadata hub via “FAIR Digital Object” interfaces is planned. All these aspects contribute to making research data findable, interoperable and reusable.

The application profile generator from the AIMS (Applying Interoperable Metadata Standards) project allows profiles to be created with individual and discipline-specific metadata, eliminating the need for technical knowledge of RDF and SHACL. This makes research data findable and interoperable.

Through various resource types, the storage of research data is made possible. Among others, there is access to the RDS via web browser or S3 client. There, a retention and archiving period of ten years from the end of the project is guaranteed. With the help of Linked Data, externally stored research data and, thanks to GitLab resources, project-related repositories can be linked. The result is that the research data can be reused as much as possible.

By means of handle-based ePIC Persistent Identifiers (PID), the storage locations of the research data become unique and permanently identifiable as well as referenceable on a global level. Furthermore, fragment identifiers for individual files are enabled by extended handle URLs. This makes the research data findable and accessible.

For RDS resources, depending on the resource type, interaction is possible via browser, S3 protocol or REST API. The REST interface makes data and metadata independently findable and accessible. The ease of entering data and metadata into the system facilitates subsequent use. In addition, the interface enables workflows to be automated. Thus, research data is made findable and accessible.

FAIR Principles in heart shape

Source: Own illustration

Discover why your research data still needs your love

Long-term archiving is beyond the scope of Coscine. Researchers must therefore transfer their data and metadata to an appropriate archive or repository for availability beyond ten years after the end of the project.

In addition, the richness of the description in the research metadata (e.g., the choice of application profile, workflows) as well as the specification of a license is in the hands of the researchers.

Since Coscine allows a high degree of flexibility in the creation of application profiles, researchers must ensure that they reference appropriate domain-specific controlled vocabularies and ontologies when creating individual application profiles, and thus use (meta-)data corresponding to domain-relevant community standards.

Future FAIR improvements

The protocol is open, free, and universally implementable.

For metadata, work is underway to implement a FAIR Data Point as a standardized interface.

Metadata should be accessible even if the data is no longer available. Therefore, a permanent placeholder with metadata of deleted resources will be implemented in the future.

For (meta)data to contain qualified references to other (meta)data, these can only be added manually so far. Therefore, a project has been requested to add a predefined option to link (meta)data records.

In addition, a technical versioning of (meta)data is currently being implemented.

Workshops benefit from the exchange with each other

The lively participation in the workshop with around 40 participants from all over Germany enabled an interesting discussion and many insights into the functions of Coscine. Interested parties were guided live through the platform during the workshop and were able to ask their questions at the end. We have listed the most interesting questions including the answers for you once again.

1st question:
Is Coscine used across all faculties/areas at RWTH Aachen University? Are there any departments that cannot use Coscine?

Answer:
Yes, Coscine is already in use at RWTH Aachen University regardless of department and is a generic FDM platform (i.e., without a subject-specific focus). Since RWTH has a large focus on engineering, mainly associated faculties are represented in Coscine so far. Initially, Coscine only queries generic metadata fields on project and resource level, the subject-specific application profiles on file level can be selected by researchers themselves. Thus, each discipline can individualize Coscine for itself.

2nd Question:
Do you obtain the ePIC PIDs from GWDG or do you have your own infrastructure at RWTH?

Answer:
The ePIC-PIDs are obtained from the GWDG. The GWDG provides us with two servers for this purpose, one for test PIDs and one for the productive PIDs.

3rd question:
If Coscine is not intended for long-term archiving, then the PIDs are not really persistent, are they?

Answer:
The PIDs that are output by Coscine persistently reference the digital object behind them, which is always a resource in Coscine. However, the data, which in turn is stored in the resource, can change. This is the only way PIDs can be issued for “warm” data – by referring only to the storage environment of the data (for RDS, this refers to the respective S3 bucket). Once the 10 years of archiving have elapsed after the end of the project, the data is deleted. The PID will continue to point to the location (e.g., the deleted S3 bucket). Therefore, a permanent placeholder with metadata for deleted resources will be implemented at this location in the future.

4th question:
To what extent is the use of Coscine (already) mandatory or is it only recommended so far?

Answer:
The use of Coscine is not mandatory and will continue to be offered and recommended on a voluntary basis in the future. What makes it attractive for researchers to use it is mainly the access to free storage on the RDS and the compliance with FDM guidelines according to application requirements.

5th Question:
How long did their pilot phase run?

Answer:
Coscine’s pilot phase started in March 2020, and we aim to have Coscine in regular operation in the second quarter of 2023.

6th Question:
What is the size of the team of developers working on Coscine?

Answer:
Coscine has about ten employees with different specialties and employment contracts – a total of about six full-time positions. The developers are employed through various third-party funded projects.

7th question:
Where do new application profiles created with the AIMS application profile generator end up?

Answer:
New application profiles are deposited in the associated and open Coscine-GitLab repository.

Conclusion

We would like to thank all participants for their interest in Coscine and the resulting successful exchange with each other. See you next time!

Learn more

You don’t want to miss any more news about Coscine? Then subscribe to our mailing list and visit us on our website.

Do you have any questions or feedback? Then send a message to the IT-ServiceDesk. We are looking forward to your message!

______

Responsible for the content of this article are Ilona Lang and Arlinda Ujkani.

Leave a Reply