
Source: Freepik
On June 6, 2025, the OpenWebSearch.eu consortium, an alliance of 14 renowned European research institutions, including CERN, launched the pilot phase of the European Open Web Index (OWI). The aim is to strengthen Europe’s digital sovereignty by creating open and balanced access to web data that is based on European values, data protection standards, and legislation. This should also provide new impetus for research data management (RDM).
Why a European Web Index Is Necessary
Until now, web search has been dominated by a few global providers such as Google, Microsoft, Baidu, and Yandex. These gatekeepers control what can be found and in what order searched content is displayed. Europe currently lacks its own infrastructure to become digitally independent. The OWI aims to close this gap with a transparent, decentralized solution that promotes fairness and independence.
Development and Technical Progress
Since September 2022, the consortium has been developing the basic infrastructure for the OWI, which is funded by the EU Horizon program with €8.5 million [1]. CERN is providing crucial technical contributions: crawling and indexing pipelines that process around nine million URLs per hour. This corresponds to several terabytes of data per day. Over a petabyte of openly licensed web content has already been indexed. By the end of 2025, 30 to 50 percent of the text-based internet should be covered.
Open by Design: Values, Ethics, Data Protection
The OWI has been interdisciplinary from the outset. Technical disciplines work closely with the fields of law, ethics, and social sciences to identify systematic biases and ensure data protection. As is customary at CERN, openness is a key principle. Since the invention of the World Wide Web at CERN, the mission has been to continue open science, now also for web search.
Possible Uses and Applications
The OWI is currently open for use as a pilot project with a research license. Initial applications are already in the planning stage, including argument search, vertical search services, and access for large language models and chatbots. A prototype called “Nooon” is aimed at people with disabilities by providing structured, accessible, and representative content without allowing any conclusions to be drawn about the individual.
Outlook and Next Steps
With its latest call in May 2025, the consortium invites developers to actively test the European search engine in projects ranging from vertical search to retrieval-augmented generation applications.
At the same time, a European ecosystem is being established. Additional data centers and third-party partners are being added, and market studies and funding models are currently being developed to ensure long-term sustainability. [2]
Conclusion
The European Open Web Index is a pioneering project for Europe’s digital self-determination. It provides an open, ethically grounded infrastructure that not only benefits research but also serves as a foundation for innovative applications such as accessible search and AI model training. Thanks to CERN’s technical expertise, strict adherence to European values, and active community involvement, a promising building block for a digital Europe is emerging – free, open, and sovereign.
If you have any questions about research data management, please do not hesitate to contact us. The RDM team looks forward to hearing from you and will be happy to help!
Responsible for the content of this article is Hania Eid.
The following sources served as the basis for this article:
- CERN
- [1] OpenWebSearch.eu
- [2] OpenWebSearch.eu
Leave a Reply
You must be logged in to post a comment.