Janek Bevendorff, Benno Stein, Matthias Hagen, and Martin Potthast. Elastic chatnoir: search engine for the clueweb and the common crawl. In European Conference on Information Retrieval, 820–824. Springer, 2018.
Kai Erenli, Christian Geminn, and Leon Pfeiffer. Legal challenges of an open web index. International Cybersecurity Law Review, 2(1):183–194, 2021. URL:
Sheikh Mastura Farzana and Tobias Hecking. Towards a scalable geoparsing approach for the web. In GeoExT@ ECIR, 25–33. 2024.
Maik Fröbe, Matti Wiegmann, Nikolay Kolyada, Bastian Grahm, Theresa Elstner, Frank Loebe, Matthias Hagen, Benno Stein, and Martin Potthast. Continuous integration for reproducible shared tasks with tira. io. In European Conference on Information Retrieval, 236–241. Springer, 2023.
Michael Granitzer, Stefan Voigt, Noor Afshan Fathima, Martin Golasowski, Christian Guetl, Tobias Hecking, Gijs Hendriksen, Djoerd Hiemstra, Jan Martinovič, Jelena Mitrović, and others. Impact and development of an open web index for open web search. Journal of the Association for Information Science and Technology, 2023. doi:10.1002/asi.24818.
Gijs Hendriksen, Michael Dinzinger, Sheikh Mastura Farzana, Noor Afshan Fathima, Maik Fröbe, Sebastian Schmidt, Saber Zerhoudi, Michael Granitzer, Matthias Hagen, Djoerd Hiemstra, and others. The open web index: crawling and indexing the web for public use. In European Conference on Information Retrieval, 130–143. Springer, 2024.
Dirk Lewandowski. The web is missing an essential part of infrastructure: an open web index. Commun. ACM, 62(4):24, 2019. URL:, doi:10.1145/3312479.
Jimmy Lin, Joel Mackenzie, Chris Kamphuis, Craig Macdonald, Antonio Mallia, Michał Siedlaczek, Andrew Trotman, and Arjen de Vries. Supporting interoperability between open-source search engines with the common index file format. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2149–2152. 2020.
Benno Stein and Sven Meyer zu Eissen. Retrieval models for genre classification. Scandinavian Journal of Information Systems, 20(1):3, 2008.
Matti Wiegmann, Magdalena Wolska, Christopher Schröder, Ole Borchardt, Benno Stein, and Martin Potthast. Trigger warning assignment as a multi-label document classification problem. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 12113–12134. 2023.