1st Workshop on Creating Interoperable Corpora of Historical Newspapers (PressMint) Second Call for Papers Date: May 16, 2026, half-day workshop Location: Palma de Mallorca, Spain Website: https://www.clarin.eu/PressMint-LREC2026 Submission Deadline: 1 March 2025 Submission link: https://softconf.com/lrec2026/PressMint/
Advertisement/Tagline Unlock the pan-European history! Join the PressMint workshop to build & analyze multilingual, interoperable historical newspaper corpora! Workshop description Historical newspapers are of interest to historians and historical linguists, as well as to social and political scientists, ethnologists, anthropologists, media and communication scholars, and researchers in cultural studies. All of these are fields where contemporary digital resources, tools and methods (e.g. “distant reading”) are still underutilised. On the other hand, corpora of historical newspapers already exist for a number of languages and countries to a large extent, as they are out of copyright. Also, the images, and often OCR, are available through the national libraries. Also, in recent years these data started to be of big interest to the researchers since they preserve the historical, cultural, political, societal past. However, these corpora are not interoperable, which precludes methods for their comparison, as well as any translingual and transnational research, an especially important consideration, as statehood and nationhood are highly dynamic in Europe in the period to be covered by the project corpora. An initial joint attempt towards the creation of a corpus of historical newspapers from the beginning of 20. century on, is the CLARIN flagship project PressMinthttps://www.clarin.eu/pressmint. The project features data from 20 partners at the moment, aiming to develop a standard for interoperable resources of newspapers in diachronic timespans. The final goal is to provide structured and high quality multilingual data in a common format, with the same type of linguistic annotation that covers (at least partially) the same time period. Objective The PressMint workshop aims to gather experts interested in creating, processing and analyzing interoperable corpora of historical data in general, but especially with a focus on newspapers. Another very important objective is to consider also the perspective of the communities who use historical data - their purposes, requirements, feedback. We encourage the interested colleagues to present their work on both types of levels – national and pan-European; monolingual and multilingual as well as task-specific and multidisciplinary. We view this workshop as a venue to exchange research ideas and start collaboration on this topic.
The workshop will feature one invited speaker: Maud Ehrmann, EPFL, CH We invite unpublished original work focusing on (but not exclusive to) on the following topics:
* compilation, annotation, visualisation and utilisation of historical newspaper corpora of the period relevant to PressMint (ideally around the start of the 20th century but not constrained by this period) * harmonisation of the existing multilingual historical newspaper corpora that contain either synchronic or diachronic data, or both * linking or comparing historical newspaper corpora with other datasets, including sources of structured knowledge, such as formal ontologies and LOD datasets * enrichment of historical newspaper corpora (with e.g. sentiment annotation, etc.) * machine translation of historical newspaper corpora * employment of LLMs as stand alone tools or as parts of NLP architectures for historical data processing, maintenance and knowledge deployment. * various scenarios of usage of historical data
Submission & Publication We accept submission of long papers (from 6 to 8 pages), short papers (4 pages) and demo papers (4 pages) to be presented as a long or short oral presentation or poster presentations at the workshop. To support double-blind reviewing, all submissions must be fully anonymized and should be formatted according to the stylesheet available on the LREC 2026 websitehttps://lrec2026.info/authors-kit/. The papers of the workshop will be published in online proceedings. At the time of submission, authors are also offered the opportunity to share related language resources with the community. All repository entries are linked to the LRE Map [https://lremap.elra.info/], which provides metadata for the resources. Please note that the LREC style guide should be followed. The formatting guidelines can be found here: https://lrec2026.info/authors-kit/. Important Dates
* Paper submission deadline: 1 March 2026 * Notification of acceptance: 15 March 2026 * Camera-ready paper: 30 March 2026 * Workshop date: 16 May 2026
Organizing Committee
* Maciej Ogrodniczuk, Institute of Computer Science, Polish Academy of Sciences, PL * Tanja Wissik, Austrian Academy of Sciences, AT * Petya Osenova, Sofia University ”St. Kl. Ohridski” & Bulgarian Academy of Sciences, BG
The workshop is supported by the CLARIN research infrastructure and the PressMint Project. To contact the organisers, please email maciej.ogrodniczuk@gmail.commailto:maciej.ogrodniczuk@gmail.com