November 2024 - Corpora

Associate Professor in Artificial Intelligence (Research and Education)
by Mark Lee 27 Nov '24

27 Nov '24

Associate Professor in Artificial Intelligence (Research and Education) Location: Birmingham Salary: £56,921 to £65,814 Grade 9 Hours: Full Time Contract Type: Permanent Placed On: 26th November 2024 Closes: 8th January 2025 Job Ref: 101425 We are looking for applications from academics specialising in all areas of Artificial Intelligence including Natural Language Processing. Please see https://www.jobs.ac.uk/job/DKV585/associate-professor-in-artificial-intelli… For applications working in NLP – I am more than happy to answer questions etc. With best regards, Mark Lee Professor of Artificial Intelligence School of Computer Science University of Birmingham www.cs.bham.ac.uk/~mgl<http://www.cs.bham.ac.uk/~mgl>

1 0

PhD Scholarships in Artificial Intelligence (NLP and e-Health) at Queen’s University Belfast, UK
by hasanuzzaman.im＠gmail.com 27 Nov '24

27 Nov '24

Dear Colleagues, I am looking for PhD students in e-Health and Sign Language Translation at the School of School of Electronics, Electrical Engineering and Computer Science, Queen's University Belfast, UK. The details of the projects are below: 1. Breaking Barriers: AI-Powered Sign Language Translation for Enhanced Healthcare Access for the Deaf (https://www.qub.ac.uk/courses/postgraduate-research/phd-opportunities/break…). The Project is co-supervised by Prof. Asif Ekbal, Professor, IIT Jodhpur, India and Dr. Anna Jurek-Loughrey, Senior Lecturer, EEECS, Queen’s University Belfast. 2. Empowering Mothers: AI-Enhanced Personalized Management of Gestational Diabetes (https://www.qub.ac.uk/courses/postgraduate-research/phd-opportunities/empow…). The project is co-supervised by Prof. Pietro Liò, Professor, Department of Computer Science and Technology, University of Cambridge, UK. Both projects offer an exciting opportunity for students to spend time at the prestigious University of Cambridge in the UK and IIT Jodhpur in India. Requirements: The successful candidate is expected to have a solid background in NLP, Machine Learning, Statistics, computer science or related discipline. The minimum academic requirement for admission to a research degree programme is normally an Upper Second Class Honours degree from a UK or ROI HE provider, or an equivalent qualification acceptable to the University. Refer to the official application process for details: https://www.qub.ac.uk/courses/postgraduate-research/phd-opportunities/empow… Application Process: For enquiries, please send an email to Dr. Mohammed Hasanuzzaman at m.hasanuzzaman(a)qub.ac.uk with your CV and transcript as well as a brief description of your research interests. Unfortunately, due to a high volume of inquiries, I may not be able to respond to all emails. Submit your formal application through the following link. https://dap.qub.ac.uk/portal/user/u_login.php Your application should be clearly marked as EEECS/2025/MH1 and/or EEECS/2025/MH2 to ensure consideration for funding About Queen's: The Queen's University Belfast, founded almost two centuries ago, is one of the oldest universities in the United Kingdom. As a member of the prestigious Russell Group, Queen’s is one of the UK’s 24 leading research-intensive universities (ranked 13th in the UK for research intensity). For more details: https://www.qub.ac.uk/Study/Why-Study-at-Queens/rankings-and-reputation/ *International students are welcome to apply, but additional funding (https://www.qub.ac.uk/Study/international-students/international-scholarshi…) or personal finances will be required to cover the difference between home (UK) and overseas fees. Best regards, Mohammed

1 0

Research position (PhD/PostDoc) at Bielefeld University, Germany
by Sina Zarrieß 27 Nov '24

27 Nov '24

The Computational Linguistics Group at Bielefeld University is looking for a ** Researcher (full-time, PhD or PostDoc) ** to work in a project on measuring linguistic creativity in different genres. The project is part of a newly established Collaborative Research Centre (CRC 1646) on “Linguistic Creativity in Communication” [1]. The announced position will focus on computational aspects of measuring linguistic creativity. A central task will be to develop sentence embedding models that disentangle style and content, based on language models for different literary and non-literary genres, and to work on the analysis of linguistic creativity in large language models, in collaboration with other projects of the CRC. The duration of the position is approx. 3 years (until the end of 2027). Deadline for applications: 10.12.2024 (extended) Contact: Sina Zarrieß (sina.zarriess(a)uni-bielefeld.de) Apply here: https://uni-bielefeld.hr4you.org/job/view/3794/research-position?page_lang=… [1] https://www.uni-bielefeld.de/fakultaeten/linguistik-literaturwissenschaft/f… -- Prof. Dr. Sina Zarrieß Computational Linguistics https://clause-bielefeld.github.io/ University of Bielefeld Universitätsstr. 25 33615 Bielefeld, Germany +49 521 106-2534

1 0

NBME Summer Internships in Psychometrics and Data Science
by Christopher Runyon 26 Nov '24

26 Nov '24

NBME (National Board of Medical Examiners) Summer 2025 Internships in Psychometrics and Data Science June 2 - July 25, 2025 NBME invites applications for multiple full-time internship positions, all fully remote, for the Summer of 2025. Over an 8-week period, interns will have the opportunity to collaborate with NBME staff and interact with fellow graduate students as they complete a research project. The expected deliverable is an internal research presentation. Specific projects for the summer of 2025 will be discussed with applicants as part of the interview process. Compensation is $12,600, and all interns are eligible to receive up to $1,000 to support their attendance at a conference (not conditional on presenting). The application deadline is Wednesday, January 31, 2025, at midnight PST. Interested students can learn more and apply here: https://nbme.applicantpro.com/jobs/3562778 If you have any questions or need additional information, please contact Chris Runyon: CRunyon(a)nbme.org<mailto:CRunyon@nbme.org> This email message and any attachments may contain privileged and/or confidential business information and are for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please notify the sender immediately by reply email and destroy all copies of the original message and any attachments.

1 0

Call for participation: MultiLexNorm 2: Multilingual Lexical Normalization
by Rob van der Goot 26 Nov '24

26 Nov '24

Dear all, After a successful first edition in 2021, we are glad to invite you to the second MultiLexNorm shared task! The shared task will be hosted at WNUT 2025. As defined in the previous iteration, lexical normalization is: The task of transforming an utterance into its standard form, word by word, including both one-to-many (1-n) and many-to-one (n-1) replacements. Building on the previous task which focused on Indo-European languages written in the Latin script, we extended the benchmark to include languages written in other scripts. We now include data for Thai, Vietnamese, and Indonesian. The data and more information about the task can be found on: https://noisy-text.github.io/2025/multi-lexnorm.html# Dates: Data available Nov 15, 2024 Data freeze Jan 07, 2025 Test data Jan 25, 2025 Final Evaluation Feb 07, 2025 Paper deadline Feb 25, 2025 Paper reviewed Mar 01, 2025 Camera ready Mar 10, 2025 Workshop May 03, 2025 (TBD) Best, The organizers: Rob van der Goot Weerayut Buaphet Peerat Limkonchotiwat Thanh-Nhi Nguyen Thanh-Phong Le

1 0

Call for participation for MultiLexNorm 2: shared task on Multilingual Lexical Normalization
by Rob van der Goot 26 Nov '24

26 Nov '24

Dear all, After a successful first edition in 2021, we are glad to invite you to the second MultiLexNorm shared task! The shared task will be hosted at WNUT 2025. Building on the previous task which focused on Indo-European languages written in the Latin script, we extended the benchmark to include languages written in other scripts. We now include data for Thai, Vietnamese, and Indonesian. The data and more information about the task can be found on: https://noisy-text.github.io/2025/multi-lexnorm.html# Dates: Data available Nov 15, 2024 Data freeze Jan 07, 2025 Test data Jan 25, 2025 Final Evaluation Feb 07, 2025 Paper deadline Feb 25, 2025 Paper reviewed Mar 01, 2025 Camera ready Mar 10, 2025 Workshop May 03, 2025 (TBD) Best, The organizers: Rob van der Goot Weerayut Buaphet Peerat Limkonchotiwat Thanh-Nhi Nguyen Thanh-Phong Le

1 0

Call for Application: Postdoctoral Positions at the Chilean National Center for Artificial Intelligence (CENIA)
by Felipe Bravo 26 Nov '24

26 Nov '24

As part of its development plan, the Chilean National Center for Artificial Intelligence (cenia.cl/en <https://cenia.cl/en/home/>) invites interested individuals to apply for 3 Postdoc positions in AI and related areas. Candidates must hold a PhD in AI or related fields, such as cognitive robotics, neuroscience, cognitive science, mathematics, or related scientific areas. In line with our commitment to actively promote gender equity in the field of AI, the postdoctoral competition will include diversity criteria in its evaluation process. The positions aim to conduct research activities in the following areas: 1. *Deep Learning for Vision and Language*: New theories and methods to further unlock the potential of Deep Learning and Generative AI to create advanced cognitive systems with a focus on vision, language, and multimodality. 2. *Neuro-symbolic AI*: Integration of deductive AI and machine learning-based AI, mutually invoking solutions from each part, injecting and utilizing semantics in current AI models. 3. *Brain-inspired AI*: Brings together scientists from neuroscience, cognitive psychology, and AI to develop AI models based on brain mechanisms and cognitive functions, while also developing AI tools to understand human cognition. 4. *Physics-based Machine Learning*: Gathers mathematicians, physicists, and AI scientists to develop hybrid machine learning models based on physical laws to solve complex problems in science and engineering, generate scientific discoveries, and explore causal relationships. 5. *Human-centered AI*: New technologies for the fair, safe, and transparent use of AI in society, as well as methodologies to assess its impact. Promotes new tools for interpretable and explainable AI. *I. Eligibility Criteria:* Applicants must: - Hold a PhD degree. - Have proven research experience, including a record of publications. - Have experience in developing or using AI tools. - Be available to work full-time in Santiago of Chile, joining Cenia preferably no later than April 2025 (exceptional cases will be considered). - Not receive incentives or salaries from other sources of the Chilean Research and Development Agency (ANID). *II. Roles/Responsibilities of the Cenia Postdoc:* - Conduct research in Cenia’s areas of interest. - Participate in Cenia’s applications for national and international funding. - Apply for competitive grants as the lead researcher, including the Fondecyt Postdoctoral Competition 2025. *III. Documents to Submit with Application:* *Phase 1:* Submit the following documents to Cenia. - An updated full CV, including a list of scientific publications. - A statement of interest (maximum 2 pages), which should include: - Reasons for joining Cenia. - Main research topics of interest at Cenia. - Groups or individuals at Cenia you would like to collaborate with. - Two letters of recommendation (free format). *Phase 2:* Pre-selected candidates will be contacted to prepare a brief research proposal, in collaboration with Cenia’s researchers. *IV. Important Dates:* - Phase 1: October 18 to December 16 (late applications may be considered in exceptional cases). - Phase 2: December 20 to January 20, 2025. - Results announcement: Monday, January 27, 2025. *V. Benefits:* - Full-time contract (42 hours/week) for one year, renewable, with the possibility of obtaining a permanent contract as a Cenia Researcher. - Monthly gross salary of CLP 2,500,000, plus financial incentives for securing competitive grants at national and international levels. - Access to a stimulating research environment, excellent facilities, and cutting-edge computational resources in Latin America. - Full-time dedication to research, with no teaching obligations. - Support for grant applications. - Opportunity to participate in Cenia’s scientific and technological transfer activities to the industrial and governmental sectors. - Opportunity to participate in outreach and community engagement activities. *VI. About Cenia:* Cenia is one of the leading organizations dedicated to AI development in Chile and Latin America. Founded in 2021, its mission is to generate knowledge and technological solutions based on AI. Cenia works closely with universities, research centers, and companies both nationally and internationally, fostering interdisciplinary research and training specialized talent in AI. Additionally, Cenia stands out for its commitment to gender equity and inclusion, actively promoting diversity in all its projects and programs. *All documents and inquiries should be sent to postdoc(a)cenia.cl <postdoc(a)cenia.cl>, with the subject line: “Postdoc Application 2024”. For any questions to the application process send an email to talento(a)cenia.cl <talento(a)cenia.cl>.* *Link: * https://cenia.cl/en/2024/10/18/call-for-application-postdoctoral-positions/

1 0

[CfP] - Call for Research & Innovation Papers at SEMANTiCS 2025
by Beyza Yaman 26 Nov '24

26 Nov '24

Call for Research & Innovation Papers SEMANTiCS 2025 EU 21st International Conference on Semantic Systems Vienna, Austria September 3 - 5, 2025 Important Dates: - Abstract Submission Deadline: April 25 , 2025 - Paper Submission Deadline: May 2, 2025 - Notification of Acceptance: June 13, 2025 - Camera-Ready Paper Deadline: July 04, 2025 All deadlines are set for 11:59 pm, Anywhere On Earth time (UTC-12) Submissions will be through Easychair and the submission link will be provided soon. Proceedings of SEMANTiCS 2025 EU will be made available open access. Research and Innovation Track The SEMANTiCS 2025 conference is excited to invite submissions for the Research and Innovation Track, welcoming groundbreaking research contributions, innovative solutions, and experimental studies relevant to the Semantic Web, Semantic Technologies, and AI-enabled semantics. We also encourage submissions at the intersections of these fields with other scientific and applied disciplines, fostering cross-disciplinary exchange and advancement. Papers should present original work that has not been published or is not under consideration elsewhere. All submissions must adhere to the submission guidelines, including reference formatting and any additional documentation as required. Each submission will undergo a rigorous review process, with at least three independent reviews, evaluating the novelty, technical quality, reproducibility, and practical relevance of the work. Topics of Interest SEMANTiCS 2025 calls for submissions of high-quality research papers across a broad spectrum of topics in Semantic Web, Semantic Technologies, and AI. We are particularly interested in new and emerging trends, especially where semantic technologies intersect with evolving fields such as large language models, explainable AI, and trustworthy data infrastructures. Topics of interest include, but are not limited to: - Web Semantics & Linked (Open) Data - Enterprise Knowledge Graphs, Graph Data Management - Machine Learning Techniques for/using Knowledge Graphs (e.g. reinforcement learning, deep learning, data mining and knowledge discovery) - Generative AI and Knowledge Graphs (e.g., Retrieval-Augmented Generation (RAG) with knowledge graph integration, generative model grounding) - Reasoning, Rules, and Policies on RAG - Knowledge Engineering and Management (e.g., knowledge acquisition, extraction, integration, and publication workflows) - Terminology, Thesaurus & Ontology Management, Ontology engineering - Web agents - Natural Language Processing for/using Knowledge Graphs (e.g. entity linking and resolution using target knowledge such as Wikidata and DBpedia, foundation models) - Crowdsourcing for/using Knowledge Graphs - Data Quality Management and Assurance - Mathematical and Logical Foundations of Knowledge-aware AI - Multimodal Knowledge Graphs (e.g., text, image, audio fusion in graph structures) - Semantic-Enhanced Data Science Pipelines and Processes - Semantics in Blockchain environments (e.g., traceability, decentralized knowledge representation) - Trust, Data Privacy, and Security with Semantic Technologies - Internet of Things (IoT), Stream Processing, and Temporal Data Management (e.g., real-time semantic processing and predictive analytics) - Conversational AI and Dialogue Systems powered by Knowledge Graphs - Provenance and Data Change Tracking (e.g., semantic versioning, data updates in distributed settings) - Semantic Interoperability (e.g., cross-domain standards, mapping frameworks, ontology alignment) - Linked Data storage, triple stores, graph databases - Robust, Scalable, and Fault-Tolerant Semantic Data Systems (e.g., distributed querying, optimization) - User Interfaces and Usability of Semantic Technologies (e.g., visualizations, intelligent user interaction) - Explainable and Interoperable AI - Decentralised and Federated Knowledge Graphs (e.g., federated querying, link traversal) Applied Semantic Technologies and AI in Real-World Scenarios, such as, but not limited to: - Biomedicine and Health (e.g., Knowledge Graphs for biomedical applications, AI-driven diagnostics, personalized health) - AI for Environmental and Climate Solutions (e.g., semantic modeling for environmental impact, biodiversity knowledge graphs) - Scientific Knowledge Graphs and Open Science (e.g., FAIR data principles, enhanced scholarly communication) - Semantic Technologies in GLAM (Galleries, Libraries, Archives, and Museums) - Knowledge Graphs and Hybrid AI for Industry 4.0/5.0 and Predictive Maintenance - Digital Humanities and Cultural Heritage Preservation - Legal Technology, AI Ethics, and Regulatory Compliance (e.g., AI and legal frameworks, semantic-enabled compliance with the EU AI Act) - Economics and Governance of Data Ecosystems (e.g., data marketplaces, semantic service interoperability, data policy) Submissions will be through Easychair. Stay tuned for the submission link. For Submission Guidelines and Review and Evaluation Criteria please head to the online call for papers: https://2025-eu.semantics.cc/page/cfp_rev_rep We would highly appreciate it if you could disseminate this call within your network. We look forward to receiving your contributions! Research and Innovation Track Chairs Blerina Spahiu (University of Milano-Bicocca, IT) Mehdi Ali (Lamarr Institute & Fraunhofer IAIS, Germany) Kind Regards, On behalf of the organising committee, Beyza Yaman

1 0

Call for papers: Register and task variation in Learner Corpus Research (VAR4LCR), Louvain-la-Neuve, 7-8 July 2025
by Gaëtanelle Gilquin 25 Nov '24

25 Nov '24

Register and task variation in Learner Corpus Research (VAR4LCR), Louvain-la-Neuve, 7-8 July 2025 To mark the end of a Hoover Seedfund collaborative project between the Centre for English Corpus Linguistics at the University of Louvain (UCLouvain) and the Department of English at Northern Arizona University (NAU), a conference on Register and task variation in Learner Corpus Research will be organized in Louvain-la-Neuve, Belgium, on 7-8 July 2025 under the aegis of the Learner Corpus Association. The conference will be an on-site event only. RATIONALE Ample evidence has been provided to demonstrate that language varies according to register. Much of this evidence for register variation comes from corpora, which have provided insights into linguistic patterns associated with distinct registers (see, e.g., Biber 1988, Biber 2006, Biber & Egbert 2018). Register variation is an important aspect of any language and language variety, but it is particularly relevant in the case of learner language, because L2 learners may not show the same register awareness as native/expert writers/speakers (cf. Gilquin & Paquot 2008, Larsson 2019). Learner Corpus Research (LCR) has taken register into account in the sense that studies have been carried out on the basis of learner corpora representing certain registers (e.g. essays in Ädel 2006 or interviews in Götz 2013). However, studies comparing learner language registers are less common. Yet, the LCR studies that have drawn such comparisons have highlighted the importance of register for learner language (e.g. Fuchs et al. 2016, Staples et al. 2018, Larsson et al. 2021). Related to register is the notion of task, particularly relevant in LCR since learner corpora are often compiled from data produced as part of specific pedagogical tasks (e.g. writing a letter, describing a graph, retelling a movie scene). As with register, LCR studies comparing different tasks are not very common, but they have underlined the potential effect of this variable (e.g. Tracy-Ventura & Myles 2015, Alexopoulou et al. 2017, Gablasova et al. 2017, Goulart & Dixon 2025). SUBMISSIONS We welcome submissions which compare two or more registers or tasks in corpora of learner language, using the methods of corpus linguistics / LCR, and which analyse the possible effects of register/task on the linguistic features of learner language. The learner registers/tasks may, in addition, be compared against some reference corpus data such as native or expert language. Both quantitative and qualitative approaches are welcome, with a focus on any aspects of language (phraseology, grammatical complexity, fluency, etc.). We are particularly interested in submissions that - use data representing different registers/tasks produced by the same L2 learners; - compare registers/tasks displaying different degrees of formality or involving different degrees of communicative control; - combine quantitative and qualitative methods of analysis; - discuss the methodological issues related to the comparison of registers/tasks in learner language; - include under-researched registers/tasks/languages. There will be three different categories of presentation: - full paper - work-in-progress report - poster ABSTRACTS Abstracts should be about 500 words (not including references) and specify how the paper will contribute to the theme of the conference, in particular by highlighting the registers/tasks that will be compared and the corpora that will be used. They should also provide a clear outline of the aim(s) of the paper, including clearly articulated research questions, sufficient details about the methodology and (preliminary) results. Abstracts should be uploaded to OpenReview (https://openreview.net/group?id=VAR4LCR/2025/Conference) no later than 20 January 2025 at 23:59 UTC. If you are new to OpenReview, you will first have to create a profile. We recommend that you use an institutional email address to do so, as profiles created without an institutional email address will go through a moderation process that can take up to two weeks. Please note that most questions asked as part of the creation of a profile are optional. Providing your current institution (under ‘History’) and a link to a webpage that displays your name and email address (under ‘Personal Links’) will be sufficient. IMPORTANT DATES - Deadline for submission of abstracts: 20 January 2025 - Notification of acceptance/rejection: 15 March 2025 - Conference: 7-8 July 2025 KEYNOTE SPEAKERS We are pleased to announce that the following speakers have agreed to give a keynote presentation at the conference: - Prof. Douglas Biber (Northern Arizona University) - Prof. Marije Michel (University of Groningen) - Prof. Shelley Staples (University of Arizona) LOCAL ORGANIZING COMMITTEE Sylvie De Cock (UCLouvain) Gaëtanelle Gilquin (UCLouvain) Sylviane Granger (UCLouvain) Pauline Jadoulle (UCLouvain) Magali Paquot (UCLouvain) Lieven Vandelanotte (UNamur) SCIENTIFIC COMMITTEE Douglas Biber (NAU) Sylvie De Cock (UCLouvain) Jesse Egbert (NAU) Gaëtanelle Gilquin (UCLouvain) Sylviane Granger (UCLouvain) Francesca Grixoni (NAU) A.J. Holmberg (NAU) Pauline Jadoulle (UCLouvain) Tove Larsson (NAU) Magali Paquot (UCLouvain) Randi Reppen (NAU) CONFERENCE WEBSITE: https://uclouvain.be/en/research-institutes/ilc/cecl/register-and-task-vari… CONTACT: var4lcr(a)uclouvain.be REFERENCES Ädel, Annelie. 2006. Metadiscourse in L1 and L2 English. Amsterdam: John Benjamins. Alexopoulou, Theodora, Michel, Marije, Murakami, Akira & Meurers, Detmar. 2017. Task effects on linguistic complexity and accuracy: A large-scale learner corpus analysis employing natural language processing techniques. Language Learning 67(S1): 180-208. Biber, Douglas. 1988. Variation across Speech and Writing. Cambridge: Cambridge University Press. Biber, Douglas. 2006. University Language: A Corpus-based Study of Spoken and Written Registers. Amsterdam: John Benjamins. Biber, Douglas & Egbert, Jesse. 2018. Register Variation Online. Cambridge: Cambridge University Press. Fuchs, Robert, Götz, Sandra & Werner, Valentin. 2016. The present perfect in learner Englishes: A corpus-based case study on L1 German intermediate and advanced speech and writing. In Valentin Werner, Elena Seoane & Cristina Suárez-Gómez (eds) Re-assessing the Present Perfect (pp. 297-338). Berlin: De Gruyter. Gablasova, Dana, Brezina, Vaclav, McEnery, Tony & Boyd, Elaine. 2017. Epistemic stance in spoken L2 English: The effect of task and speaker style. Applied Linguistics 38(5): 613-637. Gilquin, Gaëtanelle & Paquot, Magali. 2008. Too chatty: Learner academic writing and register variation. English Text Construction 1(1): 41-61. Götz, Sandra. 2013. Fluency in Native and Nonnative English Speech. Amsterdam: John Benjamins. Goulart, Larissa & Dixon, Tülay. 2025. The relative influence of language backgrounds, communicative text types, and disciplines in undergraduate student writing. International Journal of Learner Corpus Research 11(1). Larsson, Tove. 2019. Grammatical stance marking across registers: Revisiting the formal-informal dichotomy. Register Studies 1(2): 243-268. Larsson, Tove, Paquot, Magali & Biber, Douglas. 2021. On the importance of register in learner writing: A multi-dimensional approach. In Elena Seoane & Douglas Biber (eds) Corpus-based Approaches to Register Variation (pp. 235-258). Amsterdam: John Benjamins. Staples, Shelley, Biber, Douglas & Reppen, Randi. 2018. Using corpus-based register analysis to explore the authenticity of high-stakes language exams: A register comparison of TOEFL iBT and disciplinary writing tasks. The Modern Language Journal 102(2): 310-332. Tracy-Ventura, Nicole & Myles, Florence. 2015. The importance of task variability in the design of learner corpora for SLA research. International Journal of Learner Corpus Research 1(1): 58-95.

1 0

Opera Graeca Adnotata (v0.2.0)
by Giuseppe G. A. Celano 25 Nov '24

25 Nov '24

Dear All, I am delighted to announce the release of Opera Graeca Adnotata (v0.2.0). With its 1,999 texts and 40M+ tokens, Opera Graeca Adnotata is the largest multilayer corpus for Ancient Greek online. The corpus can be queried through ANNIS 4 <https://annis.varro.informatik.uni-leipzig.de/?id=f49da342-4db7-4283-b6c1-3…> or be downloaded from Zenodo <https://zenodo.org/records/14206061>. For more information, visit the GitHub <https://github.com/OperaGraecaAdnotata/OGA> repository, where you can also find instructions on how to query the corpus <https://github.com/OperaGraecaAdnotata/OGA/tree/main/query>. Best regards, Giuseppe Dr. Giuseppe G. A. Celano DFG-project leader Universität Leipzig Institute of Computer Science, NLP Augustusplatz 10 Tel: +4934132223 04109 Leipzig Deutschland

1 0

2026

2025

2024

2023

2022

Corpora November 2024