- Corpora - ELRA lists

2 Fellow positions at PSL ML for the Sciences (5 years)
by A. Allauzen 10 Sep '24

10 Sep '24

If you are a young researcher (PhD holder) in core AI fields (mathematics or computer science) or in applied domains (natural language or speech processing, physics, astrophysics, chemistry, life sciences, cognitive sciences, etc.), join us ! PSL opens two Fellow positions. A fellow is an attractive postdoc position along with teaching activities in AI and datascience. PSL provides an exceptional environment for research in data science and artificial intelligence. Depending on your skills, you will will be attached to one of the institutions of PSL. In your application you can give wishes that correspond to your research project, or we can also guide you. - Application deadline: October 31st, 2024 - Candidates selected for an audition will be invited for an in-person or video interview. - Expected starting date: January 2025 If you need more information, feel free to contact me. Best regards, Alexandre Allauzen

1 0

Call for Shared Task - RIRAG: Regulatory Information Retrieval and Answer Generation at RegNLP 2025
by Tuba Gokhan 10 Sep '24

10 Sep '24

The Regulatory Information Retrieval and Answer Generation (RIRAG) Shared Task will take place as part of the RegNLP 2025 Workshop on January 20th, 2025, in conjunction with the COLING 2025 conference in Abu Dhabi, UAE. Regulatory documents contain critical information that governs compliance and legal obligations across industries. However, retrieving relevant information and generating precise answers from these documents remains a challenging task due to their complex and diverse nature. The RIRAG Shared Task aims to address this challenge by encouraging the development of state-of-the-art systems for the retrieval and generation of answers from regulatory documents. ===Shared Task Objectives=== -Task 1: Regulatory Information Retrieval Participants will build models that retrieve relevant passages from a collection of regulatory documents, given a set of questions. -Task 2: Regulatory Answer Generation Participants will generate concise and accurate answers to regulatory questions based on the retrieved passages. ===Participation Details=== Participants will be provided with a comprehensive dataset, including regulatory documents and a set of questions related to compliance and obligations. The shared task will be divided into three phases: the Development Phase, the Testing and Submission Phase, and the Evaluation Phase. Participants will also be invited to describe their system in a paper for the RegNLP workshop proceedings. ===Important Dates=== -Development Phase: --Start: September 10, 2024 --End: November 10, 2024 Description: Participants can use all provided data for both Task 1 and Task 2. During this phase, test data will be made available for participants to evaluate their own models, but results will not determine final awards. -Testing and Submission Phase: --Start: November 10, 2024 --End: November 20, 2024 Description: Unseen questions will be released for participants to apply their models. Final submissions must be made during this period. -Evaluation Phase: --Start: November 20, 2024 --End: November 25, 2024 Description: Organizers will evaluate the submitted results, and winners will be determined based on their performance. -System Paper Submission Phase: Start: November 25, 2024 End: December 10, 2024 Description: Winning teams are expected to submit a detailed system paper, outlining their methodologies and findings. ===Join the Shared Task=== Interested participants can join the shared task and access more information by visiting our competition page on Codabench: https://www.codabench.org/competitions/3527/ ===Baseline System=== For reference, participants can access the baseline system : https://arxiv.org/abs/2409.05677 ===More Information=== For more information about RegNLP and RIRAG, please visit our website: https://regnlp.github.io/ ===Contact Information=== For inquiries related to the shared task, please contact us via email at: regnlp2025(a)gmail.com

1 0

SEARCH SOLUTIONS 2024 - registration open
by Udo Kruschwitz 09 Sep '24

09 Sep '24

Search Solutions 2024 Wednesday 27 November 2024, London Innovations in Search and Information Retrieval Search Solutions is the BCS Information Retrieval Specialist Group’s (BCS IRSG) annual event focused on practitioner issues in the arena of search and information retrieval (IR). We bring together practitioners, researchers, analysts and end users to discuss the latest developments in the IR community and to share insights between research and practice. The event consists of a Tutorial day (26 November) and a Conference day (27 November), each of which has a separate registration.The conference day includes presentations, panels and keynote talks by influential industry leaders on novel and emerging applications in search and information retrieval. 09:30 - 09:50 Registration and coffee SESSION 1: THE SEARCH EXPERIENCE: FOCUS ON THE USER 09:50 - 10:00 Introduction 10:00 - 10:20 Tanja Svarre (University of Aalborg) “People search in the enterprise” 10:20 - 10:40 Eugene Morozov (ISKO) “Search and Browse Use Cases in Data Governance” 10:40 - 10:55 Panel Q&A 10:55 - 11:10 BREAK SESSION 2: BEYOND KEYWORD SEARCH: RETRIEVAL-AUGMENTED GENERATION 11:10 - 11:30 Dyaa Albakour (SIGNAL AI) “Mapping the landscape of narratives in Global Media” 11:30 - 11:50 Alessandro Benedetti (Sease) “What I don’t like about RAG: can we do better?” 11:50 - 12:10 Taketomo Isazawa (Microsoft Research) “Beyond RAG: Integrating Knowledge with LLMs” 12:10 - 12:25 Panel Q&A 12:25 - 13:30 LUNCH SESSION 3: SEARCH WITH AN IMPACT: SYSTEMATIC REVIEWS 13:30 - 13:50 Maria-Inti Metzendorf (Cochrane Evidence Synthesis Unit Düsseldorf) “Searching for Living Systematic Reviews – overview and case report” 13:50 - 14:10 James Thomas (UCL) “The responsible use of AI in evidence synthesis: collaborative development of guidance and recommendations” 14:10 - 14:25 Panel Q&A 14:25 - 14:40 BREAK SESSION 4: SEARCH IN INDUSTRIAL SETTINGS: GETTING IT RIGHT 14:40 - 15:00 Gabriella Kazai (Amazon) “What Matters in a Measure? A Perspective from Large-Scale Search Evaluation” 15:00 - 15:20 Charlie Hull (OpenSource Connections) “Measure and Tune your Search with User Behaviour Insights” 15:20 - 15:40 Daniel Tunkelang (Consultant) & Aritra Mandal (eBay) “Modeling Queries as Bags of Documents” 15:40 - 15:55 Panel Q&A 15:55 - 16:30 BREAK 16:30 - 17:15 PANEL SESSION Panel topic: “Has generative AI removed the need for standalone search? A discussion” Panel chair: Michael Upshall (The Search Network) 17:15 - 17:25 BCS SEARCH INDUSTRY AWARDS 17:25 - 17:30 Closing words 17:30 EVENING RECEPTION LOCATION Search Solutions is organised by the Information Retrieval Specialist Group of the BCS (The Chartered Institute for IT) and ISKO (International Society for Knowledge Organization), and is held at the BCS Central London Office: BCS, The Chartered Institute for IT Ground Floor 25 Copthall Avenue London EC2R 7BP https://www.bcs.org/about-us/our-london-office-and-event-venue/ REGISTRATION Registration fees (including VAT at 20%) for Search Solutions are as follows: * BCS / ISKO member rate: £92 * Non-member rate: £110 * Students: £80 Registration fees include lunch. Tea and coffee will also be available throughout the day followed by a drinks reception in the evening. Register here: https://www.bcs.org/events-calendar/2024/november/search-solutions-2024-inf… TUTORIAL DAY Search Solutions also includes a Tutorial Programme on Tuesday, 26 November. A detailed programme will be announced in due course. A Call for Tutorials can be found here: https://www.bcs.org/membership-and-registrations/member-communities/informa… Tutorials are payable separately. PAST EVENTS Search Solutions has been held annually since 2007. For details of past events please see: https://www.bcs.org/membership-and-registrations/member-communities/informa…

1 0

CFP - Workshop on Challenges in Processing South Asian Languages (CHiPSAL) @ COLING2025
by Kengatharaiyar Sarveswaran 09 Sep '24

09 Sep '24

Hello everyone, Don’t miss this unique opportunity to discuss key issues and contribute to the advancement of language processing in the South Asian region, home to 25% of the world’s population and rich in linguistic and cultural diversity. Submit your papers by October 30, 2024, and join us at the first Workshop on Challenges in Processing South Asian Languages (CHiPSAL), taking place at COLING 2025 on January 19, 2025. Please submit your papers via *https://softconf.com/coling2025/CHiPSAL25 <https://softconf.com/coling2025/CHiPSAL25> * ---- *CHiPSAL 2025*, the First workshop on Challenges in Processing South Asian Languages (CHiPSAL), will be held as part of the 31st International Conference on Computational Linguistics (COLING 2025) in Abu Dhabi, UAE, on *January 19, 2025*. The workshop will be conducted in *virtual mode*. CHiPSAL 2025 invites the submission of original research papers, review/opinion papers, and system demonstration papers, in short or long forms, on topics that highlight the challenges related to South Asian languages, including but not limited to the following areas: - Encoding and Unicode Issues in South Asian Scripts - Orthographic Complexities and Their Impact on Language Technology - Morphological Analysis and Generation in South Asian Languages - Dialectal Variations and Language Standardisation - Code-Mixing and Multilingualism in South Asian Contexts - Building Linguistic Resources for South Asian Languages - Speech Recognition and Synthesis for South Asian Languages - Preserving Linguistic Heritage through Technology - Benchmarking Models for South Asian Languages *Important Dates* All deadlines are 11:59PM UTC-12:00 (“anywhere on Earth”). The First CFP Monday, 15 July 2024 Submission Deadline October 30, 2024 Notification of acceptance November 29, 2024 Camera-ready papers December 13, 2024 Pre-recorded video due January 5, 2024 Workshop (Virtual) January 19, 2025 *For more information: https://sites.google.com/view/chipsal <https://sites.google.com/view/chipsal>* Regards Sarves -- *Kengatharaiyer Sarveswaran (Sarves)* Department of Computer Science Faculty of Science University of Jaffna Sri Lanka sarves.github.io

1 0

Celtic Language Technology Workshop 2025 (CFP)
by abigail.walsh＠adaptcentre.ie 09 Sep '24

09 Sep '24

Dear all, Apologies for cross-posting. ┌────────────────────────────────┐ │ ***The Fifth Celtic Language Technology Workshop*** │ │ Co-located at COLING 2025 in Abu Dhabi, UAE. │ │ │ │ Website: https://cltworkshop.github.io │ │ │ │ Important Dates │ │ * Call for Papers: 13th August 2024 │ │ * Paper Submission Deadline: 18th October 2024 │ │ * Notification of Paper Acceptance: 15th November 2024 │ │ * Camera-ready Paper Deadline: 2nd December 2024 │ │ * Workshop Date: 20th January 2025 │ │ │ │ Submission platform: https://softconf.com/coling2025/CLT25/ │ │ │ └────────────────────────────────┘ The CLTW community and workshop – inaugurated at COLING (Dublin) in 2014 – has become a critical focus and forum for researchers working in natural language processing (NLP) and language technologies for Celtic languages. We are delighted to announce that the fifth edition in the Celtic Language Technology Workshop series will be co-located with COLING 2025 in Abu Dhabi, UAE. We invite submissions of long and short papers featuring original contributions on resources, theories, systems, applications, and methods in Natural Language Processing for any of the Celtic languages. Topics of interest include, but are not limited to the following: * Knowledge-based NLP/ Neural NLP/Hybrid approaches to processing Celtic Languages * Fine-Tuning of Pre-Trained Language Models (PLM) * Experiments and Evaluations (Prompting and Fine-Tuning) of Large Languages Model (LLMs) * Evaluating Celtic Language NLP systems * Celtic Language Resources * Corpus Development/Analysis * Treebanking * Parsing/Chunking * Ontology-lexica * Linked Data Resources * Syntax, Semantics * Lexicons * Terminology and Knowledge Representation * Linguistic Annotation of Celtic-language Texts * Machine Translation * Natural Language Generation * Speech Processing/Generation * Computer Assisted Language Learning (CALL) * Celtic Digital Humanities * Information Extraction * Transfer Learning * Cross-lingual Methods * NLP/LT for historical Celtic languages For more information, including submission instructions, please see the website (https://cltworkshop.github.io/) or for queries, contact the email address (celticlanguagetechnology(a)gmail.com) Kind regards, Brian Davis, Theodorus Fransen, Elaine Uí Dhonnchadha, Abigail Walsh

1 0

2ndCFP - Coling-Rel 25: New Horizons in Computational Linguistics for Religious Texts
by Majdi Sawalha 09 Sep '24

09 Sep '24

==================================================================================== COLING 2025 Workshop: New Horizons in Computational Linguistics for Religious Texts (Coling-Rel 25) ==================================================================================== Part of the COLING 2025 Conference Abu Dhabi, UAE January 19-24, 2025 https://tinyurl.com/Coling-Rel25 We invite submissions for the COLLING 2025 Workshop on New Horizons in Computational Linguistics for Religious Texts that will be held with the 31st edition of COLING in 2025 in Abu Dhabi (UAE) (COLING 2025). This workshop invites researchers exploring the intersection of language technology and religious texts. This workshop aims to foster discussion on cutting-edge applications of Natural Language Processing (NLP) to religious texts, including: • Analyzing faith-defining canons and authoritative interpretations • Extracting insights from sermons, liturgy, prayers, and poetry • Leveraging Large Language Models for novel research avenues The Coling-Rel25 workshop will explore the potential of NLP to unlock new understandings of religious traditions and chart the future of this exciting research area. The workshop welcomes researchers from computational linguistics, digital humanities, and related fields. 1. Workshop Topics and Contents We welcome submissions on a range of topics, including but not limited to: • Computational Morphology and Syntax for Religious texts; • analysis of ceremonial, liturgical, and ritual speech; recitation styles; speech decorum; discourse analysis for religious texts; • suitability of modal and other logics for knowledge representation and inference in religious texts; • issues in, and evaluation of, machine translation in religious texts; • text-mining, stylometry, and authorship attribution for religious texts; • corpus query languages and tools for exploring religious corpora; • dictionaries, thesaurai, Wordnet, and ontologies for religious texts; • measuring semantic relatedness between multiple religious texts; • (new) corpora and rich and novel annotation schemes for religious texts; • annotation and analysis of religious metaphor; • genre analysis for religious texts; • LLMs adaptation for religious texts; • ethical issues of LLMs for religious texts; • application of computer-supported methods in the analysis of religious texts in other disciplines (e.g., theology, classics, philosophy, literature). 2. Organizing Committee • Dr. Majdi Sawalha, Artificial Intelligence Department, The University of Jordan, Jordan • Prof. Sane Yagi, College of Arts, Humanities, and Social Sciences, University of Sharjah, UAE. • Dr. Faisal Alshargi, Magdeburg University, Germany. • Dr. Abdallah Al-Shdaifat, Arabic Language Department, Mohamad bin Zayed University, Abu Dhabi, UAE. • Prof. Ashraf Elnagar, Computer Science Department, University of Sharjah, UAE. • Dr. Bayan Abu Shawar, College of Engineering, Al-Ain University, Abu Dhabi, UAE. • Dr. Noorhan Abbas, School of Computing, The University of Leeds, Leeds, UK. 3. Important Dates - 15 July 2024: First Call for papers - 09 September 2024: Second Call for papers - 29 September 2024: Final Call for papers - 11 November 2024: Paper Submission due - 05 December 2024: Notification for Acceptance - 13 December 2024: Camera-ready format (cannot be changed) - 19 January 2025: COLING-2025 Workshops - 21-24 January 2025: COLING-2025 conference 4. Submission Information Coling-Rel25 workshop invites the submission of long papers of up to eight pages (limits only apply to the main body of the paper). We are following the instructions for COLING 25 proceedings which can be found on this page. At the end of the paper (after the conclusions but before the references) papers need to include a mandatory section discussing the limitations of the work and, optionally, a section discussing ethical considerations. Papers can include unlimited pages of references and an unlimited appendix. Direct submission Papers should be submitted through Softconf/START using the following link: https://softconf.com/coling2025/Coling-Rel25/ Each paper will receive a minimum of three reviews. Authors will have the opportunity to provide a short rebuttal to clarify any misunderstandings. The review process will be double-blind. Reviewers will not see authors, authors will not see reviewers. Reviews and submissions will not be made publicly visible. 5. Contact Majdi Sawalha, sawalha.majdi(a)gmail.com or sawalha.majdi(a)ju.edu.jo Workshop link https://tinyurl.com/Coling-Rel25 -- =========================================================== Majdi Sawalha, *Associate professor,* Computer Information Systems Department, King Abdullah II School of Information Technology, The UNIVERSITY OF JORDAN, Amman, Jordan.

1 0

free 8-week course on Corpus linguistics - starts 16 September
by Brezina, Vaclav 09 Sep '24

09 Sep '24

Dear all, The ESRC Centre for Corpus Approaches to Social Science, Lancaster University offers a free 8-week course on Corpus linguistics: Corpus MOOC. We start in one week's time on Monday 16 September. You can register at https://www.futurelearn.com/courses/corpus-linguistics Best, Vaclav Professor Vaclav Brezina Professor in Corpus Linguistics Department of Linguistics and English Language ESRC Centre for Corpus Approaches to Social Science Faculty of Arts and Social Sciences, Lancaster University Lancaster, LA1 4YD Office: County South, room C05 T: +44 (0)1524 510828 [cid:image001.jpg@01DB029E.3C32EBA0]@vaclavbrezina [cid:image002.jpg@01DB029E.3C32EBA0]<http://www.lancaster.ac.uk/arts-and-social-sciences/about-us/people/vaclav-…>

1 0

Journal of Digital Islamicate Research
by Mai Zaki 08 Sep '24

08 Sep '24

Dear list members, We are excited to announce the publication of the inaugural issue of the *Journal of Digital Islamicate Research* (JDIR <https://brill.com/view/journals/jdir/1/1-2/jdir.1.issue-1-2.xml>), a new peer-reviewed journal dedicated to the intersection of digital humanities and Islamicate studies, published by Brill. The *Journal of Digital Islamicate Research* offers an academic platform for exploring the ways in which digital methodologies are transforming the study of the Islamicate world. Our scope includes, but is not limited to: - Digital historical, religious and cultural studies of the Islamicate world - Islamic and Islamicate arts from a digital humanities perspective - Computational analysis of social, political, and religious dynamics within the Islamicate world - Manuscript studies and digital editions - Digital tools and methods for studying Islamicate heritage and languages - Digital methods in media and communication studies - Digital humanities explorations of literature - The intersection between AI and digital humanities Submissions are open on a rolling basis, and we encourage contributions from scholars across the globe. Manuscripts can be submitted in either English or Arabic through our editorial management system on the Journal website <https://brill.com/view/journals/jdir/jdir-overview.xml?contents=journaltoc>. For more information on how to submit, please visit our page: Submit Your Manuscript. In addition to individual papers, we welcome proposals for special issues on any topic within the scope of the journal. Special issue editors will have the opportunity to bring together a collection of cutting-edge research centered around a cohesive theme related to the digital study of the Islamicate world. We invite you to explore the first issue of JDIR and consider contributing your own work to this exciting and growing field. For inquiries or to discuss potential special issues, please feel free to reach out to the editors on the journal's email jdir(a)brill.com <%20jdir(a)brill.com>. We look forward to your contributions! Eid Mohamed and Mai Zaki Editors-in-chief

1 0

COLING 2025 Workshop on Detecting AI Generated Content, 2nd Call for participation
by Firoj Alam 07 Sep '24

07 Sep '24

(apologies for cross-posting) Dear colleague, We seek submissions of long and short papers on original and unpublished work (same page limit as the COLING 2025 main conference). In addition, there will be three shared tasks. All accepted submissions will be presented as talks and/or posters at the workshop, prior to the COLING 2025 main conference. Research Papers We invite original research papers from a wide range of topics, including but not limited to: - Detection methods for text, image, speech and other modalities - Multilingual detection methods - Detection Methods Image Modality - Detection Methods Multimodal Content - Real-time detection systems: real-time systems for detecting AI-generated content in live scenarios. - Attacks for detection systems - Datasets and resources - Benchmarking for AI generated content detection - AI generated fake news detection - Deep Fakes in audio, videos and images - Ethical and legal implications of AI generated content Shared Tasks We plan to run three shared tasks: - Task 1: Binary Multilingual Machine-Generated Text Detection (Human vs. Machine) - Task 2: AI vs. Human – Academic Essay Authenticity Challenge - Task 3: Cross-domain Machine-Generated Text Detection Important dates: Regular Submission (including shared tasks) Deadline: November 15, 2024 (dual-submission allowed) Resubmission (with a rebuttal) Deadline: December 2, 2024 Acceptance Notification: December 7, 2024 Camera-Ready Deadline: December 13, 2025 Workshop Day: January 19-20, 2025 All deadlines are 11:59PM UTC-12:00 (“anywhere on Earth”). Submission Details: Papers must describe original, completed or in-progress, and unpublished work. All papers will be refereed through a double-blind peer review process by multiple reviewers with final acceptance decisions made by the workshop organizers. Accepted papers will be given up to 9 pages (for full papers), 5 pages (for short papers and posters) in the workshop proceedings, and will be presented as oral paper or poster. We are seeking submissions under the following categories: - Full/long papers (8 pages) - Short papers (work in progress, innovative ideas/proposals: 4 pages) - Shared task papers (4 pages) Long, short, and shared task papers must follow the two-column format of *ACL conferences, using the official templates. The templates can be downloaded in style files and formatting. Please do not modify these style files, nor should you use templates designed for other conferences. Submissions that do not conform to the required styles, including paper size, margin width, and font size restrictions, will be rejected without review. Verification to guarantee conformance to publication standards, we will be using the ACL pubcheck tool. The PDFs of camera-ready papers must be run through this tool prior to their final submission, and we recommend its use also at submission time. Submissions are open to all, and are to be submitted anonymously. For the anonymity, double-blind submission, and reproducibility criteria please follow the COLING 2025 instructions. Submission portal Submissions must be made using the START portal: https://softconf.com/coling2025/DAIGenC25/ Website for further information: https://genai-content-detection.gitlab.io/ Best regards The Organizers

1 0

PhD-Position in Educational NLP
by Horbach, Andrea 06 Sep '24

06 Sep '24

1 PhD-Position in Educational NLP We invite applications for a fully funded PhD position (100%, TV-L E13 according to the German system) at the Leibniz Institute for Science and Mathematics Education in Kiel, Germany, at the newly established research group “Teaching and Learning in the Digital World” in the field of Educational NLP on topics such as automatic free-text assessment, feedback and exercise generation. The position starts in November, a later date is negotiable. It is initially funded for 3 years, an extension is possible. An ideal candidate has a master’s degree in computational linguistics, computer science, or a related discipline. Programming experience in Python and some experience in machine learning is expected. For more details including information on the application process, please refer to: https://www.leibniz-ipn.de/de/das-ipn/ueber-uns/karriere/stellenangebote/pa… The application deadline is September 30. I am happy to answer questions (andrea.horbach(a)fernuni-hagen.de). Best regards, Andrea Andrea Horbach CATALPA - Center of Advanced Technology for Assisted Learning and Predictive Analytics<https://www.fernuni-hagen.de/forschung/schwerpunkte/catalpa/index.shtml> Nachwuchsgruppenleitung / Junior Research Group Leader „EduNLP“ ________________________________ FernUniversität in Hagen Gebäude 5 (PRG) www.fernuni-hagen.de<https://www.fernuni-hagen.de/>

1 0

2026

2025

2024

2023

2022

Corpora