============================================
Interspeech 2025
17 - 21 August, Rotterdam, The Netherlands
https://www.interspeech2025.org/
============================================
Call for Papers
https://www.interspeech2025.org/call-for-papers
============================================
Important Dates
===============
Paper Submission Portal Open: 18 December 2024
Paper Submission Deadline: 12 February 2025
Paper Update Deadline: 19 February 2025
Paper Acceptance Notification: 21 May 2025
Interspeech is the world’s largest and most comprehensive conference
on the science and technology of spoken language processing.
Interspeech conferences emphasize interdisciplinary approaches
addressing all aspects of speech science and technology, ranging from
psycholinguistic theories of human speech processing to advanced
applications of speech technology.
The theme of Interspeech 2025 is “Fair and Inclusive Speech Science
and Technology” and aims to celebrate and incorporate the vast speech
diversity both within and between individuals, as well as within and
between languages. This theme emphasizes diversity as a source of
richness and information, contributing to the development of fairer,
more robust, and personalized speech technology applications and more
accurate theories of human speech processing. By embracing this
diversity, we can create theories and technologies that are inclusive
of everyone, ensuring that speech science and technology benefit all
individuals and communities.
Interspeech 2025 will feature the usual oral and poster sessions,
plenary talks by internationally renowned experts, tutorials, special
sessions and challenges, show & tell, exhibits, and satellite events.
Following Interspeech 2024 we will feature a Blue Sky track. New this
year is a Speech Science Festival for the general public on the Sunday
prior to Interspeech. Watch our website for more information in the
coming months on how you can participate!
Paper Submission
Please note that the paper submission deadline is earlier than
previous years because the conference is held in August. The paper
submission deadline is February 12 2025.
Interspeech 2025 seeks original and innovative papers covering all
aspects of human and machine speech science and technology. The
language of the conference is English, so papers must be written in
English. The paper length is up to four pages in two columns with an
additional page for references and acknowledgments only. Submitted
papers must conform to the format defined in the paper kit provided on
the conference website (and as an overleaf template) and may
optionally be accompanied by multimedia files. Interspeech 2025 will
follow a double-blind review process so papers submitted for review
should not reveal the identity or affiliation of the authors. Authors
must declare that their contributions are original and that they have
not submitted their papers elsewhere for publication. Papers must be
submitted electronically and will be evaluated through rigorous peer
review on the basis of novelty and originality, technical correctness,
clarity of presentation, key strengths, and quality of references. The
Technical Programme Committee will decide which papers to include in
the conference programme using peer review as the primary criterion,
with secondary criteria of addressing the conference theme, and
diversity across the programme as a whole. If accepted, each paper
needs to be presented by one of the authors during the in-person
conference.
Blue Sky Track
Interspeech 2024 introduced the new BLUE SKY track of highly
innovative papers in fields or directions that have not yet been
explored. Large-scale experimental evaluation will not be required for
papers in this track. If you think your work satisfies these
requirements, please consider submitting a paper to this challenging
and competitive track. Please note that to achieve the objectives of
this BLUE SKY track, we will ask the most experienced reviewers to
assess the proposals.
Venue
Interspeech 2025 will take place at Rotterdam Ahoy Convention Center
in the Netherlands.
For more information, you can visit the website https://www.interspeech2025.org
Scientific Areas and Topics
Interspeech 2025 embraces a broad range of science and technology in
speech, language and communication, including – but not limited to –
the following topics:
● Human Speech Perception, Production and Acquisition
● Speech Synthesis
● Phonetics, Phonology, and Prosody
● Spoken Language Generation
● Paralinguistics in Speech and Language
● Automatic Speech Recognition
● Analysis of Conversation
● Spoken Dialogue and Conversational AI Systems
● Speech, Voice, and Hearing Disorders
● Spoken Language Translation, Information Retrieval, Summarization
● Speaker and Language Identification
● Technologies and Systems for New Applications
● Speech and Audio Signal Analysis
● Resources and Evaluation
● Speech Coding and Enhancement
● Speech and Language Processing for Health
● Beyond traditional speech topics (not limited to the provided list)
Contact
For questions regarding the technical programme: TPC-chairs(a)interspeech2025.org
For general questions: pco(a)interspeech2025.org
====================================================================
CFP II Andaluz.IA Forum
December 20, 2024, Antigua Escuela de Magisterio (Universidad de Jaén)
*Deadline for submissions of papers extended to October 27, 2024*
====================================================================
From ten Andalusian universities, the Joint Research Center of the European
Commission, and several Andalusian researchers currently at other national
and international institutions, we are organizing the II Andaluz.IA Forum
<https://sites.google.com/view/andaluzia/home>, a meeting whose main
objective is to show the potential and give visibility to the academic and
research community in Artificial Intelligence in our region. This forum
seeks to highlight the work of Andalusian scientists, both those who are
currently working in Andalusia, as well as those who have spent part of
their training or career in the region, regardless of their current place
of work.
The first edition of the Andaluz.IA forum
<https://sites.google.com/view/andaluzia2023/> was organized at Universidad
Pablo Olavide in Seville and this year it will take place at Universidad de
Jaén. With this second edition, we want to continue highlighting the great
potential for research and academic development in Artificial Intelligence
that Andalusia has, in areas such as machine learning, deep learning,
robotics and natural language processing.
The event will be held in person on December 20, 2024 at the Antigua
Escuela de Magisterio (Universidad de Jaén). Interested researchers can
participate by presenting their results in oral or poster format, provided
that they have been accepted in relevant conferences or journals in the
area. In addition, professionals and companies wishing to participate may
do so through sponsorship or direct participation by registering on the
event's website.
For more information on registration, submission of papers and forms of
sponsorship, please consult the following link:
https://sites.google.com/view/andaluzia/call-for-papers
<https://sites.google.com/view/andaluzia/call-for-papers>.
IMPORTANT DATES
● *Deadline for submission of papers: October 15, 2024 October 27, 2024*
● Notification of acceptance: November 4, 2024
● Deadline for registration at a reduced rate: November 15, 2024
CONTACT INFORMATION
maite(a)ujaen.es
sjzafra(a)ujaen.es
ORGANIZING COMMITTEE
https://sites.google.com/view/andaluzia/organisers
[image: Universidad de Jaén] <http://www.uja.es/> *Salud María Jiménez
Zafra*
sjzafra(a)ujaen.es
Universidad de Jaén
Grupo de Investigación SINAI <http://sinai.ujaen.es/> | Departamento de
Informática
EPS Jaén, Edificio A3, Despacho 326
Campus Las Lagunillas s/n 23071 - Jaén | +34 953212992
[image: Universidad de Jaén] <http://www.uja.es/>
The next meeting of the Edge Hill Corpus Research Group will take place online (via MS Teams) on Friday 15 November 2024, 2-4 pm (GMT)
Topic: Discourse-Oriented Corpus Studies
2-3 pm
Katia Adimora (Edge Hill University)
Mexican immigration/immigrants in American and Mexican newspapers
3-4 pm
Dan Malone (Edge Hill University)
When is the extreme also typical? Using prototypicality to investigate representations of the lone-wolf terrorist
Attendance is free. The abstracts and registration link are here: https://sites.edgehill.ac.uk/crg/next
Registration closes on Wednesday 13 November, 11 am (GMT).
________________________________
Edge Hill University<http://ehu.ac.uk/home/emailfooter>
Modern University of the Year, The Times and Sunday Times Good University Guide 2022<http://ehu.ac.uk/tef/emailfooter>
University of the Year, Educate North 2021/21
________________________________
This message is private and confidential. If you have received this message in error, please notify the sender and remove it from your system. Any views or opinions presented are solely those of the author and do not necessarily represent those of Edge Hill or associated companies. Edge Hill University may monitor email traffic data and also the content of email for the purposes of security and business communications during staff absence.<http://ehu.ac.uk/itspolicies/emailfooter>
This workshop proposes to provide a forum to discuss the structures and models of information resources in historical-comparative linguistic research outputs through the integration of informatic models from library science and archivy. We want to address pertinent issues impacting the indexing (for citation) and interoperability of datasets (for sustainability).
WS Title: First workshop on Data Models, Citation, Access, and Re-usability impacting Historical Linguistic Datasets Workshop at ICHL27, Santiago de Chile, 18-22 August 2025
Workshop Type: in-person
Organizers: Hugh Paterson III & Oksana Zavalina
Abstract Deadline: October 18th EXTENDED NEW DEADLINE NOV 7th.
Abstract Details: up to 800 words excluding references.
Submission: Email PDF of abstracts to both i(a)hp3.me and oksana.zavalina(a)unt.edu with [ICHL27 w8] in the subject line.
Note: Workshops are in most cases restricted to 6 papers; all other papers, if accepted, will be given as part of the ICHL general sessions. Should there be sufficient interest for an extended workshop (up to 12 papers), we will lobby the local organizers to permit this format.
Workshop Website: https://hughandbecky.us/Hugh-CV/project/2025-ichl27-historical-linguistic-d…
Conference Website: https://ichl27santiago.cl
PDF of Workshop abstract: https://hughandbecky.us/Hugh-CV/project/2025-ichl27-historical-linguistic-d…
Publication: We are pursuing publication via edited volume post-workshop.
==Goal & Questions==
The role of library models (e.g., IFLA-LRM: Riva, Le Bœuf, and Žumer 2017) and archival practice (e.g., lifecycle management: Higgins 2012) is under-explored in relation to the construction and reuse of Historical Linguistic Information Sources. This workshop proposes to provide a forum to discuss the structures and models of information resources in historical-comparative linguistic research outputs through the integration of informatic models from library science and archivy.
We invite papers describing the information models used for assembling large corpora (including wordlists) used in historical linguistics, highlighting assumptions for citation, referencing, segmentation, and reusability of the assembled collection of texts and their digital surrogates. We encourage papers which present typologies of use cases, categories of tracked information, provenance of data content, citability of aggregate content, and the identifiers-for and permanence-of user-generated datasets on research platforms.
What are the design patterns within datasets?
What are the categories used? and what are their scopes?
What are the kinds of objects subsumed into datasets?
==Background==
Significant advances have been made in historical linguistics through the use of large compiled datasets (e.g., Kamholz et al. 2024; Tresoldi 2023; Arora et al. 2023; Dellert et al. 2020; Greenhill 2015; Segerer and Flavier 2013; Mielke 2008; Greenhill, Blust, and Gray 2008). While not precluding the contributions of single historical manuscripts and traditional manuscript consultation methods, the use of and creation of datasets (including corpora) has become the defacto way of generating new hypotheses (Wichmann and Saunders 2007; Steiner, Cysouw, and Stadler 2011; Segerer 2015). Datasets in historical linguistics generally do two things: (1) record critical researcher-created information such as reconstructed forms, cognacy judgments, confidence levels, along with contextual notes; and (2) contain foundational content from sources not created by the dataset compiler. Such source material often include historically published and unpublished resources including: maps (Hessle and Kirk 2020), language specific lexicons and published reconstructions (Kamholz et al. 2024), wordlists (Forkel et al. 2024; Segerer and Flavier 2013), transcriptions of manuscripts and texts (Weber et al. 2023; Genee and Junker 2018; Kytö 2011), and even reconstructions by other scholars, etc.
Interactional platform-tools such as RefLex (Segerer and Flavier 2013) or OUTOFPAPUA (Kamholz et al. 2024) allow users to create custom datasets based on specific selected resources available to the platform. They do this without requiring users to interact with the complete set of underlying resources and/or the platforms allow users to create new derivative aggregate collections (reconstructed forms and cognacy relations) independent of other platform users. Citing, referencing, and redistributing these custom datasets is challenging and impacts the verifiability of claims.
It is broadly accepted across linguistic research that scholarly work—including evidence— should be citable, accessable, and reusable (Bird and Simons 2003). Together these issues impact reproducibility, an important tenet in scholarship often overlooked in linguistics (Berez-Kroeker et al. 2018). However, it is also well acknowledged that the citation and reference of original source material for linguistic evidence is lacking across the field (Gawne et al. 2017). More specifically in historical-comparative linguistics, the context of citation and referencing of the evidentiary record along with current dataset assemblage and distribution practices generally do not support fine-grained or Work-oriented citation and referencing. This often means that specific and necessary details in comparative linguistics are not retrievable. Therefore, the data models embedded within historical comparative datasets become all the more important for the reproducibility of work and the testing, verification, and refinement of hypotheses (Bakro-Nagy 2010).
With the exception of leading work around Cross-Linguistic Data Formats (CLDF) use with historical-comparative data (Forkel et al. 2018; Forkel, Swanson, and Moran 2024) and approaches using linked data in linguistics (Kesäniemi et al. 2018; Tittel, Gillis-Webber, and Nannini 2020), the literature has been silent about the storage formats for historical-comparative data. Undiscussed are the information categories represented in historical comparative linguistic datasets. The informatic arrangement and description of compiled datasets has generally been ad-hoc and served the needs of individually-funded projects. This has resulted in a proliferation of divergent data categories mitigating against ease-of-reuse.
We set out to ignite discussion around compilations of manuscripts, wordlists, and other derivative resources which have become mainstream tools in hypothesis generation related to the language evolution. We explore the heretofore unapproached contribution that models such as Work-Expression-Manifestation-Item (WEMI), illustrated in figure 1, from library and information science (Coyle 2023; Riva, Le Bœuf, and Žumer 2017; IFLA, 1998) can offer those who compile, and cite/reference aggregate linguistic resources. Specifically, clarifying linking relationships between the literature and datasets, including dataset portions.
We invite papers describing the information models used for assembling large corpora (including wordlists) used in historical linguistics, highlighting assumptions for citation, referencing, segmentation, and reusability of the assembled collection of texts and their digital surrogates. We encourage papers which present typologies of use cases, categories of tracked information, provenance of data content, citability of aggregate content, and the identifiers-for and permanence-of user-generated datasets on research platforms.
Figure 1. Is available at the workshop website and the abstract in PDF form.
==References==
Arora, Aryaman, Adam Farris, Samopriya Basu, and Suresh Kolichala. 2023. “Jambu: A Historical Linguistic Database for South Asian Languages.” arXiv. https://doi.org/10.48550/arXiv.2306.02514.
Bakro-Nagy, Marianne. 2010. “Data in Historical Linguistics: On Utterances, Sources, and Reliability.” Sprachtheorie Und Germanistische Linguistik 20.2: 133-195., January. https://www.academia.edu/3629841/Data_in_historical_linguistics_On_utteranc….
Berez-Kroeker, Andrea L., Lauren Gawne, Susan Smythe Kung, Barbara F. Kelly, Tyler Heston, Gary Holton, Peter Pulsifer, et al. 2018. “Reproducible Research in Linguistics: A Position Statement on Data Citation and Attribution in Our Field.” Linguistics 56 (1): 1–18. https://doi.org/10.1515/ling-2017-0032.
Bird, Steven, and Gary F. Simons. 2003. “Seven Dimensions of Portability for Language Documentation and Description.” Language 79 (3): 557–82. https://doi.org/10.1353/lan.2003.0149.
Coyle, Karen. 2023. “openWEMI.” In Proceedings of the International Conference on Dublin Core and Metadata Applications. Dublin, Ohio: Dublin Core Metadata Initiative. https://doi.org/10.23106/DCMI.953115290.
Dellert, Johannes, Thora Daneyko, Alla Münch, Alina Ladygina, Armin Buch, Natalie Clarius, Ilja Grigorjew, et al. 2020. “NorthEuraLex: A Wide-Coverage Lexical Database of Northern Eurasia.” Language Resources and Evaluation 54 (1): 273–301. https://doi.org/10.1007/s10579-019-09480-6.
Forkel, Robert, Johann-Mattis List, Simon J. Greenhill, Christoph Rzymski, Sebastian Bank, Michael Cysouw, Harald Hammarström, Martin Haspelmath, Gereon A. Kaiping, and Russell D. Gray. 2018. “Cross-Linguistic Data Formats, Advancing Data Sharing and Re-Use in Comparative Linguistics.” Scientific Data 5 (1): 180205. https://doi.org/10.1038/sdata.2018.205.
Forkel, Robert, Johann-Mattis List, Christoph Rzymski, and Guillaume Segerer. 2024. “Linguistic Survey of India and Polyglotta Africana: Two Retrostandardized Digital Editions of Large Historical Collections of Multilingual Wordlists.” In _Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), edited by
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, and Nianwen Xue, 10578–83. Torino, Italia: ELRA and ICCL. https://aclanthology.org/2024.lrec-main.925.
Forkel, Robert, Daniel G. Swanson, and Steven Moran. 2024. “Converting Legacy Data to CLDF: A FAIR Exit Strategy for Linguistic Web Apps.” In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), edited by Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, and Nianwen Xue, 3978–82. Torino, Italia: ELRA and ICCL. https://aclanthology.org/2024.lrec-main.353.
Gawne, Lauren, Barbara F. Kelley, Andrea L. Berez-Kroeker, and Tyler Heston. 2017. “Putting Practice into Words: The State of Data and Methods Transparency in Grammatical Descriptions.” Language Documentation & Description 11:157–89. http://hdl.handle.net/10125/24731.
Genee, Inge, and Marie-Odile Junker. 2018. “The Blackfoot Language Resources and Digital Dictionary Project: Creating Integrated Web Resources for Language Documentation and Revitalization.” Language Documentation & Conservation 12:274–314. http://hdl.handle.net/10125/24770.
Greenhill, Simon J. 2015. “TransNewGuinea.Org: An Online Database of New Guinea Languages.” PLOS ONE 10 (10): e0141563. https://doi.org/10.1371/journal.pone.0141563.
Greenhill, Simon J., Robert Blust, and Russell D. Gray. 2008. “The Austronesian Basic Vocabulary Database: From Bioinformatics to Lexomics.” Evolutionary Bioinformatics 4 (January):EBO.S893. https://doi.org/10.4137/EBO.S893.
Hessle, Christian, and John Kirk. 2020. “Digitising Collections of Historical Linguistic Data: The Example of The Linguistic Atlas of Scotland.” Journal of Data Mining & Digital Humanities Special issue on Visualisations in Historical Linguistics. https://doi.org/10.46298/jdmdh.5611.
Higgins, Sarah. 2012. “The Lifecycle of Data Managment.” In Managing Research Data, edited by Graham Pryor, 17–46. London, UK: Facet Publishing.
IFLA Study Group on the Functional Requirements for Bibliographic Records and Plassard, Marie-France. 1998. “Functional Requirements for Bibliographic Records: Final Report.” 2nd ed. [UBCIM
Publications, New Series] IFLA Series on Bibliographic Control 19. Munich, Germany: K.G. Saur. http://www.ifla.org/VII/s13/frbr.
Kamholz, David, Anne van Schie, Allahverdi Verdizade, Maria Zielenbach, and Antoinette Schapper. 2024. “OUTOFPAPUA.” Database. 2024. https://outofpapua.com.
Kesäniemi, Joonas, Turo Vartiainen, Tanja Säily, and Terttu Nevalainen. 2018. “Exploring Meta-Analysis for Historical Corpus Linguistics Based on Linked Data.” Journal of Research Design and Statistics in Linguistics and Communication Science 5 (1–2): 4–47. https://doi.org/10.1558/jrds.36709.
Kytö, Merja. 2011. “Corpora and Historical Linguistics.” Revista Brasileira de Linguística Aplicada 11 (2): 417–57. https://doi.org/10.1590/S1984-63982011000200007.
Mielke, Jeff. 2008. The Emergence of Distinctive Features. Oxford, England: Oxford University Press.
Riva, Pat, Patrick Le Bœuf, and Maja Žumer, eds. 2017. IFLA Library Reference Model: A Conceptual Model for Bibliographic Information. December 2017. Den Haag, Netherlands: International Federation of Library Associations and Institutions (IFLA). https://www.ifla.org/publications/node/11412.
Segerer, Guillaume. 2015. “How Databases Shape Research: Labial-Velars Distribution in Africa.” In 8th World Congress of African Linguistics (WOCAL8). Kyoto, Japan. https://inria.hal.science/halshs-01251122.
Segerer, Guillaume, and Sébastien Flavier. 2013. “The RefLex Project: Documenting and Exploring Lexical Resources in Africa.” Oral Presentation presented at the Research, records and responsibility: Ten years of the Pacific and Regional Archive for Digital Sources in Endangered Cultures, Sydney, Australia. http://hdl.handle.net/2123/9854.
Steiner, Lydia, Michael Cysouw, and Peter Stadler. 2011. “A Pipeline for Computational Historical Linguistics,” January. https://doi.org/10.1163/221058211X570358.
Tittel, Sabine, Frances Gillis-Webber, and Alessandro A. Nannini. 2020. “Towards an Ontology Based on Hallig-Wartburg’s Begriffssystem for Historical Linguistic Linked Data.” In Proceedings of the 7th Workshop on Linked Data in Linguistics (LDL-2020), edited by Maxim Ionov, John P. McCrae, Christian Chiarcos, Thierry Declerck, Julia Bosque-Gil, and Jorge Gracia, 1–10. Marseille, France: European Language Resources Association. https://aclanthology.org/2020.ldl-1.1.
Tresoldi, Tiago. 2023. “A Global Lexical Database (GLED) for Computational Historical Linguistics.” Journal of Open Humanities Data 9 (1): Article 2. https://doi.org/10.5334/johd.96.
Weber, Natalie, Tyler Brown, Joshua Celli, McKenzie Denham, Hailey Dykstra, Rodrigo Hernandez-Merlin, Evan Hochstein, et al. 2023. “Blackfoot Words: A Database of Blackfoot Lexical Forms.” Language Resources and Evaluation 57 (3): 1207–62. https://doi.org/10.1007/s10579-022-09631-2.
Wichmann, Søren, and Arpiar Saunders. 2007. “How to Use Typological Databases in Historical Linguistic Research.” Diachronica 24 (2): 373–404. https://doi.org/10.1075/dia.24.2.06wic.
With the usual apologies for any cross-posting: In the Department of Linguistics and English Language at Lancaster University, we’re celebrating our 50th anniversary in 2024. In a series of public lectures, we will showcase our recent research in different areas of linguistics. Everyone is welcome!
Our next talk should be of particular interest to corpus linguists, forensic linguists, researchers broadly interested in the security and protection sciences:
The shared anti-science discourses, by Dr Isobelle Clarke<https://www.lancaster.ac.uk/linguistics/about/people/isobelle-clarke>
Anti-science discourse has been studied through the optic of particular governments (Carter et al., 2019) or specific topics, such as anti-vaccination (Davis, 2019), anti-genetically modified organisms (Cook et al. 2004), stem cell research (Marcon, Murdoch and Caulfield, 2017), and climate denial discourse (Park, 2015). This research often details the development and content of the anti-science position and discourses. Yet, little is known about how the discourses compare across topics. Are there anti-science discourses that are shared across topics or does the discourse vary with the topic? In this talk, I will present the results of the common discourses which are shared between texts from websites known to promote pseudoscience and conspiracy on the topics of stem cells, climate change, vaccination and genetically modified organisms.
For details of this talk and the upcoming programme, please visit our 50th anniversary website<https://www.lancaster.ac.uk/linguistics/50th-anniversary/>.
Register
Register here<https://www.trybooking.com/uk/events/landing/68278> to attend.
Date and time
6:00 PM - 8:00 PM (UTC+01), Thursday 24th October 2024
Location
Faraday Lecture<https://use.mazemap.com/#v=1&config=lancaster&campusid=341&zlevel=1¢er=…> Theatre, Lancaster University, Lancaster, LA1 4YW
Getting to the University
More information on ways to get to the university can be found here<https://www.lancaster.ac.uk/about-us/maps-and-travel/>.
If you have any questions, please don’t hesitate to ask.
Regards
Claire
Professor Claire Hardaker (she/her)
Professor of Forensic Linguistics
Director of the MSc in Forensic Linguistics & Speech Science<https://www.lancaster.ac.uk/study/postgraduate/postgraduate-courses/forensi…>
Lancaster University, LA1 4YL
Call for papers for the COLING-2025 workshop: The First Workshop on Natural Language Processing for Indo-Aryan and Dravidian Languages (IndoNLP2025)
Date: 20th January 2025 (full day)
Venue: Abu Dhabi, UAE
Webpage: https://indonlp-workshop.github.io/IndoNLP-Workshop/
Submission Deadline: 5 November 2024
Submission Portal: https://softconf.com/coling2025/IndoNLP25/
Workshop description
The rapid advancement of Natural Language Processing (NLP) and Large Language Models (LLMs) has transformed the landscape of computational linguistics. However, Indo-Aryan and Dravidian Languages (IADL), which represent a significant portion of South Asia's linguistic heritage, remain under-resourced and under-researched in these technological developments. This workshop aims to bridge this gap by bringing together researchers, linguists, and technologists to focus on the unique challenges and opportunities. Participants will explore innovative methods for creating and annotating digital corpora, develop speech and language technologies suited to IADL, and promote interdisciplinary collaborations. By leveraging LLMs, we seek to address the complexities of syntax, morphology, and semantics in these languages to enhance the performance of NLP applications. Furthermore, the workshop will provide a platform for sharing best practices, tools, and resources, enhancing the digital infrastructure necessary for language preservation. Through collaborative efforts, we aim to build a research community to advance NLP for IADL, contributing to linguistic diversity and cultural preservation in the digital age.
The topics of the workshop include, but are not limited to:
- Large Language Models for Indo-Aryan Languages and Dravidian Languages.
- Developing a cleaned Indo-Aryan and Dravidian language corpus (UNICODE) and digital linguistic resources.
- Machine Translation and Cross-Lingual Systems
- Speech Technologies: Recognition and Synthesis
- Language Identification and Dialect Detection
- Information Extraction, OCR systems and Knowledge Modelling
- NLP Applications - Fake News, Spam, and Rumour Detection
- Hate speech and Offensive Language Detection
- Sentiment Analysis and Text Summarisation
- NLP applications: Misinformation, Conspiracy theories. Rumours, SPAM, Phishing, and similar applications.
Submission & Publication
Papers will be evaluated according to their significance, originality, technical content, style, clarity, and relevance to the workshop. We welcome the following types of contributions:
- Standard research papers (up to 8 pages, plus more pages for references if needed)
- Short research papers (from 4 to 6 pages, plus more pages for references if needed)
At the end of the paper (after the conclusions but before the references), papers need to include a mandatory section discussing the limitations of the work and, optionally, a section discussing ethical considerations. Papers can include unlimited pages of references and an unlimited appendix.
To prepare your submission, please make sure to use the COLING 2025 style files available here:
- Latex - https://coling2025.org/downloads/coling-2025.zip
- Word - https://coling2025.org/downloads/coling-2025.docx
- Overleaf - https://www.overleaf.com/latex/templates/instructions-for-coling-2025-proce…
All papers should be electronically submitted in PDF format via the main conference platform via START<https://softconf.com/coling2025/IndoNLP25/>.
Important Dates
- Paper submission deadline: November 5, 2024
- Notification of acceptance: November 25, 2024
- Camera-ready paper: December 13, 2024
- Workshop date: January 20, 2025
Organising Committee
- Ruvan Weerasinghe, Informatics Institute of Technology, Sri Lanka
- Isuri Anuradha, Lancaster University, UK
- Deshan Sumanathilaka, Swansea University, UK
- Mo El-Haj, Lancaster University, UK
- Chamila Liyanage, University of Colombo School of Computing, Sri Lanka
- Fahad Khan, Istituto di Linguistica Computazionale in CNR, Italy
- Andrew Hardie, Lancaster University, UK
- Asim Abbas, Birmingham University, UK
- Ruslan Mitkov Lancaster University, UK
- Julian Hough, Swansea University, UK
- Nicholas Micallef, Swansea University, UK
- Naomi Krishnarajah, Informatics Institute of Technology, Sri Lanka
Programme Committee
- Randil Pushpanandha, University of Colombo, Sri Lanka
- Dulip Herath, Queensland University,Australia
- Daisy Lal, Lancaster University, UK
- Damith Premasiri, Lancaster University, UK
- Venkatesh Raju, Stealth Mode AI Startup, India
- Gayanath Chandrasena, University of Helsinki, Finland
- Kaza Sri Sai Swaroop, IBM, India
- Asanka Wasala,Dell Technologies, Ireland
- Kengatharaiyer Sarveswaran - University of Jaffna, Sri Lanka
- Sinnathamby Mahesan - University of Jaffna, Sri Lanka
- Nishantha Medagoda - Auckland University of Technology, New Zealand
- Prasan Yapa, Kyoto University of Advance Science, Japan
- Paul Rayson, Lancaster University, UK
- Lochandaka Ranathunga, University of Moratuwa, Sri Lanka
- Kaneeka Vidanage, General Sir John Kotelawala Defence University, Sri Lanka
- Achala Aponso, Edith Cowan University, Australia
- Rajitha Jayasinghe, University of Westminster, UK,
- Arjumand Younus, University College Dublin, Ireland
- Abdul Nazeer, National Institute of Technology, Calicut, India
- Pabitra Mitra, Indian Institute of Technology, Kharagpur, India
- Tanmoy Chakraborty, Indian Institute of Technology, Delhi, India
- Tirthankar Dasgupta, Indian Institute of Technology, Kharagpur, India
- Girish Nath Jha, School for Sanskrit and Indic Studies, JNU, India
- Arka Majhi, Indian Institute of Technology, Bombay, India
- Anand Kumar, National Institute of Technology, Karnataka, India
- Kishorjit Nongmeikapam, Indian Institute of Information Technology, India
- Abdullah Alzahrani, Swansea University, Wales, UK
Gmail: indonlp2025(a)gmail.com<mailto:indonlp2025@gmail.com>
Twitter: https://x.com/indo_nlp
The University of Technology Nuremberg (UTN) is looking to fill a full-time
position in the Department of Engineering as soon as possible:
# Research Associate - Postdoc (m/f/d) in Natural Language Understanding (NLU)
UTN is dedicated to harnessing the knowledge and innovation of the humanities
to shape a sustainable future. The Department of Engineering seeks to establish
strong, dynamic collaboration across disciplines, connecting engineering with
the humanities, social sciences, and natural sciences. The position offers an
opportunity for scientific qualification, allowing you to build your research
profile and gain experience in line with your academic background and personal
aspirations. The research project should make a contribution to one of the NLU
Lab's key research areas listed below.
## Your tasks
* Active collaboration in research and teaching
* Participation in the lab's research areas "background knowledge in language
understanding and misunderstanding" and "implicit and underspecified language"
* Support in the conception and organization of scientific events and public
engangement projects (in coordination with the Communication Unit)
* Participation in research cooperations of the department
## Your profile
* Very good academic degree (Master's or comparable) in computational
linguistics or a related field
* Doctorate in the field of computational linguistics
* Proven research focus in natural language understanding, as demonstrated
by relevant publications
* Interest in interdisciplinary cooperation in research and teaching
## We offer
* An employment contract or position as a civil servant for initially 3 years
* Active support in the development of your own research agenda and
corresponding applications for projects or an independent research group
* Salary corresponding to group A13 of the Bavarian Salary Act or pay group E13
TV-L (https://oeffentlicher-dienst.info/tv-l/allg/) if the personal and pay
scale requirements are met.
* Opportunity to actively participate in the development of the newly founded
University of Technology Nuremberg and to take on responsible tasks
* A dynamic and flexible working environment
* A modern workplace with all the attractive social benefits of the public sector
* Flexible working hours to reconcile family and career
* Mobile working opportunities
* Attractive training and development opportunities
The position is suitable for people with severe disabilities. Severely disabled
applicants will be given preference if they have the same suitability,
qualifications and professional performance.
Women are encouraged to apply in accordance with Art. 7 Para. 3 of the Bavarian
Equal Opportunities Act. The NLU Lab is committed to fostering a diverse and
inclusive work environment and we highly welcome applications from minority
groups and candidates of all backgrounds.
The position is open to part-time arrangements, provided that the
responsibilities can be fulfilled through job sharing.
## Are you interested?
Please send us your detailed application by 03.11.2024. Please only use our
application portal. Interviews for this vacancy are expected to take place in
the week commencing 10.11.2024.
Job portal: https://jobs.utn.de/en/jobposting/ead54f69ce980cfe6f7d04fba57812b88328f4a00…
## Do you have questions?
We are happy to receive general questions by e-mail to jobs(a)utn.de, quoting the
reference number ENG-2024-04, and we will call you back. If you have content
related questions, please contact Prof. Dr. Michael Roth at michael.roth(a)utn.de
The official version of this announcement is available in German on www.utn.de
Dear colleagues,
We are pleased to invite you to the tutorial, “Countering Hateful and Offensive Speech Online—Open Challenges,” which will take place on Friday, November 15, from 9:00 to 12:30 at EMNLP 2024 in Miami.
Overview: In today's digital age, hate speech and offensive speech online pose a significant challenge to maintaining respectful and inclusive online environments. This tutorial aims to provide attendees with a comprehensive understanding of the field by delving into essential dimensions such as:
1. Data Creation and Multilingualism.
2. Counter-narrative generation.
3. A hands-on session with one of the most popular APIs for detecting hate speech.
4. Fairness and ethics in AI.
5. The use of recent advanced approaches.
For the full program and detailed agenda, please visit the tutorial website. https://nlp-for-countering-hate-speech-tutorial.github.io/
Tutorial Organizers
* Flor Miriam Plaza-del-Arco, MilaNLP, Bocconi University, Italy.
* Debora Nozza, MilaNLP, Bocconi University, Italy.
* Marco Guerini, LanD_FBK, Fondazione Bruno Kessler, Italy.
* Jeffrey Sorensen, Jigsaw, USA.
* Marcos Zampieri, Language Technology Group, George Masson University, USA.
We look forward to your participation and hope to see you there!
Best regards,
The Tutorial Organizing Team
------------------------------------------------
Flor Miriam Plaza del Arco, Ph.D.
Postdoctoral Researcher MilaNLP<https://milanlproc.github.io/>, Computing Sciences Department
Bocconi University
Via Röntgen, 1-2, 20136 Milan, MI, Italy
Twitter: @florplaza22
Web: https://fmplaza.github.io/
We are looking for a PhD candidate for a fully-funded 4-year project focusing on multilingual NLP, including extremely low-resource languages.
The PhD project is part of the National Centre of Competence in Research (NCCR) Evolving Language (www.evolvinglanguage.ch), a Swiss consortium with the ambitious goal of creating a new discipline, Evolutionary Language Science, that targets the past and future of language. The consortium consists of leading scientists from traditionally separated academic domains, which allows us to harvest the diverse expertise from the humanities, social sciences, computational sciences, natural sciences and medicine towards a broad-scale interdisciplinary collaboration.
Within this framework, the successful candidate will be expected to investigate how new terms emerge in a given language and how they spread in different contexts, thereby comparing Western societies with hunter-gatherer societies. This task is led by computational linguists and evolutionary anthropologists from the USI Università della Svizzera italiana and the
University of Zurich (UZH): Prof. Lonneke van der Plas, Prof. Lena Jäger, and Prof. Andrea Migliano.
The ideal candidate should satisfy the following requirements:
• A Master (or equivalent title) in Computational Linguistics, Natural Language Processing, or related disciplines, such as Computer Science or Computational Cognitive Science
• High personal interest in multilingual approaches to NLP, low-resource languages, and language change
• Expertise in machine learning and computational modelling of language (change)
• Good skills in oral and written English (official language of the Ph.D. program)
• Good oral and writing skills in one (or more) national Swiss languages. Italian is particularly welcome.
• Ability to work independently and to plan and direct own work
• Motivation to engage in the elaboration of a PhD dissertation. Ability to work in team and autonomy in scheduling research steps. Interest in teaching and tutoring students and
availability to collaborate with colleagues, especially with colleagues from the different disciplines of the NCCR Evolving Language (engage in scientific dialogue, listen and think
critically) are required.
More information and a link to apply for the position can be found here: https://sites.google.com/site/lonnekenlp/phd-positions-available
The 9th International Workshop on Computational Linguistics for Uralic Languages (IWCLUL 2024) will be organized by ACL SIGUR. The proceedings of the event will be published in the ACL anthology. The workshop will take place in November 28-29, 2024 in Helsinki, Finland at Metropolia University of Applied Sciences.
https://acl-sigur.github.io/iwclul2024.html
Submission deadline: October 25, 2024 (extended)
Registration/publication fees: 0€!
We solicit original and unpublished work related to NLP methods for Uralic languages, including multilingual methods that include at least one Uralic language (e.g. Finnish, Estonian, Hungarian etc). Appropriate topics include (but are not limited to):
- Multilingual approaches in NLP presenting work on at least one Uralic language
- LLMs and their use in the context of (endangered) Uralic languages
- Position papers
- Parsers, analysers and processing pipelines of Uralic languages
- Lexical databases, electronic dictionaries
- Finished end-user applications aimed at Uralic languages, such as spelling or grammar checkers, machine translation or speech processing
- Evaluation methods and gold standards, tagged corpora, treebanks
- Reports on language-independent or unsupervised methods as applied to Uralic languages
- Surveys and review articles on subjects related to computational linguistics for one or more Uralic languages
- Any work that aims at combining efforts and reducing duplication of work
- How to elicit activity from the language community, agitation campaigns, games with a purpose
Short papers can be up to 4 pages in length (5 for camera-ready version). Short papers can report on work in progress or a more targeted contribution such as software or partial results.
Long papers can be up to 8 pages in length (9 for camera-ready version). Long papers should report on previously unpublished, completed, original work.
Lightning talks submitted as 750-word abstracts. Lightning talks are suited for discussing ideas or presenting work in progress. The abstracts will be published in a lightning proceedings on Zenodo.
All submission formats can have an unlimited number of pages for references. All submissions must follow the ACL stylesheet.
The submissions must be anonymous, and they will be peer-reviewed by our program committee. The peer review is double blinded. Papers must be submitted using the conference submission system by the deadline. At least one of the authors of an accepted paper must attend the event and present their paper.
Accepted papers (short and long) will be published in the joint proceedings that will appear in the ACL Anthology. Accepted papers will also be given an additional page to address the reviewers’ comments. The length of a camera-ready submission can then be 5 pages for a short paper and 9 for a long paper with an unlimited number of pages for references.
Important dates:
- Paper submission (full and short): October 25, 2024 (extended)
- Notification of acceptance: November 3, 2024
- Camera ready deadline: November 10, 2024
- Registration deadline: November 10, 2024
- Workshop: November 28-29, 2024