LaTeCH-CLfL 2025:
The 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
to be held on May 3rd or 4th, 2025 in conjunction with NAACL 2025 <https://2025.naacl.org/> in Albuquerque, NM.
https://sighum.wordpress.com/latech-clfl-2025/
Second Call for Papers (with apologies for cross-posting)
Organisers: Diego Alves, Yuri Bizzoni, Stefania Degaetano-Ortlieb, Anna Kazantseva, Janis Pagel, Stan Szpakowicz
LaTeCH-CLfL 2025 is the ninth in a series of meetings for NLP researchers who work with data from the broadly understood arts, humanities and social sciences, and for specialists in those disciplines who apply NLP techniques in their work. The workshop continues a long tradition of annual meetings. The SIGHUM Workshops on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH) ran ten times in 2007-2016. The five Workshops on Computational Linguistics for Literature (CLfL) took place in 2012-2016. The first eight joint workshops (LaTeCH-CLfL) were held in 2017-2024.
Topics and content
In the Humanities, Social Sciences, Cultural Heritage and literary communities, there is increasing interest in, and demand for, NLP methods for semantic and structural annotation, intelligent linking, discovery, querying, cleaning and visualization of both primary and secondary data. This is even true of primarily non-textual collections, given that text is also the pervasive medium for metadata. Such applications pose new challenges for NLP research: noisy, non-standard textual or multi-modal input, historical languages, vague research concepts, multilingual parts within one document, and so no. Digital resources often have insufficient coverage; resource-intensive methods require (semi-)automatic processing tools and domain adaptation, or intense manual effort (e.g., annotation).
Literary texts bring their own problems, because navigating this form of creative expression requires more than the typical information-seeking tools. Examples of advanced tasks include the study of literature of a certain period, author or sub-genre, recognition of certain literary devices, or quantitative analysis of poetry.
NLP methods applied in this context not only need to achieve high performance, but are often applied as a first step in research or scholarly workflow. That is why it is crucial to interpret model results properly; model interpretability might be more important than raw performance scores, depending on the context.
More generally, there is a growing interest in computational models whose results can be used or interpreted in meaningful ways. It is, therefore, of mutual benefit that NLP experts, data specialists and Digital Humanities researchers who work in and across their domains get involved in the Computational Linguistics community and present their fundamental or applied research results. It has already been demonstrated how cross-disciplinary exchange not only supports work in the Humanities, Social Sciences, and Cultural Heritage communities but also promotes work in the Computational Linguistics community to build richer and more effective tools and models.
Topics of interest include, but are not limited to, the following:
• adaptation of NLP tools to Cultural Heritage, Social Sciences, Humanities and literature;
• automatic error detection and cleaning of textual data;
• complex annotation schemas, tools and interfaces;
• creation (fully- or semi-automatic) of semantic resources;
• creation and analysis of social networks of literary characters;
• discourse and narrative analysis/modelling, notably in literature;
• emotion analysis for the humanities and for literature;
• generation of literary narrative, dialogue or poetry;
• identification and analysis of literary genres;
• interpretability of large language models output for DH-related tasks (explainable AI);
• linking and retrieving information from different sources, media, and domains;
• low-resource and historical language processing;
• modelling dialogue literary style for generation;
• modelling of information and knowledge in the Humanities, Social Sciences, and Cultural Heritage;
• profiling and authorship attribution;
• search for scientific and/or scholarly literature;
• work with linguistic variation and non-standard or historical use of language.
Information for authors
We invite papers on original, unpublished work in the topic areas of the workshop. In addition to long papers, we will consider short papers and system descriptions (demos). We also welcome position papers.
• Long papers, presenting completed work, may consist of up to eight (8) pages of content plus additional pages of references (just two if possible -:). The final camera-ready versions of accepted long papers will be given one additional page of content (up to 9 pages) so that reviewers’ comments can be taken into account.
• A short paper / demo presenting work in progress, or the description of a system, and may consist of up to four (4) pages of content plus additional pages of references (one if you can). Upon acceptance, short papers will be given five (5) content pages in the proceedings.
• A position paper — clearly marked as such — should not exceed eight (8) pages including references.
All submissions are to follow the *ACL paper styles (for LaTeX / Overleaf and MS Word) available at https://github.com/acl-org/acl-style-files. Papers should be submitted electronically, only in PDF, via the LaTeCH-CLfL 2025 submission website on the SoftConf pages (we will publish the link as soon as we have it).
Reviewing will be double-blind. Please do not include the authors’ names and affiliations, or any references to Web sites, project names, acknowledgements and so on — anything that immediately reveals the authors’ identity. Self-references should be kept to a reasonable minimum, and anonymous citations cannot be used.
Submission link: https://softconf.com/naacl2025/LaTeCH-CLfL2025/
Important dates (tentative)
Workshop paper due: January 30, 2025
Notification of acceptance: March 1, 2025
Camera-ready papers due: March 10, 2025
Workshop date: May 3rd or 4th, 2025
More on the organizers
Diego Alves, Language Science and Technology, Saarland University
Yuri Bizzoni, Center for Humanities Computing / School for Communication and Culture, Århus University
Stefania Degaetano-Ortlieb, Language Science and Technology, Saarland University
Anna Kazantseva, National Research Council Canada
Janis Pagel, Department of Digital Humanities, University of Cologne
Stan Szpakowicz, School of Electrical Engineering and Computer Science, University of Ottawa
Contact
latech-clfl(a)googlegroups.com <mailto:latech-clfl@googlegroups.com>
*Call for Participation*
**
Shared Task: Detection and Classification of Persuasion
Techniquesin Parliamentary Debates and Social Media, for
Slavic Languages
*
Co-located with Slav-NLP 2025 <http://bsnlp.cs.helsinki.fi/>Workshop,
at ACL 2025
http://bsnlp.cs.helsinki.fi/shared-task.html
<http://bsnlp.cs.helsinki.fi/shared-task.html>
*
*
TASK DESCRIPTION:
*
*
The task focuses on detection and classification of Persuasion
Techniques in 5 Slavic languages — Bulgarian, Polish, Croatian, Slovene
and Russian — in two types of texts: (a) parliamentary debates on
hotly-contested topics, and (b) social media posts, related to the
spread of disinformation. The task has two subtasks:
1.
Subtask 1: Detection — Given a text and a list of fragment offsets,
determine for each fragment whether it contains one or more
persuasion techniques, from a given taxonomy of persuasion techniques,
2.
Subtask 2: Classification —Given a text and a list of fragment
offsets, determine for each fragment which persuasion techniques are
employed therein.
We use a rich taxonomy with 25 persuasion techniques: Name-calling or
labelling, Guilt by association, Casting doubt, Appeal to hypocrisy,
Questioning the reputation, Flag waiving, Appeal to authority, Appeal to
popularity, Appeal to fear and prejudice, Appeal to values, Strawman,
Whataboutism, Red herring, Appeal to pity, Causal oversimplification,
False dilemma or no choice, Consequential oversimplification, False
equivalence, Slogans, Conversation killer, Appeal to time, Loaded
language, Obfuscation-Intentional vagueness-confusion, Exaggeration or
minimization, Repetition.
Subtask 1 is a binary classification task, whereas Subtask 2 is a
multi-class multi-label classification task. The text fragments
correspond to paragraphs.
For information about training and test data, guidelines, and
participation, please see theShared Task Home Page.
<http://bsnlp.cs.helsinki.fi/shared-task.html>
IMPORTANT: Participants may join both subtasks or only one. It is not
mandatory to submit responses for all languages. Up to max. 5 system
responses per language are allowed.
Important Dates
*
Registration deadline: 20 April 2025
*
Release of Testdata to registered participants: *22 April*2025
*
Submission of system responses: 26 April 2023
*
Results announced to participants: *29*April 2025
*
Submission of shared task papers (optional): 11 May 2025
*
**
*Questions and contact:
bsnlp(a)cs.helsinki.fi<mailto:bsnlp@cs.helsinki.fi>*
**
--
Roman Yangarber
Professor, University of Helsinki, Finland
Digital Humanities
INEQ: Helsinki Inequality Initiative
<https://helsinki.fi/en/ineq-helsinki-inequality-initiative> —
Linguistic Inequalities and Translation Technologies
------------------------------------------------------------------------
e-Learning & language learning
Language Learning Lab
Unioninkatu 40, Metsätalo A214
revitaAI.github.io <https://revitaai.github.io>
helsinki.fi/language-learning-lab
<https://www.helsinki.fi/language-learning-lab>
mobile: +358 50 41 51 71 3
------------------------------------------------------------------------
RЯ
In this newsletter:
Renew your LDC membership today
New publications:
Iraqi Arabic - English Lexical Database<https://catalog.ldc.upenn.edu/LDC2025L01>
LORELEI Hungarian Representative Language Pack<https://catalog.ldc.upenn.edu/LDC2025T01>
________________________________
Renew your LDC membership today
The importance of curated resources for language-related education, research, and technology development drives LDC's mission to create them, to accept data contributions from researchers across the globe, and to broadly share such resources through the LDC Catalog. LDC members enjoy no-cost access to new corpora released annually, as well as the ability to license legacy data sets from among our 960+ holdings at reduced fees. Ensure that your data needs continue to be met by renewing your LDC membership or by joining the Consortium today.
Now through March 3, 2025, 2024 members receive a 10% discount on 2025 membership, and new or returning organizations receive a 5% discount. Membership remains the most economical way to access current and past LDC releases. Consult Join LDC<https://www.ldc.upenn.edu/members/join-ldc> for more details on membership options and benefits.
________________________________
New publications:
Iraqi Arabic - English Lexical Database<https://catalog.ldc.upenn.edu/LDC2025L01> was developed by LDC. It has six interrelated tables presenting over 67,000 Iraqi Arabic words as orthographic forms in Arabic script and pronunciation forms in IPA format, along with more than 120,000 English tokens.
This release is the result of a collaboration with Georgetown University Press <https://press.georgetown.edu/> to enhance and update three dialectal Arabic dictionaries -- Iraqi, Moroccan, and Syrian -- originally published in the 1960s. The Georgetown Dictionary of Iraqi Arabic<https://press.georgetown.edu/Book/The-Georgetown-Dictionary-of-Iraqi-Arabic> was published in 2013. That work was based on, and expanded, two dictionaries, A Dictionary of Iraqi Arabic: English-Arabic (Clarity, Stowasser, and Wolfe, eds., 2003) and A Dictionary of Iraqi Arabic: Arabic-English (Woodhead and Beene, eds., 2003).
The several enhancements developed by LDC in the updated and enhanced dictionary and the lexical database included facilitating comparisons across Arabic dialects and Modern Standard Arabic by providing Arabic script spellings and IPA pronunciations to Iraqi words and phrases; promoting ease of use by language learners and researchers by developing reasonable orthographic conventions for applying the Arabic alphabet to the dialect; and facilitating a user's understanding of morphological and lexical relations by adding information on the linguistic structures of Iraqi Arabic.
The documentation accompanying this release includes instructions for combining into one database the tables in this corpus with the tables in Moroccan Arabic - English Lexical Database LDC2023L01.<https://catalog.ldc.upenn.edu/LDC2023L01>
2025 members can access this corpus through their LDC accounts provided they have submitted a completed copy of the special license agreement. Non-members may license this data for a fee.
*
LORELEI Hungarian Representative Language Pack<https://catalog.ldc.upenn.edu/LDC2025T01> is comprised of over 686 million words of Hungarian monolingual text, 165,000 words of which were translated into English, 2.3 million words of found Hungarian-English parallel text, and 87,000 Hungarian words translated from English data. Approximately 72,500 words were annotated for named entities and over 25,000 words were annotated for full entity (including nominals and pronouns), entity linking and situation frames (identifying entities, needs and issues); over 17,000 words have simple semantic annotation; and close to 10,000 words were annotated for noun phrase chunking. Data was collected from discussion forum, news, reference, social network, and weblogs.
The LORELEI (Low Resource Languages for Emergent Incidents) program was concerned with building human language technology for low resource languages in the context of emergent situations. Representative languages were selected to provide broad typological coverage.
The knowledge base for entity linking annotation is available separately as LORELEI Entity Detection and Linking Knowledge Base (LDC2020T10)<https://catalog.ldc.upenn.edu/LDC2020T10>.
2025 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
To unsubscribe from this newsletter, log in to your LDC account<https://catalog.ldc.upenn.edu/login> and uncheck the box next to "Receive Newsletter" under Account Options or contact LDC for assistance.
Membership Coordinator
Linguistic Data Consortium<ldc.upenn.edu>
University of Pennsylvania
T: +1-215-573-1275
E: ldc(a)ldc.upenn.edu<mailto:ldc@ldc.upenn.edu>
M: 3600 Market St. Suite 810
Philadelphia, PA 19104
The call for papers for EUROCALL 2025 is out.
See: https://eurocall2025.com/call-for-papers/
EUROCALL is the European Association for Computer Assisted Language Learning.
The conference will be held in Milan at Università Cattolica on 27-30 August 2025.
IMPORTANT DATES
01 December 2024: first call for papers
mid December 2024: submission opens
03 February 2025: submission of abstracts closes
21 February 2025: deadline to sign up as reviewer of abstracts on OpenConf
w/c 24th February/ 3rd March 2025: reviews assigned
31 March 2025: deadline for completion of all reviews
14 April 2025: notification to authors
15 April - 15 June 2025: early bird registration
16 June 2025 - 16 July 2025: ordinary conference registration
27-30 August 2025: EUROCALL 2025
Best,
Marco
Prof. Marco C. Passarotti
Computational Linguistics
Index Thomisticus Treebank https://itreebank.marginalia.it/
ERC Grantee, P.I. LiLa https://lila-erc.eu/ (Grant Agreement No. 769994)
CIRCSE Research Centre https://centridiricerca.unicatt.it/circse_index.html
[cropped-europe-flag.png] [cropped-erc_high_res.png] [cropped-lila-logo-9.png]
Università Cattolica del Sacro Cuore
Largo Gemelli, 1
20123 Milan, Italy
marco.passarotti(a)unicatt.it
tel. +39-02-72342380
[http://static.unicatt.it/ext-portale/5xmille_firma_mail_2023.jpg] <https://www.unicatt.it/uc/5xmille>
The NLP group at Linköping University<https://liu-nlp.ai/>, Sweden, is looking for a
Postdoc in Natural Language Processing
within the EU-funded TrustLLM project on developing open, trustworthy, and factual large language models.
The position is full-time (100%) for a fixed term of two years, with the potential of an extension to a total of three years, and comes without teaching obligation. Starting date is by agreement, but ideally as soon as possible.
Research areas include language adaptation and modularisation of LLMs, tokenization for multilingual LLMs, as well as evaluation of relevant qualities (e.g. trustworthiness, factuality) in multilingual LLMs.
For more information about this position and how to apply, see:
https://liu-nlp.ai/postdoc-trustllm-2025/
The application deadline is 2025-02-05.
Please do not hesitate to contact me for details and discussion!
Best regards,
Marcel
--
Marcel Bollmann, Dr. phil.
Associate Professor in Natural Language Processing
Department of Computer and Information Science, Linköping University, Sweden
www: https://marcel.bollmann.me/
ACL 2025 Call for Papers
Main Conference
ACL 2025
Website: https://2025.aclweb.org/
Submission Deadline: February 15, 2025
Conference Dates: July 27 to August 1, 2025
Location: Vienna, Austria
Special Theme: “Generalization of NLP Models”
Contact:
Roberto Navigli (General Chair)
Wanxiang Che, Joyce Nabende, Mohammad Taher Pilehvar, Ekaterina Shutova
(Program Chairs)
Overview
ACL 2025 invites the submission of long and short papers featuring
substantial, original, and unpublished research in all aspects of
Computational Linguistics and Natural Language Processing.
ACL 2025 has a goal of a diverse technical program—in addition to
traditional research results, papers may contribute negative findings,
survey an area, announce the creation of a new resource, argue a
position, report novel linguistic insights derived using existing
computational techniques, and reproduce, or fail to reproduce, previous
results. As in recent years, some of the presentations at the conference
will be of papers accepted by the Transactions of the ACL (TACL) and by
the Computational Linguistics (CL) journals.
Papers submitted to ACL 2025, but not selected for the main conference,
will also automatically be considered for publication in the Findings of
the Association of Computational Linguistics.
Paper Submission Information
Papers may be submitted to the ARR 2025 February cycle. Papers that have
received reviews and a meta-review from ARR (whether from the ARR 2025
February cycle or an earlier ARR cycle) may be committed to ACL 2025 via
the conference commitment site (TBA).
Submission Topics
ACL 2025 aims to have a broad technical program. Relevant topics for the
conference include, but are not limited to, the following areas (in
alphabetical order):
Computational Social Science and Cultural Analytics
Dialogue and Interactive Systems
Discourse and Pragmatics
Efficient/Low-Resource Methods for NLP
Ethics, Bias, and Fairness
Generation
Human-centered NLP
Information Extraction
Information Retrieval and Text Mining
Interpretability and Analysis of Models for NLP
Language Modeling
Linguistic theories, Cognitive Modeling and Psycholinguistics
Machine Learning for NLP
Machine Translation
Multilinguality and Language Diversity
Multimodality and Language Grounding to Vision, Robotics and Beyond
NLP Applications
Phonology, Morphology and Word Segmentation
Question Answering
Resources and Evaluation
Semantics: Lexical and Sentence-Level
Sentiment Analysis, Stylistic Analysis, and Argument Mining
Speech recognition, text-to-speech and spoken language understanding
Summarization
Syntax: Tagging, Chunking and Parsing
Special Theme: Generalization of NLP Models
ACL 2025 Theme Track: Generalization of NLP Models
Following the success of the ACL 2020-2024 Theme tracks, we are happy to
announce that ACL 2025 will have a new theme with the goal of reflecting
and stimulating discussion about the current state of development of the
field of NLP.
Generalization is crucial for ensuring that models behave robustly,
reliably, and fairly when making predictions on data different from
their training data. Achieving good generalization is critically
important for models used in real-world applications, as they should
emulate human-like behavior. Humans are known for their ability to
generalize well, and models should aspire to this standard.
The theme track invites empirical and theoretical research and position
and survey papers reflecting on the Generalization of NLP Models. The
possible topics of discussion include (but are not limited to) the
following:
How can we enhance the generalization of NLP models across various
dimensions—compositional, structural, cross-task, cross-lingual,
cross-domain, and robustness?
What factors affect the generalization of NLP models?
What are the most effective methods for evaluating the
generalization capabilities of NLP models?
While Large Language Models (LLMs) significantly enhance the
generalization of NLP models, what are the key limitations of LLMs in
this regard?
The theme track submissions can be either long or short.
We anticipate having a special session for this theme at the conference
and a Thematic Paper Award in addition to other categories of awards.
Two-Stage Review: Submission to ARR, Commitment to ACL 2025
ACL 2025 will use ACL Rolling Review (ARR) as a reviewing system, but
final decisions will be made by the conference. Both submissions of
articles for review and commitment of reviewed articles to the
conference will be performed via the Open Review platform.
Specifically, authors will follow a two-step process:
Authors submit articles to ARR, where submissions receive reviews
and meta-reviews from ARR reviewers and area chairs;
Authors commit their reviewed articles to a publication venue (e.g.,
ACL 2025), where Senior Area Chairs and Program Chairs make acceptance
decisions from the ARR reviews and meta-reviews.
ACL 2025 has chosen this approach in coordination with *CL 2024
conferences, which are adopting the same procedure and a coordinated
submission plan to allow maximum flexibility during their submission
periods for the authors.
At each cycle, after a paper has been fully reviewed, authors have the
option to commit their paper to a conference or revise and resubmit for
another round of reviews.
The reviewing process will continue to be double-blind. Reviewers will
not see authors, nor will authors see reviewers, and reviews on ARR will
not be made publicly visible. However, authors will be given the option
through ARR to make their anonymized submitted articles publicly
visible.
Mandatory Reviewing Workload
As the pace of research in the field continues to increase, we need to
strengthen the commitment to reviewing for each paper submission. During
the ARR submission process, authors will be required to specify which
co-authors are committing to cover reviewing in this reviewing cycle.
Please see the new ARR policy regarding reviewing workload here. As
this is an ARR-wide policy for all
*CL conferences, questions or clarifications should be addressed to ARR
directly.
Important Dates:
Submission deadline (all papers are submitted to ARR): February 15,
2025
ARR reviews & meta-reviews available to authors of the February
cycle: April 15, 2025
Commitment deadline for ACL 2025: April 20, 2025
Notification of acceptance: May 15, 2025
Withdrawal deadline: May 30, 2025
Camera-ready papers due: May 30, 2025
Tutorials: July 27, 2025
Conference: July 28 - 30, 2025
Workshops: July 31 - August 1, 2025
Note: All deadlines are 11:59PM UTC-12:00 (“anywhere on Earth”).
Paper Submission Details
Both long and short paper submissions should follow all of the ARR
submission requirements at https://aclrollingreview.org/cfp, including:
Long Papers (8 pages) and Short Papers (4 pages):
Instructions for Two-Way Anonymized Review:
Authorship
Citation and Comparison
Multiple Submission Policy, Resubmission Policy, and Withdrawal
Policy
Ethics Policy including the responsible NLP research checklist
Limitations
Paper Submission and Templates
Optional Supplementary Materials
Final versions of accepted papers will be given one additional page of
content (up to 9 pages for long papers, up to 5 pages for short papers)
to address reviewers’ comments.
Following the ACL and ARR policies, there is no anonymity period
requirement.
At the time of submission to ARR, authors will be asked to select a
preferred venue (e.g., ACL 2025). This is used only to calculate
acceptance rates. Authors who selected ACL 2025 as a preferred venue
when submitting to ARR may choose not to commit to ACL 2025 after
receiving their reviews, and authors who selected a preferred venue
other than ACL 2025 when submitting to ARR are still welcome to commit
to ACL 2025.
Presentation at the Conference
All accepted papers must be presented at the conference to appear in the
proceedings. The conference will include both in-person and virtual
presentation options. Papers without at least one presenting author
registered by the early registration deadline may be subject to desk
rejection.
Long and short papers will be presented orally or as posters as
determined by the program committee. While short papers will be
distinguished from long papers in the proceedings, there will be no
distinction in the proceedings between papers presented orally and
papers presented as posters.
Dear all,
Today, the data freeze of the MultiLexNorm 2 shared task is in effect.
As defined in the previous iteration of the task, lexical normalization is:
The task of transforming an utterance into its standard form, word by word,
including both one-to-many (1-n) and many-to-one (n-1) replacements.
This time, the focus is on non-Indo-European languages. We have manged
to obtain (new) datasets for: Thai, Vietnamese, Indonesian, Japanese, and
Korean.
More information can be found on: https://noisy-text.github.io/2025/multi-lexnorm.html#
Deadlines:
Data available: Nov 15, 2024
Data freeze: Jan 14, 2025
Test data: Jan 25, 2025
Final Evaluation: Feb 07, 2025
Paper deadline: Feb 25, 2025
Paper reviewed: Mar 01, 2025
Camera ready: Mar 10, 2025
Workshop: May 03, 2025 (TBD)
Best,
The organizers
*Call for Participation*
*First workshop on Challenges in Processing South Asian Languages (CHiPSAL
2025)Co-located with the 31st International Conference on Computational
Linguistics (COLING 2025)*
*Virtual*
*January 19, 2025 8.30 AM - 3.00 PM (GMT +4)*
*Accepted papers - *https://sites.google.com/view/chipsal/accepted-papers
*W**orkshop program** -*
https://sites.google.com/view/chipsal/workshop-program
*Workshop Website - https://sites.google.com/view/chipsal/
<https://sites.google.com/view/chipsal/>*
Please join us!
We are excited to engage with the research community in advancing NLP for
South Asian languages and fostering meaningful collaborations.
*Why CHiPSAL?*
South Asia, with over 1.97 billion people, is one of the most
linguistically diverse regions globally, home to 700+ languages and 25+
major scripts. This region is rich in cultural and linguistic heritage but
faces significant challenges in natural language processing (NLP). These
include encoding and orthographic issues, resource constraints, linguistic
complexities, dialectal diversity, and more. *CHiPSAL* addresses these
challenges and advances NLP research for South Asian languages while
fostering collaborations across linguistic, technical, and cultural domains.
*Organizing Chairs:*
Kengatharaiyer Sarveswaran, University of Jaffna, Jaffna, Sri Lanka
Ashwini Vaidya, Indian Institute of Technology, Delhi, India
Bal Krishna Bal, Kathmandu University, Kathmandu, Nepal
Sana Shams, University of Engineering and Technology, Lahore, Pakistan
Surendrabikram Thapa, Virginia Tech, USA
*Program Committee Members (alphabetical order):*A M Abirami, Thiagarajar
College of Engineering, India.
Abhai Pratap Singh, Amazon, USA.
Akaash Vishal Hazarika, Splunk, USA.
Aloka Fernando, University of Moratuwa, Sri Lanka.
Aman Shakya,Institute of Engineering, Pulchowk, Tribhuvan University, Nepal.
Anitha Dhakshina Moorthy, Thiagarajar College of Engineering, India.
Ann Sinthusha Anton Vijeevaraj, University of Vavuniya, Sri Lanka.
Annette Hautli-Janisz, University of Passau, Germany.
Ashwini Vaidya, IIT Delhi, India.
Bal Krishna Bal, Kathmandu University, Nepal.
Balaram Prasain, Tribhuvan University, Nepal.
Bareera Sadia, Al-Khawarizmi Institute of Computer Science, UET, Lahore
Pakistan.
Brinda Gurusamy, Cisco, USA.
Buddhika Karunarathne, University of Moratuwa, Sri Lanka.
Eugene Y A Charles, University of Jaffna, Sri Lanka.
Farah Adeeba, University of Engineering and Technology, KSK, Pakistan.
Farhan Jafri, Jamia Millia Islamia, India.
Gihan Dias, University of Moratuwa, Sri Lanka.
H N D Thilini, University of Colombo School of Computing, Sri Lanka.
Hariram Veeramani, UCLA, USA.
Hassan Sajjad, Dalhousie University, Canada.
Jayeeta Putatunda, Fitch Ratings, USA.
Kengatharaiyer Sarveswaran, University of Jaffna, Sri Lanka.
Krishna Chalise, Tribhuvan University, Nepal.
Kritesh Rauniyar, IIMS College, Nepal.
Lekhnath Pathak, Tribhuvan University, Nepal.
Lynnette Hui Xian Ng, CMU, USA.
Mahak Shah, Columbia University, USA.
Manjunath Chandrashekaraiah, Astera Labs, USA.
Menan Velayuthan, University of Moratuwa, Sri Lanka.
Munief Tahir, Al-Khawarizmi Institute of Computer Science, UET, Lahore
Pakistan.
Parameswari Krishnamurthy,IIIT Hyderabad, India.
Paritosh Katre, PayPal, USA.
Prakash Poudyal, Kathmandu University, Nepal.
Preetish Kakkar, Adobe, USA.
Qurat-ul-Ain Akram, University of Engineering and Technology, KSK, Pakistan.
Randil Pushpananda, University of Colombo School of Computing, Sri Lanka.
Sahar Rauf, Al-Khawarizmi Institute of Computer Science, UET, Lahore
Pakistan.
Sana Shams, Al-Khawarizmi Institute of Computer Science, UET, Lahore
Pakistan.
Shuvam Shiwakoti, Virginia Tech, USA.
Siddhant Bikram Shah, Northeastern University, USA.
Sinnathamby Mahesan, University of Jaffna, Sri Lanka.
Suganya Ramamoorthy, Vellore Institute of Technology University, India.
Surabhi Adhikari, Columbia University, USA.
Surangika Ranathunga, Massey University, New Zealand.
Surendrabikram Thapa, Virginia Tech, USA.
Tafseer Ahmed, Alexa Translations, Canada.
Toqeer Ehsan, Mohamed bin Zayed University of Artificial Intelligence,
United Arab Emirates.
Usman Naseem, Macquarie University, Australia.
Uthayasanker Thayasivam, University of Moratuwa, Sri Lanka.
Vijayrajsinh Gohil, New York University, USA.
*Volunteers (alphabetical order):*
Ahrane Mahaganapathy, University of Jaffna, Sri Lanka.
Menan Velayuthan, University of Moratuwa, Sri Lanka.
Suthakar Sivashanth, University of Jaffna, Sri Lanka.
Thank you
--
*Dr Kengatharaiyer Sarveswaran (Sarves)*
Senior Lecturer (Grade-I) in Computer Science
Department of Computer Science
Faculty of Science
University of Jaffna
Sri Lanka
sarves.github.io
The 4th Workshop on Arabic Corpus Linguistics (WACL-4) [1]
WACL4 AT COLING’2025
WITH FOCUS ON ARABIC DIALECTS
The field of Arabic language research using corpora and corpus methods
has experienced significant growth and development in recent years. What
once were isolated efforts have now transformed into a vibrant and
expansive area of study, advancing rapidly across multiple dimensions in
both corpus and computational linguistics. Building upon the success of
previous editions--WACL-1 in 2011, WACL-2 in 2013 in conjunction with
the Corpus Linguistics Conference at Lancaster University, and WACL-3 in
2019 at the Corpus Linguistics 2019 conference at Cardiff University--we
are excited to announce the fourth edition of the Workshop on Arabic
Corpus Linguistics (WACL-4).
The primary objectives of WACL-4 are to highlight the latest
developments in the creation, annotation, and application of Arabic
corpora, including the introduction of new corpora and advancements in
annotation techniques, while fostering collaboration among researchers
from diverse institutions and regions to stimulate joint research
projects and interdisciplinary initiatives. This edition will place a
special emphasis on the study of Arabic dialects, including non-standard
and regional varieties, to broaden the understanding of Arabic in its
various manifestations and support research on under-resourced
linguistic varieties. Additionally, WACL-4 aims to encourage the
development and refinement of Natural Language Processing (NLP) systems
and tools tailored for Arabic, integrating corpora into NLP workflows,
creating new computational tools, and evaluating existing systems to
improve their efficacy in processing Arabic text.
The workshop will be held online on January 20th, 2025 in conjunction
with the 31st edition of COLING in 2025 in Abu Dhabi (UAE).
We are pleased to share the programme of WACL4 2025 with you.
Please visit:
https://drive.google.com/file/d/1SSNC1r4dx023cb_FuQWhWa8d3Si4fvvp/view?usp=…
To register for the workshop, please visit
https://coling2025.org/registration/
We are looking forward to welcoming you at WACL4 at COLING'2025
Kind regards,
WACL4 Organising Committee
--
Amal Haddad Haddad (She/her)
Facultad de Traducción e Interpretación
Universidad de Granada |https://www.ugr.es/personal/amal-haddad-haddad
Lexicon Research Group |http://lexicon.ugr.es/haddad
Co-Convenor, BAAL SIG 'Humans, Machines,
Language'|https://r.jyu.fi/humala
Event Coordinator, BAAL SIG 'Language, Learning and Teaching'
===============
Cláusula de Confidencialidad: "Este mensaje se dirige exclusivamente a
su destinatario y puede contener información privilegiada o
confidencial. Si no es Ud. el destinatario indicado, queda notificado de
que la utilización, divulgación o copia sin autorización está prohibida
en virtud de la legislación vigente. Si ha recibido este mensaje por
error, se ruega lo comunique inmediatamente por esta misma vía y proceda
a su destrucción.
This message is intended exclusively for its addressee and may contain
information that is CONFIDENTIAL and protected by professional
privilege. If you are not the intended recipient you are hereby notified
that any dissemination, copy or disclosure of this communication is
strictly prohibited by law. If this message has been received in error,
please immediately notify us via e-mail and delete it"
===============
Links:
------
[1] https://wp.lancs.ac.uk/wacl4
++ 1st reminder to participate in our web survey on data annotation bottlenecks and active learning; apologies for cross-posting ++
Dear list members,
We invite you to participate in our web survey exploring how recent advancements in NLP, such as LLMs, have changed the need for labeled data in Supervised Machine Learning.
Survey details:
* Topic: Web survey on Data Annotation and Active Learning
* Target group: Researchers and practitioners alike in the fields of NLP, Supervised Machine Learning, and Active Learning in particular (knowledge of Active Learning is not required)
* Duration: 5-15 minutes
* Deadline for participation: January 12, 2025
* Survey link: https://bildungsportal.sachsen.de/umfragen/limesurvey/index.php/538271
Why should I invest my time in this survey?
* Make an impact: Participate in a community-effort and help to gain a better understanding of the current state and open issues on methods that are used to overcome a lack of labeled data.
* Gain insights: Receive a report with key findings to incorporate these insights into research and development of new methods and technologies.
Thank you for considering participating in our survey!
If you have any questions or require additional information, please don't hesitate to contact us directly at activelearningsurvey2024(a)gmail.com<mailto:activeLearningSurvey2024@gmail.com>.
If you know colleagues or peers who might be interested, we'd be grateful if you could forward this survey to them as well.
Best regards,
Julia Romberg (GESIS - Leibniz Institute for the Social Sciences, Germany)
Christopher Schröder (Institut für Angewandte Informatik e. V., Germany)
Julius Gonsior (TUD Dresden University of Technology)
------------------------------------------------------------------------
[gesis-logo-new-50-50]
Leibniz Institute for the Social Sciences
Julia Romberg
Computational Social Science, Team Data Science Methods
+49(221)47694-742
Dear colleagues
Please forward this email to anyone you think might be interested in a PhD studentship focused on discovering drivers of child language development from videos/multimodal transcripts of early child-parent/educator interactions:
https://www.findaphd.com/phds/project/identifying-drivers-of-language-devel…
The student will be based at the University of Manchester in the UK, but will spend at least 12 months at the University of Melbourne, Australia.
Best,
Colin
Neural language models have revolutionised natural language processing (NLP) and have provided state-of-the-art results for many tasks. However, their effectiveness is largely dependent on the pre-training resources. Therefore, language models (LMs) often struggle with low-resource languages in both training and evaluation. Recently, there has been a growing trend in developing and adopting LMs for low-resource languages. LoResLM aims to provide a forum for researchers to share and discuss their ongoing work on LMs for low-resource languages.
LoResLM 2025 will be a physical workshop co-located with COLING 2025, Abu Dhabi on 20th January 2025.
We are pleased to share the programme of LoResLM 2025 with you. Please visit https://loreslm.github.io/program for the full programme.
To register for the workshop, please visit https://coling2025.org/registration/
We are looking forward to welcoming you at LoResLM 2025 in Abu Dhabi.
The workshop is supported in part by CLARIN-UK, funded by the Arts and Humanities Research Council as part of the Infrastructure for Digital Arts and Humanities programme.
>> Keynote Speaker
Jose Camacho-Collados, Cardiff University.
Title - "Multilinguality and Cultural Awareness in Language Models"
>> Organising Committee
Hansi Hettiarachchi, Lancaster University, UK
Tharindu Ranasinghe, Lancaster University, UK
Paul Rayson, Lancaster University, UK
Ruslan Mitkov, Lancaster University, UK
Mohamed Gaber, Birmingham City University, UK
Damith Premasiri, Lancaster University, UK
Fiona Anting Tan, National University of Singapore, Singapore
Lasitha Uyangodage, University of Münster, Germany
>> Programme Committee
Gábor Bella - IMT Atlantique, France
Samuel Cahyawijaya - The Hong Kong University of Science and Technology, Hong Kong
Burcu Can - University of Stirling, UK
Çağrı Çöltekin - University of Tübingen, Germany
Raj Dabre - National Institute of Information and Communications Technology, Japan
Vera Danilova - Uppsala University, Sweden
Debashish Das - Birmingham City University, UK
Ona de Gibert - University of Helsinki, Finland
Alphaeus Dmonte - George Mason University, USA
Bonaventure F. P. Dossou - McGill University, Canada
Daan van Esch - Google
Ignatius Ezeani - Lancaster University, UK
Anna Furtado - University of Galway, Ireland
Amal Htait - Aston University, UK
Ali Hürriyetoğlu - Wageningen University & Research, Netherlands
Danka Jokic - University of Belgrade, Serbia
Diptesh Kanojia - University of Surrey, UK
Daisy Lal - Lancaster University, UK
Colin Leong - University of Dayton, USA
Veronika Lipp - Hungarian Research Centre for Linguistics, Hungary
Muhidin Mohamed - Aston University, UK
Farhad Nooralahzadeh - University of Zurich, Switzerland
Rrubaa Panchendrarajan - Queen Mary University of London, UK
Nadeesha Pathirana - Aston University, UK
Alistair Plum - University of Luxembourg, Luxembourg
Nishat Raihan - George Mason University, USA
Omid Rohanian - University of Oxford, UK
Sandaru Seneviratne - Australian National University, Australia
Ravi Shekhar - University of Essex, UK
Archchana Sindhujan - University of Surrey, UK
Claytone Sikasote - University of Cape Town, South Africa
Marjana Prifti Skenduli - University of New York Tirana, Albania
Uthayasanker Thayasivam - University of Moratuwa, Sri Lanka
Taro Watanabe - Nara Institute of Science and Technology, Japan
Edlira Vakaj - Birmingham City University, UK
John Vidler - Lancaster University, UK
Phil Weber - Aston University, UK
Bryan Wilie - Hong Kong University of Science & Technology, Hong Kong
Artūrs Znotiņš - University of Latvia, Latvia
URL - https://loreslm.github.io/
Twitter - https://x.com/LoResLM2025
LinkedIn - https://www.linkedin.com/company/loreslm/
Hello,
We are hiring 2 PhD students to work on combining language models with
structured data, starting from September 2025, at Telecom Paris,
Institut Polytechnique de Paris.
Large Language Models are amazing, and with our research project, we aim
to make them even more amazing! Our project will connect large language
models to structured knowledge such as knowledge bases or databases.
With this,
1. language models will stop hallucinating
2. language models can be audited and updated reliably
3. language models will become smaller and thus more eco-friendly and
deployable
We work in the DIG team at Telecom Paris, one of the finest engineering
schools in France, and part of Institute Polytechnique de Paris — ranked
38th in the world by the QS ranking. The institute is 45 min away from
Paris by public transport, and located in the green of the Plateau de
Saclay.
Excited about joining us? Tick these boxes:
1. Have a good background in natural language processing, machine
learning, and knowledge representation
2. Have a master's degree (or equivalent)
3. Be of European nationality (imposed by our sponsor, the French
Ministry of Armed Forces)
Check out our Web site to apply:
https://suchanek.name/work/research/kb-lm/index.html
Fabian Suchanek & Nils Holzenberger
Dear all,
The end of 2024 has been very active at EURALEX, for example you can now find EURALEX 2024 proceedings on the website (they have already been indexed by SCOPUS), and the videorecordings from the 2024 Congress (presentations and pre-conference workshops) have been made available at the Videolectures website (https://videolectures.net/events/euralex2024_cavtat).
We are now pleased to announce the launch of our new webinar series.
EURALEX Talks is a series of online webinars featuring invited experts in the field of lexicography. These sessions are free and open to everyone. They explore a wide variety of topics related to language and lexicography. Each talk lasts approximately 40 minutes, followed by questions and discussion. Join us on Tuesday 28 January 2025 at 16.00 (CET) for our first talk, which will be given by Pamela Faber. Zoom link: https://uni-lj-si.zoom.us/j/8569694820.
The Language of Love Fraud: Frames of Deception
The language of love fraud is a unique example of an online linguistic deception. Using a fabricated identity, the fraudster creates the illusion of a romantic relationship between himself and the victim, solely through language. This deception is often successful because of the fraudster’s lexical choices (soulmate, cherish, adore, sacred vow, etc.) which override his flawed syntax and activate a frame of romantic love in her mind.
Biodata
Pamela Faber is Professor Emeritus in Translation and Interpreting at the University of Granada (Spain). She is the founder of the LexiCon research group, with whom she has carried out various nationally-funded research projects on Frame-Based Terminology, the approach to terminology that she created and developed. One of the results of these projects is EcoLexicon (ecolexicon.ugr.es), a terminological knowledge base on environmental science. She has more than 150 articles, book chapters, and books, which have inspired researchers throughout the world to explore specialized knowledge from a frame-based perspective.
Looking forward to seeing you online. Please forward the announcement to other mailing lists and colleagues who might be interested.
Best wishes
Iztok Kosem
EURALEX President
Apologies for cross-posting.
----------------------------------------
*The International Conference on Spoken Language Translation*
ACL – 22nd* IWSLT 2025 – **S**econd** Call for Participation*
*31 July-1 August 2025 - Vienna, Austria*
http://iwslt.org
The International Conference on Spoken Language Translation (IWSLT)
<https://iwslt.org/> is the premier annual conference for all aspects of
Spoken Language Translation. Every year, the conference organises and
sponsors open evaluation campaigns around key challenges in simultaneous
and consecutive translation, under real-time/low latency or offline
conditions and under low-resource or multilingual constraints. System
descriptions and results from participants’ systems and scientific papers
related to key algorithmic advances and best practices are presented.
IWSLT is the venue of the SIGSLTs <https://iwslt.org/sigslt/>, the Special
Interest Group on Spoken Language Translation <https://iwslt.org/sigslt/>
of ACL <https://www.aclweb.org/portal/>, ISCA <https://www.isca-speech.org/>
and ELRA <https://www.elra.info/>. With a track record of 21 years, IWSLT
benchmarks and proceedings serve as reference for all researchers and
practitioners working on speech translation and related fields.
The 22nd edition of IWSLT will be run as a hybrid ELRA
<https://www.elra.info/>/ACL <https://www.aclweb.org/portal/> event,
co-located with ACL 2025 <https://2025.aclweb.org/> from 31 July to 1
August 2025.
*Important Dates*
*January 1, 2025*: Release of shared task training and dev data
*March 15, 2025*: Scientific paper submission deadline
*Apr 1-15, 2025*: Evaluation period
*April 21, 2025*: System description paper submission deadline
*May 15, 2025*: Notification of acceptance
*June 1, 2025*: Camera-ready deadline (all paper)
*July 31-Aug 1*, *2025*: IWSLT conference
Evaluation
The IWSLT 2025 features shared tasks <https://iwslt.org/2025/#shared-tasks>
that address the following focus areas:
- High-resource ST: Offline track, Simultaneous track, Subtitling track
- Low-resource ST: Low-resource and Indic (multilingual) tracks
- Instruction-following Speech Processing track: Technical domain ST, ASR,
Summarization, and QA
Training and development data for each shared task will be prepared and
released by the respective organisers (for further information on this
initiative, please refer to the IWSLT website <https://iwslt.org/2025/>).
Participants will receive instructions about how to submit their runs. In
addition, participants have the opportunity to present their work
through a system
paper that will be published in the ACL Proceedings.
Conference
IWSLT also invites submissions of scientific papers to be published in the
ACL Proceedings and presented either in oral or poster format. The
conference selects high-quality, original contributions on theoretical and
practical issues of spoken language translation research, technologies and
applications. Submissions will be accepted directly through the IWSLT
submission site (to be announced on the website <https://iwslt.org/2025/>).
We will also accept commitments of submissions with reviews from the ACL
Rolling Review.
Additionally, to foster cross-pollination of ideas, the conference also
invites the presentation of papers on speech translation recently published
elsewhere. Please note that this is for non-archival presentation of papers
relevant to speech translation already published in other venues (e.g.,
Findings for the *ACL, speech, NLP or MT conferences). Submissions for this
category will be accepted through a dedicated form (to be announced on the
website <https://iwslt.org/2025/>). Papers will be checked for relevance to
IWSLT, and assigned either oral or poster presentation slots if selected.
Contact
Please email iwslt-evaluation-campaign(a)googlegroups.com if you have any
questions related to the shared tasks.
Thanks,
Marine, Marcello, Alex, Jan, Sebastian, Elizabeth, Atul
(IWSLT organisers)
Apologies for cross-posting.
---------------------------------------------------------------------------
*The Eighth Workshop on Technologies for Machine Translation of
Low-Resource Languages (LoResMT 2025)*
*https://www.loresmt.org/ <https://www.loresmt.org/>*
*@ NAACL 2025 (May 3–4, 2025)*
*Albuquerque, New Mexico, U.S.A.*
*SUBMISSION*
*
<https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/LoResMT>https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/LoResMT
<https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/LoResMT>*
*TIMELINE*
*Paper submission due:* January 30, 2025 (Anywhere on Earth)
*Pre-reviewed (ARR) submission deadline:* February 20, 2025
*Notification of acceptance:* March 1, 2025
*Camera-ready papers due:* March 10, 2025 (Anywhere on Earth)
*Pre-recorded video due (hard deadline):* April 8, 2025
*Workshop dates at NAACL 2025:* May 3–4, 2025
*SCOPE*
Based on the success of past low-resource machine translation (MT)
workshops at AMTA 2018, MT Summit 2019, AACL-IJCNLP 2020, AMTA 2021, COLING
2022, EACL 2023, ACL 2024, we introduce LoResMT 2025 workshop at NAACL
2025. The workshop provides a discussion panel for researchers working on
MT systems/methods for low-resource and under-represented languages in
general. We would like to help review/overview the state of MT for
low-resource languages and define the most important directions. We also
solicit papers dedicated to supplementary NLP tools that are used in any
language and especially in low-resource languages. Overview papers of these
NLP tools are very welcome. It will be beneficial if the evaluations of
these tools in research papers include their impact on the quality of MT
output.
*TOPICS*
We are highly interested in (1) original research papers, (2)
review/opinion papers, and (3) online systems on the topics below; however,
we welcome all novel ideas that cover research on low-resource languages.
- Neural machine translation (NMT) for low-resource languages
- Use of LLMs (large language models) for low-resource MT systems
- COVID-related corpora, their translations and corresponding NLP/MT systems
- Work that presents online systems for practical use by native speakers
- Word tokenizers/de-tokenizers for specific languages
- Word/morpheme segmenters for specific languages
- Alignment/Re-ordering tools for specific language pairs
- Use of morphology analyzers and/or morpheme segmenters in MT
- Multilingual/cross-lingual NLP tools for MT
- Corpora creation and curation technologies for low-resource languages
- Review of available parallel corpora for low-resource languages
- Research and review papers on MT methods for low-resource languages
- MT systems/methods (e.g. rule-based, SMT, NMT) for low-resource languages
- Pivot MT for low-resource languages
- Zero-shot MT for low-resource languages
- Fast building of MT systems for low-resource languages
- Re-usability of existing MT systems for low-resource languages
- Machine translation for language preservation
*SUBMISSION INFORMATION*
We are soliciting two types of submissions: (1) research, review, and
position papers and (2) system demonstration papers. For research, review
and position papers, the length of each paper should be at least four (4)
and not exceed eight (8) pages, plus unlimited pages for references. For
system demonstration papers, the limit is four (4) pages. Submissions
should be formatted according to the official ACL style templates
(Overleaf). Please refer to the NAACL submission guideline for further
information <https://2025.naacl.org/calls/papers/#paper-submission-details>.
Accepted papers will be published at ACL Anthology in the NAACL 2025 and
will be presented at the conference.
Submissions must be anonymized and should be done using the provided
submission system. Scientific papers that have been or will be submitted to
other venues must be declared as such and must be withdrawn from the other
venues if accepted and published at LoResMT. The review will be
double-blind. Authors of an accepted paper should present their paper in
person at NAACL 2025. Papers should be submitted in PDF to the LoResMT Open
Review
<https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/LoResMT>.
We would like to encourage authors to cite papers written in ANY language
that are related to the topics, as long as both original bibliographic
items and their corresponding English translations are provided.
Registration is handled by the main conference (https://2025.naacl.org/).
*ORGANIZING COMMITTEE (LISTED ALPHABETICALLY)*
Atul Kr. Ojha, University of Galway
Chao-Hong Liu, Potamu Research Ltd
Ekaterina Vylomova, University of Melbourne, Australia
Jade Abbott, Retro Rabbit
Jonathan Washington, Swarthmore College
Nathaniel Oco, National University (Philippines)
Tommi A Pirinen, UiT The Arctic University of Norway, Tromsø
Valentin Malykh, Huawei Noah’s Ark lab and Kazan Federal University
Varvara Logacheva, Skolkovo Institute of Science and Technology
Xiaobing Zhao, Minzu University of China
*PROGRAM COMMITTEE (LISTED ALPHABETICALLY)*
Abigail Walsh, ADAPT Centre, Dublin City University, Ireland
Alberto Poncelas, Rakuten, Singapore
Ali Hatami, University of Galway
Alina Karakanta, Fondazione Bruno Kessler (FBK), University of Trento
Anna Currey, AWS AI Labs
Aswarth Abhilash Dara, Walmart Global Technology
Atul Kr. Ojha, University of Galway & Panlingua Language Processing LLP
Bogdan Babych, Heidelberg University
Chao-hong Liu, Potamu Research Ltd
Constantine Lignos, Brandeis University, USA
Daan van Esch, Google
Dana Moukheiber, Massachusetts Institute of Technology
Ekaterina Vylomova, University of Melbourne, Australia
Eleni Metheniti, CLLE-CNRS and IRIT-CNRS
Flammie Pirinen, UiT Norgga árktalaš universitehta
Gaurav Negi, University of Galway
Jinliang Lu, Institute of automation, Chinese Academy of Sciences
John Philip McCrae, University of Galway
Jonathan Washington, Swarthmore College
Koel Dutta Chowdhury, Saarland University
Majid Latifi, UPC University
Maria Art Antonette Clariño, University of the Philippines Los Baños
Milind Agarwal, George Mason University
Mathias Müller, University of Zurich
Nathaniel Oco, De La Salle University
Pavel Rychlý, Masaryk University and Lexical Computing
Pengwei Li, Meta
Rashid Ahmad, International Institute of Information Technology, Hyderabad
Rico Sennrich, University of Zurich
Santanu Pal, Wipro
Sangjee Dondrub, Qinghai Normal University
Sardana Ivanova, University of Helsinki
Sourabrata Mukherjee, Charles University
Thepchai Supnithi, National Electronics and Computer Technology Center
Timothee Mickus, University of Helsinki
Valentin Malykh, Huawei Noah’s Ark lab and Kazan Federal University
Wen Lai, LMU Munich
Xuebo Liu, Harbin Institute of Technolgy, Shenzhen
Yalemisew Abgaz, Dublin City University
Yasmin Moslem, Bering Lab
Zhanibek Kozhirbayev, National Laboratory Astana, Nazarbayev University
*CONTACT*
Please email loresmt(a)googlegroups.com if you have any
questions/comments/suggestions.
Second Call for Workshop Proposals
Deadline: Jan 31
16th International Conference on Computational Semantics (IWCS)
Heinrich Heine University Düsseldorf, Germany
22-24 September 2025
https://iwcs2025.github.io/
IWCS is a biennial conference on computational semantics. This year's
edition is organized by Heinrich Heine University Düsseldorf. The
conference is endorsed by SIGSEM, the ACL Special Interest Group on
Computational Semantics.
The aim of IWCS is to bring together researchers interested in any aspects
of the computation, annotation, extraction, representation, and learning of
meaning in natural language, whether this is from a lexical or structural
semantic perspective. IWCS embraces both symbolic and machine learning
approaches to computational semantics, and everything in between. The
conference and workshops will take place 22-24 September 2025.
=== WORKSHOP PROPOSALS ===
We invite proposals for workshops to be held in conjunction with IWCS 2025.
Accepted workshops will have the option to publish their proceedings in the
ACL Anthology.
We solicit proposals in all areas of computational semantics, in other
words all computational aspects of meaning of natural language within
written, spoken, signed, or multi-modal communication. Workshops are
invited on these closely related areas, including the following:
* design of meaning representations
* syntax-semantics interface
* representing and resolving semantic ambiguity
* shallow and deep semantic processing and reasoning
* hybrid symbolic and statistical approaches to semantics
* distributional semantics
* alternative approaches to compositional semantics
* inference methods for computational semantics
* recognizing textual entailment
* learning by reading
* methodologies and practices for semantic annotation
* machine learning of semantic structures
* probabilistic computational semantics
* neural semantic parsing
* computing meaning with large language models
* computational aspects of lexical semantics
* semantics and ontologies
* semantic web and natural language processing
* semantic aspects of language generation
* generating from meaning representations
* semantic relations in discourse and dialogue
* semantics and pragmatics of dialogue acts
* multimodal and grounded approaches to computing meaning
* semantics-pragmatics interface
* applications of computational semantics
=== FINANCES ===
Workshops must cover their own costs for invited speakers as well as
organizers' traveling costs.
=== SUBMISSION INFORMATION ===
Proposals for workshops should contain:
* A title and brief (max two pages) description of the workshop topic and
content;
* The names, affiliation and email addresses of the organisers;
* An estimate of the expected audience size;
* If the workshop has been held before, a note specifying where previous
workshops were held, how many submissions the workshop received, how many
papers were accepted and how many attendees the workshop attracted;
* Whether you plan a half-day or full-day workshop;
* Whether or not the workshop proceedings should be published in the ACL
Anthology.
Proposals should be submitted on OpenReview:
https://openreview.net/group?id=aclweb.org/SIGSEM/IWCS/2025/Workshop_Propos…
The person submitting the proposal will need an OpenReview account. Please
note OpenReview's moderation policy, where newly created accounts with an
institutional email address are approved automatically, but other email
addresses can take up to two weeks to approve.
=== IMPORTANT DATES ===
31 January 2025 Workshop proposal submissions due
07 February 2025 Workshop proposal notification of acceptance
24 September 2025 Workshop date
=== CONTACT ===
For questions, contact: iwcs2025-program-chairs(a)uni-duesseldorf.de
Kilian Evang, Laura Kallmeyer, Sylvain Pogodalla (the IWCS 2025 program
chairs)
--
Dr. Kilian Evang · Institut für Linguistik · Heinrich-Heine-Universität
Düsseldorf
Universitätsstr. 1 · 40225 Düsseldorf, Germany · https://kilian.evang.name
Humans, Machines, Language
Annual conference
University of Granada, Spain
24-25 June 2025
https://sites.google.com/view/humans-machines-language/events/2025-conferen…
We welcome everyone interested in the impact of new and emerging
language technologies that integrate with human senses. Whether you are
a tech developer who wants to learn more about linguistics, or a
linguist who wants to know more about tech, we want to hear from you!
HuMaLa leads on from the COST Action 'Language In The Human-Machine Era'
(https://lithme.eu [1]); you can find out more about our core themes of
interest from the LITHME forecast report
(https://doi.org/10.17011/jyx/reports/20210518/1 [2]) and animations
(https://lithme.eu/animations [3]).
HuMaLa's inaugural conference will be held at the University of Granada,
Spain, on 24-25 June 2025. The conference theme is:
'Humanistic insights for human-machine language technologies: privacy,
security, and wellbeing'
This echoes the priorities of the EU's recently introduced AI Act:
"human oversight, safety, privacy, transparency, non-discrimination and
social and environmental wellbeing"
(https://www.europarl.europa.eu/news/en/press-room/20230609IPR96212/
[4]). We hope to explore these timely topics from a range of humanistic
perspectives, with a focus on human-machine language technologies. We
welcome researchers and developers from computer science, linguistics,
sociology, education, and more. To understand the more general scope of
the conference, again, see the LITHME forecast report [2] and animations
[3].
In addition to technical work (e.g., model description or dataset), we
also welcome theoretical and empirical studies on the ethical, legal,
cultural and social implications of language technology adoption across
these domains.
Presentation format: Talk (20 mins) or Poster, non-archival
Presentations can address any of the topics that fall within the
interests of HuMaLa. Selection for places will be made by the conference
scientific committee.
We encourage early career applicants to read a guide on abstract
writing, for example:
https://info.lse.ac.uk/current-students/student-futures/how-to-write-an-abs…
[5]. Senior colleagues are used to all this, and are therefore at a
somewhat unfair advantage. We hope the above guide (and others like it)
will help early career applicants to craft their abstract more
precisely.
Abstract submission deadline: Friday 31 January 2025, 12:00 (noon) GMT
Website:
https://sites.google.com/view/humans-machines-language/events/2025-conferen…
We are looking forward to seeing you in Granada.
Links:
------
[1] https://lithme.eu/
[2] https://doi.org/10.17011/jyx/reports/20210518/1
[3] https://lithme.eu/animations
[4] https://www.europarl.europa.eu/news/en/press-room/20230609IPR96212/
[5]
https://info.lse.ac.uk/current-students/student-futures/how-to-write-an-abs…
apologies for cross-posting
We are pleased to announce the *GermEval Shared Task on Candy Speech Detection („Flausch-Erkennung“)*
This is the first call to participate in the shared task on candy speech detection („Flausch-Erkennung“).
We invite everyone from academia and industry to participate in the shared task.
The workshop discussing the results of this shared task is planned to be held in conjunction with the Conference on Natural Language Processing (KONVENS) in September 2025.
*Introduction*
Numerous methods have been developed for detecting and censoring negative speech (e.g., hate speech or offensive or harmful language) on social media platforms. However, there is much less focus on identifying and promoting positive supportive discourse in online communities. Our shared task aims to address this gap and encourage researchers to focus on such positive expressions.
The task is to identify expressions of candy speech (Flausch) in online posts (YouTube comments). We define candy speech as an expression of positive attitudes in social media toward individuals or their output (videos, comments, etc.). The purpose of candy speech is to encourage, cheer up, support and empower others. It can be viewed as the counterpart to hate speech, as it also aims to influence the self-image of the target person or group, but in a positive way.
*Data*
We will provide the participants with annotated training (and development) and unlabeled test datasets containing complete written, German language comment threads under YouTube videos posted by different content creators. The content creators and communities vary in topic, style, age group, etc. The test data and training data do not overlap wrt. to the original content creator of the video – the communities commenting on the videos can therefore be expected to differ.
*Task Details*
Candy speech detection is the task of identifying the presence of candy speech (at the span level) in a given YouTube comment thread and classifying each expression in one of the predefined categories. This shared task focuses on German speaking YouTube communities. Participants will be provided with a dataset of YouTube comments manually annotated for different types of candy speech expressions.
We offer the following two subtasks. Participants in this year's shared task may choose to participate in either subtask:
Subtask 1: Coarse-Grained Classification
The goal of this subtask is to identify whether the given comment contains candy speech ("Flausch") or not. The dataset is manually annotated for the presence of candy speech.
Subtask 2: Fine-Grained Classification
The goal of this subtask is to identify the span of each candy speech expression in a given text and classify it in one of the predefined categories. The dataset is manually annotated for 10 different types of candy speech expressions, such as “positive feedback”, “compliment”, “group membership” etc.
More details on the subtasks (including examples) can be found at the website of the shared task (see link below).
*Important dates*
Trial data available: February 15, 2025
Training data available: March 3, 2025
Test data available: May 17, 2025
Evaluation start: June 16, 2025
Evaluation end: June 27, 2025
Paper submission due: July 11, 2025
Camera ready due: August 15, 2025
GermEval workshop: September 8 or 12, 2025 (co-located with KONVENS)
*Website*
https://yuliacl.github.io/GermEval2025-Flausch-Erkennung/
*GermEval*
GermEval is a series of shared task evaluation campaigns that focus on Natural Language Processing for the German language. GermEval has been conducted regularly since 2014 in co-location with KONVENS/GSCL conferences:
https://germeval.github.io/tasks/
*contact email*
Please send any enquiry to the following email address:
germeval-2025-candy-speech(a)ruhr-uni-bochum.de
Best regards,
Yulia Clausen, Ruhr-Universität Bochum, Germany
Tatjana Scheffler, Ruhr-Universität Bochum, Germany
Michael Wiegand, Universität Wien, Austria
Second Workshop on Patient-Oriented Language Processing (CL4Health) @ NAACL 2025
https://bionlp.nlm.nih.gov/cl4health2025/
Albuquerque, New Mexico, USA
SCOPE
CL4Health fills the gap among the different biomedical language processing workshops by providing a general venue for a broad spectrum of patient-oriented language processing research. The second workshop on patient-oriented language processing follows the successful inaugural CL4Health workshop (co-located with LREC-COLING 2024), which clearly demonstrated the need for a computational linguistics venue that focuses on language related to health of the public.
CL4Health is concerned with the resources, computational approaches, and behavioral and socio-economic aspects of the public interactions with digital resources in search of health-related information that satisfies their information needs and guides their actions. The workshop invites papers concerning all areas of language processing focused on patients' health and health-related issues concerning the public. The issues include, but are not limited to accessibility and trustworthiness of health information provided to the public; explainable and evidence-supported answers to consumer-health questions; accurate summarization of patients' health records at their health-literacy level; understanding patients' non-informational needs through their language, and accurate and accessible interpretations of biomedical research. The topics of interest for the workshop include but are not limited to the following:
* Health-related information needs and online behaviors of the public;
* Quality assurance and ethics considerations in language technologies and approaches applied to text and other modalities for public consumption;
* Summarization of data from electronic health records for patients;
* Detection of misinformation in consumer health-related resources and mitigation of potential harms;
* Consumer health question answering (Community Question Answering)(CQA);
* Biomedical text simplification/adaptation;
* Dialogue systems to support patients' interactions with clinicians, healthcare systems, and online resources;
* Linguistic resources, data and tools for language technologies focusing on consumer health;
* Infrastructures and pre-trained language models for consumer health
SHARED TASK
Perspective-aware Healthcare Answer Summarization (PerAnsSumm) will be co-located with the workshop.
In community / consumer health question answering, several aspects, such as question understanding and answer generation, have been studied for over a decade. A new and important question posed by this task is the different perspectives provided in the answers to questions posted to online forums. The responses to the questions offer different answer perspectives, e.g., personal experiences, factual information, and suggestions. Traditionally, the CQA answer summarization task has focused on a single best-voted answer as a reference summary. A single answer does not capture all the perspectives. Moreover, a structured presentation of the information in the form of perspective-specific summaries may be more useful for the end-users. To address these gaps, this challenge introduces a novel perspective-specific answer summarization task within a CQA setup. The task will use the Perspective-aware healthcare Answer SuMmarizAtion (PUMA) dataset, a corpus of medical question-answer pairs created by the task organizers. The PUMA dataset consists of 3,167 CQA threads with approximately 10K answers filtered from the Yahoo! L6 corpus. Each answer in PUMA is annotated with five perspective spans: ‘cause’, ‘suggestion’, ‘experience’, ‘question’, and ‘information’.
Further details are about the shared task are available at: https://peranssumm.github.io/
IMPORTANT DATES
(Tentative)
January 30, 2025 -Workshop Paper Due Date️
March 1, 2025 - Notification of acceptance
March 10, 2025 - Camera-ready papers due
April 8, 2025 - Pre-recorded video due (hard deadline)
May 3 OR 4, 2025 - Workshop
SUBMISSIONS
Two types of submissions are invited:
- Full papers: should not exceed eight (8) pages of text, plus unlimited references. These are intended to be reports of original research.
- Short papers: may consist of up to four (4) pages of content, plus unlimited references. Appropriate short paper topics include preliminary results, application notes, descriptions of work in progress, etc.
Electronic Submission: Submissions must be electronic and in PDF format, using the Softconf START conference management system. Submissions need to be anonymous.
Submission site: https://softconf.com/naacl2025/cl4health2025
Dual submission policy: papers may NOT be submitted to the workshop if they are or will be concurrently submitted to another meeting or publication.
MEETING
The workshop will be hybrid. Virtual attendees must be registered for the workshop to access the online environment.
Accepted papers will be presented as posters or oral presentations based on the reviewers’ recommendations.
ORGANIZERS
- Dina Demner-Fushman, US National Library of Medicine
- Sophia Ananiadou, National Centre for Text Mining and University of Manchester, UK
- Paul Thompson, National Centre for Text Mining and University of Manchester, UK
- Deepak Gupta, US National Library of Medicine
--
Paul Thompson
Research Fellow
Department of Computer Science
National Centre for Text Mining
Manchester Institute of Biotechnology
University of Manchester
131 Princess Street
Manchester
M1 7DN
UK
http://personalpages.manchester.ac.uk/staff/Paul.Thompson/
Dear all,
The newly established research group on Natural Language Processing at the
University of Marburg is seeking applications for a position as Doctoral
Researcher in one of the research areas of the group, which include: Methods
and Applications of Natural Language Processing, Perspectivism and
Disagreement in NLP, AI for Social Good, Legal Tech and NLP Evaluation.
The position is offered for a period of 3 years. The starting date is as
soon as possible. The position is fulltime with salary and benefits
commensurate with a public service position in the state Hesse, Germany
(TV-H E 13).
Application deadline is the 19th of January. For more information and to
apply please visit:
https://stellenangebote.uni-marburg.de/jobposting/b26cbcb09d3e6c83dbdbab7def
555c7ec1843b040
Regards
Daniel
Dear all,
HITS is looking for a two-year
Postdoctoral Researcher in Natural Language Processing (m/f/x) to perform research in multilingual coreference resolution.
Application deadline: January 15th, 2025. Starting date (negotiable): March 1st, 2025.
Please see for details
https://www.h-its.org/hits-job/postdoctoral-researcher-in-natural-language-…
If you have further questions please don't hesitate to contact Michael Strube at michael.strube(a)h-its.org.
With best regards,
Michael Strube
--
Michael Strube
NLP Group
HITS gGmbH
Schloss-Wolfsbrunnenweg 35
69118 Heidelberg, Germany
http://www.h-its.org/nlp
We are seeking applications for a fully-funded one-year Research Assistant position in Computational Linguistics, focusing on developing Argumentation Knowledge Graphs for advanced search engines. The project aims to create structured, multi-perspective knowledge graphs to enhance search engines with reliable, balanced, and credible content, addressing challenges like information overload and misinformation. Conducted in collaboration with OpenWebSearch.EU, the project provides access to high-quality open data and enables integration into search interfaces, delivering trustworthy, diverse perspectives to support well-informed decision-making.
https://www.rug.nl/about-ug/work-with-us/job-opportunities/?details=00347-0…
[Apologies for cross-posting]
********************************************************************
CALL FOR PAPERS
ACM TSWWW 2025
Towards a Safer Web for Women - First International Workshop on Protecting Women Online
co-located with
The Web Conference 2025
Sydney, Australia
28 April - 2 May 2025
https://tsww25.github.io/
********************************************************************
EXTENDED DEADLINES (all deadlines are AoE)
********************************************************************
21st January 2025 22nd December 2024: Workshop paper submission deadline
27th January 2025: Notification of acceptance
********************************************************************
SCOPE AND OVERVIEW
__________________
The workshop is dedicated to addressing the pressing issue of online violence against women by fostering dialogue and innovation. The workshop will explore global challenges and solutions for gender-based violence and the impact of online harms on women, among others. We aim to encourage the development of technological and interdisciplinary frameworks and innovations to ensure women's online safety.
The workshop aims to review progress in approaches combating online violence against women, identify persistent barriers, and propose solutions to emerging challenges. Topics of interest include, but are not limited to:
* Detection and prevention of gender-based online violence (e.g., harassment, stalking, cyberbullying)
* Sentiment and emotion analysis in abusive or harmful online interactions towards women
* Gender bias identification and mitigation in AI
* Human-centered approaches for online safety applications
* Approaches to preventing, understanding, identifying and mitigating online harms faced by women with multiple marginalised identities (e.g., misogynoir, LGBTQ+ women, or women from religious or cultural minorities)
* Analysis of tracking devices, surveillance tools, and hidden cameras misused against women
* Detection and mitigation of non-consensual deepfake generation and dissemination
* Interdisciplinary approaches to identifying and addressing online harm
* Legal and ethical frameworks for protecting women online
* Psychological, social, and legal impacts of online technology when used for gender-based abuse
PAPER FORMAT AND SUBMISSION INSTRUCTIONS
________________________________________
We welcome both new and recent research, including non-archival submissions to showcase work published elsewhere, if it is especially relevant to the workshop's theme. Accepted formats include:
* Long papers: Maximum 8 pages (excluding references)
* Short papers: Maximum 4 pages (excluding references)
* Position, idea, and emerging problem papers: Maximum 4 pages (excluding references)
* Non-archival submissions: Up to 2 pages (excluding references)
All papers should be submitted via Easychair: https://easychair.org/conferences/?conf=tsww25
For full details, visit our Call for Papers page.
Further, at least one author of each accepted workshop paper has to register. Workshop attendance is only granted for registered participants. Accepted papers (except for non-archival submissions) will be included in the workshop proceedings, which will be published as companion proceedings of The Web Conference, and indexed according to the main conference policy.
ORGANISING COMMITTEE
____________________
Workshop chairs:
* Ángel Pavón Pérez, The Open University
* Miriam Fernandez, The Open University
* Tracie Farrell, The Open University
* Debora Nozza, Bocconi University
* Christine de Kock, University of Melbourne