The 4th International Conference on Natural Language Processing for Digital Humanities will co-locate with EMNLP in Miami, USA!
The proceedings will be published in the ACL anthology. The event will take place on November 15-16 2024.
https://www.nlp4dh.com/nlp4dh-2024
Submission deadline: September 1, 2024
The focus of NLP4DH is on applying natural language processing techniques to digital humanities research. The topics can be anything of digital humanities interest with a natural language processing or generation aspect. A list of suitable NLP4DH topics include but are not limited to:
-Text analysis and processing related to humanities using computational methods
-Dataset creation and curation for NLP (e.g. digitization, digitalization, datafication, and data preservation).
-Research on cultural heritage collections such as national archives and libraries using NLP
-NLP for error detection, correction, normalization and denoising data
-Generation and analysis of literary works such as poetry and novels
-Analysis and detection of text genres
Short papers can be up to 4 pages in length. Short papers can report on work in progress or a more targeted contribution such as software or partial results.
Long papers can be up to 8 pages in length. Long papers should report on previously unpublished, completed, original work.
Lightning talks can be submitted as 750-word abstracts. Lightning talks are suited for discussing ideas or presenting work in progress. Lightning talks will be published in lightning proceedings on Zenodo.
Accepted papers (short and long) will be published in the proceedings that will appear in the ACL Anthology. Accepted papers will also be given an additional page to address the reviewers’ comments. The length of a camera ready submission can then be 5 pages for a short paper and 9 for a long paper with an unlimited number of pages for references.
The authors of the accepted papers will be invited to submit an extended version of their paper to a special issue in the Journal of Data Mining & Digital Humanities<https://jdmdh.episciences.org/volume/view/id/593>.
Important dates
-Paper submission (full and short): September 1, 2024
-Notification of acceptance: September 22, 2024
-Camera ready deadline: October 4, 2024
-Conference: November 15-16, 2024
Job Opening for Data Scientist with a focus on natural language
processing
Application link: https://bit.ly/4fqI2Sl
Application deadline: 31 August 2024
The South African Centre for Digital Language Resources (SADiLaR) is
looking for a data scientist with a focus on natural language
processing (permanent position). As a Data Scientist at the South
African Centre for Digital Language Resources (SADiLaR) you will have
the opportunity to initiate and lead projects focusing on Human
Language Technology and Digital Humanities stemming from your own
research interests. You will work closely together with a team of
researchers as part of SADiLaR's extended network, both on your own and
commissioned projects. Dissemination of project results at national
and international conferences will be encouraged and supported. This
position is crucial for research and development in Human Language
Technology and Digital Humanities, fields that form the essence of
SADiLaR, which is a national Research Infrastructure supported by the
Department of Science and Innovation. Read more about SADiLaR at
https://www.sadilar.org.
Key responsibilities:
- Research: Research in the area of Human Language Technology and
Digital Humanities.
- Project work: Initiating and contributing to Human Language
Technology and Digital Humanities projects.
- Teaching: Teaching in the area of Human Language Technology and
Digital Humanities.
- Mentorship: Mentorship of researchers in the field of Human Language
Technology and Digital Humanities.
Minimum requirements:
- A PhD (NQF level 10) in one of the following fields: Computational
Linguistics, Natural Language Processing, Human Language Technology,
Digital Humanities, Data Science, Computer Science, Information
Technology, Artificial Intelligence, or related fields. The PhD should
have a focus on computational aspects of linguistics.
- A minimum of (five) 5 years' experience in the use of Python (other
programming languages used within the computational linguistics or
Digital Humanities domain can also be considered).
- Evidence of peer-reviewed academic publications.
- A minimum of (three) 3 years' experience as a supervisor/co-
supervisor of students or playing a mentorship/supervising role for
individuals.
- A minimum of (three) 3 years' experience with using and/or developing
computational tools.
- A minimum of (three) 3 years experience related to research within
the domain of Language Technology or Digital Humanities.
- A minimum of (one) 1 year experience related to teaching or training
within the domain of Language Technology or Digital Humanities.
More information can be found at the application link.
For informal inquiries please contact: Menno van Zaanen
<menno.vanzaanen(a)nwu.ac.za>
Application link: https://bit.ly/4fqI2Sl
--
Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za
Professor in Digital Humanities
South African Centre for Digital Language Resources
https://www.sadilar.org
________________________________
NWU PRIVACY STATEMENT:
http://www.nwu.ac.za/it/gov-man/disclaimer.html
DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system.
________________________________
Background and Scope
---------------------
While interest in automatic approaches to Counterspeech generation has been steadily growing,
including studies on data curation (Chung et al., 2019a; Fanton et al., 2021), detection (Chung
et al., 2021a; Mathew et al., 2018), and generation (Tekiroglu et al., 2020; Chung et al., 2021b;
Zhu and Bhat, 2021; Tekiroglu et al., 2022), the large majority of the published experimental work on automatic Counterspeech generation has been carried out for English. This is due to the scarcity of both non-English manually curated training data and to the crushing predominance of English in the generative Large Language Models (LLMs) ecosystem. A workshop on exploring Multilingual Counterspeech Generation is proposed to promote and encourage research on multilingual approaches for this challenging topic.
Thus, this workshop aims to test monolingual and multilingual LLMs in particular and Language Technology in general to automatically generate counterspeech not only in English but also in languages with fewer resources. In this sense, an important goal of the workshop will be to understand the impact of using LLMs, considering for example how to deal with pressing issues such as biases, hallucinated content, data scarcity or data contamination.
We seek to maximize the scientific and social impact of this workshop by promoting the
creation of a community of researchers from diverse fields, such as computer and social sciences, as well as policy makers and other stakeholders interested in automatic counterspeech generation. By doing so we aim to gain a deeper understanding of how counterspeech is currently used to tackle abuse by individuals, activists, and organizations
and how Natural Language Processing (NLP) and Generation (NLG) may be best applied to counteract it.
Call for Papers
---------------------
We welcome submissions on the following topics (but not limited to):
- Models and methods for generating counterspeech in different languages.
- Automatic Counterspeech generation for low resource languages with scarce training data.
- Dialogue agents that use counterspeech to combat offensive messages that are directed to individuals or groups, targeted based on various aspects such as ideology, gender, sexual orientation and religion.
- Methods for human and automatic evaluation of counterspeech.
- Multidisciplinary studies providing different perspectives on the topic such as computer science, social science, psychology, etc.
- Development of taxonomies and quality datasets for counterspeech in multiple languages.
- Potentials and limitations (e.g., fairness, biases, hallucinated content) of applying different NLP methods, such as LLMs, to generate counterspeech.
- Social impact and empirical studies of counterspeech in social networks, including research on the effectiveness and consequences for users of using counterspeech to combat hate online.
Submission
---------------------
We welcome two types of papers: regular workshop papers and non-archival submissions. Regular workshop papers will be included in the workshop proceedings. All submissions must be in PDF format and made through START [https://softconf.com/coling2025/MCG25/]
- Regular workshop papers: Authors can submit papers up to 8 pages, with unlimited pages for references. Authors may submit up to 100 MB of supplementary materials separately and their code for reproducibility. All submissions undergo an double-blind single-track review. Accepted papers will be presented as posters with the possibility of oral presentations.
- Non-archival submissions: Cross-submissions are welcome. Accepted papers will be presented at the workshop, but will not be included in the workshop proceedings. Papers must be in PDF format and will be reviewed in a double-blind fashion by workshop reviewers. We also welcome extended abstracts (up to 2 pages) of papers that are work in progress, under review or to be submitted to other venues. Papers in this category need to follow the COLING format.
Important Dates
---------------------
- Submission: November 20th, 2024
- Notification of Acceptance: December 2nd, 2024
- Camera-Ready Papers Due: December 10th, 2024
-----------------------------------------------------
Shared Task on Multilingual Counterspeech Generation
-----------------------------------------------------
In addition to paper contributions, we are organizing a shared task on multilingual counterspeech generation with the aim of sharing in a central space current efforts, especially those for languages different to English.
It is envisaged that the shared task would allow the community to study how we can improve counterspeech generation for both lower resource languages but also to reinforce the strong body of research already existing for English.
The counterspeech generated by participants should be respectful, non-offensive, and contain information that is specific and truthful with respect to the following targets: Jews, LGBT+, immigrants,, people of color, women.
Data
---------------------
We release new data consisting of 597 Hate Speech-Counter Narrative (HS-CN) pairs. In this dataset, the HS are taken from MTCONAN [https://github.com/marcoguerini/CONAN/tree/master/Multitarget-CONAN], while the CN are newly generated. Together with each HS-CN pair, we also provide 5 background knowledge sentences, some of which are relevant for obtaining the Counter Narratives. The dataset is available in 4 different languages (Basque, English, Italian and Spanish) and divided in the following splits:
- Development: 100 pairs. [AVAILABLE NOW!] [https://huggingface.co/datasets/LanD-FBK/ML_MTCONAN_KN]
- Train: 397 pairs [AVAILABLE in OCTOBER!]
- Test: 100 pairs [TBA]
In order to score the shared task participants, the CNs will be kept hidden during the shared task while the HS and the background knowledge will be released for participants to prepare their submissions.
The languages, Basque, English, Italian and Spanish, offer a varied spectrum of complexity including an agglutinative language isolate (Basque), two romance languages (Italian, Spanish) and a Germanic one (English). The choice of languages obeys to the existing linguistic expertise among the organizers required to successfully run the shared task.
Additional Multilingual Data
---------------------
Participants also have available the English, Italian, Basque and Spanish CONAN manually curated data:
- CONAN (English and Italian) [https://github.com/marcoguerini/CONAN/tree/master/CONAN]
- CONAN (Basque and Spanish) [https://huggingface.co/datasets/HiTZ/CONAN-EUS]
- CONAN-MT-SP (Spanish) [https://github.com/sinai-uja/CONAN-MT-SP] or [https://huggingface.co/datasets/SINAI/CONAN-MT-SP]
- Multitarget-CONAN (English) [https://github.com/marcoguerini/CONAN/tree/master/Multitarget-CONAN]
Participation
---------------------
The aim of the shared task is, given a HS (and optionally any additional knowledge the participants may like to use), generate a CN to counteract the HS.
An example of CN appear bellow:
- English:
+ HS: We should ban all homosexuals.
+ CN: When will the love prosper and the hatred start to dissipate? I will not only respect my fellow LGBT+ people, I will promote their rights.
- Italian:
+ HS: Dovremmo espellere tutti gli omosessuali.
+ CN: Quand'è che l'amore prospererà e l'odio inizierà a dissiparsi? Non solo rispetterò i miei compagni LGBT+, ma promuoverò i loro diritti.
- Spanish:
+ HS: Deberíamos prohibir a todos los homosexuales.
+ CN: ¿Cuándo prosperará el amor y empezará a disiparse el odio? No sólo respetaré a mis compañeros LGBT+, sino que promoveré sus derechos.
- Basque:
+ HS: Homosexual guztiak debekatu beharko genituzke.
+ CN: Noiz hasiko da maitasuna irabazten eta gorrotoa desagertzen? LGBT+ pertsonak errespetatzeaz gain, haien eskubideak sustatuko ditut.
Participants will download the test HS for the 4 languages and generate at most three different CNs per HS for each language). The test window will last 5 days.
Participants are allowed to use any resource (language model, data, etc.) to generate the CN.
Evaluation
---------------------
The CNs submitted by the participants will be evaluated:
- Using traditional automatic metrics as in Tekiro ̆glu et al.( 2022), which include BLEU, ROUGE, Novelty and Repetition Rate.
- Using LLM as a Judge following the approach described in this paper: https://arxiv.org/abs/2406.15227
Important Dates
---------------------
- Test dataset release: October 21st, 2024
- Results submission: October 25th, 2024
- Results notification: November 10th, 2024
- Working papers submission: November 20th, 2024
- Notification of Acceptance: December 3rd, 2024
- Camera-Ready Papers Due: December 10th, 2024
- Workshop: January 19th, 2025
For more information you can joint the Google group [https://groups.google.com/g/multilingual-cs-generation-coling2025] or visit our website [https://sites.google.com/view/multilang-counterspeech-gen/home]
Best regards,
The Multilingual Counterspeech Generation Workshop Organizers.
[apologies for x-posting]
We are looking for a postdoctoral researcher in computational linguistics and/or natural language processing to work on the project "Beyond pixels and words: language technology generation and understanding of spatial language in interaction" (2023-01552) funded by the Swedish Research Council (VR).
Application deadline on August 15, 2024 23:59 (CEST, UTC+2)
Project description: https://web103.reachmee.com/ext/I005/1035/job?site=7&lang=UK&validator=9b89… (English) and https://web103.reachmee.com/ext/I005/1035/job?site=6&lang=SE&validator=3038… (Swedish) and from my personal page (coming soon)
I am looking for candidates with a strong background in computational linguistics, natural language processing (or neighbouring fileds such computer vision and robotics) and machine learning, ideally with experience of computational semantics, language modelling and working with multi-modal representations.
Best regards,
Simon
—
Simon Dobnik
Professor of Computational Linguistics
CLASP & FLoV, University of Gothenburg
https://www.gu.se/en/about/find-staff/simondobnik
SEARCH SOLUTIONS 2024
Wednesday 27 November 2024, London
Innovations in Search and Information Retrieval
Search Solutions is the BCS Information Retrieval Specialist Group’s (BCS IRSG) annual event focused on practitioner issues in the arena of search and information retrieval (IR). In contrast to other major industry events, Search Solutions aims to be highly interactive, with attendance strictly limited. We bring together practitioners, researchers, analysts and end users to discuss the latest developments in the IR community and to share insights between research and practice. It is a unique opportunity to bring together academic research and practitioner experience.
The Search Solutions event consists of a Tutorial day (26 November) and a Conference day (27 November), each of which has a separate registration.
The main programme includes presentations, panels and keynote talks by influential industry leaders on novel and emerging applications in search and information retrieval.
CONFIRMED SPEAKERS (MORE TO COME)
* Dyaa Albakour, SIGNAL AI, “Mapping the landscape of narratives in Global Media”
* Alessandro Benedetti, Sease, “What I don’t like about RAG: can we do better?”
* Charlie Hull, OpenSource Connections, “Measure and Tune your Search with User Behaviour Insights”
* Maria-Inti Metzendorf, Heinrich Heine University Düsseldorf, “Searching for Living Systematic Reviews – overview and case report”
* Eugene Morozov, ISKO, "Search and Browse Use Cases in Data Governance"
* Daniel Tunkelang, Search Consultant & Aritra Mandal, eBay, "Modeling Queries as Bags of Documents"
We are also looking forward to a lively panel chaired by Michael Upshall entitled “Has Generative AI removed the need for conventional search?”
LOCATION
Search Solutions is organised by the Information Retrieval Specialist Group of the BCS (The Chartered Institute for IT) and ISKO (International Society for Knowledge Organization), and is held at the BCS Central London Office:
BCS, The Chartered Institute for IT
Ground Floor
25 Copthall Avenue
London
EC2R 7BP
https://www.bcs.org/about-us/our-london-office-and-event-venue/
REGISTRATION
Registration fees (including VAT at 20%) for Search Solutions are as follows:
* BCS / ISKO member rate: £92
* Non-member rate: £110
* Students: £80
Registration fees include lunch. Tea and coffee will also be available throughout the day followed by a drinks reception in the evening.
Register here: https://www.bcs.org/events-calendar/2024/november/search-solutions-2024-inf…
TUTORIAL DAY
Search Solutions also includes a Tutorial Programme on Tuesday, 26 November. A detailed programme will be announced in due course. A Call for Tutorials can be found here: https://www.bcs.org/membership-and-registrations/member-communities/informa…
Tutorials are payable separately.
PAST EVENTS
We have been running Search Solutions for more than 15 years now. For details on past events please visit the BCS IRSG Web site at:
https://www.bcs.org/membership-and-registrations/member-communities/informa…
***Apologies for possible cross-posting ***
The two major conferences in the Baltic and Nordic regions, NoDaLida, organized by The Northern European Association for Language Technology (NEALT) and Baltic HLT are joining forces to organize NoDaLiDa/Baltic-HLT 2025 – The Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies, to be held in Tallinn, Estonia, on March 2–5, 2025.
We would like to invite proposals for workshops, to be held on Sunday, March 2, immediately before the main conference, or on Wednesday, March 5, immediately after the main conference. Workshops can be scheduled either for a full day (morning and afternoon) or for half a day. The main conference will be held on-site only, without an online option, in order to facilitate networking. Workshops are free to offer online presentations if they wish to do so.
NoDaLiDa/Baltic-HLT addresses all aspects of natural language processing, speech processing, and computational linguistics, including work in closely related neighboring disciplines (such as, for example, machine learning, linguistics, digital humanities, or psychology) that is sufficiently formalized or applied to bear relevance to speech and language technologies.
Workshop proposals can be submitted in free-text form as a pdf file, by email to ‘nodalida_baltichlt_2025-workshops(a)googlegroups.com<mailto:nodalida_baltichlt_2025-workshops@googlegroups.com>’. Workshop proposals must include adequate information on at least the following aspects:
- proposed workshop title
- topic and goals of the workshop
- target group and estimated attendance
- workshop organizer(s) and contact(s)
- mode of organization and program design, including:
- information on full versus half-day workshop
- information on the preference of workshop day: March 2, March 5, or either
- information on whether you plan an on-site only or hybrid event
SCHEDULE
* Monday, August 26, 2024: Submission of workshop proposals
* Tuesday, September 10, 2024: Notification of workshop selection
* Monday, December 16, 2024: Recommended workshop paper submission deadline
* Monday, February 3, 2025: Camera-ready workshop papers due
* Sunday, March 2, 2025: Pre-conference workshops
* Sunday, March 5, 2025: Post-conference workshops
Organizers of accepted proposals will be responsible for publicizing and running the workshop, including sending out calls for papers, reviewing submissions, producing the camera-ready workshop proceedings, and organizing the meeting day.
SELECTION
The assessment and selection of workshop proposals will be made by the NoDaLiDa/Baltic-HLT 2025 Workshop Chairs:
* Normunds Grūzītis, University of Latvia, Latvia
* Samia Touileb, University of Bergen, Norway
To inquire about the workshop submission process or any practical aspect of the organization of workshops, please email ‘nodalida_baltichlt_2025-workshops(a)googlegroups.com<mailto:nodalida_baltichlt_2025-workshops@googlegroups.com>’.
For any question about the conference in general, please email ‘nodalida_baltichlt_2025-pc(a)googlegroups.com<mailto:nodalida_baltichlt_2025-pc@googlegroups.com>’ and for any general practical inquiries, please email ‘nodalida_baltichlt_2025-loc(a)eki.ee<mailto:nodalida_baltichlt_2025-loc@eki.ee>’.
Looking forward to your workshop proposals which will help make NoDaLiDa/Baltic-HLT 2025 a success!
Sara Stymne, NoDaLiDa/Baltic-HLT 20205 general chair
PS: please also check our call for papers: https://www.nodalida-bhlt2025.eu/call-for-papers
När du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/
E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy
Hi colleagues,
We are seeking candidates for a PhD position in Human-Computer Interaction and Natural Language Processing.
Location: Brisbane or Sydney, Australia
Deadline: We will review applications and conduct interviews on an ongoing basis until a suitable candidate for the role is found.
Project title: “Spot, what’s happening?” Explaining a robot’s state through language
Details about the project
Collaborative human-robot teams are increasingly being used for a variety of tasks. When working with robots, humans sometimes need to understand what the robot is currently doing, why and how it got to that state, and how it interprets its environment. This is important for the human operator to ensure the robot is working according to plans and particularly important if the human needs to intervene or react when the robot needs help.
This project aims to explore the use of language as an interaction modality to facilitate the collaboration between robots and humans. The project will investigate what information about a robot’s state and perception is relevant to human operators in a given interaction and task context and how to present that information through language, including dialogue or chat interfaces.
The project requires expertise in Human-Computer Interaction and/or Human-Robot Interaction. In addition, knowledge or a willingness to upskill in natural language processing (including using large language models), Explainable AI, and robotics are highly desired.
This project addresses Dynamic Situation Awareness and Human-AI Communication, two of the foundational research questions being addressed in Collaborative Intelligence research programme at CSIRO CINTEL. The project also fits into a collaboration between CINTEL and the Australian Cobotic Centre’s research on Human-Robot Interaction.
More details about the post and instructions on how to apply are available at https://www.australiancobotics.org/project/project-2-5-spot-whats-happening….
Thanks,
Dai
*Last Call for Papers*: The seventh edition of the BlackboxNLP workshop,
co-located with EMNLP 2024, in Miami on November 15, 2024. .
*Important dates*
---------------------
*August 15, 2024* - Paper submission deadline (through OpenReview).
September 20, 2024 - Notification of acceptance.
October 4, 2024 - Camera-ready deadline.
*November 15, 2024* - Workshop date.
All deadlines are 11:59PM UTC-12:00 (“Anywhere on Earth”). We will accept
ARR submissions; corresponding deadlines will be announced at a later
moment in time.
*Workshop description:*
-----------------
Many recent performance improvements in NLP have come at the cost of
understanding of the systems. How do we assess what representations and
computations models learn? How do we formalize desirable properties of
interpretable models, and measure the extent to which existing models
achieve them? How can we build models that better encode these properties?
What can new or existing tools tell us about these systems’ inductive
biases?
The goal of this workshop is to bring together researchers focused on
interpreting and explaining NLP models by taking inspiration from fields
such as machine learning, psychology, linguistics, and neuroscience. We
hope the workshop will serve as an interdisciplinary meetup that allows for
cross-collaboration.
Topics of interest include, but are not limited to:
* Applying analysis techniques from neuroscience to analyze
high-dimensional vector representations in artificial neural networks;
* Analyzing the network’s response to strategically chosen input in order
to infer the linguistic generalizations that the network has acquired;
* Examining network performance on simplified or formal languages;
* Mechanistic interpretability, reverse engineering approaches to
understanding particular properties of neural models;
* Proposing modifications to neural architectures that increase their
interpretability;
* Testing whether interpretable information can be decoded from
intermediate representations;
* Explaining specific model predictions made by neural networks;
* Generating and evaluating the quality of adversarial examples in NLP;
* Developing open-source tools for analyzing neural networks in NLP;
* Evaluating the analysis results: how do we know that the analysis is
valid?
*Submissions*
-----------------
We call for two types of papers:
1) *Archival papers*. These are papers reporting on completed, original and
unpublished research, with a maximum length of 8 pages + references (papers
shorter than this maximum are also welcome). Broader Impacts/Ethics and
Limitations sections are optional and can be included on a 9th page. An
optional appendix may appear after the references in the same pdf file.
Accepted papers are expected to be presented at the workshop and will be
published in the workshop proceedings of the ACL Anthology, meaning they
cannot be published elsewhere. They should report on obtained results
rather than intended work. These papers will undergo double-blind
peer-review, and should thus be anonymized.
2) *Extended abstracts*. These may report on work in progress or may be
cross-submissions that have already appeared (or are scheduled to appear)
in another venue in 2024-2025. The extended abstracts are of a maximum of 2
pages + references. These submissions are non-archival and will not be
included in the proceedings. The selection will not be based on a
double-blind review and thus submissions of this type need not be
anonymized.
Submissions should follow the official EMNLP 2024 style guidelines.
Accepted submissions for both tracks will be presented at the workshop:
most as posters, some as oral presentations (determined by the program
committee).
*The submission site is:*
https://openreview.net/group?id=EMNLP/2024/Workshop/BlackBoxNLP
*Organizers*
-----------------
Yonatan Belinkov, Technion
Najoung Kim, Boston University
Jaap Jumelet, University of Amsterdam
Hosein Mohebbi, Tilburg University
Aaron Mueller, Northeastern University & Technion
Hanjie Chen, Rice University
*Contact*
---------------------
Please contact the organizers via email (blackboxnlp(a)googlegroups.com) for
any questions.
Read more on the workshop's website:
https://blackboxnlp.github.io
Hello all, here is our Call for Papers: VarDial 2025 - Twelfth Workshop on NLP for Similar Languages, Varieties and Dialects
VarDial 2025: https://sites.google.com/view/vardial-2025/home
Co-located with COLING 2025, VarDial deals with computational methods and language resources for closely related languages, language varieties, and dialects.
We welcome papers dealing with one or more of the following topics:
- Corpora, resources, and tools for similar languages, varieties and dialects;
- Adaptation of tools (taggers, parsers) for similar languages, varieties and dialects;
- Evaluation of language resources and tools when applied to language varieties;
- Reusability of language resources in NLP applications (e.g., for machine translation, POS tagging, syntactic parsing, etc.);
- Corpus-driven studies in dialectology and language variation;
- Computational approaches to mutual intelligibility between dialects and similar languages;
- Automatic identification of lexical variation;
- Automatic classification of language varieties;
- Text similarity and adaptation between language varieties;
- Linguistic issues in the adaptation of language resources and tools (e.g., semantic discrepancies, lexical gaps, false friends);
- Machine translation between closely related languages, language varieties and dialects.
In addition to the topics listed above, we also welcome papers dealing with diachronic language variation (e.g. phylogenetic methods, historical dialects).
Timeline
Publication of call for papers: Tuesday, August 6th, 2024
Paper submission deadline: Tuesday, November 5th, 2024
Notification of acceptance: Monday, November 25th, 2024
Commitment deadline for pre-reviewed papers: TBD
Camera-ready papers due: Friday, December 13th, 2024
Workshop date: Sunday, January 19th, 2025
Submission information
We invite submissions of up to 8 pages of content, plus up to one page for ethical considerations and/or limitations, plus unlimited pages of references. We also welcome shorter contributions, but we do not make an explicit distinction between long and short papers. For shared task system description papers, we recommend a length of 4-5 pages of content.
Detailed submission guidelines available on the COLING 2025 website. All submissions must use the official COLING templates. Contributions must be submitted to Softconf: https://softconf.com/coling2025/VarDial25/
It will also be possible to submit rejected COLING main conference papers to VarDial 2025. Instructions on committing such papers to VarDial will be provided here at a later date.
Organizers
Yves Scherrer - University of Oslo (Norway)
Tommi Jauhiainen - University of Helsinki (Finland)
Nikola Ljubešić - Jožef Stefan Institute (Slovenia) and University of Ljubljana (Slovenia)
Preslav Nakov - Mohamed bin Zayed University of Artificial Intelligence (UAE)
Jörg Tiedemann - University of Helsinki (Finland)
Marcos Zampieri - George Mason University (USA)
Cheers,
Tommi
—
Tommi Jauhiainen (PhD, Language Technology)
Projektisuunnittelija / Project Planning Officer
FIN-CLARIN & Kielipankki – The Language Bank of Finland
Digitaalisten ihmistieteiden osasto / Department of Digital Humanities
Helsingin yliopisto / University of Helsinki
https://www.kielipankki.fi
Member
Centre of Excellence in Ancient Near Eastern Empires, University of Helsinki
[With apologies for cross-posting]
We are excited to announce the 22nd International Workshop on Treebanks and Linguistic Theories (TLT 2024), which will bring together developers and users of linguistically annotated natural language corpora. The workshop is endorsed by ACL SIGPARSE and will be hosted by Universität Hamburg in Germany on December 5th-6th, 2024.
-----------------------------
VENUE
-----------------------------
TLT 2024 will take place at the guest house of Universität Hamburg. In order to support rich discussions and networking, TLT 2024 will primarily be an in-person event; we will, however, accommodate a limited number of live / synchronous remote presentations, prioritizing those with circumstances that prevent travel.
Universität Hamburg and its guest house are conveniently located near the Dammtor train station / metro station Stephansplatz which are well-connected with many parts of the city and beyond, providing an easy commute for attendees.
Hamburg is a vibrant city known for its rich maritime history as one of the leading cities in the medieval Hanseatic League, as well as its modern cultural diversity, including events at the world-famous Elbphilharmonie Concert Hall. The city is easily accessible by train or plane (Hamburg Airport (HAM); about 1 to 1.5 hours train ride: Bremen Airport (BRE) and Hannover Airport (HAJ)).
-----------------------------
SUBMISSION INFORMATION
-----------------------------
TLT addresses all aspects of treebank design, development, and use. As ‘treebanks’ we consider any pairing of natural language data (spoken, signed, or written) with annotations of linguistic structure at various levels of analysis, including, e.g., morpho-phonology, syntax, semantics, and discourse. Annotations can take any form (including trees or general graphs), but they should be encoded in a way that enables computational processing. Reflections on the design of linguistic annotations, methodology studies, resource announcements or updates, annotation or conversion tool development, or reports on treebank usage including probing the leakage of treebanks into large language models are but some examples of the types of papers we anticipate for TLT.
Papers should describe original work; they should emphasize completed work rather than intended work, and should indicate clearly the state of completion of the reported results. Submissions will be judged on correctness, originality, technical strength, significance and relevance to the conference, and interest to the attendees.
We invite paper submissions in two distinct tracks:
* regular papers on substantial and original research, including empirical evaluation results, where appropriate;
* short papers on smaller, focused contributions, work in progress, negative results, surveys, or opinion pieces.
Submissions (in both tracks) may either be archival—in case of unpublished work—or non-archival, based on the wish of the authors. All archival papers accepted for presentation at the workshop will be included in the TLT 2024 proceedings volume, which will be part of the ACL Anthology. Non-archival papers must have been published or accepted for publication at another CL conference.
Long papers may consist of up to 8 pages of content (excluding references and appendices). Short papers may consist of up to 4 pages of content (excluding references and appendices). Accepted papers will be given an additional page to address reviewer comments.
All submissions should follow the two-column format and the ACL style guidelines. We strongly recommend the use of the LaTeX style files, OpenDocument, or Microsoft Word templates created for ACL: https://github.com/acl-org/acl-style-files
Submissions will be reviewed double-blind, and all full and short papers must be anonymous, i.e. not reveal author(s) on the title page or through self-references. So e.g., “We previously showed (Smith, 2020) …”, should be avoided. Instead, use citations such as “Smith (2020) previously showed …. Papers must be submitted digitally, in PDF, and uploaded through the on-line conference system (link forthcoming).
Submissions that violate these requirements will be rejected without review.
-----------------------------
IMPORTANT DATES
-----------------------------
* Long and short paper submission deadlines: August 15th, 2024
* Reviews Due: September 26th, 2024
* Notification of acceptance: October 6th, 2024
* Final version of papers due: November 6th, 2024
* TLT2024: December 5th-6th, 2024 in Hamburg
-----------------------------
TLT2024 WORKSHOP CHAIRS
-----------------------------
Daniel Dakota, Indiana University
Sandra Kübler, Indiana University
Heike Zinsmeister, Universität Hamburg
-----------------------------
TLT2024 COMMUNICATION CHAIR
-----------------------------
Sarah Jablotschkin, Universität Hamburg
Contact: tlt2024.gw(a)uni-hamburg.de
Website: https://www.korpuslab.uni-hamburg.de/en/tlt2024.html