Dear all,
University of Turku, Finland, invites applications for **fully funded doctoral researcher positions in digital linguistics and NLP, in the TurkuNLP research group**.
The positions are part of the Human-Centric Artificial Intelligence for a Sustainable Future (HAIF) doctoral training programme, co-funded by the European Union’s Horizon Europe research and innovation programme’s Marie Skłodowska-Curie Actions (MSCA) (grant agreement No 101177564).
Please find more information on the program and application procedure below!
Best,
Veronika Laippala
HAIF is a doctoral programme for 25 doctoral researchers hosted by the University of Turku. HAIF unites research groups from the fields of computing, materials science, social sciences, law, humanities, and health sciences. HAIF doctoral researchers will conduct novel research under the guidance of scientifically distinguished and experienced supervisors. By connecting 11 research groups from 5 faculties, HAIF aims to create a cohesive, synergistic, and interdisciplinary research community.
The global network of associated partners, representing various industries and third-sector organisations, in addition to universities and research institutes, are committed to supporting the HAIF doctoral training. This training features 4–6-month secondments during the years 2027–2029 and a variety of other educational activities.
HAIF’s objective is to provide training that promotes the safe, secure, legally, and ethically sustainable use of AI, with an emphasis on humans as developers, users, and decision-subjects affected by the technology. In addition to individual and societal perspectives, HAIF’s research themes delve into the technological properties that shape interactions between humans and AI systems (e.g., transparency, interpretability, reliability, and accountability).
HAIF doctoral researchers will be selected through a transparent, merit-based process with equity and inclusivity as cross-cutting priorities.
The call for applications for the HAIF doctoral researcher positions is open until Friday, 28 February, at 15:00 (Helsinki/Europe).
An eligible applicant is an early-stage researcher (no doctoral degree) who holds a Master’s degree in a relevant discipline. Based on the EU mobility rule, the applicant must not have resided in Finland for more than 12 months in the last 36 months.
Read more about HAIF and find the guide for applicants on the website: https://haif.utu.fi/
Here is the direct link to the online application system: https://ats.talentadore.com/apply/25-doctoral-researcher-positions-in-msca-…
*THE 1ST EUROPEAN CL RESEARCH SUMMIT FOR JUNIOR RESEARCHERS*
We invite junior researchers in computational linguistics and NLP to
join us for an exciting week in the Swiss mountains!
This call invites participants to apply for the summit by completing a
non-binding registration form. Upon acceptance, participants will
receive a final binding registration form.
*Registration Form*
Please register via the following registration form:
https://forms.gle/kZpFnbGETwAXgxbWA <https://forms.gle/kZpFnbGETwAXgxbWA>
*About KnitTogether:*
Great research arises from amazing collaborations. However, establishing
collaborations is challenging, and it depends on the context of the
institution, the supervision, existing networks, and one’s personality.
With *KnitTogether25*, we invite NLP researchers, particularly junior
ones who work on *Bias in NLP and Its Societal Impacts,* to spend an
exciting week in the Swiss mountains.
In a growing community dealing with a topic that is gaining increasing
societal influence and relevance, we want to join forces and exchange
ideas. What can we learn from each other?
In particular, but not limited to, we would like to explore: a) Which
groups are under-/overrepresented or represented to their disadvantage?
b) How can we incorporate such factors into modeling? c) How can we
analyze them? We would like to hear the different perspectives of the
participants: How do you relate your work to the topic? Using innovative
and playful methods, we want to learn more about the work of our peers
and thus get to know each other better /scientifically/.
During a week full of hikes and other fun activities, the goal is to get
to know each other without hurdles, draft ideas, and discuss research.
We want to create a safe space where everyone can openly discuss their
work and ideas without academic pressure.
Program
* Workshops
* Peer-to-peer tutorials
* Keynotes
* Mentoring sessions
* Brainstorming sessions
* Social activities
*Important Dates*
* *February 28, 2025 (AoE)*: registration deadline
* *May 5-10, 2025:* Retreat Date
Workshop Location
* Wildhaus, Switzerland
*Workshop Fees*
* *Registration fee:* ~ 60 Euros (includes accommodation and food,
excludes travel costs).
* Application for *additional travel subsidies* is possible.
Website
* https://knittogether.github.io/kt25/
<https://knittogether.github.io/kt25/>
*Organizers*
* Esra Dönmez (University of Stuttgart)
* Neele Falk (University of Stuttgart)
* Florian Schneider (University of Hamburg)
* Andreas Waldis (TU Darmstadt, University of Lucerne)
*Contact*
For any questions, please contact the organizers at
/knittogethernlp(a)gmail.com/ <mailto:knittogethernlp@gmail.com>
Best Regards,
KnitTogether25 Organizers
*---------------------------------------------*
*Sponsors*
We would like to thank*IRIS*(Interchange Forum for Reflecting on
Intelligent Systems),*HSLU*(University of Lucerne) and*GSCL*(German
Society for Computational Linguistics and Language Technology) for
sponsoring this event.
<Apologies for cross-postings>
------------------------------------------------
Call for Participation
PROFE 2025: Language Proficiency Evaluation
IberLEF 2025 Shared Task
https://nlp.uned.es/question-answering/profe2025
PROFE 2025 reuses the exams for Spanish proficiency evaluation developed by Instituto Cervantes over many years to evaluate human students. Therefore, automatic systems will be evaluated under the same conditions as humans were. Systems will receive a set of exercises with their corresponding instructions without specific training material. In this way we expect Transfer Learning approaches or the use of Generative Large Language Models.
Subtasks
PROFE 2025 has three subtasks, one per exercise type. Teams can participate in any combination of them. Each subtask contains several exercises of the same type. The subtasks are:
1.
Multiple choice subtask: each exercise includes a text and a set of multiple-choice questions about the text where only one answer is correct. Given a multiple-choice question, systems must select the correct answer among the candidates.
2.
Matching subtask: each exercise contains two sets of texts. Systems must find the text in the second set that best matches the first set. There is only one possible matching per text, but the first set can contain extra unnecessary texts.
3.
Filling the gap subtask: each exercise contains a text with several gaps corresponding to textual fragments that have been removed and presented disorderly as options. Systems must determine the correct position for each fragment. There is only one correct text per gap, but there could be more candidates than gaps.
The different exercises open research on how to approach them, adapting different prompts when using generative models.
Dataset
We will use the IC-UNED-RC-ES dataset created from real examinations at Instituto Cervantes. These exams were created by human experts to assess language proficiency in Spanish. We have already collected the exams and converted them to a digital format, which is ready to be used in the task. The dataset contains exams at different levels (from A1 to C2).
The complete dataset contains 282 exams with 855 exercises. The total number of evaluation points are 6146 (among 16570 options) distributed by exercise type as:
multiple-choice: 3544 responses
matching: 2309 responses
fill-the-gap: 293 responses
In PROFE 2025 we plan to use around 50% of the exams, so the other 50% remains hidden for PROFE second edition.
We intend not to distribute the gold standard to prevent overfitting in post-campaign experiments and data contamination in LLMs.
Evaluation measures and baseline
We will use traditional accuracy (proportion of correct answers) as the main evaluation measure. Systems will receive evaluation scores from two different perspectives:
*
At the question level, where correct answers are counted individually without grouping them.
*
At the exam level, where scores for each exam are considered. Each exam contains several exercises of different types. An exam is considered to be passed if an accuracy score (accounted as the proportion of correct answers) above 0.5 is reached. Then, the proportion of passed exams is given as a global score. This perspective will only apply to those teams participating in the three subtasks.
More in detail, the exact evaluation per subtask is as follows:
*
Multiple choice subtask: we will measure accuracy as the proportion of questions correctly answered
*
Matching subtask: we will measure accuracy as the proportion of correct texts matched.
*
Fill in the gap subtask: We will measure accuracy as the proportion of correctly filled gaps.
We will use accuracy as the evaluation measure because there is only one correct option among candidates and because it is the measure applied to humans doing the same exams. Thus, we can compare the performance of automatic systems and humans under the same conditions
A preliminary baseline using ChatGPT obtains the following results for each exercise type (provided that different prompting can produce slightly different results):
*
Multiple choice accuracy: 0.64
*
Filling the gap accuracy: 0.43
*
Matching accuracy: 0.51
Schedule
February 6, 2025 Registration opens
March 10, 2025 Training data released
April 28, 2025 Test set release
May 9, 2025 Deadline for submitting runs
May 14, 2025 Release of evaluation results
June 3, 2025 Paper submission deadline
Organizers
Alvaro Rodrigo<https://www.uned.es/universidad/docentes/informatica/alvaro-rodrigo-yuste.h…>, UNED NLP & IR Group (Universidad Nacional de Educación a Distancia)
Anselmo Peñas<https://www.uned.es/universidad/docentes/informatica/anselmo-penas-padilla.…>, UNED NLP & IR Group (Universidad Nacional de Educación a Distancia)
Alberto Pérez<https://www.uned.es/universidad/docentes/informatica/alberto-perez-garcia-p…>, UNED NLP & IR Group (Universidad Nacional de Educación a Distancia)
Sergio Moreno<https://www.uned.es/universidad/docentes/en/informatica/sergio-moreno-alvar…>, UNED NLP & IR Group (Universidad Nacional de Educación a Distancia)
Javier Fruns, Instituto Cervantes
Inés Soria, Instituto Cervantes
Rodrigo Agerri<https://ragerri.github.io/>, HiTz (Universidad del País Vasco, UPV/EHU)
AVISO LEGAL. Este mensaje puede contener información reservada y confidencial. Si usted no es el destinatario no está autorizado a copiar, reproducir o distribuir este mensaje ni su contenido. Si ha recibido este mensaje por error, le rogamos que lo notifique al remitente.
Le informamos de que sus datos personales, que puedan constar en este mensaje, serán tratados en calidad de responsable de tratamiento por la UNIVERSIDAD NACIONAL DE EDUCACIÓN A DISTANCIA (UNED) c/ Bravo Murillo, 38, 28015-MADRID-, con la finalidad de mantener el contacto con usted. La base jurídica que legitima este tratamiento, será su consentimiento, el interés legítimo o la necesidad para gestionar una relación contractual o similar. En cualquier momento podrá ejercer sus derechos de acceso, rectificación, supresión, oposición, limitación al tratamiento o portabilidad de los datos, ante la UNED, Oficina de Protección de datos<https://www.uned.es/dpj>, o a través de la Sede electrónica<https://sede.uned.es/> de la Universidad.
Para más información visite nuestra Política de Privacidad<https://descargas.uned.es/publico/pdf/Politica_privacidad_UNED.pdf>.
🎓 CIRCE Seminar: Textbooks, ideologies and language education. An
overview from different countries' teaching materials
Dr. Francesca Gallina, University of Pisa, Italy
Date: February 24, 2025
Time: 16:30 – 17:30 (CET)
Venue: Online
Attendees: Secondary school teachers, researchers, language instructors
The CIRCE project <https://www.circe-project.eu> is pleased to announce
the online seminar ‘Textbooks, ideologies and language education. An
overview from different countries' teaching materials’ organized in
collaboration with DFCLAM University of Siena, H2IOSC project and CNR ILC.
Abstract:
Textbooks play a pivotal role in language education, even in the era of
internet and AI. For both learners and teachers, textbooks are something
they use daily in the processes of learning and teaching, in the
classroom or at home. According to Curdt-Christiansen and Weninger
(2015) textbooks are sociocultural materials, they have the power to
represent and position ideas, values and beliefs, namely they can
legitimate an ideology, reflect a particular ideology and in this
perspective, they can shape identity of learners. To explore textbooks
as sociocultural products, we will analyze a sample of textbooks from
different countries and focus on the following aspects: plurilingualism,
theoretical approach, language activities, metalinguistic analysis and
variation. Examples from textbooks will be presented and discussed.
About the speaker:
Francesca Gallina is Associate Professor of Educational Linguistics at
the University of Pisa. She completed her PhD in Linguistics and
didactic of Italian as L2 in 2009 with a dissertation on the development
of the lexical competence of learners of Italian as a second language at
the University for Foreigners of Siena.
Her main research interest areas are: educational linguistics, L2
vocabulary acquisition, language contact and multilingualism, impact of
language policies on the development of L2 teaching and learning processes.
Her last publications are “Italiano di contatto e didattica plurilingue”
(Cesati, 2021), “Osservare e valutare la competenza lessicale in
italiano L2” (2022).
The seminar will be hosted on the H2IOSC Training Environment to allow
everyone interested to access a registration of the event.
The seminar is free of charge, but participants must register.
Further details and the link to register here:
https://www.circe-project.eu/circe-online-seminar-series/
--
facebook <https://www.facebook.com/CNRsocialFB> twitter
<https://twitter.com/CNRsocial_> instagram
<https://www.instagram.com/cnrsocial/> linkedin
<https://www.linkedin.com/company/283032>
Claudia Soria
CNR, ISTITUTO DI LINGUISTICA COMPUTAZIONALE "ANTONIO ZAMPOLLI"
claudia.soria(a)ilc.cnr.it
Tel. 0503153166
Via Giuseppe Moruzzi, 1, 56124 – Pisa
www.ilc.cnr.it
*www.cnr.it* <http://www.cnr.it/>
Devolvi il 5×1000 al CNR
CF 80054330586
We are offering a postdoctoral fellowship with the Graduate Program in
Applied Linguistics and Language Studies, the Pontifical Catholic
University of São Paulo , Brazil, funded by the São Paulo Research
Foundation (FAPESP), Grant #2022/05848-7.
Successful applicants will contribute research to at least one of the
following topics: 1.Development and refinement of methodologies and
innovative resources for MD Analysis; 2. MD description of register
variation across different languages and domains; 3. MD identification of
discourse surrounding socially relevant issues in the contemporary world. A
full-time commitment of 40 hours per week is required. Applicants must have
obtained a doctoral degree less than seven years ago. The fellowship duration
is between 12 and 24 months. Interested applicants should complete the
online form at: https://form.jotform.com/230804143618956 by May 31. 2025. The
fellowship provides a monthly stipend of 12,000 BRL (
https://fapesp.br/valores/bolsasnopais). Project proposals and corpus data
may be in either English or Portuguese.
Questions, please email tnnycorpuslg(a)gmail.com
*[Apologies for cross-posting]*
Dear Researchers,
We are pleased to invite you to participate in the *PolyHope-M Shared Task
at RANLP 2025*, an exciting challenge aimed at advancing the computational
understanding of *hope* across multiple languages. Hope is a crucial human
emotion that influences decision-making, resilience, and social
interactions. This shared task focuses on detecting and categorizing
hope-related expressions in *English, German, Spanish, and Urdu social
media texts*.
Unlike traditional binary sentiment classification, *PolyHope-M* introduces
a *nuanced multiclass classification approach* to distinguish between:
- *Generalized Hope:* A broad sense of optimism not tied to specific
outcomes.
- *Realistic Hope:* Expectations grounded in achievable goals.
- *Unrealistic Hope:* Desires for outcomes that are unlikely or
impossible.
- *Not Hope:* Texts that do not express hope.
*Task Description*
Participants will develop NLP models for classifying hope-related
expressions through two main subtasks:
1. *Binary Hope Speech Detection* (Subtask 1 - separate for each
language): Classifying texts as *Hope* or *Not Hope*.
2. *Multiclass Hope Speech Detection* (Subtask 2 - separate for each
language): Distinguishing between different types of hope expressions.
*Important Dates*
- *Training data release:* February 20, 2025
- *Evaluation data release & evaluation start:* February 25, 2025
- *Evaluation end:* March 25, 2025
- *Publication of official results:* March 26, 2025
- *Paper submission deadline:* April 26, 2025
- *Author notifications:* May 10, 2025
- *Camera-ready submission:* June 10, 2025
- *Shared task presentation at RANLP 2025:* September 11-12, 2025
*Organizing Committee*
- *Fazlourrahman Balouchzahi*, Independent Researcher, Mexico
- *Sabur Butt*, Institute for the Future of Education (IFE), Tecnológico
de Monterrey, Mexico
- *Maaz Amjad*, Texas Tech University, Lubbock, TX, USA
- *Luis Jose Gonzalez-Gomez*, IFE, Tecnológico de Monterrey, Mexico
- *Abdul Gafar Manuel Meque*, Centro de Investigación en Computación,
Instituto Politécnico Nacional, Mexico
- *Helena Gómez-Adorno*, IIMAS, UNAM, Mexico
- *Bharathi Raja Chakravarthi*, University of Galway, Ireland
- *Grigori Sidorov*, Centro de Investigación en Computación, Instituto
Politécnico Nacional, Mexico
- *Thomas Mandl*, University of Hildesheim, Germany
- *Hector G Ceballos*, IFE, Tecnológico de Monterrey, Mexico
- *Ruba Priyadharshini*, Gandhigram Rural Institute, India
- *Saranya Rajiakodi*, Central University of Tamilnadu, India
*Contact Information*
For any inquiries, please contact the organizing team:
- *Fazlourrahman Balouchzahi:* fbalouchzahi2021(a)cic.ipn.mx
- *Sabur Butt:* saburb(a)tec.mx
- *Helena Gómez-Adorno:* helena.gomez(a)iimas.unam.mx
For more details and participation guidelines, visit:
*PolyHope-M Competition Page*: https://www.codabench.org/competitions/5635/
*RANLP Website*: https://ranlp.org/ranlp2025/
We look forward to your participation in *PolyHope-M at RANLP 2025* and
your contributions toward enhancing hope speech detection across multiple
languages!
Best regards,
Sabur Butt
*Sabur Butt, Ph.D. *(He/Him)
Institute for the Future of Education (IFE)
*Tecnológico de Monterrey, Mexico*
Address: Av. Eugenio Garza Sada 2501 Sur Tecnológico, 64849 Monterrey, N.L.
LinkedIn <https://www.linkedin.com/in/saburb> - GitHub
<https://github.com/saburbutt> - Scholar
<https://scholar.google.com/citations?user=re7md-0AAAAJ&hl=en> - Website
<https://saburbutt.github.io/>
The content of this data transmission must not be considered an offer,
proposal, understanding or agreement unless it is confirmed in a document
signed by a legal representative of ITESM. The content of this data
transmission is confidential and is intended to be delivered only to the
addressees. Therefore, it shall not be distributed and/or disclosed through
any means without the authorization of the original sender. If you are not
the addressee, you are forbidden from using it, either totally or
partially, for any purpose.
*[Apologies for cross-posting]*
Dear Researchers,
We are excited to invite you to participate in the *PolyHope Shared Task at
IberLEF 2025*, a unique challenge focused on analyzing the expression of
hope in social media texts. Hope is a fundamental human emotion
influencing decision-making,
yet its nuanced nature—especially when masked by sarcasm—presents significant
challenges for Natural Language Processing (NLP) systems. This year, the
task expands its scope to Spanish texts, emphasizing hope as an expectation
alongside its existing English counterpart.
*Task Description*
The goal of PolyHope is to classify social media texts based on their
expression of hope, with two subtasks available:
1. *Binary Hope Speech Detection (Subtask 1.a: English, Subtask 1.b:
Spanish)*
- *Hope:* Texts conveying hope, expectation, or desire.
- *Not Hope:* Texts that do not express any hopeful sentiment.
2. *Multiclass Hope Speech Detection (Subtask 2.a: English, Subtask 2.b:
Spanish)*
- *Generalized Hope:* Broad optimism without specific targets.
- *Realistic Hope:* Hope grounded in plausible outcomes.
- *Unrealistic Hope:* Expressions of hope for unlikely outcomes.
- *Not Hope:* Texts without hopeful sentiment.
- *Sarcasm:* Texts that mimic hope but are sarcastic in nature.
This task aims to promote research in *inclusive language technologies*,
enhance the *psychological and linguistic understanding* of hope, and
develop *multilingual NLP models* capable of detecting nuanced sentiment
and sarcasm.
*Important Dates*
- *Release of training data:* February 13, 2025
- *Release of test corpora & evaluation campaign start:* March 13, 2025
- *End of evaluation campaign (submission deadline):* March 28, 2025
- *Publication of official results:* March 30, 2025
- *Paper submission:* April 25, 2025
- *Review notification:* May 20, 2025
- *Camera-ready submission:* June 3, 2025
- *IberLEF Workshop:* September 2025
- *Publication of proceedings:* September 2025
*Organizing Committee*
- *Sabur Butt*, Institute for the Future of Education (IFE), Tecnológico
de Monterrey, Mexico
- *Fazlourrahman Balouchzahi*, Independent Researcher, Mexico
- *Maaz Amjad*, Texas Tech University, Lubbock, TX, USA
- *Salud María Jiménez-Zafra*, SINAI, Universidad de Jaén, Spain
- *Hector G Ceballos*, IFE, Tecnológico de Monterrey, Mexico
- *Grigori Sidorov*, Centro de Investigación en Computación, Instituto
Politécnico Nacional, Mexico
*Contact Information*
For inquiries, feel free to reach out to the organizing team:
- *General Inquiries:* polyhopeatiberlef(a)gmail.com
- *Sabur Butt:* saburb(a)tec.mx
- *Fazlourrahman Balouchzahi:* fbalouchzahi2021(a)cic.ipn.mx
- *Salud María Jiménez-Zafra:* sjzafra(a)ujaen.es
For more details and participation guidelines, visit:
PolyHope Competition Page: https://www.codabench.org/competitions/5509/
<https://www.codabench.org/competitions/5509/>
IberLEF Website: https://sites.google.com/view/iberlef-2025/home?authuser=0
We look forward to your participation in advancing research on hope speech
detection!
Best regards,
Sabur Butt
*Sabur Butt, Ph.D. *(He/Him)
Institute for the Future of Education (IFE)
*Tecnológico de Monterrey, Mexico*
Address: Av. Eugenio Garza Sada 2501 Sur Tecnológico, 64849 Monterrey, N.L.
LinkedIn <https://www.linkedin.com/in/saburb> - GitHub
<https://github.com/saburbutt> - Scholar
<https://scholar.google.com/citations?user=re7md-0AAAAJ&hl=en> - Website
<https://saburbutt.github.io/>
The content of this data transmission must not be considered an offer,
proposal, understanding or agreement unless it is confirmed in a document
signed by a legal representative of ITESM. The content of this data
transmission is confidential and is intended to be delivered only to the
addressees. Therefore, it shall not be distributed and/or disclosed through
any means without the authorization of the original sender. If you are not
the addressee, you are forbidden from using it, either totally or
partially, for any purpose.
[Apologies if you receive multiple copies of this extended deadline CFP]
---------- LAST EXTENSION: Canadian AI 2025 - Final Deadline Extension ----------
---------- May 26-29, 2025, in Calgary, Alberta ----------
---------- FINAL Deadline Extension: Thursday, Feb 20, 2025 (11:59 p.m. AoE) ----------
Dear Colleagues,
We want to thank all those who have submitted papers so far to our conference. In response to ongoing requests, we are granting one FINAL extension. The new and final submission deadline is **Thursday, Feb 20, 2025, by 11:59 p.m. AoE**.
We invite submissions in all areas of Artificial Intelligence, either theoretical or applied, to the 38th Canadian Conference on Artificial Intelligence, taking place in Calgary on May 26-29. We also welcome position papers, which present evidence-based arguments for a particular point of view without necessarily introducing a new system.
Conference proceedings will be published in PubPub open-access online format and submitted to be indexed/abstracted in leading indexing services such as DBLP, ACM, and Google Scholar.
---------- Submission Details ----------
Canadian AI is accepting submissions of both long and short papers:
Long papers: Maximum 12 pages (including references)
Short papers: Maximum 6 pages (including references)
Formats: LaTeX and Word submissions are accepted
**NEW: Each submission may include an Appendix (PDF) as supplementary material.**
More information and submission templates are available under Submission Details here:
https://www.caiac.ca/en/conferences/canadianai-2025/call-papers
**The portal for submission can be found here:**
https://cmt3.research.microsoft.com/CANADIANAI2025/
All submissions must be original, anonymized for double-blind review, and not under review elsewhere (preprints are acceptable if the title differs).
---------- Topics of Interest Include ----------
- Agent Systems
- AI Applications
- Automated Reasoning
- Case‐based Reasoning
- Cognitive Models
- Constraint Satisfaction
- Data Mining
- Deep Learning and Neural Models
- E‐Commerce
- Ethics in AI, AI for social good
- Evolutionary Computation
- Explainable AI
- Fair, Secure, Private, and Trusted AI
- Games
- Information Retrieval and Search
- Knowledge Management
- Knowledge Representation
- Large Language Models
- Machine Learning
- Multimedia Processing
- Natural Language Processing
- Planning
- Robotics
- Uncertainty
- User Modeling
- Web Mining and Applications
Authors of accepted long papers will be allotted time for an oral presentation during the conference. Accepted short papers will also be allotted time for a 5-minute oral presentation, followed by a poster session presentation. It is mandatory for at least one author of each accepted paper to attend the conference in person to present their work. Authors are expected to agree to this requirement before submitting their paper for review.
Furthermore, the corresponding author of each paper must complete and sign a copyright form on behalf of all authors associated with the paper. It is important that the corresponding author who signs the copyright form matches the corresponding author listed on the paper.
---------- Important Dates ----------
- FINAL Submission Deadline: **Thursday, Feb 20, 2025 (11:59 p.m. AoE)**
- Author Notification: Tuesday, April 1, 2025
- Camera-Ready Copy Due: Tuesday, April 15, 2025
- Conference Dates: May 26-29, 2025
---------- Awards ----------
Best Paper Award and Best Student Paper Award will be given at the conference. To qualify for the student award, the first author must be a registered student at submission.
---------- Program Chairs ----------
Paula Branco
School of Electrical Engineering and Computer Science, University of Ottawa
pbranco(a)uottawa.ca
https://uniweb.uottawa.ca/view/profile/members/4218?lang=en
Amine Trabelsi
Département d'informatique, Université de Sherbrooke
Amine.Trabelsi(a)USherbrooke.ca
https://www.usherbrooke.ca/informatique/trabelsi
We look forward to your participation in Canadian AI 2025!
In this newsletter:
LDC at LT4ALL 2025
LDC membership discounts expire March 3
Spring 2025 data scholarship recipients
New publications:
AIDA Scenario 3 Practice Topic Source Data and Annotation<https://catalog.ldc.upenn.edu/LDC2025T02>
MATERIAL Georgian-English Language Pack<https://catalog.ldc.upenn.edu/LDC2025S01>
________________________________
LDC at LT4All 2025
LDC is pleased to be a sponsor of The 2nd International Conference on Language Technologies for All (LT4All 2025)<https://www.lt4all2025.eu/overview/>, February 24-26, 2025, organized by ELRA and SIGUL, the ELRA/ISCA Special Interest Group on Under-resourced Languages, and in partnership with UNESCO as part of the International Decade of Indigenous Languages (2022-2032). The conference theme, "Advancing Humanism through Language Technologies," focuses on community empowerment within the larger discussion on the many ways technology impacts language communities. The conference will also commemorate the Silver Jubilee of International Mother Language Day (February 21).
LDC membership discounts expire March 3
Time is running out to save on 2025 membership fees. Renew your LDC membership, rejoin the Consortium, or become a new member by March 3 to receive a discount of up to 10%. For more information on membership benefits and options, visit Join LDC<https://www.ldc.upenn.edu/members/join-ldc>.
Spring 2025 data scholarship recipients
Congratulations to the recipients of LDC's Spring 2025 data scholarships:
Sair Buckle: Charles Sturt University (Australia): PhD student, AI and Cyber Futures Institute. Sair is awarded a copy of Avocado Research Email Corpus LDC2015T03 for her work in behavioral science.
Le Phuoc Thinh Tien, Vietnam National University Ho Chi Minh City (Vietnam); Bachelor's student, Faculty of Information Technology. Le is awarded a copy of Penn Discourse Treebank Version 3.0 LDC2019T05 for his research in natural logical reasoning.
The next round of applications will be accepted in September 2025. For information about the program, visit the Data Scholarships page<https://www.ldc.upenn.edu/language-resources/data/data-scholarships>.
________________________________
New publications:
AIDA Scenario 3 Practice Topic Source Data and Annotation<https://catalog.ldc.upenn.edu/LDC2025T02> was developed by LDC and is comprised of English, Russian, and Spanish web documents (text, video, image) and annotations. Each phase of the AIDA program centered on a specific scenario, or broad topic area, with related subtopics designated as either practice subtopics or evaluation subtopics. The Phase 3 scenario focused on the COVID-19 global pandemic. This corpus contains source documents and annotations for the Scenario 3 practice topics.
The corpus contains 1417 root documents; 279 documents were annotated. Annotations include:
* Event, relation, and entity annotation (64 documents)
* Claim frame annotation: claims (true or not) relating to the COVID-19 pandemic (203 documents)
* Practice topic query claim frames: example claim frames intended to be used by systems as queries to extract similar claims from additional documents (30 documents)
The DARPA AIDA (Active Interpretation of Disparate Alternatives) program aimed to develop a multi-hypothesis semantic engine to generate explicit alternative interpretations of events, situations, and trends from a variety of unstructured sources. LDC supported AIDA by collecting, creating, and annotating multimodal linguistic resources in multiple languages.
2025 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
*
MATERIAL Georgian-English Language Pack<https://catalog.ldc.upenn.edu/LDC2025S01> was developed by Appen<http://www.appen.com/> for the IARPA MATERIAL<https://www.iarpa.gov/index.php/research-programs/material> program and contains 79 hours of Georgian conversational telephone speech, transcripts, English translations, annotations, and queries. Calls were made using different telephones (e.g., mobile, landline) from a variety of environments. Transcripts cover approximately half of the speech files, and approximately 3% of the speech data was translated into English. This release also includes English queries and their relevance annotations.
The MATERIAL program focused on underserved languages with the ultimate goal to build cross language information retrieval systems to find speech and text content using English search queries.
2025 members can access this corpus through their LDC accounts provided they have submitted a completed copy of the special license agreement. Non-members may license this data for a fee.
To unsubscribe from this newsletter, log in to your LDC account<https://catalog.ldc.upenn.edu/login> and uncheck the box next to "Receive Newsletter" under Account Options or contact LDC for assistance.
Membership Coordinator
Linguistic Data Consortium<ldc.upenn.edu>
University of Pennsylvania
T: +1-215-573-1275
E: ldc(a)ldc.upenn.edu<mailto:ldc@ldc.upenn.edu>
M: 3600 Market St. Suite 810
Philadelphia, PA 19104
Dear List,
the 12th International Conference on CMC and Social Media Corpora for the Humanities (CMC-Corpora) will be held at the University of Bayreuth, Germany, on the 4th and 5th of September 2025 (CfP below).
Keynotes
Gavin Brookes (Lancaster University)
Stephanie Evert (Friedrich-Alexander University Erlangen-Nuremberg)
Deadline paper/abstract submission: 15th May 2025, 23:59 CEST
The conference brings together language-centered research on CMC and social media in linguistics, philologies, communication sciences, media, and social sciences with research questions from the fields of corpus and computational linguistics, language technology, text technology, and machine learning.
We adhere to a wide definition of CMC and social media, covering various media of digital communication, including email, newsgroups, forums, chat and messenger applications (e.g. WhatsApp), social networks (e.g. Facebook, Instagram, X, TikTok), gaming platforms, as well as interactions in the communication areas of video portals (YouTube), learning platforms, gaming apps, online games and virtual worlds.
We invite submissions to the 12th conference on following topics:
* Development of CMC corpora / social media corpora
* Building CMC corpora: from data collection to publication
* Open access data for CMC research: ethical and GDPR issues
* Annotating CMC data: genres, linguistic aspects, metadata
* Multimodal corpora
* Big data corpora
* Analysis of CMC corpora / social media corpora
* Sociolinguistic studies of CMC
* Discourse analysis of CMC
* Linguistic characteristics of CMC
* Multimodal (incl. visual) aspects of CMC
* Multilingualism and code-switching in CMC
* CMC in language education
* Natural language processing (NLP) of CMC data / social media data
* Normalization
* PoS tagging
* Lemmatization
* Syntactic parsing
* CMC for the benefit of digital societies
* Interdisciplinary research design and research methods in CMC for the benefit of digital societies
* Exploration of Diversity and Inclusion in CMC
* Intersection of CMC and Social Sciences
* Intersection of CMC and Human-Centered Data Science
* Intersection of CMC and Computational Social Science
* Contrastive CM studies across different languages
The conference language is English. Submissions will consist of:
* Short papers (2-4 pages – maximal 6 pages including the list of references –, following the existing template) for oral presentations
* Abstracts (max. 300 words) for poster presentations
Submission and review
Authors of accepted papers are invited to present their work at the conference (30-minute timeslots: 20-minute talks, followed by 10 minutes of discussion). Authors of accepted abstracts can present their work in progress or early-stage research during the poster session. At the start of the conference, all accepted papers will be made available in online proceedings. After the conference, speakers with the best contributions will be invited to submit extended papers for one or more special issue journal or a volume publication.
Instructions for authors
All contributions will be collected via ConfTool.
Templates
Submission templates for MS Word and LaTeX are provided on the conference’s homepage:
https://www.cmc2025.uni-bayreuth.de/en/.
Important Dates
* Platform opening for short paper and abstract submission: 17th February 2025 (platform open)
* Deadline paper/abstract submission: 15th May 2025, 23:59 CEST
More information on the CMC2025 conference:
https://www.cmc2025.uni-bayreuth.de/en/
For all enquiries, please contact the organizers at cmc2025(a)uni-bayreuth.de.
More information on the “International Conference Series on CMC and Social Media Corpora (cmc-corpora)”:
https://cmc-corpora.org/series/#
Local organizing committee:
* Dr. Annamária Fábián (University of Bayreuth/Bavarian Research Institute for Digital Transformation at the Bavarian Academy of Science)
* Prof. Dr. Igor Trost (Alpen-Adria University Klagenfurt/University of Passau)
Scientific chairs:
* Dr. Steven Coats (University Oulu)
* Dr. Annamária Fábián (University of Bayreuth)
* Prof. Dr. Julien Longhi (CY Cergy Paris University)
* Prof. Igor Trost (Alpen-Adria University Klagenfurt/U. of Passau)
* Prof. Reinhild Vandekerckhove (University of Antwerp)
* Dr. Lieke Verheijen (Radboud University)
Scientific committee (so far confirmed):
* Paul Baker (Lancaster University)
* Gavin Brookes (Lancaster University)
* Noah Bubenhofer (University of Zürich)
* Mario Cal Varela (Universidade de Santiago de Compostela)
* Louis Cotgrove (Leibniz-Institut für Deutsche Sprache Mannheim)
* Steven Coats (University of Oulu)
* Orphée DeClercq (Ghent University)
* Stephanie Evert (Friedrich-Alexander University Erlangen-Nüremberg)
* Francisco Javier Fernández Polo (University of Santiago de Compostela)
* Annamária Fábián (University of Bayreuth/Bavarian Research Institute for Digital Transformation – Bavarian Academy of Science)
* Jenny Frey (European Academy of Bozen)
* Aivars Glaznieks (Eurac Research Bolzano)
* Claire Hardaker (Lancaster University)
* Stefan Hartmann (Heinrich-Heine-University Düsseldorf)
* Iris Hendrickx (Radboud University Nijmegen)
* Axel Herold (Berlin-Brandenburgische Akademie der Wissenschaften)
* Besim Kabashi (Friedrich-Alexander University Erlangen-Nüremberg)
* Erik-Tjong Kim-Sang (Netherlands eScience Center)
* Alexander Koenig (CLARIN ERIC)
* Marc Kupietz (Leibniz-Institut für deutsche Sprache Mannheim)
* Mikko Laitinen (University of Eastern Finland)
* Els Lefever (Ghent University)
* Julien Longhi (Cergy-Pontoise Université)
* Harald Lüngen (Leibniz-Institut für deutsche Sprache Mannheim)
* Konstanze Marx-Wischnowski (University of Greifswald)
* Maja Miličević-Petrović (University of Bologna)
* Nelleke Oostdijk (Radboud University)
* Jan Oliver Rüdiger (Leibniz-Institut für deutsche Sprache Mannheim)
* Tatjana Scheffler (Ruhr-Universität Bochum)
* Steven Schoonjans (Alpen-Adria University Klagenfurt)
* Mirco Schönfeld (University of Bayreuth)
* Stefania Spina (Università per Stranieri di Perugia)
* Egon Stemle (Eurac Research)
* Caroline Tagg (The Open University)
* Stefanie Ullmann (University of Cambridge)
* Igor Trost (Alpen-Adria University Klagenfurt/Universität Passau)
* Reinhild Vandekerckhove (University of Antwerp)
* Lieke Verheijen (Radboud University)
* Stefanie Walter (Technical University Munich)
* Katrin Weller (GESIS Cologne)
All the best,
Annamaria Fabian (Bayreuth) and Igor Trost (Klagenfurt/Passau)
Dr. Annamaria Fabian
Principal Investigator (Post-Doc) of the project "Disability diversity and the communicative realization of inclusion on Social Media"
Department German Linguistics
University of Bayreuth
https://www.gl.uni-bayreuth.de/de/team/A-Fabian/index.php
Member of the Bavarian Research Institute for Digital Transformation (Bavarian Academy of Science)
Head of the Research Group "Diversity and Inclusion in Digital Societies"
Newly published:
Annamaria Fabian, Igor Trost, Kevin Altmann & Mara Schwind (2024): The analysis of ‘inclusion’ and ‘accessibility’ in Computer-Mediated-Communication for an inclusive transformation in digital societies. pp. 16-20 (double-side print). In: Céline Poudat, Mathilde Guernut. Proceedings of the 11th Conference on CMC and Social Media Corpora for the Humanities. 11th Conference on CMC and Social Media Corpora for the Humanities (CMC 2024), CORLI; Université Côte d’Azur, 2024. halshs-04673776v1
available here: https://cmc-corpora-nice.sciencesconf.org/data/pages/CMC2024_2_.pdf