Dear Corpora Members,
🌟 Exciting Announcement: OSACT 2024 Workshop 🌟
Calling All Researchers in Computational Linguistics, NLP, and IR Specializing in Arabic Language!
Are you at the forefront of research in low-resource languages, particularly Arabic? Do you delve into the complexities of computational linguistics (CL), natural language processing (NLP), and information retrieval (IR) with a focus on Arabic?
We invite you to explore and contribute to groundbreaking advancements in machine translation, particularly in developing models that seamlessly translate dialectal Arabic text into Modern Standard Arabic (MSA).
Moreover, if your research aims to elevate the integrity and dependability of Arabic Large Language Models (LLMs) by innovating in hallucination detection and mitigation strategies, this workshop is a perfect platform for you.
Join us at the 6th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT6), a hub of innovation and scholarly exchange.
Featured Shared Tasks: Tackling the Forefront Issues in Arabic LLMs and Modern Standard Arabic (MSA) Machine Translation
Task 1: Arabic LLMs Hallucination Challenge: Address the critical issue of hallucinated content in Arabic language models. Engage in this vital conversation and present your solutions. Read details: https://osact-lrec.github.io/
Task 2: Dialect to MSA Machine Translation Challenge: Engage in the pivotal task of transforming dialectal Arabic into MSA through innovative translation models. We invite you to utilize your expertise in driving significant advancements in language processing, fostering more effective and meaningful exchanges in the Arabic-speaking world. Read details: https://osact-lrec.github.io/
Event Details:
Date: May 25, 2024
Location: Torino, Italy
In conjunction with the esteemed LREC-COLING 2024 - The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation.
Don’t miss this opportunity to contribute to a pioneering field!
Key Dates:
Paper Submission Deadline: February 25, 2024
Acceptance Notification: March 25, 2024
Visit our website, OSACT 2024: https://osact-lrec.github.io/ , to read more details and submission guidelines.
For all your questions please send to OSACT.WORKSHOP(a)gmail.com
Looking forward to your participation and to seeing you in LERC-COLING in May 2024!
The OSACT 2024 Workshop Organizing Committee
--
Mona Ali MSc. PhD. Computer Science
Associate Professor
Northeastern University
Vancouver Campus
410 W Georgia, 14th Floor
Vancouver, BC | Land of: xʷməθkʷəy̓əm, Sḵwx̱wú7mesh, and səlilwətaɬ
*************************************************************************
CNLP4DH First Call for Papers:
Throughout 2024 Journal of Data Mining and Digital Humanities (JDMDH)
organizes a worldwide call for papers about the topic
Chinese Natural Language Processing for Digital Humanities (CNLP4DH)
As a reminder JDMDH is an international-based journal managed by French
national research institutions and green open access (no charge for readers
and authors).
This special issue is dedicated to natural language processing for digital
humanities involving the documents written in Chinese, including Modern,
Ancient and dialectal Chinese. Mandarin, which is the national official and
main common language, can be accepted and research on texts written in
other languages, such as Tibet, Inner Mongolia, etc., is also welcome.
A list of suitable topics includes but are not limited to:
- Text analysis and processing related to humanities using computational
methods
- Dataset creation and curation for NLP (e.g. digitization, datafication,
and data preservation).
- Research on cultural heritage collections such as national archives and
libraries using NLP
- NLP for error detection, correction, normalization and denoising data
- Generation and analysis of literary works such as poetry and novels
- Analysis and detection of text genres
- Word segmentation, part-of-speech tagging of Ancient Chinese
- Large Language Models (LLM) for Chinese in Digital Humanities
- Cross modal Models (text-speech-video-image) for Chinese in Digital
Humanities
- Visualization of text analytics
- Ontology models for natural language text
- Applications in Chinese Literature, Traditional Chinese medicine,
Learning Chinese language as second language, Sentiment Analysis in Chinese
Social Media, China Cultural Heritage, Chinese History, Ancient Chinese
language
submission guideline: https://jdmdh.episciences.org/page/submissions
Paper submission : https://jdmdh.episciences.org/submit
Website and more details:
https://jdmdh.episciences.org/page/chinese-natural-language-processing-for-…
Guest Editors:
Dr. Wenhe FENG (Guangdong University of Foreign Studies, Laboratory of
Language Engineering and Computing)
Dr. Bin LI (Nanjing Normal University, School of Chinese Language and
Literature, Center of Linguistic Big Data and Computational Humanities)
Dr. Nicolas TURENNE (Guangdong University of Foreign Studies, School of
Information Science and Technology)
Dr. Tong WEI (Beijing University, Digital Humanities Center)
*************************************************************************
***********************************************************************************
First Call for Papers:
The 5th workshop on: "Resources and ProcessIng of linguistic, para-linguistic and extra-linguistic Data from
people with various forms of cognitive/psychiatric/developmental impairments"
Workshop: co-located with LREC-COLING 2024 | Turin, Italy | May 21st, 2024
RaPID-5 serves as an interdisciplinary platform for researchers to exchange insights, methods, and experiences related to collecting and processing data from individuals with mental, cognitive, neuropsychiatric, or neurodegenerative impairments. The workshop focuses on creating, processing, and applying such data resources from individuals at different stages and severity levels of these impairments. The ultimate goal of RaPID-5 is to facilitate the study of relationships among linguistic, paralinguistic, and extra-linguistic observations, with applications ranging from aiding diagnosis to enhancing monitoring and predicting individuals at higher risk, ultimately promoting multidisciplinary collaboration across clinical, language technology, computational linguistics, and computer science communities.
Submission deadline: Sun., 31st of March, 2024 (anywhere on earth)
Paper submission: https://softconf.com/lrec-coling2024/rapid2024/
Website and more details: https://spraakbanken.gu.se/en/rapid-2024
Contact: Dimitrios Kokkinakis
Contact email: dimitrios.kokkinakis(a)gu.se
Organizing committee:
* Kathleen C. Fraser, National Research Council, Canada;
* Dimitrios Kokkinakis, University of Gothenburg, Sweden;
* Kristina Lundholm Fors, Lund University, Sweden;
* Charalambos K. Themistocleous, University of Oslo, Norway
*
Athanasios Tsanas, The University of Edinburgh, UK
*
Fredrik Öhman, University of Gothenburg and Sahlgrenska University Hospital, Sweden
************************************************************************************
[Apologies for multiple postings]
CALL FOR PAPERS
Second International Workshop Towards Digital Language Equality (TDLE):
Focusing on Sustainability
co-located with LREC-COLING 2024, May 2024, Turin (Italy)
See further details at https://european-language-equality.eu/tdle-2024/
[1]
1 Description and Aims of the Workshop
The key aim of this half-day workshop co-located with LREC-COLING 2024
(https://lrec-coling-2024.org/), to be held in Turin (Italy) in May
2024, is to discuss and promote the importance of sustainability in the
design, development, creation, use, distribution and sharing of language
data, resources, platforms, infrastructures, tools and technologies,
with the intention of achieving Digital Language Equality (DLE). While
some important work has recently addressed these crucial areas (e.g.
Fort and Couillault, 2016; Hessenthaler et al., 2022; Ramesh et al.,
2023; Castilho et al., forthcoming), the relevant contributions seem to
be as yet unsystematic and relatively isolated. The workshop intends to
provide an inclusive forum to encourage in-depth debate and facilitate
collaborations to promote the sustainability of resources and
technologies in any (combination of) languages, in support of
multilingualism and of the overarching goal of DLE.
The sustainability of language resources and technologies is key to
enabling multilingualism and digital language equality in the age of
Artificial Intelligence.
2 Topics of Interest
The second international Towards Digital Language Equality (TDLE)
workshop focuses on sustainability in relation to the design,
development, creation, use, distribution and sharing of language data,
resources, platforms, infrastructures, tools and technologies, with a
view to promoting the broader goal of Digital Language Equality (DLE).
The concept of DLE has been firmly established in relation to all
languages of Europe (Rehm and Way, 2023), and has the potential to also
benefit other languages throughout the world, to support the prosperity
of the respective communities at a time of impressive - but as yet very
unevenly distributed and severely imbalanced - progress in
language-centric Artificial Intelligence (AI), e.g. through large
language models (LLMs). The workshop places particular emphasis on
multilingualism and on leveling up digital support for languages,
domains and applications that have so far been underserved, and wishes
to explore ways to develop policies and funding streams to work towards
sustainability in connection with DLE, especially in support of
regional, minority and territorial languages.
To this end, recognizing that the sustainability of Language Resources
and Technologies (LRTs) is key to enabling multilingualism and DLE in
the age of AI, topics of particular interest for the workshop on which
we invite original contributions covering any (combination of) languages
include, but are not limited to, the following:
- research on the factors affecting DLE and the sustainability of LRTs;
- best practices, case studies and validated guidelines related to the
design, implementation and improvement of sustainability of written,
oral/spoken, signed and/or multimodal LRTs (including LLMs),
particularly in support of DLE;
- how multilingual LLM technology can support DLE;
- retrospectively assessing the sustainability of legacy LRTs, and
future-proofing new LRTs in the interest of DLE;
- analyzing the costs and benefits of foregrounding sustainability for
LRTs;
- the role of metadata, accompanying documentation and licenses in
showing and improving the sustainability of LRTs;
- sustainability, fairness and accessibility (e.g. for users with
physical or cognitive disabilities, limited computing resources and
connectivity) of platforms and infrastructures hosting, distributing and
sharing LRTs in the interest of DLE;
- how current data and computing access inequality is affecting DLE (in
particular regarding LLMs);
- ecological sustainability and environmental fairness of developing and
deploying state-of-the-art LRTs, e.g. LLMs with regard to energy
consumption, global warming and climate change;
- developing data and parameter efficient methods to train or adapt
language models to new languages;
- how to evaluate, measure, compare and improve the sustainability of
LRTs;
- establishing benchmarks and protocols to ensure the sustainability of
LRTs;
- how to avoid the potential dangers of developing and using unfair and
unsustainable LRTs, e.g. for malicious, ill-intentioned or harmful
purposes;
- ethical, legal, cultural and/or socio-economic implications of
(ignoring) fairness and sustainability of LRTs;
- developing and implementing forward-looking policies to promote
fairness and long-term sustainability of LRTs to achieve DLE;
- education and training needs and experiences in relation to promoting
fairness and sustainability of LRTs and ways to raise broad awareness of
DLE and related topics, e.g. among the general public, policy- and
decision-makers.
Given this wide-ranging and inclusive remit, the workshop intends to
bring together developers, creators, vendors, distributors, brokers,
users, evaluators and researchers of written, oral/spoken, signed and/or
multimodal LRTs in any (combination of) languages.
3 Background and First TDLE Workshop Held in 2022
The second 2024 edition of the workshop builds on the success of the
first Towards Digital Language Equality (TDLE) workshop, that was held
at LREC 2022 in Marseille (France) on 20 June 2022, and whose accepted
papers were published in a dedicated volume of proceedings, Aldabe et
al. (2022).
Following this well-received inaugural workshop held in June 2022, the
second event in the series will be co-located with LREC-COLING 2024 in
Turin (Italy) in May 2024, and will focus specifically on the highly
relevant topic of the sustainability of LRTs in connection with
multilingualism and DLE.
4 Submissions
Up-to-date information on the workshop, including materials for authors,
guidelines, templates, stylesheet and key dates can be found at the
dedicated website https://european-language-equality.eu/tdle-2024/ [1].
To contact the organizing committee of the workshop directly, you can
email tdle2024.hitz(a)ehu.eus.
Papers submitted to the workshop should be completely anonymous for
double-blind peer review, written in English, and prepared using the
official LREC-COLING 2024 author's kit and submission
stylesheet/template available at
https://lrec-coling-2024.org/authors-kit/ [2]. The submissions to the
workshop should not exceed 8 pages, excluding references, and be saved
in unprotected PDF format. Papers should be submitted no later than 23
February 2024 through the START submission management system available
via the workshop website at
https://european-language-equality.eu/tdle-2024/ [1].
The workshop seeks original papers, i.e. it does not accept submissions
that have been, or will be, published elsewhere. The workshop allows
simultaneous submissions, and in these cases the authors should clearly
indicate in the manuscript to which other conference, workshop or venue
they have submitted the paper for review. Each paper submitted to the
workshop will receive three double-blind peer reviews. Papers accepted
for presentation will be included in the proceedings of the workshop.
In light of the LREC-COLING 2024 Map and the "Share your LRs!"
initiative, when submitting their papers through the START system
authors will be asked to provide essential information about resources
(in a broad sense, i.e. also technologies, standards, evaluation kits,
etc.) that have been used for the work described in the paper or are a
new result of their research. Moreover, ELRA encourages all LREC-COLING
authors to share the described LRs (data, tools, services, etc.) to
enable their reuse and replicability of experiments (including
evaluation ones).
5 Key Dates
Paper submission deadline: 23 February 2024
Notification of acceptance: 19 March 2024
Camera-ready papers due: 8 April 2024
Half-day workshop date: 20, 21 or 25 May 2024 (TBC)
6 Workshop Organizers
- Itziar Aldabe (HiTZ Basque Center for Language Technology - Ixa,
University of the Basque Country, Spain)
- Begoña Altuna (HiTZ Basque Center for Language Technology - Ixa,
University of the Basque Country, Spain)
- Aritz Farwell (HiTZ Basque Center for Language Technology - Ixa,
University of the Basque Country, Spain)
- Federico Gaspari (University of Naples "Federico II", Italy & ADAPT
Centre, Dublin City University, Ireland - co-chair)
- Joss Moorkens (School of Applied Language & Intercultural
Studies/ADAPT Centre, Dublin City University, Ireland - co-chair)
- Stelios Piperidis (Institute of Language and Speech Processing, Athena
Research and Innovation Center in Information, Communication and
Knowledge Technologies, Greece)
- Georg Rehm (Speech and Language Technology Lab, Deutsches
Forschungszentrum für Künstliche Intelligenz, Germany)
- German Rigau (HiTZ Basque Center for Language Technology - Ixa,
University of the Basque Country, Spain)
7 Program Committee
- Antonios Anastasopoulos (GMU, USA)
- Anya Belz (ADAPT, DCU, Ireland)
- Steven Bird (CDU, Australia)
- Fred Blain (Uni. Tilburg, Netherlands)
- Franco Cutugno (Uni. Naples "Federico II", Italy)
- Bessie Dendrinos (NKUA, Greece & ECSPM, Denmark)
- Félix do Carmo (Uni. Surrey, UK)
- Annika Grützner-Zahn (DFKI, Germany)
- Ana Guerberof-Arenas (Uni. Groningen, Netherlands)
- Davyth Hicks (ELEN, Belgium)
- Monja Jannet (ADAPT, DCU, Ireland)
- John Judge (ADAPT, DCU, Ireland)
- Dorothy Kenny (SALIS/CTTS/ADAPT, DCU, Ireland)
- Sabine Kirchmeier (EFNIL, Luxembourg)
- Teresa Lynn (MBZUAI, United Arab Emirates)
- Maite Melero (BSC, Spain)
- Helena Moniz (Uni. Lisbon, Portugal & EAMT)
- Johanna Monti (UniOR, Italy)
- Rachele Raus (UniBO, Italy)
- Wessel Reijers (Uni. Paderborn, Germany)
- Celia Rico Pérez (Universidad Complutense de Madrid, Spain)
- Dimitar Shterionov (TU, Netherlands)
- Carlos S. C. Teixeira (IOTA Localisation Services & Uni. Rovira i
Virgili, Spain)
- Antonio Toral (Uni. Groningen, Netherlands)
- Vincent Vandeghinste (Instituut voor de Nederlandse Taal, Netherlands
& KU Leuven, Belgium)
References
Itziar Aldabe, Begoña Altuna, Aritz Farwell and German Rigau, editors.
2022. Proceedings of the Workshop Towards Digital Language Equality
(TDLE). European Language Resources Association, Marseille, France.
Sheila Castilho, Federico Gaspari, Joss Moorkens, Maja Popović and
Antonio Toral, editors. Forthcoming. Journal of Specialised Translation.
Special Issue n. 41 on "Translation Automation and Sustainability".
Karën Fort and Alain Couillault, 2016. "Yes, We Care! Results of the
Ethics and Natural Language Processing Surveys". Proceedings of the
Tenth International Conference on Language Resources and Evaluation
(LREC'16). European Language Resources Association, Portorož, Slovenia.
1593-1600.
Marius Hessenthaler, Emma Strubell, Dirk Hovy and Anne Lauscher, 2022.
"Bridging Fairness and Environmental Sustainability in Natural Language
Processing". Proceedings of the 2022 Conference on Empirical Methods in
Natural Language Processing, Abu Dhabi, United Arab Emirates. 7817-7836.
András Kornai, 2013. "Digital Language Death". PLoS ONE, 8(10):e77056.
Krithika Ramesh, Sunayana Sitaram and Monojit Choudhury, 2023. "Fairness
in Language Models Beyond English: Gaps and Challenges". Findings of the
Association for Computational Linguistics: EACL 2023. Association for
Computational Linguistics, Dubrovnik, Croatia. 2106-2119.
Georg Rehm and Andy Way, editors. 2023. European Language Equality: A
Strategic Agenda for Digital Language Equality. Berlin: Springer.
Links:
------
[1] https://european-language-equality.eu/tdle-2024/
[2] https://lrec-coling-2024.org/authors-kit/
The Robotics and Semantic Systems group at the Department of Computer Science, Lund University, has announced an Assistant Professor position (biträdande universitetslektor, BUL) in Computer Science with focus on semantic systems and natural language processing.
The group (https://rss.cs.lth.se) is doing research in cognitive robotics and AI, including Machine Learning, Natural Language Processing, Human-Robot Interaction and Advanced Robotics. The group is part of the RobotLab LTH (https://robotics.lth.se). It is involved in a number of research centers and programmes, including WASP (Wallenberg Autonomous Software and Systems Programme, https://wasp-sweden.org), ELLIIT (Excellence Center at Linköping – Lund in Information Technology, https://elliit.se), LTH profile area: Pillars of AI and Digitalization and LU profile area: Natural and Artificial Cognition.
The candidate is expected to have a PhD a couple of years old, preferably after a postdoc, planning to settle for the academic career in a vibrant university (with a fresh Nobel prize in house:-).
Detailed information about the position may be found in the announcement:
https://lu.varbi.com/what:job/jobID:654845/?lang=en
The deadline for applying has been postponed until January 9th, 2024.
You are welcome to contact me for more details.
Jacek Malec, head of the RSS group
--
Jacek Malec jacek.malec(a)cs.lth.se
Department of Computer Science tel. +46 46 2224950
LTH, Lund University cell +46 70 4950474
Box 118, 221 00 Lund, Sweden http://cs.lth.se/Jacek_Malec/
When you send emails to Lund University, we process your personal data in accordance with existing legislation. To find out more about the processing of your personal data, visit the Lund University website: https://www.lunduniversity.lu.se/about/contact-us/processing-of-personal-da…
[apologies for cross-posting]
ISIR, in Paris, has an open position for a two year non-permanent junior
researcher / postdoc in Machine Translation / Large Language Models.
Details available here:
https://emploi.cnrs.fr/Offres/CDD/UMR7222-FRAYVO-001/Default.aspx?lang=EN
Please apply before Jan, 5th, 2024.
Best
F
--
---
François Yvon
ISIR/CNRS
4 Place Jussieu
F 75005 Paris
https://fyvo.github.io
Finnish Center for Artificial Intelligence FCAI<https://fcai.fi/> and ELLIS Unit Helsinki<https://fcai.fi/ellis-unit-helsinki> are looking for postdocs, research fellows and PhD students to the following areas of research:
1) Reinforcement learning
2) Probabilistic methods
3) Simulation-based inference
4) Privacy-preserving machine learning
5) Collaborative AI and human modeling
6) Machine learning for science
Read more about the positions and apply by January 14 (postdocs/research fellows) or January 21 (PhD students), 2024 at https://fcai.fi/we-are-hiring
——————————————
Jörg Tiedemann
University of Helsinki
https://blogs.helsinki.fi/language-technology/
*** Apologies for Cross-Posting ***
The Second Arabic Natural Language Processing Conference (ArabicNLP 2024)
Co-located with ACL 2024 in Bangkok, Thailand, August 16, 2024. (Hybrid
Mode).
Conference URL: https://arabicnlp2024.sigarab.org/
ArabicNLP 2024 builds on eight previous conference and workshop editions,
which have been very successful drawing in a large active participation in
various capacities (See Scholar Page
<https://scholar.google.com/citations?user=LGzh8jYAAAAJ>). This conference
is timely given the continued rise in research projects focusing on Arabic
NLP. The conference is organized by the Special Interest Group on Arabic
NLP (SIGARAB <https://www.sigarab.org/>), an Association for Computational
Linguistics Special Interest Group on Arabic NLP. This announcement
combines two calls: (1) a call for shared task proposals, and (2) a call
for conference papers.
Call for Shared Task Proposals
We invite proposals for shared tasks related to Arabic NLP to be part of
the ArabicNLP 2024 conference.
The proposals should provide an overview of the proposed task, motivation,
data/resource collection and creation, task description, pilot run details
(if available), a tentative timeline that matches the submission dates
below, and task organizers (name, email, affiliation). Proposals in PDF
format can be up to 4 pages.
Shared Task Proposal Submission URL: https://shorturl.at/eCJOS
Important Dates for Shared Task Proposals
-
January 23, 2024: Submission of shared tasks proposals due date
-
February 6, 2024: Notification of acceptance of shared tasks
-
Proposals should target the following dates when planning their calls
-
April 29, 2024: Shared task papers due date
-
June 4, 2024: Notification of acceptance
-
June 24, 2024: Camera-ready papers due
-
August 16, 2024: ArabicNLP conference
All deadlines are 11:59 pm UTC -12h
<https://www.timeanddate.com/time/zone/timezone/utc-12> (“Anywhere on
Earth”).
For any questions, please contact the Shared Task Chair:
arabicnlp-shared-task-chair(a)sigarab.org
Call for Papers
We invite long (up to 8 pages), short (up to 4 pages), and demo paper (up
to 4 pages) submissions. Long and short papers will be presented orally or
as posters as determined by the program committee; presentation mode does
not reflect the quality of the work.
Submissions are invited on topics that include, but are not limited to, the
following:
-
Enabling technologies: (any size) language models, diacritization,
lemmatization, morphological analysis, disambiguation, tokenization, POS
tagging, named entity detection, chunking, parsing, semantic role labeling,
sentiment analysis, Arabic dialect modeling, etc.
-
Applications: dialog modeling, machine translation, speech recognition,
speech synthesis, optical character recognition, pedagogy, assistive
technologies, social media analytics, etc.
-
Resources: dictionaries, annotated data, corpora, etc.
Submissions may include work in progress as well as finished work.
Submissions must have a clear focus on specific issues pertaining to the
Arabic language whether it is standard Arabic, dialectal, classical, or
mixed. Papers on other languages sharing problems faced by Arabic NLP
researchers, such as Semitic languages or languages using Arabic script,
are welcome provided that they propose techniques or approaches that would
be of interest to Arabic NLP, and they explain why this is the case.
Additionally, papers on efforts using Arabic resources but targeting other
languages are also welcome. Descriptions of commercial systems are welcome,
but authors should be willing to discuss the details of their work. We also
welcome position papers and surveys about any of the above topics.
Conference Paper Submission URL: <https://softconf.com/emnlp2022/WANLP2022>
TBA
Important Dates for Conference Papers
-
April 22, 2024: Abstract submission for conference papers due date
-
April 29, 2024: Conference paper due date
-
May 21, 2024: Reviews submission deadline
-
June 4, 2024: Notification of acceptance
-
June 24, 2024: Camera-ready papers due
-
August 16, 2024: ArabicNLP conference
All deadlines are 11:59 pm UTC -12h
<https://www.timeanddate.com/time/zone/timezone/utc-12> (“Anywhere on
Earth”).
If you have any questions, please contact us at:
arabicnlp-pc-chairs(a)sigarab.org
The ArabicNLP 2024 Organizing Committee
--
Salam Khalifa
PhD Student at Stony Brook Linguistics
<https://www.linguistics.stonybrook.edu/>.
Call for Participation
We are announcing the first BEA (2024) shared-task on automated prediction of Difficulty And Response Time for Multiple Choice Questions (DART-MCQ).
Motivation
For standardized exams to be fair and valid, test questions, otherwise known as items, must meet certain criteria. One important criterion is that the items should cover a wide range of difficulty levels to gather information about the abilities of test takers effectively. Additionally, it is essential to allocate an appropriate amount of time for each item: too little time can make the exam speeded, while too much time can make it inefficient.
There is growing interest in predicting item characteristics such as difficulty and response time based on the item text. However, due to difficulties with sharing exam data, efforts to advance the state-of-the-art in item parameter prediction have been fragmented and conducted in individual institutions, with no transparent evaluation on a publicly available dataset. In this Shared Task, we bridge this gap by sharing practice item content and characteristics from a high-stakes medical exam called the United States Medical Licensing Examination® (USMLE®) for the exploration of two topics: predicting item difficulty (Track 1) and item response time (Track 2) based on item text.
Participation
The shared-task has two separate tracks as follows:
• Track 1: Given the item text and metadata, predict the item difficulty variable.
• Track 2: Given the item text and metadata, predict the time intensity variable.
Important Dates
Training data release: January 15
Test data release: February 10
Results due: February 16
Announcement of winners: February 21
Paper submissions due: March 10
Camera-ready papers due: April 22
Links
For more information about the shared task, see: https://sig-edu.org/sharedtask/2024
Organizers
Victoria Yaneva, National Board of Medical Examiners
Peter Baldwin, National Board of Medical Examiners
Kai North, George Mason University
Brian Clauser, National Board of Medical Examiners
Saed Rezayi, National Board of Medical Examiners
Yiyun Zhou, National Board of Medical Examiners
Le An Ha, University of Wolverhampton
Polina Harik, National Board of Medical Examiners