*** Call for Participation for GenoVarDis at IberLEF 2024 ***
GenoVarDis: NER in Genomic Variants and related Diseases at IberLEF 2024
https://codalab.lisn.upsaclay.fr/competitions/17733
We look forward to your participation in advancing Spanish biomedical text
processing through the GenoVarDis challenge at IberLEF 2024.
This task addresses the shortage of resources for Spanish in the domain of
NER and genomic variants. The first one of this kind. By leveraging a
unique corpus of a wider spectrum of mutations and variant-related entities
(including gene, disease and symptom) in Spanish (mainly, translating from
English and curated by human-experts), we aim to provide valuable data for
training and evaluating NER models in this low-resource domain.
Description of the task:
* Given a text (sequence of tokens), identify the named entities as spans
in the text and
classify them according to one of: Variant on DNA sequence, RS number,
Allele on DNA sequence, Wild type and mutant, Variant with insufficient
information, and transcript IDs. Metrics will include precision, recall,
and F1 scores for the task (F-score is the primary metric), considering
exact matches.
Example of text:
Neurofibromatosis tipo I. Mutación de splicing detectada por MLPA y
secuenciación en la Argentina
La neurofibromatosis tipo 1 (NF1) es un desorden genético autosómico
dominante, con una prevalencia de 1 en 2500-3000 nacidos vivos. La
dificultad diagnóstica se debe al tamaño extenso del gen NF1 con pocos
sitios hot-spot, la ausencia de una clara relación genotipo-fenotipo y
rasgos clínicos con un espectro muy heterogéneo. Un caso sospechoso de NF1
procedente de la provincia de Jujuy fue analizado por MLPA (multiplex
ligation-dependent probe amplification) en nuestro laboratorio. Mujer,
adolescente mestiza (Amerindia/Europea), con un osteoma maxilar, lordosis
lumbar, neurofibromas cutáneos y manchas café con leche. Por MLPA se
detectó una alteración en el exón 13 del gen NF1. Por secuenciación del
exón 13 se identificó una mutación “missense” en la posición 1466 del ARNm
(NM_000267.3:c.1466A>G) que introduce un sitio de splicing aberrante.
#pmid start end term entity
25919870 0 24 Neurofibromatosis tipo I Disease
25919870 101 125 neurofibromatosis tipo 1 Disease
25919870 127 130 NF1 Gene
25919870 291 294 NF1 Gene
25919870 447 450 NF1 Gene
25919870 640 655 Osteoma maxilar Disease
25919870 657 672 Lordosis lumbar Disease
25919870 674 696 Neurofibromas cutáneos Disease
25919870 699 721 Manchas café con leche Disease
25919870 747 771 Alteración en el exón 13 OtherMutation
25919870 780 783 NF1 Gene
25919870 833 872 Mutación “missense” en la posición 1466 OtherMutation
25919870 883 894 NM_000267.3 Transcript
25919870 895 904 c.1466A>G DNAMutation
How to participate:
If you want to participate in this task, please join our Codalab
competition: https://codalab.lisn.upsaclay.fr/competitions/17733
Important Dates:
* March 22, 2024: release training corpus.
* May 24, 2024: release test corpus.
* June 7, 2024: publication of results.
* June 17, 2024: paper submission.
* June 28, 2024: notification of acceptance.
* July 3, 2024: camera ready paper submission.
* September, 2024: IberLEF 202 Workshop.
*** Apologies for cross-posting ***
Dear colleagues,
The NLP4Health Lab
<https://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fnlp4healt…>
in
the Department of Medical Informatics at the Amsterdam UMC
<https://www.amsterdamumc.org/en/research.htm> and University of Amsterdam
<https://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.uva.n…>
is
hiring *one postdoctoral researcher *and *two PhD students* in *Responsible
Natural Language Processing (NLP) and Machine Learning (ML) for Healthcare*,
all positions are fully-funded. Do you have a strong background in NLP and
ML, and a keen interest in large language models and healthcare? Please
consider applying, we are accepting applications (until March 15*)!
Please check details and apply via the links below:
- PhD positions:
https://werkenbij.amsterdamumc.org/en/vacatures/research/2-phd-positions-in…
- Postdoc researcher position:
https://werkenbij.amsterdamumc.org/en/vacatures/research/postdoctoral-resea…
The positions are funded by an NGF AiNed Fellowship Grant for the
"CaRe-NLP: Human-Centric and Responsible NLP methods for Dutch healthcare"
project. The overall goal of the project is to develop human-centric and
responsible NLP and ML methods for healthcare in the Netherlands, Europe,
and worldwide. We will design, build, and evaluate state-of-the-art large
language models (LLMs) for healthcare data that include (combinations of)
free-text clinical notes collected in primary/secondary/intensive care
settings, medical images, time series measurements, medical knowledge
graphs, and multi-modal electronic heath records (EHRs). Our methods' goals
are to ensure privacy and fairness, prevent bias, cope with data scarcity,
and be interpretable and explainable. We collaborate with a network of
clinicians across multiple specialties, and you will tackle relevant
clinical problems with real-world impact.
The project team is led by dr. Iacer Calixto and prof. Ameen Abu-Hanna, and
we are housed at the Amsterdam UMC location University of Amsterdam in the
beautiful city of Amsterdam.
(Please feel free to share with your students/communities!)
Have a great week,
Iacer.
Dear Corpora list members,
As part of the EPSRC UK ReproHum project (https://reprohum.github.io), we are performing a survey of NLP and ML researchers’ experience and views of reproducibility. We would like to hear from as many researchers as possible (NLP or ML), not just those who work on evaluation!
If you completed a similar survey in 2022 then you can still complete this one, we are interested in the difference in your experience and views between then and now.
We would be most grateful if you are able to spend 5-10 minutes taking part in the survey, it can be accessed via the below link:
https://forms.gle/RshrHcvAXxAEEFj59
With thanks and apologies for cross-posting.
Craig Thomson
Research Fellow
Computing Science
University of Aberdeen
The University of Aberdeen is a charity registered in Scotland, No SC013683.
Tha Oilthigh Obar Dheathain na charthannas clàraichte ann an Alba, Àir. SC013683.
PAN 2024: Shared Tasks on Authorship Analysis, Computational Ethics, and Originality
Call for Participation
We'd like to invite you to participate in the following shared tasks at PAN 2024 held in conjunction with the CLEF conference in Grenoble, France.
1. Voight-Kampff Generative AI Authorship Verification.
Given two texts, one authored by a human, one by a machine: pick out the human.
https://pan.webis.de/clef24/pan24-web/generated-content-analysis.html
2. Oppositional Thinking Analysis.
Given an online message, is it a conspiracy theory or critical thinking?
https://pan.webis.de/clef24/pan24-web/oppositional-thinking-analysis.html
3. Multi-Author Writing Style Analysis.
Given a document, determine at which positions the author changes.
https://pan.webis.de/clef24/pan24-web/style-change-detection.html
4. Multilingual Text Detoxification.
Given a toxic piece of text, re-write it in a non-toxic way while saving the main content as much as possible.
https://pan.webis.de/clef24/pan24-web/text-detoxification.html
Find out more at https://pan.webis.de/clef24/pan24-web
Important Dates
--------------------------
now Training Data Released
May 05, 2024 Software submission
May 31, 2024 Participant paper submission
June 24, 2024 Peer review notification
July 08, 2024 Camera-ready participant papers submission
Sep 09-12, 2024 Conference
Links
--------------------------
PAN: https://pan.webis.de
Contact: pan(a)webis.de
We are looking forward to your submission!
The PAN team
Dear colleague,
We invite you to participate in the 2024 edition of the CheckThat! Lab at
CLEF 2024. This year, we feature six tasks ---two follow-up and four new---
that correspond to important components within and around the full
fact-checking pipeline in multiple languages:
Task 1 Check-worthiness in tweets. to identify claims that could be
important to verify on social- and mainstream media (the only task that has
been organized during all editions of the lab; Available in Arabic,
English, Dutch and Spanish.
Task 2 Subjectivity in news articles. to spot text that should be processed
with specific strategies; benefiting the fact-checking pipeline. Available
in Arabic, English, German, Italian, and Multilingual.
Task 3 Persuasion Techniques. to identify text spans in which a persuasion
technique is being issued to influence the reader. This task is offered in
four languages: Arabic, Bulgarian, English, Portuguese and Slovene.
Task 4 Detecting hero, villain, and victim from memes. Detecting hero,
villain, and victim from memes:} to predict the role of each entity: hero,
villain, victim, or other in a given meme and a list of entities. Available
in Arabic, English and Code-mixed.
Task 5 Rumor Verification using Evidence from Authorities. to retrieve
evidence from trusted sources (authorities that have “real knowledge'' on
the matter) and determine if the rumor is supported, refuted, or
unverifiable according to the evidence. Available in Arabic and English.
Task 6 Robustness of Credibility Assessment with Adversarial Examples. to
discover small changes that could be applied to misinformation text,
causing the provided classifiers to make wrong predictions. Available for
news articles, tweets, propaganda techniques and claims (including
regarding COVID-19) in English.
Further information: https://checkthat.gitlab.io/
Datasets: https://gitlab.com/checkthat_lab/clef2024-checkthat-lab
<https://gitlab.com/checkthat_lab/clef2023-checkthat-lab>
Register and participate:
https://clef2024-labs-registration.dei.unipd.it/registrationForm.php
<https://clef2023-labs-registration.dei.unipd.it/registrationForm.php>
Important Dates
---------------------
- November 2023: Lab registration opens
- January 2024: Release of the training materials
- 22 April 2024: Lab registration closes
- 2 May 2024: Beginning of the evaluation cycle (test sets release)
- 6 May 2024 (23:59 AOE): End of the evaluation cycle (run submission)
- 31 May 2024: Deadline for the submission of working notes
- 10 June 2024: Submission of Condensed Lab Overviews [LNCS]
- 21 June 2024: Camera Ready Copy of Condensed Lab Overviews [LNCS] due
- 24 June 2024: Notification of acceptance of working notes
- 8 July 2024: Deadline for submission of camera-ready working notes
- 22-26 July 2024: Preview of working notes
- 9-12 September 2024: CLEF 2024 Conference in Grenoble, France
Best regards,
The CLEF-2024 CheckThat! Lab Shared Task Organizers
The 16th International Conference on Advances in Social Networks Analysis
and Mining -ASONAM-2024
September 02-05, 2024, Calabria, Italy.
Conference Link: https://asonam.cpsc.ucalgary.ca/2024/
Workshop information :
https://asonam.cpsc.ucalgary.ca/2024/CFW.php#key_dates
CALL FOR WORKSHOP PROPOSALS
The 16th International Conference on Advances in Social Networks Analysis
and Mining (ASONAM-2024) invites proposals for workshops at its annual
conference. ASONAM 2024 will be held between September 02-05, 2024 in
Calabria, Italy.
ASONAM is an interdisciplinary venue that brings together practitioners and
researchers from a variety of Social Network Analysis and Mining fields to
promote collaborations and exchange of ideas and practices.
The ASONAM 2024 Committee invites proposals for workshops to be held on
September 02-05, 2024 in conjunction with the main ASONAM 2024 conference.
Workshops can be either scheduled for a full day (morning and afternoon) or
for half a day.
Proposals should include following information:
- The name of the workshop.
- The names and addresses of the organizers, and a designated contact
person.
- Description of the workshop: abstract, objectives, relevance, and
expected outcome.
- The names of program committee members and, if applicable, other
potential applicants.
- A description of the plans for workshop (e.g., program, keynotes,
highlights, etc.).
- The expected number of attendees and the planned length of the workshop.
- A description of past versions of the workshop, including dates,
organizers, submission and acceptance counts, attendance, sites, and any
other relevant information.
Important dates:
Workshop proposal deadline March 20, 2024 11:59 PM AoE
Workshop acceptance notification April 10, 2024 11:59 PM AoE
For paper submission in your proposal, reviewing and final revisions,
please consider the following deadlines:
Workshop paper submission deadline June 10, 2024 11:59 PM AoE
Workshop paper acceptance notification July 10, 2024 11:59 PM AoE
Workshop paper camera-ready deadline July 18, 2024 11:59 PM AoE
Organizers of accepted proposals will be responsible for publicizing and
running the workshop, including sending out calls for papers, reviewing
submissions, producing the camera ready workshop proceedings, and
organizing the meeting days.
Submission Link:
https://easychair.org/conferences/?conf=workshopsasonam2024
Looking forward to your workshop proposals which will help make ASONAM 2024
a success!
Kind Regards
Rajesh Sharma
Associate Professor,
Head, Computational Social Science Lab,
Institute of Computer Science,
University of Tartu, Estonia
https://rajeshsharma.cs.ut.ee/
WojoodNER 2024
The 2nd Arabic Named Entity Recognition Shared Task at ArabicNLP’24
https://dlnlp.ai/st/wojood/
ندعوكم للمشاركة في المسابقة العلمية الثانية لاكتشاف الاعلام في النصوص العربية. سيحصل المشاركين على مدونة وجود الجديدة (٥٥٠ الف كلمة + انواع مفصلة من الاعلام). يوجد ثلاث مهام في المسابقة يمكن المشاركة باي منها، احدى المهام حول الحرب على غزة ويمكن للمشاركين استخدام بيانات خارجية فيها
Dataset: Wojood-Fine <https://aclanthology.org/2023.arabicnlp-1.25/> New version: Arabic Fine-Grained Entity Recognition (Wojood + Subtypes of entity types).
Subtask-1 (Closed-Track Flat Fine-Grain NER): We provide the Wojood-Fine Flat train (70%) and development (10%) datasets. The final evaluation will be on the test set (20%). External data is not allowed .... (read more <https://dlnlp.ai/st/wojood/>).
Subtask-2 (Closed-Track Nested Fine-Grain NER): This subtask is similar to the subtask-1, we provide the Wojood-Fine Nested train (70%) and development (10%) datasets. The final evaluation will be on the test set (20%) .... (read more <https://dlnlp.ai/st/wojood/>).
Subtask-3 (Open-Track NER - Gaza War): to allow participants to reflect on the utility of NER in the context of real-world events, allow them to use external resources, and encourage them to use generative models in different ways (fine-tuned, zero-shot learning, in-context learning, etc.). The goal of focusing on generative models in this particular subtask is to help the Arabic NLP research community better understand the capabilities and performance gaps of LLMs in information extraction, an area currently understudied.
We provide development and test data related to the current War on Gaza. This is motivated by the assumption that discourse about recent global events will involve mentions from different data distribution. For this subtask, we include data from five different news domains related to the War on Gaza - but we keep the names of the domains hidden. Participants will be given a development dataset (10K tokens, 2K from each of the five domains), and a testing dataset (50K tokens, 10K from each domain). Both development and testing sets are manually annotated with fine-grain named entities using the same annotation guidelines used in Subtask1 and Subtask2 (also described in Liqreina et al., 2023). .... (read more <https://dlnlp.ai/st/wojood/>).
BASELINES
Two baseline models trained on WojoodFine (flat and nested) are provided (See Liqreina et al., 2023 <https://aclanthology.org/2023.arabicnlp-1.25/>). The code used to produce these baselines is available on GitHub <https://github.com/SinaLab/ArabicNER>.
Subtask Precision Recall Average Micro-F1
Flat Fine-Grain NER (Subtask 1) 0.8870 0.8966 0.8917
Nested Fine-Grain NER (Subtask 2) 0.9179 0.9279 0.9229
GOOGLE COLAB NOTEBOOKS
To allow you to experiment with the baseline, we authored four Google Colab notebooks that demonstrate how to train and evaluate our baseline models.
[1] Train Flat Fine-Grain NER <https://gist.github.com/mohammedkhalilia/72c3261734d7715094089bdf4de74b4a>: This notebook can be used to train our ArabicNER model on the flat Fine-grain NER task using the sample Wojood_Fine data.
[2] Evaluate Flat Fine-Grain NER <https://gist.github.com/mohammedkhalilia/c807eb1ccb15416b187c32a362001665>: This notebook will use the trained model saved from the notebook above to perform evaluation on unseen dataset.
[3] Train Nested Fine-Grain NER <https://gist.github.com/mohammedkhalilia/a4d83d4e43682d1efcdf299d41beb3da>: This notebook can be used to train our ArabicNER model on the nested Fine-grain task using the sample Wojood data.
[4] Evaluate Nested Fine-Grain NER <https://gist.github.com/mohammedkhalilia/9134510aa2684464f57de7934c97138b>: This notebook will use the trained model saved from the notebook above to perform evaluation on unseen dataset.
REGISTRATION
Participants need to register via this form (NERSharedTask 2024) <https://docs.google.com/forms/d/1ISMILgQYfUug3XuDpxFmuPASXkWaduYOUc3xOZuGwq…>. Participating teams will be provided with common training development datasets. No external manually labelled datasets are allowed. Blind test data set will be used to evaluate the output of the participating teams. Each team is allowed a maximum of 3 submissions. All teams are required to report on the development and test sets (after results are announced) in their write-ups.
FAQ
For any questions related to this task, please check our Frequently Asked Questions <https://docs.google.com/document/d/1W_13FRpP3NbDx_ALYJWA3-ESXPRVomOjNovUuYf…>
IMPORTANT DATES
- February 25, 2024: Shared task announcement.
- March 1, 2024: Release of training data, development sets, scoring script, and Codalab links.
- April 5, 2024: Registration deadline.
- April 26, 2024: Test set made available.
- May 3, 2024: Codalab Test system submission deadline.
- May 10, 2024: Shared task system paper submissions due.
- June 17, 2024: Notification of acceptance.
- July 1, 2024: Camera-ready version.
- August 16, 2024: ArabicNLP 2024 conference in Thailand.
CONTACT
For any questions related to this task, please contact the organizers directly using the following email address: NERSharedtask(a)gmail.com <mailto:NERSharedtask@gmail.com> .
ORGANIZERS
- Mustafa Jarrar, Birzeit University
- Muhammad Abdul-Mageed, University of British Columbia & MBZUAI
- Mohammed Khalilia, Birzeit University
- Bashar Talafha, University of British Columbia
- AbdelRahim Elmadany, University of British Columbia
- Nagham Hamad, Birzeit University
--Mustafa
__________________________
Mustafa Jarrar, PhD
Professor of Artificial Intelligence
Chair, PhD Program in Computer Science
Birzeit University, Palestine
Whatsapp:+972599662258 | mjarrar(a)birzeit.edu <mailto:mjarrar@birzeit.edu>
http://www.jarrar.info <http://www.jarrar.info/>
Apologies for cross-posting
Extended Deadline: March 10, 2024
The First Workshop on Visualization for Natural Language Processing
(Vis4NLP)May 27th 2024, Odense, Denmark
The workshop will be co-located with EuroVis 2024
<https://www.eurovis.org/eurovis> in Odense, Denmark, and will take place in
person on May 27. The workshop aims to create a dedicated space for
interdisciplinary collaboration at the intersections of NLP and
visualization. Vis4NLP serves as a pivotal platform where researchers,
practitioners, and academics come together to collectively tackle the
ever-evolving challenges and opportunities in NLP visualization.
*Call for paper: http://vis4nlp.com/ <http://vis4nlp.com/>Workshop
date: May 27, 2024Venue: Syddansk Universitet - University of Southern
Denmark <http://sdu.dk/>*
Important Dates
- Workshop paper due: March 3, 2024 March 10, 2024
- Notification of acceptance: April 10, 2024
- Camera-ready papers due: April 20, 2024
- Workshop date: May 27, 2024
All submission deadlines are at 23:59 GMT on the date indicated.
Best regards
*Tariq Yousef*Assistant Professor of Data Science
Department of Mathematics and Computer Science
Faculty of Science
*University of Southern Denmark*
[image: image.png]
GAMES AND NLP 2024 @ LREC-COLING 2024
=====================================
Co-located with LREC-COLING in Turin, Italy
21st May 2024
https://gamesandnlp.com
*** Deadline extended: Mar 4th ***
Call for Papers
--------------------
The 10th Workshop on Games and Natural Language Processing (Games and NLP 2024)—to be held at LREC-COLING 2024 — will examine the use of games and gamification for Natural Language Processing (NLP) tasks, as well as how NLP research can advance player engagement and communication within games. The Games and NLP workshop aims to promote and explore the possibilities for research and practical applications of games and gamification that have a core NLP aspect, either to generate resources and perform language tasks or as a game mechanic itself. This workshop investigates computational and theoretical aspects of natural language research that would be beneficial for designing and building novel game experiences, or for processing texts to conduct formal game studies. NLP would benefit from games in obtaining language resources (e.g., construction of a thesaurus or a parser through a crowdsourcing game), or in learning the linguistic characteristics of game users as compared to those of other domains.
Topics (include, but are not limited to)
--------------------------------------------------
• Games for collecting data useful for NLP
• Gamification of NLP tasks
• Player motivation and experience
• Game design
• Novel uses of natural language processing or generation as a game mechanic
• Natural language in games as an alternative method of input for people with disabilities
• Processing NLP game data
• Analysis of large-scale game-related corpora
• Real-time sentiment analysis of player discourse or chat
• Evaluation of games for NLP
• Serious games for learning languages
• Player immersion in language-enabled mixed reality or physically embodied games
• Narrative plot or text generation of text-based interactive narrative systems
• Natural language understanding and generation of character dialogue
• Ethical and privacy concerns of ownership of text and audio chat in massively multiplayer online games
Submissions:
------------------
The papers should be submitted as a PDF document, conforming to the formatting guidelines provided in the call for papers of LREC-COLING conference (https://lrec-coling-2024.org/authors-kit/). Submissions are to be made via Softconf/START Conference Manager at https://softconf.com/lrec-coling2024/gamesandnlp2024/
Important Dates
---------------------
• Submission Deadline: Mar 4th (*** extended ***)
• Notification of Acceptance: Mar 26th
• Camera Ready Deadline: Apr 1st
• Workshop: May 21st
Organisation Committee
--------------------------------
• Chris Madge, chair (Queen Mary University of London)
• Jon Chamberlain (University of Essex, UK)
• Karën Fort (Sorbonne Université, France)
• Udo Kruschwitz (University of Regensburg, Germany)
• Stephanie Lukin (U.S. Army Research Laboratory)
Programme Committee
-------------------------------
• Alice Millour (Sorbonne Université)
• Brent Harrison (University of Kentucky, US)
• Ian Horswill (Northwestern University)
• Jonathan Lessard (Universite Condoria)
• Luisa Coheur (INESC-ID & Instituto Superior Técnico, University of Lisbon)
• Mariët Theune (University of Twente)
• Massimo Poesio (Queen Mary University, UK)
• Mathieu Lafourcade (LIRMM, France)
• Morteza Behrooz (University of California, Santa Cruz, US)
• Pedro Santos (INESC-ID & Instituto Superior Técnico, University of Lisbon)
• Richard Bartle (University of Essex, UK)
• Seth Cooper (Northeastern University, US)
• Valerio Basile (University of Turin, Italy)
• Fatima Althani (Queen Mary University, UK)
**The 6th Workshop on Open-Source Arabic Corpora and Processing Tools (Hybrid) with shared tasks on Arabic LLMs Hallucination and Dialect to MSA Machine Translation**
The workshop will be conducted in a *hybrid* format to ensure maximum participation, accommodating attendees both online and in-person.
Submission deadline: extended to * March 1 *, 2024
*Workshop site* : https://osact-lrec.github.io/
*shared tasks:*
Task 1: Arabic LLMs Hallucination (contact Hamdy Mubarak), Link: https://sites.google.com/view/arabic-llms-hallucination
Task 2: Dialect to MSA Machine Translation (contact Kareem Darwish), Link: https://codalab.lisn.upsaclay.fr/competitions/17118
*Co-located with LREC-COLING 2024*
https://lrec-coling-2024.org/
Turin, Italy, 20-25 May 2024
* Important Dates*
Submission deadline: extended to * March 1 *, 2024
Notification of acceptance: March 25, 2024
Camera-ready papers due: March 30, 2024
Workshop date: May 25, 2024
*Workshop Description*
In the computational linguistics (CL), natural language processing (NLP), and information retrieval (IR) communities, Arabic is considered to be relatively resource-poor compared to English. This situation was thought to be the reason for the limited number of language resources -based studies in Arabic. However, the past few years witnessed the emergence of new considerably large and free classical and Modern Standard Arabic (MSA) as well as dialectical corpora and to a lesser extent Arabic processing tools.
This workshop follows the footsteps of previous editions of OSACT to provide a forum for researchers to share and discuss their ongoing work. This workshop is timely given the continued rise in research projects focusing on Arabic Language Resources. The sixth workshop comes to encourage researchers and practitioners of Arabic language technologies, including CL, NLP and IR to share and discuss their latest research efforts, corpora, and tools. The workshop will also give special attention to Large Language Models (LLMs) and Generative AI, which is a hot topic nowadays. In addition to the general topics of CL, NLP and IR, the workshop will give a special emphasis on two shared tasks, namely: Arabic LLMs Hallucination and Dialect to MSA Machine Translation.
*Submissions Topics*
Language Resources:
- Pre-trained Arabic language models and their applications.
- Surveying and evaluating the design of available Arabic corpora, their associated and processing tools.
- Availing new annotated corpora for NLP and IR applications such as named entity recognition, machine translation, sentiment analysis, text classification, and language learning.
- Evaluating the use of crowdsourcing platforms for Arabic data annotation.
- Open source Arabic processing toolkits.
Tools and Technologies:
Language education, e.g., L1 and L2.
- Language modeling and pre-trained models.
- Tokenization, normalization, word segmentation, morphological analysis, part-of-speech tagging, etc.
- Sentiment analysis, dialect identification, and text classification.
- Dialect translation.
- Fake news detection.
- Web and social media search and analytics.
- Issues in the design, construction, and use of Arabic LRs: text, speech, sign, gesture, image, in single or multimodal/multimedia data.
- Guidelines, standards, best practices, and models for LRs interoperability.
- Methodologies and tools for LRs construction and annotation.
- Methodologies and tools for extraction and acquisition of knowledge.
- Ontologies, terminology, and knowledge representation.
- LRs and Semantic Web (including Linked Data, Knowledge Graphs, etc.).
Issues in the design, construction and use of Arabic LRs:
- Guidelines, standards, best practices and models for LRs interoperability.
- Methodologies and tools for LRs construction and annotation.
- Methodologies and tools for extraction and acquisition of knowledge.
- Ontologies, terminology and knowledge representation.
- LRs and Semantic Web (including Linked Data, Knowledge Graphs, etc.).
*Submissions*
- Submission Instructions: https://lrec-coling-2024.org/authors-kit/
- Submission Link: https://softconf.com/lrec-coling2024/osact2024/
*Workshop organizers*
- Hend Al-Khalifa ( King Saud University, KSA)
- Hamdy Mubarak (Qatar Computing Research Institute, Qatar)
- Kareem Darwish (aiXplain Inc., US)
- Tamer Elsayed (Qatar University, Qatar)
- Mona Ali (Northeastern University, Canada)