Apologies for cross-posting.
---------------------------------------------------------------------------
The Seventh Workshop on Technologies for Machine Translation of Low-Resource
Languages (LoResMT 2024)
https://www.loresmt.org/
@ ACL 2024 (August 11–16, 2024)
Bangkok, Thailand
SUBMISSION
https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/LoResMT
TIMELINE
Paper submission due: May 17 (Friday), 2024, at 23:59 (Anywhere on Earth)
Notification of acceptance: June 17 (Monday), 2024
Camera-ready papers due: July 1 (Monday), 2024, at 23:59 (Anywhere on Earth)
Workshop dates at ACL: August 15, 2024
SCOPE
Based on the success of past low-resource machine translation (MT)
workshops at AMTA 2018 (https://amtaweb.org/), MT Summit 2019 (
https://www.mtsummit2019.com), AACL-IJCNLP 2020 (http://aacl2020.org/),
AMTA 2021, COLING 2022 and EACL 2023, we introduce the Seventh LoResMT
Workshop at ACL 2024. The workshop provides a discussion panel for
researchers working on MT systems/methods for low-resource and
under-represented languages in general. We would like to help
review/overview the state of MT for low-resource languages and define the
most important directions. We also solicit papers dedicated to
supplementary NLP tools that are used in any language and especially in
low-resource languages. Overview papers on these NLP tools are very
welcome. It will be beneficial if the evaluations of these tools in
research papers include their impact on the quality of MT output.
TOPICS
We are highly interested in (1) original research papers, (2)
review/opinion papers, and (3) online systems on the topics below; however,
we welcome all novel ideas that cover research on low-resource languages.
- Neural machine translation (NMT) for low-resource languages
- Use of LLMs (large language models) for low-resource MT systems
- COVID-related corpora, their translations and corresponding NLP/MT systems
- Work that presents online systems for practical use by native speakers
- Word tokenizers/de-tokenizers for specific languages
- Word/morpheme segmenters for specific languages
- Alignment/Re-ordering tools for specific language pairs
- Use of morphology analyzers and/or morpheme segmenters in MT
- Multilingual/cross-lingual NLP tools for MT
- Corpora creation and curation technologies for low-resource languages
- Review of available parallel corpora for low-resource languages
- Research and review papers on MT methods for low-resource languages
- MT systems/methods (e.g. rule-based, SMT, NMT) for low-resource languages
- Pivot MT for low-resource languages
- Zero-shot MT for low-resource languages
- Fast building of MT systems for low-resource languages
- Re-usability of existing MT systems for low-resource languages
- Machine translation for language preservation
SUBMISSION INFORMATION
We are soliciting two types of submissions: (1) research, review, and
position papers and (2) system demonstration papers. For research, review
and position papers, the length of each paper should be at least four (4)
and not exceed eight (8) pages, plus unlimited pages for references. For
system demonstration papers, the limit is four (4) pages. Submissions
should be formatted according to the official ACL 2024 style templates.
Accepted papers will be published online in the ACL 2024 proceedings and
will be presented at the conference.
Submissions must be anonymized and should be done using the provided
submission system. Scientific papers that have been or will be submitted to
other venues must be declared as such and must be withdrawn from the other
venues if accepted and published at LoResMT. The review will be
double-blind. Authors of an accepted paper should present their paper in
person at ACL 2024. Papers should be submitted in PDF to the LoResMT Open
Review.
We would like to encourage authors to cite papers written in ANY language
that are related to the topics, as long as both original bibliographic
items and their corresponding English translations are provided.
Registration is handled by the main conference (https://2024.aclweb.org/).
ORGANIZING COMMITTEE (LISTED ALPHABETICALLY)
Atul Kr. Ojha, University of Galway & Panlingua Language Processing LLP
Chao-Hong Liu, Potamu Research Ltd
Ekaterina Vylomova, University of Melbourne, Australia
Jade Abbott, Retro Rabbit
Jonathan Washington, Swarthmore College
Nathaniel Oco, National University (Philippines)
Tommi A Pirinen, UiT The Arctic University of Norway, Tromsø
Valentin Malykh, Huawei Noah’s Ark lab and Kazan Federal University
Varvara Logacheva, Skolkovo Institute of Science and Technology
Xiaobing Zhao, Minzu University of China
PROGRAM COMMITTEE (LISTED ALPHABETICALLY)
Abigail Walsh, ADAPT Centre, Dublin City University, Ireland
Alberto Poncelas, Rakuten, Singapore
Alina Karakanta, Leiden University
Amirhossein Tebbifakhr, Fondazione Bruno Kessler
Anna Currey, Amazon Web Services
Aswarth Abhilash Dara, Amazon
Arturo Oncevay, University of Edinburgh
Atul Kr. Ojha, DSI, University of Galway & Panlingua Language Processing LLP
Barry Haddow, University of Edinburgh
Bogdan Babych, Heidelberg University
Chao-Hong Liu, Potamu Research Ltd
Constantine Lignos, Brandeis University, USA
Daan van Esch, Google
Diptesh Kanojia, University of Surrey, UK
Duygu Ataman, University of Zurich
Ekaterina Vylomova, University of Melbourne, Australia
Eleni Metheniti, CLLE-CNRS and IRIT-CNRS
Flammie Pirinen, UiT The Arctic University of Norway, Tromsø
Koel Dutta Chowdhury, Saarland University (Germany)
Jade Abbott, Retro Rabbit
Jasper Kyle Catapang, University of the Philippines
Jindřich Libovicky, Charles University
John P. McCrae, DSI, University of Galway
Liangyou Li, Noah’s Ark Lab, Huawei Technologies
Majid Latifi, University of York, York, UK
Maria Art Antonette Clariño, University of the Philippines Los Baños
Mathias Müller, University of Zurich
Nathaniel Oco, De La Salle University (Philippines)
Rajdeep Sarkar, Yahoo
Rico Sennrich, University of Zurich
Saliha Muradoglu, The Australian National University
Sangjee Dondrub, Qinghai Normal University
Santanu Pal, WIPRO AI
Sardana Ivanova, University of Helsinki
Shantipriya Parida, Silo AI
Sunit Bhattacharya, Charles University
Surafel Melaku Lakew, Amazon AI
Wen Lai, Center for Information and Language Processing, LMU Munich
Valentin Malykh, Huawei Noah’s Ark lab and Kazan Federal University
CONTACT
Please email loresmt(a)googlegroups.com if you have any
questions/comments/suggestions.
Apologies for cross-posting. you are requested to please circulate it for
wider publicity...
---------------------------------------------------------------------------
7thWorkshop on Indian Language Data: Resources and Evaluation (WILDRE)
Venue: Lingotto Conference Centre - Torino, Italy (Organized under LREC-COLING
2024 (20-25 May 2024) <https://lrec-coling-2024.org/>)
Website: http://sanskrit.jnu.ac.in/conf/wildre7
WILDRE-7, the 7th Workshop on Indian Language Data: Resources and
Evaluation is proposed to be organised in Lingotto Conference Centre -
Torino, Italy under the LREC-COLING platform. India has a huge linguistic
diversity and has seen concerted efforts from the Indian government and
industry to develop language resources. European Language Resource
Association (ELRA) and its associate organizations have been very active
and successful in addressing the challenges and opportunities related to
language resource creation and evaluation. It is therefore a big
opportunity for resource creators of Indian languages to showcase their
work on this platform and also to interact and learn from those involved in
similar initiatives all over the world. The broader objectives of the
WILDRE will be
-
To map the status of Indian Language Resources
-
To investigate challenges related to creating and sharing various levels
of language resources
-
To promote a dialogue between language resource developers and users
-
To provide an opportunity for researchers from India to collaborate with
researchers from other parts of the world
*Dates for Short/Long papers and Posters and Demos*
February 28, 2024 March 06, 2024: Paper submissions due [extended deadline]
March 28, 2024: Paper notification acceptance
April 10, 2024: Camera-ready papers due
SUBMISSIONS
Papers must describe original, completed/ in progress and unpublished work.
Each submission will be reviewed by three program committee members.
Accepted papers will be given up to 10 pages (for full papers) 5 pages (for
short papers and posters) in the workshop proceedings, and will be
presented as oral paper or poster.
Papers should be formatted according to the LREC-COLING style sheet, which
is provided on the LREC-COLING 2024 website (
https://lrec-coling-2024.org/authors-kit/). Papers should be submitted in
PDF format to the LREC-COLING website (
https://softconf.com/lrec-coling2024/wildre-7/)
We are seeking submissions under the following category
-
Full papers (10 pages)
-
Short papers (work in progress: 5 pages)
-
Posters (innovative ideas/proposals, research proposal of students)
-
Demo (of working online/standalone systems)
WILDRE-7 will have a special focus on Demos of Indian Language Technology.
In the past few years, as more resources have been developed and made
available, there has been an increased activity in developing usable
technology using these. WILDRE-7 would like to encourage and widen the Demo
track to allow the community to showcase their demos and have mutually
beneficial interactions with each other as well as resource developers.
WILDRE-7 is seeking full, short papers, posters and demos on the following
topics related to Indian Language Resources:
-
Digital Humanities, heritage computing
-
Corpora - text, speech, multimodal, methodologies, annotation and tools
-
Lexicons and Machine-readable dictionaries
-
Ontologies, Grammars
-
Language resources for NLP/ IR/Speech tasks, tools and Infrastructure
for language resources
-
Standards or specifications for language resources application
-
Licensing and copyright issues
-
Data mining
-
Text summarization
Both submission and review processes will be handled electronically. The
review process will be double-blind. The workshop website will provide the
submission guidelines and the link for the electronic submission.
When submitting a paper from the START page, authors will be asked to
provide essential information about resources (in a broad sense, i.e.
technologies, standards, evaluation kits, etc.) that have been used for the
work described in the paper or are a new result of your research. Moreover,
ELRA encourages all LREC-COLING authors to share the described LRs (data,
tools, services, etc.), to enable their reuse, and replicability of
experiments, including evaluation ones, etc.
For further information on this initiative, please refer to
https://lrec-coling-2024.org/
Shared Task
Following the success of the five WILDRE workshops, WILDRE-7 will
include *Code-mixed
Less-Resourced Sentiment Analysis (Code-mixed) *and *Discourse Machine
Translation (DiscoMT)* Shared Tasks. The organizers of shared tasks will
provide datasets and evaluation platforms to evaluate systems developed by
the participants. For further information on this initiative, please refer
to http://sanskrit.jnu.ac.in/conf/wildre7
Workshop *Organisers*
-
Girish Nath Jha, Jawaharlal Nehru University, India
-
Kalika Bali, Microsoft Research India Lab, Bangalore, India
-
Sobha L, AU-KBC, Anna University, Chennai, India
-
Atul Kr. Ojha, University of Galway, Ireland & Panlingua Language
Processing LLP, India
Workshop contact:
Atul Kr. Ojha, University of Galway, Ireland & Panlingua Language
Processing LLP, India, shashwatup9k(a)gmail.com
Identify, Describe and Share your LRs
Describing your LRs in the LRE Map is now a normal practice in the
submission procedure of LREC (introduced in 2010 and adopted by other
conferences). To continue the efforts initiated at LREC 2014 about “Sharing
LRs” (data, tools, web services, etc.), authors will have the possibility,
when submitting a paper, to upload LRs in a special LREC repository. This
effort of sharing LRs, linked to the LRE Map for their description, may
become a new “regular” feature for conferences in our field, thus
contributing to creating a common repository where everyone can deposit and
share data.
As scientific work requires accurate citations of referenced work to allow
the community to understand the whole context and also replicate the
experiments conducted by other researchers, LREC-COLING 2024 endorses the
need to uniquely identify LRs through the use of the International Standard
Language Resource Number (ISLRN, www.islrn.org), a Persistent Unique
Identifier to be assigned to each Language Resource. The assignment of
ISLRNs to LRs cited in LREC-COLING papers will be offered at submission
time.
**
*CFP: The3rd Annual Meeting of the ELRA-ISCA Special Interest Group on
Under-resourced Languages (SIGUL2024)*
*
* Workshop website: https://sigul-2024.ilc.cnr.it
<https://sigul-2024.ilc.cnr.it/>
* When: Monday and Tuesday, May 20th-21st, 2024
* Where: Torino, Italy (co-located with LREC-COLING 2024)
* Deadline for submissions: February 26th, 2024
* Paper submission link: https://softconf.com/lrec-coling2024/sigul2024/
<https://softconf.com/lrec-coling2024/sigul2024/>
* Deadline for camera-ready papers: April 5th, 2024
The 3rd Annual Meeting of the ELRA <http://www.elra.info/>/ISCA
<https://www.isca-speech.org/iscaweb/index.php>Special Interest Group on
Under-Resourced Languages
<http://www.elra.info/en/sig/sigul/>(SIGUL2024) will provide a forum for
the presentation and discussion of cutting-edge research in language
processing for under-resourced languages by academic and industry
researchers.
SIGUL2024 is held over two days to allow for extended discussions and
interaction.
Far from being just a smaller version of a conference, SIGUL2024 aims to
create the conditions for an exchange of knowledge and a comparison of
needs and perspectives between research and practice in the field to
take place.
We invite contributions (regular long papers of 8 pages or short papers
of 4 pages) targeting any of the following - non-exhaustive - list of
topics:
*
Processing any under-resourced languages (covering less-resourced,
under-resourced, endangered, minority, and minoritized languages)
*
Cognitive and linguistic studies of under-resourced languages
*
Fast resources acquisition: text and speech corpora, parallel texts,
dictionaries, grammars, and language models
*
Zero and few-shot methodologies and self-supervised learning in
language and speech technologies
*
Cross-lingual and multilingual acoustic and lexical modeling
*
Speech recognition and synthesis for under-resourced languages and
dialects
*
Machine translation and speech-to-speech translation
*
Spoken dialogue systems
*
Applications of language technologies for under-resourced languages
*
Large language models and under-resourced languages
*
Special topic:
o
Text and speech resources and technologies for the languages of
Italy
Special Session on languages of Italy and language technologies
Italy is known for its linguistic diversity that reflects its long and
varied history. To celebrate it, SIGUL2024 will provide a special
session or forum for researchers interested in developing language
resources and technologies for the many languages of Italy (regional,
minority, or heritage languages, including those of the neighboring
countries).
Submissions
Authors can choose among three paper categories:
*
Regular long papers – up to eight (8) pages maximum*, presenting
substantial, original, completed, and unpublished work.
*
Short papers – up to four (4) pages*, describing work-in-progress
projects in the early stage of development, new resources, negative
results, system demonstrations, and early-career/student work.
*
Position papers – up to eight (8) pages*, for reflective
considerations of methodological, best practice, and institutional
issues (e.g., ethics, data ownership, speakers’ community
involvement, de-colonizing approaches).
The above page limits exclude any number of additional pages that may be
needed for references.
The form of the presentation may be oral or poster, whereas in the
proceedings there is no difference between the accepted papers.
Submission is NOT anonymous and the official LREC-COLING 2024 format
must be adopted. Each paper will be reviewed by three independent reviewers.
Invited speakers
Eddie Avila, GlobalVoices
Jean Maillard, FAIR, META
Important Dates
• 26 February 2024: submission due
• 18 March 2024: reviews due
• 22 March 2024: notifications to authors
• 5 April 2024: camera-ready (PDF) due
Diversity & Inclusion Subsidies
SIGUL2024 is providing funds for registration and travel or for
bandwidth/VPN. We encourage citizens of developing countries and members
of marginalised communities to apply for subsidies. Details on the
application procedure will be available on the workshop website. For
inquiries, please contact claudia.soria[AT]ilc.cnr.it.
Workshop Organizers
Maite Melero, Sakriani Sakti, Claudia Soria
Program Committee
*
Mohammad A. M. Abushariah (The University of Jordan, Jordan)
*
Manex Aguirrezabal (University of Copenhagen – Center for
Sprogteknologi | Center for Language Technology, Denmark)
*
Shyam S. Agrawal (KIIT, Gurugram ,India)
*
Begoña Altuna (HiTZ Center - Ixa, Euskal Herriko Unibertsitatea |
University of the Basque Country, Spain)
*
Antti Arppe (University of Alberta, Canada)
*
Martin Benjamin (Kamusi Project International)
*
Delphine Bernhard (Université de Strasbourg, LiLPa, France)
*
Steven Bird (Charles Darwin University, Australia)
*
Claudia Borg (University of Malta)
*
Matt Coler (University of Groningen, Campus Fryslân, The Netherlands)
*
Dan Cristea (Romanian Academy, Romania)
*
Pradip Kumar Das (IIT Guwahati, India)
*
A. Seza Doğruöz (Universiteit Gent, België | Ghent University, Belgium)
*
Stefano Ghazzali (Language Technologies Unit Bangor University
Prifysgol Bangor | Bangor University, Bangor, Gwynedd)
*
Itziar Gonzalez-Dios (HiTZ Basque Center for Language Technologies -
Ixa, University of the Basque Country UPV/EHU)
*
Lars Hellan (Norwegian University of Science and Technology, Norway)
*
Mélanie Jouitteau (IKER, CNRS, France)
*
Ritesh Kumar (UnReaL-TecE LLP, India)
*
Richard Littauer
*
Teresa Lynn (Mohamed bin Zayed University of Artificial
Intelligence, United Arab Emirates)
*
Nina Markl (University of Essex, UK)
*
Maite Melero (Barcelona Supercomputing Center, Espanya | Spain)
*
Peter Mihajlik (Budapest University of Technology and Economics,
Hungary)
*
Win Pa Pa (UCS Yangon, Myanmar)
*
Sandy Ritchie (Google Research)
*
Sakriani Sakti (JAIST, Japan)
*
Nay San (Stanford University, USA)
*
Claudia Soria (CNR-ILC, Italia | Italy)
*
Daan Van Esch (Google Research)
*
Menno van Zaanen (South African Centre for Digital Language
Resources, South Africa)
*
Jenifer Vega Rodriguez (GIPSA-lab, Université Grenoble Alpes, France)
*
Marcely Zanon Boito (NAVER Labs Europe, France)
Identify, Describe and Share your LRs!
When submitting a paper from the START page, authors will be asked to
provide essential information about resources (in a broad sense, i.e.
also technologies, standards, evaluation kits, etc.) that have been used
for the work described in the paper or are a new result of your
research. Moreover, ELRA encourages all LREC-COLING authors to share the
described LRs (data, tools, services, etc.) to enable their reuse and
replicability of experiments (including evaluation ones).
Contact
claudia.soria[AT]ilc.cnr.it
Please, write “SIGUL2024” in the subject of your e-mail.
*
--
facebook <https://www.facebook.com/CNRsocialFB> twitter
<https://twitter.com/CNRsocial_> instagram
<https://www.instagram.com/cnrsocial/> linkedin
<https://www.linkedin.com/company/283032>
Claudia Soria
CNR, ISTITUTO DI LINGUISTICA COMPUTAZIONALE "ANTONIO ZAMPOLLI"
claudia.soria(a)ilc.cnr.it
Tel. 0503153166
Via Giuseppe Moruzzi, 1, 56124 – Pisa
www.ilc.cnr.it
*www.cnr.it* <http://www.cnr.it/>
Devolvi il 5×1000 al CNR
CF 80054330586
--
facebook <https://www.facebook.com/CNRsocialFB> twitter
<https://twitter.com/CNRsocial_> instagram
<https://www.instagram.com/cnrsocial/> linkedin
<https://www.linkedin.com/company/283032>
Claudia Soria
CNR, ISTITUTO DI LINGUISTICA COMPUTAZIONALE "ANTONIO ZAMPOLLI"
claudia.soria(a)ilc.cnr.it
Tel. 0503153166
Via Giuseppe Moruzzi, 1, 56124 – Pisa
www.ilc.cnr.it
*www.cnr.it* <http://www.cnr.it/>
Devolvi il 5×1000 al CNR
CF 80054330586
Apologies for cross-posting
*2nd Workshop on Resources and Technologies for Indigenous, Endangered and
Lesser-resourced Languages in Eurasia (EURALI) @ LREC-COLING 2024*
Date: 20-25 May, 2024
Paper submissions due March 04th, 2024
Venue: Lingotto Conference Centre - Torino (Italia)
Main website: https://sites.google.com/view/eurali/
LREC-COLING 2024 website: https://lrec-coling-2024.org/
Submission website: https://softconf.com/lrec-coling2024/eurali2024/
——————————————————————————————————
*Workshop overview and objectives*
This workshop will focus on the development of language technology
resources and tools for indigenous, endangered and lesser-resourced
languages on the Eurasian continent.
In a media-centric world where language technology allows people to break
cultural and language barriers, it is important that speakers of endangered
and indigenous languages can be empowered to use this technology to
continue to share their knowledge and culture with the world. With the hope
of bridging this gap, the goal of this workshop is to increase visibility
and promote research for lesser-resourced and under-represented languages
in Europe and Asia. Through collaboration between NLP researchers, language
experts and linguists working for the benefit of endangered languages in
these communities, we aim to create language technology resources that will
help to preserve and revive these languages for future generations.
Furthermore, the workshop aims to promote the emergence of new methods that
benefit linguists (e.g. automating analysis and validation processes),
field linguists (facilitating data collection and analysis processes), and
computational linguists (developing new techniques necessary for linguistic
analysis, development of supervised or weakly supervised methods for the
analysis of poorly written or undocumented languages).
The main objective of the workshop is to create basic resources and develop
tools for Eurasiatic languages, including but not limited to the following
topics:
- identifying languages and variants spoken in these regions
- creation of language resources and applications, e.g. sentiment analysis,
named entity recognition, and syntactic parsing
- standardization for endangered languages
- automatic identification and classification of lexical variation and
language varieties
- adaptation of fundamental NLP tools for these languages, e.g.,
morphological analysis, taggers and parsers
- reusability of language resources in NLP applications, e.g. machine
translation, and POS tagging
- machine translation between closely related languages
- evaluation of language resources and tools when applied to
lesser-resourced languages in the same language families
- corpora, resources, and tools for closely related languages
- linguistic and textual similarities among languages in Eurasia
- digitalization of endangered languages
- challenges in the creation of language resources and tools from
linguistic perspectives (which includes any perspective formal theory)
*Submissions*
We are seeking submissions in the following categories:
- Full papers: 8 pages+unlimited references
- Short papers (work in progress): 4 pages+unlimited references
- Posters (innovative ideas/proposals, a research idea of students): 4
pages+unlimited references
- Demo (of working online/standalone systems): 2 pages
Papers must describe original, completed or in progress, and unpublished
work. The accepted papers will be given up for full/short paper and poster
in the workshop proceedings and will be presented as an oral presentation
or poster.
Papers should be formatted according to the LREC-COLING style sheet (
https://lrec-coling-2024.org/authors-kit/), which is provided on the
LREC-COLING 2024 website (https://lrec-coling-2024.org/). Please submit
papers in PDF format to the START account (
https://softconf.com/lrec-coling2024/eurali2024/). For further information
on this initiative, please refer to https://sites.google.com/view/eurali/.
*Important Dates*
March 04, 2024: Paper submissions due [extended deadline]
March 22, 2024: Paper notification of acceptance
May 25, 2024: Workshop
*Workshop Chairs*
Atul Kr. Ojha, University of Galway, Galway (Ireland)
Sina Ahmadi, George Mason University, Fairfax VA (USA)
Chao-Hong Liu, Potamu Research Ltd, Dublin (Ireland)
John P. McCrae, University of Galway, Galway (Ireland)
Theodorus Fransen, Università Cattolica del Sacro Cuore, Milan (Italy)
Silvie Cinková, Charles University, Prague (Czech Republic)
*Programme Committee (to be updated)*
Abigail Walsh, Dublin City University, Dublin (Ireland)
Aiala Rosá, Universidad de la República - Uruguay, Montevideo (Uruguay)
Aryaman Arora, Stanford University, Stanford, California (USA)
A. Seza Doğruöz, Ghent University, Ghent (Belgium)
Alina Karakanta, University of Leiden, Leiden (Netherlands)
Alina Wróblewska, Institute of Computer Science, Jana Kazimierza, Warszawa
(Poland)
Akanksha Bansal, Panlingua, Delhi (India)
Atul Kr. Ojha, University of Galway, Galway (Ireland) & Panlingua, (India)
Bharathi Raja Chakravarthi, University of Galway, Galway (Ireland)
Bogdan Babych, Heidelberg University, Heidelberg (Germany)
Çağrı Çöltekin, University of Tübingen, Tübingen (Germany)
Chao-Hong Liu, Potamu Research Ltd, Dublin (Ireland)
Chihiro Taguchi, the University of Notre Dame, Notre Dame (USA)
Daan van Esch, Google, Amsterdam (Netherlands)
Daniel Zeman, Charles University, Prague (Czech Republic)
Deepak Alok, IIT-Delhi, Delhi (India)
Dorothee Beermann, Norwegian University of Science and Technology,
Trøndelag (Norway)
Esha Banerjee, J.P. Morgan, Bengaluru (India)
Ekaterina Vylomova, University of Melbourne, Melbourne (Australia)
George Rehm, GmbH, Berlin (Germany)
Hiwa Asadpour, Goethe University, Frankfurt (Germany)
Jamal Abdul Nasir, University of Galway, Galway (Ireland)
Joakim Nivre, Uppsala University, (Sweden)
John P. McCrae, University of Galway, (Ireland)
John E. Ortega, New York University (USA)
Jonathan Washington, Swarthmore College, Swarthmore (USA)
Joseph Mariani, LIMSI-CNRS, Pairs (France)
Kaja Dobrovoljc, University of Ljubljana, Ljubljana (Slovenia)
Khalid Choukri, ELDA/ELRA, Paris (France)
Luke D. Gessler, University of Colorado at Boulder (USA)
Maitrey Mehta, University of Utah, Utah (USA)
Marie-Catherine de Marneffe, UCLouvainCollège Léon Durpiez, (Belgium)
Mayank Jobanputra, University of Tübingen, Tübingen (Germany)
Olesea Caftanatov, Vladimir Andrunachievici Institute of Mathematics and
Computer Science, Chişinău (Moldova)
Ranka Stanković, University of Belgrade, Belgrade (Serbia)
Rico Sennrich, University of Zurich, Zurich (Switzerland)
Ritesh Kumar, Agra University, Agra (India)
Rute Costa, the Universidade NOVA de Lisboa, Lisbon (Portugal)
Saliha Muradoglu, Australian National University, Canberra (Australia)
Sarah Moeller, University of Florida, Gainesville, FL (USA)
Silvie Cinkovà, Charles University, Prague (Czech Republic)
Sina Ahmadi, George Mason University, (USA)
Stella Markantonatou, Athena RC, Athens (Greece)
Sourabrata Mukherjee, Charles University, Prague (Czech Republic)
Theodorus Fransen, Università Cattolica del Sacro Cuore, Milan (Italy)
Valentin Malykh, MTS AI / ITMO University
Verginica Barbu Mititelu, Research Institute for Artificial Intelligence,
Bucharest (Romania)
Victoria Bobicev, University of Moldova, Chișinău (Moldova)
Voula Giouli, Institute for Language and Speech Processing, Athens (Greece)
EXTENDED DEADLINE (28 February 2024)
The fifth workshop on Resources for African Indigenous Language (RAIL)
Colocated with LREC-COLING 2024
https://bit.ly/rail2024
New: extended deadline
Conference dates: 20-25 May 2024
Workshop date: 25 May 2024
Venue: Lingotto Conference Centre, Torino (Italy)
The fifth RAIL workshop website: https://bit.ly/rail2024
LREC-COLING 2024 website: https://lrec-coling-2024.org/
Submission website: https://softconf.com/lrec-coling2024/rail2024/
The fifth Resources for African Indigenous Languages (RAIL) workshop
will be co-located with LREC-COLING 2024 in Lingotto Conference Centre,
Torino, Italy on 25 May 2024. The RAIL workshop is an interdisciplinary
platform for researchers working on resources (data collections, tools,
etc.) specifically targeted towards African indigenous languages. In
particular, it aims to create the conditions for the emergence of a
scientific community of practice that focuses on data, as well as
computational linguistic tools specifically designed for or applied to
indigenous languages found in Africa.
Many African languages are under-resourced while only a few of them are
somewhat better resourced. These languages often share interesting
properties such as writing systems, or tone, making them different from
most high-resourced languages. From a computational perspective, these
languages lack enough corpora to undertake high level development of
Human Language Technologies (HLT) and Natural Language Processing (NLP)
tools, which in turn impedes the development of African languages in
these areas. During previous workshops, it has become clear that the
problems and solutions presented are not only applicable to African
languages but are also relevant to many other low-resource languages.
Because these languages share similar challenges, this workshop
provides researchers with opportunities to work collaboratively on
issues of language resource development and learn from each other.
The RAIL workshop has several aims. First, the workshop brings together
researchers who work on African indigenous languages, forming a
community of practice for people working on indigenous languages.
Second, the workshop aims to reveal currently unknown or unpublished
existing resources (corpora, NLP tools, and applications), resulting in
a better overview of the current state-of-the-art, and also allows for
discussions on novel, desired resources for future research in this
area. Third, it enhances sharing of knowledge on the development of
low-resource languages. Finally, it enables discussions on how to
improve the quality as well as availability of the resources.
The workshop has “Creating resources for less-resourced languages” as
its theme, but submissions on any topic related to properties of
African indigenous languages (including non-African languages) may be
accepted. Suggested topics include (but are not limited to) the
following:
Digital representations of linguistic structures
Descriptions of corpora or other data sets of African indigenous
languages
Building resources for (under resourced) African indigenous languages
Developing and using African indigenous languages in the digital age
Effectiveness of digital technologies for the development of African
indigenous languages
Revealing unknown or unpublished existing resources for African
indigenous languages
Developing desired resources for African indigenous languages
Improving quality, availability and accessibility of African indigenous
language resources
Submission requirements:
We invite papers on original, unpublished work related to the topics of
the workshop. Submissions, presenting completed work, may consist of up
to eight (8) pages of content for a long submission and up to four (4)
pages of content for a short submission plus additional pages of
references. The final camera-ready version of accepted long papers are
allowed one additional page of content (up to 9 pages) so that
reviewers’ feedback can be incorporated. Papers should be formatted
according to the LREC-COLING style sheet
(https://lrec-coling-2024.org/authors-kit/), which is provided on the
LREC-COLING 2024 website (https://lrec-coling-2024.org/). Reviewing is
double-blind, so make sure to anonymise your submission (e.g., do not
provide author names, affiliations, project names, etc.) Limit the
amount of self citations (anonymised citations should not be used). The
RAIL workshop follows the LREC-COLING submission requirements.
Please submit papers in PDF format to the START account
(https://softconf.com/lrec-coling2024/rail2024/). Accepted papers will
be published in proceedings linked to the LREC-COLING conference.
Important dates:
Submission deadline: 28 February 2024 (AoE)
Date of notification: 15 March 2024
Camera ready deadline: 29 March 2024
RAIL workshop: 25 May 2024
Organising Committee
Rooweither Mabuya, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Muzi Matfunjwa, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Mmasibidi Setaka, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Menno van Zaanen, South African Centre for Digital Language Resources
(SADiLaR), South Africa
--
Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za
Professor in Digital Humanities
South African Centre for Digital Language Resources
https://www.sadilar.org
________________________________
NWU PRIVACY STATEMENT:
http://www.nwu.ac.za/it/gov-man/disclaimer.html
DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system.
________________________________
The fifth workshop on Resources for African Indigenous Language (RAIL)
Colocated with LREC-COLING 2024
https://bit.ly/rail2024
Conference dates: 20-25 May 2024
Workshop date: 25 May 2024
Venue: Lingotto Conference Centre, Torino (Italy)
The fifth RAIL workshop website: https://bit.ly/rail2024
LREC-COLING 2024 website: https://lrec-coling-2024.org/
Submission website: https://softconf.com/lrec-coling2024/rail2024/
The fifth Resources for African Indigenous Languages (RAIL) workshop
will be co-located with LREC-COLING 2024 in Lingotto Conference Centre,
Torino, Italy on 25 May 2024. The RAIL workshop is an interdisciplinary
platform for researchers working on resources (data collections, tools,
etc.) specifically targeted towards African indigenous languages. In
particular, it aims to create the conditions for the emergence of a
scientific community of practice that focuses on data, as well as
computational linguistic tools specifically designed for or applied to
indigenous languages found in Africa.
Many African languages are under-resourced while only a few of them are
somewhat better resourced. These languages often share interesting
properties such as writing systems, or tone, making them different from
most high-resourced languages. From a computational perspective, these
languages lack enough corpora to undertake high level development of
Human Language Technologies (HLT) and Natural Language Processing (NLP)
tools, which in turn impedes the development of African languages in
these areas. During previous workshops, it has become clear that the
problems and solutions presented are not only applicable to African
languages but are also relevant to many other low-resource languages.
Because these languages share similar challenges, this workshop
provides researchers with opportunities to work collaboratively on
issues of language resource development and learn from each other.
The RAIL workshop has several aims. First, the workshop brings together
researchers who work on African indigenous languages, forming a
community of practice for people working on indigenous languages.
Second, the workshop aims to reveal currently unknown or unpublished
existing resources (corpora, NLP tools, and applications), resulting in
a better overview of the current state-of-the-art, and also allows for
discussions on novel, desired resources for future research in this
area. Third, it enhances sharing of knowledge on the development of
low-resource languages. Finally, it enables discussions on how to
improve the quality as well as availability of the resources.
The workshop has “Creating resources for less-resourced languages” as
its theme, but submissions on any topic related to properties of
African indigenous languages (including non-African languages) may be
accepted. Suggested topics include (but are not limited to) the
following:
* Digital representations of linguistic structures
* Descriptions of corpora or other data sets of African indigenous
languages
* Building resources for (under resourced) African indigenous languages
* Developing and using African indigenous languages in the digital age
* Effectiveness of digital technologies for the development of African
indigenous languages
* Revealing unknown or unpublished existing resources for African
indigenous languages
* Developing desired resources for African indigenous languages
* Improving quality, availability and accessibility of African
indigenous language resources
Submission requirements:
We invite papers on original, unpublished work related to the topics of
the workshop. Submissions, presenting completed work, may consist of up
to eight (8) pages of content for a long submission and up to four (4)
pages of content for a short submission plus additional pages of
references. The final camera-ready version of accepted long papers are
allowed one additional page of content (up to 9 pages) so that
reviewers’ feedback can be incorporated. Papers should be formatted
according to the LREC-COLING style sheet
(https://lrec-coling-2024.org/authors-kit/), which is provided on the
LREC-COLING 2024 website (https://lrec-coling-2024.org/). Reviewing is
double-blind, so make sure to anonymise your submission (e.g., do not
provide author names, affiliations, project names, etc.) Limit the
amount of self citations (anonymised citations should not be used). The
RAIL workshop follows the LREC-COLING submission requirements.
Please submit papers in PDF format to the START account
(https://softconf.com/lrec-coling2024/rail2024/). Accepted papers will
be published in proceedings linked to the LREC-COLING conference.
Important dates:
Submission deadline: 23 February 2024
Date of notification: 15 March 2024
Camera ready deadline: 29 March 2024
RAIL workshop: 25 May 2024
Organising Committee
Rooweither Mabuya, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Muzi Matfunjwa, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Mmasibidi Setaka, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Menno van Zaanen, South African Centre for Digital Language Resources
(SADiLaR), South Africa
--
Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za
Professor in Digital Humanities
South African Centre for Digital Language Resources
https://www.sadilar.org
[NWU Celebrations]
________________________________
NWU PRIVACY STATEMENT:
http://www.nwu.ac.za/it/gov-man/disclaimer.html
DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system.
________________________________