SIGUL February 2024

sigul@list.elra.info

3 participants
6 discussions

First CFP: LoResMT 2024 at ACL 2024
by Atul K. Ojha 27 Feb '24

27 Feb '24

Apologies for cross-posting. --------------------------------------------------------------------------- The Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2024) https://www.loresmt.org/ @ ACL 2024 (August 11–16, 2024) Bangkok, Thailand SUBMISSION https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/LoResMT TIMELINE Paper submission due: May 17 (Friday), 2024, at 23:59 (Anywhere on Earth) Notification of acceptance: June 17 (Monday), 2024 Camera-ready papers due: July 1 (Monday), 2024, at 23:59 (Anywhere on Earth) Workshop dates at ACL: August 15, 2024 SCOPE Based on the success of past low-resource machine translation (MT) workshops at AMTA 2018 (https://amtaweb.org/), MT Summit 2019 ( https://www.mtsummit2019.com), AACL-IJCNLP 2020 (http://aacl2020.org/), AMTA 2021, COLING 2022 and EACL 2023, we introduce the Seventh LoResMT Workshop at ACL 2024. The workshop provides a discussion panel for researchers working on MT systems/methods for low-resource and under-represented languages in general. We would like to help review/overview the state of MT for low-resource languages and define the most important directions. We also solicit papers dedicated to supplementary NLP tools that are used in any language and especially in low-resource languages. Overview papers on these NLP tools are very welcome. It will be beneficial if the evaluations of these tools in research papers include their impact on the quality of MT output. TOPICS We are highly interested in (1) original research papers, (2) review/opinion papers, and (3) online systems on the topics below; however, we welcome all novel ideas that cover research on low-resource languages. - Neural machine translation (NMT) for low-resource languages - Use of LLMs (large language models) for low-resource MT systems - COVID-related corpora, their translations and corresponding NLP/MT systems - Work that presents online systems for practical use by native speakers - Word tokenizers/de-tokenizers for specific languages - Word/morpheme segmenters for specific languages - Alignment/Re-ordering tools for specific language pairs - Use of morphology analyzers and/or morpheme segmenters in MT - Multilingual/cross-lingual NLP tools for MT - Corpora creation and curation technologies for low-resource languages - Review of available parallel corpora for low-resource languages - Research and review papers on MT methods for low-resource languages - MT systems/methods (e.g. rule-based, SMT, NMT) for low-resource languages - Pivot MT for low-resource languages - Zero-shot MT for low-resource languages - Fast building of MT systems for low-resource languages - Re-usability of existing MT systems for low-resource languages - Machine translation for language preservation SUBMISSION INFORMATION We are soliciting two types of submissions: (1) research, review, and position papers and (2) system demonstration papers. For research, review and position papers, the length of each paper should be at least four (4) and not exceed eight (8) pages, plus unlimited pages for references. For system demonstration papers, the limit is four (4) pages. Submissions should be formatted according to the official ACL 2024 style templates. Accepted papers will be published online in the ACL 2024 proceedings and will be presented at the conference. Submissions must be anonymized and should be done using the provided submission system. Scientific papers that have been or will be submitted to other venues must be declared as such and must be withdrawn from the other venues if accepted and published at LoResMT. The review will be double-blind. Authors of an accepted paper should present their paper in person at ACL 2024. Papers should be submitted in PDF to the LoResMT Open Review. We would like to encourage authors to cite papers written in ANY language that are related to the topics, as long as both original bibliographic items and their corresponding English translations are provided. Registration is handled by the main conference (https://2024.aclweb.org/). ORGANIZING COMMITTEE (LISTED ALPHABETICALLY) Atul Kr. Ojha, University of Galway & Panlingua Language Processing LLP Chao-Hong Liu, Potamu Research Ltd Ekaterina Vylomova, University of Melbourne, Australia Jade Abbott, Retro Rabbit Jonathan Washington, Swarthmore College Nathaniel Oco, National University (Philippines) Tommi A Pirinen, UiT The Arctic University of Norway, Tromsø Valentin Malykh, Huawei Noah’s Ark lab and Kazan Federal University Varvara Logacheva, Skolkovo Institute of Science and Technology Xiaobing Zhao, Minzu University of China PROGRAM COMMITTEE (LISTED ALPHABETICALLY) Abigail Walsh, ADAPT Centre, Dublin City University, Ireland Alberto Poncelas, Rakuten, Singapore Alina Karakanta, Leiden University Amirhossein Tebbifakhr, Fondazione Bruno Kessler Anna Currey, Amazon Web Services Aswarth Abhilash Dara, Amazon Arturo Oncevay, University of Edinburgh Atul Kr. Ojha, DSI, University of Galway & Panlingua Language Processing LLP Barry Haddow, University of Edinburgh Bogdan Babych, Heidelberg University Chao-Hong Liu, Potamu Research Ltd Constantine Lignos, Brandeis University, USA Daan van Esch, Google Diptesh Kanojia, University of Surrey, UK Duygu Ataman, University of Zurich Ekaterina Vylomova, University of Melbourne, Australia Eleni Metheniti, CLLE-CNRS and IRIT-CNRS Flammie Pirinen, UiT The Arctic University of Norway, Tromsø Koel Dutta Chowdhury, Saarland University (Germany) Jade Abbott, Retro Rabbit Jasper Kyle Catapang, University of the Philippines Jindřich Libovicky, Charles University John P. McCrae, DSI, University of Galway Liangyou Li, Noah’s Ark Lab, Huawei Technologies Majid Latifi, University of York, York, UK Maria Art Antonette Clariño, University of the Philippines Los Baños Mathias Müller, University of Zurich Nathaniel Oco, De La Salle University (Philippines) Rajdeep Sarkar, Yahoo Rico Sennrich, University of Zurich Saliha Muradoglu, The Australian National University Sangjee Dondrub, Qinghai Normal University Santanu Pal, WIPRO AI Sardana Ivanova, University of Helsinki Shantipriya Parida, Silo AI Sunit Bhattacharya, Charles University Surafel Melaku Lakew, Amazon AI Wen Lai, Center for Information and Language Processing, LMU Munich Valentin Malykh, Huawei Noah’s Ark lab and Kazan Federal University CONTACT Please email loresmt(a)googlegroups.com if you have any questions/comments/suggestions.

1 0

Extended deadline [March 6th]: 7thWorkshop on Indian Language Data: Resources and Evaluation (WILDRE) @LREC-COLING 2024
by Atul K. Ojha 27 Feb '24

27 Feb '24

Apologies for cross-posting. you are requested to please circulate it for wider publicity... --------------------------------------------------------------------------- 7thWorkshop on Indian Language Data: Resources and Evaluation (WILDRE) Venue: Lingotto Conference Centre - Torino, Italy (Organized under LREC-COLING 2024 (20-25 May 2024) <https://lrec-coling-2024.org/>) Website: http://sanskrit.jnu.ac.in/conf/wildre7 WILDRE-7, the 7th Workshop on Indian Language Data: Resources and Evaluation is proposed to be organised in Lingotto Conference Centre - Torino, Italy under the LREC-COLING platform. India has a huge linguistic diversity and has seen concerted efforts from the Indian government and industry to develop language resources. European Language Resource Association (ELRA) and its associate organizations have been very active and successful in addressing the challenges and opportunities related to language resource creation and evaluation. It is therefore a big opportunity for resource creators of Indian languages to showcase their work on this platform and also to interact and learn from those involved in similar initiatives all over the world. The broader objectives of the WILDRE will be - To map the status of Indian Language Resources - To investigate challenges related to creating and sharing various levels of language resources - To promote a dialogue between language resource developers and users - To provide an opportunity for researchers from India to collaborate with researchers from other parts of the world *Dates for Short/Long papers and Posters and Demos* February 28, 2024 March 06, 2024: Paper submissions due [extended deadline] March 28, 2024: Paper notification acceptance April 10, 2024: Camera-ready papers due SUBMISSIONS Papers must describe original, completed/ in progress and unpublished work. Each submission will be reviewed by three program committee members. Accepted papers will be given up to 10 pages (for full papers) 5 pages (for short papers and posters) in the workshop proceedings, and will be presented as oral paper or poster. Papers should be formatted according to the LREC-COLING style sheet, which is provided on the LREC-COLING 2024 website ( https://lrec-coling-2024.org/authors-kit/). Papers should be submitted in PDF format to the LREC-COLING website ( https://softconf.com/lrec-coling2024/wildre-7/) We are seeking submissions under the following category - Full papers (10 pages) - Short papers (work in progress: 5 pages) - Posters (innovative ideas/proposals, research proposal of students) - Demo (of working online/standalone systems) WILDRE-7 will have a special focus on Demos of Indian Language Technology. In the past few years, as more resources have been developed and made available, there has been an increased activity in developing usable technology using these. WILDRE-7 would like to encourage and widen the Demo track to allow the community to showcase their demos and have mutually beneficial interactions with each other as well as resource developers. WILDRE-7 is seeking full, short papers, posters and demos on the following topics related to Indian Language Resources: - Digital Humanities, heritage computing - Corpora - text, speech, multimodal, methodologies, annotation and tools - Lexicons and Machine-readable dictionaries - Ontologies, Grammars - Language resources for NLP/ IR/Speech tasks, tools and Infrastructure for language resources - Standards or specifications for language resources application - Licensing and copyright issues - Data mining - Text summarization Both submission and review processes will be handled electronically. The review process will be double-blind. The workshop website will provide the submission guidelines and the link for the electronic submission. When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC-COLING authors to share the described LRs (data, tools, services, etc.), to enable their reuse, and replicability of experiments, including evaluation ones, etc. For further information on this initiative, please refer to https://lrec-coling-2024.org/ Shared Task Following the success of the five WILDRE workshops, WILDRE-7 will include *Code-mixed Less-Resourced Sentiment Analysis (Code-mixed) *and *Discourse Machine Translation (DiscoMT)* Shared Tasks. The organizers of shared tasks will provide datasets and evaluation platforms to evaluate systems developed by the participants. For further information on this initiative, please refer to http://sanskrit.jnu.ac.in/conf/wildre7 Workshop *Organisers* - Girish Nath Jha, Jawaharlal Nehru University, India - Kalika Bali, Microsoft Research India Lab, Bangalore, India - Sobha L, AU-KBC, Anna University, Chennai, India - Atul Kr. Ojha, University of Galway, Ireland & Panlingua Language Processing LLP, India Workshop contact: Atul Kr. Ojha, University of Galway, Ireland & Panlingua Language Processing LLP, India, shashwatup9k(a)gmail.com Identify, Describe and Share your LRs Describing your LRs in the LRE Map is now a normal practice in the submission procedure of LREC (introduced in 2010 and adopted by other conferences). To continue the efforts initiated at LREC 2014 about “Sharing LRs” (data, tools, web services, etc.), authors will have the possibility, when submitting a paper, to upload LRs in a special LREC repository. This effort of sharing LRs, linked to the LRE Map for their description, may become a new “regular” feature for conferences in our field, thus contributing to creating a common repository where everyone can deposit and share data. As scientific work requires accurate citations of referenced work to allow the community to understand the whole context and also replicate the experiments conducted by other researchers, LREC-COLING 2024 endorses the need to uniquely identify LRs through the use of the International Standard Language Resource Number (ISLRN, www.islrn.org), a Persistent Unique Identifier to be assigned to each Language Resource. The assignment of ISLRNs to LRs cited in LREC-COLING papers will be offered at submission time.

1 0

2nd CfP: SIGUL2024 - The 3rd Annual Meeting of the ELRA-ISCA Special Interest Group on Under-resourced Languages
by Claudia Soria 26 Feb '24

26 Feb '24

** *CFP: The3rd Annual Meeting of the ELRA-ISCA Special Interest Group on Under-resourced Languages (SIGUL2024)* * * Workshop website: https://sigul-2024.ilc.cnr.it <https://sigul-2024.ilc.cnr.it/> * When: Monday and Tuesday, May 20th-21st, 2024 * Where: Torino, Italy (co-located with LREC-COLING 2024) * Deadline for submissions: February 26th, 2024 * Paper submission link: https://softconf.com/lrec-coling2024/sigul2024/ <https://softconf.com/lrec-coling2024/sigul2024/> * Deadline for camera-ready papers: April 5th, 2024 The 3rd Annual Meeting of the ELRA <http://www.elra.info/>/ISCA <https://www.isca-speech.org/iscaweb/index.php>Special Interest Group on Under-Resourced Languages <http://www.elra.info/en/sig/sigul/>(SIGUL2024) will provide a forum for the presentation and discussion of cutting-edge research in language processing for under-resourced languages by academic and industry researchers. SIGUL2024 is held over two days to allow for extended discussions and interaction. Far from being just a smaller version of a conference, SIGUL2024 aims to create the conditions for an exchange of knowledge and a comparison of needs and perspectives between research and practice in the field to take place. We invite contributions (regular long papers of 8 pages or short papers of 4 pages) targeting any of the following - non-exhaustive - list of topics: * Processing any under-resourced languages (covering less-resourced, under-resourced, endangered, minority, and minoritized languages) * Cognitive and linguistic studies of under-resourced languages * Fast resources acquisition: text and speech corpora, parallel texts, dictionaries, grammars, and language models * Zero and few-shot methodologies and self-supervised learning in language and speech technologies * Cross-lingual and multilingual acoustic and lexical modeling * Speech recognition and synthesis for under-resourced languages and dialects * Machine translation and speech-to-speech translation * Spoken dialogue systems * Applications of language technologies for under-resourced languages * Large language models and under-resourced languages * Special topic: o Text and speech resources and technologies for the languages of Italy Special Session on languages of Italy and language technologies Italy is known for its linguistic diversity that reflects its long and varied history. To celebrate it, SIGUL2024 will provide a special session or forum for researchers interested in developing language resources and technologies for the many languages of Italy (regional, minority, or heritage languages, including those of the neighboring countries). Submissions Authors can choose among three paper categories: * Regular long papers – up to eight (8) pages maximum*, presenting substantial, original, completed, and unpublished work. * Short papers – up to four (4) pages*, describing work-in-progress projects in the early stage of development, new resources, negative results, system demonstrations, and early-career/student work. * Position papers – up to eight (8) pages*, for reflective considerations of methodological, best practice, and institutional issues (e.g., ethics, data ownership, speakers’ community involvement, de-colonizing approaches). The above page limits exclude any number of additional pages that may be needed for references. The form of the presentation may be oral or poster, whereas in the proceedings there is no difference between the accepted papers. Submission is NOT anonymous and the official LREC-COLING 2024 format must be adopted. Each paper will be reviewed by three independent reviewers. Invited speakers Eddie Avila, GlobalVoices Jean Maillard, FAIR, META Important Dates • 26 February 2024: submission due • 18 March 2024: reviews due • 22 March 2024: notifications to authors • 5 April 2024: camera-ready (PDF) due Diversity & Inclusion Subsidies SIGUL2024 is providing funds for registration and travel or for bandwidth/VPN. We encourage citizens of developing countries and members of marginalised communities to apply for subsidies. Details on the application procedure will be available on the workshop website. For inquiries, please contact claudia.soria[AT]ilc.cnr.it. Workshop Organizers Maite Melero, Sakriani Sakti, Claudia Soria Program Committee * Mohammad A. M. Abushariah (The University of Jordan, Jordan) * Manex Aguirrezabal (University of Copenhagen – Center for Sprogteknologi | Center for Language Technology, Denmark) * Shyam S. Agrawal (KIIT, Gurugram ,India) * Begoña Altuna (HiTZ Center - Ixa, Euskal Herriko Unibertsitatea | University of the Basque Country, Spain) * Antti Arppe (University of Alberta, Canada) * Martin Benjamin (Kamusi Project International) * Delphine Bernhard (Université de Strasbourg, LiLPa, France) * Steven Bird (Charles Darwin University, Australia) * Claudia Borg (University of Malta) * Matt Coler (University of Groningen, Campus Fryslân, The Netherlands) * Dan Cristea (Romanian Academy, Romania) * Pradip Kumar Das (IIT Guwahati, India) * A. Seza Doğruöz (Universiteit Gent, België | Ghent University, Belgium) * Stefano Ghazzali (Language Technologies Unit Bangor University Prifysgol Bangor | Bangor University, Bangor, Gwynedd) * Itziar Gonzalez-Dios (HiTZ Basque Center for Language Technologies - Ixa, University of the Basque Country UPV/EHU) * Lars Hellan (Norwegian University of Science and Technology, Norway) * Mélanie Jouitteau (IKER, CNRS, France) * Ritesh Kumar (UnReaL-TecE LLP, India) * Richard Littauer * Teresa Lynn (Mohamed bin Zayed University of Artificial Intelligence, United Arab Emirates) * Nina Markl (University of Essex, UK) * Maite Melero (Barcelona Supercomputing Center, Espanya | Spain) * Peter Mihajlik (Budapest University of Technology and Economics, Hungary) * Win Pa Pa (UCS Yangon, Myanmar) * Sandy Ritchie (Google Research) * Sakriani Sakti (JAIST, Japan) * Nay San (Stanford University, USA) * Claudia Soria (CNR-ILC, Italia | Italy) * Daan Van Esch (Google Research) * Menno van Zaanen (South African Centre for Digital Language Resources, South Africa) * Jenifer Vega Rodriguez (GIPSA-lab, Université Grenoble Alpes, France) * Marcely Zanon Boito (NAVER Labs Europe, France) Identify, Describe and Share your LRs! When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC-COLING authors to share the described LRs (data, tools, services, etc.) to enable their reuse and replicability of experiments (including evaluation ones). Contact claudia.soria[AT]ilc.cnr.it Please, write “SIGUL2024” in the subject of your e-mail. * -- facebook <https://www.facebook.com/CNRsocialFB> twitter <https://twitter.com/CNRsocial_> instagram <https://www.instagram.com/cnrsocial/> linkedin <https://www.linkedin.com/company/283032> Claudia Soria CNR, ISTITUTO DI LINGUISTICA COMPUTAZIONALE "ANTONIO ZAMPOLLI" claudia.soria(a)ilc.cnr.it Tel. 0503153166 Via Giuseppe Moruzzi, 1, 56124 – Pisa www.ilc.cnr.it *www.cnr.it* <http://www.cnr.it/> Devolvi il 5×1000 al CNR CF 80054330586 -- facebook <https://www.facebook.com/CNRsocialFB> twitter <https://twitter.com/CNRsocial_> instagram <https://www.instagram.com/cnrsocial/> linkedin <https://www.linkedin.com/company/283032> Claudia Soria CNR, ISTITUTO DI LINGUISTICA COMPUTAZIONALE "ANTONIO ZAMPOLLI" claudia.soria(a)ilc.cnr.it Tel. 0503153166 Via Giuseppe Moruzzi, 1, 56124 – Pisa www.ilc.cnr.it *www.cnr.it* <http://www.cnr.it/> Devolvi il 5×1000 al CNR CF 80054330586

1 2

Extended Deadline [March 04, 2024]: 2nd Workshop on Resources and Technologies for Indigenous, Endangered and Lesser-resourced Languages in Eurasia (EURALI)
by Atul K. Ojha 23 Feb '24

23 Feb '24

Apologies for cross-posting *2nd Workshop on Resources and Technologies for Indigenous, Endangered and Lesser-resourced Languages in Eurasia (EURALI) @ LREC-COLING 2024* Date: 20-25 May, 2024 Paper submissions due March 04th, 2024 Venue: Lingotto Conference Centre - Torino (Italia) Main website: https://sites.google.com/view/eurali/ LREC-COLING 2024 website: https://lrec-coling-2024.org/ Submission website: https://softconf.com/lrec-coling2024/eurali2024/ —————————————————————————————————— *Workshop overview and objectives* This workshop will focus on the development of language technology resources and tools for indigenous, endangered and lesser-resourced languages on the Eurasian continent. In a media-centric world where language technology allows people to break cultural and language barriers, it is important that speakers of endangered and indigenous languages can be empowered to use this technology to continue to share their knowledge and culture with the world. With the hope of bridging this gap, the goal of this workshop is to increase visibility and promote research for lesser-resourced and under-represented languages in Europe and Asia. Through collaboration between NLP researchers, language experts and linguists working for the benefit of endangered languages in these communities, we aim to create language technology resources that will help to preserve and revive these languages for future generations. Furthermore, the workshop aims to promote the emergence of new methods that benefit linguists (e.g. automating analysis and validation processes), field linguists (facilitating data collection and analysis processes), and computational linguists (developing new techniques necessary for linguistic analysis, development of supervised or weakly supervised methods for the analysis of poorly written or undocumented languages). The main objective of the workshop is to create basic resources and develop tools for Eurasiatic languages, including but not limited to the following topics: - identifying languages and variants spoken in these regions - creation of language resources and applications, e.g. sentiment analysis, named entity recognition, and syntactic parsing - standardization for endangered languages - automatic identification and classification of lexical variation and language varieties - adaptation of fundamental NLP tools for these languages, e.g., morphological analysis, taggers and parsers - reusability of language resources in NLP applications, e.g. machine translation, and POS tagging - machine translation between closely related languages - evaluation of language resources and tools when applied to lesser-resourced languages in the same language families - corpora, resources, and tools for closely related languages - linguistic and textual similarities among languages in Eurasia - digitalization of endangered languages - challenges in the creation of language resources and tools from linguistic perspectives (which includes any perspective formal theory) *Submissions* We are seeking submissions in the following categories: - Full papers: 8 pages+unlimited references - Short papers (work in progress): 4 pages+unlimited references - Posters (innovative ideas/proposals, a research idea of students): 4 pages+unlimited references - Demo (of working online/standalone systems): 2 pages Papers must describe original, completed or in progress, and unpublished work. The accepted papers will be given up for full/short paper and poster in the workshop proceedings and will be presented as an oral presentation or poster. Papers should be formatted according to the LREC-COLING style sheet ( https://lrec-coling-2024.org/authors-kit/), which is provided on the LREC-COLING 2024 website (https://lrec-coling-2024.org/). Please submit papers in PDF format to the START account ( https://softconf.com/lrec-coling2024/eurali2024/). For further information on this initiative, please refer to https://sites.google.com/view/eurali/. *Important Dates* March 04, 2024: Paper submissions due [extended deadline] March 22, 2024: Paper notification of acceptance May 25, 2024: Workshop *Workshop Chairs* Atul Kr. Ojha, University of Galway, Galway (Ireland) Sina Ahmadi, George Mason University, Fairfax VA (USA) Chao-Hong Liu, Potamu Research Ltd, Dublin (Ireland) John P. McCrae, University of Galway, Galway (Ireland) Theodorus Fransen, Università Cattolica del Sacro Cuore, Milan (Italy) Silvie Cinková, Charles University, Prague (Czech Republic) *Programme Committee (to be updated)* Abigail Walsh, Dublin City University, Dublin (Ireland) Aiala Rosá, Universidad de la República - Uruguay, Montevideo (Uruguay) Aryaman Arora, Stanford University, Stanford, California (USA) A. Seza Doğruöz, Ghent University, Ghent (Belgium) Alina Karakanta, University of Leiden, Leiden (Netherlands) Alina Wróblewska, Institute of Computer Science, Jana Kazimierza, Warszawa (Poland) Akanksha Bansal, Panlingua, Delhi (India) Atul Kr. Ojha, University of Galway, Galway (Ireland) & Panlingua, (India) Bharathi Raja Chakravarthi, University of Galway, Galway (Ireland) Bogdan Babych, Heidelberg University, Heidelberg (Germany) Çağrı Çöltekin, University of Tübingen, Tübingen (Germany) Chao-Hong Liu, Potamu Research Ltd, Dublin (Ireland) Chihiro Taguchi, the University of Notre Dame, Notre Dame (USA) Daan van Esch, Google, Amsterdam (Netherlands) Daniel Zeman, Charles University, Prague (Czech Republic) Deepak Alok, IIT-Delhi, Delhi (India) Dorothee Beermann, Norwegian University of Science and Technology, Trøndelag (Norway) Esha Banerjee, J.P. Morgan, Bengaluru (India) Ekaterina Vylomova, University of Melbourne, Melbourne (Australia) George Rehm, GmbH, Berlin (Germany) Hiwa Asadpour, Goethe University, Frankfurt (Germany) Jamal Abdul Nasir, University of Galway, Galway (Ireland) Joakim Nivre, Uppsala University, (Sweden) John P. McCrae, University of Galway, (Ireland) John E. Ortega, New York University (USA) Jonathan Washington, Swarthmore College, Swarthmore (USA) Joseph Mariani, LIMSI-CNRS, Pairs (France) Kaja Dobrovoljc, University of Ljubljana, Ljubljana (Slovenia) Khalid Choukri, ELDA/ELRA, Paris (France) Luke D. Gessler, University of Colorado at Boulder (USA) Maitrey Mehta, University of Utah, Utah (USA) Marie-Catherine de Marneffe, UCLouvainCollège Léon Durpiez, (Belgium) Mayank Jobanputra, University of Tübingen, Tübingen (Germany) Olesea Caftanatov, Vladimir Andrunachievici Institute of Mathematics and Computer Science, Chişinău (Moldova) Ranka Stanković, University of Belgrade, Belgrade (Serbia) Rico Sennrich, University of Zurich, Zurich (Switzerland) Ritesh Kumar, Agra University, Agra (India) Rute Costa, the Universidade NOVA de Lisboa, Lisbon (Portugal) Saliha Muradoglu, Australian National University, Canberra (Australia) Sarah Moeller, University of Florida, Gainesville, FL (USA) Silvie Cinkovà, Charles University, Prague (Czech Republic) Sina Ahmadi, George Mason University, (USA) Stella Markantonatou, Athena RC, Athens (Greece) Sourabrata Mukherjee, Charles University, Prague (Czech Republic) Theodorus Fransen, Università Cattolica del Sacro Cuore, Milan (Italy) Valentin Malykh, MTS AI / ITMO University Verginica Barbu Mititelu, Research Institute for Artificial Intelligence, Bucharest (Romania) Victoria Bobicev, University of Moldova, Chișinău (Moldova) Voula Giouli, Institute for Language and Speech Processing, Athens (Greece)

1 0

Final CfP 5th workshop on Resources for African Indigenous Language (RAIL) @ LREC-COLING
by Menno Van Zaanen 22 Feb '24

22 Feb '24

EXTENDED DEADLINE (28 February 2024) The fifth workshop on Resources for African Indigenous Language (RAIL) Colocated with LREC-COLING 2024 https://bit.ly/rail2024 New: extended deadline Conference dates: 20-25 May 2024 Workshop date: 25 May 2024 Venue: Lingotto Conference Centre, Torino (Italy) The fifth RAIL workshop website: https://bit.ly/rail2024 LREC-COLING 2024 website: https://lrec-coling-2024.org/ Submission website: https://softconf.com/lrec-coling2024/rail2024/ The fifth Resources for African Indigenous Languages (RAIL) workshop will be co-located with LREC-COLING 2024 in Lingotto Conference Centre, Torino, Italy on 25 May 2024. The RAIL workshop is an interdisciplinary platform for researchers working on resources (data collections, tools, etc.) specifically targeted towards African indigenous languages. In particular, it aims to create the conditions for the emergence of a scientific community of practice that focuses on data, as well as computational linguistic tools specifically designed for or applied to indigenous languages found in Africa. Many African languages are under-resourced while only a few of them are somewhat better resourced. These languages often share interesting properties such as writing systems, or tone, making them different from most high-resourced languages. From a computational perspective, these languages lack enough corpora to undertake high level development of Human Language Technologies (HLT) and Natural Language Processing (NLP) tools, which in turn impedes the development of African languages in these areas. During previous workshops, it has become clear that the problems and solutions presented are not only applicable to African languages but are also relevant to many other low-resource languages. Because these languages share similar challenges, this workshop provides researchers with opportunities to work collaboratively on issues of language resource development and learn from each other. The RAIL workshop has several aims. First, the workshop brings together researchers who work on African indigenous languages, forming a community of practice for people working on indigenous languages. Second, the workshop aims to reveal currently unknown or unpublished existing resources (corpora, NLP tools, and applications), resulting in a better overview of the current state-of-the-art, and also allows for discussions on novel, desired resources for future research in this area. Third, it enhances sharing of knowledge on the development of low-resource languages. Finally, it enables discussions on how to improve the quality as well as availability of the resources. The workshop has “Creating resources for less-resourced languages” as its theme, but submissions on any topic related to properties of African indigenous languages (including non-African languages) may be accepted. Suggested topics include (but are not limited to) the following: Digital representations of linguistic structures Descriptions of corpora or other data sets of African indigenous languages Building resources for (under resourced) African indigenous languages Developing and using African indigenous languages in the digital age Effectiveness of digital technologies for the development of African indigenous languages Revealing unknown or unpublished existing resources for African indigenous languages Developing desired resources for African indigenous languages Improving quality, availability and accessibility of African indigenous language resources Submission requirements: We invite papers on original, unpublished work related to the topics of the workshop. Submissions, presenting completed work, may consist of up to eight (8) pages of content for a long submission and up to four (4) pages of content for a short submission plus additional pages of references. The final camera-ready version of accepted long papers are allowed one additional page of content (up to 9 pages) so that reviewers’ feedback can be incorporated. Papers should be formatted according to the LREC-COLING style sheet (https://lrec-coling-2024.org/authors-kit/), which is provided on the LREC-COLING 2024 website (https://lrec-coling-2024.org/). Reviewing is double-blind, so make sure to anonymise your submission (e.g., do not provide author names, affiliations, project names, etc.) Limit the amount of self citations (anonymised citations should not be used). The RAIL workshop follows the LREC-COLING submission requirements. Please submit papers in PDF format to the START account (https://softconf.com/lrec-coling2024/rail2024/). Accepted papers will be published in proceedings linked to the LREC-COLING conference. Important dates: Submission deadline: 28 February 2024 (AoE) Date of notification: 15 March 2024 Camera ready deadline: 29 March 2024 RAIL workshop: 25 May 2024 Organising Committee Rooweither Mabuya, South African Centre for Digital Language Resources (SADiLaR), South Africa Muzi Matfunjwa, South African Centre for Digital Language Resources (SADiLaR), South Africa Mmasibidi Setaka, South African Centre for Digital Language Resources (SADiLaR), South Africa Menno van Zaanen, South African Centre for Digital Language Resources (SADiLaR), South Africa -- Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za Professor in Digital Humanities South African Centre for Digital Language Resources https://www.sadilar.org ________________________________ NWU PRIVACY STATEMENT: http://www.nwu.ac.za/it/gov-man/disclaimer.html DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system. ________________________________

1 0

Final CfP 5th workshop on Resources for African Indigenous Language (RAIL) @ LREC-COLING
by Menno Van Zaanen 13 Feb '24

13 Feb '24

The fifth workshop on Resources for African Indigenous Language (RAIL) Colocated with LREC-COLING 2024 https://bit.ly/rail2024 Conference dates: 20-25 May 2024 Workshop date: 25 May 2024 Venue: Lingotto Conference Centre, Torino (Italy) The fifth RAIL workshop website: https://bit.ly/rail2024 LREC-COLING 2024 website: https://lrec-coling-2024.org/ Submission website: https://softconf.com/lrec-coling2024/rail2024/ The fifth Resources for African Indigenous Languages (RAIL) workshop will be co-located with LREC-COLING 2024 in Lingotto Conference Centre, Torino, Italy on 25 May 2024. The RAIL workshop is an interdisciplinary platform for researchers working on resources (data collections, tools, etc.) specifically targeted towards African indigenous languages. In particular, it aims to create the conditions for the emergence of a scientific community of practice that focuses on data, as well as computational linguistic tools specifically designed for or applied to indigenous languages found in Africa. Many African languages are under-resourced while only a few of them are somewhat better resourced. These languages often share interesting properties such as writing systems, or tone, making them different from most high-resourced languages. From a computational perspective, these languages lack enough corpora to undertake high level development of Human Language Technologies (HLT) and Natural Language Processing (NLP) tools, which in turn impedes the development of African languages in these areas. During previous workshops, it has become clear that the problems and solutions presented are not only applicable to African languages but are also relevant to many other low-resource languages. Because these languages share similar challenges, this workshop provides researchers with opportunities to work collaboratively on issues of language resource development and learn from each other. The RAIL workshop has several aims. First, the workshop brings together researchers who work on African indigenous languages, forming a community of practice for people working on indigenous languages. Second, the workshop aims to reveal currently unknown or unpublished existing resources (corpora, NLP tools, and applications), resulting in a better overview of the current state-of-the-art, and also allows for discussions on novel, desired resources for future research in this area. Third, it enhances sharing of knowledge on the development of low-resource languages. Finally, it enables discussions on how to improve the quality as well as availability of the resources. The workshop has “Creating resources for less-resourced languages” as its theme, but submissions on any topic related to properties of African indigenous languages (including non-African languages) may be accepted. Suggested topics include (but are not limited to) the following: * Digital representations of linguistic structures * Descriptions of corpora or other data sets of African indigenous languages * Building resources for (under resourced) African indigenous languages * Developing and using African indigenous languages in the digital age * Effectiveness of digital technologies for the development of African indigenous languages * Revealing unknown or unpublished existing resources for African indigenous languages * Developing desired resources for African indigenous languages * Improving quality, availability and accessibility of African indigenous language resources Submission requirements: We invite papers on original, unpublished work related to the topics of the workshop. Submissions, presenting completed work, may consist of up to eight (8) pages of content for a long submission and up to four (4) pages of content for a short submission plus additional pages of references. The final camera-ready version of accepted long papers are allowed one additional page of content (up to 9 pages) so that reviewers’ feedback can be incorporated. Papers should be formatted according to the LREC-COLING style sheet (https://lrec-coling-2024.org/authors-kit/), which is provided on the LREC-COLING 2024 website (https://lrec-coling-2024.org/). Reviewing is double-blind, so make sure to anonymise your submission (e.g., do not provide author names, affiliations, project names, etc.) Limit the amount of self citations (anonymised citations should not be used). The RAIL workshop follows the LREC-COLING submission requirements. Please submit papers in PDF format to the START account (https://softconf.com/lrec-coling2024/rail2024/). Accepted papers will be published in proceedings linked to the LREC-COLING conference. Important dates: Submission deadline: 23 February 2024 Date of notification: 15 March 2024 Camera ready deadline: 29 March 2024 RAIL workshop: 25 May 2024 Organising Committee Rooweither Mabuya, South African Centre for Digital Language Resources (SADiLaR), South Africa Muzi Matfunjwa, South African Centre for Digital Language Resources (SADiLaR), South Africa Mmasibidi Setaka, South African Centre for Digital Language Resources (SADiLaR), South Africa Menno van Zaanen, South African Centre for Digital Language Resources (SADiLaR), South Africa -- Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za Professor in Digital Humanities South African Centre for Digital Language Resources https://www.sadilar.org [NWU Celebrations] ________________________________ NWU PRIVACY STATEMENT: http://www.nwu.ac.za/it/gov-man/disclaimer.html DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system. ________________________________

1 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

SIGUL February 2024