Dear SIGUL Members,

we are pleased to share with you the seventh SIGUL Newsletter.

The SIGUL Newsletter is a bi-weekly report on issues related to the topics of language resources and tools for less-resourced languages.

Your feedback will be welcome.

Claudia Soria

SIGUL Co-chair

SIGUL is the ELRA-ISCA Special Interest Group on Under-Resourced Languages

*Calls for papers*

ICALP2019 (ex CITALA) submission deadline is approaching (May 31, 2019).

The International Conference on Arabic Language Processing ICALP2019 (ex CITALA) is aimed to be one of the most important and leading conference on Arabic NLP (https://icalp2019.loria.fr). ICALP2019 is the 7th edition (information about the previous editions are mentioned at: ICALP2017), all branches of NLP related to Arabic or Semitic languages are welcomed. Arabic spoken or text language processing are the main kernel of ICALP2019. The conference will emphasize on all the approaches from theoretical models to industrial applications. ICALP2019 will be held at Loria (University of Lorraine), which is situated in Nancy on the east part of France near the border of Luxembourg and Germany. We look forward to welcome you to a scientifically inspiring event and we hope that ICALP2019 will be a highly positive scientific experience for all the participants. Two keynote speakers Dr Mona Diab (Department of Computer Science, George Washington University) and Dr Albert Gatt (Institute of Linguistics and Language Technology, University of Malta.) will share their expertise with the aim of exposing participants to a wide spectrum of research and applications, to foster cross-pollination of research ideas and interests in Arabic Language Processing and related applications. As in the previous year (ICALP – CITALA 2017), we aim, for accepted papers, to be published by Springer. All the submitted papers will be reviewed by three experts from the domain. Only the 30% best papers will be selected to be published by Springer. Participants are invited to submit original communications in all the areas of Arabic NLP. Few topics of the conference are given below.

Thanks to the Google's sponsoring two scholarships (including the Flight ticket, the accomodation and the registration fees) are proposed to Phd or Master students to attend the conference. A short CV and a motivation letter are requested to apply to this scholarship (please contact icalp2019@inria.fr)

List of Topics

Arabic dialect processing
Automatic speech recognition
Building ontologies
Code Switching phenomena
Comparative Linguistic Studies
Cross-Language Applications
Deep Learning for Arabic applications
Digital Humanities
End-to-End DNN
Generation and Analysis
Human Machine-Dialogue
Information retrieval on Social Networks
Linguistic Resources for Arabic NLP Applications
Machine Translation
Multi-Word Term Extraction
Named Entity Recognition
Opining Mining and Sentiment Analysis
Optical Character Recognition
Paraphrasing and Textual Entailment
Question/Answering Systems
Semitic languages
Spell-check and automatic corrections
Study of Linguistic Phenomena
Text Categorization and Clustering
Text Summarization
Word Sense Disambiguation

Committees

Program Committee

Ahmed Ali (Qatar Computing Research Institute (QCRI))
Mohamed Afify (Microsoft, Cairo, Egypt)
Hassina Aliane (Centre de Recherche sur l’Information Scientifique et Technique , Algeria)
Frédéric Béchet (Aix Marseille Université - LIF, France)
Almoataz Bellahal-Said (Cairo University, Egypt)
Laurent Besacier (University of Grenoble, France)
Karim Bouzoubaa (EMI, UM5, Rabat, Morocco)
Violetta Cavalli-Sforza (Al Akhawayn University, Morocco)
Khalid Choukri (ELDA, European Language Resource Association, France)
Kareem Darwish (Qatar Computing Research Institute)
Joseph Di Martino (Loria - University of Lorraine, France)
Mona Diab (George Washington university, USA)
Abdelhamid El Jihad (IERA, UMS, Rabat, Morocco)
Mahmoud El-Haj (School of Computing and Communications Lancaster University, UK)
Yannick Estève (LIUM University of Maine, France)
Albert Gatt ( Institute of Linguistics and Language Technology, University of Malta)
Nada Ghneim (Higher Institute for Applied Science and Technology, Damascu)
Ahmed Guessoum (University of Science and Technology Houari Boumediene, Algeria)
Hatem Haddad (Department of Computer & Decision Engineering, Université Libre de Bruxelles, Belgium)
Kais Haddar (Faculté des sciences de Sfax, Tunisia)
Abdelfetah Hamdani (IERA, UMS, rabat, Morocco)
Yannis Haralambous (Institut Mines-Télécom & UMR CNRS 6285 Lab-STICC, Brest, France)
Salima Harrat (Ecole Normale Supérieure Bouzaréah, Algiers, Algeria)
Mark Hasegawa-Johnson (University of Illinois, USA)
Mustafa Jarrar (Birzeit University- Sina Institute, Palestine)
Denis Jouvet (Loria University of Lorraine, France)
Abdelmonaime Lachkar (ENSA, USMBA, Fez, Morocco)
David Langlois (Loria University of Lorraine, France)
Abdelhak Lakhouaja (UMP, Oujda, Morocco)
Azzedine Lazrek (UCA, Marrakech, Morocco)
Yves Lepage (University of Waseda, Japan)
Azzedine Mazroui (UMP, Oujda, Morocco)
Karima Meftouh (University of Annaba, Algeria)
Farid Meziane (University of Salford, Manchester, UK)
Vito Pirrelli (Institute for Computational Linguistics CNR, Pisa Italy)
Violaine Prince (LIRMM, Montepelier, France)
Allan Ramsay (School of Computer Science, University of Manchester, UK)
Horacio Rodríguez (Universitat Politècnica de Catalunya, Barcelona - Spain)
Paolo Rosso (Universidad Politécnica de Valencia, Spain)
Motaz Saad (University Islamic of Gaza, Palestine)
Fatiha Sadat (University du Québec à Montréal, Canada)
Khaled Shaalan (The British University, United Arab Emirates)
Olivier Siohan (Google, USA)
Kamel Smaili (Loria University of Lorraine, France)
Adnan Yahya (Birzeit University, Palestine)
Abdella Yousfi (Mohammed V university, Morocco)
Wajdi Zaghouani (Carnegie Mellon University Qatar)
Imed Zitouni (Microsoft Research, USA)

Organizing committee

Karima Abidi (Student management)
Olivia Brenner (Communication)
Joseph Di Martino (Special Event)
David Langlois ( Publication and Logistic management)
Mohamed-Amine Menacer (Website designer)

Invited Speakers

Dr Mona Diab ( Department of Computer Science, George Washington University (GW) )
Dr Albert Gatt ( Institute of Linguistics and Language Technology, University of Malta )

Publication

ICALP2019 (CITALA) proceedings aim to be published by Springer

Important Dates

Submission portal opens April 30, 2019
Final paper submission deadline May 31, 2019
Acceptance/rejection notification July 15, 2019
Camera ready July 31, 2019
Registration open August 25, 2019
Conference 16-17 October, 2019

Venue

The conference will be held in Nancy (East of France) near the borders of Germany, Luxembourg and Belgium. It will be held at Loria ( Laboratoire lorrain de recherche en informatique et ses applications).

Contact

smaili@loria.fr

--------------------------

----------------------------------------------------

The 2nd Workshop on Technologies for

MT of Low Resource Languages (LoResMT 2019)

The Helix, DCU, Dublin, August 20, 2019

https://sites.google.com/view/loresmt/

@ MT Summit XVII

----------------------------------------------------

BRIEF

1. Call for Papers:

https://easychair.org/cfp/LoResMT2019

Submission due on "May 24, 2019" (Abstract on "May 17"):

https://easychair.org/conferences/?conf=loresmt2019

2. Shared Tasks on MT for Bhojpuri, Magahi, Sindhi, and Latvian (<> English)

*Registration open. Training, Development sets available upon registration!*

Participants please register by sending email to

loresmt@googlegroups.com

with Team name, members (emails and affiliations) information.

Timeline:

May 03, 2019: Release of training data

June 04, 2019: Release of test data

June 11, 2019: Submission of the systems

June 16, 2019: Notification of results

June 25, 2019: Submission of shared task papers

3. Prof Xiaobing Zhao et al. will give an invited talk on

"Building Cross-Lingual Knowledge Base for Low Resource Languages in China"

4. Please find below the proceeding and slides of LoResMT 2018. Works on several low resource languages, e.g. Filipino, Finnish, Irish, Latvian, Mongolian, Quechua, Tibetan, and Uyghur, are presented.

https://amtaweb.org/wp-content/uploads/2018/03/AMTA_2018_Workshop_Proceedings_LoResMT.pdf

https://sites.google.com/view/loresmt-2018/

SCOPES

Machine translation (MT) technologies have been improved significantly in the last two decades, with the developments on phrased-based statistical MT (SMT) and recently the neural MT (NMT). However, most of these methods rely on the availability of large parallel data (millions to tens of millions sentence pairs) in the training, which are resources that do not exist in many language pairs.

In addition, MT methods still rely on a few natural language processing (NLP) tools to help pre-process human generated texts in the forms that are required as input for these methods, and/or post-process the output in proper textual forms in target languages. In many MT systems, the performance of these tools has great impacts on the quality of the resulting translation. These NLP tools include, but not limited to, several kinds of word tokenizers/de-tokenizers, word segmenters, morphology analyzers, etc.

The workshop solicits papers on MT systems/methods for low resource languages in general. We also solicit papers dedicated to these supplementary NLP tools that are used in any language and especially in low resource languages. We would like to have an overview of research on MT for low resource languages and these NLP tools from our community.

TOPICS

We solicit original research papers, review papers, and position papers on MT research for low resource languages in the workshop. Multilingual and/or cross-lingual NLP tools for low resource languages are especially welcome. Topics of the workshop include but are not limited to:

- Research and review papers of pre-processing and/or post-processing NLP tools for MT

- Position papers on the development of pre-processing and/or post-processing tools for MT

- Word tokenizers/de-tokenizers for specific languages

- Word/morpheme segmenters for specific languages

- Alignment/Re-ordering tools for specific language pairs

- Use of morphology analyzers and/or morpheme segmenters in MT

- Multilingual/cross-lingual NLP tools for MT

- Re-usability of existing NLP tools for low resource languages

- Corpora creation and curation technologies for low resource languages

- Review of available parallel corpora for low resource languages

- Research and review papers of MT methods for low resource languages

- MT systems/methods (e.g. rule-based, SMT, NMT) for low resource languages

- Pivot MT for low resource languages

- Zero-shot MT for low resource languages

- Fast building of MT systems for low resource languages

- Re-usability of existing MT systems for low resource languages

- Machine translation for language preservation

SUBMISSION INFORMATION

Workshop papers should adhere to MT Summit 2019 style guide (LaTeX, OpenOffice, Word).

https://www.mtsummit2019.com/submissions

Submission deadline: "May 24, 2019" (Abstract on "May 17").

https://easychair.org/conferences/?conf=loresmt2019

There are two types of submissions in the workshop. For research, review and position papers, the length of each paper should be at least four (4) and not exceed eight (8) pages, plus unlimited pages for references. More pages would be allowed as long as it could be justified. The review will be double-blinded. For non-archival system demonstration abstracts, the limit is four (4) pages. The review will be single-blind.

We would like to encourage authors to cite papers written in ANY language that are related to the topics, as long as both original bibliographic items and their corresponding English translations are provided.

IMPORTANT DATES

March 19, 2019: Call for papers

April 19, 2019: 2nd Call for papers

May 24, 2019: Submission deadline of workshop papers

June 21, 2019: Notification of acceptance

July 12, 2019: Camera-ready papers due

July 19, 2019: Workshop proceeding on-line

August 21, 2019: LoResMT workshop

ORGANIZERS (listed alphabetically)

Alina Karakanta FBK-Fondazione Bruno Kessler

Atul Kr. Ojha Panlingua Language Processing LLP/Jawaharlal Nehru University

Chao-Hong Liu ADAPT Centre, Dublin City University

Jonathan Washington Swarthmore College

Nathaniel Oco National University (Philippines)

Surafel Melaku Lakew FBK-Fondazione Bruno Kessler

Valentin Malykh Moscow Institute of Physics and Technology

Varvara Logacheva Moscow Institute of Physics and Technology

Xiaobing Zhao Minzu University of China

CONTACT

chaohong.liu@adaptcentre.ie

LoResMT @ MT Summit 2019

https://sites.google.com/view/loresmt/

LoResMT @ AMTA 2018

https://sites.google.com/view/loresmt-2018/

-- 
Claudia Soria
Researcher
Istituto di Linguistica Computazionale "A. Zampolli"
Consiglio Nazionale delle Ricerche
Via Moruzzi 1
56124 Pisa
Italy

Tel. +39 050 3153166
Skype clausor