-------- Message transféré --------
Sujet : Call for Papers: IEEE ACM/TASLP Special Issue on Speech &
Language Technologies for Low-Resource Languages
Date : Wed, 29 Mar 2023 04:01:36 -0400
De : IEEE Signal Processing Society
<marketing(a)signalprocessingsociety.org>
Répondre à : marketing(a)signalprocessingsociety.org
Pour : mariani(a)limsi.fr
Call for Papers: IEEE ACM/TASLP Special Issue on Speech & Language
Technologies for Low-Resource Languages
SPS_Logo_KO_RGB (1)
CALL FOR PAPERS
IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP)
TASLP Special Issue on Speech & Language
Technologies for Low-Resource Languages
*Submit Your Manuscript*
<https://czqvL04.na1.hubspotlinks.com/Ctc/T8+113/czqvL04/VVqS0B3s16ytW55j9JS…>
Speech and language processing is a multi-disciplinary research area
that focuses on various aspects of natural language processing and
computational linguistics. Speech and language technologies deal with
the study of methods and tools to develop innovative paradigms for
processing human languages (speech and writing) that can be recognized
by machines. Thanks to the incredible advances in machine learning and
artificial intelligence techniques that effectively interpret speech and
textual sources.
In general, speech technologies include a series of artificial
intelligence algorithms that enables the computer system to produce,
analyze, modify, and respond to human speech and texts. It establishes a
more natural interaction between humans and computers as well as the
translation between all human languages with effective analysis of text
and speech. These techniques have significant applications in
computational linguistics, natural language processing, computer
science, mathematics, speech processing, machine learning, and
acoustics. Another important application of this technology is the
machine translation of text and voice.
There exists a huge gap between speech and language processing in
low-resource languages as they have lesser computational resources. With
the ability to access vast computational sources from various digital
sources, we can resolve numerous language processing problems in real
time with enhanced user experience and productivity measures. Speech and
language processing technologies for low-resource languages are still in
their infancy. Research in this stream will enhance the likelihood of
these languages becoming an active part of our life, as their importance
is paramount.
Furthermore, the societal shift towards digital media along with
spectacular advances in digital media along with processing power,
computational storage, and software capabilities with a vision of
transferring low-resource computing language resources into efficient
computing models.
This special issue aims to explore the language and speech processing
technologies to novel computational models for processing speech, text,
and language. The novel and innovative solutions focus on content
production, knowledge management, and natural communication of
low-resource languages. We welcome researchers and practitioners working
in speech and language processing to present their novel and innovative
research contributions for this special section.
Topics of Interest
* Artificial intelligence-assisted speech & language technologies for
low-resource languages
* Pragmatics for low-resource languages
* Emerging trends in knowledge representation for low-resource languages
* Machine translation for low-resource language processing
* Automatic speech recognition & speech technology for low-resource
languages
* Sentiment & statistical analysis for low-resource languages
* Multimodal analysis for low-resource languages
* Augment mining for low-resource language processing
* Text summarization & speech synthesis
* Sentence-level semantics for speech recognition
* Information retrieval & extraction of low-resource languages
Submission Guidelines
Manuscripts should be submitted through the Manuscript Central system
<https://czqvL04.na1.hubspotlinks.com/Ctc/T8+113/czqvL04/VVqS0B3s16ytW55j9JS…>.
*Submit Your Manuscript*
<https://czqvL04.na1.hubspotlinks.com/Ctc/T8+113/czqvL04/VVqS0B3s16ytW55j9JS…>
*Important Dates*
* *Submission deadline: 30 May 2023*
* Authors notification: 25 July 2023
* Revised version submission: 29 September 2023
* Final decision notification: 15 December 2023
Guest Editors
* Dr. Chi Lin <mailto:clindut@ieee.org>, Dalian University of
Technology, China
* Dr. Chang Wu Yu <mailto:cwyu@chu.edu.tw>, Chung Hua University, Taiwan
* Dr. Ning Wang <mailto:wangn@rowan.edu>, Rowan University, USA
* Dr. Qiang Lin <mailto:lqchina@dlust.edu.cn>, Dalian University of
Technology, China
Facebook
<https://czqvL04.na1.hubspotlinks.com/Ctc/T8+113/czqvL04/VVqS0B3s16ytW55j9JS…>
LinkedIn
<https://czqvL04.na1.hubspotlinks.com/Ctc/T8+113/czqvL04/VVqS0B3s16ytW55j9JS…>
Twitter
<https://czqvL04.na1.hubspotlinks.com/Ctc/T8+113/czqvL04/VVqS0B3s16ytW55j9JS…>
Instagram
<https://czqvL04.na1.hubspotlinks.com/Ctc/T8+113/czqvL04/VVqS0B3s16ytW55j9JS…>
YouTube
<https://czqvL04.na1.hubspotlinks.com/Ctc/T8+113/czqvL04/VVqS0B3s16ytW55j9JS…>
IEEE Signal Processing Society, 445 Hoes Lane, Piscataway, NJ 08854, USA
Unsubscribe
<https://hs-4789515.s.hubspotemail.net/hs/manage-preferences/unsubscribe-all…>
Manage preferences
<https://hs-4789515.s.hubspotemail.net/hs/manage-preferences/unsubscribe?lan…>
--
Depuis le 1er janvier 2021, le LIMSI a fusionné avec le LRI et est devenu le LISN (Laboratoire Interdisciplinaire des Sciences du Numérique)
Since January 1st 2021, LIMSI merged with the LRI lab and became the LISN (Interdisciplinary Computer Science Laboratory)
-
Joseph MARIANI
Directeur de Recherche Émérite au CNRS
LISN
Rue John von Neumann
Université Paris-Saclay
Batiment 508
91405 ORSAY Cedex (France)
Tel: +33 1 69 15 78 56
Email:Joseph.Mariani@limsi.fr
Web:https://perso.limsi.fr/mariani/index
Web IMMI:http://immi.cnrs.fr/
Scope
The purpose of the Italian Information Retrieval Workshop (IIR) is to
provide a meeting forum for stimulating and disseminating research in
Information Retrieval, where Italian researchers (especially young ones)
and researchers affiliated with Italian institutions can network and
discuss their research results in an informal way.
IIR 2023 is the 13th edition of the Italian Information Retrieval Workshop.
It will take place on June 8th - 9th, 2023 and is organized by the National
Research Council of Italy (CNR) and the University of Pisa.
Participation in the IIR 2023 workshop will be free of charge. However,
advance registration will be strictly required.
Topics
IIR 2023 offers the opportunity to present and discuss theoretical and
empirical research. Relevant topics include, but are not restricted to:
-
Search and Ranking. Research on core Information Retrieval (IR)
algorithmic topics, including IR at scale, covering topics such as:
-
Theoretical models and foundations of IR and access
-
Retrieval models and ranking models, including diversity and
aggregated search
-
Web search, including link analysis, sponsored search, search
advertising, adversarial search and spam, and vertical search
-
Queries and query analysis
-
Recommendation, Content Analysis, and Classification. Research focusing
on recommender systems (RS), rich content representations and content
analysis, covering topics such as:
-
Filtering and recommender systems
-
Document representation
-
Content analysis and information extraction, including summarization,
text representation, readability, sentiment analysis, and opinion mining
-
Cross- and multilingual search
-
Clustering, classification, and topic models
-
Artificial Intelligence, NLP, Semantics, and Dialog. Research bridging
AI and IR --, especially toward deep semantics -- and dialog with
intelligent agents, covering topics such as:
-
Question Answering
-
Conversational systems and retrieval, including spoken language
interfaces, dialog management systems, and intelligent chat systems
-
Semantics and knowledge graphs
-
Deep learning for IR, embeddings, Large Language Models, and agents
-
NLP techniques used to enhance search and recommendation
-
Domain-Specific Applications. Research focusing on domain-specific
challenges, covering topics such as:
-
Social search
-
Search in structured data including email and entity search
-
Multimedia search
-
Search and recommendation for Educational, Legal, Health - including
genomics and bioinformatics -, and Academic domains
-
Other domains such as digital libraries, enterprise, news, app, and
archival search
-
Human Factors and Interfaces. Research into user-centric aspects of IR,
including user interfaces, behavior modeling, privacy, and interactive
systems, covering topics such as:
-
Mining and modeling search activity, including user and task models,
click models, log analysis, behavioral analysis, and attention modeling
-
Interactive and personalized search and recommendation
-
Collaborative search, social tagging and crowdsourcing
-
Information privacy and security
-
Evaluation. Research that focuses on the measurement and evaluation of
IR systems, covering topics such as:
-
User-centered evaluation methods, including measures of user
experience and performance, user engagement and search task design
-
Test collections and evaluation metrics, including the development of
new test collections
-
Eye-tracking and physiological approaches, such as fMRI
-
Evaluation of novel information access tasks and systems such as
multi-turn information access
-
Statistical methods and reproducibility issues in information
retrieval evaluation
-
Efficiency and scalability
-
Future Directions. Research with theoretical or empirical contributions
on new technical or social aspects of IR, especially in more speculative
directions or with emerging technologies, covering topics such as:
-
Novel approaches to IR
-
Ethics, economics, and politics
-
Applications of search to social good
-
IR and RS with new devices, including wearable computing,
neuroinformatics, sensors, Internet-of-Things, vehicles
Submissions
Papers may range from theoretical works to system descriptions. We
particularly encourage PhD students or Early-Stage Researchers to submit
their research. We also welcome contributions from the industry and papers
describing ongoing funded projects which may result useful to the IIR
community.
Authors are invited to submit one of the following types of contributions:
-
Full original papers (10 pages, plus additional pages for references if
needed)
-
Short original papers (5 pages, plus additional pages for references if
needed)
-
Extended abstracts containing descriptions of ongoing projects or
presenting already published results (up to 4 pages, plus additional pages
for references if needed). If presenting already published results the
extended abstract should be single-blind and contain a reference to the
original published paper.
Submissions of full research papers must be in English, in PDF format in
the CEUR-WS single-column conference format available at
https://drive.google.com/file/d/1_onwNQpVPD0ViZPrhGfardLrsP0sgmIp/view?usp=…
.
Submission will be peer-reviewed and accepted papers will appear in the
CEUR workshop series (at the authors’ discretion).
Submission will be through CMT at
https://cmt3.research.microsoft.com/IIR2023/.
Important Dates
-
Submission website opens: March 15, 2023
-
Submission deadline: April 15, 2023
-
Notification of acceptance: May 16, 2023
-
Camera-ready deadline: May 30th, 2023
-
IIR 2023: June 8th-9th, 2023
Deadlines refer to 23:59 (11:59pm) in the AoE (Anywhere on Earth) time zone.
For further information, visit the website http://iir2023.isti.cnr.it/ or
contact us to guglielmo.faggioli(a)unipd.it
Second International Workshop on Automatic Translation for Signed and Spoken Languages (AT4SSL2023 @EAMT2023)
First Call For Papers
https://sites.google.com/tilburguniversity.edu/at4ssl2023/
****** Apologies for cross-posting ******
SCOPE
According to the World Federation of the Deaf (WFD) over 70 million people are deaf and communicate primarily via Sign Language (SL). Currently, human interpreters are the main medium for sign-to-spoken, spoken-to-sign and sign-to-sign language translation. The availability and cost of these professionals is often a limiting factor in communication between signers and non-signers. Machine Translation (MT) is a core technique for reducing language barriers for spoken languages. Although MT has come a long way since its inception in the 1950s, it still has a long way to go to successfully cater to all communication needs and users. When it comes to the deaf and hard of hearing communities, MT is in its infancy. The complexity of the task to automatically translate between SLs or sign and spoken languages, requires a multidisciplinary approach (Bragg et al., 2019)<https://dl.acm.org/doi/10.1145/3308561.3353774>.
The rapid technological and methodological advances in deep learning, and in AI in general, that we see in the last decade, have not only improved MT, recognition of image, video and audio signals, the understanding of language, the synthesis of life-like 3D avatars, etc., but have also led to the fusion of interdisciplinary research innovations that lays the foundation of automated translation services between sign and spoken languages.
This one-day workshop aims to be a venue for presenting and discussing (complete, ongoing or future) research on automatic translation between sign and spoken languages and bring together researchers, practitioners, interpreters and innovators working in related fields.
Theme of the workshop: Data is one of the key factors for the success of today’s AI, including language and translation models for sign and spoken languages. However, when it comes to SL, MT and Natural Language Processing, we face problems related to small volumes of (parallel) data, large veracity in terms of origin of annotations (deaf or hearing interpreters), non-standardized annotations (e.g. glosses differ across corpora), video quality or recording setting, and others. The theme of this edition of the workshop is Sign language parallel data – challenges, solutions and resolutions.
The AT4SSL workshop aims to open a (guided) discussion between participants about current challenges, innovations and future developments related to the automatic translation between sign and spoken languages. To this extent, AT4SSL will host a moderated round table around the following three topics: (i) quality of recognition and synthesis models and user-expectations; (ii) co-creation -- deaf, hearing and hard-of-hearing people joining forces towards a common goal and (iii) sign-to-spoken and spoken-to-sign translation technology in media.
TOPICS
This workshop aims to focus on the following topics. However, submissions related to the general topic of automatic translation between signed and spoken languages that deviate from these topics are also welcome:
* Data: resources, collection and curation, challenges, processing, data life cycle
* Use-cases, applications
* Ethics, privacy and policies
* Sign language linguistics
* Machine translation (with a focus on signed-to-signed, signed-to-spoken or spoken-to-signed language translation)
* Natural language processing
* Interpreting of sign and spoken languages
* Image and video recognition (for the purpose of sign language recognition)
* 3D avatar and virtual signers synthesis
* Usability and challenges of current methods and methodologies
* Sign language in the media
SUBMISSION FORMAT
Two types of submissions are going to be accepted for the AT4SSL workshop:
* Research, review, position and application papers
Unpublished papers that present original, completed work. The length of each paper should be at least four (4) and maximum eight (8) pages, with unlimited pages for references.
* Extended abstracts
Extended abstracts should present original, ongoing work or innovative ideas. The length of each extended abstract is four (4) pages, with unlimited pages for references.
Both papers should be formatted according to the official EAMT 2023 style templates (LaTex<https://events.tuni.fi/uploads/2022/12/ee35fd56-latex_template.zip>. Overleaf<https://www.overleaf.com/read/mkjbkppndvxw>, MS Word<https://events.tuni.fi/uploads/2022/12/edd598d2-eamt23.docx>, Libre/Open Office<https://events.tuni.fi/uploads/2022/12/ece98f81-eamt23.odt>, PDF<https://events.tuni.fi/uploads/2022/12/6e89772e-eamt23.pdf>).
Accepted papers and extended abstracts will be published in the EAMT 2023 proceedings and will be presented at the conference.
SUBMISSION POLICY
*
Submissions must be anonymized.
*
Papers and extended abstracts should be submitted using EASY Chair<https://easychair.org/conferences/?conf=eamt2023>.
*
Work that has been or is planned to be submitted to other venues must be declared as such. Upon acceptance at AT4SSL, it must be withdrawn from the other venues.
*
The review will be double-blind.
IMPORTANT DATES:
* First call for papers: 13-March-2023
* Second call for papers: 31-March-2023
* Submission deadline: 14-April-2023
* Review process: between 17-April-2023 and 05-May-2023
* Acceptance notification: 12-May-2023
* Camera ready submission: 01-June-2023
* Submission of material for interpreters: 06-June-2023
* Programme will be finalised by: 01-June-2023
* Workshop date: 15-June-2023
ORGANISATION COMMITTEE:
Dimitar Shterionov (TiU)
Mirella De Sisto (TiU)
Mathias Muller (UZH)
Davy Van Landuyt (EUD)
Rehana Omardeen (EUD)
Shaun O’Boyle (DCU)
Annelies Braffort (Paris-Saclay University)
Floris Roelofsen (UvA)
Frédéric Blain (TiU)
Bram Vanroy (KU Leuven; UGent)
Eleftherios Avramidis (DFKI)
FOR CONTACTS:
Dimitar Shterionov, workshop chair: d.shterionov(a)tilburguniversity.edu
Registration will be handled by the EAMT2023 conference. (To be announced)
Dear colleagues,
This is our final reminder to invite you to participate in our survey entitled "Surveying the Landscape of Ethics Consideration Sections Grounded in Research Data Lifecycle". We would like to remind you that the survey is still available until 15.03.2023, should you wish to take part.
If you would like to participate in our survey, please click on this link: <https://umfrage.iis.fhg.de/index.php/765534?lang=en> https://survey.iis.fraunhofer.de/index.php/765534?lang=en
We would like to express our sincerest gratitude to each and every one of you who took the time to participate in our survey; your input has been immensely helpful and we are truly grateful for your support.
Sincerely,
Zahra Kolagar and Hadiseh Yadollahi
P.S,
If for any technical reasons, you cannot access the link in this email, please report to zahra.kolagar(a)iis.fraunhofer.de<mailto:zahra.kolagar@iis.fraunhofer.de>
Dear Colleague,
We hope our email finds you well.
We are a group of researchers at the Fraunhofer Institute of Integrated Circuits (IIS) in Erlangen, Germany.
Our research is focused on the application of natural language processing in various fields. Our current research is focused on adapting the existing ethical principles to ensure that they are still relevant and adequate to various stages of the research data lifecycle.
To this end, we have created a survey and would like to invite you to participate in our survey entitled: Surveying the Landscape of Ethics Consideration Sections Grounded in Research Data Lifecycle. We believe that your expertise will help identify relevant and important points in regard to ethical considerations that both researchers and reviewers need to bear in mind at various stages of research. You can find more information on this survey by clicking on the link below.
If you would like to participate in our survey, please click on this link: https://survey.iis.fraunhofer.de/index.php/765534?lang=en
. This link will stay active until 15.03.2023.
Please also feel free to share the link to this survey with anyone who might be interested in participating.
Sincerely,
Zahra Kolagar and Hadiseh Yadollahi
---------------------------------------------------------------------------------------------
Research Associates at Fraunhofer Institute of Integrated Circuits (IIS)
Am Wolfsmantel 33, 91058 Erlangen, Germany
________________________________
From: Kolagar, Zahra
Sent: Wednesday, March 1, 2023 12:28:25 AM
Subject: Surveying the Landscape of Ethics Consideration Sections Grounded in Research Data Lifecycle
Dear Colleague,
We hope our email finds you well.
We are a group of researchers at the Fraunhofer Institute of Integrated Circuits (IIS) in Erlangen, Germany.
Our research is focused on the application of natural language processing in various fields. Our current research is focused on adapting the existing ethical principles to ensure that they are still relevant and adequate to various stages of the research data lifecycle.
To this end, we have created a survey and would like to invite you to participate in our survey entitled: Surveying the Landscape of Ethics Consideration Sections Grounded in Research Data Lifecycle. We believe that your expertise will help identify relevant and important points in regard to ethical considerations that both researchers and reviewers need to bear in mind at various stages of research. You can find more information on this survey by clicking on the link below.
If you would like to participate in our survey, please click on this link: https://survey.iis.fraunhofer.de/index.php/765534?lang=en
. This link will stay active until 15.03.2023.
Please also feel free to share the link to this survey with anyone who might be interested in participating.
Sincerely,
Zahra Kolagar and Hadiseh Yadollahi
---------------------------------------------------------------------------------------------
Research Associates at Fraunhofer Institute of Integrated Circuits (IIS)
Am Wolfsmantel 33, 91058 Erlangen, Germany
Check this out!
FYI, the 13th International Workshop on Spoken Dialogue Systems
Technology <https://sites.google.com/view/iwsds2023/home> (IWSDS2023) is
taking place this week, 21-24 February.
As this year’s conference theme is "Diversity in Dialogue Systems", many
contributions also address under-resourced languages.
Check the program for interesting papers here
<https://sites.google.com/view/iwsds2023/program/detailed-schedule>.
A special session on Dialogue Systems for Multilingual and
Under-resourced Language Speakers
<https://sites.google.com/view/iwsds2023/special-sessions-workshops> is
planned on Friday 24.
Best,
Claudia
--
Claudia Soria
Researcher
Cnr-Istituto di Linguistica Computazionale “Antonio Zampolli”
Via Moruzzi 1
56124 Pisa
Italy
Tel. +39 050 3153166
Skype clausor
Dear colleagues,
My team is heading to Bolivia again and this time we'll have internet
access!
We'd love to find a system (online app or code) that allows us to put audio
files in a secure server, and which has a simple and intuitive interface
where our informants can log in, listen to an audio clip, re-speak it,
transcribe it on a piece of paper, take a photo of the paper (perhaps with
their phone) & upload that photo to be stored together with the rest of the
information on that clip, and then move on to the next clip.
Does anything like that exist? What would you recommend? Feel free to email
just me, and if so, let me know if I can share back the suggestions with
the group.
Thank you in advance,
Alex
---------------------------------------------------------------
Alex (Alejandrina) Cristia
Researcher, CNRS
Laboratoire de Sciences Cognitives et Psycholinguistique
29, rue d'Ulm, 75005, Paris, FRANCE
My site: www.acristia.org
---------------------------------------------------------------
If you donate, ask me about effective charities
<https://effectivealtruism.us8.list-manage.com/track/click?u=52b028e7f799cca…>.
/ Si vous faites des dons, posez-moi des questions sur le don efficace
<https://www.altruismeefficacefrance.org/guide-don-efficace-1/>.
Third call for papers
Fourth workshop on Resources for African Indigenous Language (RAIL)
https://bit.ly/rail2023
Note: deadline extension and submission system information
The 4th RAIL (Resources for African Indigenous* Languages) workshop
will be co-located with EACL 2023 in Dubrovnik, Croatia. The Resources
for African Indigenous Languages (RAIL) workshop is an
interdisciplinary platform for researchers working on resources (data
collections, tools, etc.) specifically targeted towards African
indigenous languages. In particular, it aims to create the conditions
for the emergence of a scientific community of practice that focuses on
data, as well as computational linguistic tools specifically designed
for or applied to indigenous languages found in Africa.
Previous workshops showed that the presented problems (and solutions)
are not only applicable to African languages. Many issues are also
relevant to other low-resource languages, such as different scripts and
properties like tone. As such, these languages share similar
challenges. This allows for researchers working on these languages with
such properties (including non-African languages) to learn from each
other, especially on issues pertaining to language resource
development.
The RAIL workshop has several aims. First, it brings together
researchers working on African indigenous languages, forming a
community of practice for people working on indigenous languages.
Second, the workshop aims to reveal currently unknown or unpublished
existing resources (corpora, NLP tools, and applications), resulting in
a better overview of the current state-of-the-art, and also allows for
discussions on novel, desired resources for future research in this
area. Third, it enhances sharing of knowledge on the development of
low-resource languages. Finally, it enables discussions on how to
improve the quality as well as availability of the resources.
The workshop has “Impact of impairments on language resources” as its
theme, but submissions on any topic related to properties of African
indigenous languages (including non-African languages) may be accepted.
Suggested topics include (but are not limited to) the following:
Digital representations of linguistic structures
Descriptions of corpora or other data sets of African indigenous
languages
Building resources for (under resourced) African indigenous languages
Developing and using African indigenous languages in the digital age
Effectiveness of digital technologies for the development of African
indigenous languages
Revealing unknown or unpublished existing resources for African
indigenous languages
Developing desired resources for African indigenous languages
Improving quality, availability and accessibility of African indigenous
language resources
*: The term indigenous languages used in the RAIL workshop is intended
to refer to non-colonial languages (in this case those used in Africa).
In no way is this term used to cause any harm or discomfort to anyone.
Many of these languages were or are still marginalised, and the aim of
the workshop is to bring attention to the creation, curation, and
development of resources for these languages in Africa.
Submission requirements:
We invite papers on original, unpublished work related to the topics of
the workshop. Submissions, presenting completed work, may consist of up
to eight (8) pages of content plus additional pages of references. The
final camera-ready version of accepted long papers are allowed one
additional page of content (so up to 9 pages) so that reviewers’
feedback can be incorporated.
Submissions need to use the EACL stylesheets. These can be found at
https://2023.eacl.org/calls/styles. Submission is electronic in PDF
through the START system which can be found at
https://softconf.com/eacl2023/RAIL2023. Reviewing is double-blind, so
make sure to anonymize your submission (e.g., do not provide author
names, affiliations, project names, etc.) Limit the amount of self
citations (anonymized citations should not be used). Accepted papers
will be published in the ACL workshop proceedings.
Please make sure you also go through the responsible NLP checklist
(https://aclrollingreview.org/responsibleNLPresearch/). Also,
submissions should have a section titled “Limitations” (as described in
the stylesheets). Authors are also encouraged to include an explicit
ethics statement.
Important dates:
Submission deadline 20 February 2023
Date of notification 13 March 2023
Camera ready deadline 27 March 2023
RAIL workshop 5 or 6 May 2023
Organising Committee
Rooweither Mabuya, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Don Mthobela, Cam Foundation
Mmasibidi Setaka, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Menno van Zaanen, South African Centre for Digital Language Resources
(SADiLaR), South Africa
--
Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za
Professor in Digital Humanities
South African Centre for Digital Language Resources
https://www.sadilar.org
________________________________
NWU PRIVACY STATEMENT:
http://www.nwu.ac.za/it/gov-man/disclaimer.html
DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system.
________________________________
Second call for papers
Fourth workshop on Resources for African Indigenous Language (RAIL)
https://bit.ly/rail2023
The 4th RAIL (Resources for African Indigenous* Languages) workshop
will be co-located with EACL 2023 in Dubrovnik, Croatia. The Resources
for African Indigenous Languages (RAIL) workshop is an
interdisciplinary platform for researchers working on resources (data
collections, tools, etc.) specifically targeted towards African
indigenous languages. In particular, it aims to create the conditions
for the emergence of a scientific community of practice that focuses on
data, as well as computational linguistic tools specifically designed
for or applied to indigenous languages found in Africa.
Previous workshops showed that the presented problems (and solutions)
are not only applicable to African languages. Many issues are also
relevant to other low-resource languages, such as different scripts and
properties like tone. As such, these languages share similar
challenges. This allows for researchers working on these languages with
such properties (including non-African languages) to learn from each
other, especially on issues pertaining to language resource
development.
The RAIL workshop has several aims. First, it brings together
researchers working on African indigenous languages, forming a
community of practice for people working on indigenous languages.
Second, the workshop aims to reveal currently unknown or unpublished
existing resources (corpora, NLP tools, and applications), resulting in
a better overview of the current state-of-the-art, and also allows for
discussions on novel, desired resources for future research in this
area. Third, it enhances sharing of knowledge on the development of
low-resource languages. Finally, it enables discussions on how to
improve the quality as well as availability of the resources.
The workshop has “Impact of impairments on language resources” as its
theme, but submissions on any topic related to properties of African
indigenous languages (including non-African languages) may be accepted.
Suggested topics include (but are not limited to) the following:
Digital representations of linguistic structures
Descriptions of corpora or other data sets of African indigenous
languages
Building resources for (under resourced) African indigenous languages
Developing and using African indigenous languages in the digital age
Effectiveness of digital technologies for the development of African
indigenous languages
Revealing unknown or unpublished existing resources for African
indigenous languages
Developing desired resources for African indigenous languages
Improving quality, availability and accessibility of African indigenous
language resources
*: The term indigenous languages used in the RAIL workshop is intended
to refer to non-colonial languages (in this case those used in Africa).
In no way is this term used to cause any harm or discomfort to anyone.
Many of these languages were or are still marginalised, and the aim of
the workshop is to bring attention to the creation, curation, and
development of resources for these languages in Africa.
Submission requirements:
We invite papers on original, unpublished work related to the topics of
the workshop. Submissions, presenting completed work, may consist of up
to eight (8) pages of content plus additional pages of references. The
final camera-ready version of accepted long papers are allowed one
additional page of content (so up to 9 pages) so that reviewers’
feedback can be incorporated.
Submissions need to use the EACL stylesheets. These can be found at
https://2023.eacl.org/calls/styles. Submission is electronic in PDF
through the START system (link will be provided once available).
Reviewing is double-blind, so make sure to anonymize your submission
(e.g., do not provide author names, affiliations, project names, etc.)
Limit the amount of self citations (anonymized citations should not be
used). Accepted papers will be published in the ACL workshop
proceedings.
Please make sure you also go through the responsible NLP checklist
(https://aclrollingreview.org/responsibleNLPresearch/). Also,
submissions should have a section titled “Limitations” (as described in
the stylesheets). Authors are also encouraged to include an explicit
ethics statement.
Important dates:
Submission deadline 13 February 2023
Date of notification 13 March 2023
Camera ready deadline 27 March 2023
RAIL workshop 5 or 6 May 2023
Organising Committee
Rooweither Mabuya, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Don Mthobela, Cam Foundation
Mmasibidi Setaka, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Menno van Zaanen, South African Centre for Digital Language Resources
(SADiLaR), South Africa
--
Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za
Professor in Digital Humanities
South African Centre for Digital Language Resources
https://www.sadilar.org
________________________________
NWU PRIVACY STATEMENT:
http://www.nwu.ac.za/it/gov-man/disclaimer.html
DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system.
________________________________
On behalf of our colleague, Laurent Besacier:
hi everyone
i thought this info could be useful for scientists working on NMT for
low resource languages
we recently released SMaLL-100 model, a Shallow Multilingual MT Model
for Low-Resource Languages
it is a distilled version of the large 12B MTM-100 model released by Meta
you can find out more here: https://arxiv.org/abs/2210.11621
but the reason why i want to share this to SIGUL & EAMT community is
because we provide models (which are a good pre-trained model to develop
MT for low resource language pair) and also a demo platform to access MT
for those 10,000 language pairs !
Models: https://huggingface.co/alirezamsh/small100
Online MT demo: https://huggingface.co/spaces/alirezamsh/small100
(still a bit slow because currently running on 2 v*CPU* - 16GB RAM)
Best regards
Laurent Besacier
Naver Labs Europe
--
Claudia Soria
Researcher
Cnr-Istituto di Linguistica Computazionale “Antonio Zampolli”
Via Moruzzi 1
56124 Pisa
Italy
Tel. +39 050 3153166
Skype clausor