Dear colleagues,
My team and I are thinking of approaching a Bolivian community we have
collaborated with in the past about potentially building SLT tools and/or a
dataset with them. One of our research projects requires the creation of a
TTS system, so we think it would be important to couch this research goal
within a collaborative research project that takes into account the
communities' own goals and needs.
This is the first time I do anything like this, and I'm sorry if my
question is very naïve: Do you have materials you'd recommend for us to
read, such as:
- information often provided to aboriginal communities about this kind
of effort
- information about how other communities have set up a payment scheme
- information about variable terms in licensing; eg if the community
does not want commercial reuse, is that ok by the LDC? any other
restrictions communities often ask for? any other rights, such as royalties
in case of commercialization, or free access to the software?
Please reply to me alone. I'll compile all replies and share back the full
list of resources with the mailing list.
Thank you in advance,
Alex
---------------------------------------------------------------
Alex (Alejandrina) Cristia
Researcher, CNRS
Laboratoire de Sciences Cognitives et Psycholinguistique
29, rue d'Ulm, 75005, Paris, FRANCE
My site: www.acristia.org
---------------------------------------------------------------
If you donate, ask me about effective charities
<https://effectivealtruism.us8.list-manage.com/track/click?u=52b028e7f799cca…>.
/ Si vous faites des dons, demandez moi sur le don efficace
<https://www.altruismeefficacefrance.org/guide-don-efficace-1/>.
Dear colleagues,
A fascinating opportunity for those working on languages that are
inflectional! Read below & contact Ben Ambridge, in cc, for any questions.
-Alex
---------------------------------------------------------------
Alex (Alejandrina) Cristia
Researcher, CNRS
Laboratoire de Sciences Cognitives et Psycholinguistique
29, rue d'Ulm, 75005, Paris, FRANCE
My site: www.acristia.org
---------------------------------------------------------------
If you donate, ask me about effective charities
<https://effectivealtruism.us8.list-manage.com/track/click?u=52b028e7f799cca…>.
/ Si vous faites des dons, demandez moi sur le don efficace
<https://www.altruismeefficacefrance.org/guide-don-efficace-1/>.
---------- Forwarded message ---------
From: Ben Ambridge <ben.ambridge(a)manchester.ac.uk>
Date: Thu, Apr 28, 2022 at 10:15 PM
Subject: Fwd: Crosslinguistic morphology experiments - call for collborators
To: Alex CRISTIA <alecristia(a)gmail.com>
Hi Alex - I know you’ve worked on quite a few hard-to-reach languages -
would you be interested in this, or able to point me in the direction of
others who might be?
Thanks
Ben
===
Dear colleagues, we are seeking potential collaborators for a grant
application for a large crosslinguistic project investigating children’s
acquisition of inflectional morphology. We aim to include 100
typologically-diverse languages. Due to the size of the envisaged project,
it would not be feasible to apply for funding for full-time research
assistants to test children (or to fund a portion of each collaborator’s
salary). Our intention for the grant application is that each collaborator
will be able to claim up to €10,000 for expenses (e.g., travel, laptops,
participant payments, part-time/casual researchers), with the data
collected by a researcher who is already primarily sponsored/employed
(e.g., as PhD student, postdoc or research assistant) at your institution.
We will provide computerized elicitation tasks; your role (with the help of
full-time research and support staff employed at our end) would be to
translate the task into your language and inflectional system and to
supervise data collection (with children aged 3-6, and adults). At the
moment, our goal is simply to put together a list of *potential*
collaborators+languages for the grant application (NB: we can include only
languages with verb and/or noun person/case/number inflectional
morphology). To be included on this provisional list, please email
Ben.Ambridge(a)Manchester.ac.uk with your name, institution and language(s).
****Apologies for cross-postings****
Call for Papers
SIGUL 2022 Workshop <https://sigul-2022.ilc.cnr.it/>
a post-Conference Workshop of LREC 2022
Marseille (FR), 24-25 June 2022
*EXTENDED paper submission deadline: 19 April 2022*
The 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages (SIGUL 2022) will provide a forum for the presentation and discussion of cutting-edge research in text and speech processing for under-resourced languages by academic and industry researchers. SIGUL 2022 will carry on the tradition of the CCURL-SLTU (Collaboration and Computing for Under-Resourced Languages – Spoken Language Technologies for Under-resourced languages) Workshop Series, which has been organised since 2008 and, as LREC Workshops, since 2014. As usual, this Workshop spans the research interest areas of less-resourced, under-resourced, endangered, minority and minoritized languages. Since this year LREC includes a track dedicated specifically to endangered and less-resourced languages, the workshop aims to be a venue for networking and discussion as much as for scientific debate.
Over the last years, research in NLP for less-resourced languages has taken momentum. The multiplication of research interest makes it even more necessary for the community that revolves around less-resourced languages to find opportunities for aggregation and discussion. Following the long-standing series of previous meetings, the SIGUL venue will provide a forum for the presentation of cutting edge research in NLP, MT and Speech Technologies for under-resourced languages to both academic and industry researchers, and also to offer a venue where researchers in different disciplines and from varied backgrounds can fruitfully explore new areas of intellectual and practical development while honouring their common interest of sustaining less-resourced languages.
Topics include but are not limited to:
General research on under-resourced languages.
Transfer-learning techniques for under-resourced languages (use of multilingual, pretrained models, unsupervised, semi-supervised, zero-shot, few-shot training,...) in NLP, MT and Speech technologies.
We also invite position papers on methodological, ethical, or institutional issues
Instructions for submission can be found here <https://sigul-2022.ilc.cnr.it/submission/>
Important Dates
- Paper submission deadline: *19* April 2022
- Notification of acceptance: 3 May 2022
- Camera-ready paper: 23 May 2022
- Workshop date: 24-25 June 2022
Organizing Committee
Maite Melero - Barcelona Supercomputing Centre, Spain
Sakriani Sakti - NAIST, Japan
Claudia Soria - CNR-ILC, Italy
To contact the organisers, please mail sigul2022(a)ilc.cnr.it <mailto:sigul2022@ilc.cnr.it> (Subject: [SIGUL2022]).
>
>
> Research internship position at NAVER LABS Europe (Grenoble, France) on Energy-Based Models for Controlled Text Generation
>
> Start date: June 2022
> Duration: 5-6 months
>
> DESCRIPTION
> Large language models can now be used to generate highly fluent texts. However, the synthesized utterances can be deficient on other important levels: semantic consistency, faithfulness to the facts, toxic or socially biased content.
>
> Our team has developed several effective solutions on that front [1,2,3,4] exploiting the expressive power of Energy-Based Models in defining constraints over generative models. However, certain challenges remain: (1) How can we quickly adapt to changing control conditions without the need for model retraining? (2) Can we exploit these techniques to improve on hard-to-quantify features, such as safety, unbiasedness, textual coherence, or matching the human intention? (3) Can we improve training speed/robustness, for example, by leveraging techniques from RL?
>
> We are looking for a motivated intern to help us develop techniques and algorithms addressing these challenges. Experiments will be conducted on selected text generation tasks using the state of art pre-trained language models.
>
> The successful candidate should be enrolled in a graduate program, at the Master or (preferably) PhD level.
>
> The intern will work in a team integrated by Hady Elsahar, Marc Dymetman, Germán Kruszewski, and Jos Rozen.
>
> Publication of this internship's results in major conferences/journals will be strongly encouraged.
>
> REQUIRED SKILLS
> - Strong programming skills
> - Relevant experience with training Deep Learning models for NLP
> - Strong mathematical skills
> - Ability to communicate research
>
> OPTIONAL SKILLS
> - Knowledge of MCMC sampling techniques and/or Reinforcement Learning
> - Publications at peer-reviewed AI conferences
>
> REFERENCES
> [1] Khalifa et al., A Distributional Approach to Controlled Text Generation, In ICLR-2021
> [2] Eikema et al., Sampling from Energy-Based Models with Quality/Efficiency Trade-offs, In CtrlGen at Neurips 2021
> [3] Korbak et al., Energy-Based Models for Code Generation under Compilability Constraints, In NLP4prog at ACL2021
> [4] Korbak et al. Controlling Conditional Language Models with Distributional Policy Gradients, In CtrlGen at Neurips 2021
>
> APPLICATION INSTRUCTIONS
> Please note that applicants must be registered students at a university or other academic institution and that this establishment will need to sign an 'Internship Convention' with NAVER LABS Europe before the student is accepted.
>
> You can apply for this position online at https://europe.naverlabs.com/job/energy-based-models-for-controlled-text-ge… <https://europe.naverlabs.com/job/energy-based-models-for-controlled-text-ge…>. Don't forget to upload your CV and cover letter before you submit. Incomplete applications will not be accepted.
>
> ABOUT NAVER LABS
> NAVER is the #1 Internet portal in Korea with activities that span a wide range of businesses including search, commerce, content, financial and cloud platforms.
>
> NAVER LABS, co-located in Korea and France, is the organization dedicated to preparing NAVER’s future. NAVER LABS Europe is located in a spectacular setting in Grenoble, in the heart of the French Alps. Scientists at NAVER LABS Europe are empowered to pursue long-term research problems that, if successful, can have significant impact and transform NAVER. We take our ideas as far as research can to create the best technology of its kind. Active participation in the academic community and collaborations with world-class public research groups are, among others, important tools to achieve these goals. Teamwork, focus and persistence are important values for us.
>
> NAVER LABS Europe is an equal opportunity employer.
>
> For more information and application see https://europe.naverlabs.com/job/energy-based-models-for-controlled-text-ge… <https://europe.naverlabs.com/job/energy-based-models-for-controlled-text-ge…>
****Apologies for cross-postings****
***Please help disseminate****
1st Call for Papers
SIGUL 2022 Workshop <https://sigul-2022.ilc.cnr.it/>
a post-Conference Workshop of LREC 2022
Marseille (FR), 24-25 June 2022
*paper submission deadline: 11 April 2022*
The 1st Annual Meeting of the ELRA/ISCA Special Interest Group on
Under-Resourced Languages (SIGUL 2022) will provide a forum for the
presentation and discussion of cutting-edge research in text and speech
processing for under-resourced languages by academic and industry
researchers. SIGUL 2022 will carry on the tradition of the CCURL-SLTU
(Collaboration and Computing for Under-Resourced Languages – Spoken
Language Technologies for Under-resourced languages) Workshop Series,
which has been organised since 2008 and, as LREC Workshops, since 2014.
As usual, this Workshop spans the research interest areas of
less-resourced, under-resourced, endangered, minority and minoritized
languages. Since this year LREC includes a track dedicated specifically
to endangered and less-resourced languages, the workshop aims to be a
venue for networking and discussion as much as for scientific debate.
Over the last years,research in NLP for less-resourced languages has
taken momentum. The multiplication of research interest makes it even
more necessary for the community that revolves around less-resourced
languages to find opportunities for aggregation and discussion.
Following the long-standing series of previous meetings, the SIGUL venue
will provide a forum for the presentation of cutting edge research in
NLP, MT and Speech Technologies for under-resourced languages to both
academic and industry researchers, and also to offer a venue where
researchers in different disciplines and from varied backgrounds can
fruitfully explore new areas of intellectual and practical development
while honouring their common interest of sustaining less-resourced
languages.
Topics include but are not limited to:
*
General research on under-resourced languages.
*
Transfer-learning techniquesfor under-resourced languages(use of
multilingual, pretrained models, unsupervised, semi-supervised,
zero-shot, few-shot training,...) in NLP, MT and Speech technologies.
*
We also invite position paperson methodological, ethical, or
institutional issues
Instructions for submission can be found here
<https://sigul-2022.ilc.cnr.it/submission/>
Important Dates
- Paper submission deadline: 11 April 2022
- Notification of acceptance: 3 May 2022
- Camera-ready paper: 23 May 2022
- Workshop date: 24-25 June 2022
Organizing Committee
*
Maite Melero - Barcelona Supercomputing Centre, Spain
*
Sakriani Sakti - NAIST, Japan
*
Claudia Soria - CNR-ILC, Italy
To contact the organisers, please mail sigul2022(a)ilc.cnr.it
<mailto:sigul2022@ilc.cnr.it>(Subject: [SIGUL2022]).
--
Claudia Soria
Researcher
Istituto di Linguistica Computazionale "A. Zampolli"
Consiglio Nazionale delle Ricerche
Via Moruzzi 1
56124 Pisa
Italy
Management Committee member
COST Action CA19102 ‘Language In The Human-Machine Era' (LITHME)
www.lithme.eu
Tel. +39 050 3153166
Skype clausor
****Apologies for cross-postings****
***Please help disseminate****
1st Call for Papers
SIGUL 2022 Workshop <https://sigul-2022.ilc.cnr.it/>
a post-Conference Workshop of LREC 2022
Marseille (FR), 24-25 June 2022
The 1st Annual Meeting of the ELRA/ISCA Special Interest Group on
Under-Resourced Languages (SIGUL 2022) will provide a forum for the
presentation and discussion of cutting-edge research in text and speech
processing for under-resourced languages by academic and industry
researchers. SIGUL 2022 will carry on the tradition of the CCURL-SLTU
(Collaboration and Computing for Under-Resourced Languages – Spoken
Language Technologies for Under-resourced languages) Workshop Series, which
has been organised since 2008 and, as LREC Workshops, since 2014. As usual,
this Workshop spans the research interest areas of less-resourced,
under-resourced, endangered, minority and minoritized languages. Since this
year LREC includes a track dedicated specifically to endangered and
less-resourced languages, the workshop aims to be a venue for networking
and discussion as much as for scientific debate.
Over the last years, research in NLP for less-resourced languages has taken
momentum. The multiplication of research interest makes it even more
necessary for the community that revolves around less-resourced languages
to find opportunities for aggregation and discussion. Following the
long-standing series of previous meetings, the SIGUL venue will provide a
forum for the presentation of cutting edge research in NLP, MT and Speech
Technologies for under-resourced languages to both academic and industry
researchers, and also to offer a venue where researchers in different
disciplines and from varied backgrounds can fruitfully explore new areas of
intellectual and practical development while honouring their common
interest of sustaining less-resourced languages.
Topics include but are not limited to:
-
General research on under-resourced languages.
-
Transfer-learning techniques for under-resourced languages (use of
multilingual, pretrained models, unsupervised, semi-supervised, zero-shot,
few-shot training,...) in NLP, MT and Speech technologies.
-
We also invite position papers on methodological, ethical, or
institutional issues
Instructions for submission can be found here
<https://sigul-2022.ilc.cnr.it/submission/>
Important Dates
- Paper submission deadline: 11 April 2022
- Notification of acceptance: 3 May 2022
- Camera-ready paper: 23 May 2022
- Workshop date: 24-25 June 2022
Organizing Committee
-
Maite Melero - Barcelona Supercomputing Centre, Spain
-
Sakriani Sakti - NAIST, Japan
-
Claudia Soria - CNR-ILC, Italy
To contact the organisers, please mail sigul2022(a)ilc.cnr.it (Subject:
[SIGUL2022]).
To kick off the International Decade of Indigenous Languages 2022-32, Linguapax will present the 2021 Linguapax Review special issue on Language Technologies and Language Diversity
The event will take place online on 9 March 2022, at 6 pm CET, via Zoom.
During the presentation, authors of the 2021 Linguapax Review will participate in a live debate. We are proud to be joined by:
Andras Kornai <https://www.linkedin.com/in/ACoAAAAB31oBkjnx7uquXtV7tM7-w2lGSsKxbdw>, advisor at the Hungarian Academy of Sciences and author of "Digital Language Death"
Daniel Pimienta <https://www.linkedin.com/in/ACoAAABDoo8BjE6560HaeAmfgcrAOHIqHATliMg>, mathematician, head of the Observatory of Linguistic and Cultural Diversity on the Internet
Tunde Adegbola <https://www.linkedin.com/in/ACoAAABR2W0B-E35vkfUZzAJdeTL_awNeQq0oDM>, scientist, musician, engineer, linguist and culture activist, founder of African Languages Technology Initiative (Alt-i)
Roland Kuhn, <https://nrc.canada.ca/en/corporate/contact-us/nrc-directory-science-profess…>PRO at National Research Council Canada and leader of the Indigenous Languages Technology project
Eddie Avila <https://www.linkedin.com/in/ACoAAABGd_8BCQSXFxlgDkk4uf8jWhru9V_AbUA>, director of Rising Voices, an initiative to support peer networks of indigenous language digital activists in Latin America
Subhashish P. <https://www.linkedin.com/in/ACoAAANsOmkB2pSPSzrrS0zGo0pKhFCGXY8I80s> Panigrahi, National Geographic Explorer and documentary filmmaker, founder of OpenSpeaks, a project for documenting indigenous and endangered languages
The debate will be moderated by Maite Melero <https://www.linkedin.com/in/ACoAAAK3E2UBEPSieDYF2HzmjZpjuAL1A-7E5tQ>, coordinator of the special issue, and will address questions such as :
Why should we - all of us - care about linguistic diversity?
Are new technologies a threat or an opportunity for endangered languages?
What are the keys for effective language digital activism?
We will open this interesting debate to the audience.
Participation is free but registration is required: at: https://lnkd.in/eZMf-QcV <https://lnkd.in/eZMf-QcV>
fyi,
Joseph
-------- Message transféré --------
Sujet : Update for Event : Special Session on Low-Resource ASR Development
Date : Tue, 8 Feb 2022 10:21:32 +0000
De : ACL Member Portal <portal(a)aclweb.org>
Pour : joseph.mariani(a)limsi.fr
Greetings Mariani J Joseph,
Call for Papers:
Special Session on Low-Resource ASR Development at INTERSPEECH 2022
We invite submission of original results or studies on automatic speech
recognition (ASR) technologies for low-resource languages to the
preliminarily accepted Low-Resource ASR Special Session at INTERSPEECH 2022.
The special session aims to bring together researchers from all sectors
working on ASR (Automatic Speech Recognition) for low-resource languages
and dialects to discuss the state of the art and future directions. It
will allow for fruitful exchanges between participants in low-resource
ASR challenges and evaluations and other researchers working on
low-resource ASR development.
One such challenge is the OpenASR Challenge series conducted by NIST
(National Institute of Standards and Technology) in coordination with
IARPA’s (Intelligence Advanced Research Projects Activity) MATERIAL
(Machine Translation for English Retrieval of Information in Any
Language) program. The most recent challenge, OpenASR21, offered an ASR
test of 15 low resource languages for conversational telephone speech,
with additional data genres and case-sensitive scoring for some of the
languages.
Another challenge is the Hindi ASR Challenge that was recently opened to
evaluate regional variations of Hindi with the use of spontaneous
telephone speech recordings made available by Gram Vaani, a social
technology enterprise company. The regional variations of Hindi,
together with spontaneity of speech, natural background, and
transcriptions with varying degrees of accuracy due to crowd sourcing
make it a unique corpus for automatic recognition of spontaneous
telephone speech in low-resource regional variations of Hindi. A 1000
hours audio-only data (no transcription) is also released with this
challenge to explore self-supervised training for such a low-resource
framework.
We invite contributions from the OpenASR21 Challenge participants, the
MATERIAL performers, the Hindi ASR Challenge participants, and any other
researchers with relevant work in the low-resource ASR problem space.
Topics:
Reports of results from tests of low-resource ASR, such as (but not
limited to) the NIST/IARPA OpenASR21 Challenge, IARPA MATERIAL
evaluations, and the Hindi ASR Challenge.
Topics focused on aspects of challenges and solutions in low-resource
settings, such as:
Zero- or few-shot learning methods
Transfer learning techniques
Cross-lingual training techniques
Use of pretrained models
Factors influencing ASR performance (such as dialect, gender, genre,
variations in training data amount, or casing)
Any other topics focused on low-resource ASR challenges and solutions
URL:
https://www.nist.gov/itl/iad/mig/low-resource-asr-development-special-se...
[1]
Organizers:
Peter Bell, University of Edinburgh
Jayadev Billa, University of Southern California Information Sciences
Institute
Prasanta Ghosh, Indian Institute of Science, Bangalore
William Hartmann, Raytheon BBN Technologies
Kay Peterson, National Institute of Standards and Technology
Aaditeshwar Seth, Indian Institute of Technology, Delhi
Important dates:
Initial paper submission deadline: March 21, 2022
Please see the Important Dates section of the INTERSPEECH 2022 Call for
Papers for the most up-to-date paper submission, acceptance, and other
relevant dates.
Read more:
https://www.aclweb.org/portal/content/special-session-low-resource-asr-deve…
[1]
https://www.nist.gov/itl/iad/mig/low-resource-asr-development-special-sessi…
---
This is an automatic message from the ACL Member Portal.
If you want to change your alert profile, please visit
https://www.aclweb.org/portal/alerts/status
and login to your account using your username (Mariani J Joseph).
To unsubscribe from all ACL alerts and other announcements, visit
https://www.aclweb.org/portal/unsubscribe/6539/502a7792e3fd0
--
Depuis le 1er janvier 2021, le LIMSI a fusionné avec le LRI et est devenu le LISN (Laboratoire Interdisciplinaire des Sciences du Numérique)
Since January 1st 2021, LIMSI merged with the LRI lab and became the LISN (Interdisciplinary Computer Science Laboratory)
-
Joseph MARIANI
Directeur de Recherche Émérite au CNRS
LISN
Rue John von Neumann
Université Paris-Saclay
Batiment 508
91405 ORSAY Cedex (France)
Tel: +33 1 69 15 78 56
Email:Joseph.Mariani@limsi.fr
Web:https://perso.limsi.fr/mariani/index
Web IMMI:http://immi.cnrs.fr/
[Apologies if you have already received this message]
Among the highlights labeled «French Presidency of the Council of the
European Union 2022»,
the Ministry of Culture, through the General Delegation to the French
language and languages of France, is organizing the online Forum on
*Innovation, technologies and plurilingualism*, on February 7-9, 2022,
and would like to invite you to attend this 3-day event.
Today, digital transformation offers new possibilities for
plurilingualism, a challenge for social cohesion and citizenship in Europe.
Opened by Roselyne Bachelot-Narquin, the French Minister of Culture,
this forum will bring together many actors, French and European, in the
domains of Translation, Language Technologies, Digital Technology and
Artificial Intelligence, for the benefit of multilingualism in our
societies.
*Innovation, promotion, technologies and multilingualism*
Four themes will structure the programme
(https://www.culture.gouv.fr/Media/Medias-creation-rapide/PROGRAMME-EN_Forum…):
* Multilingualism and translation in a Europe of culture and
knowledge facing the digital challenge;
* Language learning and teaching in Europe, their promotion and
attractiveness through digital innovation;
* Natural Language Processing, “discoverability” of scientific
content, collection, evaluation and sharing of digital linguistic
resources, with stakeholders from the worlds of science, research
and business;
* Language technologies, at the service of the European citizen,
to promote EU values and a common sense of belonging.
The Forum is organized in partnership with the Hauts-de-France Region
and the ARTE TV channel.
To attend this event and receive the related links, you are invited to
register on the platform of the French Presidency of the Council of the
European Union as follows:
*Meeting: Forum « Innovation, Technologies and Plurilingualism »
*
1- Click on the link:
https://delegues.accreditation-eu2022.fr/secured/login
2- Click on : /Create a new account /
3- Fill in the form
4- You receive a message, click on the link to /Activate your account /
5- On the /Authentication/ page, fill in FORINOVPLURI.PARTICIDGLFLF
(Delegation field) / ParticipaDGLFLF2022! (access code) / your email /
your password (created at the first connection)
Contact: soraya.loukakou(a)culture.gouv.fr
<mailto:soraya.loukakou@culture.gouv.fr>
Dear All,
we are happy to announce that SIGUL will have its own workshop during
LREC2022: the workshop will take place on June 24 and 25. The call for
papers and the list of the other accepted workshops is available here:
https://lrec2022.lrec-conf.org/en/workshops-and-tutorials/ws-tut-schedule/.
Please distribute!
Best wishes for the upcoming Festive Season,
Claudia
--
Claudia Soria
Researcher
Istituto di Linguistica Computazionale "A. Zampolli"
Consiglio Nazionale delle Ricerche
Via Moruzzi 1
56124 Pisa
Italy
Management Committee member
COST Action CA19102 ‘Language In The Human-Machine Era' (LITHME)
www.lithme.eu
Tel. +39 050 3153166
Skype clausor