FYI.
> Inizio messaggio inoltrato:
>
> Da: Menno Van Zaanen <Menno.VanZaanen(a)nwu.ac.za>
> Oggetto: [Corpora-List] 2nd CfP Third workshop on Resources for African Indigenous Language (RAIL)
> Data: 19 luglio 2022 09:07:30 CEST
> A: "corpora(a)list.elra.info" <corpora(a)list.elra.info>
>
>
> Second call for papers
>
> Third workshop on Resources for African Indigenous Language (RAIL)
> https://bit.ly/rail2022
>
>
> The South African Centre for Digital Language Resources (SADiLaR) is
> organising the 3rd RAIL workshop in the field of Resources for African
> Indigenous Languages. This workshop aims to bring together researchers
> who are interested in showcasing their research and thereby boosting
> the field of African indigenous languages. This provides an overview of
> the current state-of-the-art and emphasizes availability of African
> indigenous language resources, including both data and tools.
> Additionally, it will allow for information sharing among researchers
> interested in African indigenous languages and also start discussions
> on improving the quality and availability of the resources. Many
> African indigenous languages currently have no or very limited
> resources available and, additionally, they are often structurally
> quite different from more well-resourced languages, requiring the
> development and use of specialized techniques. By bringing together
> researchers from different fields (e.g., (computational) linguistics,
> sociolinguistics, language technology) to discuss the development of
> language resources for African indigenous languages, we hope to boost
> research in this field.
>
> The RAIL workshop is an interdisciplinary platform for researchers
> working on resources (data collections, tools, etc.) specifically
> targeted towards African indigenous languages. It aims to create the
> conditions for the emergence of a scientific community of practice that
> focuses on data, as well as tools, specifically designed for or applied
> to indigenous languages found in Africa.
>
> Suggested topics include the following:
> * Digital representations of linguistic structures
> * Descriptions of corpora or other data sets of African indigenous
> languages
> * Building resources for (under resourced) African indigenous languages
> * Developing and using African indigenous languages in the digital age
> * Effectiveness of digital technologies for the development of African
> indigenous languages
> * Revealing unknown or unpublished existing resources for African
> indigenous languages
> * Developing desired resources for African indigenous languages
> * Improving quality, availability and accessibility of African
> indigenous language resources
>
>
> The 3rd RAIL workshop 2022 will be co-located with the 10th Southern
> African Microlinguistics Workshop (
> https://sites.google.com/nwulettere.co.za/samwop-10/home). This will be
> an in-person event located in Potchefstroom, South Africa. Registration
> will be free.
>
> RAIL 2022 submission requirements:
> * RAIL asks for full papers from 4 pages to 8 pages (plus more pages
> for references if needed), which must strictly follow the Journal of
> the Digital Humanities Association of Southern Africa style guide (
> https://upjournals.up.ac.za/index.php/dhasa/libraryFiles/downloadPublic/30
> ).
> * Accepted submissions will be published in JDHASA, the Journal of the
> Digital Humanities Association of Southern Africa (
> https://upjournals.up.ac.za/index.php/dhasa/).
> * Papers will be double blind peer-reviewed and must be submitted
> through EasyChair (https://easychair.org/my/conference?conf=rail2022).
>
> Important dates
> Submission deadline: 28 August 2022
> Date of notification: 30 September 2022
> Camera ready copy deadline: 23 October 2022
> RAIL: 30 November 2022, North-West University - Potchefstroom
> SAMWOP: 1 – 3 December 2021, North-West University - Potchefstroom
>
>
> Organising Committee
> Jessica Mabaso
> Rooweither Mabuya
> Muzi Matfunjwa
> Mmasibidi Setaka
> Menno van Zaanen
>
> South African Centre for Digital Language Resources (SADiLaR), South
> Africa
>
> --
> Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za
> Professor in Digital Humanities
> South African Centre for Digital Language Resources
> https://www.sadilar.org
> ________________________________
> NWU CORONA VIRUS:
> http://www.nwu.ac.za/coronavirus/
>
> NWU PRIVACY STATEMENT:
> http://www.nwu.ac.za/it/gov-man/disclaimer.html
>
> DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system.
> ________________________________
> _______________________________________________
> Corpora mailing list -- corpora(a)list.elra.info
> https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
> To unsubscribe send an email to corpora-leave(a)list.elra.info
Dear All,
There will be the two-day speech and language technology hackathon will
take place during the IEEE Spoken Language Technology (SLT) Workshop in
Doha, Qatar, on January 7th and 8th, 2023. This year's Hackathon will be
inspiring, momentous, and fun. The goal is to build a diverse community
of people who want to explore and envision how machines understand the
world's spoken languages.
More details can be found here: https://slt2022.org/hackathon.php
Sincerely yours,
Sakriani Sakti
PhD Position : Naver Labs Europe (France) and FBK Trento (Italy) start Nov 2022
Have you recently completed or expect very soon an MSc or equivalent degree in computer science, artificial intelligence, computational linguistics, engineering, or a related area? Are you interested in carrying out research on Speech-to-Speech Translation during the next few years? Are you excited to spend a part of your life in 2 pleasant alpine cities in France (Grenoble) and Italy (Trento) ?
WE ARE LOOKING FOR YOU!!!
The Machine Translation (MT) group at Fondazione Bruno Kessler (Trento, Italy) in conjunction with Naver Labs Europe (Grenoble, France) are pleased to announce the availability of the following fully-funded Ph.D. position at the Doctorate Program in Industrial Innovation of the University of Trento and Fondazione Bruno Kessler.
PhD topic: Unified Foundation models for Speech-to-Speech Translation
The deadline for application: August 23rd.
More details here: [ http://tinyurl.com/PhD-FBK-NLE | http://tinyurl.com/PhD-FBK-NLE ]
=====
Laurent Besacier
Dear SIGUL list members,
we are happy to inform you that the SIGUL2022 Workshop Proceedings are
available for download:
http://www.lrec-conf.org/proceedings/lrec2022/workshops/SIGUL/2022.sigul-1.…
The individual papers can be found as well on the workshop program page,
where we are laso making available the slides and posters that were used
during the presentations: https://sigul-2022.ilc.cnr.it/programme/
SIGUL2022 was held on the last 24th and 25th of June in Marseille,
co-located with LREC2022. It featured 27 papers addressing a vast array
of topics and covering 76 different languages from Africa, the Americas,
Asia, and Europe.
We are very thankful to all the authors, participants, invited speakers,
chairs, panelists, local organisers and program committee members for
contributing to a very successful event.
All the best,
Claudia, Maite, Sakti (SIGUL2022 Co-chairs)
--
Claudia Soria
Researcher
Istituto di Linguistica Computazionale "A. Zampolli"
Consiglio Nazionale delle Ricerche
Via Moruzzi 1
56124 Pisa
Italy
Management Committee member
COST Action CA19102 ‘Language In The Human-Machine Era' (LITHME)
www.lithme.eu
Tel. +39 050 3153166
Skype clausor
Dear colleagues,
My team and I are thinking of approaching a Bolivian community we have
collaborated with in the past about potentially building SLT tools and/or a
dataset with them. One of our research projects requires the creation of a
TTS system, so we think it would be important to couch this research goal
within a collaborative research project that takes into account the
communities' own goals and needs.
This is the first time I do anything like this, and I'm sorry if my
question is very naïve: Do you have materials you'd recommend for us to
read, such as:
- information often provided to aboriginal communities about this kind
of effort
- information about how other communities have set up a payment scheme
- information about variable terms in licensing; eg if the community
does not want commercial reuse, is that ok by the LDC? any other
restrictions communities often ask for? any other rights, such as royalties
in case of commercialization, or free access to the software?
Please reply to me alone. I'll compile all replies and share back the full
list of resources with the mailing list.
Thank you in advance,
Alex
---------------------------------------------------------------
Alex (Alejandrina) Cristia
Researcher, CNRS
Laboratoire de Sciences Cognitives et Psycholinguistique
29, rue d'Ulm, 75005, Paris, FRANCE
My site: www.acristia.org
---------------------------------------------------------------
If you donate, ask me about effective charities
<https://effectivealtruism.us8.list-manage.com/track/click?u=52b028e7f799cca…>.
/ Si vous faites des dons, demandez moi sur le don efficace
<https://www.altruismeefficacefrance.org/guide-don-efficace-1/>.
Dear colleagues,
A fascinating opportunity for those working on languages that are
inflectional! Read below & contact Ben Ambridge, in cc, for any questions.
-Alex
---------------------------------------------------------------
Alex (Alejandrina) Cristia
Researcher, CNRS
Laboratoire de Sciences Cognitives et Psycholinguistique
29, rue d'Ulm, 75005, Paris, FRANCE
My site: www.acristia.org
---------------------------------------------------------------
If you donate, ask me about effective charities
<https://effectivealtruism.us8.list-manage.com/track/click?u=52b028e7f799cca…>.
/ Si vous faites des dons, demandez moi sur le don efficace
<https://www.altruismeefficacefrance.org/guide-don-efficace-1/>.
---------- Forwarded message ---------
From: Ben Ambridge <ben.ambridge(a)manchester.ac.uk>
Date: Thu, Apr 28, 2022 at 10:15 PM
Subject: Fwd: Crosslinguistic morphology experiments - call for collborators
To: Alex CRISTIA <alecristia(a)gmail.com>
Hi Alex - I know you’ve worked on quite a few hard-to-reach languages -
would you be interested in this, or able to point me in the direction of
others who might be?
Thanks
Ben
===
Dear colleagues, we are seeking potential collaborators for a grant
application for a large crosslinguistic project investigating children’s
acquisition of inflectional morphology. We aim to include 100
typologically-diverse languages. Due to the size of the envisaged project,
it would not be feasible to apply for funding for full-time research
assistants to test children (or to fund a portion of each collaborator’s
salary). Our intention for the grant application is that each collaborator
will be able to claim up to €10,000 for expenses (e.g., travel, laptops,
participant payments, part-time/casual researchers), with the data
collected by a researcher who is already primarily sponsored/employed
(e.g., as PhD student, postdoc or research assistant) at your institution.
We will provide computerized elicitation tasks; your role (with the help of
full-time research and support staff employed at our end) would be to
translate the task into your language and inflectional system and to
supervise data collection (with children aged 3-6, and adults). At the
moment, our goal is simply to put together a list of *potential*
collaborators+languages for the grant application (NB: we can include only
languages with verb and/or noun person/case/number inflectional
morphology). To be included on this provisional list, please email
Ben.Ambridge(a)Manchester.ac.uk with your name, institution and language(s).
****Apologies for cross-postings****
Call for Papers
SIGUL 2022 Workshop <https://sigul-2022.ilc.cnr.it/>
a post-Conference Workshop of LREC 2022
Marseille (FR), 24-25 June 2022
*EXTENDED paper submission deadline: 19 April 2022*
The 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages (SIGUL 2022) will provide a forum for the presentation and discussion of cutting-edge research in text and speech processing for under-resourced languages by academic and industry researchers. SIGUL 2022 will carry on the tradition of the CCURL-SLTU (Collaboration and Computing for Under-Resourced Languages – Spoken Language Technologies for Under-resourced languages) Workshop Series, which has been organised since 2008 and, as LREC Workshops, since 2014. As usual, this Workshop spans the research interest areas of less-resourced, under-resourced, endangered, minority and minoritized languages. Since this year LREC includes a track dedicated specifically to endangered and less-resourced languages, the workshop aims to be a venue for networking and discussion as much as for scientific debate.
Over the last years, research in NLP for less-resourced languages has taken momentum. The multiplication of research interest makes it even more necessary for the community that revolves around less-resourced languages to find opportunities for aggregation and discussion. Following the long-standing series of previous meetings, the SIGUL venue will provide a forum for the presentation of cutting edge research in NLP, MT and Speech Technologies for under-resourced languages to both academic and industry researchers, and also to offer a venue where researchers in different disciplines and from varied backgrounds can fruitfully explore new areas of intellectual and practical development while honouring their common interest of sustaining less-resourced languages.
Topics include but are not limited to:
General research on under-resourced languages.
Transfer-learning techniques for under-resourced languages (use of multilingual, pretrained models, unsupervised, semi-supervised, zero-shot, few-shot training,...) in NLP, MT and Speech technologies.
We also invite position papers on methodological, ethical, or institutional issues
Instructions for submission can be found here <https://sigul-2022.ilc.cnr.it/submission/>
Important Dates
- Paper submission deadline: *19* April 2022
- Notification of acceptance: 3 May 2022
- Camera-ready paper: 23 May 2022
- Workshop date: 24-25 June 2022
Organizing Committee
Maite Melero - Barcelona Supercomputing Centre, Spain
Sakriani Sakti - NAIST, Japan
Claudia Soria - CNR-ILC, Italy
To contact the organisers, please mail sigul2022(a)ilc.cnr.it <mailto:sigul2022@ilc.cnr.it> (Subject: [SIGUL2022]).
>
>
> Research internship position at NAVER LABS Europe (Grenoble, France) on Energy-Based Models for Controlled Text Generation
>
> Start date: June 2022
> Duration: 5-6 months
>
> DESCRIPTION
> Large language models can now be used to generate highly fluent texts. However, the synthesized utterances can be deficient on other important levels: semantic consistency, faithfulness to the facts, toxic or socially biased content.
>
> Our team has developed several effective solutions on that front [1,2,3,4] exploiting the expressive power of Energy-Based Models in defining constraints over generative models. However, certain challenges remain: (1) How can we quickly adapt to changing control conditions without the need for model retraining? (2) Can we exploit these techniques to improve on hard-to-quantify features, such as safety, unbiasedness, textual coherence, or matching the human intention? (3) Can we improve training speed/robustness, for example, by leveraging techniques from RL?
>
> We are looking for a motivated intern to help us develop techniques and algorithms addressing these challenges. Experiments will be conducted on selected text generation tasks using the state of art pre-trained language models.
>
> The successful candidate should be enrolled in a graduate program, at the Master or (preferably) PhD level.
>
> The intern will work in a team integrated by Hady Elsahar, Marc Dymetman, Germán Kruszewski, and Jos Rozen.
>
> Publication of this internship's results in major conferences/journals will be strongly encouraged.
>
> REQUIRED SKILLS
> - Strong programming skills
> - Relevant experience with training Deep Learning models for NLP
> - Strong mathematical skills
> - Ability to communicate research
>
> OPTIONAL SKILLS
> - Knowledge of MCMC sampling techniques and/or Reinforcement Learning
> - Publications at peer-reviewed AI conferences
>
> REFERENCES
> [1] Khalifa et al., A Distributional Approach to Controlled Text Generation, In ICLR-2021
> [2] Eikema et al., Sampling from Energy-Based Models with Quality/Efficiency Trade-offs, In CtrlGen at Neurips 2021
> [3] Korbak et al., Energy-Based Models for Code Generation under Compilability Constraints, In NLP4prog at ACL2021
> [4] Korbak et al. Controlling Conditional Language Models with Distributional Policy Gradients, In CtrlGen at Neurips 2021
>
> APPLICATION INSTRUCTIONS
> Please note that applicants must be registered students at a university or other academic institution and that this establishment will need to sign an 'Internship Convention' with NAVER LABS Europe before the student is accepted.
>
> You can apply for this position online at https://europe.naverlabs.com/job/energy-based-models-for-controlled-text-ge… <https://europe.naverlabs.com/job/energy-based-models-for-controlled-text-ge…>. Don't forget to upload your CV and cover letter before you submit. Incomplete applications will not be accepted.
>
> ABOUT NAVER LABS
> NAVER is the #1 Internet portal in Korea with activities that span a wide range of businesses including search, commerce, content, financial and cloud platforms.
>
> NAVER LABS, co-located in Korea and France, is the organization dedicated to preparing NAVER’s future. NAVER LABS Europe is located in a spectacular setting in Grenoble, in the heart of the French Alps. Scientists at NAVER LABS Europe are empowered to pursue long-term research problems that, if successful, can have significant impact and transform NAVER. We take our ideas as far as research can to create the best technology of its kind. Active participation in the academic community and collaborations with world-class public research groups are, among others, important tools to achieve these goals. Teamwork, focus and persistence are important values for us.
>
> NAVER LABS Europe is an equal opportunity employer.
>
> For more information and application see https://europe.naverlabs.com/job/energy-based-models-for-controlled-text-ge… <https://europe.naverlabs.com/job/energy-based-models-for-controlled-text-ge…>
****Apologies for cross-postings****
***Please help disseminate****
1st Call for Papers
SIGUL 2022 Workshop <https://sigul-2022.ilc.cnr.it/>
a post-Conference Workshop of LREC 2022
Marseille (FR), 24-25 June 2022
*paper submission deadline: 11 April 2022*
The 1st Annual Meeting of the ELRA/ISCA Special Interest Group on
Under-Resourced Languages (SIGUL 2022) will provide a forum for the
presentation and discussion of cutting-edge research in text and speech
processing for under-resourced languages by academic and industry
researchers. SIGUL 2022 will carry on the tradition of the CCURL-SLTU
(Collaboration and Computing for Under-Resourced Languages – Spoken
Language Technologies for Under-resourced languages) Workshop Series,
which has been organised since 2008 and, as LREC Workshops, since 2014.
As usual, this Workshop spans the research interest areas of
less-resourced, under-resourced, endangered, minority and minoritized
languages. Since this year LREC includes a track dedicated specifically
to endangered and less-resourced languages, the workshop aims to be a
venue for networking and discussion as much as for scientific debate.
Over the last years,research in NLP for less-resourced languages has
taken momentum. The multiplication of research interest makes it even
more necessary for the community that revolves around less-resourced
languages to find opportunities for aggregation and discussion.
Following the long-standing series of previous meetings, the SIGUL venue
will provide a forum for the presentation of cutting edge research in
NLP, MT and Speech Technologies for under-resourced languages to both
academic and industry researchers, and also to offer a venue where
researchers in different disciplines and from varied backgrounds can
fruitfully explore new areas of intellectual and practical development
while honouring their common interest of sustaining less-resourced
languages.
Topics include but are not limited to:
*
General research on under-resourced languages.
*
Transfer-learning techniquesfor under-resourced languages(use of
multilingual, pretrained models, unsupervised, semi-supervised,
zero-shot, few-shot training,...) in NLP, MT and Speech technologies.
*
We also invite position paperson methodological, ethical, or
institutional issues
Instructions for submission can be found here
<https://sigul-2022.ilc.cnr.it/submission/>
Important Dates
- Paper submission deadline: 11 April 2022
- Notification of acceptance: 3 May 2022
- Camera-ready paper: 23 May 2022
- Workshop date: 24-25 June 2022
Organizing Committee
*
Maite Melero - Barcelona Supercomputing Centre, Spain
*
Sakriani Sakti - NAIST, Japan
*
Claudia Soria - CNR-ILC, Italy
To contact the organisers, please mail sigul2022(a)ilc.cnr.it
<mailto:sigul2022@ilc.cnr.it>(Subject: [SIGUL2022]).
--
Claudia Soria
Researcher
Istituto di Linguistica Computazionale "A. Zampolli"
Consiglio Nazionale delle Ricerche
Via Moruzzi 1
56124 Pisa
Italy
Management Committee member
COST Action CA19102 ‘Language In The Human-Machine Era' (LITHME)
www.lithme.eu
Tel. +39 050 3153166
Skype clausor