SIGUL April 2022

sigul@list.elra.info

3 participants
4 discussions

Fwd: Crosslinguistic morphology experiments - call for collborators
by Alex CRISTIA 29 Apr '22

29 Apr '22

Dear colleagues, A fascinating opportunity for those working on languages that are inflectional! Read below & contact Ben Ambridge, in cc, for any questions. -Alex --------------------------------------------------------------- Alex (Alejandrina) Cristia Researcher, CNRS Laboratoire de Sciences Cognitives et Psycholinguistique 29, rue d'Ulm, 75005, Paris, FRANCE My site: www.acristia.org --------------------------------------------------------------- If you donate, ask me about effective charities <https://effectivealtruism.us8.list-manage.com/track/click?u=52b028e7f799cca…>. / Si vous faites des dons, demandez moi sur le don efficace <https://www.altruismeefficacefrance.org/guide-don-efficace-1/>. ---------- Forwarded message --------- From: Ben Ambridge <ben.ambridge(a)manchester.ac.uk> Date: Thu, Apr 28, 2022 at 10:15 PM Subject: Fwd: Crosslinguistic morphology experiments - call for collborators To: Alex CRISTIA <alecristia(a)gmail.com> Hi Alex - I know you’ve worked on quite a few hard-to-reach languages - would you be interested in this, or able to point me in the direction of others who might be? Thanks Ben === Dear colleagues, we are seeking potential collaborators for a grant application for a large crosslinguistic project investigating children’s acquisition of inflectional morphology. We aim to include 100 typologically-diverse languages. Due to the size of the envisaged project, it would not be feasible to apply for funding for full-time research assistants to test children (or to fund a portion of each collaborator’s salary). Our intention for the grant application is that each collaborator will be able to claim up to €10,000 for expenses (e.g., travel, laptops, participant payments, part-time/casual researchers), with the data collected by a researcher who is already primarily sponsored/employed (e.g., as PhD student, postdoc or research assistant) at your institution. We will provide computerized elicitation tasks; your role (with the help of full-time research and support staff employed at our end) would be to translate the task into your language and inflectional system and to supervise data collection (with children aged 3-6, and adults). At the moment, our goal is simply to put together a list of *potential* collaborators+languages for the grant application (NB: we can include only languages with verb and/or noun person/case/number inflectional morphology). To be included on this provisional list, please email Ben.Ambridge(a)Manchester.ac.uk with your name, institution and language(s).

1 0

Extended submission deadline - SIGUL2022 Workshop on under-resourced languages (co-located with LREC2022)
by Claudia Soria 08 Apr '22

08 Apr '22

****Apologies for cross-postings**** Call for Papers SIGUL 2022 Workshop <https://sigul-2022.ilc.cnr.it/> a post-Conference Workshop of LREC 2022 Marseille (FR), 24-25 June 2022 *EXTENDED paper submission deadline: 19 April 2022* The 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages (SIGUL 2022) will provide a forum for the presentation and discussion of cutting-edge research in text and speech processing for under-resourced languages by academic and industry researchers. SIGUL 2022 will carry on the tradition of the CCURL-SLTU (Collaboration and Computing for Under-Resourced Languages – Spoken Language Technologies for Under-resourced languages) Workshop Series, which has been organised since 2008 and, as LREC Workshops, since 2014. As usual, this Workshop spans the research interest areas of less-resourced, under-resourced, endangered, minority and minoritized languages. Since this year LREC includes a track dedicated specifically to endangered and less-resourced languages, the workshop aims to be a venue for networking and discussion as much as for scientific debate. Over the last years, research in NLP for less-resourced languages has taken momentum. The multiplication of research interest makes it even more necessary for the community that revolves around less-resourced languages to find opportunities for aggregation and discussion. Following the long-standing series of previous meetings, the SIGUL venue will provide a forum for the presentation of cutting edge research in NLP, MT and Speech Technologies for under-resourced languages to both academic and industry researchers, and also to offer a venue where researchers in different disciplines and from varied backgrounds can fruitfully explore new areas of intellectual and practical development while honouring their common interest of sustaining less-resourced languages. Topics include but are not limited to: General research on under-resourced languages. Transfer-learning techniques for under-resourced languages (use of multilingual, pretrained models, unsupervised, semi-supervised, zero-shot, few-shot training,...) in NLP, MT and Speech technologies. We also invite position papers on methodological, ethical, or institutional issues Instructions for submission can be found here <https://sigul-2022.ilc.cnr.it/submission/> Important Dates - Paper submission deadline: *19* April 2022 - Notification of acceptance: 3 May 2022 - Camera-ready paper: 23 May 2022 - Workshop date: 24-25 June 2022 Organizing Committee Maite Melero - Barcelona Supercomputing Centre, Spain Sakriani Sakti - NAIST, Japan Claudia Soria - CNR-ILC, Italy To contact the organisers, please mail sigul2022(a)ilc.cnr.it <mailto:sigul2022@ilc.cnr.it> (Subject: [SIGUL2022]).

1 0

Research Internship on Controlled Text Generation at Naver Labs Europe
by Laurent Besacier 05 Apr '22

05 Apr '22

> > > Research internship position at NAVER LABS Europe (Grenoble, France) on Energy-Based Models for Controlled Text Generation > > Start date: June 2022 > Duration: 5-6 months > > DESCRIPTION > Large language models can now be used to generate highly fluent texts. However, the synthesized utterances can be deficient on other important levels: semantic consistency, faithfulness to the facts, toxic or socially biased content. > > Our team has developed several effective solutions on that front [1,2,3,4] exploiting the expressive power of Energy-Based Models in defining constraints over generative models. However, certain challenges remain: (1) How can we quickly adapt to changing control conditions without the need for model retraining? (2) Can we exploit these techniques to improve on hard-to-quantify features, such as safety, unbiasedness, textual coherence, or matching the human intention? (3) Can we improve training speed/robustness, for example, by leveraging techniques from RL? > > We are looking for a motivated intern to help us develop techniques and algorithms addressing these challenges. Experiments will be conducted on selected text generation tasks using the state of art pre-trained language models. > > The successful candidate should be enrolled in a graduate program, at the Master or (preferably) PhD level. > > The intern will work in a team integrated by Hady Elsahar, Marc Dymetman, Germán Kruszewski, and Jos Rozen. > > Publication of this internship's results in major conferences/journals will be strongly encouraged. > > REQUIRED SKILLS > - Strong programming skills > - Relevant experience with training Deep Learning models for NLP > - Strong mathematical skills > - Ability to communicate research > > OPTIONAL SKILLS > - Knowledge of MCMC sampling techniques and/or Reinforcement Learning > - Publications at peer-reviewed AI conferences > > REFERENCES > [1] Khalifa et al., A Distributional Approach to Controlled Text Generation, In ICLR-2021 > [2] Eikema et al., Sampling from Energy-Based Models with Quality/Efficiency Trade-offs, In CtrlGen at Neurips 2021 > [3] Korbak et al., Energy-Based Models for Code Generation under Compilability Constraints, In NLP4prog at ACL2021 > [4] Korbak et al. Controlling Conditional Language Models with Distributional Policy Gradients, In CtrlGen at Neurips 2021 > > APPLICATION INSTRUCTIONS > Please note that applicants must be registered students at a university or other academic institution and that this establishment will need to sign an 'Internship Convention' with NAVER LABS Europe before the student is accepted. > > You can apply for this position online at https://europe.naverlabs.com/job/energy-based-models-for-controlled-text-ge… <https://europe.naverlabs.com/job/energy-based-models-for-controlled-text-ge…>. Don't forget to upload your CV and cover letter before you submit. Incomplete applications will not be accepted. > > ABOUT NAVER LABS > NAVER is the #1 Internet portal in Korea with activities that span a wide range of businesses including search, commerce, content, financial and cloud platforms. > > NAVER LABS, co-located in Korea and France, is the organization dedicated to preparing NAVER’s future. NAVER LABS Europe is located in a spectacular setting in Grenoble, in the heart of the French Alps. Scientists at NAVER LABS Europe are empowered to pursue long-term research problems that, if successful, can have significant impact and transform NAVER. We take our ideas as far as research can to create the best technology of its kind. Active participation in the academic community and collaborations with world-class public research groups are, among others, important tools to achieve these goals. Teamwork, focus and persistence are important values for us. > > NAVER LABS Europe is an equal opportunity employer. > > For more information and application see https://europe.naverlabs.com/job/energy-based-models-for-controlled-text-ge… <https://europe.naverlabs.com/job/energy-based-models-for-controlled-text-ge…>

1 0

CfP: SIGUL2022 Workshop, co-located with LREC2022
by Claudia Soria 04 Apr '22

04 Apr '22

****Apologies for cross-postings**** ***Please help disseminate**** 1st Call for Papers SIGUL 2022 Workshop <https://sigul-2022.ilc.cnr.it/> a post-Conference Workshop of LREC 2022 Marseille (FR), 24-25 June 2022 *paper submission deadline: 11 April 2022* The 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages (SIGUL 2022) will provide a forum for the presentation and discussion of cutting-edge research in text and speech processing for under-resourced languages by academic and industry researchers. SIGUL 2022 will carry on the tradition of the CCURL-SLTU (Collaboration and Computing for Under-Resourced Languages – Spoken Language Technologies for Under-resourced languages) Workshop Series, which has been organised since 2008 and, as LREC Workshops, since 2014. As usual, this Workshop spans the research interest areas of less-resourced, under-resourced, endangered, minority and minoritized languages. Since this year LREC includes a track dedicated specifically to endangered and less-resourced languages, the workshop aims to be a venue for networking and discussion as much as for scientific debate. Over the last years,research in NLP for less-resourced languages has taken momentum. The multiplication of research interest makes it even more necessary for the community that revolves around less-resourced languages to find opportunities for aggregation and discussion. Following the long-standing series of previous meetings, the SIGUL venue will provide a forum for the presentation of cutting edge research in NLP, MT and Speech Technologies for under-resourced languages to both academic and industry researchers, and also to offer a venue where researchers in different disciplines and from varied backgrounds can fruitfully explore new areas of intellectual and practical development while honouring their common interest of sustaining less-resourced languages. Topics include but are not limited to: * General research on under-resourced languages. * Transfer-learning techniquesfor under-resourced languages(use of multilingual, pretrained models, unsupervised, semi-supervised, zero-shot, few-shot training,...) in NLP, MT and Speech technologies. * We also invite position paperson methodological, ethical, or institutional issues Instructions for submission can be found here <https://sigul-2022.ilc.cnr.it/submission/> Important Dates - Paper submission deadline: 11 April 2022 - Notification of acceptance: 3 May 2022 - Camera-ready paper: 23 May 2022 - Workshop date: 24-25 June 2022 Organizing Committee * Maite Melero - Barcelona Supercomputing Centre, Spain * Sakriani Sakti - NAIST, Japan * Claudia Soria - CNR-ILC, Italy To contact the organisers, please mail sigul2022(a)ilc.cnr.it <mailto:sigul2022@ilc.cnr.it>(Subject: [SIGUL2022]). -- Claudia Soria Researcher Istituto di Linguistica Computazionale "A. Zampolli" Consiglio Nazionale delle Ricerche Via Moruzzi 1 56124 Pisa Italy Management Committee member COST Action CA19102 ‘Language In The Human-Machine Era' (LITHME) www.lithme.eu Tel. +39 050 3153166 Skype clausor

1 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

SIGUL April 2022