*Job offer summary*
In the context of the BELSPO FED-tWIN
<https://www.belspo.be/belspo/research/FEDtWIN_en.stm> research program
(ARKEY funded proposal), the MiiL
<https://uclouvain.be/en/research-institutes/ilc/miil> (Media Innovation
and Intelligibility Lab from UCLouvain <https://uclouvain.be/en/index.html>,
Belgium) and the States Archives of Belgium
<https://www.arch.be/index.php?l=en> (SAB) are looking for a *post-doc
researcher in computer science or digital humanities*, with a significant
experience in the automatic processing of digital documents involving both
text and images. The proposed contract is an *open-ended FTE contract* (50%
UCLouvain, 50% SAB). The funding is guaranteed over a period of 10 years,
with the intention of securing the position.
The main objective of this ARKEY research profile will be to improve
the *digital
valorization of archive collections through long-term tools*. It involves
*(1)* the research and development of an enhanced access key to digitized
content (through text and layout recognition, and enriched digital archival
representation), and *(2)* the improvement of the navigation experience
within archive collections. It builds on the expertise of a *multidisciplinary
team* from SAB and from several research groups within UCLouvain (MiiL
<https://uclouvain.be/en/research-institutes/ilc/miil>, Cental
<https://uclouvain.be/fr/instituts-recherche/ilc/cental>, ARCH
<https://uclouvain.be/fr/decouvrir/archives>, and GEMCA
<https://uclouvain.be/fr/instituts-recherche/incal/gemca>). ARKEY aims to
bring added value for society and public service, by improving the
accessibility and intelligibility of archives: a priority for many
researchers and a foundation of democratic states.
The complete offer is available here
<https://uclouvain.be/en/research-institutes/ilc/miil/jobs.html>.
Application should be sent to Antonin Descampe (
antonin.descampe(a)uclouvain.be) and Eddy Put (eddy.put(a)arch.be) as soon as
possible and *no later than May 1st, 2023*.
*Résumé de l’offre*
Dans le cadre du programme de recherche BELSPO FED-tWIN
<https://www.belspo.be/belspo/research/FEDtWIN_fr.stm> (proposition
financée « ARKEY »), le MiiL
<https://uclouvain.be/fr/instituts-recherche/ilc/miil> (Media Innovation
and Intelligibility Lab de l'UCLouvain <https://uclouvain.be/fr/index.html>,
Belgique) et les Archives de l’Etat en Belgique
<https://www.arch.be/index.php?l=fr> (AGR) recherchent un *post-doctorant
en informatique ou en humanités numériques*, avec une expérience importante
dans le le traitement automatique de documents numériques impliquant à la
fois du texte et des images. Le contrat proposé est un *contrat à temps
plein à durée indéterminée* (50% UCLouvain, 50% AGR). Le financement est
garanti sur 10 ans, avec un intention de pérennisation du poste.
L'objectif principal de ce profil de recherche ARKEY sera d’*optimiser la
valorisation numérique des collections d'archives grâce à des outils
informatiques pérennes*. Il propose *(1) *la recherche et le développement
de moyens d'accès enrichis au contenu numérisé (grâce à la reconnaissance
de texte et de structure, ainsi qu’à une représentation enrichie des
archives), et *(2)* l'amélioration de l'expérience de navigation au sein
des collections d'archives. Il s'appuie sur l'expertise d'une *équipe
pluridisciplinaire* des AGR et de plusieurs groupes de recherche au sein de
l'UCLouvain (MiiL <https://uclouvain.be/fr/instituts-recherche/ilc/miil>,
Cental <https://uclouvain.be/fr/instituts-recherche/ilc/cental>, ARCH
<https://uclouvain.be/fr/decouvrir/archives> et GEMCA
<https://uclouvain.be/fr/instituts-recherche/incal/gemca>). ARKEY vise à
apporter une valeur ajoutée pour la société et le service public en
améliorant l'accessibilité et l'intelligibilité des archives : une priorité
pour de nombreux chercheurs et chercheuses, et un fondement des États
démocratiques.
L'offre complète est également disponible ici
<https://uclouvain.be/fr/instituts-recherche/ilc/miil/offres-d-emploi.html>.
Les candidatures doivent être envoyées à Antonin Descampe (
antonin.descampe(a)uclouvain.be) et Eddy Put (eddy.put(a)arch.be) dès que
possible et *au plus tard le* *1er mai 2023*.
*** With apologies for multiple postings ***
Tenure-track Assistant Professor or Associate Professor in computational cognitive modelling and natural language processing
The Department of Nordic Studies and Linguistics, Faculty of Humanities, University of Copenhagen (UCPH), Denmark, invites applications for a tenure-track assistant/associate professorship in computational cognitive modelling and natural language processing to be filled by August 1, 2023 or as soon as possible thereafter.
Job content
The successful candidate will engage in cutting-edge research in computational cognitive modelling in collaboration with the CST researchers and is expected to contribute actively to the Centre's research environment, see also: Research - University of Copenhagen (ku.dk)<https://cst.ku.dk/english/research/>
The ideal candidate will have expertise in working with neurocognitive methods such as eye-tracking or EEG; they will have worked with computational modelling, preferably of language phenomena; they will have active knowledge of machine learning and deep modelling techniques.
The candidate will also be expected to strengthen the Centre's project portfolio by applying for external funding, in particular to support projects at the interface between computational cognitive modelling and NLP.
The candidate will also contribute with teaching to the MSc in IT and Cognition, more specifically to the Cognitive Science courses, as well as supervise master's dissertations. More detail on the programme as a whole and the individual study units are provided at: Master of Science (MSc) in IT and Cognition - University of Copenhagen (ku.dk)<https://studies.ku.dk/masters/it-and-cognition/>
The closing date for applications is 23:59 CET, 20 March 2023
Applications or supplementary material received thereafter will not be considered.
More information about the qualification requirements and the application procedure can be found on the following link:
https://jobportal.ku.dk/tenure-track/?show=158556
Costanza Navarretta
PhD, senior researcher/assoc.professor
Centre for Language Technology
Department of Nordic Studies and Linguistics
University of Copenhagen
DIR +45 35329079
costanza(a)hum.ku.dk<mailto:costanza@hum.ku.dk>
AmericasNLP 2023 Shared Task on Machine Translation into Indigenous
Languages
First Call for Participation
The AmericasNLP 2023 Shared Task on Machine Translation into Indigenous
Languages <https://turing.iimas.unam.mx/americasnlp/2023_st.html> is a
competition aimed at encouraging the development of machine translation
(MT) systems for Indigenous languages of the Americas. Participants will
build systems that translate between Spanish and an Indigenous language.
Systems submitted to the shared task will be presented at the Third
Workshop on NLP for Indigenous Languages of the Americas (AmericasNLP) on
July 14, 2023, which will be co-located with the 61st Annual Meeting of the
Association for Computational Linguistics (ACL 2023), which will be held in
Toronto, Canada.
Why?
Many of the Indigenous languages of the Americas are so-called low-resource
languages: parallel data with other languages as needed to train MT systems
is limited. This means that many approaches designed for translating
between high-resource languages, such as English and Chinese, are not
directly applicable or perform poorly. Additionally, many Indigenous
languages exhibit linguistic properties uncommon among languages frequently
studied in natural language processing (NLP). For instance, many are
polysynthetic. This constitutes an additional difficulty. The goal of
AmericasNLP is to motivate researchers to take on the challenge of
developing MT systems for Indigenous languages.
How?
AmericasNLP invites the submission of MT results obtained by systems built
for Indigenous languages. Participants can use the training and development
data we provide, but there are no limits on what participants can use. If
participants want to translate additional data to improve their systems,
that's great! If they want to use pretrained models, that's great, too! The
only limitation is that we ask participants to not have the test input
translated by hand or train on the development or test sets.
The main metric of the shared task is ChrF++ (Popović, 2017). Participants
can enter the competition with as many language pairs as they like, and
systems for every language pair will be evaluated separately. We provide an
evaluation script and a baseline MT system to help participants get started
quickly. If you are interested in this shared task, please register here
<https://forms.gle/ZMVWCxoFunHF3bjNA>.
Which languages?
The following language pairs are featured in the AmericasNLP 2023 shared
task:
-
Hñähñu–Spanish
-
Wixarika–Spanish
-
Nahuatl–Spanish
-
Guaraní–Spanish
-
Bribri–Spanish
-
Rarámuri–Spanish
-
Quechua–Spanish
-
Aymara–Spanish
-
Shipibo-Konibo–Spanish
-
Asháninka–Spanish
-
👻Surprise language👻–Spanish
Spanish is always the target language: systems are evaluated on translating
from an Indigenous language into Spanish.
Important Dates
-
Release of initial languages and evaluation script: March 16, 2023
-
Release of baseline system and baseline results: March 20, 2023
-
Release of surprise language data: April 21, 2023
-
Submission of translations (shared task deadline): May 07, 2023
-
Announcements of results: May 09, 2023
-
Submission of system description papers: May 16, 2023
-
Notification of acceptance: May 20, 2023
-
Camera-ready papers due: May 26, 2023
-
Workshop: July 14, 2023
All deadlines are 11:59 pm UTC -12h (AoE).
Organizers
Abteen Ebrahimi, Manuel Mager, Arturo Oncevay, Enora Rice, John Ortega,
Shruti Rijhwani, Ivan Vladimir Meza Ruiz, Alexis Palmer, Katharina Kann
Contact: americas.nlp.workshop(a)gmail.com
Website: https://turing.iimas.unam.mx/americasnlp/2023_st.html
--
Dr. Katharina Kann
Assistant Professor of Computer Science
University of Colorado Boulder
Personal page: https://kelina.github.io
Group page: https://nala-cub.github.io
We invite you to participate in the shared task on Multi-Label and
Multi-Class Emotion Classification on Code-Mixed Text Messages, organized
as part of WASSA 2023 <https://wassa-workshop.github.io/>at ACL 2023
<https://2023.aclweb.org/>. This task aims to develop models that can
predict emotion based on code-mixed (Roman Urdu and English) text messages.
*Task Description*
The shared task has two Tracks:
*Track 1 - Multi-Label Emotion Classification (MLEC):* Given a code-mixed
SMS message, classify it as 'neutral or no emotion' or as one, or more, of
eleven given emotions that best represent the mental state of the author.
*Track 2 - Multi-Class Emotion Classification (MCEC):* Given a code-mixed
SMS message, classify it as 'neutral or no emotion' or as one of eleven
given emotions that best represent the mental state of the author.
*Note: *You are free to participate in any or both tracks.
*For participation, please check:*
https://codalab.lisn.upsaclay.fr/competitions/10864
*Important Dates*
- February 28th, 2023: Initial training data release
- February 28th, 2023: Codalab competition website goes online, and
development data released
- April 15th, 2023: Evaluation phase begins: development labels test
data released
- April 18th, 2023: Deadline submission of final result on Codalab
- April 24th, 2023: Deadline system description paper (max. 4p)
- May 22nd, 2023: Notification of acceptance
- June 6th, 2023: Camera-ready papers due
*Task Organizers*
- Iqra Ameer - School of Biomedical Informatics, University of Texas,
Health Science Center Houston, USA
- Necva Bölücü - Commonwealth Scientific and Industrial Research
Organisation, Australia
- Ali Al Bataineh - Department of Electrical and Computer Engineering
Norwich University, USA
- Hua Xu - Section of Biomedical Informatics and Data Science, School of
Medicine, Yale University, USA
*Contact*
wassa23codemixed [at] gmail [dot] com
*Join Google Group*
wassa23codemixed(a)googlegroups.com
--
*Regards,*
Dr. Iqra Ameer (Ph.D.)
Call for Papers: 'TwinTalks 4: Understanding and Facilitating Remote Collaboration in DH'
The workshop is a joint initiative by the European Social Sciences and Humanities Research Infrastructures CLARIN <http://www.clarin.eu> and DARIAH<https://www.dariah.eu/> and it will be organised as part of the DH 2023 Collaboration and Opportunity Conference<https://dh2023.adho.org/> that will take place on July 10-14 in Graz, Austria.
Dates and Location
Main conference: 10-14 July, Messe Congress Graz convention centre<http://www.mcg.at/messegraz.at/en/locations/messecongress-graz/veranstalter…>
TwinTalks workshop: 10 July, 9:00 - 12:30, University of Graz<https://www.uni-graz.at/en/>
________________________________
Important Dates
* 15 March 2023: Call for Papers
* 15 May 2023: Submission deadline
* 15 June 2023: Notification of acceptance
* 30 June 2023: Deadline for the final version of extended abstracts
* 10 July 2023 (9:00 - 12:30): Workshop
________________________________
Workshop Aims
The main objective of the workshop is to develop a better understanding of the dynamics on the Digital Humanities work floor when researchers, teachers and/or professionals with different – but often overlapping – areas of competence engage in remote collaboration to solve humanities research questions, and to explore how education and training of humanities scholars, cultural heritage professionals and technical experts can help to make remote collaboration across disciplines more efficient and effective, more creative and innovative, and more inclusive and rewarding for all participants.
To this end, we invite submissions reporting on all aspects and stages of engaging in remote collaborative research and teaching in DH, including the obstacles encountered and solutions found. We also welcome position papers on the role of research infrastructures in facilitating remote collaboration in DH.
The insights gained should help those involved in the education of humanities scholars, professionals and technical experts alike to develop better training programmes, tailored towards the needs of a diverse group of potential learners.
The workshop is a follow-up of three previous successful TwinTalks workshops that have taken place at various DH conferences from 2019 onwards (TwinTalks 1 proceedings<https://ceur-ws.org/Vol-2365/>; TwinTalks 1 blog<http://www.parthenos-project.eu/clarin-and-parthenos-twintalks>; TwinTalks 2+3 proceedings<https://ceur-ws.org/Vol-2717/>).
________________________________
Audience
Researchers, cultural heritage professionals, educators, scientific programmers, research infrastructure operators and policy-makers with a special interest in creating the conditions where people with humanities research skills and technical expertise (or both) can fruitfully collaborate in answering humanities research questions remotely.
________________________________
Workshop Format
The programme starts with an invited talk by a prominent speaker, which will set the scene for the rest of the day. The main component of the workshop programme consists of two types of (submitted) talks:
* Twin talks, i.e. talks presented by pairs or teams consisting of someone rooted primarily in humanities research (with a humanities research problem, i.e. not a technical problem or tool), someone with a background in a totally different discipline (e.g. technical) who has contributed their specific capabilities to arrive at the answers, and/or a cultural heritage professional whose collection knowledge has contributed to the development of the research corpus. Talks will usually consist of three parts, followed by questions from the audience: In the first part, the humanities research question is the point of focus, while in the second part, it is shown how the joint effort resulted in an answer to the respective question. In the third part, these perspectives come together, as the team describes how the remote collaboration went, including obstacles that were encountered, and how better training and education could help to make remote collaboration more efficient and effective.
* Teach talks by people with experience with or interesting ideas about how remote cross-discipline collaboration is or can be addressed in curricula or other training activities.
________________________________
Submissions
The language of the workshop is English.
What we expect from the submissions for the Twin Talks track:
* They are authored and presented by one or more humanities scholars and one or more digital experts
* They start from a humanities research question (i.e. not a technical question, a presentation of a tool, a platform or a data collection)
* They describe the remote research carried out jointly and its results
* They describe the technical aspects of the methods used and the results obtained
* They analyse the way the scholars and the technicians collaborated remotely, addressing issues such as (but not limited to):
* What was easy and what was difficult, and why?
* How did the researchers, technicians or cultural heritage professionals change each other’s way of looking at things?
* Did they, for instance, make each other aware of blind spots they had?
* Did the combination of thinking from a DH research question and thinking from a technical solution lead to new insights?
* How could better training or education of scholars and digital experts make remote collaboration easier, more effective and more efficient?
With regards to the TeachTalks track, one single author and presenter is sufficient. Of course, multi-author papers are equally welcome.
Submission instructions
* Format: PDF. For format instructions, see http://ceur-ws.org/Vol-XXX/CEURART.zip
* Size: Extended abstracts, size ca 4-8 pages (between 2000-4000 words), covering research questions and answers, technical aspects and collaboration experience for Twin Talks, or relevant educational experience for Teach Talks.
* Publication: The workshop proceedings will be published at CEUR-WS<https://ceur-ws.org/>.
* Submission URL: https://easychair.org/conferences/?conf=twintalksdh2023
Workshop Programme Committee
* Bente Maegaard (CLARIN ERIC / University of Copenhagen, Denmark)
* Barbara McGillivray (King's College London & The Alan Turing Institute, UK)
* Benjamin Wiggins (University of Manchester, UK)
* Eleni Gouli (Academy of Athens, Greece)
* Francesca Frontini (CNR, Italy & CLARIN ERIC)
* Frank Uiterwaal (EHRI / NIOD / KNAW, Netherlands)
* Folgert Karsdorp (Meertens Institute, KNAW, Netherlands)
* Geoffrey Rockwell (University of Alberta, Canada)
* Hitoshi Isahara (Center for IT-Based Education, Japan)
* Jennifer Edmond (Trinity College Dublin, Ireland)
* Koenraad De Smedt (CLARINO, University of Bergen, Norway)
* Maria Gavrilidou (Institute for Language and Speech Processing, Athens, Greece)
* Menno Van Zaanen (South African Centre for Digital Language Resources, South Africa)
* Milena Dobreva (Sofia University “St. Kliment Ohridski”, Bulgaria)
* Mikko Tolonen (University of Helsinki, Finland)
* Radim Hladik (Academy of Sciences, Czech Republic)
* Ulrike Wuttke (University of Applied Sciences Potsdam, Germany)
* Vicky Garnett (Trinity College Dublin, Ireland)
________________________________
Chairs and Organisers
The workshop is a joint initiative by European SSH Research Infrastructures CLARIN (http://www.clarin.eu<http://www.clarin.eu/>) and DARIAH (https://www.dariah.eu/).
* Steven Krauwer (CLARIN ERIC / Utrecht University, Netherlands)
* Darja Fišer (CLARIN ERIC / Institute of Contemporary History, Slovenia)
* Iulianna van der Lek-Ciudin (CLARIN ERIC, Netherlands)
* Sally Chambers (DARIAH-EU / Ghent Centre for Digital Humanities, Belgium)
* Agiatis Benardou (DARIAH-EU / Digital Curation Unit, ATHENA R.C., Athens, Greece)
________________________________
Contact Information
For any questions, please contact Iulianna van der Lek at events(a)clarin.eu<https://mailto:events@clarin.eu>.
—
Elisa Gorgaini
CLARIN ERIC External Relation Officer
elisa(a)clarin.eu
+31648213015
www.clarin.eu
Training set released!
Please, consider participating and/or forwarding to colleagues and groups.
****We apologize for multiple postings of this e-mail****
-------------------------------------------------------------------------------
MentalRiskES at IberLEF 2023: Second Call for Participation
Website: https://sites.google.com/view/mentalriskes
-------------------------------------------------------------------------------
MentalRiskES is a novel task on early risk identification of mental disorders in Spanish comments from Telegram users. The task must be resolved as an online problem, that is, the participants must be able to detect a potential risk as early as possible in a continuous stream of data. Therefore, the performance not only depends on the accuracy of the systems but also on how fast the problem is detected.
We would like to invite you to participate in the following tasks:
1. Eating disorders detection
2. Depression detection
3. Unknown disorder detection
Find out more at https://sites.google.com/view/mentalriskes.
MentalRiskES 2023 is part of the IberLEF Workshop and will be held in conjunction with SEPLN 2023 conference in Jaén (Spain).
-------------------------------------------------------------------------------
Important Dates
-------------------------------------------------------------------------------
Now Registration open
Feb 16th Release of trial corpora
Mar 16th Release of training corpora
May 1st Registration closed
May 10th Release of test corpora and start of the evaluation campaign
May 18th End of evaluation campaign (deadline for submission of runs)
May 26th Publication of official results and release of test gold labels
Jun 8th Deadline for paper submission
Jun 23rd Acceptance notification
Jun 30th Camera-ready submission deadline
July 6th Final camera-ready submission deadline (to IberLEF organisers)
Please reach out to the organizers at MentalRiskES@IberLEF2023<https://groups.google.com/g/mentalriskes>.
The MentalRiskES 2023 organizing committee.
--
Flor Miriam Plaza del Arco
Postdoctoral Researcher at MilaNLP
Bocconi University
Via Roentgen 1-2, 20136 Milan, MI, Italy
Twitter: @florplaza22
Web: https://fmplaza.github.io/
Lab: https://milanlproc.github.io
=====================================
ICMI 2023 Call for tutorial proposals
https://icmi.acm.org/2023/call-for-tutorials/
25th ACM International Conference on Multimodal Interaction
9-13 October 2023, Paris, France
=====================================
ACM ICMI 2023 seeks half-day (3-4 hours) tutorial proposals addressing
current and emerging topics within the scope of "Science of Multimodal
Interactions". Tutorials are intended to provide a high-quality learning
experience to participants with a varied range of backgrounds. It is
expected that tutorials are self-contained.
Prospective organizers should submit a 4-page (maximum) proposal containing
the following information:
1. Title
2. Abstract appropriate for possible Web promotion of the Tutorial
3. A short list of the distinctive topics to be addressed
4. Learning objectives (specific and measurable objectives)
5. The targeted audience (student / early stage / advanced researchers,
pré-requisite knowledge, field of study)
6. Detailed description of the Tutorial and its relevance to multimodal
interaction
7. Outline of the tutorial content with a tentative schedule and its
duration
8. Description of the presentation format (number of presenters, interactive
sessions, practicals)
9. Accompanying material (repository, references) and equipment, emphasizing
any required material from the organization committee (subject to approval)
10. Short biography of the organizers (preferably from multiple
institutions) together with their contact information and a list of 1-2 key
publications related to the tutorial topic
11. Previous editions: If the tutorial was given before, describe when and
where it was given, and if it will be modified for ACM ICMI 2023.
Proposals will be evaluated using the following criteria:
- Importance of the topic and the relevance to ACM ICMI 2023 and its main
theme: "Science of Multimodal Interactions"
- Presenters' experience
- Adequateness of the presentation format to the topic
- Targeted audience interest and impact
- Accessibility and quality of accompanying materials (open access)
Proposals that focus exclusively on the presenters' own work or commercial
presentations are not acceptable.
Unless explicitly mentioned and agreed by the Tutorial chairs, the tutorial
organizers will take care of any specific requirements which are related to
the tutorial such as specific handouts, mass storages, rights of
distribution (material, handouts, etc.), copyrights, etc.
Important Dates and Contact Details
-----------------------------------
Tutorial Proposal Deadline: May 15, 2023
Tutorial Acceptance Notification: May 29, 2023
Camera-ready version of the tutorial abstract: June 26, 2023
Tutorial date: TBD (either October 9 or October 13)
Proposals should be emailed to the ICMI 2023 Tutorial Chairs, Prof. Hatice
Gunes and Dr. Guillaume Chanel: icmi2023-tutorial-chairs(a)acm.org
<mailto:icmi2023-tutorial-chairs@acm.org>
Prospective organizers are also encouraged to contact the co-chairs if they
have any questions.
The Third Workshop on NLP for Indigenous Languages of the Americas
(AmericasNLP 2023)
Second Call for Papers
The Third Workshop on NLP for Indigenous Languages of the Americas
(AmericasNLP) will be co-located with the 61st Annual Meeting of the
Association for Computational Linguistics (ACL 2023
<https://2023.aclweb.org/>), which is scheduled to be held in Toronto,
Canada, between July 9-14, 2023.
The goal of the workshop is to encourage and increase the visibility of
work on the Indigenous languages of the Americas. It aims to encourage
research on NLP, computational linguistics, corpus linguistics and speech
for Indigenous languages, to connect researchers and professionals from
underrepresented communities and native speakers of endangered languages
with the ACL community, and, more generally, to promote machine learning
approaches suitable for low-resource languages.
We invite the submission of
-
Long papers (8 pages) and short papers (4 pages) on substantial,
original, and unpublished research
-
Non-archival extended abstracts (2 pages), technical reports (8 pages),
and work which has been presented at other venues (in the format of the
original publication)
Submissions do not need to describe work on native languages directly, as
long as it is clear why those can benefit from the described approaches.
Areas of interest include but are not limited to:
-
Creation of datasets for NLP applications
-
Incorporation of external knowledge into neural systems
-
Linguistic typology and the use of typological features for NLP
-
Transfer learning, meta-learning, and active learning
-
Weakly supervised, semi-supervised, and unsupervised learning
-
Machine translation of low-resource languages
-
Morphology and phonology of low-resource languages
-
NLP applications for Indigenous languages of the Americas
Important dates:
-
Start of the anonymity period: March 15, 2023
-
Submission deadline: April 15, 2023
-
Notification of acceptance: May 15, 2023
-
Camera ready papers due: May 26, 2023
-
Workshop: July 14, 2023
All deadlines are 11.59 pm UTC -12h (anywhere on earth).
Link to submission portal:
https://softconf.com/acl2023/AmericasNLP2023/
The workshop also includes:
-
A machine translation shared task on truly low-resource languages
-
A mentoring program to support students and newcomers from
underrepresented communities (application form:
https://forms.gle/afBWauDfDQijXHTy9)
We also have a diverse set of invited speakers, focused on bridging the gap
between linguists, NLP, and machine learning research!
-
Steven Bird (linguistics; ethics)
-
Angela Fan (NLP; machine translation)
-
Kristine Stenzel (field linguistics; American Indigenous languages)
Organizing Committee
-
Manuel Mager, AWS AI Labs
-
Arturo Oncevay, University of Edinburgh
-
Enora Rice, University of Colorado Boulder
-
Abteen Ebrahimi, University of Colorado Boulder
-
Shruti Rijhwani, Google Research
-
Alexis Palmer, University of Colorado Boulder
-
Katharina Kann, University of Colorado Boulder
More information and contact information can be found at
http://turing.iimas.unam.mx/americasnlp/.
--
Dr. Katharina Kann
Assistant Professor of Computer Science
University of Colorado Boulder
Personal page: https://kelina.github.io
Group page: https://nala-cub.github.io
*** APOLOGIES FOR CROSS-POSTINGS ***
Dear all,
The chairs of LITHME Working Group 1 - Computational Linguistics are pleased to invite you to our forthcoming online talk, ' With a little help from NLP:
My Language Technology applications with impact on society(and my thoughts on the future of NLP) ', by Ruslan Mitkov.
The talk will present original methodologies developed by the speaker, underpinning implemented Language Technology tools which are already having an impact on the following areas of society: e-learning, translation and interpreting and care for people with language disabilities.
The first part of the presentation will introduce an original methodology and tool for generating multiple-choice tests from electronic textbooks. The application draws on a variety of Natural Language Processing (NLP) techniques which include term extraction, semantic computing and sentence transformation. The presentation will include an evaluation of the tool which demonstrates that generation of multiple-choice tests items with the help of this tool is almost four times faster than manual construction and the quality of the test items is not compromised. This application benefits e-learning users (both teachers and students) and is an example of how NLP can have a positive societal impact, in which the speaker passionately believes. The latest version of the system based on deep learning techniques will also be briefly introduced.
The talk will go on to discuss two other original recent projects which are also related to the application of NLP beyond academia. First, a project, whose objective is to develop next-generation translation memory tools for translators and, in the near future, for interpreters, will be briefly presented. Finally, a project will be outlined which focuses on helping users with autism to read and better understand texts. The speaker will put forward ideas as to what we can do next.
The presentation will finish with a brief outline of the latest (and forthcoming) research topics (to be) which the speaker plans to pursue and his vision on the future NLP applications. In particular, he will share his views as to how NLP will develop and what should be done for NLP to be more successful, more inclusive and more ethical.
You can find a bio note of Ruslan Mitkov below.
Since this talk might be of interest to other LITHME members, we took the liberty of sharing it with the whole network.
The talk will take place on Friday, 17 March, at 13:00 (CET), via Zoom. Attendance, as usual, is free, but you will need to register in advance by clicking:
https://videoconf-colibri.zoom.us/meeting/register/tJwrcO2urz0qHt3aELxrDnw_…
We look forward to seeing you online on 17 March.
All best
Rui & Henrique
Ruslan Mitkov - Bionote
Prof Dr Ruslan Mitkov has been working in Natural Language Processing (NLP), Computational Linguistics, Corpus Linguistics, Machine Translation, Translation Technology and related areas since the early 1980s. Whereas Prof Mitkov is best known for his seminal contributions to the areas of anaphora resolution and automatic generation of multiple-choice tests, his extensively cited research (more than 270 publications including 20 books, 35 journal articles and 40 book chapters) also covers topics such as deep learning for NLP, machine translation, translation memory and translation technology in general, bilingual term extraction, automatic identification of cognates and false friends, natural language generation, automatic summarisation, computer-aided language processing, centering, evaluation, corpus annotation, NLP-driven corpus-based study of translation universals, text simplification, NLP for people with language disorders and computational phraseology. In addition, Ruslan Mitkov is well known for his vision in research based on innovative ideas and drive towards research output which seeks to enhance the work efficiency of different professions (e.g. for teachers, translators and interpreters) or seeks to improve the quality of life (e.g. for people with language disabilities) and which has significant impact beyond academia. Mitkov is author of the monograph Anaphora resolution (Longman) and Editor of the most successful Oxford University Press Handbook - The Oxford Handbook of Computational Linguistics whose second and substantially revised edition was published in June 2022. Current prestigious projects include his role as Executive Editor of the Journal of Natural Language Engineering published by Cambridge University Press and Editor-in-Chief of the Natural Language Processing book series of John Benjamins publishers. Dr Mitkov is also working on the forthcoming Oxford Dictionary of Computational Linguistics (Oxford University Press, co-authored with Patrick Hanks) and the Oxford Handbook of Phraseology Linguistics (Oxford University Press, co-authored with Gloria Corpas and Jean-Pierre Colson). Prof Mitkov has been invited as a keynote speaker at more than 200 international conferences. He has acted as Chair or Programme Chair of more than 65 international conferences on Natural Language Processing (NLP), Machine Translation, Translation Technology, Translation Studies, Corpus Linguistics and Anaphora Resolution. He is asked on a regular basis to review for leading international funding bodies and organisations and to act as a referee for applications for Professorships both in North America and Europe. Ruslan Mitkov is regularly asked to review for leading journals, publishers and conferences and serve as a member of Programme Committees or Editorial Boards. Prof Mitkov has been an external examiner of many doctoral theses and curricula in the UK and abroad, including Master’s programmes related to NLP, Translation and Translation Technology. Prof Mitkov is Coordinator (Director) of the first and only Erasmus Mundus Master’s Programme in Technology for Translation and Interpreting - an innovative and inspirational programme, with a strong research focus but an equally strong emphasis on business; leading companies in the global translation and language industry participate as associated partners. Dr Mitkov has considerable external funding to his credit (more than £ 20,000,000) and has been Principal Investigator of 25 projects, are funded by UK research councils, by the EC as well as by companies and users from the UK and USA. Ruslan Mitkov received his MSc from the Humboldt University in Berlin, his PhD from the Technical University in Dresden and worked as a Research Professor at the Institute of Mathematics, Bulgarian Academy of Sciences, Sofia. Mitkov is Professor of Computational Linguistics and Language Engineering at the University of Wolverhampton which he joined in 1995 and where he set up the Research Group in Computational Linguistics. His Research Group has emerged as an internationally leading unit in applied Natural Language Processing and members of the group have won awards at different NLP/shared-task competitions and conferences. In addition to being Head of the Research Group in Computational Linguistics, Prof Mitkov is also Director of the Research Institute in Information and Language Processing and Director of the Responsible Digital Humanities Lab. The Research Institute consists of the Research Group in Computational Linguistics and the Research Group in Statistical Cybermetrics, which is another top performer internationally. Ruslan Mitkov is Vice President of ASLING, an international Association for promoting Language Technology. Dr Mitkov is a Fellow of the Alexander von Humboldt Foundation, Germany, was a Marie Curie Fellow, Distinguished Visiting Professor at the University of Franche-Comté in Besançon, France and Distinguished Visiting Researcher at the University of Malaga, Spain; he also serves/has served as Vice-Chair for the prestigious EC funding programmes ‘Future and Emerging Technologies’ and ‘EIC Pathfinder Open’. In September 2022 the renowned National Board of Medical Examiners (USA) presented Prof Mitkov with a certificate of distinguished collaboration which resulted in lasting impact on the strategic planning and decision making of the US organisation and their employment of NLP solutions to assessment for the last 17 years. In recognition of his outstanding professional/research achievements, Prof Mitkov was awarded the title of Doctor Honoris Causa at Plovdiv University in November 2011. At the end of October 2014 Dr Mitkov was also conferred Professor Honoris Causa at Veliko Tarnovo University and on 25 October 2022 Prof R Mitkov received the title ‘Doctor Honoris Cause’ for the third time, this time awarded by New Bulgarian University, Sofia.
Rui Sousa Silva
Professor Auxiliar | Assistant Professor
PhD
Faculdade de Letras da Universidade do Porto | Faculty of Arts and Humanities of the University of Porto
CLUP-Centro de Linguística da Universidade do Porto | Linguistics Centre of the University of Porto
www.linguisticaforense.pt<http://www.linguisticaforense.pt> | https://s.up.pt/qjur
[Text Description automatically generated with medium confidence]
Rui Sousa Silva.
~
Professor Auxiliar | Assistant Professor
www.letras.up.pt | www.clup.pt | www.linguisticaforense.pt
In this newsletter:
LDC's 30th anniversary year ends
LDC data and commercial technology development
New publications:
Mixer 3 Speech<https://catalog.ldc.upenn.edu/LDC2023S02>
LORELEI Tamil Representative Language Pack<https://catalog.ldc.upenn.edu/LDC2023T03>
________________________________
LDC's 30th anniversary year ends
We hope you enjoyed the monthly data spotlights in celebration of LDC's 30th anniversary year, April 2022-March 2023. We would not have achieved this milestone without the continued support and collaboration of our members, friends, and the community. We are grateful. As we enter our fourth decade, we pledge to continue to serve the community and our members by distributing high quality, diverse data and by providing top-notch member services and research program support.
LDC data and commercial technology development
For-profit organizations are reminded that an LDC membership is a pre-requisite for obtaining a commercial license to almost all LDC databases. Non-member organizations, including non-member for-profit organizations, cannot use LDC data to develop or test products for commercialization, nor can they use LDC data in any commercial product or for any commercial purpose. LDC data users should consult corpus-specific license agreements for limitations on the use of certain corpora. Visit the Licensing<https://www.ldc.upenn.edu/data-management/using/licensing> page for further information.
________________________________
New publications:
Mixer 3 Speech<https://catalog.ldc.upenn.edu/LDC2023S02> contains 3,200 hours of conversational telephone speech involving 3,875 speakers, 19,595 telephone recordings, and 26 distinct languages. This material was collected by LDC from 2005-2007 as part of the Mixer project, and recordings in this corpus were used in NIST Speaker Recognition Evaluation and NIST Language Recognition Evaluation corpora, including 2006 SRE and 2007 LRE.
Recordings were generated using LDC's computer telephony system. Recruited speakers were connected through a robot operator to carry on casual conversations lasting up to 10 minutes. Subjects fluent in languages other than English were asked to complete at least one non-English call. Metadata includes the number of calls per subject and language, as well as speaker demographic information.
2023 members can access this corpus through their LDC accounts. This corpus is a Members-Only release and is not available for non-member licensing. Contact ldc(a)ldc.upenn.edu<mailto:ldc@ldc.upenn.edu> for information about membership.
*
LORELEI Tamil Representative Language Pack<https://catalog.ldc.upenn.edu/LDC2023T03> is comprised of over 41 million words of Tamil monolingual text, 680,000 words of found Tamil-English parallel text, and 226,000 Tamil words translated from English data. Approximately 78,000 words were annotated for named entities and over 24,000 words were annotated for entity discovery and linking, and situation frames (identifying entities, needs, and issues). Data was collected from discussion forum, news, reference, social network, and weblogs.
The LORELEI (Low Resource Languages for Emergent Incidents) program was concerned with building human language technology for low resource languages in the context of emergent situations. Representative languages were selected to provide broad typological coverage.
The knowledge base for entity linking annotation is available separately as LORELEI Entity Detection and Linking Knowledge Base (LDC2020T10)<https://catalog.ldc.upenn.edu/LDC2020T10>.
2023 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
To unsubscribe from this newsletter, log in to your LDC account<https://catalog.ldc.upenn.edu/login> and uncheck the box next to "Receive Newsletter" under Account Options or contact LDC for assistance.
Membership Coordinator
Linguistic Data Consortium<ldc.upenn.edu>
University of Pennsylvania
T: +1-215-573-1275
E: ldc(a)ldc.upenn.edu<mailto:ldc@ldc.upenn.edu>
M: 3600 Market St. Suite 810
Philadelphia, PA 19104