*ICNLSP 2022: LAST call for papers*
Dear all,
We are delighted to announce that ICNLSP 2022
<https://www.icnlsp.org/2022welcome/>, the 5*th* edition of the
International Conference on Natural Language and Speech Processing, hosted
by DataScientia (University of Trento)
<http://datascientia.disi.unitn.it/events/> for the third time, will be
held online, on 16-17 December 2022.
*Important dates*
*Submission deadline*: *30 August 2022*
*Notification of acceptance*: *31 October 2022*
*Camera-ready paper due*: *20 November 2022*
*Conference dates*: *16, 17 Decemberber 2022*
*Publication*
1- All accepted papers will be published in ACL Anthology, and indexed in
DBLP.
2- Selected papers will be published in Signals and Communication
Technology (Springer) (https://www.springer.com/series/4748), indexed by
Scopus and zbMATH.
*Keynote speakers*
1. *Eric Laporte*, *Gustave Eiffel University*, *France.*
2.* Jan Niehues*, *University of Maastricht*, *Netherlands.*
3. *Ahmed Ali*, *Qatar Computing Research Institute*, *Qatar*.
*Workshop: NSURL 2022*
The workshop on NLP Solutions for Under Resourced Languages NSURL
<http://nsurl.org> will be held with ICNLSP 2022
<https://www.icnlsp.org/2022welcome/>. The workshop aim to be a forum for
solving NLP tasks concerning Arabic and its dialects and also
under-resourced languages as African, Persian, etc.
We look forward to welcome you to ICNLSP 2022
<https://www.icnlsp.org/2022welcome/> that will be an opportunity to get
acquainted with the latest research in the field of natural language and
speech processing, hoping that it will be successful with your active
participation.
*Contact*
icnlsp2022(a)easychair.org
Dear List member,
The Institute of Translation Studies at the University of Innsbruck,
Austria, is looking to appoint a University Assistant (Postdoc) in
Multilingual (specialised) lexicography.
The deadline for application is August 26th.
All details (German and English) and the link for applying are at
https://lfuonline.uibk.ac.at/public/karriereportal.details?asg_id_in=12939
Informal enquiries can be sent to
Laura.Giacomini(a)uibk.ac.at
Best wishes,
Laura Giacomini
The Center for Information and Language Processing (CIS) at LMU Munich
has several fully-funded positions in Natural Language Processing and
Deep Learning available in the groups of Barbara Plank and Hinrich
Schütze.
Application deadline: September 8th, 2022
Details and application: https://www.cis.lmu.de/web/jobs2022.html
The Center for Information and Language Processing (CIS) at LMU Munich
(co-directed by Barbara Plank and Hinrich Schütze) has two open
tenure-track lecturer positions (Akademische/r Rat/Raetin auf
Lebenszeit) in computational linguistics / natural language
processing.
Application deadline: September 30th, 2022
Details and application: https://www.cis.lmu.de/web/arpositions2022.html
Hello! We are a team of researchers from MSR New England and New York. We are seeking participants (aged 18 or older) who have experience in machine learning or are interested in applying machine learning to developing computational models for signed languages for a survey study.
The purpose of this project is to explore how machine learning practitioners can better build machine learning models for sign language computation (e.g., recognition/translation). We want to understand your general motivations in working with machine learning problems and expected challenges when newly working with sign language data and tasks. Please know that sign language knowledge or sign language computation experience is NOT required to participate in this project.
The survey can be found at https://forms.office.com/r/7LPnkdTFLN<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fforms.off…> along with a consent form for further details. For every submission of the survey, $10 will be donated to LEAD-K (Language Equity and Acquisition for Deaf Kids), up to the first 50 submissions. The survey receives one submission per person.
Once you agree to consent, you will be directed to the survey questions. It will take about 30 minutes to answer the questions, including your experience in machine learning and sign language computation (if any), understanding of sign language culture, and demographics such as your age or education level.
Your responses will be anonymous, unless you choose to provide your name and email address for future contact where you will be invited to participate in a paid study to collaborate with American Sign Language experts. Your name and email address will never be shared outside of the research team.
Please complete the survey by Tuesday, 8/23 and feel free to forward this to other colleagues who may be interested!
Thank you so much for your consideration!
Rie Kamikubo, Danielle Bragg, Alex Lu, Hal Daumé III
In this newsletter:
Fall 2022 LDC Data Scholarship Program
30th Anniversary Highlight: The LDC Gigawords
________________________________
New publication:
HAVIC MED Novel 2 Test - Videos, Metadata and Annotation<https://catalog.ldc.upenn.edu/LDC2022V02>
Fall 2022 LDC Data Scholarship Program
Student applications for the Fall 2022 LDC Data Scholarship program are being accepted now through September 15, 2022. This program provides eligible students with no-cost access to LDC data. Students must complete an application consisting of a data use proposal and letter of support from their advisor. For application requirements and program rules, visit the LDC Data Scholarships page<https://www.ldc.upenn.edu/language-resources/data/data-scholarships>.
30th Anniversary Highlight: The LDC Gigawords
Giga: a combining form meaning "billion," used in the formation of compound words (Source: https://www.dictionary.com/browse/giga-)
LDC's Gigaword corpora are a natural outgrowth of its vast decades-long multi-language newswire collection. Newswire data was originally collected, annotated, and distributed for use in many sponsored projects and was also released through the LDC catalog in tailored data sets. Then came the idea of making LDC's entire newswire collection available by language with a simple, minimal markup to support a broad range of NLP/HLT tasks. The first Arabic<https://catalog.ldc.upenn.edu/LDC2011T11>, Chinese<https://catalog.ldc.upenn.edu/LDC2011T13>, and English<https://catalog.ldc.upenn.edu/LDC2011T07> Gigaword editions were released in 2003; subsequent cumulative releases through fifth editions in 2011 represent LDC's newswire collection spanning 1994-2010 in those languages. French<https://catalog.ldc.upenn.edu/LDC2011T10> and Spanish<https://catalog.ldc.upenn.edu/LDC2011T12> Gigawords were first published in 2006, culminating in the release of third editions in 2011, likewise covering newswire collected by LDC through 2010.
The community has used, and continues to use, these data sets in numerous ways. Automatic text summarization is a favorite, and current work in this area applies deep learning principles (see, e.g., Gao et al. 2020<https://link.springer.com/article/10.1007/s00521-018-3946-7>, English). Gigawords are also useful for text source classification (Huang et al. 2003<https://aclanthology.org/Y08-1042.pdf>, Chinese), information extraction (Lan et al. 2020<https://arxiv.org/pdf/2004.14519.pdf>, Arabic), knowledge extraction and distributional semantics (Napoles et al. 2012<https://aclanthology.org/W12-3018.pdf>, English), and natural language understanding (Ganitkevitch 2013<https://www.cs.jhu.edu/~juri/pdf/proposal-naacl-2013-srw.pdf>, English), among other fields. Recent variations like the annotated<https://catalog.ldc.upenn.edu/LDC2012T21> and concretely annotated<https://catalog.ldc.upenn.edu/LDC2018T20> English Gigawords add syntactic, semantic, and coreference annotations to this billion word text collection.
All Gigaword corpora are available for licensing by Consortium members and non-members. Visit Obtaining Data <https://www.ldc.upenn.edu/language-resources/data/obtaining> for more information.
________________________________
New publication:
HAVIC MED Novel 2 Test - Videos, Metadata and Annotation<https://catalog.ldc.upenn.edu/LDC2022V02> is comprised of 6,200 hours of user-generated videos with annotation and metadata developed by LDC for the 2015 NIST Multimedia Event Detection tasks. The data consists of videos of various events (event videos) and videos completely unrelated to events (background videos). Each event video was manually annotated with judgments describing its event properties and other salient features. Background videos were labeled with topic and genre categories.
HAVIC MED Novel 2 Test -- Videos, Metadata and Annotation is distributed via web download.
2022 Subscription Members will automatically receive copies of this corpus. 2022 Standard Members may request a copy as part of their 16 free membership corpora. This corpus is a members-only release and is not available for non-member licensing. Contact ldc(a)ldc.upenn.edu<mailto:ldc@ldc.upenn.edu> for information about membership.
Membership Coordinator
Linguistic Data Consortium<ldc.upenn.edu>
University of Pennsylvania
T: +1-215-573-1275
E: ldc(a)ldc.upenn.edu<mailto:ldc@ldc.upenn.edu>
M: 3600 Market St. Suite 810
Philadelphia, PA 19104
Il giorno ven 12 ago 2022 alle 14:00 <corpora-request(a)list.elra.info> ha
scritto:
> Send Corpora mailing list submissions to
> corpora(a)list.elra.info
>
> To subscribe or unsubscribe via email, send a message with subject or
> body 'help' to
> corpora-request(a)list.elra.info
>
> You can reach the person managing the list at
> corpora-owner(a)list.elra.info
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Corpora digest..."
>
> Today's Topics:
>
> 1. [CfP] TREC Health Misinformation Track 2022 (Maria Maistro)
> 2. [CfP] ACM TOIS Efficiency in Neural IR (Maria Maistro)
> 3. Call for Badges - ACM SIGIR Artifact Badges Continuous Submission
> (Nicola Ferro)
> 4. Call for proposals: Natural Language Processing (John Benjamin’s)
> (Caro)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 12 Aug 2022 08:03:18 +0000
> From: Maria Maistro <mm(a)di.ku.dk>
> Subject: [Corpora-List] [CfP] TREC Health Misinformation Track 2022
> To: "corpora(a)list.elra.info" <corpora(a)list.elra.info>
> Message-ID: <86B5F708-9063-456A-B790-888B9639E00F(a)ku.dk>
> Content-Type: multipart/alternative;
> boundary="_000_86B5F7089063456AB790888B9639E00Fkudk_"
>
> Call for Participation - TREC Health Misinformation Track 2022
> https://trec-health-misinfo.github.io
>
> Overview 🧐
> --------------------------
> Web search engines are frequently used to help people make decisions about
> health-related issues. Unfortunately, the web is filled with misinformation
> regarding the efficacy of treatments for health issues. Search users may
> not be able to discern correct from incorrect information, nor credible
> from non-credible sources. As a result of finding misinformation deemed by
> the user to be useful to their decision making task, they can make
> incorrect decisions that waste money and put their health at risk.
>
> The TREC Health Misinformation track fosters research on retrieval methods
> that promote reliable and correct information over misinformation for
> health-related decision making tasks.
>
> Tasks 💼
> --------------------------
> * Ad-hoc Retrieval Task: design a ranking model that promotes credible and
> correct information over incorrect information;
> * Answer Prediction Task: predict the answer to the topic’s stance.
>
> Guidelines 📋 we u guy
> --------------------------
> * Corpus: noclean version of the C4 dataset (
> https://huggingface.co/datasets/allenai/c4);
> * Topics: about consumer health search (people seeking health advice
> online);
> * Runs: runs may be either automatic or manual with the standard TREC run
> format.
>
> Detailed guidelines: https://trec-health-misinfo.github.io
>
> Important Dates 🔥
> --------------------------
> * Runs due from participants: August 28, 2022
> * Evaluation results returned: End of September 2022
> * Notebook paper due: October 2022
> * TREC 2022 Conference: November 14-18, 2022
> * Final paper due: February 2023
>
> Organization 👔
> --------------------------
> * Charles Clarke, University of Waterloo
> * Maria Maistro, University of Copenhagen
> * Mark Smucker, University of Waterloo
>
>
> ———
>
> Maria Maistro, PhD
> Tenure-track Assistant Professor
> Department of Computer Science
> University of Copenhagen
> Universitetsparken 5, 2100 Copenhagen, Denmark
>
*** Apologies for Cross-Posting ***
The 7th Arabic Natural Language Processing Workshop (WANLP2022) will be a
full-day event taking place on December 8, 2022 (in a hybrid mode). This
year’s WANLP is co-located with EMNLP 2022 in Abu Dhabi, United Arab
Emirates.
Workshop URL: http://wanlp2022.arabic-nlp.net/
Submission URL: https://softconf.com/emnlp2022/WANLP2022
Important Dates
-
September 5: Workshop Paper Due Date
-
October 10: Notification of Acceptance
-
October 21: Camera-ready papers due (strict!)
-
December 7-8: Workshop Dates
We invite submissions on topics that include, but are not limited to, the
following:
-
Enabling core technologies: morphological analysis, disambiguation,
tokenization, POS tagging, named entity detection, chunking, parsing,
semantic role labeling, sentiment analysis, Arabic dialect modeling, etc.
-
Applications: machine translation, speech recognition, speech synthesis,
optical character recognition, pedagogy, assistive technologies, social
media, etc.
-
Resources: dictionaries, annotated data, corpus, etc.
Submissions may include work in progress as well as finished work.
Submissions§ must have a clear focus on specific issues pertaining to the
Arabic language whether it is standard Arabic, dialectal, classical, or
mixed. Papers on other languages sharing problems faced by Arabic NLP
researchers, such as Semitic languages or languages using Arabic script,
are welcome provided that they propose techniques or approaches that would
be of interest to Arabic NLP, and they explain why this is the case.
Additionally, papers on efforts using Arabic resources but targeting other
languages are also welcome. Descriptions of commercial systems are welcome,
but authors should be willing to discuss the details of their work.
We have several submission tracks including long, short, and demo tracks.
If you have any questions, please contact us at: wanlp2022(a)gmail.com
The WANLP 2022 Organizing Committee
http://wanlp2022.arabic-nlp.net/
----
*Wajdi Zaghouani, Ph.D.*
*Assistant Professor*
College of Humanities and Social Sciences
P.O. Box 34110 | Education City | Doha, Qatar
tel: +974 4454 5601 | mob: +974 33454992
wzaghouani(a)hbku.edu.qa| Office A141, LAS Building
The Centre for Translation Studies (CTS) at University of Surrey invites applications for a place in our MRes in Translation and Interpreting Studies course. Students attending this course get in-depth, systematic research training in translation and interpreting, and customised preparation for a PhD and an academic career. This unique and innovative course is the first of its kind in the UK and draws on the research areas CTS is well known for: translation and interpreting technologies, translation process research, translation as intercultural mediation, corpus-based translation, audiovisual translation and multimodality studies. CTS has more recently embarked on exciting, fast-developing areas, including machine translation, Natural Language Processing for translation/interpreting and hybrid workflows in translation/interpreting. The research we carry out at CTS is in touch with recent technological and social developments, as we maintain a strong focus on the responsible integration of technologies in workflows where multilingual and multimodal mediation is key.
By studying with us, you'll join our internationally recognised Centre for Translation Studies, thus benefiting from a combination of leading research expertise and professional relevance and honing skills you will need in order to thrive in academia or in the industry. As an MRes student, you will take two compulsory taught modules and select two optional modules (60 credits). You will then complete your degree with an MRes in Translation and Interpreting Studies Dissertation (120 credits). The dissertation, which is longer than a typical MA dissertation, will enable you to research a topic in greater depth than is the case in a conventional MA project format. This year, we invite in particular students interested in pursuing dissertation topics related to machine translation, corpora in translation and interpreting, and the use of NLP for translation and interpreting.
For further inspiration, take a look at what our current students say about the course and their MA projects: https://www.surrey.ac.uk/student-life/what-our-students-say/zeynep-polat-po…
And for more details about the programme or how to apply visit: https://www.surrey.ac.uk/postgraduate/translation-and-interpreting-studies-…
If you feel that an MRes is not for you, you can check our other postgraduate courses on topics related to translation and interpreting at:
https://www.surrey.ac.uk/centre-translation-studies/study/postgraduate-cour…
---
Prof Constantin Orăsan
Professor of Language and Translation Technologies
Centre for Translation Studies | School of Literature and Languages
Personal page: https://www.surrey.ac.uk/people/constantin-orasan
Office: 06LC03, Phone: +44 (0) 1483 68 4115
Library and Learning Centre, University of Surrey, Guildford, Surrey, GU2 7XH, UK
*Europhras’2022*
International Conference ‘*Computational and Corpus-based Phraseology’*
Malaga, 28-30 September 2022
The forthcoming international conference ‘Computational and Corpus-based
Phraseology’ (Europhras 2022) will take place in Malaga on 28, 29 and 30
September 2022. We are delighted to announce the new website of the
conference : https://europhras.com/2022/
*Conference topics*
The conference will focus on interdisciplinary approaches to phraseology
and invited submissions on a wide range of topics, covering, but not
limited to: computational, corpus-based, psycholinguistic and cognitive
approaches to the study of phraseology, and practical applications in
computational linguistics, translation, lexicography and language learning,
teaching and assessment.
These topics cover include the following:
*Computational approaches to the study of multiword expressions*, e.g.
automatic detection, classification and extraction of multiword
expressions; automatic translation of multiword expressions; computational
treatment of proper names; multiword expressions in NLP tasks and
applications such as parsing, machine translation, text summarisation, term
extraction, web search;
*Corpus-based approaches to phraseology*, e.g. corpus-based empirical
studies of phraseology, task-orientated typologies of phraseological units
(e.g. for annotation, lexicographic representation, etc.), annotation
schemes, applications in applied linguistics and more specifically
translation, interpreting, lexicography, terminology, language learning,
teaching and assessment (see also below);
*Phraseology in mono- and bilingual lexicography and terminography*, e.g.
new forms of presenting phraseological units in dictionaries and other
lexical resources based on corpus-based and corpus-driven approaches;
domain-specific terminology;
*Phraseology in translation and cross-linguistic studies*, e.g. use
parallel and comparable corpora for translating of phraseological units;
phraseological units in computer-aided translation; study of phraseology
across languages;
*Phraseology in specialised languages and language dialects*, e.g.
phraseology of specialised languages, study of phraseological use in
different dialects or varieties of a specific language;
*Phraseology in language learning, teaching and assessment*: e.g. second
language/bilingual processing of phraseological units and formulaic
language; phraseological units in learner language;
*Theoretical and descriptive approaches to phraseology*, e.g.
phraseological units and the lexis-grammar interface, the relevance of
phraseology for theoretical models of grammar, the representation of
phraseological units in constituency and dependency theories, phraseology
and its interaction with semantics;
*Cognitive and psycholinguistic approaches*: e.g. cognitive models of
phraseological unit comprehension and production; on-line measures
of phraseological unit processing (e.g. eye tracking, event-related
potentials, self-paced reading); phraseology and language disorders;
phraseology and text readability;
The above list is indicative and not exhaustive. Any submission presenting
a study related to the alternative terms of phraseological units, multiword
expressions, multiword units, formulaic language or polylexical
expressions, will be considered.
The Springer volume and the e-proceedings will be both available at the
conference.
In addition, call for follow up papers will be announced after the
conference and the accepted papers reporting these new studies will be
published as peer-reviewed and/or indexed volume (in English). A collection
of papers in Spanish will be published in an indexed journal (2023).
*Schedule*
28-30 September 2022 - conference takes place in Malaga
*Keynote Speakers*
Jean-Pierre Colson, Université Catholique de Louvain
Miloš Jakubíček, Lexical Computing
María del Carmen Mellado Blanco, University of Santiago de Compostela
Aline Villavicencio, Federal University of Rio Grande do Sul and University
of Essex
Conference and Programme Committee Co-Chairs
Gloria Corpas Pastor, University of Malaga
Ruslan Mitkov, University of Wolverhampton
Programme committee
Margarita María Alonso Ramos, University of A Coruña
M. Belén Alvarado Ortega, University of Alicante
Verginica Barbu Mititelu, Romanian Academy
Ignacio Bosque, Complutense University of Madrid
María Luisa Carrió-Pastor, Polytechnic University of Valencia
Anna Čermáková, University of Cambridge
Parthena Charalampidou, Aristotle University of Thessaloniki
Ken Church, Baidu
Jean-Pierre Colson, Université Catholique de Louvain
Dmitrij Dobrovolskij, Russian Language Institute
Peter Ďurčo, University of St. Cyril and Methodius
Natalia Filatkina, University of Hamburg
Elizaveta Goncharova, National Research University, Artificial Intelligence
Research Institute (AIRI)
María Isabel González Rey, University of Santiago de Compostela
Stefan Gries, University of California
Enrique Gutiérrez Rubio, Palacký University Olomouc
Kleanthes K. Grohmann, University of Cyprus
Amal Haddad Haddad, University of Granada
Miloš Jakubíček, Sketch Engine
Eva Lucía Jiménez-Navarro, University of Cordoba
Cvetana Krstev, University of Belgrade
Natalie Kübler, Université Paris Cité
Maria Kunilovskaya, University of Wolverhampton
Ljubica Leone, Lancaster University
Óscar Loureda Lamas, Heidelberg University
Elvira Manero Richard, University of Murcia
Ramón Martí Solano, University of Limoges
María del Carmen Mellado Blanco, University of Santiago de Compostela
Flor Mena Martínez, University of Murcia
Pedro Mogorrón Huerta, University of Alicante
Johanna Monti, “L’Orientale” University of Naples
Esteban Tomás Montoro del Arco, University of Granada
Inés Olza Moreno, University of Navarra
Adriane Orenha Ottaiano, São Paulo State University
Antonio Pamies Bertrán, University of Granada
Rozane Rebechi, Federal University of Rio Grande do Sul
Mª Ángeles Recio Ariza, University of Salamanca
Ute Römer, Georgia State University
Leonor Ruiz Gurillo, University of Alicante
Kathrin Steyer, University of Mannheim
Joanna Szerszunowicz, University of Bialystok
Yukio Tono, Tokyo University of Foreign Studies
Agnès Tutin, University of Grenoble Alpes
Tom Wasow, Stanford University
Eric Wehrli, University of Geneva
Stefanie Wulff, University of Florida
Aline Villavicencio, Federal University of Rio Grande do Sul and University
of Sheffield
Michael Zock, Laboratoire d’Informatique Fondamentale de Marseille
*Organisation and sponsors*
The forthcoming international conference ‘Computational and Corpus-based
Phraseology’ is jointly organised by the University of Malaga (Research
Group in Lexicography and Translation), the University of Wolverhampton
(Research Group in Computational Linguistics) and the Association for
Computational Linguistics - Bulgaria.
The Sketch Engine is the official sponsor of the conference.
*Accompanying events*
The 5th edition of the Workshop on Multiword Units in Machine Translation
and Translation Technology (MUMTTT 2022) will take place as part
of Europhras 2022.
In addition, as part of Europhras 2022 a Sketch Engine tutorial will be
given by Miloš Jakubíček, CEO, Lexical Computing.
*Further information and contact details*
Registration for EUROPHRAS 2022 is now open. To register, please complete
the *registration form*
<https://url6b.mailanyone.net/v1/?m=1nkNMi-0001pg-3h&i=57e1b682&c=wgKlKznP1z…>
.
*** The early bird registration has been extended until 9 September 2022 ***
The conference website (https://europhras.com/2022/) will be updated on a
regular basis. For further information, please email europhras2022(a)gmail.com
Best regards,
EUROPHRAS 2022 Organising Committee