The Natural Language Processing Section at the Department of Computer Science at University of Copenhagen is advertising an 18 month position for a Postdoctoral Researcher in Natural Language Processing. The position is funded by the European Union Horizon Europe project to Democratize Trustworthy and Efficient Large Language Model Technology for Europe. The overall goal of the project is to develop European large language models (LLMs) on an unprecedented scale, trained on the largest amount of text so far in European AI, covering a range of underrepresented languages, and pushing the limits of European exascale computing. The successful candidate will join a team developing hybrid token-pixel language models and retrieval-augmented language models. The project team includes a consortium of researchers across Europe, and, locally, the co-investigator, a postdoctoral researcher, and one Ph.D student. Further information about the project is available at https://cordis.europa.eu/project/id/101135671.
The successful candidate will join the Language and Multimodal Processing group, which is part of a section with a strong, international, and diverse environment for research within core as well as emerging topics in natural language processing, natural language understanding, computational linguistics and multi-modal language processing. It is housed within the main Science Campus, which is centrally located in Copenhagen. Further information about the group is available here: https://lampgroup.github.io/ and further information about research at the Department is available here: https://di.ku.dk/english/research/.
The application deadline is 31 January 2024, with a start date of 1 April 2024, or as soon as possible thereafter. Further information about the position can be found here: https://employment.ku.dk/faculty/?show=160726
Informal enquiries about the positions can be made to the co-investigator Desmond Elliott, Department of Computer Science, University of Copenhagen, e-mail: de(a)di.ku.dk.
9th Symposium on Corpus Approaches to Lexicogrammar (LxGr2024)
CALL FOR PAPERS
Deadline for abstract submission: Friday 15 March 2024
The symposium will take place online on Friday 5 and Saturday 6 July 2024.
LxGr primarily welcomes papers reporting on corpus-based research on any aspect of the interaction of lexis and grammar - particularly studies that interrogate the system lexicogrammatically to get lexicogrammatical answers. However, position papers discussing theoretical or methodological issues are also welcome, as long as they are relevant to both lexicogrammar and corpus linguistics.
If you would like to present, send an abstract of 500 words (excluding references) to lxgr(a)edgehill.ac.uk<mailto:lxgr@edgehill.ac.uk>. Make sure that the abstract clearly specifies the research focus (research questions or hypotheses), the corpus, the methodology (techniques and metrics), the theoretical orientation, and the main findings. Abstracts will be double-blind reviewed, and decisions will be communicated within four weeks.
Full papers will be allocated 35 minutes (including 10 minutes for discussion).
Work-in-progress reports will be allocated 20 minutes (including 5 minutes for discussion).
There will be no parallel sessions.
Participation is free.
For details, visit the LxGr website: https://sites.edgehill.ac.uk/lxgr/lxgr2024
If you have any questions, contact lxgr(a)edgehill.ac.uk<mailto:lxgr@edgehill.ac.uk>.
________________________________
Edge Hill University<http://ehu.ac.uk/home/emailfooter>
Modern University of the Year, The Times and Sunday Times Good University Guide 2022<http://ehu.ac.uk/tef/emailfooter>
University of the Year, Educate North 2021/21
________________________________
This message is private and confidential. If you have received this message in error, please notify the sender and remove it from your system. Any views or opinions presented are solely those of the author and do not necessarily represent those of Edge Hill or associated companies. Edge Hill University may monitor email traffic data and also the content of email for the purposes of security and business communications during staff absence.<http://ehu.ac.uk/itspolicies/emailfooter>
We are seeking an enthusiastic PhD candidate to work in multimodal NLP for model-based systems engineering.
Details
This project is funded by the EPSRC iCASE (sponsored by BAE Systems) to conduct research in the area of multimodal natural language processing (NLP) for model-based systems engineering based on Large Language Models (LLMs). LLMs have demonstrated a remarkable ability to generate text when presented with images, text, audio and video as input. They are able to achieve higher performance than traditional neural methods and pre-trained language models, without the need for supervised training.
The project will examine different approaches to multimodal LLM-based NLP to address complex and fine-grained tasks such as reasoning in model-based systems engineering. The PhD will delve into LLM architectures, data augmentation methods, multi-task and domain-specific LLMs, prompting engineering and interpretability.
The candidate will have the opportunity to work with experts at BAE to gain experience in the practical application of model-based systems engineering. The candidate will join the world-class teams of Prof. S. Ananiadou (Computer Science and National Centre for Text Mining, Natural Language Processing, LLMs) and Prof. H. Yin (Electrical and Electronic Engineering, Deep Learning, Computer Vision).
Requirements
You will have a very good undergraduate degree in Computer Science (minimum 2:1 UK or equivalent for EU students). Experience and knowledge of NLP, multimodal LLMs, Ontologies, Semantic Web, Computer Aided Engineering (CAE) and Model-Based Systems Engineering (MBSE) tools and technology will be considered as an advantage.
The successful candidate must be capable of obtaining UK security clearance to fulfil any onsite industrial placement at the location of the host site.
Research Environment in Host Institution
The Department of Computer Science at the University of Manchester (UoM) is in the unique position of hosting the National Centre for Text Mining (NaCTeM), the first publicly funded centre for text mining in the world, focusing on fundamental research in Natural Language Processing (LLMs, interpretability, information extraction) in a variety of domains. Besides NaCTeM, academic expertise in AI is spread across a number of other institutes including the Institute for Data Science and AI (IDSAI), the Centre for AI Fundamentals and partnerships with the Alan Turing Institute and the European Laboratory for Learning and Intelligent Systems (ELLIS).
BAE Systems
BAE Systems provides some of the world's most advanced, technology-led defence, aerospace and security solutions. They employ a skilled workforce of more than 93,000 people in around 40 countries. Working with customers and local partners, they develop, engineer, manufacture, and support products and systems to deliver military capability, protect national security and people, and keep critical information and infrastructure secure.
Before you apply
We strongly recommend that you contact the supervisors of this project prior to application.
How to apply
To be considered for this project, you will need to complete a formal application through our online application portal<https://www.findaphd.com/common/clickCount.aspx?theid=167305&type=199&DID=1…> by the 26th of January 2024.
When applying, you will need to specify the full name of this project, the name of your supervisor, how you are planning to fund your research, details of your previous studies, and the names and contact details of two referees.
Please also send the following to Prof. Sophia Ananiadou (Sophia.ananiadou(a)manchester.ac.uk<mailto:Sophia.ananiadou@manchester.ac.uk>) and Prof. Hujun Yin (hujun.yin(a)manchester.ac.uk<mailto:hujun.yin@manchester.ac.uk>):
* cover letter and full CV
* Full degree transcripts and relevant certificates
Candidates will be shortlisted by a panel comprising members of UoM and BAE Systems. Selected candidates will be invited to give a presentation followed by a formal interview. The interviews will be held during the week of 29th January 2024.
Your application will not be processed unless all of the required documents are submitted at the time of application, and we cannot accept responsibility for late or missed deadlines. Incomplete applications will not be considered.
If you have any questions about making an application, please contact our admissions team by emailing FSE.doctoralacademy.admissions(a)manchester.ac.uk<mailto:FSE.doctoralacademy.admissions@manchester.ac.uk>.
Equality, diversity and inclusion<https://www.findaphd.com/common/clickCount.aspx?theid=167305&type=199&DID=1…> is fundamental to the success of The University of Manchester, and is at the heart of all of our activities. We know that diversity strengthens our research community, leading to enhanced research creativity, productivity and quality, and societal and economic impact.
We actively encourage applicants from diverse career paths and backgrounds and from all sections of the community, regardless of age, disability, ethnicity, gender, gender expression, sexual orientation and transgender status.
We also support applications from those returning from a career break or other roles. We will consider offering flexible study arrangements (including part-time: 50%, 60% or 80%, depending on the project/funder).
Funding Notes
This project is funded through EPSRC iCASE (with BAE Systems). The project will pay the tuition fees and provide a tax free stipend set at the UKRI rate (£18,622). We are able to offer a limited number of studentships to applicants outside the UK. Therefore, full studentships will only be awarded to exceptional quality candidates, due to the competitive nature of this scheme. Additional research funds will be available.
----------
Professor Sophia Ananiadou
Department of Computer Science
Director, National Centre for Text Mining
Deputy Director, Institute for Data Science and Artificial Intelligence
Turing Fellow
The University of Manchester
(apologies for cross-posting)
The 9th Workshop on Linked Data in Linguistics: Resources, Applications,
Best Practices
Workshop colocated with *LREC-COLING 2024*,
*Date*: May 25, 2024
*Venue*: Torino, Italy and online
For up to date info, check: https://ldl2024.linguistic-lod.org/
Call for Papers
The Linked Data in Linguistics (LDL) workshop series has established itself
as the premier venue for discussing the application of Semantic Web
technologies to the fields of linguistics, digital lexicography, and
digital humanities (DH).
While recent years have witnessed a steady growth in adoption of the
technology in these areas, its uptake in other relevant domains, most
notably in the case of natural language processing (NLP), continues to lag
behind.
This year, aside from embracing the full bandwidth of applications of LLOD
technologies and the closely related area of knowledge graphs in
linguistics, we welcome contributions addressing the application of LLOD
technologies to NLP applications, as well as those dealing with emerging
hot topics of future bridges between structured (linguistic) knowledge and
neural methods.
In addition, this year’s edition of the workshop will be a venue for
in-depth discussions on community standards and best practices, and, above
all, those related to the work of the W3C community groups OntoLex
<https://www.w3.org/community/ontolex/> [1], LD4LT
<https://www.w3.org/community/ld4lt/> [2] and BPMLOD
<https://www.w3.org/community/bpmlod/> [3]. To this end, it will include
featured talks on the latest achievements, developments, and perspectives
of these W3C Community Groups.
[1] Ontology-Lexica Community Group
[2] Linked Data in Language Technology Community Group
[3] Best Practices in Multilingual Linked Open Data
* Topics of interest *
We invite presentations of algorithms, methodologies, experiments, tools,
use cases, descriptions of ongoing or planned research projects as well as
position papers that describe the creation, publication or application of
linked linguistic data collections and their linking with other resources.
Descriptions of such data, and in particular, its uses in research
(linguistics, lexicology, digital humanities) and technology (NLP,
e-lexicography, localization) are also welcome. The following is a
non-exhaustive list of relevant topics:
1. Building, managing and linking language resources
- Lexicons and Lexical Data, including Dictionaries and Lexicographic
Resources
- Annotations and Annotated Corpora
- Entity Linking
2. Technologies, challenges and best practices for language technology
and language resources on the web:
- Interoperability
- Sustainability
- FAIRness
3. Structured data in language technology:
- Knowledge Graphs
- Machine Learning
- Multilingual Technologies
- Language Knowledge Injection in LLMs
4. Show cases, case studies and applications by different communities of
practice:
- Multimodality
- Corpus Linguistics
- Lexicography
- Digital Humanities
5. Current directions and critical reflection. Position papers on:
- Ethical, legal, technological aspects of structured data in the age
of LLMs
- The role of LLOD in promoting low-resource languages
- Extensions of RDF and graph formalisms
We invite both long (8 pages and 2 pages of references) and short papers (4
pages and 2 pages of references) representing original research, innovative
approaches and resource descriptions. Short papers may also represent
project descriptions. These do not have to be implemented but discuss to
what extent and for which purposes Linguistic Linked Open Data is reused or
created. Projects that are still in their early stages and seek advice from
the broader Linguistic Linked Data community are welcome, especially if
they include underrepresented fields of study.
Papers should be formatted according to the LREC-COLING guidelines, please
see https://lrec-coling-2024.org/authors-kit/. Please note that the review
process will be *single-blind*.
* Identify, Describe and Share your LRs! *
When submitting a paper from the START page, authors will be asked to
provide essential information about resources (in a broad sense, i.e. also
technologies, standards, evaluation kits, etc.) that have been used for the
work described in the paper or are a new result of your research. Moreover,
ELRA encourages all LREC-COLING authors to share the described LRs (data,
tools, services, etc.) to enable their reuse and replicability of
experiments (including evaluation ones).
* Important Dates *
- Submission Date: February 23, 2024
- Notification of Acceptance: March 22, 2024
- Camera-Ready: April 5, 2024
- Workshop: May 25, 2024
* Workshop Organizers *
- Christian Chiarcos (University of Augsburg, Germany)
- Katerina Gkirtzou (Athena Research Center, Greece)
- Maxim Ionov (University of Cologne, Germany)
- Fahad Khan (Consiglio Nazionale delle Ricerche, Italy)
- John P. McCrae, (University of Galway, Ireland)
- Elena Montiel Ponsoda (Universidad Politécnica de Madrid, Spain)
- Patricia Martín Chozas (Universidad Politécnica de Madrid, Spain)
Please get in contact via ldl2024(a)linguistic-lod.org.
* Program Committee *
- Sina Ahmadi (George Mason University, USA)
- Verginica Barbu Mititelu (Research Institute for Artificial
Intelligence of the Romanian Academy, Romania)
- Paul Buitelaar (Insight, Ireland)
- Sara Carvalho (University of Aveiro, Portugal)
- Rute Costa (NOVA FCSH/NOVA CLUNL, Portugal)
- Milan Dojchinovski (Czech Technical University, Czech Republic)
- Agata Filipowska (Uniwersytet Ekonomiczny w Poznaniu, Poland)
- Francesca Frontini (CNR-ILC, Italy)
- Frances Gillis Webber (University of Cape Town, South Africa)
- Voula Giouli (Athena Research Center, Greece)
- Dagmar Gromann (University of Vienna, Austria)
- Yoshihiko Hayashi (Waseda University, Japan)
- Alik Kirillovich (Higher School of Economics, Russia)
- Penny Labropoulou (Athena Research Center, Greece)
- Chaya Liebeskind (Jerusalem College of Technology, Israel)
- David Lindemann (University of the Basque Country, Spain)
- Francesco Mambrini (Università Cattolica del Sacro Cuore, Italy)
- Monica Monachini (CNR-ILC, Italy)
- Diego Moussallem (Paderborn University, Germany)
- Roberto Navigli (“La Sapienza” Università di Roma, Italy)
- Petya Osenova (IICT-BAS, Bulgaria)
- Ana Ostroški Anić (Institute of Croatian Language and Linguistics,
Croatia)
- Giulia Pedonese (CNR-ILC, Italy)
- Sigita Rackevičienė (Mykolas Romeris University, Lithuania)
- Felix Sasaki (SAP, Germany)
- Andrea Schalley (Karlstad University, Sweden)
- Gilles Sérasset (University Grenoble Alpes, France)
- Milena Slavcheva (IICT-BAS, Bulgaria)
- Blerina Spahiu (Bicocca University, Italy)
- Ranka Stanković (University of Belgrade, Serbia)
- Armando Stellato (University of Rome, Italy)
- Federica Vezzani (University of Padua, Italy)
*SEM brings together researchers interested in the semantics of (many and diverse!) natural languages and its computational modeling. The conference embraces data-driven, neural, and probabilistic approaches, as well as symbolic approaches and everything in between; practical applications as well as theoretical contributions are welcome. The long-term goal of *SEM is to provide a stable forum for the growing number of NLP researchers working on all aspects of semantics of (many and diverse!) natural languages.
Topics of interest:
Lexical semantics and word representations
Compositional semantics and sentence representations
Statistical, machine learning, and deep learning methods in semantic tasks
Multilingual and cross-lingual semantics
Word sense disambiguation and induction
Semantic parsing, and syntax-semantics interface
Frame semantics and semantic role labeling
Textual inference, textual entailment, and question answering
Formal approaches to semantics
Extraction of events and of causal and temporal relations
Entity linking, pronouns and coreference
Discourse, pragmatics, and dialogue
Machine reading
Extra-propositional aspects of meaning
Multiword and idiomatic expressions
Metaphor, irony, and humor
Knowledge mining and acquisition
Common sense reasoning
Language generation
Semantics in NLP applications: sentiment analysis, abusive language detection, summarization, fact-checking, etc.
Multidisciplinary research on semantics
Grounding and multimodal semantics
Psycholinguistics
Interpretability and Explainability
Human semantic processing
Semantic annotation, evaluation, and resources
Ethical aspects and bias in semantic representations
We encourage authors to think about the ethical aspects of their work, and to address and discuss all ethical questions and implications relevant to their research. STARSEM values reproducibility and particularly welcomes submissions that adhere to the reproducibility guidelines as specified here.
Submission Instructions
Submissions must describe unpublished work and be written in English. We solicit both long and short papers. Please note that double submission of papers will need to be notified at submission.
Long papers describe original research and may consist of up to eight (8) pages of content, plus unlimited pages for references. Appendices are allowed after the references, but the paper should be self-contained and reviewers will not be required to check the appendices, if any. Final versions of long papers will be given one additional page of content (up to 9 pages) so that reviewers' comments can be taken into account. Short papers describe original focused research and may consist of up to four (4) pages, plus unlimited pages for references. Upon acceptance, short papers will be given five (5) content pages in the proceedings. Authors are encouraged to use this additional page to address reviewers comments in their final versions.
Submissions should follow the ARR formatting requirements. The deadline for direct submissions is Feb 22, 2024, and these submissions will be reviewed by the *SEM-2024 program committee. ACL Rolling Review (ARR) submissions can be committed to *SEM up to March 22, 2024 (authors of ARR-reviewed papers need to include their OpenReview link with reviews in the submission form). Both types of submissions are through OpenReview. Limitations and Ethics Statement sections are allowed and encouraged, but they are not mandatory. They should be placed after the conclusion and they will not count towards the overall page limit.). In *SEM there is no special policy against multiple submissions, but this should be notified to the Program Chairs.
Submission link: https://openreview.net/group?id=aclweb.org/StarSEM/2024/Conference
Important Dates
Anonymity period for direct submissions begins Jan 22, 2024
Direct submission deadline Feb 22, 2024
ARR-reviewed paper submission deadline Mar 22, 2024
Notification of acceptance Apr 22, 2024
Camera-ready deadline May 5, 2024
Conference date Jun 16, 2024
Anonymity period
To protect the integrity of double-blind review and ensure that submissions are reviewed fairly, we adopt the rules and guidelines for ACL conferences. The following rules and guidelines make reference to the anonymity period, which runs from 1 month before the submission deadline (starting February 22, 2024 11:59PM UTC-12:00) up to the date when your paper is either accepted, rejected (Apr 22, 2024), or withdrawn.
You may not make a non-anonymized version of your paper available online to the general community (for example, via a preprint server) during the anonymity period. By a version of a paper we understand another paper having essentially the same scientific content but possibly differing in minor details (including title and structure) and/or in length (e.g., an abstract is a version of the paper that it summarizes).
If you have posted a non-anonymized version of your paper online before the start of the anonymity period, you may submit an anonymized version to the conference. The submitted version must not refer to the non-anonymized version, and you must inform the program chair(s) that a non-anonymized version exists.
You may not update the non-anonymized version during the anonymity period, and we ask you not to advertise it on social media or take other actions that would further compromise double-blind reviewing during the anonymity period.
Note that, while you are not prohibited from making a non-anonymous version available online before the start of the anonymity period, this does make double-blind reviewing more difficult to maintain, and we therefore encourage you to wait until the end of the anonymity period if possible. Alternatively, you may consider submitting your work to the Computational Linguistics journal, which does not require anonymization and has a track for “short” (i.e., conference-length) papers.
Welcome to SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration Mistakes!
Task description: SHROOM participants will need to detect grammatically sound output that contains incorrect semantic information (i.e. unsupported or inconsistent with the source input), with or without having access to the model that produced the output.
Overview of the task: The modern NLG landscape is plagued by two interlinked problems:
On the one hand, our current neural models have a propensity to produce inaccurate but fluent outputs; on the other hand, our metrics are most apt at describing fluency, rather than correctness. This leads neural networks to “hallucinate”, i.e., produce fluent but incorrect outputs that we currently struggle to detect automatically. For many NLG applications, the correctness of an output is however mission critical. For instance, producing a plausible-sounding translation that is inconsistent with the source text puts in jeopardy the usefulness of a machine translation pipeline. With our shared task, we hope to foster the growing interest in this topic in the community.
With SHROOM we adopt a post hoc setting, where models have already been trained and outputs already produced: participants will be asked to perform binary classification to identify cases of fluent overgeneration hallucinations in two different tracks: a model-aware and a model-agnostic track. In the former, participants have access to the model that produced the output; in the latter, they do not. To ensure a low-barrier to entry, we format the task as a binary classification problem. We now also provide a baseline kit, containing a baseline system, a format checker and the scoring program.
All systems will be rated on accuracy (i.e., the proportion of test examples correctly labeled) and calibration (i.e., the correlation between the probability assigned by a system and the proportion of annotators marking a production as hallucinatory).
We provide to participants a collection of checkpoints, inputs, references and outputs of systems covering three NLG tasks: definition modeling (DM), machine translation (MT), and paraphrase generation (PG), trained with varying degrees of accuracy. The development set provides binary annotations from five different annotators and a majority vote gold label.
Anyone wishing to participate in the task is welcome! Participants will have to
* Submit at least once during the evaluation phase on January;
* Write a system description paper before February 19;
* Review other system description papers (max. 2).
Trial, dev and train data are now available on the task website:
https://helsinki-nlp.github.io/shroom/
Codalab competition: https://codalab.lisn.upsaclay.fr/competitions/15726
Join the mailing group: https://groups.google.com/u/1/g/semeval-2024-task-6-shroom
Updates on Twitter: @shroom2024<https://twitter.com/shroom2024>
Important dates:
* Sample data ready: July 15th, 2023
* Validation data ready: September 11th, 2023
* Unlabeled train data ready: September 22nd, 2023
* Evaluation period starts (test set released): January 10th, 2024
* Evaluation period ends: January 31st, 2024
* Workshop paper submission deadline: February 19th, 2024
* Notification to authors: March 18th, 2024
* SemEval workshop: 16–21 June, Mexico (collocated with NAACL 2024)
Task organizers
* Elaine Zosa, Silo AI, Finland
* Raúl Vázquez, University of Helsinki, Finland
* Jörg Tiedemann, University of Helsinki, Finland
* Vincent Segonne, Southern Brittany University, France
* Teemu Vahtola, University of Helsinki, Finland
* Alessandro Raganato, University of Milano-Bicocca, Italy
* Timothee Mickus, University of Helsinki, Finland
* Marianna Apidianaki, University of Pennsylvania, USA
Call for Papers: * HTRes 2024 – Holocaust Testimonies as Language
Resources *Pre-conference workshop at LREC-COLING 2024
(https://lrec-coling-2024.org/)
Tuesday, 21st May, 2024 in Torino, Italy
Workshop webpage: https://www.clarin.eu/HTRes2024
** Final date for paper submission: 21 February 2024 **
Holocaust testimonies serve as a bridge between survivors and history’s
darkest chapters, providing a connection to the profound experiences of
the past. Testimonies stand as the primary source of information that
describe the Holocaust, offering first-hand accounts and personal
narratives of those who experienced it. The majority of testimonies are
captured in an oral format, as survivors vividly explain and share their
personal experiences and observations from that time period.
Transforming Holocaust testimonies into a machine-processable digital
format can be a difficult task owing to the unstructured nature of the
text. The creation of accessible, comprehensive, and well-annotated
Holocaust testimony collections is of paramount importance to our
society. These collections empower researchers and historians to
validate the accuracy of socially and historically significant
information, enabling them to share critical insights and trends derived
from these data. This workshop will investigate a number of ways in
which techniques and tools from natural language processing and corpus
linguistics can contribute to the exploration, analysis, dissemination
and preservation of Holocaust testimonies.
Topics of interest:
We expect contributions related to the following topics:
* Creation of datasets and development of tools for the study of
Holocaust testimonies:
* Creation of language corpora of Holocaust testimonies
* Digitisation and enhancement of oral and written testimonies
(including automatic speech recognition, alignment of text and
speech, format conversion, OCR, handwriting recognition, machine
translation)
* Named entity recognition for identifying people, places, and events
in testimonies
* Standards, representation formats, and guidelines for annotations
and vocabularies relevant to the Holocaust testimonies
* Creation, adaptation and tuning of software applications for the
creation, annotation, enhancement and use of Holocaust testimonies
as language resources
Research using NLP and Holocaust testimonies
* Applications of NLP in analysing Holocaust survivor testimonies
* Sentiment analysis and emotional content extraction from survivor
narratives.
Data Visualisation, Knowledge Representation and Information Extraction:
* Visualising complex data structures from Holocaust testimonies
* Building knowledge graphs and networks to represent historical
relationships
* Interactive data visualisations for education and research
* Extracting biographical and temporal information relevant to the
Holocaust
* Deep learning and large language models
Digital Archiving and Long-Term Preservation:
* Methods and tools for digitising and preserving Holocaust testimonies
* Best practices for metadata standards and cataloguing
* Ensuring long-term accessibility and data integrity
Ethical Considerations and Privacy
* Ethical challenges in digitising and sharing sensitive testimonies
* Anonymisation and privacy protection in Holocaust data
* Community engagement and consent in digital projects
User and application aspects
* Development of tools and interfaces for the search, analysis and
exploration of Holocaust testimonies
* Other relevant use cases and application scenarios
All papers must clearly state and explain their relevance to the topic
of 'Holocaust Testimonies as Language Resources'.
All papers must represent original and unpublished work that is not
currently under review. Papers will be evaluated according to their
significance, originality, technical content, style, clarity, and
relevance to the workshop. We welcome the following types of contributions:
Standard research papers (up to 8 pages, plus more pages for references
if needed);
Short research papers (from 4 to 6 pages, plus more pages for references
if needed).
Submissions should strictly follow the LREC2024 stylesheet formatting
guidelines. All papers should be electronically submitted in PDF format
via the main conference platform via START
(https://softconf.com/lrec-coling2024/htres2024/)
Important Dates:
Final date for paper submission: 21 February 2024
Notification of Acceptance: 20 March 2024
Camera-ready version submission: 15 April 2024
Workshop date: 21 May 2024
Programme:
Please refer to the website for the details of the programme, plus the
organizing and programme committees: https://www.clarin.eu/HTRes2024
--
Senior Researcher in Corpus Linguistics
Faculty of Linguistics, Philology and Phonetics, University of Oxford
National Co-ordinator, CLARIN-UK
martin.wynne(a)ling-phil.ox.ac.uk
https://orcid.org/0000-0002-4155-0530
Please note that students in NLP are also very welcome to apply to the
CS @ Max Planck PhD Program; NLP Faculty in this program include e.g.
Mariya Toneva and Vera Demberg.
CS @ Max Planck is a selective doctoral program that grants
admitted students full financial support to pursue doctoral research
in the field of computer and information science, with faculty at Max
Planck Institutes and some of the best German universities.
To qualify for the program, students must hold a Bachelor’s or
Master’s degree in computer science (or a related field) and have an
outstanding academic record. We especially encourage applications
from students who wish to explore research across the CS spectrum
before committing to a topic and advisor.
For more information about the program, see:
https://www.cis.mpg.de/graduate-programs/cs-max-planck
The next application deadline is December 31, 2023.
For further information, please contact Gretchen Gravelle (MPI-SWS Grad
Office, grad-office(a)mpi-sws.org)
Apologies for cross-posting.
----------------------------------------
*The International Conference on Spoken Language Translation*
*21st IWSLT 2024 – First Call for Participation*
*August 15-16, 2024 – Bangkok, Thailand*
*http://iwslt.org <http://iwslt.org/>*
The International Conference on Spoken Language Translation (IWSLT) is the
premier annual conference for all aspects of Spoken Language Translation.
Every year, the conference organizes and sponsors open evaluation campaigns
around key challenges in simultaneous and consecutive translation, under
real-time/low latency or offline conditions and under low-resource or
multilingual constraints. System descriptions and results from
participants’ systems and scientific papers related to key algorithmic
advances and best practices are presented.
IWSLT is the venue of the SIGSLTs, the Special Interest Group on Spoken
Language Translation of ACL, ISCA and ELRA. With a track record of 20
years, IWSLT benchmarks and proceedings serve as reference for all
researchers and practitioners working on speech translation and related
fields.
The 21st edition of IWSLT will be run as an *ELRA/ACL* event and co-located
with ACL 2024 <https://2024.aclweb.org/> on August 15-16, 2024. It will be
run as a hybrid event.
Important Dates
January 15, 2024: Release of shared task training and dev data
April 01-15, 2024: Evaluation period
April 29, 2024: Paper submission due (all papers)
June 4, 2024: Notification of acceptance
June 24, 2024: Camera-ready paper due
July 22, 2024: Pre-recorded video due
August 15-16, 2024: Conference
Evaluation
The IWSLT 2024 features shared tasks <https://iwslt.org/2024/#shared-tasks>
that address the following focus areas:
- Speech-to-speech track
- Simultaneous track
- Subtitling track
- Offline track
- Dubbing track
- Low-resource track
- Indic track
Training, development and test data for each shared task will be prepared
and released by the respective organizers (for further information on this
initiative, please refer to the website <https://iwslt.org/2024/>).
Participants will receive instructions about how to submit their runs. In
addition, participants have the opportunity to present their work
through a system
paper that will be published in the ACL Proceedings.
Conference
IWSLT also invites submissions of scientific papers to be published in the
ACL Proceedings and presented either in oral or poster format. The
conference selects high-quality, original contributions on theoretical and
practical issues of spoken language translation research, technologies and
applications. For further information on this initiative, please refer to
the website <https://iwslt.org/2024/#paper-submission>
Contact
Please send an email to iwslt-evaluation-campaign(a)googlegroups.com if you
have any questions related to the shared tasks.