Dear all,
We are hiring for the following two postdoctoral positions at the Alan
Turing Institute both focussed on probabilistic program scaffolds for large
language models. This is a collaborative project lead by Dr. Pranava
Madhyastha from City, University of London along with Prof. Alessandra
Russo from Imperial College London and Prof. Anthony Cohn from the
University of Leeds.
Opportunity 1: LLM Inference Expert
The first position requires experience with controlling inference in LLMs
and transformer-based sequence-to-sequence models. More details and
application link can be found here:
https://cezanneondemand.intervieweb.it/turing/jobs/senior-research-associat…
.
Opportunity 2: Probabilistic Programming Specialist
The second position requires a solid background in probabilistic
programming, logic programming or symbolic models for artificial
intelligence (more details and application link can be found here:
https://cezanneondemand.intervieweb.it/turing/jobs/research-associate-proba…
)
As a postdoctoral researcher at the Alan Turing Institute, you will be part
of a vibrant and collaborative research environment, surrounded by renowned
experts and cutting-edge technologies. This position provides an excellent
platform to advance your career and make lasting contributions to the field
of artificial intelligence.
For any questions, get in touch with me (over pranava.madhyastha(a)city.ac.uk
).
Kind regards,
Pranava
Dear colleagues,
(Apologize if you received multiple emails from different mailing lists)
We are delighted to announce the call for task proposal of NTCIR-18. NTCIR
(NII Testbeds and Community for Information Access Research) is a series of
evaluation conferences that mainly focus on information access with East
Asian languages and English. The first NTCIR conference (NTCIR-1) took
place in August/September 1999, and the latest NTCIR-17 conference was held
in December 2023. Research teams from all over the world participate in one
or more NTCIR tasks to advance the state of the art and to learn from one
another's experiences.
We invite new task proposals within the expansive field of information
access. Organizing an evaluation task entails pinpointing significant
research challenges, strategically addressing them through collaboration
with fellow researchers (including co-organizers and participants),
developing the requisite evaluation framework to propel advancements in the
state of the art, and generating a meaningful impact on both the research
community and future developments. Prospective applicants are urged to
underscore the real-world applicability of their proposed tasks by
utilizing authentic data, focusing on practical tasks, and solving tangible
problems. Additionally, they should confront challenges in evaluating
information access technology, such as the extensive number of assessments
needed for evaluation, ensuring privacy while using proprietary data, and
conducting live tests with actual users.
*Task Proposal Submission Due: Feb 9, 2024 (Anywhere on Earth)SUBMISSION
LINK: https://easychair.org/conferences/?conf=ntcir18proposal
<https://easychair.org/conferences/?conf=ntcir18proposal>*
Below are more details, and please feel free to contact us if you have any
questions.
Happy holidays, and happy new year.
Warm regards,
NTCIR-18 Program Committee Co-Chairs
Qingyao Ai, Chung-Chi Chen, and Shoko Wakamiya
Dear colleagues,
Have you ever worked at the intersection of natural language processing and
endangered language documentation, or are you curious about doing so?
My colleagues and I at the University of Colorado Boulder are surveying NLP
researchers and documentary linguists who have done or are interested in
this kind of work. Our goal is to better understand how to make NLP systems
more practically successful in language documentation settings.
If you have 15 minutes, we would be honored if you shared your experiences
with us to help advance our understanding of NLP in language documentation.
We invite you to participate by taking one of our two different surveys
based on which group you belong to:
- NLP researchers
<https://docs.google.com/forms/d/e/1FAIpQLSeCFdMrbWmRqz7OAYbhoJYKX5g2NHPooXo…>
- Documentary linguists <https://forms.gle/4pGhsbGQ36b58byn6>
We look forward to reading what you have to share!
Best regards,
Luke Gessler
Dear Corpora Members,
🌟 Exciting Announcement: OSACT 2024 Workshop 🌟
Calling All Researchers in Computational Linguistics, NLP, and IR Specializing in Arabic Language!
Are you at the forefront of research in low-resource languages, particularly Arabic? Do you delve into the complexities of computational linguistics (CL), natural language processing (NLP), and information retrieval (IR) with a focus on Arabic?
We invite you to explore and contribute to groundbreaking advancements in machine translation, particularly in developing models that seamlessly translate dialectal Arabic text into Modern Standard Arabic (MSA).
Moreover, if your research aims to elevate the integrity and dependability of Arabic Large Language Models (LLMs) by innovating in hallucination detection and mitigation strategies, this workshop is a perfect platform for you.
Join us at the 6th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT6), a hub of innovation and scholarly exchange.
Featured Shared Tasks: Tackling the Forefront Issues in Arabic LLMs and Modern Standard Arabic (MSA) Machine Translation
Task 1: Arabic LLMs Hallucination Challenge: Address the critical issue of hallucinated content in Arabic language models. Engage in this vital conversation and present your solutions. Read details: https://osact-lrec.github.io/
Task 2: Dialect to MSA Machine Translation Challenge: Engage in the pivotal task of transforming dialectal Arabic into MSA through innovative translation models. We invite you to utilize your expertise in driving significant advancements in language processing, fostering more effective and meaningful exchanges in the Arabic-speaking world. Read details: https://osact-lrec.github.io/
Event Details:
Date: May 25, 2024
Location: Torino, Italy
In conjunction with the esteemed LREC-COLING 2024 - The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation.
Don’t miss this opportunity to contribute to a pioneering field!
Key Dates:
Paper Submission Deadline: February 25, 2024
Acceptance Notification: March 25, 2024
Visit our website, OSACT 2024: https://osact-lrec.github.io/ , to read more details and submission guidelines.
For all your questions please send to OSACT.WORKSHOP(a)gmail.com
Looking forward to your participation and to seeing you in LERC-COLING in May 2024!
The OSACT 2024 Workshop Organizing Committee
--
Mona Ali MSc. PhD. Computer Science
Associate Professor
Northeastern University
Vancouver Campus
410 W Georgia, 14th Floor
Vancouver, BC | Land of: xʷməθkʷəy̓əm, Sḵwx̱wú7mesh, and səlilwətaɬ
*************************************************************************
CNLP4DH First Call for Papers:
Throughout 2024 Journal of Data Mining and Digital Humanities (JDMDH)
organizes a worldwide call for papers about the topic
Chinese Natural Language Processing for Digital Humanities (CNLP4DH)
As a reminder JDMDH is an international-based journal managed by French
national research institutions and green open access (no charge for readers
and authors).
This special issue is dedicated to natural language processing for digital
humanities involving the documents written in Chinese, including Modern,
Ancient and dialectal Chinese. Mandarin, which is the national official and
main common language, can be accepted and research on texts written in
other languages, such as Tibet, Inner Mongolia, etc., is also welcome.
A list of suitable topics includes but are not limited to:
- Text analysis and processing related to humanities using computational
methods
- Dataset creation and curation for NLP (e.g. digitization, datafication,
and data preservation).
- Research on cultural heritage collections such as national archives and
libraries using NLP
- NLP for error detection, correction, normalization and denoising data
- Generation and analysis of literary works such as poetry and novels
- Analysis and detection of text genres
- Word segmentation, part-of-speech tagging of Ancient Chinese
- Large Language Models (LLM) for Chinese in Digital Humanities
- Cross modal Models (text-speech-video-image) for Chinese in Digital
Humanities
- Visualization of text analytics
- Ontology models for natural language text
- Applications in Chinese Literature, Traditional Chinese medicine,
Learning Chinese language as second language, Sentiment Analysis in Chinese
Social Media, China Cultural Heritage, Chinese History, Ancient Chinese
language
submission guideline: https://jdmdh.episciences.org/page/submissions
Paper submission : https://jdmdh.episciences.org/submit
Website and more details:
https://jdmdh.episciences.org/page/chinese-natural-language-processing-for-…
Guest Editors:
Dr. Wenhe FENG (Guangdong University of Foreign Studies, Laboratory of
Language Engineering and Computing)
Dr. Bin LI (Nanjing Normal University, School of Chinese Language and
Literature, Center of Linguistic Big Data and Computational Humanities)
Dr. Nicolas TURENNE (Guangdong University of Foreign Studies, School of
Information Science and Technology)
Dr. Tong WEI (Beijing University, Digital Humanities Center)
*************************************************************************
***********************************************************************************
First Call for Papers:
The 5th workshop on: "Resources and ProcessIng of linguistic, para-linguistic and extra-linguistic Data from
people with various forms of cognitive/psychiatric/developmental impairments"
Workshop: co-located with LREC-COLING 2024 | Turin, Italy | May 21st, 2024
RaPID-5 serves as an interdisciplinary platform for researchers to exchange insights, methods, and experiences related to collecting and processing data from individuals with mental, cognitive, neuropsychiatric, or neurodegenerative impairments. The workshop focuses on creating, processing, and applying such data resources from individuals at different stages and severity levels of these impairments. The ultimate goal of RaPID-5 is to facilitate the study of relationships among linguistic, paralinguistic, and extra-linguistic observations, with applications ranging from aiding diagnosis to enhancing monitoring and predicting individuals at higher risk, ultimately promoting multidisciplinary collaboration across clinical, language technology, computational linguistics, and computer science communities.
Submission deadline: Sun., 31st of March, 2024 (anywhere on earth)
Paper submission: https://softconf.com/lrec-coling2024/rapid2024/
Website and more details: https://spraakbanken.gu.se/en/rapid-2024
Contact: Dimitrios Kokkinakis
Contact email: dimitrios.kokkinakis(a)gu.se
Organizing committee:
* Kathleen C. Fraser, National Research Council, Canada;
* Dimitrios Kokkinakis, University of Gothenburg, Sweden;
* Kristina Lundholm Fors, Lund University, Sweden;
* Charalambos K. Themistocleous, University of Oslo, Norway
*
Athanasios Tsanas, The University of Edinburgh, UK
*
Fredrik Öhman, University of Gothenburg and Sahlgrenska University Hospital, Sweden
************************************************************************************
[Apologies for multiple postings]
CALL FOR PAPERS
Second International Workshop Towards Digital Language Equality (TDLE):
Focusing on Sustainability
co-located with LREC-COLING 2024, May 2024, Turin (Italy)
See further details at https://european-language-equality.eu/tdle-2024/
[1]
1 Description and Aims of the Workshop
The key aim of this half-day workshop co-located with LREC-COLING 2024
(https://lrec-coling-2024.org/), to be held in Turin (Italy) in May
2024, is to discuss and promote the importance of sustainability in the
design, development, creation, use, distribution and sharing of language
data, resources, platforms, infrastructures, tools and technologies,
with the intention of achieving Digital Language Equality (DLE). While
some important work has recently addressed these crucial areas (e.g.
Fort and Couillault, 2016; Hessenthaler et al., 2022; Ramesh et al.,
2023; Castilho et al., forthcoming), the relevant contributions seem to
be as yet unsystematic and relatively isolated. The workshop intends to
provide an inclusive forum to encourage in-depth debate and facilitate
collaborations to promote the sustainability of resources and
technologies in any (combination of) languages, in support of
multilingualism and of the overarching goal of DLE.
The sustainability of language resources and technologies is key to
enabling multilingualism and digital language equality in the age of
Artificial Intelligence.
2 Topics of Interest
The second international Towards Digital Language Equality (TDLE)
workshop focuses on sustainability in relation to the design,
development, creation, use, distribution and sharing of language data,
resources, platforms, infrastructures, tools and technologies, with a
view to promoting the broader goal of Digital Language Equality (DLE).
The concept of DLE has been firmly established in relation to all
languages of Europe (Rehm and Way, 2023), and has the potential to also
benefit other languages throughout the world, to support the prosperity
of the respective communities at a time of impressive - but as yet very
unevenly distributed and severely imbalanced - progress in
language-centric Artificial Intelligence (AI), e.g. through large
language models (LLMs). The workshop places particular emphasis on
multilingualism and on leveling up digital support for languages,
domains and applications that have so far been underserved, and wishes
to explore ways to develop policies and funding streams to work towards
sustainability in connection with DLE, especially in support of
regional, minority and territorial languages.
To this end, recognizing that the sustainability of Language Resources
and Technologies (LRTs) is key to enabling multilingualism and DLE in
the age of AI, topics of particular interest for the workshop on which
we invite original contributions covering any (combination of) languages
include, but are not limited to, the following:
- research on the factors affecting DLE and the sustainability of LRTs;
- best practices, case studies and validated guidelines related to the
design, implementation and improvement of sustainability of written,
oral/spoken, signed and/or multimodal LRTs (including LLMs),
particularly in support of DLE;
- how multilingual LLM technology can support DLE;
- retrospectively assessing the sustainability of legacy LRTs, and
future-proofing new LRTs in the interest of DLE;
- analyzing the costs and benefits of foregrounding sustainability for
LRTs;
- the role of metadata, accompanying documentation and licenses in
showing and improving the sustainability of LRTs;
- sustainability, fairness and accessibility (e.g. for users with
physical or cognitive disabilities, limited computing resources and
connectivity) of platforms and infrastructures hosting, distributing and
sharing LRTs in the interest of DLE;
- how current data and computing access inequality is affecting DLE (in
particular regarding LLMs);
- ecological sustainability and environmental fairness of developing and
deploying state-of-the-art LRTs, e.g. LLMs with regard to energy
consumption, global warming and climate change;
- developing data and parameter efficient methods to train or adapt
language models to new languages;
- how to evaluate, measure, compare and improve the sustainability of
LRTs;
- establishing benchmarks and protocols to ensure the sustainability of
LRTs;
- how to avoid the potential dangers of developing and using unfair and
unsustainable LRTs, e.g. for malicious, ill-intentioned or harmful
purposes;
- ethical, legal, cultural and/or socio-economic implications of
(ignoring) fairness and sustainability of LRTs;
- developing and implementing forward-looking policies to promote
fairness and long-term sustainability of LRTs to achieve DLE;
- education and training needs and experiences in relation to promoting
fairness and sustainability of LRTs and ways to raise broad awareness of
DLE and related topics, e.g. among the general public, policy- and
decision-makers.
Given this wide-ranging and inclusive remit, the workshop intends to
bring together developers, creators, vendors, distributors, brokers,
users, evaluators and researchers of written, oral/spoken, signed and/or
multimodal LRTs in any (combination of) languages.
3 Background and First TDLE Workshop Held in 2022
The second 2024 edition of the workshop builds on the success of the
first Towards Digital Language Equality (TDLE) workshop, that was held
at LREC 2022 in Marseille (France) on 20 June 2022, and whose accepted
papers were published in a dedicated volume of proceedings, Aldabe et
al. (2022).
Following this well-received inaugural workshop held in June 2022, the
second event in the series will be co-located with LREC-COLING 2024 in
Turin (Italy) in May 2024, and will focus specifically on the highly
relevant topic of the sustainability of LRTs in connection with
multilingualism and DLE.
4 Submissions
Up-to-date information on the workshop, including materials for authors,
guidelines, templates, stylesheet and key dates can be found at the
dedicated website https://european-language-equality.eu/tdle-2024/ [1].
To contact the organizing committee of the workshop directly, you can
email tdle2024.hitz(a)ehu.eus.
Papers submitted to the workshop should be completely anonymous for
double-blind peer review, written in English, and prepared using the
official LREC-COLING 2024 author's kit and submission
stylesheet/template available at
https://lrec-coling-2024.org/authors-kit/ [2]. The submissions to the
workshop should not exceed 8 pages, excluding references, and be saved
in unprotected PDF format. Papers should be submitted no later than 23
February 2024 through the START submission management system available
via the workshop website at
https://european-language-equality.eu/tdle-2024/ [1].
The workshop seeks original papers, i.e. it does not accept submissions
that have been, or will be, published elsewhere. The workshop allows
simultaneous submissions, and in these cases the authors should clearly
indicate in the manuscript to which other conference, workshop or venue
they have submitted the paper for review. Each paper submitted to the
workshop will receive three double-blind peer reviews. Papers accepted
for presentation will be included in the proceedings of the workshop.
In light of the LREC-COLING 2024 Map and the "Share your LRs!"
initiative, when submitting their papers through the START system
authors will be asked to provide essential information about resources
(in a broad sense, i.e. also technologies, standards, evaluation kits,
etc.) that have been used for the work described in the paper or are a
new result of their research. Moreover, ELRA encourages all LREC-COLING
authors to share the described LRs (data, tools, services, etc.) to
enable their reuse and replicability of experiments (including
evaluation ones).
5 Key Dates
Paper submission deadline: 23 February 2024
Notification of acceptance: 19 March 2024
Camera-ready papers due: 8 April 2024
Half-day workshop date: 20, 21 or 25 May 2024 (TBC)
6 Workshop Organizers
- Itziar Aldabe (HiTZ Basque Center for Language Technology - Ixa,
University of the Basque Country, Spain)
- Begoña Altuna (HiTZ Basque Center for Language Technology - Ixa,
University of the Basque Country, Spain)
- Aritz Farwell (HiTZ Basque Center for Language Technology - Ixa,
University of the Basque Country, Spain)
- Federico Gaspari (University of Naples "Federico II", Italy & ADAPT
Centre, Dublin City University, Ireland - co-chair)
- Joss Moorkens (School of Applied Language & Intercultural
Studies/ADAPT Centre, Dublin City University, Ireland - co-chair)
- Stelios Piperidis (Institute of Language and Speech Processing, Athena
Research and Innovation Center in Information, Communication and
Knowledge Technologies, Greece)
- Georg Rehm (Speech and Language Technology Lab, Deutsches
Forschungszentrum für Künstliche Intelligenz, Germany)
- German Rigau (HiTZ Basque Center for Language Technology - Ixa,
University of the Basque Country, Spain)
7 Program Committee
- Antonios Anastasopoulos (GMU, USA)
- Anya Belz (ADAPT, DCU, Ireland)
- Steven Bird (CDU, Australia)
- Fred Blain (Uni. Tilburg, Netherlands)
- Franco Cutugno (Uni. Naples "Federico II", Italy)
- Bessie Dendrinos (NKUA, Greece & ECSPM, Denmark)
- Félix do Carmo (Uni. Surrey, UK)
- Annika Grützner-Zahn (DFKI, Germany)
- Ana Guerberof-Arenas (Uni. Groningen, Netherlands)
- Davyth Hicks (ELEN, Belgium)
- Monja Jannet (ADAPT, DCU, Ireland)
- John Judge (ADAPT, DCU, Ireland)
- Dorothy Kenny (SALIS/CTTS/ADAPT, DCU, Ireland)
- Sabine Kirchmeier (EFNIL, Luxembourg)
- Teresa Lynn (MBZUAI, United Arab Emirates)
- Maite Melero (BSC, Spain)
- Helena Moniz (Uni. Lisbon, Portugal & EAMT)
- Johanna Monti (UniOR, Italy)
- Rachele Raus (UniBO, Italy)
- Wessel Reijers (Uni. Paderborn, Germany)
- Celia Rico Pérez (Universidad Complutense de Madrid, Spain)
- Dimitar Shterionov (TU, Netherlands)
- Carlos S. C. Teixeira (IOTA Localisation Services & Uni. Rovira i
Virgili, Spain)
- Antonio Toral (Uni. Groningen, Netherlands)
- Vincent Vandeghinste (Instituut voor de Nederlandse Taal, Netherlands
& KU Leuven, Belgium)
References
Itziar Aldabe, Begoña Altuna, Aritz Farwell and German Rigau, editors.
2022. Proceedings of the Workshop Towards Digital Language Equality
(TDLE). European Language Resources Association, Marseille, France.
Sheila Castilho, Federico Gaspari, Joss Moorkens, Maja Popović and
Antonio Toral, editors. Forthcoming. Journal of Specialised Translation.
Special Issue n. 41 on "Translation Automation and Sustainability".
Karën Fort and Alain Couillault, 2016. "Yes, We Care! Results of the
Ethics and Natural Language Processing Surveys". Proceedings of the
Tenth International Conference on Language Resources and Evaluation
(LREC'16). European Language Resources Association, Portorož, Slovenia.
1593-1600.
Marius Hessenthaler, Emma Strubell, Dirk Hovy and Anne Lauscher, 2022.
"Bridging Fairness and Environmental Sustainability in Natural Language
Processing". Proceedings of the 2022 Conference on Empirical Methods in
Natural Language Processing, Abu Dhabi, United Arab Emirates. 7817-7836.
András Kornai, 2013. "Digital Language Death". PLoS ONE, 8(10):e77056.
Krithika Ramesh, Sunayana Sitaram and Monojit Choudhury, 2023. "Fairness
in Language Models Beyond English: Gaps and Challenges". Findings of the
Association for Computational Linguistics: EACL 2023. Association for
Computational Linguistics, Dubrovnik, Croatia. 2106-2119.
Georg Rehm and Andy Way, editors. 2023. European Language Equality: A
Strategic Agenda for Digital Language Equality. Berlin: Springer.
Links:
------
[1] https://european-language-equality.eu/tdle-2024/
[2] https://lrec-coling-2024.org/authors-kit/
The Robotics and Semantic Systems group at the Department of Computer Science, Lund University, has announced an Assistant Professor position (biträdande universitetslektor, BUL) in Computer Science with focus on semantic systems and natural language processing.
The group (https://rss.cs.lth.se) is doing research in cognitive robotics and AI, including Machine Learning, Natural Language Processing, Human-Robot Interaction and Advanced Robotics. The group is part of the RobotLab LTH (https://robotics.lth.se). It is involved in a number of research centers and programmes, including WASP (Wallenberg Autonomous Software and Systems Programme, https://wasp-sweden.org), ELLIIT (Excellence Center at Linköping – Lund in Information Technology, https://elliit.se), LTH profile area: Pillars of AI and Digitalization and LU profile area: Natural and Artificial Cognition.
The candidate is expected to have a PhD a couple of years old, preferably after a postdoc, planning to settle for the academic career in a vibrant university (with a fresh Nobel prize in house:-).
Detailed information about the position may be found in the announcement:
https://lu.varbi.com/what:job/jobID:654845/?lang=en
The deadline for applying has been postponed until January 9th, 2024.
You are welcome to contact me for more details.
Jacek Malec, head of the RSS group
--
Jacek Malec jacek.malec(a)cs.lth.se
Department of Computer Science tel. +46 46 2224950
LTH, Lund University cell +46 70 4950474
Box 118, 221 00 Lund, Sweden http://cs.lth.se/Jacek_Malec/
When you send emails to Lund University, we process your personal data in accordance with existing legislation. To find out more about the processing of your personal data, visit the Lund University website: https://www.lunduniversity.lu.se/about/contact-us/processing-of-personal-da…
[apologies for cross-posting]
ISIR, in Paris, has an open position for a two year non-permanent junior
researcher / postdoc in Machine Translation / Large Language Models.
Details available here:
https://emploi.cnrs.fr/Offres/CDD/UMR7222-FRAYVO-001/Default.aspx?lang=EN
Please apply before Jan, 5th, 2024.
Best
F
--
---
François Yvon
ISIR/CNRS
4 Place Jussieu
F 75005 Paris
https://fyvo.github.io