PhD Title: Information Rating and Analysis of Knowledge Dynamics.
Application to the Temporal Monitoring of the Reliability of
Bibliographic Information on Insects as Vectors of Plant Pathogens.
We are looking for candidates who have proven knowledge of NLP and
Machine Learning. Deadline is June30th.
Detailed description of the doctoral project HERE
<https://maiage.inrae.fr/sites/default/files/2024-04/ADUM%20INRAE%202024-eng…>
-----------
## Research subject
within the framework of the research project on NLP for insect
monitoring in plant health. The central aim of the PhD project is to
develop original approaches for the reliability of textual information,
by integrating linguistic dimensions and knowledge graphs (NLP, language
models), and dynamic dimensions (time series). The quality and relevance
of the extracted information will be derived from the collected
documents along time and from the existing knowledge base.
For plant health and risk management, the biological interaction between
insect vectors, pathogens, and host plants is of primary interest for
anticipating contamination and reducing pesticide use.
You will be affiliated with the Computer Science Graduate School at
Paris-Saclay University. You will be employed by INRAE.
## Requirements
A successful candidate will have an MSc or equivalent in Artificial
Intelligence.
*
Proven experience with applying natural language processing.
*
Interest in learning about biology or bioinformatics.
*
High level of academic English or French, both written and spoken;
*
Good programming skills in Python or Java (and preferably experience
with deep learning tools)
*
Capacity to work as part of a team in a multidisciplinary framework.
*
Experiences of applied research to Life Science is an asset.
## Offer
We offer a motivating research environment with many opportunities for
in-house, national and international collaborations and access to
computing GPU resources and state-of-the-art research equipment. The
gross salary per month for the three-year contract is 2 100 (in 2024) to
2300 (in 2026) including the social security package (healthcare,
pensions, unemployment benefits).
## Start date: October 2024
Location: Paris-Saclay University campus, mainly at the MaIAGE lab [1]
and MIA-Paris-Saclay [2], you will visit PHIM [3] at the INRAE research
center in Montpellier (South of France).
## Application
The closing date is June 30th 2024
Interested candidates should send their application files to Claire
Nédellec (claire.nedellec(a)inrae.fr <mailto:claire.nedellec@inrae.fr>),
Vincent Guigue, Nicolas Sauvion (Nicolas.sauvion(a)inrae.fr
<mailto:Nicolas.sauvion@inrae.fr>), and Robert Bossy
(Robert.bossy(a)inrae.fr <mailto:Robert.bossy@inrae.fr>).
It should comprise:
*
a CV (max 5 pages) with transcripts (Master), diplomas, internships
*
a cover letter
*
the names and contact of two referees for reference letters
[1] https://maiage.inrae.fr/en/bibliome
<https://maiage.inrae.fr/en/bibliome>
[2] https://vguigue.github.io/ <https://vguigue.github.io/>
[3]
https://umr-phim.cirad.fr/en/recherche/comprendre-les-epidemies-dans-les-ch…
<https://umr-phim.cirad.fr/en/recherche/comprendre-les-epidemies-dans-les-ch…>
*Apologies for cross-posting*
First Workshop on Knowledge-Enhanced Machine Translation
Sheffield, United Kingdom, on June 27, 2024
https://kemt2024.wixsite.com/home
First Call for Papers
The 1st edition of the Workshop on Knowledge-Enhanced Machine Translation
will be held in Sheffield, United Kingdom, on June 27, 2024, co-located
with EAMT 2024. We welcome submissions either of research papers or
extended abstracts/industry reports. Full research papers should describe
original, unpublished content, while extended abstracts are open to
reporting preliminary results of ongoing research. Industry reports should
demonstrate the impact of conceptual modelling in a real-world setting,
arguing for generalisability of methods and lessons learned. Potential
submission topics encompass, but are not restricted to:
-
Integration of external terminology and constrained decoding
-
Integration of translation memories and similar translations from
external sources
-
Leveraging any kind of linguistic information
-
Data augmentation techniques
-
Using large language models to integrate external resources
-
Knowledge graphs
-
Integration of translation quality indicators for improving final MT
output
-
Quality assessment of knowledge-enhanced MT systems
-
Utilising quality estimation systems for improving MT performance
Submission information
The workshop accepts submissions in two different modalities:
-
Full research papers: Submissions will be accepted as papers of at least
4 up to 10 pages (plus unlimited pages for references and appendices).
-
Extended abstracts/industry reports:Submissions will be accepted as
papers of up to 2 pages. The references are not included in the 2-page
limit.
Accepted submissions will be presented either as posters or oral
communications, as decided by the program committee. Accepted submissions
will be published online as proceedings included in the ACL Anthology,
unless the authors specify otherwise.
Submissions should be formatted according to the EAMT 2024 guidelines (PDF,
LaTeX, Word)
<https://eamt2024.sheffield.ac.uk/conference-calls/2nd-call-for-papers#h.45w…>
and submitted in PDF through OpenReview page (
https://openreview.net/group?id=EAMT.org/2024/Workshop/KEMT ).
Important dates
-
Workshop paper/abstracts due: 15th April 2024
-
Notification of acceptance: 15th May 2024
-
Camera-ready papers/extended abstracts due: 27th May 2024
-
Workshop date: 27th June 2024
Follow us on Twitter: https://twitter.com/kemt2024
--
Miquel Esplà-Gomis
Departament de Llenguatges i Sistemes Informàtics
Universitat d'Alacant
Carretera de Sant Vicent del Raspeig s/n
03690 Sant Vicent del Raspeig, Alacant (Spain)
Tel: +34 965903400 ext. 2424
The next meeting of the Edge Hill Corpus Research Group will take place online (via MS Teams) on Thursday 25 April 2024, 2:00-3:30 pm (UK time).
Attendance is free. You can register here:
https://store.edgehill.ac.uk/conferences-and-events/conferences/events/edge…
Topics: Corpus Methodology, Large Language Models
Speakers: Sylvia Jaworska<https://www.reading.ac.uk/elal/staff/dr-sylvia-jaworska> (University of Reading, UK) & Mathew Gillings<https://www.wu.ac.at/ebc/about-us/team/mathew-gillings/> (Vienna University of Economics and Business, Austria)
Title: How humans vs. machines identify discourse topics: an exploratory triangulation
Abstract
Identifying discourses and discursive topics in a set of texts has not only been of interest to linguists, but to researchers working across social sciences. Traditionally, these analyses have been conducted based on small-scale interpretive analyses of discourse which involve some form of close reading. Naturally, however, that close reading is only possible when the dataset is small, and it leaves the analyst open to accusations of bias and/or cherry-picking.
Designed to avoid these issues, other methods have emerged which involve larger datasets and have some form of quantitative component. Within linguistics, this has typically been through the use of corpus-assisted methods, whilst outside of linguistics, topic modelling is one of the most widely-used approaches. Increasingly, researchers are also exploring the utility of LLMs (such as ChatGPT) to assist analyses and identification of topics. This talk reports on a study assessing the effect that analytical method has on the interpretation of texts, specifically in relation to the identification of the main topics. Using a corpus of corporate sustainability reports, totalling 98,277 words, we asked 6 different researchers, along with ChatGPT, to interrogate the corpus and decide on its main ‘topics’ via four different methods. Each method gradually increases in the amount of context available.
• Method A: ChatGPT is used to categorise the topic model output and assign topic labels;
• Method B: Two researchers were asked to view a topic model output and assign topic labels based purely on eyeballing the co-occurring words;
• Method C: Two researchers were asked to assign topic labels based on a concordance analysis of 100 randomised lines of each co-occurring word;
• Method D: Two researchers were asked to reverse-engineer a topic model output by creating topic labels based on a close reading.
The talk explores how the identified topics differed both between researchers in the same condition, and between researchers in different conditions shedding light on some of the mechanisms underlying topic identification by machines vs humans or machines assisted by humans. We conclude with a series of tentative observations regarding the benefits and limitations of each method along with suggestions for researchers in selecting an analytical approach for discourse topic identification. While this study is exploratory and limited in scope, it opens up a way for further methodological and larger scale triangulations of corpus-based analyses with other computational methods including AI.
________________________________
Edge Hill University<http://ehu.ac.uk/home/emailfooter>
Modern University of the Year, The Times and Sunday Times Good University Guide 2022<http://ehu.ac.uk/tef/emailfooter>
University of the Year, Educate North 2021/21
________________________________
This message is private and confidential. If you have received this message in error, please notify the sender and remove it from your system. Any views or opinions presented are solely those of the author and do not necessarily represent those of Edge Hill or associated companies. Edge Hill University may monitor email traffic data and also the content of email for the purposes of security and business communications during staff absence.<http://ehu.ac.uk/itspolicies/emailfooter>
Dear all,
We are excited to announce the upcoming "Celtic Languages in the Digital Age" workshop, scheduled for April 9, 2024, at Lancaster University.
This is a hybrid event of talks and panel discussion, organised by the UCREL NLP Group and funded by the Faculty of Science and Technology's Research Catalyst Fund, aims to address the critical need for linguistic resources supporting Celtic languages.
Event Details:
- Date: April 9, 2024
- Location: Lancaster University
- Format: Hybrid with online broadcast (free registration)
- Register to attend online: https://bit.ly/clida2024
The workshop will gather experts in Celtic languages, linguistics, corpus linguistics, computer science, and computational linguistics to explore the development of language models for under-resourced languages, including Welsh, Irish, Scottish Gaelic, Cornish, Breton, and Manx. We also have a talk on the use of transfer learning to create language models for low-resourced languages taking Luxembourgish as a use case.
Programme, list of speakers and talks details can be found on the event's website: https://wp.lancs.ac.uk/celtic/
If you are in or near Lancaster and would like to attend in person, do get in touch with me as we have a few places left, attending in person is free, lunch and refreshments will be provided on the day.
Best wishes,
Mo
--------------------------------
Dr Mo El-Haj
Senior Lecturer in NLP
Director of Admissions (SCC)
Co-Director of UCREL NLP Group<https://ucrel.lancs.ac.uk/>
Natural Language Engineering (NLE) Journal Editorial Board
https://www.cambridge.org/core/journals/natural-language-engineering
Advisory Board of the Natural Language Processing Book Series
https://benjamins.com/catalog/nlp
School of Computing and Communications, Lancaster University
https://www.lancaster.ac.uk/staff/elhaj
@DocElhaj<https://twitter.com/DocElhaj>
You may receive emails from me outside what are your typical office hours.
I do not expect you to respond to my email outside your working hours.
*** Third Call for Papers ***
12th IEEE International Conference on Intelligent Systems (IS'24)
Invited Session on Intelligent Tools for e/m/d Learning
August 29-31, 2024, Golden Sands Resort, Varna, Bulgaria
https://www.ieee-is.org
(*** Submission Deadline: April 30, 2024 AoE ***)
This invited session aims to explore state-of-the-art advancements in intelligent tools
designed to revolutionize the landscape of e-learning (electronic learning), m-learning
(mobile learning) and d-learning (digital learning). With the rapid integration of technology
into education, there is a growing need to investigate and showcase innovative intelligent
solutions that can enhance the effectiveness of online, mobile and digital learning
environments. Intelligent solutions for the educational context can have a groundbreaking and
revolutionary impact on learners and faculty alike, with integrated educational tools offering
enhanced engagement, support for different learning levels, styles and abilities, and
personalized learning experiences with immediate feedback at its core.
Papers submitted to this session are expected to present results on various aspects of
intelligent tools for e-learning, m-learning and d-learning. Topics may include but are not
limited to adaptive learning systems, personalized learning experiences, artificial
intelligence-driven content creation, data analytics for educational insights, and the integration
of emerging technologies such as augmented reality and virtual reality in educational settings.
Authors are encouraged to share empirical research, case studies, and practical
implementations that demonstrate the impact of intelligent tools on learner/faculty
engagement, knowledge retention, and overall educational outcomes. The session aims to
provide a comprehensive overview of the current state-of-the-art in intelligent tools in
e-learning, m-learning and d-learning, fostering discussions on their implications for the
future of education and their potential to address challenges in diverse learning
environments and educational levels, as well as diverse learner ability.
PAPER SUBMISSION INSTRUCTIONS
Submitted papers should be in IEEE 2-column format and should adhere to the template
available here: https://www.ieee-is.org/wp-content/uploads/2024/02/IS_A4_format-AAu.docx
The expected paper length in camera-ready format should not exceed 6 pages.
Submissions should be done in PDF using Easy Chair and the submission link is:
https://easychair.org/conferences/?conf=itl24 .
PUBLICATION
All accepted papers will be included in the IS'24 proceedings to be published by IEEE. The
proceedings of the previous editions of IS can be found here:
https://ieeexplore.ieee.org/xpl/conhome/1000395/all-proceedings .
Traditionally, extended versions of conference-selected papers appear within 1-2 years after
the conference dates in well-known international iournals and/or post-conference books.
More information can be found on the conference web site
(https://www.ieee-is.org/publication-information/ ).
CONTACT POINT
For any additional information or clarification please contact the Invited Session Chair,
George A. Papadopoulos at george(a)ucy.ac.cy .
IMPORTANT DATES
• Paper submission: April 30, 2024, AoE
• Notification: May 20, 2024
• Camera Ready: June 6, 2024
• Author Registration: June 15, 2024
ORGANISATION
Committees
https://www.ieee-is.org/program-committee/
Invited Session Chair
• George A. Papadopoulos, University of Cyprus, Cyprus (george(a)ucy.ac.cy)
****We apologize for multiple postings of this e-mail****
IberLEF 2024 Task - HOPE: Approaching Hope Speech Detection in Social
Media from Two Perspectives, for Equality, Diversity and Inclusion and as
Expectations
Held as part of the evaluation forum IberLEF 2024
<https://sites.google.com/view/iberlef-2024/home?authuser=0> in the 40th
edition of the International Conference of the Spanish Society for Natural
Language Processing (SEPLN 2024
<http://sepln2024.infor.uva.es/en/front-page-english/>)
Valladolid, Spain, 24-27 September 2024
Codalab link: https://codalab.lisn.upsaclay.fr/competitions/17714
Dear All,
Hope, a crucial aspect of human psychology, profoundly shapes emotions,
behavior, and mood, influencing how individuals perceive and navigate
challenges (Bruininks and Malle, 2005; Snyder, 1994, 2000). High levels of
hope correlate with positive outcomes such as academic success and lower
depression rates, while low hope is associated with diminished well-being
(Snyder, 2002; Snyder et al., 1997; Diener, 2009). Despite its
significance, hope has been underexplored in Natural Language Processing
(NLP) until recent years. Efforts have been made to integrate NLP
techniques into the analysis of hope through shared tasks, like those
organized in ACL 2022, RANLP 2023, and IberLEF 2023 (Chakravarthi et al.,
2022; Kumaresan et al., 2023; Jiménez-Zafra et al., 2023). The upcoming
IberLEF 2024 edition aims to delve deeper into hope from two angles: hope
for equality, diversity, and inclusion, and hope as expectations. This
edition promises to expand understanding by examining hope across different
domains and languages, thus addressing crucial questions in hope speech
detection research. Two tasks are outlined in this description, each
focusing on different aspects of hope.
-
Task 1: It centers on "Hope for Equality, Diversity, and Inclusion,"
emphasizing the importance of hope speech in mitigating hostility and
supporting individuals facing challenges like illness, stress, or
loneliness, particularly within vulnerable groups such as the LGBT
community and racial minorities. This task consists of giving a Spanish
tweet, identifying whether it contains hope speech or not. The possible
categories for each text are:
-
hs: hope speech.
-
nhs: non hope speech.
-
Task 2: It delves into "Hope as Expectations," highlighting hope's role
as an anticipatory mindset shaping human emotions and behaviors, especially
in the context of social media where expressions are abundant. This task
aims to analyze hope speech's presence in English and Spanish texts,
focusing on binary hope speech detection and multiclass hope speech
detection. The subtask are presented as follows,
-
Subtask 2a- Binary Hope speech detection: A given text in
English/Spanish will be classified as:
-
Hope
-
Not Hope
-
Subtask 2b- Multiclass Hope speech detection: A given text in
English/Spanish will be classified as:
-
Generalized Hope
-
Realistic Hope
-
Unrealistic Hope
-
Not Hope
In both tasks, there will be a real-time leaderboard and the participants
will be allowed to make a maximum of 10 submissions through CodaLab, from
which each team will have to select the best one for ranking.
The dataset details and registration are available at:
https://codalab.lisn.upsaclay.fr/competitions/17714
Best regards,
The HOPE 2024 organizing committee
Important dates
-
Release of training + development corpora: Feb 16, 2024.
-
Release of test corpora and start of evaluation campaign: April 1, 2024.
-
End of evaluation campaign (deadline for runs submission): Apr 16, 2024.
-
Publication of official results: Apr 18, 2024.
-
Paper submission: May 14, 2024.
-
Review notification: Jun 11, 2024.
-
Camera-ready submission: Jun 28, 2024.
-
IberLEF Workshop (SEPLN 2024): Sep 27, 2024.
-
Publication of proceedings: Sep ??, 2024.
Organizing Committee
-
Daniel García-Baena, SINAI, Universidad de Jaén, Spain.
-
Fazlourrahman Balouchzahi, CIC IPN, Mexico.
-
Salud María Jiménez-Zafra, SINAI, Universidad de Jaén, Spain.
-
Sabur Butt, Institute for the Future of Education (IFE) at Tecnológico
de Monterrey, Mexico.
-
Miguel Ángel García-Cumbreras, SINAI, Universidad de Jaén, Spain.
-
Atnafu Lambebo Tonja, Centro de Investigación en Computación, Instituto
Politécnico Nacional (IPN), Mexico.
-
José Antonio García-Díaz, UMUTeam, Universidad de Murcia, Spain.
-
Selen Bozkurt, Department of Biomedical Informatics, School of Medicine,
Emory University.
-
Bharathi Raja Chakravarthi, University of Galway, Ireland.
-
Hector G. Ceballos, Institute for the Future of Education (IFE) at
Tecnologico de Monterrey, Mexico.
-
Rafael Valencia-García, UMUTeam, Universidad de Murcia, Spain.
-
Grigori Sidorov, CIC IPN, Mexico.
-
L. Alfonso Ureña-López, SINAI, Universidad de Jaén, Spain.
-
Alexander Gelbukh, CIC IPN, Mexico.
*Sabur Butt, Ph.D. *(He/Him)
Institute for the Future of Education (IFE)
*Tecnológico de Monterrey, Mexico*
Address: Av. Eugenio Garza Sada 2501 Sur Tecnológico, 64849 Monterrey, N.L.
LinkedIn <https://www.linkedin.com/in/saburb> - GitHub
<https://github.com/saburbutt> - Scholar
<https://scholar.google.com/citations?user=re7md-0AAAAJ&hl=en> - Website
<https://saburbutt.github.io/>
Anna Rogers <anna.gld(a)gmail.com>
10:36 AM (2 hours ago)
to Corpora
The Department of Computer Science at the IT University of Copenhagen is
offering a PhD position in Natural Language Processing/Computational
Linguistics*,* with a start date of *1 September 2024*. The *application
deadline is 1**5* *April** 2024.* Applications for the position can be
submitted via ITU job portal
<https://candidate.hr-manager.net/ApplicationInit.aspx?cid=119&ProjectId=181…>
.
*Proposed project title: *Linguistic competence of language models in
prompt interpretation and text generation
*Proposed project description.* Recent generative systems based on
pre-trained language models are remarkably fluent when generating even
relatively exotic kinds of text, such as limericks or texts in early middle
English. At the same time, they remain sensitive to slight variation in the
wording of the prompts.
The proposed project will investigate this difference in competence for
different linguistic phenomena when the model interprets its instructions
(prompts) and generates text in response to prompts. *The specific focus of
the project is negotiable*, it could be syntactic constructions, variations
in lexical semantics, text registers etc. Model competence will be assessed
with respect to analysis of its pre-training data (hence, an open-source
model will be used).
The ideal candidate would have a strong background in computational
linguistics, as well as core skills in programming in Python and machine
learning. For entering the PhD program in Denmark, a M.Sc. or equivalent
degree is required. For this position, it is also possible to start as a
Master student, if extra ECTs are needed (for students who currently have
60-115 MA ECTS).
The successful candidate will be a member of the national Pioneer Centre
for Artificial Intelligence <https://aicentre.dk/>, a 5-university Danish
research endeavor, and of the NLPnorth <https://nlpnorth.github.io/>research
group at the IT University’s Computer Science Department. Both the centre
and research group are highly international and well-funded, working on a
broad range of research topics.
The project will be co-supervised by Associate Professors Anna Rogers
<https://annargrs.github.io/> (arog(a)itu.dk) and Rob van der Goot
<https://robvanderg.github.io/> (robv(a)itu.dk), to whom inquiries about the
project can be directed.
--
Anna Rogers
Associate Professor
IT University of Copenhagen
http://annargrs.github.io/
I have a postdoc opening in my lab with *applications due April 10th*. See
Bullard Research Fellow (BRF) area 7 (“*BRF7*”) in the job ad here:
https://apply.interfolio.com/142711.
"BRF7) We seek applicants in natural language processing (NLP), information
retrieval (IR), and human computation & crowdsourcing (HCOMP). Our work on
responsible AI develops methods for model explanations and fairness. We
build automated and human-in-the-loop models. We develop general methods to
advance the state-of-the-art, grounded in social challenges like curbing
disinformation and hate speech. A variety of our ongoing work touches on
large language models (LLMs). This position will be mentored by Matt Lease
<https://www.ischool.utexas.edu/~ml/>, as part of his lab for Artificial
Intelligence and Human-Centered Computing <http://ai.ischool.utexas.edu/>
(AI&HCC), and provide collaboration opportunities in UT Austin’s
campus-wide Good Systems <http://goodsystems.utexas.edu/> grand challenge
for responsible AI."
Please see the job ad for full details about the opening:
https://apply.interfolio.com/142711.
--
Matt Lease
Professor
Information & Computer Science
University of Texas at Austin
Voice: (512) 471-9350 · Fax: (512) 471-3971 · Office: UTA 5.536
http://www.ischool.utexas.edu/~ml
Apologies for cross-posting.
---------------------------------------------------------------------------
There is still time to participate in this shared task---evaluation phase from April 17 to 24.
**Social Media Mining For Health 2024**
https://healthlanguageprocessing.org/smm4h-2024/
The Social Media Mining for Health (SMM4H) workshop and shared tasks have been running successfully since 2016. They now go into the 9th round, with the workshop being co-located at ACL 2024 in Bangkok.
https://2024.aclweb.org/
Bangkok, Thailand , August 12–17, 2024
**Important Dates for all SMM4H Shared Tasks**
Training data available: January 10, 2024
CodaLab Available: January 17, 2024
Evaluation Phase: April 17 - 24, 2024
System description paper due: May 17, 2024
Paper acceptance notification: June 17, 2024
Camera-ready papers due: July 1, 2024
Workshop in Bangkok, Thailand , August 15, 2024
**Task 2: Task Description**
Adverse Drug Events (ADEs) are negative medical side effects related to a drug. Mining ADEs from user-generated text has become a popular topic and is an important use case for research, as it could help detecting crowd signals from users online. Being able to make use of information across languages by applying multi-lingual methods further supports this endeavor.
Our task targets the languages *German, French and Japanese* and is split into two subtasks. Subtask 2a focuses on Named Entity Recognition (NER) of of medication, disorder, and function mentions from user-generated texts. Subtask 2b performs joint NER and Relation Extraction (RE) to determine if these disorders are ADEs by finding the correct relations between medications, disorders and functions. We distinguish two types of relations between medication mentions and disorder/function mentions:
- "caused": the disorder/function was caused by a medication, i.e., the disorder/function is an ADE
- "treatment_for": the disorder/function is the reason for the medication, i.e., the medication is supposed to treat the disorder/function
*Tasks*:
Participants can choose between participating in subtask 2a, or subtask 2b, or both.
~~ We explicitly encourage the submission of new and creative approaches! ~~
- Subtask 2a) Named entity recognition of the entities "drug", "disorder" and "function" from user-generated texts.
- Subtask 2b) Joint named entity and relation extraction of the entities "drug", "disorder" and "function" and the relations "caused" and "treatment_for".
*Data*:
The data originates from social media platforms, e.g., patient fora and X (Twitter). We provide data in German and Japanese, and a few examples in French. The submitted systems will be evaluated on German, French and Japanese data. Please find more information here: https://healthlanguageprocessing.org/smm4h-2024/
Please use this form to register: https://forms.gle/7w4si27uJrCMiTyL8
Organizers of Subtask 2:
Pierre Zweigenbaum, Université Paris-Saclay, CNRS, LISN, France
Sebastian Möller, Technische Universität Berlin, DFKI GmbH, Germany
Roland Roller, DFKI GmbH, Germany
Philippe Thomas, DFKI GmbH, Germany
Eiji Aramaki, NAIST, Japan
Shoko Wakamiya, NAIST, Japan
Shuntaro Yada, NAIST, Japan
Katherine Yeh, Université Paris-Saclay, CNRS, LISN, France
Lisa Raithel, Technische Universität Berlin, Germany & Université Paris-Saclay, CNRS, LISN, France
--
Pierre Zweigenbaum
Senior Researcher
Université Paris-Saclay, CNRS, LISN
PrivateNLP 2024: Fifth Workshop on Privacy in Natural Language Processing at ACL 2024
1st Call For Papers
ACL PrivateNLP is a full day workshop taking place on August 15, 2024 in conjunction with ACL 2024.
Workshop website: https://sites.google.com/view/privatenlp/
Important Dates:
• Submission Deadline: May 17, 2024
• Acceptance Notification: June 17, 2024
• Camera-ready versions: July 01, 2024
• Workshop: August 15, 2024
Privacy-preserving data analysis has become essential in the age of Large Language Models (LLMs) where access to vast amounts of data can provide gains over tuned algorithms. A large proportion of user-contributed data comes from natural language e.g., text transcriptions from voice assistants.
It is therefore important to curate NLP datasets while preserving the privacy of the users whose data is collected, and train LLMs models that only retain non-identifying user data.
The workshop aims to bring together practitioners and researchers from academia and industry to discuss the challenges and approaches to designing, building, verifying, and testing privacy preserving systems in the context of Natural Language Processing.
Topics of interest include but are not limited to:
* Privacy in Large Language Models
* Generating privacy preserving test sets
* Inference and identification attacks
* Generating Differentially private derived data
* NLP, privacy and regulatory compliance
* Private Generative Adversarial Networks
* Privacy in Active Learning and Crowdsourcing
* Privacy and Federated Learning in NLP
* User perceptions on privatized personal data
* Auditing provenance in language models
* Continual learning under privacy constraints
* NLP and summarization of privacy policies
* Ethical ramifications of AI/NLP in support of usable privacy
* Homomorphic encryption for language models
Submissions:
Accepted papers will be presented orally or as posters and included in the workshop proceedings. Submissions are open to all, and are to be submitted anonymously. All papers will be refereed through a double-blind peer review process by at least three reviewers with final acceptance decisions made by the workshop organizers.
We'll be using OpenReview - the final submission link will be specified later
Organizers:
Sepideh Ghanavati, University of Maine
Abhilasha Ravichander, Allen AI
Niloofar Mireshghallah, University of Washington
Ivan Habernal, Paderborn University
Seyi Feyisetan, Amazon
Patricia Thaine, Private AI
Contact us: privatenlp24-orga(a)lists.uni-paderborn.de