*** CMCL – 2nd Call for Papers***
The 13th edition of the Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2024) will be co-located with the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024).
Webpage: https://cmclorg.github.io/
Direct submission page: https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/CMCL
ARR commitment page: https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/CMCL_ARR_Commi…
*Workshop Description*
CMCL 2024 is a one-day workshop held in conjunction with ACL 2024. CMCL invites papers on cognitive modeling, cognitively-inspired natural language processing, and, more broadly, the alignment of language models with human cognition/perception. The 2024 workshop follows in the tradition of earlier meetings at ACL 2010, ACL 2011, NAACL-HLT 2012, ACL 2013, ACL 2014, NAACL 2015, EACL 2017, LSA 2018, NAACL 2019, EMNLP 2020, NAACL 2021, and ACL 2022.
*Scope and Topics*
The research interests/questions include, but are not limited to:
- Human-like language acquisition/learning: How is language acquisition of language models (LMs) (dis)similar to
humans, and why?
- Contrasting/aligning NLP models with human behavior data: What do humans compute during language comprehension/production, and how/why?
- Linguistic probing of NLP models: How well do current language models understand/represent/generalize language behaviorally/internally?
- Linguistically-motivated data modeling/analysis: How can one quantify a particular aspect of language?
- Emergent communication/language: What are the sufficient conditions for the emergence of language?
A more formal description of the workshop scope is:
- Stochastic models of factors influencing a speaker's production or comprehension decisions.
- Models of semantic interpretation, including psychologically realistic notions of word and phrase meaning and composition.
- Incremental parsers for diverse grammar formalisms and their psychological plausibility.
- Models of speaker-specific linguistic adaptation and/or generalization.
- Models of first and second language acquisition and bilingual language processing.
- Behavioral tasks for better understanding neural models of linguistic representation.
- Models and empirical analysis of the relationship between mechanistic psycholinguistic principles and pragmatics or semantics.
- Models of lexical acquisition, including phonology, morphology, and semantics.
- Psychologically motivated models of grammar induction.
- Psychologically plausible models of lexical or conceptual representations.
- Models of language disorders, such as aphasia, dyslexia, or dysgraphia.
- Behavioral datasets or resources for modeling language processing or production in languages other than English.
- Models of language comprehension difficulty.
- Models of language learning and generalization.
- Models of linguistic information propagation and language evolution in communities.
- Cognitively-motivated models of discourse and dialogue.
*Invited Speakers*
Aida Nematzadeh (Google DeepMind)
Frank Keller (University of Edinburgh)
*Important Dates*
- May 17, 2024: Paper submission/commitment deadline (cf. May 15, 2024: notification of ACL 2024)
- June 17, 2024: Notification of acceptance
- July 1, 2024: Camera-ready paper due
- August 15, 2024: Workshop dates
Deadlines are at 11:59 pm AOE.
*Workshop submissions*
CMCL accepts direct submissions through the OpenReview site: https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/CMCL
We also receive papers already reviewed in ACL Rolling Review (ARR) February or earlier: https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/CMCL_ARR_Commi…
There is no need that the CMCL is mentioned as a preferred venue in the original ARR submission.
Detiailed submission flow/schedule is shown in our workshop webpage: https://cmclorg.github.io/
*Submission types*
We invite three types of submissions:
(1) Archival regular workshop submissions that present original research in either long (8 pages + references) or short (4 pages + references) paper format.
(2) Non-archival submissions of extended abstracts that present preliminary results (from 2 to 4 pages + references).
(3) Non-archival cross-submission of long/short papers that present relevant research submitted/published elsewhere (including ACL "Findings of..." papers).
- Only regular workshop papers submitted via (1) will be included in the proceedings, but all types of papers will have a presentation opportunity in the workshop.
- Submissions must be formatted using the ACL style template (https://github.com/acl-org/acl-style-files) and be submitted as a PDF file.
- We adhere to the ACL anonymity policy: https://www.aclweb.org/adminwiki/index.php/ACL_Anonymity_Policy
- This year, we don't host a shared task.
*Workshop Organizers*
Tatsuki Kuribayashi (MBZUAI, tatsuki.kuribayashi(a)mbzuai.ac.ae)
Giulia Rambelli (University of Bologna, giulia.rambelli4(a)unibo.it)
Ece Takmaz (University of Amsterdam, ece.takmaz(a)uva.nl)
Philipp Wicke (Ludwig Maximilian University LMU, pwicke(a)cis.lmu.de)
Yohei Oseki (University of Tokyo, oseki(a)g.ecc.u-tokyo.ac.jp)
*Program Committee*
Abdellah Fourtassi (Aix-Marseille University)
Adina Williams (FAIR)
Afra Alishahi (Tilburg University)
Aniello De Santo (University of Utah)
Carina Kauf (MIT)
Cassandra Jacobs (University of Buffalo)
Christos Christodoulopoulos (Amazon)
Cory Shain (MIT)
Ethan Wilcox (ETH Zurich)
Frances Yung (Saarland University)
Fred Mailhot (Dialpad)
Gianluca Lebani (University Ca' Foscari Venezia)
James Michaelov (The University of California San Diego)
John Hale (University of Georgia)
Laurent Prévot (Aix-Marseille University)
Lisa Beinborn (VU Amsterdam)
Ludovica Pannitto (University of Trento)
Micha Elsner (Ohio State University)
Nora Hollenstein (University of Copenhagen)
Rachel Ryskin (University of California Merced)
Raquel Garrido Alhama (Tilburg University)
Richard Futrell (UC Irvine Language Science)
Robert Frank (Yale University)
Ryo Yoshida (The University of Tokyo)
Samar Husain (IIT Delhi)
Sandra Kuebler (Indiana University)
Tal Linzen (New York University)
Ted Briscoe (MBZUAI)
Tiago Pimentel (ETH Zurich)
Tim Hunter (UCLA)
Vera Demberg (Saarland University)
William Schuler (Ohio State University)
Yao Yao (Hong Kong Polytechnic University)
*Website*
https://cmclorg.github.io/
*Sponsoring Institutions*
Japan Society for the Promotion of Science
*Contact*
cmclorganizers2024(a)gmail.com
The University of Amsterdam has a fully funded PhD position on AI/NLP/IR
for information access.
We seek an ambitious PhD student with a background in artificial
intelligence, natural language processing and information retrieval.
Your focus will be on large language models (LLMs) for information
access. How can we search specific collections, including full text,
metadata, and multimodal content? How can we support complex search
tasks and practices, such as scholarly research on cultural data, and
the research and workflow of investigative journalism?
The PhD position is part of four PhD vacancies as part of digital
Humanities, Artificial Intelligence, Cultural Heritage (HAICu) project,
a large national science agenda project funded by the Netherlands
Organization for Scientific Research. We are one of the best European
and global places to study AI, and you will work together with other AI
and Digital Humanities researchers, and a range of external partners on
scientific breakthroughs. HAICu deploys artificial intelligence (AI) to
make digital heritage collections more accessible, and the extraordinary
challenges of cultural heritage provide a unique opportunity to push the
boundaries of AI. The PhD position is fully funded and you will be
employed by the University of Amsterdam for four years (full-time, with
all employment benefits) and are expected to complete a PhD thesis
within this period.
Are you interested? Strong candidates with an AI/NLP/IR background are
encouraged to apply by May 15. Details are in:
https://vacatures.uva.nl/UvA/job/4PhDs/792167402/ (Project #1).
Feel free to reach out with questions or comments!
Jaap Kamps
We invite you to participate and submit your work to the First Workshop
on Data Contamination (CONDA) co-located with ACL 2024 in Bangkok, Thailand.
Data contamination, where evaluation data is inadvertently included in
pre-training corpora of large scale models, and language models (LMs) in
particular, has become a concern in recent times. The growing scale of
both models and data, coupled with massive web crawling, has led to the
inclusion of segments from evaluation benchmarks in the pre-training
data of LMs. The scale of internet data makes it difficult to prevent
this contamination from happening, or even detect when it has happened.
Crucially, when evaluation data becomes part of pre-training data, it
introduces biases and can artificially inflate the performance of LMs on
specific tasks or benchmarks. This poses a challenge for fair and
unbiased evaluation of models, as their performance may not accurately
reflect their generalization capabilities.
Although a growing number of papers and state-of-the-art models mention
issues of data contamination, there is no agreed-upon definition or
standard methodology to ensure that a model does not report results on
contaminated benchmarks. Addressing data contamination is a shared
responsibility among researchers, developers, and the broader community.
By adopting best practices, increasing transparency, documenting
vulnerabilities, and conducting thorough evaluations, we can work
towards minimizing the impact of data contamination and ensuring fair
and reliable evaluations.
We welcome paper submissions on all topics related to data
contamination, including but not limited to:
* Definitions, taxonomies, and gradings of contamination
* Contamination detection (both manual and automatic)
* Community efforts to discover, report, and organize contamination events
* Documentation frameworks for datasets or models
* Methods to avoid data contamination
* Methods to forget contaminated data
* Scaling laws and contamination
* Memorization and contamination
* Policies to avoid impact of contamination in publication venues and
open source communities
* Reproducing and attributing results from previous work to data
contamination
* Survey work on data contamination research
* Data contamination in other modalities
*Submission Instructions*
We welcome two types of papers: regular workshop papers and non-archival
submissions. Regular workshop papers will be included in the workshop
proceedings. All submissions must be in PDF format and made through
OpenReview.
* Regular workshop papers: Authors can submit papers up to 8 pages,
with unlimited pages for references. Authors may submit up to 100 MB
of supplementary materials separately and their code for
reproducibility. All submissions undergo a double-blind single-track
review. Best Paper Award(s) will be given based on nomination by the
reviewers. Accepted papers will be presented as posters with the
possibility of oral presentations.
* Non-archival submissions: Cross-submissions are welcome. Accepted
papers will be presented at the workshop but not included in the
workshop proceedings. Papers must be in PDF format and will be
reviewed in a double-blind fashion by workshop reviewers. We also
welcome extended abstracts (up to 2 pages) of papers that are work
in progress, under review or to be submitted to other venues. Papers
in this category need to follow the ACL format.
In addition to papers submitted directly to the workshop, which will be
reviewed by our Programme Committee. We also accept papers reviewed
through ACL Rolling Review and committed to the workshop. Please, check
the relevant dates for each type of submission.
*Important dates*
* Relevant deadlines to consider when submitting your paper are:
* Paper submission deadline: May 17 (Friday), 2024
* ARR pre-reviewed commitment deadline: TBD, 2024
* Notification of acceptance: June 17 (Monday), 2024
* Camera-ready paper due: July 1 (Monday), 2024
* Workshop date: August 16, 2024
*Sponsors*
* AWS AI and Amazon Bedrock
* HuggingFace
* Google
*Contact*
* Website: https://conda-workshop.github.io/
* Email: conda-workshop(a)googlegroups.com
*Organizers*
Oscar Sainz, University of the Basque Country (UPV/EHU)
Iker García Ferrero, University of the Basque Country (UPV/EHU)
Eneko Agirre, University of the Basque Country (UPV/EHU)
Jon Ander Campos, Cohere
Alon Jacovi, Bar Ilan University
Yanai Elazar, Allen Institute for Artificial Intelligence and University
of Washington
Yoav Goldberg, Bar Ilan University and Allen Institute for Artificial
Intelligence
--
Eneko Agirre
HiTZ Hizkuntza Teknologiako Zentroa - Ixa Taldea
Centro Vasco de Tecnología de la Lengua - Grupo Ixa
Basque Center for Language Technology - Ixa NLP Group
University of the Basque Country (UPV/EHU)
hitz.ehu.eus/eneko <https://hitz.ehu.eus/eneko>
The research group Data Mining and Machine Learning at the University of Vienna is looking for a Postdoctoral Researcher in Natural Language Processing.
Possible research topics are:
- Analysis, explainability and interpretability of large language models
- Linguistic capabilities of large language models
- Extraction of structured information from text, linking knowledge graphs and language
- Weak supervision of natural language processing models
- Multimodal and multilingual deep learning
For more details see:
https://jobs.univie.ac.at/job/Postdoctoral-Researcher-in-Natural-Language-P…
--
Univ.-Prof. Dr. Benjamin Roth
Digitale Textwissenschaften
Universität Wien
Kolingasse 14
Raum 5.17
1090 Wien
email: benjamin.roth(a)univie.ac.at
tel: +43 14277 79513
virtual coffee (Tuesday 2pm CEST): https://www.benjaminroth.net/virtual_coffee
video call: https://univienna.zoom.us/j/93796507934?pwd=VFg5dW9JbStPUml6WFVtOWJXV3phQT09
web: https://dm.cs.univie.ac.at/team/person/112089/
Dear all
Please, find below more information about a conference we are organising.
The conference, which will take place on 21-22 November 2024 at the University of Liège (Belgium), is meant for researchers interested in metaphor and national identity discourse from different perspectives (linguistics, cognitive science, etc.).
Regards,
_____________
CALL FOR PAPERS
METAPOL3: DISCOURSE, IDEOLOGIES AND SUB-STATE NATIONALISM
21-22 November 2024, University of Liège, Belgium
Despite attempts to discourage sub-state nationalism and keep the political map of the world in its present form, the struggle for separate identities still remains a serious issue in modern-day countries. Sub-state nationalism has led to violent conflicts in postcolonial Africa, the former Yugoslavia and Soviet Union, and has been one of the main causes of political upheaval in Belgium, Britain, Spain, China, etc.
As Anderson (1983) indicates, nations are imagined communities whose formation involves the spread of discourses aimed at establishing a clear difference between in-groups and out-groups. While national identity has attracted a fair amount of scholarly interest in the field of political science, it is only in the early 90s that studies emphasizing the discursive manifestations of nationalism started being conducted (Wodak & Matouschek, 1993; Wodak & Reisigl, 1999; Wodak et al. 1999).
These last two decades, the study of political discourse has been consolidated by metaphor analysis (Musolff, 2006; 2016; Saric & Stanojevic, 2019), and even though great strides have indeed been made in political discourse analysis, research on sub-state nationalism remains scant. It is thus in an attempt to fill this gap that we are organizing this conference which will hopefully bring together researchers from different fields (linguistics, sociology, political science, cognitive science), interested in discourse, metaphor and nationalist ideologies.
Topics of interest include, but are not limited to,
- The discursive construction of (sub-state) national identity
- Characteristics of separatist discourse
- Conceptualisations of the body politic
- Metaphor scenarios in national identity discourse
- Visual metaphor in (sub-state) nationalist discourse
- Gender and metaphor in (sub-state) nationalist discourse
- ...
KEYNOTE SPEAKER
Professor Martin Reisigl, University of Vienna
SUBMISSION OF PROPOSALS
The conference will be held in person at the University of Liège, Belgium.
Each presentation will last 20 minutes, followed by 10 minutes of Q&A.
Conference proposals should include:
- A title (max. 15 words)
- Key words (max. 5 words)
- An abstract (300 words, excluding references)
Guide for submitting a proposal
To submit a proposal, you must create a user account on sciencesconf.org and log in as a registered user.
It is possible to create an account either directly on the SciencesConf portal or by clicking on the Login button on top right of the conference website (https://metapol3.sciencesconf.org/).
Once connected, access "My submissions" and then go to New submission > Submit an abstract.
Individual and co-authored papers in English or French are welcome.
All abstracts will go through double-blind peer review.
IMPORTANT DATES
Submission deadline: 15/05/24
Notification of acceptance: 15/07/24
CONTACT AND MORE INFORMATION
For more information, please visit our website (https://metapol3.sciencesconf.org/), and if necessary, do not hesitate to email us at metapol3(a)sciencesconf.org.
The following Research Fellow position is available as part of the Edinburgh Clinical NLP Group and the Advanced Care Research Centre at the University of Edinburgh. The deadline is 19th of April 2024.
https://www.jobs.ac.uk/job/DGS698/research-fellow-in-clinical-natural-langu…
----------------------------------------------------------------
Dr. Beatrice Alex
Senior Lecturer and Chancellor’s Fellow
University of Edinburgh
Head of the Edinburgh Language Technology Group
Co-lead of the Edinburgh Clinical NLP Group
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
Registration Deadline 10 April
The 2nd Arabic Named Entity Recognition Shared Task, at ArabicNLP’24
https://dlnlp.ai/st/wojood/
Dataset: Wojood-Fine <https://aclanthology.org/2023.arabicnlp-1.25/> New version: Arabic Fine-Grained Entity Recognition (Wojood + Subtypes of entity types).
Subtask-1 (Closed-Track Flat Fine-Grain NER): We provide the Wojood-Fine Flat train (70%) and development (10%) datasets. The final evaluation will be on the test set (20%). External data is not allowed .... (read more <https://dlnlp.ai/st/wojood/>).
Subtask-2 (Closed-Track Nested Fine-Grain NER): This subtask is similar to the subtask-1, we provide the Wojood-Fine Nested train (70%) and development (10%) datasets. The final evaluation will be on the test set (20%) .... (read more <https://dlnlp.ai/st/wojood/>).
Subtask-3 (Open-Track NER - Gaza War): to allow participants to reflect on the utility of NER in the context of real-world events, allow them to use external resources, and encourage them to use generative models in different ways (fine-tuned, zero-shot learning, in-context learning, etc.). The goal of focusing on generative models in this particular subtask is to help the Arabic NLP research community better understand the capabilities and performance gaps of LLMs in information extraction, an area currently understudied.
We provide development and test data related to the current War on Gaza. This is motivated by the assumption that discourse about recent global events will involve mentions from different data distribution. For this subtask, we include data from five different news domains related to the War on Gaza - but we keep the names of the domains hidden. Participants will be given a development dataset (10K tokens, 2K from each of the five domains), and a testing dataset (50K tokens, 10K from each domain). Both development and testing sets are manually annotated with fine-grain named entities using the same annotation guidelines used in Subtask1 and Subtask2 (also described in Liqreina et al., 2023). .... (read more <https://dlnlp.ai/st/wojood/>).
BASELINES
Two baseline models trained on WojoodFine (flat and nested) are provided (See Liqreina et al., 2023 <https://aclanthology.org/2023.arabicnlp-1.25/>). The code used to produce these baselines is available on GitHub <https://github.com/SinaLab/ArabicNER>.
Subtask
Precision
Recall
Average Micro-F1
Flat Fine-Grain NER (Subtask 1)
0.8870
0.8966
0.8917
Nested Fine-Grain NER (Subtask 2)
0.9179
0.9279
0.9229
GOOGLE COLAB NOTEBOOKS
To allow you to experiment with the baseline, we authored four Google Colab notebooks that demonstrate how to train and evaluate our baseline models.
[1] Train Flat Fine-Grain NER <https://gist.github.com/mohammedkhalilia/72c3261734d7715094089bdf4de74b4a>: This notebook can be used to train our ArabicNER model on the flat Fine-grain NER task using the sample Wojood_Fine data.
[2] Evaluate Flat Fine-Grain NER <https://gist.github.com/mohammedkhalilia/c807eb1ccb15416b187c32a362001665>: This notebook will use the trained model saved from the notebook above to perform evaluation on unseen dataset.
[3] Train Nested Fine-Grain NER <https://gist.github.com/mohammedkhalilia/a4d83d4e43682d1efcdf299d41beb3da>: This notebook can be used to train our ArabicNER model on the nested Fine-grain task using the sample Wojood data.
[4] Evaluate Nested Fine-Grain NER <https://gist.github.com/mohammedkhalilia/9134510aa2684464f57de7934c97138b>: This notebook will use the trained model saved from the notebook above to perform evaluation on unseen dataset.
REGISTRATION
Participants need to register via this form (NERSharedTask 2024) <https://docs.google.com/forms/d/1ISMILgQYfUug3XuDpxFmuPASXkWaduYOUc3xOZuGwq…>. Participating teams will be provided with common training development datasets. No external manually labelled datasets are allowed. Blind test data set will be used to evaluate the output of the participating teams. Each team is allowed a maximum of 3 submissions. All teams are required to report on the development and test sets (after results are announced) in their write-ups.
FAQ
For any questions related to this task, please check our Frequently Asked Questions <https://docs.google.com/document/d/1W_13FRpP3NbDx_ALYJWA3-ESXPRVomOjNovUuYf…>
IMPORTANT DATES
- February 25, 2024: Shared task announcement.
- March 1, 2024: Release of training data, development sets, scoring script, and Codalab links.
- April 10, 2024: Registration deadline.
- April 26, 2024: Test set made available.
- May 3, 2024: Codalab Test system submission deadline.
- May 10, 2024: Shared task system paper submissions due.
- June 17, 2024: Notification of acceptance.
- July 1, 2024: Camera-ready version.
- August 16, 2024: ArabicNLP 2024 conference in Thailand.
CONTACT
For any questions related to this task, please contact the organizers directly using the following email address: NERSharedtask(a)gmail.com <mailto:NERSharedtask@gmail.com> .
(Re-sending due to the initial attempt bouncing, apologies in advance if
you receive multiple copies!)
Professor and co-principal investigator Najoung Kim <https://najoung.kim/>of
the Boston University Department of Linguistics <https://ling.bu.edu/> (with
active affiliations in Computer Science <https://www.bu.edu/cs/>and Data
Science <https://www.bu.edu/cds-faculty/>) is seeking a Postdoctoral
Associate to join the Professor's TIN Lab in Fall 2024. The successful
applicant will have a background in one of the following disciplines:
Artificial Intelligence, Natural Language Processing, Computational
Linguistics, Cognitive Science, or other relevant areas. The postdoctoral
associate will work closely with the PIs (Najoung Kim, Boston University &
Sebastian Schuster, UCL) and will be responsible for co-leading a
collaborative research project minimally involving two PhD-level graduate
students.
*Responsibilities*
The postdoctoral associate’s primary responsibility is to lead a research
project, the aim of which is to develop detailed evaluation protocols for
AI technology applied to a consequential task of real-world
complexity—specifically, AI in the domain of academic AI research—and to
apply this evaluation to estimate the capacities of the current best
models. We expect there to be a substantial system development component
(for building the baselines) as well as a substantial human study design
component (for a rigorous evaluation of the system outputs), where
different expertise can be contributed by different members of the research
team (minimally, two PIs, the postdoctoral associate, and two PhD-level
graduate students).
The postdoctoral associate is also invited to engage with the broader
academic community at BU, spanning Linguistics, Computer Science, and the
Center for Computing & Data Sciences, and academic communities in Boston
and New England. There will also be regular opportunities to connect with
the community at UCL.
*Qualifications*
The postdoctoral associate needs to hold a PhD degree at the start of their
appointment. Hands-on experience in either: (1) building systems that use
language models as a core component to solve complex tasks, or (2) leading
human annotation efforts or human behavioral experiments is required.
Publications or prior research experience in one of the following topic
areas are desired, but not required:
- Compositional generalization
- Data-efficient training methods (e.g., BabyLM-scale)
- Language model evaluation
- General-purpose prompting techniques
*Location*
The postdoctoral associate will be based in Boston University. They will be
physically located in one of the office spaces in 621 Commonwealth Avenue
or 665 Commonwealth Avenue, subject to space availabilities.
*Duration*
This is a one-year position with the base expectation that it will renew
for a second year, conditioned on satisfactory progress.
*Compensation*
The 12-month compensation for this position will be $90K-100K USD,
commensurate with experience.
*Application*
Candidates must submit a CV, two pieces of their most significant research
contribution, and contacts of two references at the time of application. We
will only contact the reference writers for letters of recommendation when
we decide to interview the candidate. Application materials should be
uploaded as individual PDF files through Academic Jobs Online at
https://academicjobsonline.org/ajo/jobs/27426. We will give full
consideration to applications received by April 15, 2024: two weeks from
the job posting date. Afterwards, applications will be considered on a
rolling basis until the position is filled.
Inquiries should be directed to najoung(a)bu.edu and s.schuster(a)ucl.ac.uk.
--
Najoung Kim
Assistant Professor
Department of Linguistics & Computer Science, Boston University
https://najoung.kim 🍪
*Apologies for cross-posting*
Fifth Workshop on Gender Bias in Natural Language Processing
Bangkok, Thailand, on August 16, 2024
https://genderbiasnlp.talp.cat/ <https://kemt2024.wixsite.com/home>
Second Call for Papers
Gender bias, among other demographic biases (e.g. race, nationality, religion), in machine-learned models is of increasing interest to the scientific community and industry. Models of natural language are highly affected by such biases, which are present in widely used products and can lead to poor user experiences. There is a growing body of research into improved representations of gender in NLP models. Key example approaches are to build and use balanced training and evaluation datasets (e.g. Webster et al., 2018), and to change the learning algorithms themselves (e.g. Bolukbasi et al., 2016). While these approaches show promising results, there is more to do to solve identified and future bias issues. In order to make progress as a field, we need to create widespread awareness of bias and a consensus on how to work against it, for instance by developing standard tasks and metrics. Our workshop provides a forum to achieve this goal.
Topics of interest
We invite submissions of technical work exploring the detection, measurement, and mediation of gender bias in NLP models and applications. Other important topics are the creation of datasets, identifying and assessing relevant biases or focusing on fairness in NLP systems. Finally, the workshop is also open to non-technical work addressing sociological perspectives, and we strongly encourage critical reflections on the sources and implications of bias throughout all types of work.
In addition this year we are organising a Shared Task on Gender Bias Machine Translation evaluation.
Paper Submission Information
Submissions will be accepted as short papers (4-6 pages) and as long papers (8-10 pages), plus additional pages for references, following the ACL 2024 guidelines. Supplementary material can be added, but should not be central to the argument of the paper. Blind submission is required.
Each paper should include a statement which explicitly defines (a) what system behaviors are considered as bias in the work and (b) why those behaviors are harmful, in what ways, and to whom (cf. Blodgett et al. (2020)). More information on this requirement, which was successfully introduced at GeBNLP 2020, can be found on the workshop website. We also encourage authors to engage with definitions of bias and other relevant concepts such as prejudice, harm, discrimination from outside NLP, especially from social sciences and normative ethics, in this statement and in their work in general.
Non-archival option
The authors have the option of submitting research as non-archival, meaning that the paper will not be published in the conference proceedings. We expect these submissions to describe the same quality of work and format as archival submissions.
Important dates.
May 10, 2024: Workshop Paper Due Date
June 5, 2024: Notification of Acceptance
June 25, 2024: Camera-ready papers due
August 16, 2024: Workshop Dates
Keynote Speakers.
Isabelle Augenstein, University of Copenhagen
Hal Daumé III, University of Maryland and Microsoft Research NYC
Organizers.
Christine Basta, Alexandria University
Marta R. Costa-jussà, FAIR, Meta,
Agnieszka Falénska, University of Stuttgart
Seraphina Goldfarb-Tarrant, Cohere
Debora Nozza, Bocconi University
Dear All,
The Association of Cyber Forensics and Threat Investigators invites you to
join our next webinars:
"DFIR Stream 0x6" on Tuesday, April 16 · 4:00 – 5:00 pm (GMT+00:00) UK Time
Title: Operationalizing Machine Learning for Networks,
by Shinan Liu, University of Chicago.
Register@ https://www.acfti.org/news-events/dfir-stream-0x6
"DFIR Stream 0x7" on Tuesday, May 7. 1:30 – 2:30 pm (GMT+00:00) UK Time
Title: Malware Detection in Memory Forensics: Open Challenges and Issues,
by Dr. Ricardo J. Rodríguez, University of Zaragoza.
Register@ https://www.acfti.org/news-events/dfir-stream-0x7
"DFIR Stream 0x8" on Monday, May 13 · 4:00 – 5:00 pm (GMT+00:00) UK Time
Title: Low-Level Hardware Information Assisted Approach Towards System
Security,
by Dr. Chen Liu, Clarkson University.
Register@ https://www.acfti.org/news-events/dfir-stream-0x8
======Housekeeping Notes======
- Note that this event is online only. Hence, You must register to receive
a link to connect. Due to limited availability, we kindly ask you to
register as soon as possible to ensure your participation in the webinar of
your choice.
- For Students, A certificate of successful participation in the event will
be delivered upon request for free (after verifying attendance), indicating
the number of hours of the seminar (please make sure that you add the
correct name in the registration form). This should be sufficient for those
participants who plan to request ECTS recognition from their home
university.
Join Us & stay tuned! #CyberSecurity #MemoryForensics #MachineLearning
#AnomalyDetection
Finally, I would like to remind you that the call for speakers is currently
open on the dedicated DFIR stream website,
https://dfir.stream/call-for-guest-speakers
To get more news about our events, please join our low-traffic announcement
group @ https://groups.google.com/g/acfti
This event is brought to you by CFTIRC (Cyber Forensics & Threat
Investigations Research Community).
Best regards,
Andrew Zayin Ph.D., CISSP, CISM, CRISC, CDPSE, PMP
ACFTI Secretariat