August 2024 - Corpora

Data released for PLABA 2024 @ TREC
by Ondov, Brian 22 Aug '24

22 Aug '24

Training and testing data are now available for the 2024 Plain Language Adaptation of Biomedical Abstracts<https://bionlp.nlm.nih.gov/plaba2024/> track at TREC<https://trec.nist.gov>. The PLABA task is to rewrite biomedical abstracts, line by line, in plain language for the general public. This includes: - Simplifying and splitting sentences - Replacing or omitting jargon - Providing further background where appropriate This year, PLABA also has new subtask—expert term replacement. This includes: - Identifying expert terms - Classifying how they should be replaced (explained, substituted, generalized, exemplified, or omitted) - Generating replacement text for the terms As this subtask does not require generation of complete documents, we hope it will allow for participation with fewer compute resources. By participating in PLABA, you will get: - Access to a unique, manually annotated expert term replacement dataset - Manual evaluations of system outputs by science communication experts Further information Task descriptions: https://bionlp.nlm.nih.gov/plaba2024/ Registration: https://ir.nist.gov/evalbase/ Mailing list: https://groups.google.com/g/plaba2024 Due dates Aug 30 - Task 2 (abstract rewriting) submissions due Sep 15 - Task 1 (term replacement) submissions due We look forward to your submissions!

2 1

[CFP] NLP4Science workshop at EMLP 2024
by Roi Reichart 22 Aug '24

22 Aug '24

Background and Scope --------------------- The NLP4Science workshop invites submissions of papers that leverage NLP tools and methodologies for human-focused scientific modeling. As language is closely tied to human behavior, cognition, and communication, many researchers are turning to NLP to decode intricate patterns and gain meaningful insights into humanity. NLP tools and Large Language Models (LLMs) are reshaping research methodologies across a wide range of fields, including social science, psychology, psychiatry, psycholinguistics, health, neuroscience, finance, behavioral economics, political science, and more. This workshop provides an opportunity for researchers to focus on how NLP tools can be harnessed for scientific modeling and exploration. We offer a platform for exchanging knowledge, methodologies, findings, tools, and resources, fostering synergies across different scientific domains. The workshop aims to facilitate discussions that encompass shared scientific methodologies across various disciplines. Topics will include, but are not limited to: principles of NLP-driven human-centered scientific modeling, state-of-the-art methods for statistically robust evaluation of NLP models in science, experimental design, causal inference, and the development and application of model interpretation and causality-based methods for text models in science. The workshop will be held as part of EMNLP in November 2024 in Miami, Florida. ===Important Dates=== Paper submission deadline: August 16, 2024 Notification of acceptance: September 5, 2024 Camera-ready papers due: October 3, 2024 ===Submission Topics=== We will solicit novel papers, including, but not limited to the following topics: - NLP and LLMs for scientific modeling in various fields - Social science - Psychology - Psychiatry - Psycholinguistics - Health - Neuroscience - Finance - Behavioral economics - Political science - Statistically robust evaluation of NLP models in science - NLP Experimental design - Causal inference - Model Interpretability ===Submission Information=== We will be using the EMNLP Submission Guidelines for the workshop. Authors are invited to submit a full paper of up to 8 pages of content with unlimited pages for references. We also invite short papers of up to 4 pages of content, including unlimited pages for references. Final camera ready versions of accepted papers will be given an additional page of content to address reviewer comments. Submit your work here: https://openreview.net/group?id=EMNLP/2024/Workshop/NLP4Science ===Contact Information=== - Workshop contact email address: nlp4science [at] gmail.com - Workshop Twitter: @nlp4science Read more: <https://sites.google.com/view/nlp4science/home> https://sites.google.com/view/nlp4science/home

2 1

2nd CFP: Natural Language Processing for Digital Humanities NLP4DH @ EMNLP 2024
by Mika Hämäläinen 22 Aug '24

22 Aug '24

The 4th International Conference on Natural Language Processing for Digital Humanities will co-locate with EMNLP in Miami, USA! The proceedings will be published in the ACL anthology. The event will take place on November 15-16 2024. https://www.nlp4dh.com/nlp4dh-2024 Submission deadline: September 1, 2024 The focus of NLP4DH is on applying natural language processing techniques to digital humanities research. The topics can be anything of digital humanities interest with a natural language processing or generation aspect. A list of suitable NLP4DH topics include but are not limited to: -Text analysis and processing related to humanities using computational methods -Dataset creation and curation for NLP (e.g. digitization, digitalization, datafication, and data preservation). -Research on cultural heritage collections such as national archives and libraries using NLP -NLP for error detection, correction, normalization and denoising data -Generation and analysis of literary works such as poetry and novels -Analysis and detection of text genres Short papers can be up to 4 pages in length. Short papers can report on work in progress or a more targeted contribution such as software or partial results. Long papers can be up to 8 pages in length. Long papers should report on previously unpublished, completed, original work. Lightning talks can be submitted as 750-word abstracts. Lightning talks are suited for discussing ideas or presenting work in progress. Lightning talks will be published in lightning proceedings on Zenodo. Accepted papers (short and long) will be published in the proceedings that will appear in the ACL Anthology. Accepted papers will also be given an additional page to address the reviewers’ comments. The length of a camera ready submission can then be 5 pages for a short paper and 9 for a long paper with an unlimited number of pages for references. The authors of the accepted papers will be invited to submit an extended version of their paper to a special issue in the Journal of Data Mining & Digital Humanities<https://jdmdh.episciences.org/volume/view/id/593>. Important dates -Paper submission (full and short): September 1, 2024 -Notification of acceptance: September 22, 2024 -Camera ready deadline: October 4, 2024 -Conference: November 15-16, 2024

2 1

Job Opening for Data Scientist with a focus on natural language processing
by Menno Van Zaanen 22 Aug '24

22 Aug '24

Job Opening for Data Scientist with a focus on natural language processing Application link: https://bit.ly/4fqI2Sl Application deadline: 31 August 2024 The South African Centre for Digital Language Resources (SADiLaR) is looking for a data scientist with a focus on natural language processing (permanent position). As a Data Scientist at the South African Centre for Digital Language Resources (SADiLaR) you will have the opportunity to initiate and lead projects focusing on Human Language Technology and Digital Humanities stemming from your own research interests. You will work closely together with a team of researchers as part of SADiLaR's extended network, both on your own and commissioned projects. Dissemination of project results at national and international conferences will be encouraged and supported. This position is crucial for research and development in Human Language Technology and Digital Humanities, fields that form the essence of SADiLaR, which is a national Research Infrastructure supported by the Department of Science and Innovation. Read more about SADiLaR at https://www.sadilar.org. Key responsibilities: - Research: Research in the area of Human Language Technology and Digital Humanities. - Project work: Initiating and contributing to Human Language Technology and Digital Humanities projects. - Teaching: Teaching in the area of Human Language Technology and Digital Humanities. - Mentorship: Mentorship of researchers in the field of Human Language Technology and Digital Humanities. Minimum requirements: - A PhD (NQF level 10) in one of the following fields: Computational Linguistics, Natural Language Processing, Human Language Technology, Digital Humanities, Data Science, Computer Science, Information Technology, Artificial Intelligence, or related fields. The PhD should have a focus on computational aspects of linguistics. - A minimum of (five) 5 years' experience in the use of Python (other programming languages used within the computational linguistics or Digital Humanities domain can also be considered). - Evidence of peer-reviewed academic publications. - A minimum of (three) 3 years' experience as a supervisor/co- supervisor of students or playing a mentorship/supervising role for individuals. - A minimum of (three) 3 years' experience with using and/or developing computational tools. - A minimum of (three) 3 years experience related to research within the domain of Language Technology or Digital Humanities. - A minimum of (one) 1 year experience related to teaching or training within the domain of Language Technology or Digital Humanities. More information can be found at the application link. For informal inquiries please contact: Menno van Zaanen <menno.vanzaanen(a)nwu.ac.za> Application link: https://bit.ly/4fqI2Sl -- Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za Professor in Digital Humanities South African Centre for Digital Language Resources https://www.sadilar.org ________________________________ NWU PRIVACY STATEMENT: http://www.nwu.ac.za/it/gov-man/disclaimer.html DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system. ________________________________

2 1

[CFP] The First Workshop and Shared Task on Multilingual Counterspeech Generation at COLING-2025
by mevallec＠ujaen.es 22 Aug '24

22 Aug '24

Background and Scope --------------------- While interest in automatic approaches to Counterspeech generation has been steadily growing, including studies on data curation (Chung et al., 2019a; Fanton et al., 2021), detection (Chung et al., 2021a; Mathew et al., 2018), and generation (Tekiroglu et al., 2020; Chung et al., 2021b; Zhu and Bhat, 2021; Tekiroglu et al., 2022), the large majority of the published experimental work on automatic Counterspeech generation has been carried out for English. This is due to the scarcity of both non-English manually curated training data and to the crushing predominance of English in the generative Large Language Models (LLMs) ecosystem. A workshop on exploring Multilingual Counterspeech Generation is proposed to promote and encourage research on multilingual approaches for this challenging topic. Thus, this workshop aims to test monolingual and multilingual LLMs in particular and Language Technology in general to automatically generate counterspeech not only in English but also in languages with fewer resources. In this sense, an important goal of the workshop will be to understand the impact of using LLMs, considering for example how to deal with pressing issues such as biases, hallucinated content, data scarcity or data contamination. We seek to maximize the scientific and social impact of this workshop by promoting the creation of a community of researchers from diverse fields, such as computer and social sciences, as well as policy makers and other stakeholders interested in automatic counterspeech generation. By doing so we aim to gain a deeper understanding of how counterspeech is currently used to tackle abuse by individuals, activists, and organizations and how Natural Language Processing (NLP) and Generation (NLG) may be best applied to counteract it. Call for Papers --------------------- We welcome submissions on the following topics (but not limited to): - Models and methods for generating counterspeech in different languages. - Automatic Counterspeech generation for low resource languages with scarce training data. - Dialogue agents that use counterspeech to combat offensive messages that are directed to individuals or groups, targeted based on various aspects such as ideology, gender, sexual orientation and religion. - Methods for human and automatic evaluation of counterspeech. - Multidisciplinary studies providing different perspectives on the topic such as computer science, social science, psychology, etc. - Development of taxonomies and quality datasets for counterspeech in multiple languages. - Potentials and limitations (e.g., fairness, biases, hallucinated content) of applying different NLP methods, such as LLMs, to generate counterspeech. - Social impact and empirical studies of counterspeech in social networks, including research on the effectiveness and consequences for users of using counterspeech to combat hate online. Submission --------------------- We welcome two types of papers: regular workshop papers and non-archival submissions. Regular workshop papers will be included in the workshop proceedings. All submissions must be in PDF format and made through START [https://softconf.com/coling2025/MCG25/] - Regular workshop papers: Authors can submit papers up to 8 pages, with unlimited pages for references. Authors may submit up to 100 MB of supplementary materials separately and their code for reproducibility. All submissions undergo an double-blind single-track review. Accepted papers will be presented as posters with the possibility of oral presentations. - Non-archival submissions: Cross-submissions are welcome. Accepted papers will be presented at the workshop, but will not be included in the workshop proceedings. Papers must be in PDF format and will be reviewed in a double-blind fashion by workshop reviewers. We also welcome extended abstracts (up to 2 pages) of papers that are work in progress, under review or to be submitted to other venues. Papers in this category need to follow the COLING format. Important Dates --------------------- - Submission: November 20th, 2024 - Notification of Acceptance: December 2nd, 2024 - Camera-Ready Papers Due: December 10th, 2024 ----------------------------------------------------- Shared Task on Multilingual Counterspeech Generation ----------------------------------------------------- In addition to paper contributions, we are organizing a shared task on multilingual counterspeech generation with the aim of sharing in a central space current efforts, especially those for languages different to English. It is envisaged that the shared task would allow the community to study how we can improve counterspeech generation for both lower resource languages but also to reinforce the strong body of research already existing for English. The counterspeech generated by participants should be respectful, non-offensive, and contain information that is specific and truthful with respect to the following targets: Jews, LGBT+, immigrants,, people of color, women. Data --------------------- We release new data consisting of 597 Hate Speech-Counter Narrative (HS-CN) pairs. In this dataset, the HS are taken from MTCONAN [https://github.com/marcoguerini/CONAN/tree/master/Multitarget-CONAN], while the CN are newly generated. Together with each HS-CN pair, we also provide 5 background knowledge sentences, some of which are relevant for obtaining the Counter Narratives. The dataset is available in 4 different languages (Basque, English, Italian and Spanish) and divided in the following splits: - Development: 100 pairs. [AVAILABLE NOW!] [https://huggingface.co/datasets/LanD-FBK/ML_MTCONAN_KN] - Train: 397 pairs [AVAILABLE in OCTOBER!] - Test: 100 pairs [TBA] In order to score the shared task participants, the CNs will be kept hidden during the shared task while the HS and the background knowledge will be released for participants to prepare their submissions. The languages, Basque, English, Italian and Spanish, offer a varied spectrum of complexity including an agglutinative language isolate (Basque), two romance languages (Italian, Spanish) and a Germanic one (English). The choice of languages obeys to the existing linguistic expertise among the organizers required to successfully run the shared task. Additional Multilingual Data --------------------- Participants also have available the English, Italian, Basque and Spanish CONAN manually curated data: - CONAN (English and Italian) [https://github.com/marcoguerini/CONAN/tree/master/CONAN] - CONAN (Basque and Spanish) [https://huggingface.co/datasets/HiTZ/CONAN-EUS] - CONAN-MT-SP (Spanish) [https://github.com/sinai-uja/CONAN-MT-SP] or [https://huggingface.co/datasets/SINAI/CONAN-MT-SP] - Multitarget-CONAN (English) [https://github.com/marcoguerini/CONAN/tree/master/Multitarget-CONAN] Participation --------------------- The aim of the shared task is, given a HS (and optionally any additional knowledge the participants may like to use), generate a CN to counteract the HS. An example of CN appear bellow: - English: + HS: We should ban all homosexuals. + CN: When will the love prosper and the hatred start to dissipate? I will not only respect my fellow LGBT+ people, I will promote their rights. - Italian: + HS: Dovremmo espellere tutti gli omosessuali. + CN: Quand'è che l'amore prospererà e l'odio inizierà a dissiparsi? Non solo rispetterò i miei compagni LGBT+, ma promuoverò i loro diritti. - Spanish: + HS: Deberíamos prohibir a todos los homosexuales. + CN: ¿Cuándo prosperará el amor y empezará a disiparse el odio? No sólo respetaré a mis compañeros LGBT+, sino que promoveré sus derechos. - Basque: + HS: Homosexual guztiak debekatu beharko genituzke. + CN: Noiz hasiko da maitasuna irabazten eta gorrotoa desagertzen? LGBT+ pertsonak errespetatzeaz gain, haien eskubideak sustatuko ditut. Participants will download the test HS for the 4 languages and generate at most three different CNs per HS for each language). The test window will last 5 days. Participants are allowed to use any resource (language model, data, etc.) to generate the CN. Evaluation --------------------- The CNs submitted by the participants will be evaluated: - Using traditional automatic metrics as in Tekiro ̆glu et al.( 2022), which include BLEU, ROUGE, Novelty and Repetition Rate. - Using LLM as a Judge following the approach described in this paper: https://arxiv.org/abs/2406.15227 Important Dates --------------------- - Test dataset release: October 21st, 2024 - Results submission: October 25th, 2024 - Results notification: November 10th, 2024 - Working papers submission: November 20th, 2024 - Notification of Acceptance: December 3rd, 2024 - Camera-Ready Papers Due: December 10th, 2024 - Workshop: January 19th, 2025 For more information you can joint the Google group [https://groups.google.com/g/multilingual-cs-generation-coling2025] or visit our website [https://sites.google.com/view/multilang-counterspeech-gen/home] Best regards, The Multilingual Counterspeech Generation Workshop Organizers.

2 1

**Deadline: 15 August** - Postdoctoral researcher in computational linguistics with a focus on grounding language in computer vision and robotics
by Simon Dobnik 22 Aug '24

22 Aug '24

[apologies for x-posting] We are looking for a postdoctoral researcher in computational linguistics and/or natural language processing to work on the project "Beyond pixels and words: language technology generation and understanding of spatial language in interaction" (2023-01552) funded by the Swedish Research Council (VR). Application deadline on August 15, 2024 23:59 (CEST, UTC+2) Project description: https://web103.reachmee.com/ext/I005/1035/job?site=7&lang=UK&validator=9b89… (English) and https://web103.reachmee.com/ext/I005/1035/job?site=6&lang=SE&validator=3038… (Swedish) and from my personal page (coming soon) I am looking for candidates with a strong background in computational linguistics, natural language processing (or neighbouring fileds such computer vision and robotics) and machine learning, ideally with experience of computational semantics, language modelling and working with multi-modal representations. Best regards, Simon — Simon Dobnik Professor of Computational Linguistics CLASP & FLoV, University of Gothenburg https://www.gu.se/en/about/find-staff/simondobnik

2 1

SEARCH SOLUTIONS 2024 - London, 27 Nov
by Udo Kruschwitz 22 Aug '24

22 Aug '24

SEARCH SOLUTIONS 2024 Wednesday 27 November 2024, London Innovations in Search and Information Retrieval Search Solutions is the BCS Information Retrieval Specialist Group’s (BCS IRSG) annual event focused on practitioner issues in the arena of search and information retrieval (IR). In contrast to other major industry events, Search Solutions aims to be highly interactive, with attendance strictly limited. We bring together practitioners, researchers, analysts and end users to discuss the latest developments in the IR community and to share insights between research and practice. It is a unique opportunity to bring together academic research and practitioner experience. The Search Solutions event consists of a Tutorial day (26 November) and a Conference day (27 November), each of which has a separate registration. The main programme includes presentations, panels and keynote talks by influential industry leaders on novel and emerging applications in search and information retrieval. CONFIRMED SPEAKERS (MORE TO COME) * Dyaa Albakour, SIGNAL AI, “Mapping the landscape of narratives in Global Media” * Alessandro Benedetti, Sease, “What I don’t like about RAG: can we do better?” * Charlie Hull, OpenSource Connections, “Measure and Tune your Search with User Behaviour Insights” * Maria-Inti Metzendorf, Heinrich Heine University Düsseldorf, “Searching for Living Systematic Reviews – overview and case report” * Eugene Morozov, ISKO, "Search and Browse Use Cases in Data Governance" * Daniel Tunkelang, Search Consultant & Aritra Mandal, eBay, "Modeling Queries as Bags of Documents" We are also looking forward to a lively panel chaired by Michael Upshall entitled “Has Generative AI removed the need for conventional search?” LOCATION Search Solutions is organised by the Information Retrieval Specialist Group of the BCS (The Chartered Institute for IT) and ISKO (International Society for Knowledge Organization), and is held at the BCS Central London Office: BCS, The Chartered Institute for IT Ground Floor 25 Copthall Avenue London EC2R 7BP https://www.bcs.org/about-us/our-london-office-and-event-venue/ REGISTRATION Registration fees (including VAT at 20%) for Search Solutions are as follows: * BCS / ISKO member rate: £92 * Non-member rate: £110 * Students: £80 Registration fees include lunch. Tea and coffee will also be available throughout the day followed by a drinks reception in the evening. Register here: https://www.bcs.org/events-calendar/2024/november/search-solutions-2024-inf… TUTORIAL DAY Search Solutions also includes a Tutorial Programme on Tuesday, 26 November. A detailed programme will be announced in due course. A Call for Tutorials can be found here: https://www.bcs.org/membership-and-registrations/member-communities/informa… Tutorials are payable separately. PAST EVENTS We have been running Search Solutions for more than 15 years now. For details on past events please visit the BCS IRSG Web site at: https://www.bcs.org/membership-and-registrations/member-communities/informa…

2 1

2nd call for Workshops: NoDaLiDa/Baltic-HLT 2025
by Sara Stymne 22 Aug '24

22 Aug '24

***Apologies for possible cross-posting *** The two major conferences in the Baltic and Nordic regions, NoDaLida, organized by The Northern European Association for Language Technology (NEALT) and Baltic HLT are joining forces to organize NoDaLiDa/Baltic-HLT 2025 – The Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies, to be held in Tallinn, Estonia, on March 2–5, 2025. We would like to invite proposals for workshops, to be held on Sunday, March 2, immediately before the main conference, or on Wednesday, March 5, immediately after the main conference. Workshops can be scheduled either for a full day (morning and afternoon) or for half a day. The main conference will be held on-site only, without an online option, in order to facilitate networking. Workshops are free to offer online presentations if they wish to do so. NoDaLiDa/Baltic-HLT addresses all aspects of natural language processing, speech processing, and computational linguistics, including work in closely related neighboring disciplines (such as, for example, machine learning, linguistics, digital humanities, or psychology) that is sufficiently formalized or applied to bear relevance to speech and language technologies. Workshop proposals can be submitted in free-text form as a pdf file, by email to ‘nodalida_baltichlt_2025-workshops(a)googlegroups.com<mailto:nodalida_baltichlt_2025-workshops@googlegroups.com>’. Workshop proposals must include adequate information on at least the following aspects: - proposed workshop title - topic and goals of the workshop - target group and estimated attendance - workshop organizer(s) and contact(s) - mode of organization and program design, including: - information on full versus half-day workshop - information on the preference of workshop day: March 2, March 5, or either - information on whether you plan an on-site only or hybrid event SCHEDULE * Monday, August 26, 2024: Submission of workshop proposals * Tuesday, September 10, 2024: Notification of workshop selection * Monday, December 16, 2024: Recommended workshop paper submission deadline * Monday, February 3, 2025: Camera-ready workshop papers due * Sunday, March 2, 2025: Pre-conference workshops * Sunday, March 5, 2025: Post-conference workshops Organizers of accepted proposals will be responsible for publicizing and running the workshop, including sending out calls for papers, reviewing submissions, producing the camera-ready workshop proceedings, and organizing the meeting day. SELECTION The assessment and selection of workshop proposals will be made by the NoDaLiDa/Baltic-HLT 2025 Workshop Chairs: * Normunds Grūzītis, University of Latvia, Latvia * Samia Touileb, University of Bergen, Norway To inquire about the workshop submission process or any practical aspect of the organization of workshops, please email ‘nodalida_baltichlt_2025-workshops(a)googlegroups.com<mailto:nodalida_baltichlt_2025-workshops@googlegroups.com>’. For any question about the conference in general, please email ‘nodalida_baltichlt_2025-pc(a)googlegroups.com<mailto:nodalida_baltichlt_2025-pc@googlegroups.com>’ and for any general practical inquiries, please email ‘nodalida_baltichlt_2025-loc(a)eki.ee<mailto:nodalida_baltichlt_2025-loc@eki.ee>’. Looking forward to your workshop proposals which will help make NoDaLiDa/Baltic-HLT 2025 a success! Sara Stymne, NoDaLiDa/Baltic-HLT 20205 general chair PS: please also check our call for papers: https://www.nodalida-bhlt2025.eu/call-for-papers När du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/ E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy

2 1

PhD position at Queensland University of Technology and CSIRO
by dai.xiang.au＠gmail.com 22 Aug '24

22 Aug '24

Hi colleagues, We are seeking candidates for a PhD position in Human-Computer Interaction and Natural Language Processing. Location: Brisbane or Sydney, Australia Deadline: We will review applications and conduct interviews on an ongoing basis until a suitable candidate for the role is found. Project title: “Spot, what’s happening?” Explaining a robot’s state through language Details about the project Collaborative human-robot teams are increasingly being used for a variety of tasks. When working with robots, humans sometimes need to understand what the robot is currently doing, why and how it got to that state, and how it interprets its environment. This is important for the human operator to ensure the robot is working according to plans and particularly important if the human needs to intervene or react when the robot needs help. This project aims to explore the use of language as an interaction modality to facilitate the collaboration between robots and humans. The project will investigate what information about a robot’s state and perception is relevant to human operators in a given interaction and task context and how to present that information through language, including dialogue or chat interfaces. The project requires expertise in Human-Computer Interaction and/or Human-Robot Interaction. In addition, knowledge or a willingness to upskill in natural language processing (including using large language models), Explainable AI, and robotics are highly desired. This project addresses Dynamic Situation Awareness and Human-AI Communication, two of the foundational research questions being addressed in Collaborative Intelligence research programme at CSIRO CINTEL. The project also fits into a collaboration between CINTEL and the Australian Cobotic Centre’s research on Human-Robot Interaction. More details about the post and instructions on how to apply are available at https://www.australiancobotics.org/project/project-2-5-spot-whats-happening…. Thanks, Dai

2 1

BlackboxNLP 2024: The 7th Workshop on Analyzing and Interpreting Neural Networks for NLP
by Hosein Mohebbi 22 Aug '24

22 Aug '24

*Last Call for Papers*: The seventh edition of the BlackboxNLP workshop, co-located with EMNLP 2024, in Miami on November 15, 2024. . *Important dates* --------------------- *August 15, 2024* - Paper submission deadline (through OpenReview). September 20, 2024 - Notification of acceptance. October 4, 2024 - Camera-ready deadline. *November 15, 2024* - Workshop date. All deadlines are 11:59PM UTC-12:00 (“Anywhere on Earth”). We will accept ARR submissions; corresponding deadlines will be announced at a later moment in time. *Workshop description:* ----------------- Many recent performance improvements in NLP have come at the cost of understanding of the systems. How do we assess what representations and computations models learn? How do we formalize desirable properties of interpretable models, and measure the extent to which existing models achieve them? How can we build models that better encode these properties? What can new or existing tools tell us about these systems’ inductive biases? The goal of this workshop is to bring together researchers focused on interpreting and explaining NLP models by taking inspiration from fields such as machine learning, psychology, linguistics, and neuroscience. We hope the workshop will serve as an interdisciplinary meetup that allows for cross-collaboration. Topics of interest include, but are not limited to: * Applying analysis techniques from neuroscience to analyze high-dimensional vector representations in artificial neural networks; * Analyzing the network’s response to strategically chosen input in order to infer the linguistic generalizations that the network has acquired; * Examining network performance on simplified or formal languages; * Mechanistic interpretability, reverse engineering approaches to understanding particular properties of neural models; * Proposing modifications to neural architectures that increase their interpretability; * Testing whether interpretable information can be decoded from intermediate representations; * Explaining specific model predictions made by neural networks; * Generating and evaluating the quality of adversarial examples in NLP; * Developing open-source tools for analyzing neural networks in NLP; * Evaluating the analysis results: how do we know that the analysis is valid? *Submissions* ----------------- We call for two types of papers: 1) *Archival papers*. These are papers reporting on completed, original and unpublished research, with a maximum length of 8 pages + references (papers shorter than this maximum are also welcome). Broader Impacts/Ethics and Limitations sections are optional and can be included on a 9th page. An optional appendix may appear after the references in the same pdf file. Accepted papers are expected to be presented at the workshop and will be published in the workshop proceedings of the ACL Anthology, meaning they cannot be published elsewhere. They should report on obtained results rather than intended work. These papers will undergo double-blind peer-review, and should thus be anonymized. 2) *Extended abstracts*. These may report on work in progress or may be cross-submissions that have already appeared (or are scheduled to appear) in another venue in 2024-2025. The extended abstracts are of a maximum of 2 pages + references. These submissions are non-archival and will not be included in the proceedings. The selection will not be based on a double-blind review and thus submissions of this type need not be anonymized. Submissions should follow the official EMNLP 2024 style guidelines. Accepted submissions for both tracks will be presented at the workshop: most as posters, some as oral presentations (determined by the program committee). *The submission site is:* https://openreview.net/group?id=EMNLP/2024/Workshop/BlackBoxNLP *Organizers* ----------------- Yonatan Belinkov, Technion Najoung Kim, Boston University Jaap Jumelet, University of Amsterdam Hosein Mohebbi, Tilburg University Aaron Mueller, Northeastern University & Technion Hanjie Chen, Rice University *Contact* --------------------- Please contact the organizers via email (blackboxnlp(a)googlegroups.com) for any questions. Read more on the workshop's website: https://blackboxnlp.github.io

2 1

2026

2025

2024

2023

2022

Corpora August 2024