- Corpora - ELRA lists

TSD 2024 - 1st Call for Papers
by TSD 2024 23 Feb '24

23 Feb '24

[Apologies for cross-postings] ********************************************************* TSD 2024 - FIRST CALL FOR PAPERS ********************************************************* Twenty-seventh International Conference on TEXT, SPEECH and DIALOGUE (TSD 2024) Brno, Czech Republic, 9-13 September 2024 http://www.tsdconference.org/ The conference is organized by the Faculty of Informatics, Masaryk University, Brno, and the Faculty of Applied Sciences, University of West Bohemia, Pilsen. The conference is supported by International Speech Communication Association. Venue: Brno, Czech Republic THE SUBMISSION DEADLINES: April 10 2024 ............ Submission of abstracts April 17 2024 ............ Submission of full papers Submission of abstracts serves for better organization of the review process only - for the actual review a full paper submission is necessary. TSD SERIES TSD series evolved as a prime forum for interaction between researchers in both spoken and written language processing from all over the world. Proceedings of TSD form a book published by Springer-Verlag in their Lecture Notes in Artificial Intelligence (LNAI) series. TSD Proceedings are regularly indexed by Thomson Reuters Conference Proceedings Citation Index. Moreover, LNAI series are listed in all major citation databases such as DBLP, SCOPUS, EI, INSPEC or COMPENDEX. CALL for SATELLITE WORKSHOP PROPOSALS https://www.tsdconference.org/tsd2024/conf_workshop_proposals.html The TSD 2024 conference will be accompanied by one-day satellite workshops or project meetings with organizational support by the TSD organizing committee. The organizing committee can arrange for a meeting room at the conference venue and prepare a workshop proceedings as a book with ISBN by a local publisher. The workshop papers that will pass also the standard TSD review process will appear in the Springer proceedings. Each workshop is a subject to proposal that should be sent via the proposal submission form or discussed via the contact e-mail tsd2024(a)tsdconference.org ahead of the respective deadline. TOPICS Topics of the conference will include (but are not limited to): Corpora and Language Resources (monolingual, multilingual, text and spoken corpora, large web corpora, large language models, disambiguation, specialized lexicons, dictionaries) Speech Recognition (multilingual, continuous, emotional speech, handicapped speaker, out-of-vocabulary words, alternative way of feature extraction, new models for acoustic and language modelling) Tagging, Classification and Parsing of Text and Speech (morphological and syntactic analysis, synthesis and disambiguation, multilingual processing, sentiment analysis, credibility analysis, automatic text labeling, summarization, authorship attribution) Speech and Spoken Language Generation (multilingual, high fidelity speech synthesis, computer singing) Semantic Processing of Text and Speech (information extraction, information retrieval, data mining, semantic web, knowledge representation, inference, ontologies, sense disambiguation, plagiarism detection, fake news detection) Integrating Applications of Text and Speech Processing (machine translation, natural language understanding, question-answering strategies, assistive technologies) Automatic Dialogue Systems (self-learning, multilingual, question-answering systems, dialogue strategies, prosody in dialogues) Multimodal Techniques and Modelling (video processing, facial animation, visual speech synthesis, user modelling, emotions and personality modelling) Papers on processing of languages other than English are strongly encouraged. PROGRAM COMMITTEE Elmar Noeth, Germany (general chair) Rodrigo Agerri, Spain Eneko Agirre, Spain Vladimir Benko, Slovakia Archna Bhatia, USA Jan Cernocky, Czech Republic Simon Dobrisek, Slovenia Kamil Ekstein, Czech Republic Karina Evgrafova, Russia Yevhen Fedorov, Ukraine Volker Fischer, Germany Darja Fiser, Slovenia Lucie Flek, Germany Bjorn Gamback, Norway Radovan Garabik, Slovakia Alexander Gelbukh, Mexico Louise Guthrie, USA Jan Hajic, Czech Republic Eva Hajicova, Czech Republic Yannis Haralambous, France Hynek Hermansky, USA Jaroslava Hlavacova, Czech Republic Ales Horak, Czech Republic Eduard Hovy, USA Milos Jakubicek, Czech Republic Maria Khokhlova, Russia Aidar Khusainov, Russia Daniil Kocharov, Russia Miloslav Konopik, Czech Republic Valia Kordoni, Germany Evgeny Kotelnikov, Russia Pavel Kral, Czech Republic Siegfried Kunzmann, USA Nikola Ljubesic, Croatia Natalija Loukachevitch, Russia Bernardo Magnini, Italy Vaclav Matousek, Czech Republic Roman Moucek, Czech Republic Agnieszka Mykowiecka, Poland Hermann Ney, Germany Joakim Nivre, Sweden Juan Rafael Orozco-Arroyave, Colombia Maciej Piasecki, Poland Josef Psutka, Czech Republic James Pustejovsky, USA German Rigau, Spain Paolo Rosso, Spain Leon Rothkrantz, The Netherlands Anna Rumshisky, USA Milan Rusko, Slovakia Pavel Rychly, Czechia Mykola Sazhok, Ukraine Pavel Skrelin, Russia Pavel Smrz, Czech Republic Petr Sojka, Czech Republic Georg Stemmer, Germany Marko Robnik Sikonja, Slovenia Marko Tadic, Croatia Jan Trmal, Czechia Tamas Varadi, Hungary Zygmunt Vetulani, Poland Aleksander Wawer, Poland Pascal Wiggers, The Netherlands Alina Wroblewska, Poland Jerneja Zganec Gros, Slovenia FORMAT OF THE CONFERENCE The conference program will include presentation of invited papers, oral presentations, and poster/demonstration sessions. Papers will be presented in plenary or topic oriented sessions. Social events including a trip in the vicinity of Brno will allow for additional informal interactions. SUBMISSION OF PAPERS Authors are invited to submit a full paper not exceeding 12 pages formatted in the LNCS style (including references). Those accepted will be presented either orally or as posters. The decision about the presentation format will be based on the recommendation of the reviewers. The authors are asked to submit their papers using the on-line form accessible from the conference website. Papers submitted to TSD 2024 must not be under review by any other conference or publication during the TSD review cycle, and must not be previously published or accepted for publication elsewhere. Authors are also invited to present actual projects, developed software or interesting material relevant to the topics of the conference. The presenters of demonstrations should provide an abstract not exceeding one page. The demonstration abstracts will not appear in the conference proceedings. IMPORTANT DATES April 10 2024 ............ Submission of abstracts April 17 2024 ............ Submission of full papers June 5 2024 .............. Notification of acceptance June 15 2024 ............. Final papers (camera ready) and registration August 8 2024 ............ Submission of demonstration abstracts August 15 2024 ........... Notification of acceptance for demonstrations sent to the authors September 9-13 2024 ...... Conference date Submission of abstracts serves for better organization of the review process only - for the actual review a full paper submission is necessary. The accepted conference contributions will be published in Springer proceedings that will be made available to participants at the time of the conference. OFFICIAL LANGUAGE The official language of the conference is English. ACCOMMODATION The organizing committee will arrange discounts on accommodation in the 4-star hotel at the conference venue. The current prices of the accommodation will be available at the conference website. ADDRESS All correspondence regarding the conference should be addressed to Ales Horak, TSD 2024 Faculty of Informatics, Masaryk University Botanicka 68a, 602 00 Brno, Czech Republic phone: +420-5-49 49 18 63 fax: +420-5-49 49 18 20 email: tsd2024(a)tsdconference.org The official TSD 2024 homepage is: http://www.tsdconference.org/tsd2024 LOCATION Brno is the second largest city in the Czech Republic with a population of almost 400.000 and is the country's judiciary and trade-fair center. Brno is the capital of South Moravia, which is located in the south-east part of the Czech Republic and is known for a wide range of cultural, natural, and technical sights. South Moravia is a traditional wine region. Brno had been a Royal City since 1347 and with its six universities it forms a cultural center of the region. Brno can be reached easily by direct flights from London and Milano, and by trains or buses from Vienna (150 km) or Prague (230 km).

1 0

Final CfP 5th workshop on Resources for African Indigenous Language (RAIL) @ LREC-COLING
by Menno Van Zaanen 22 Feb '24

22 Feb '24

EXTENDED DEADLINE (28 February 2024) The fifth workshop on Resources for African Indigenous Language (RAIL) Colocated with LREC-COLING 2024 https://bit.ly/rail2024 New: extended deadline Conference dates: 20-25 May 2024 Workshop date: 25 May 2024 Venue: Lingotto Conference Centre, Torino (Italy) The fifth RAIL workshop website: https://bit.ly/rail2024 LREC-COLING 2024 website: https://lrec-coling-2024.org/ Submission website: https://softconf.com/lrec-coling2024/rail2024/ The fifth Resources for African Indigenous Languages (RAIL) workshop will be co-located with LREC-COLING 2024 in Lingotto Conference Centre, Torino, Italy on 25 May 2024. The RAIL workshop is an interdisciplinary platform for researchers working on resources (data collections, tools, etc.) specifically targeted towards African indigenous languages. In particular, it aims to create the conditions for the emergence of a scientific community of practice that focuses on data, as well as computational linguistic tools specifically designed for or applied to indigenous languages found in Africa. Many African languages are under-resourced while only a few of them are somewhat better resourced. These languages often share interesting properties such as writing systems, or tone, making them different from most high-resourced languages. From a computational perspective, these languages lack enough corpora to undertake high level development of Human Language Technologies (HLT) and Natural Language Processing (NLP) tools, which in turn impedes the development of African languages in these areas. During previous workshops, it has become clear that the problems and solutions presented are not only applicable to African languages but are also relevant to many other low-resource languages. Because these languages share similar challenges, this workshop provides researchers with opportunities to work collaboratively on issues of language resource development and learn from each other. The RAIL workshop has several aims. First, the workshop brings together researchers who work on African indigenous languages, forming a community of practice for people working on indigenous languages. Second, the workshop aims to reveal currently unknown or unpublished existing resources (corpora, NLP tools, and applications), resulting in a better overview of the current state-of-the-art, and also allows for discussions on novel, desired resources for future research in this area. Third, it enhances sharing of knowledge on the development of low-resource languages. Finally, it enables discussions on how to improve the quality as well as availability of the resources. The workshop has “Creating resources for less-resourced languages” as its theme, but submissions on any topic related to properties of African indigenous languages (including non-African languages) may be accepted. Suggested topics include (but are not limited to) the following: Digital representations of linguistic structures Descriptions of corpora or other data sets of African indigenous languages Building resources for (under resourced) African indigenous languages Developing and using African indigenous languages in the digital age Effectiveness of digital technologies for the development of African indigenous languages Revealing unknown or unpublished existing resources for African indigenous languages Developing desired resources for African indigenous languages Improving quality, availability and accessibility of African indigenous language resources Submission requirements: We invite papers on original, unpublished work related to the topics of the workshop. Submissions, presenting completed work, may consist of up to eight (8) pages of content for a long submission and up to four (4) pages of content for a short submission plus additional pages of references. The final camera-ready version of accepted long papers are allowed one additional page of content (up to 9 pages) so that reviewers’ feedback can be incorporated. Papers should be formatted according to the LREC-COLING style sheet (https://lrec-coling-2024.org/authors-kit/), which is provided on the LREC-COLING 2024 website (https://lrec-coling-2024.org/). Reviewing is double-blind, so make sure to anonymise your submission (e.g., do not provide author names, affiliations, project names, etc.) Limit the amount of self citations (anonymised citations should not be used). The RAIL workshop follows the LREC-COLING submission requirements. Please submit papers in PDF format to the START account (https://softconf.com/lrec-coling2024/rail2024/). Accepted papers will be published in proceedings linked to the LREC-COLING conference. Important dates: Submission deadline: 28 February 2024 (AoE) Date of notification: 15 March 2024 Camera ready deadline: 29 March 2024 RAIL workshop: 25 May 2024 Organising Committee Rooweither Mabuya, South African Centre for Digital Language Resources (SADiLaR), South Africa Muzi Matfunjwa, South African Centre for Digital Language Resources (SADiLaR), South Africa Mmasibidi Setaka, South African Centre for Digital Language Resources (SADiLaR), South Africa Menno van Zaanen, South African Centre for Digital Language Resources (SADiLaR), South Africa -- Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za Professor in Digital Humanities South African Centre for Digital Language Resources https://www.sadilar.org ________________________________ NWU PRIVACY STATEMENT: http://www.nwu.ac.za/it/gov-man/disclaimer.html DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system. ________________________________

1 0

CfP: BUCC, extended deadline, 17th Workshop on Building and Using Comparable Corpora
by Pierre Zweigenbaum 22 Feb '24

22 Feb '24

17th Workshop on Building and Using Comparable Corpora Co-located with LREC-COLING 2024 Torino, Italia, 20 May 2024 *Extended deadline: 6 March 2024* Invited speaker: François Yvon, Sorbonne Université, CNRS, ISIR Workshop website: https://comparable.limsi.fr/bucc2024/ LREC-COLING website: https://lrec-coling-2024.org/ Workshop proceedings to be published in the ACL Anthology MOTIVATION In the language engineering and linguistics communities, research in comparable corpora has been motivated by two main reasons. In language engineering, on the one hand, it is chiefly motivated by the need to use comparable corpora as training data for statistical NLP applications such as statistical and neural machine translation or cross-lingual retrieval. In linguistics, on the other hand, comparable corpora are of interest because they enable cross-language discoveries and comparisons. It is generally accepted in both communities that comparable corpora consist of documents that are comparable in content and form in various degrees and dimensions across several languages. Parallel corpora are on the one end of this spectrum, unrelated corpora on the other. Comparable corpora have been used in a range of applications, including Information Retrieval, Machine Translation, Cross-lingual text classification, etc. The linguistic definitions and observations related to comparable corpora can improve methods to mine such corpora for applications of neural NLP, for example, to extract parallel corpora from comparable corpora for neural machine translation. As such, it is of great interest to bring together builders and users of such corpora. TOPICS We solicit contributions on all topics related to comparable (and parallel) corpora, including but not limited to the following: Building Comparable Corpora: - Automatic and semi-automatic methods - Methods to mine parallel and non-parallel corpora from the web - Tools and criteria to evaluate the comparability of corpora - Parallel vs non-parallel corpora, monolingual corpora - Rare and minority languages, across language families - Multi-media/multi-modal comparable corpora Applications of Comparable Corpora: - Human translation - Language learning - Cross-language information retrieval & document categorization - Bilingual and multilingual projections - (Unsupervised) Machine translation - Writing assistance - Machine learning techniques using comparable corpora Mining from Comparable Corpora: - Cross-language distributional semantics, word embeddings and pre-trained multilingual transformer models - Extraction of parallel segments or paraphrases from comparable corpora - Methods to derive parallel from non-parallel corpora (e.g. to provide for low-resource languages in neural machine translation) - Extraction of bilingual and multilingual translations of single words, multi-word expressions, proper names, named entities, sentences, paraphrases etc. from comparable corpora - Induction of morphological, grammatical, and translation rules from comparable corpora - Induction of multilingual word classes from comparable corpora Comparable Corpora in the Humanities: - Comparing linguistic phenomena across languages in contrastive linguistics - Analyzing properties of translated language in translation studies - Studying language change over time in diachronic linguistics - Assigning texts to authors via authors' corpora in forensic linguistics - Comparing rhetorical features in discourse analysis - Studying cultural differences in sociolinguistics - Analyzing language universals in typological research IMPORTANT DATES Deadlines are "anywhere on Earth". 6 Mar 2024: *Extended* paper submission deadline 24 Mar 2024: Notification of acceptance 7 Apr 2024: Camera-ready final papers 20 May 2024: Workshop date For updates, please see the workshop website at https://comparable.limsi.fr/bucc2024/ PRACTICAL INFORMATION The workshop is an in-person event. Workshop registration is via the main conference registration site, see https://lrec-coling-2024.org/ The workshop proceedings will be published in the ACL Anthology. SUBMISSION GUIDELINES Please follow the style sheet and templates (for LaTeX, Overleaf and MS-Word) provided for the main conference at https://lrec-coling-2024.org/authors-kit/ Papers should be submitted as a PDF file using the START conference manager at https://secure-web.cisco.com/1UaJIr7ltEdbVzt8EpJCgpyj2ZyxfgFf-boU68G__QPUm2… Submissions must describe original and unpublished work and range from 4 to 8 pages plus unlimited references. Reviewing will be double blind, so the papers should not reveal the authors' identity. Accepted papers will be published in the workshop proceedings, which will be included in the ACL Anthology. Double submission policy: Parallel submission to other meetings or publications is possible but must be immediately (i.e. as soon as known to the authors) notified to the workshop organizers by e-mail. For further information and updates, please see the BUCC 2024 website: https://comparable.limsi.fr/bucc2024/ WORKSHOP ORGANIZERS - Pierre Zweigenbaum (Université Paris-Saclay, CNRS, LISN, Orsay, France) - Reinhard Rapp (University of Mainz and Magdeburg-Stendal University of Applied Sciences, Germany) - Serge Sharoff (University of Leeds, United Kingdom) Contact: pz (at) lisn (dot) fr PROGRAMME COMMITTEE - Ebrahim Ansari (Institute for Advanced Studies in Basic Sciences, Iran) - Thierry Etchegoyhen (Vicomtech, Spain) - Kyo Kageura (University of Tokyo, Japan) - Natalie Kübler (Université Paris Cité, France) - Philippe Langlais (Université de Montréal, Canada) - Yves Lepage (Waseda University, Japan) - Shervin Malmasi (Amazon, USA) - Michael Mohler (Language Computer Corporation, USA) - Emmanuel Morin (Nantes Université, France) - Dragos Stefan Munteanu (Language Weaver, Inc., USA) - Ted Pedersen (University of Minnesota, Duluth, USA) - Ayla Rigouts Terryn (KU Leuven, Belgium) - Reinhard Rapp (University of Mainz and Magdeburg-Stendal University of Applied Sciences, Germany) - Nasredine Semmar (CEA LIST, Paris, France) - Silvia Severini (Leonardo Labs, Italy) - Serge Sharoff (University of Leeds, UK) - Richard Sproat (OGI School of Science & Technology, USA) - Tim Van de Cruys (KU Leuven, Belgium) - Pierre Zweigenbaum (Université Paris-Saclay, CNRS, LISN, Orsay, France)

1 0

2nd Call for Papers MMSYM 2014
by Andy Lücking 22 Feb '24

22 Feb '24

*Apologies for cross-postings* Call for Papers MMSYM 2024 (http://mmsym.org/) 2nd CALL FOR PAPERS 2nd International MultiModal communication SYMposium (MMSYM 2024) Frankfurt, Germany; 25.09.-27.09.2024 STRICT ABSTRACT SUBMISSION DEADLINE: March 08th, 2024 (anywhere on earth)! The 2nd edition of MMSYM continues the symposium series on multimodal communication previously held as the 1st MMSYM in Barcelona (2023), as European Symposia in Leuven (2019), Bielefeld (2017), Copenhagen (2016), Tartu (2014) and Malta (2013), and even earlier as Nordic (2003-2012) and Swedish Symposia (1997-2000). The symposium aims at gaining insights into the interaction and/or co-dependence of visual and acoustic modes of communication. To advance our understanding of communication, the symposium aims at further integrating multimodality as an integral part of linguistics and cognitive science. We welcome innovative contributions in the broader field of multimodal communication. Investigating multimodality extends our knowledge about various ways of communication beyond (spoken) language. This year’s symposium has a particular interest in three main research themes: (1) The gesture-speech integration, in particular the prosody-gesture link, (2) formal, automatic and machine-learning approaches to multimodality, and (3) psycholinguistic approaches in multimodal settings. For MMSYM 2024, we particularly encourage contributions relating to the conference themes. Topics include, but are not limited to: • Annotation schemes and tools for multimodal data • Articulation-Gesture-coordination • Automatic recognition and interpretation of different modalities and their interaction • Formal implementation of multimodality • Intercultural aspects of multimodal behavior • Kinematics of bodily movements • Machine and deep learning techniques applied to multimodal data • Memory effects in multimodal settings • Multimodal aspects of language acquisition and learning (both L1 and L2) • Multimodal communication disorders and communication support • Multimodal corpora • Multimodal dialogue systems • Multimodal health communication • Multimodal human-computer interaction and conversational agents • Multimodal language processing • Prosody-Gesture-Integration • Semantic and pragmatic functions of multimodality • Speech and gestures in human communication Invited speakers: - Petra Wagner (Bielefeld University) - Judith Holler (Donders Institute for Brain, Cognition & Behaviour, Radboud University; MPI for Psycholinguistics Nijmegen) - Julie Hunter (LinaGora Labs, Toulouse)

1 0

Semantic Methods for Events and Stories, 2nd Edition (SEMMES 2024) – 2nd Call for Papers
by Pasquale Lisena 22 Feb '24

22 Feb '24

Apologies for cross posting *************** Semantic Methods for Events and Stories, 2nd Edition (SEMMES 2024) – 2nd Call for Papers *************** Website: https://anr-kflow.github.io/semmes/ Workshop co-located with the Extended Semantic Web Conference (ESWC) in Hersonissos, Greece Submission deadline: March 7th, 2024 Scope *************** An important part of human history and knowledge is made of events, which can be aggregated and connected to create stories, be they real or fictional. These events as well as the stories created from them can typically be inherently complex, reflect societal or political stances and be perceived differently across the world population. The Semantic Web offers technologies and methods to represent these events and stories, as well as to interpret the knowledge encoded into graphs and use it for different applications, spanning from narrative understanding and generation to fact-checking. The aim of the 2nd edition of our workshop on Semantic Methods for Events and Stories (SEMMES) is to offer an opportunity to discuss the challenges related to dealing with events and stories, and how we can use semantic methods to tackle them. We welcome approaches which combine data, methods and technologies coming from the Semantic Web with methods from other fields, including machine learning, narratology or information extraction. This workshop wants to bring together researchers working on complementary topics, in order to foster collaboration and sharing of expertise in the context of events and stories. Topics *************** Topics of interest include, but are not limited to: - Ontologies and data models for representing events, event relations, and narratives; - Event extraction, co-reference and linking; - Event Relation extraction and linking (e.g. temporal, causal, modal relationships); - Methods combining KGs and LLMs targeting event- or narrative-related research; - Fake events detection and event verification; - Event-centric question answering; - Event information visualisation; - Event-centric knowledge graphs and vocabularies; - Completion of event-centric knowledge graphs and reasoning; - Event summarisation; - Automatic narrative understanding and generation; - Storytelling Applications/Demos. Submission Guidelines *************** We welcome the following types of contributions. - Long papers (10-15 pages including references) - Short papers (5-9 pages including references) We welcome any types of research, resource and application papers, as well as (short only) demonstration submissions. Submissions must be written in English and formatted using the template for submissions to CEUR Workshop Proceedings (https://www.overleaf.com/latex/templates/template-for-submissions-to-ceur-w…) All papers and abstracts have to be submitted electronically via EasyChair: https://easychair.org/conferences/?conf=semmes2024. Each accepted paper needs to be presented by one of the authors, who agrees to register and participate in SEMMES. Authors may be requested to serve as reviewers for max 2 papers. Important Dates *************** - Submission deadline: March 7th, 2024 - Notifications: April 4th, 2024 - Camera-ready version: April 18th, 2024 - Workshop day: May 26th or 27th, 2024 (half-day, TBA) All deadlines are 23:59 anywhere on earth (UTC-12). Proceedings *************** The complete set of papers will be published with the joint CEUR ESWC Workshop Proceedings (http://CEUR-WS.org), listed by the DBLP. -- Pasquale Lisena EURECOM, Campus SophiaTech 450 route des Chappes, 06410 Biot, France e-mail: pasquale.lisena(a)eurecom.fr site: http://pasqlisena.github.io/

2 1

CHANGE OF DATE for sign-lang@LREC 2024
by Schulder, Dr. Marc 22 Feb '24

22 Feb '24

Dear all, the date for the sign-lang@LREC 2024 workshop has changed. It will now be on Saturday, 25 May, the day after the LREC-COLING 2024 main conference. We are also pleased to confirm that the workshop will be a hybrid event. Similar to the 2022 workshop, participants will be given access to an online text chat before and during the event for online participants to present their work as well as for discussion of all workshop contributions. On-stage presentations will be live streamed (including International Sign/English interpretation) with opportunity for questions from online and on-site participants. The live poster sessions will be held on-site only, but posters will be made available online for discussion via text chat. For further information, please visit the workshop website at https://www.sign-lang.uni-hamburg.de/lrec2024/ Yours, the sign-lang@LREC 2024 workshop committee

1 0

Edge Hill Corpus Research Group, Thursday 29 February 2024
by Costas Gabrielatos 22 Feb '24

22 Feb '24

Edge Hill Corpus Research Group The next meeting of the Edge Hill Corpus Research Group will take place online (via MS Teams) on Thursday 29 February 2024, 2:00-3:30 pm (GMT). Attendance is free. Registration closes on Wednesday 28 February, 11 am (GMT) You can register here: https://store.edgehill.ac.uk/conferences-and-events/conferences/events/edge… Topic: Corpus Methodology Speaker: Matteo Di Cristofaro<https://infogrep.it/site/> (University of Modena and Reggio Emilia, Italy) Title: One dataset, many corpora: Problems of scientific validity in corpora and corpus-derived results Abstract Corpus linguistics has, since its inception, recognised the relevance of digital technologies as a major driving force behind corpus techniques and their (r)evolution in the study of language (cf. Tognini-Bonelli 2012). And yet, while both corpus linguistics and digital technologies have frequently benefited from each other (the case of NLP/NLU is one such macro example), their pathways have often diverged. The result is a disconnect between corpus linguistics and digital data processing whose effects directly impinge on the ability to analyse language through software tools. A disconnect becoming more and more relevant as corpus linguistics is being applied to vast amounts of data obtained from manifold sources – including a wide array of social media platforms, each one with its unique linguistic and technical peculiarities. As the ground-truth of an ever-increasing number of language studies, corpora must be able to correctly treat and represent such peculiarities: e.g. the dialogic dimension of comments or forum posts; the presence (and potential subsequent normalisation) of spelling variations; the use of hashtags and emojis. Failing to do so, the corpus-derived results will likely present researchers with a falsified view of the language under scrutiny. What is at stake is not the ability to “count” what is in a corpus, but rather whether what is being counted is or is not a feature present in the original data – of which the corpus should be a faithful representation. The presentation is consequently devoted to tackling digital technicalities, i.e. “those notions and mechanisms that – while not classically associated with natural language – are i) foundational of the digital environments in which language production and exchanges occur and ii) at the core of the techniques that are used to produce, collect, and process the focus of investigation, that is, digital textual data.” (Di Cristofaro 2023:5). One such example is represented by character encodings: although at the “core” of the whole corpus linguistics enterprise (cf. McEnery and Xiao 2005; Gries 2016:39,111) – since they allow written language to be processed by a computer and understood by humans -, these are often overlooked at all stages of corpus compilation and analysis, potentially leading linguists to involuntarily tampering with the data and its linguistic contents. Starting from practical examples, the presentation discusses the implications that digital technicalities have on corpora and their analyses – or rather, what happens when they are not properly treated – while outlining (also in the form of Python scripts and practical tools) potential new pathways that a “digital-aware” perspective of corpus linguistics can open up. References Di Cristofaro, Matteo. Corpus Approaches to Language in Social Media. Routledge Advances in Corpus Linguistics. New York: Routledge, 2023. https://doi.org/10.4324/9781003225218<https://doi.org/10.4324/9781003225218>. Gries, Stefan Th. Quantitative Corpus Linguistics with R: A Practical Introduction. 2nd ed. New York: Routledge, 2016. https://doi.org/10.4324/9781315746210<https://doi.org/10.4324/9781315746210>. McEnery, Tony, and Richard Xiao. ‘Character Encoding in Corpus Construction’. In Developing Linguistic Corpora: A Guide to Good Practice, edited by Martin Wynne, 47–58. Oxford: Oxbow Books, 2005. https://users.ox.ac.uk/~martinw/dlc/index.htm<https://users.ox.ac.uk/~martinw/dlc/index.htm>. Tognini Bonelli, Elena. ‘Theoretical Overview of the Evolution of Corpus Linguistics’. In The Routledge Handbook of Corpus Linguistics, edited by Anne O’Keeffe and Michael McCarthy, 14–27. Routledge Handbooks in Applied Linguistics. Milton Park, Abingdon, Oxon ; New York: Routledge, 2012. ________________________________ Edge Hill University<http://ehu.ac.uk/home/emailfooter> Modern University of the Year, The Times and Sunday Times Good University Guide 2022<http://ehu.ac.uk/tef/emailfooter> University of the Year, Educate North 2021/21 ________________________________ This message is private and confidential. If you have received this message in error, please notify the sender and remove it from your system. Any views or opinions presented are solely those of the author and do not necessarily represent those of Edge Hill or associated companies. Edge Hill University may monitor email traffic data and also the content of email for the purposes of security and business communications during staff absence.<http://ehu.ac.uk/itspolicies/emailfooter>

1 0

2 Assistant/Associate Professors in Artificial Intelligence at Leiden Institute of Advanced Computer Science (LIACS)
by Wijnholds, G.J. (Gijs) 22 Feb '24

22 Feb '24

Dear all, My institute is looking to recruit two assistant or associate professors in AI, which may be relevant to people on this list. Here’s the beginning of the vacancy: "The Faculty of Science, Leiden Institute of Advanced Computer Science (LIACS), is looking for: 2 Assistant/Associate Professors in Artificial Intelligence (0.8-1.0 FTE) The rapid evolution and expansion of Artificial Intelligence and Computer Science and the increasing integration with other disciplines creates new challenges in developing and understanding modern computation in its foundations, applications, and societal consequences. Our institute is at the center of this transformation, and we aim to strengthen our research and education in artificial intelligence. We are looking for candidates with expertise complementary to the one that is already present at LIACS and related to generative AI, human centered AI, interactive machine learning, and computational creativity." Here’s the full vacancy: https://www.universiteitleiden.nl/vacatures/2024/q1/14490-2-assistant_assoc… Best, dr. Gijs Wijnholds Assistant Professor in Natural Language Processing Text Mining and Retrieval Group<https://tmr.liacs.nl/> Leiden Institute of Advanced Computer Science https://gijswijnholds.github.io

1 0

[CfP] e-Commerce and NLP Workshop @ LREC-COLING 2024
by besnikf＠amazon.com 22 Feb '24

22 Feb '24

The Seventh Workshop on e-Commerce and NLP (ECNLP 7) Co-located with LREC-COLING 2024 in Torino, Italy – May 21, 2024 https://sites.google.com/view/ecnlp/ Submission Deadline: Friday Feb 23, 2024 - 23:59pm (AoE) ECNLP focuses on NLP for e-Commerce and online shopping applications. We welcome papers covering all aspects on online commerce and data, including search, retrieval, and customer-facing applications and tasks. Important Dates Submission Deadline: Friday Feb 23, 2024 - 23:59pm (AoE) Acceptance Notification: Friday March 29, 2024 Camera-ready versions: Friday April 12, 2024 Workshop: Tuesday May 21, 2024 Instructions for Authors Papers must be submitted in PDF format using the official LREC-COLING template. More details available on the website. Additional Information and Contact Details https://sites.google.com/view/ecnlp/home/ Workshop Scope ECNLP invites quality research contributions as short or long papers. All submissions will undergo a double-blind review process, and accepted submissions will be presented at the workshop. NLP and IR have been powering e-Commerce applications since the early days of the fields. Today, NLP and IR already play a significant role in e-commerce tasks, including product search, recommender systems, product question answering, machine translation, sentiment analysis, product description and review summarization, and customer review processing, among many other tasks. With the exploding popularity of chatbots and shopping assistants – both text- and voice-based – NLP, IR, question answering, and dialogue systems research is poised to transform e-commerce once again, but requires a forum where new and unfinished ideas could be discussed. The ECNLP workshop will provide a venue for the dissemination of NLP and IR research results related to e-commerce and online shopping, bringing together researchers from both academia and industry. The workshop welcomes submission of late-breaking and preliminary research results, as well as opinion and position papers. Topics of interest include but are not limited to: - Product classification and cataloguing (including into types and hierarchies) - NER for products, brands, attributes, and part names - Search and product query auto-completion - Recommender systems and product suggestions - Machine Translation applied to e-commerce (e.g. translating product titles/reviews) - Voice & dialogue-based e-commerce applications; ASR for e-commerce - Advertising and ad prediction/forecasting models - Fraud and spam detection in e-commerce (e.g. in customer reviews/comments) - Product description and review summarization - Product similarity and matching of seller-provided listings to catalog products - Technical support request processing (user emails, chat agents, etc.) - E-commerce related social media processing - The intersection of Computer Vision and NLP (e.g. product images and text) - Product Question Answering - Shopping assistants, agents, and chat bots - Sentiment analysis, opinion mining, and stance detection in user-generated content - Relevant resources and datasets Thank you, The ECNLP Organizing Committee

1 0

Discharge Me! shared task
by dina demner 22 Feb '24

22 Feb '24

The Discharge Me! shared task invites participants to streamline the generation of discharge summary sections in the EHR, with the goal of alleviating clinician burden and enhancing patient care quality. Leveraging a dataset derived from MIMIC-IV, participants are tasked with generating the "Brief Hospital Course" and "Discharge Instructions" sections using over 100,000 admissions from the Emergency Department (ED). Submission guidelines and data access agreements are detailed on the task and competition website (https://stanford-aimi.github.io/discharge-me <https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstanford-…>), with system submissions due by May 10th, 2024. Accepted papers will be presented at the 23rd Workshop on Biomedical Natural Language Processing at ACL 2024. Join us in revolutionizing clinical documentation and improving healthcare workflows! For further details and registration, please visit the Codabench competition page linked on the task website.

1 0

special issue of SNAM journal - Datasets, Language Resources and Algorithmic Approaches on Online Wellbeing and Social Order in Asian Languages - Deadline July 2024
by Rajesh Sharma 21 Feb '24

21 Feb '24

Hello All, Special issue of the *Social Network Analysis and Mining (SNAM)* journal: *Datasets, Language Resources and Algorithmic Approaches on Online Wellbeing* *and Social Order in Asian Languages* https://link.springer.com/journal/13278/updates/26741080 *** Deadline for submission: July 2024 *** *** Guest Editors *** Vivek Kumar Singh, Banaras Hindu University, Varanasi, India David Pinto, Benemerita Universidad Autonoma de Puebla, Mexico Dr Sriparna Saha, Indian Institute of Technology, Patna, India Dr. Vedika Gupta, OP Jindal Global University, Haryana, India. Dr. Rajesh Sharma, University of Tartu, Estonia. *** Context *** The phenomenal growth of social media platforms has resulted in their becoming ubiquitous in the sense that now almost everyone on the planet is using or is being affected by content on social media platforms. Social media platforms have become so influential that they are not only affecting individual thoughts and behaviours but also guiding collective behaviours of groups and societies. There are now innumerable instances of hate speech, abusive content, cyberbullying, misogyny, fake news and disinformation etc. on social media platforms. Such content can severely impact our emotions, mental health, and well-being. The spread of hate speech, misinformation, fundamentalist propaganda, religious hate campaigns etc. on social media platforms can be furthermore dangerous as it could disturb the social order and harmony. The hateful and targeted campaigns can affect social structures and institutions, values, and norms. Therefore, it is extremely important that such content is identified and appropriately dealt with. However, due the huge volume and speed of creation of such content, it can only be done by using sophisticated computational methods that can automatically detect and identify harmful content. Taking into account the fact that the social media is accessible in large number of languages across the world, the task becomes more challenging. Availability of enough and suitable data and resources is a fundamental requirement towards this endeavour. Asia, being the largest continent, embraces diverse cultures, ethnicities and languages. There are around 2300 languages spoken in Asia. Though there has been substantial research on the above mentioned aspects in the English language, research in Asian languages is still in its infancy. The limited or availability of no datasets and resources in these languages is a primary reason for this. This special issue aims to bring together contributions that advance the research in the area of computational methods for automatic detection and identification of harmful content on the social media platforms, such as those reporting: · Algorithmic approaches · Computational resources · Datasets · Dictionaries and Lexicons · Software Resources Contributions that report novel methods and techniques, datasets and application of various state of the art methods for different tasks in the social media text analytics, including those in low resource languages are also welcome. Though the main focus area of the special issue is on the analysis of the textual content, studies and resources that report multimodal data (with text being the major part) will also be considered. ***Topics of Interest*** The special issue invites original, unpublished contributions on datasets (elicitation, processing, annotation) and resources (corpora, lexica, database, ontologies, computational approaches, and methodologies) on the following non-exhaustive list of indicative topics: · Aggression and Abusive Content detection · Cognitive Analytics of Social Media Services · Collective Idea Generation and Opinion Dynamics · Depression Intensity Estimation · Detection of Hate Speech, Profanity, Hostility, Cyberbullying · Disinformation, Misinformation, Fake News and Rumours · Emotion analysis, Emotional conversation generation · Fraud detection in online social network · Making online environments safer · Personality trait assessment · Polarization in online discussions · Protecting Children from abusive content · Racial and targeted abuse detection · Religious abuse and bias detection · Sentiment Analysis · Sexism and Misogynistic attitude detection · Social Alignment Contagion in Online Social Networks · Social biases in online texts · Social Perception and Social Influence in social media · Suicide Ideation detection in the Online Environment · Violent Incident detection *** Important dates *** • Submission deadline: July 2024 • Notification to the authors after the first review: December 2024 • Notification to the authors after the second review: March 2025 • Publication: December 2024 *** Submission Guidelines *** Articles reporting original and unpublished research results pertaining to the above topics are solicited. Submitted articles will follow an academic review process. Manuscripts must be prepared according to the instructions for authors available at the journal webpage and submitted through the publisher's online submission system, available here <https://idp-personal-authenticator.springernature.com/gateway?response_type…> . Kind Regards Rajesh Sharma, Associate Professor, Head, Computational Social Science Lab, Institute of Computer Science, University of Tartu, Estonia Group webpage: https://css.cs.ut.ee/ Personal Webpage: https://rajeshsharma.cs.ut.ee/

1 0

Call for applications - Summer School 2024 - Digital Humanities and Digital Communication: Challenges and opportunities of interacting with and through technology
by jessicajane.nocella1＠gmail.com 21 Feb '24

21 Feb '24

Summer School 2024 Digital Humanities and Digital Communication: Challenges and opportunities of interacting with and through technology Host Institution: University of Modena and Reggio Emilia Coordinating Institution: Department of Studies on Language and Culture Website: https://www.summerschooldigitalhumanities.unimore.it/ Dates: 3 June 2024—7 June 2024 Location: Modena, Emilia Romagna, Italy CALL FOR APPLICATIONS We are happy to announce the 6th edition of our Summer School in Digital Humanities and Digital Communication, which will be hosted by the Department of Studies on Language and Culture of the University of Modena and Reggio Emilia, in collaboration with the Fondazione Marco Biagi. As part of the Doctoral Programme in Human Sciences, the Summer School aims to provide PhD students and young researchers with methodological tools for the study of digital communication and data analysis. This year’s focus is on challenges and opportunities of interacting through technology, with topics ranging from digital resources for research in the humanities to the use of new information technologies for data analysis; from tools for analysing communication in new media to ways of processing, accessing, and disseminating knowledge. SUMMER SCHOOL THEMES The digital world in which we live opens up numerous opportunities, but also challenges and risks. In recent years the impact of technology has been profound and far-reaching and the speed at which innovations have been introduced has radically changed the landscape of research and communication. New forms of media have transformed our working and social habits and the dynamics of interpersonal relationships. Digital technology has also facilitated the production, storage and access to information. Research, especially in the humanities, has benefited from increasingly complex digital archives, the flexibility and the multimodality of digital publishing, the wealth of tools for the compilation, annotation and analysis of corpora etc. The object itself of research has changed, often including digital data or focusing on digital communication and user- generated content in particular. Dissemination of knowledge has expanded its potential with the use of augmented reality and gaming. Indeed, generative AI is opening the whole field of the humanities to new methods and new research questions. However, these trends often pull in opposite directions, creating paradoxes and contradictions. For example, whilst an infinite amount of information is guaranteed, the reliability, trustworthiness and source of that information is unknown, with AI in the background and the legal issues associated with it (data leaks, misrepresenting information, unintended uses etc.). Access to global systems of communication bring potentially an infinite number of people into contact, but at the same time, alone with our computers or mobile phones, we can become detached and solitary. Contemporary forms of communication have blurred the distinction between what is real and what is virtual. What kind of demands do the new forms of technology pose on researchers in the humanities? What role does literacy play in fostering ethical understanding and critical thinking in today’s technologically evolving society? Is there a risk of undermining the active role of human agents with AI? May too much trust be placed on the machine? The summer school will try to discuss the challenges and opportunities of interacting with and through technology, considering new fields of study, new tools and resources, new forms of collaboration in research, while at the same time allowing participants to explore some of the recent advances in the field of digital humanities in hands-on workshops. APPLICATIONS AND SUBMISSION GUIDELINES https://www.summerschooldigitalhumanities.unimore.it/application/ IMPORTANT DATES Deadline for applications: March 28, 2024 Notification of acceptance: April 10, 2024 Conference website: https://www.summerschooldigitalhumanities.unimore.it/ For any inquiry, please contact the organisers at: digitalhumanities(a)unimore.it

1 0

Final call: The First Workshop on Language-driven Deliberation Technology (May 20th, LREC-COLING). Deadline: March 2nd
by Gabriella Lapesa 21 Feb '24

21 Feb '24

Call Deadline: 02-Mar-2024 Meeting Description: The DELITE workshop provides a forum for presenting new advances in technology around deliberation by addressing researchers in Natural Language Processing, human-computer interaction, corpus linguistics, political science and philosophy, as well as stakeholders and domain experts involved in integrating such technology into decision-making processes. With numerous projects all over the world interested in aspects of digital democracy, inclusivity and representation in the decision-making process and improving deliberative democracy, DELITE2024 is right at the center of a new interdisciplinary research community, with the language-driven angle representing a fundamental and distinctive contribution. 2nd Call for Papers: Deliberation is ubiquitous: from navigating divergent interests in everyday personal life to reaching consensus in the political decision making process, deliberation describes the communicative process by which a group of people exchange ideas, weigh different arguments, and ultimately reach mutual understanding. In recent years, deliberative processes have gained momentum and shown to improve everyday and political decision-making. For the first time, technological solutions are maturing to the point that they can be deployed to support deliberation. In this context, we want to establish the foundations for collecting and curating data for deliberation domains and for evaluating technology in deliberative settings. The DELITE workshop provides a forum for presenting new advances in technology around deliberation by addressing researchers in Natural Language Processing, human-computer interaction, corpus linguistics, political science and philosophy, as well as stakeholders and domain experts involved in integrating such technology into decision-making processes. Topics for DELITE2024 include, but are not limited to: - Technological advances for public decision making - Deliberation theory in NLP models - In-domain versus across domain resources and corpora - Data-driven theory development - Integration of language systems into deliberation processes and interfaces - Technological solutions for online deliberation at scale - Argument mining for deliberation scenarios - Visual Analytics for human sensemaking - Empirical foundations for evaluation - Integration and reflection on recent advances in LLMs for deliberation scenarios - Explainability - Ethical questions - Addressing bias Application areas include, but are not limited to: - Public policy making - Democratic innovations - Deliberative democracy - Political decision making - Participatory urban planning - Citizen engagement and co-creation - Intelligence services and military - Conflict resolution/mitigation - Case analysis in healthcare - Legal decision making - Scholarly discourse (written and spoken) Submissions *************** Papers must describe original (completed or in progress) and unpublished work. We invite long (8 pages, excluding references) and short papers (4 pages, excluding references). Papers must be anonymized to support double-blind reviewing, i.e., they must not include authors’ names and affiliations and should avoid links to non-anonymized repositories. Papers that do not conform to these requirements will be rejected without review. Upon acceptance, the papers will be given one additional page – for long papers, up to nine (9) pages of content plus unlimited pages for acknowledgments and references and five (5) pages for short papers. We also invite non-archival, non-anonymous papers (2-4 pages, including references) to describe ongoing work, introduce research projects, or summarize already published work. These will be presented in a poster session where ongoing projects are presented in order to serve community building. Submission of all papers is electronic, using the Softconf START conference management system (https://softconf.com/lrec-coling2024/delite2024/). Papers must follow the LREC-COLING 2024 two-column format, using the supplied official style files. The templates can be downloaded from the Style Files and Formatting page provided on the website. Please do not modify these style files, nor should you use templates designed for other conferences. Submissions that do not conform to the required styles, including paper size, margin width, and font size restrictions, will be rejected without review. Important Dates ****************** Paper submission deadline: 2 March 2024 (extended) Notification of acceptance: 13 March 2024 Camera-ready versions due: 20 March 2024 Workshop date: 20 May 2024 (half-day)

1 0

NLPerspectives Final CFP & deadline extension
by Sara Tonelli 21 Feb '24

21 Feb '24

LPerspectives: The 3rd Workshop on Perspectivist Approaches to NLP Collocated with LREC-COLING in Turin, Italy FINAL CALL FOR PAPERS (DEADLINE EXTENSION) https://nlperspectives.di.unito.it/w/3rd-workshop-on-perspectivist-approach… Until recently, the dominant paradigm in natural language processing (and other areas of artificial intelligence) has been to resolve observed label disagreement into a single “ground truth” or “gold standard” via aggregation, adjudication, or statistical means. However, in recent years, the field has increasingly focused on subjective tasks, such as abuse detection or quality estimation, in which multiple points of view may be equally valid, and a unique ‘ground truth’ label may not exist (Plank, 2022). At the same time, as concerns have been raised about bias and fairness in AI, it has become increasingly apparent that an approach which assumes a single “ground truth” can erase minority voices. Strong perspectivism in NLP (Cabitza et al., 2023) pursues the spirit of recent initiatives such as Data Statements (Bender and Friedman, 2018), extending their scope to the full NLP pipeline, including the aspects related to modelling, evaluation and explanation. In line with the first <https://nlperspectives.di.unito.it/w/w2022/> and second <https://nlperspectives.di.unito.it/w/2nd-workshop-on-perspectivist-approach…> editions, the third NLPerspectives (Perspectivist Approaches to Disagreement in NLP) workshop will explore current and ongoing work on: the collection and labelling of non-aggregated datasets; and approaches to modelling and including these perspectives in NLP pipelines, as well as evaluation and applications of multi-perspective Machine Learning models. We also welcome opinion pieces and literature reviews, e.g., fairness and inclusion in a perspectivist framework. Following our previous workshops, a key outcome of the third edition will be to continue the work begun at https://pdai.info/ to create a repository of perspectivist datasets with non-aggregated labels for use by researchers in perspectivist NLP modelling. Authors are, therefore, invited to share their LRs (data, tools, services, etc.) and provide essential information about resources (i.e., also technologies, standards, evaluation kits, etc.) that have been used for the work or are a result of their research. In addition, authors will be required to adhere to ethical research policies on AI and may include an ethics statement in their papers. The NLPerspectives workshop will be co-located with the 14th edition of LREC-COLING 2024 <https://lrec-coling-2024.org/> in Torino, Italy, in May 20-25, 2024 and online. Submissions The papers should be submitted as a PDF document, conforming to the formatting guidelines provided in the call for papers of LREC-COLING conference: authors-kit <https://lrec-coling-2024.org/authors-kit/> We accept three types of submissions: Regular research papers; Non-archival submissions: like research papers, but will not be included in the proceedings; Research communications: 4-page abstracts summarising relevant research published elsewhere. NLPerspectives will also accept submissions that have been rejected from the main LREC-COLING conference or ACL rolling review, provided they are accompanied with their reviews and they fit the topic of the workshop. Research papers (archival or non-archival) may consist of up to 8 pages of content. Research communications may consist of up to 4 pages of content. More details will be up soon. Please make submissions at https://softconf.com/lrec-coling2024/nlperspectives2024/ Topics We invite original research papers from a wide range of topics, including but not limited to: Non-aggregated data collection and annotation frameworks Descriptions of corpora collected under the perspectivist paradigm Multi-perspective Modelling and Machine Learning Evaluation of multi-perspective models/ models of disagreement Multi-perspective disagreement as applied to NLP evaluation Fairness and inclusive modelling Perspectivist approaches for social good Applications of multi-perspective modelling Computing with (dis)agreement Perspectivist Natural Language Generation Foundational aspects of perspectivism Opinion pieces and reviews on perspectivist approaches to NLP Submissions are open to all, and are to be submitted anonymously (and must conform to the instructions for double-blind review). All papers will be refereed through a double-blind peer review process by at least three reviewers, with final acceptance decisions made by the workshop organisers. Scientific papers will be evaluated based on relevance, significance of contribution, impact, technical quality, scholarship, and quality of presentation. Attendance At least one author of each accepted paper is required to participate in the conference and present the work, in-person or online. Important Dates * (EXTENDED) Friday March 1, 2024: Paper submission * Friday March 29, 2024: Notification of acceptance * Friday April 12, 2024: Camera-ready papers due * Tuesday May 21, 2024: Workshop Workshop organisers: Gavin Abercrombie, Heriot-Watt University Valerio Basile, University of Turin Davide Bernardi, Amazon Alexa Shiran Dudy, Northeastern University Simona Frenda, University of Turin Lucy Havens, University of Edinburgh Sara Tonelli, Fondazione Bruno Kessler Contact us at g.abercrombie(a)hw.ac.uk <mailto:g.abercrombie@hw.ac.uk> if you have any questions. Website: https://nlperspectives.di.unito.it/ -- -- Le informazioni contenute nella presente comunicazione sono di natura privata e come tali sono da considerarsi riservate ed indirizzate esclusivamente ai destinatari indicati e per le finalità strettamente legate al relativo contenuto. Se avete ricevuto questo messaggio per errore, vi preghiamo di eliminarlo e di inviare una comunicazione all’indirizzo e-mail del mittente. -- The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. If you received this in error, please contact the sender and delete the material.

1 0

36th International Conference on Advanced Information Systems Engineering (CAiSE 2024): Last Call for Tutorial Proposals
by Announce 21 Feb '24

21 Feb '24

*** Last Call for Tutorial Proposals *** 36th International Conference on Advanced Information Systems Engineering (CAiSE'24) June 3-7, 2024, 5* St. Raphael Resort and Marina, Limassol, Cyprus https://cyprusconferences.org/caise2024/ (*** Submission Deadline: 28th February, 2024 AoE ***) CAiSE'24 invites proposals for tutorials on advanced topics in the field of Information Systems Engineering. Tutorials should aim at offering new insights, knowledge, and skills to professionals, educators, researchers, and students seeking to gain a better understanding either about methods of broad interest in the field, or emergent paradigms that are ripe for practical adoption or that require further research to reach maturity. Proposals emphasizing the special theme of the CAISE'24 conference “Information Systems in the Age of Artificial Intelligence” are encouraged, but proposals on other new or long-standing topics in information systems engineering are also welcome. Tutorials should be focused on principles, concepts, and methods. Commercial or sales-oriented presentations are not allowed and will not be accepted. Tutorials are intended to provide a pedagogic introduction to or overview of a topic of relevance. Potential presenters should keep in mind that there may be a heterogeneous audience, including novice graduate students, experienced practitioners, and specialized researchers. Tutorial speakers should be prepared to cope with this diversity in the audience. Tutorials will be 90 minutes long and organized in parallel with the technical sessions of the main conference and participants of the conference will have free access to all of them. Potential proposers are free to contact the tutorial chairs via e-mail to validate their idea prior to the submission. SELECTION CRITERIA The tutorial chairs will review each proposal and select a subset of them based on the following criteria: 1. relevance to the field of IS engineering; 2. anticipated appeal to the conference audience; 3. timeliness and importance for the conference audience; 4. past experience and qualifications of the instructor(s). The tutorial chairs will also consider the complementarity of the proposal w.r.t. the conference program and other tutorial proposals. SUBMISSION GUIDELINES Tutorial proposals should be submitted to Easy Chair using the conference submission site (https://easychair.org/conferences/?conf=caise2024) and then selecting the “CAiSE 2024 Tutorials” track. The proposal (length up to 1500 words) should cover the following points: • Title • Presenters and affiliation • Goal: The overall goal of the tutorial. • Scope: Intended audience, level (basic or advanced), and prerequisites. • Topic relevance and novelty: Specifically indicate the relevance to the scope of CAiSE, the relevance to practice, the novel aspects that would make this tutorial beneficiary and appealing to CAiSE participants. • Structure of contents: Here you should provide a structured overview of your planned tutorial, organized into numbered sections and subsections. For each subsection, you should sketch its contents in a few sentences or bullet points. • References: Provide references to papers, books, etc. that your tutorial builds on. Please specify previous venues at which similar tutorials have been presented by you and indicate the difference between the proposed tutorial and previous ones. CAiSE usually does not accept tutorials that have been presented in other venues. • Sample Slides: Include at least 5 sample slides of the presentation you plan to give if your tutorial is accepted. Select slides that are typical of your presentation style. These slides have to be submitted in a separate PDF file. Services provided to tutorialists • A 2-page tutorial abstract will be published in the CAiSE LNCS proceedings • Tutorials will benefit from the local organizational infrastructure (registration, badges, refreshments, beamers, screens, etc.). • Advertisement of the tutorial on CAISE 2024 homepage and mailings. • The conference fee will be waived for tutorial presenters (one fee per tutorial). IMPORTANT DATES • Submission of Tutorial Proposals: 28th February, 2024 (AoE) • Notification of Acceptance: 15th March, 2024 • Camera-ready Abstracts: 5th April, 2024 • Tutorial Presenters Registration Deadline: 8th April, 2024 TUTORIAL CHAIRS • Adela del Rio Ortega, University of Seville, Spain (adeladelrio(a)us.es) • Tiago Prince Sales, University of Twente, The Netherlands (t.princesales(a)utwente.nl) Other Committee Members https://cyprusconferences.org/caise2024/committees/

1 0

[CFP] NLDB 2024 - The 29th International Conference on Natural Language & Information Systems - Submissions Open [CFP]
by Federico Torrielli 21 Feb '24

21 Feb '24

* We apologize if you receive multiple copies of this CFP * For the online version of this Call, visit: https://nldb2024.di.unito.it/submissions/ =============== *SUBMISSIONS ARE OPEN* AT https://easychair.org/conferences/?conf=nldb2024 =============== NLDB 2024 The 29th International Conference on Natural Language & Information Systems 25-27 June 2024, University of Turin, Italy. Website: https://nldb2024.di.unito.it/ Submission deadline: 22 March, 2024 About NLDB The 29th International Conference on Natural Language & Information Systems will be held at the University of Turin, Italy, and will be a face to face event. Since 1995, the NLDB conference brings together researchers, industry practitioners, and potential users interested in various applications of Natural Language in the Database and Information Systems field. The term "Information Systems" has to be considered in the broader sense of Information and Communication Systems, including Big Data, Linked Data and Social Networks. The field of Natural Language Processing (NLP) has itself recently experienced several exciting developments. In research, these developments have been reflected in the emergence of Large Language Models and the importance of aspects such as transparency, bias and fairness, Large Multimodal Models and the connection of the NLP field with Computer Vision, chatbots and dialogue-based pipelines. Regarding applications, NLP systems have evolved to the point that they now offer real-life, tangible benefits to enterprises. Many of these NLP systems are now considered a de-facto offering in business intelligence suites, such as algorithms for recommender systems and opinion mining/sentiment analysis. Language models developed by the open-source community have become widespread and commonly used. Businesses are now readily adopting these technologies, thanks to the efforts of the open-source community. For example, fine-tuning a language model on a company’s own dataset is now easy and convenient, using modules created by thousands of academic researchers and industry experts. It is against this backdrop of recent innovations in NLP and its applications in information systems that the 29th edition of the NLDB conference takes place. We welcome research and industrial contributions, describing novel, previously unpublished works on NLP and its applications across a plethora of topics as described in the Call for Papers. Call for Papers: NLDB 2024 invites authors to submit papers on unpublished research that addresses theoretical aspects, algorithms, applications, architectures for applied and integrated NLP, resources for applied NLP, and other aspects of NLP, as well as survey and discussion papers. This year's edition of NLDB continues with the Industry Track to foster fruitful interaction between the industry and the research community. Topics of interest include but are not limited to: * Large Language Models: training, applications, transfer learning, interpretability of large language models. * Multimodal Models: Integration of text with other modalities like images, video, and audio; multimodal representation learning; applications of multimodal models. * AI Safety and ethics: Safe and ethical use of Generative AI and NLP; avoiding and mitigating biases in NLP models and systems; explainability and transparency in AI. * Natural Language Interfaces and Interaction: design and implementation of Natural Language Interfaces, user studies with human participants on Conversational User Interfaces, chatbots and LLM-based chatbots and their interaction with users. * Social Media and Web Analytics: Opinion mining/sentiment analysis, irony/sarcasm detection; detection of fake reviews and deceptive language; detection of harmful information: fake news and hate speech; sexism and misogyny; detection of mental health disorders; identification of stereotypes and social biases; robust NLP methods for sparse, ill-formed texts; recommendation systems. * Deep Learning and eXplainable Artificial Intelligence (XAI): Deep learning architectures, word embeddings, transparency, interpretability, fairness, debiasing, ethics. * Argumentation Mining and Applications: Automatic detection of argumentation components and relationships; creation of resource (e.g. annotated corpora, treebanks and parsers); Integration of NLP techniques with formal, abstract argumentation structures; Argumentation Mining from legal texts and scientific articles. * Question Answering (QA): Natural language interfaces to databases, QA using web data, multi-lingual QA, non-factoid QA(how/why/opinion questions, lists), geographical QA, QA corpora and training sets, QA over linked data (QALD). * Corpus Analysis: multi-lingual, multi-cultural and multi-modal corpora; machine translation, text analysis, text classification and clustering; language identification; plagiarism detection; information extraction: named entity, extraction of events, terms and semantic relationships. * Semantic Web, Open Linked Data, and Ontologies: Ontology learning and alignment, ontology population, ontology evaluation, querying ontologies and linked data, semantic tagging and classification, ontology-driven NLP, ontology-driven systems integration. * Natural Language in Conceptual Modelling: Analysis of natural language descriptions, NLP in requirement engineering, terminological ontologies, consistency checking, metadata creation and harvesting. * Natural Language and Ubiquitous Computing: Pervasive computing, embedded, robotic and mobile applications; conversational agents; NLP techniques for Internet of Things (IoT); NLP techniques for ambient intelligence * Big Data and Business Intelligence: Identity detection, semantic data cleaning, summarisation, reporting, and data to text. Important Dates: Full paper submission: 22 March, 2024 Paper notification: 19 April, 2024 Camera-ready deadline: 26 April, 2024 Conference: 25-27 June 2024 Submission Guidelines: Authors should follow the LNCS format (https://www.springer.com/gp/computer-science/lncs/conference-proceedings-gu…) and submit their manuscripts in pdf via Easychair (https://easychair.org/conferences/?conf=nldb2024) Papers can be submitted to either the main conference or the industry track. Submissions can be full papers (up to 15 pages including references and appendices), short papers (up to 11 pages including references and appendices) or papers for a poster presentation or system demonstration (6 pages including references). The program committee may decide to accept some full papers as short papers or poster papers. All questions about submissions should be emailed to federico.torrielli(a)unito.it (Web & Publicity Chair) General Chairs: Luigi Di Caro, University of Turin Farid Meziane, University of Derby Amon Rapp, University of Turin Vijayan Sugumaran, Oakland University

1 0

CfP: VarDial 2024 and Shared Tasks
by Yves Scherrer 21 Feb '24

21 Feb '24

VarDial 2024, the eleventh workshop on NLP for similar languages, varieties and dialects, will be held in conjunction with NAACL in Mexico City, on June 20/21, 2024. We welcome papers dealing with one or more of the following topics: - Corpora, resources, and tools for similar languages, varieties and dialects; - Adaptation of tools (taggers, parsers) for similar languages, varieties and dialects; - Evaluation of language resources and tools when applied to language varieties; - Reusability of language resources in NLP applications (e.g., for machine translation, POS tagging, syntactic parsing, etc.); - Corpus-driven studies in dialectology and language variation; - Computational approaches to mutual intelligibility between dialects and similar languages; - Automatic identification of lexical variation; - Automatic classification of language varieties; - Text similarity and adaptation between language varieties; - Linguistic issues in the adaptation of language resources and tools (e.g., semantic discrepancies, lexical gaps, false friends); - Machine translation between closely related languages, language varieties and dialects. In addition to the topics listed above, we also welcome papers dealing with diachronic language variation (e.g. phylogenetic methods, historical dialects). Paper submission deadline: March 10, 2024 (AoE) Details: https://sites.google.com/view/vardial-2024/call-for-papers The VarDial workshop has a history of hosting well-attended shared tasks on various dialects and languages. In 2024, we organize the two following tasks: 1. The DIALECT-COPA shared task on dialectal causal commonsense reasoning This shared task invites the community to propose, develop, and test approaches for adapting models for causal commonsense language understanding to three dialects of South-Slavic languages: the Slovenian Cerkno dialect, the Croatian Chakavian dialect, and the Serbian, Macedonian and Bulgarian Torlak dialect. Training and development data based on the COPA (Choice of plausible alternatives, Roemmele et al. 2011) dataset are available for four related standard languages (Slovenian, Croatian, Serbian, Macedonian) and two out of the three testing dialects (Cerkno, Torlak), the Chakavian dialect serving as a surprise dialect. 2. DSL-ML - Multi-label classification of similar languages The DSL-ML task is a multi-label extension of the classic "Discriminating similar languages" task that has been popular with VarDial since the beginnings of the workshop. The motivation behind this new task formulation is that some texts do not present any linguistic markers to unambiguously determine their origin. It therefore makes sense to predict several possible labels for such texts. The 2024 DSL-ML task is based on multi-label conversions of existing datasets from five different macro-languages: English, Spanish, Portuguese, French and BCMS (Bosnian, Croatian, Montenegrin, Serbian). Test results submission deadline: March 11, 2024 (AoE) System description paper submission deadline: March 24, 2024 (AoE) Registration: https://forms.gle/UcLYcPgDFJoiAVip7 Details: https://sites.google.com/view/vardial-2024/shared-tasks

1 0

Deadline Extension - Call for papers for the LREC-COLING2024 pre-conference workshop: Holocaust Testimonies as Language Resources
by Francesca Frontini 21 Feb '24

21 Feb '24

[Deadline Extension] *Call for papers for the LREC-COLING2024 pre-conference workshop: Holocaust Testimonies as Language Resources* Date: 21 May 2024 (full day) Venue: Lingotto Conference Centre, Turin, Italy Webpage: https://www.clarin.eu/HTRes2024 <https://url6.mailanyone.net/scanner?m=1rX22L-0002ym-4R&d=4%7Cmail%2F90%2F17…> Submission Extended Deadline: 28 February 2024 Submission Portal: https://softconf.com/lrec-coling2024/htres2024/ *Workshop description* Holocaust testimonies serve as a bridge between survivors and history’s darkest chapters, providing a connection to the profound experiences of the past. Testimonies stand as the primary source of information that describe the Holocaust, offering first-hand accounts and personal narratives of those who experienced it. The majority of testimonies are captured in an oral format, as survivors vividly explain and share their personal experiences and observations from that time period. Transforming Holocaust testimonies into a machine-processable digital format can be a difficult task owing to the unstructured nature of the text. The creation of accessible, comprehensive, and well-annotated Holocaust testimony collections is of paramount importance to our society. These collections empower researchers and historians to validate the accuracy of socially and historically significant information, enabling them to share critical insights and trends derived from these data. This workshop will investigate a number of ways in which techniques and tools from natural language processing and corpus linguistics can contribute to the exploration, analysis, dissemination and preservation of Holocaust testimonies. The workshop is supported by CLARIN and the European Holocaust Research Infrastructure (EHRI). We expect contributions related to the following topics: Creation of datasets and development of tools for the study of Holocaust testimonies: - Creation of language corpora of Holocaust testimonies - Digitisation and enhancement of oral and written testimonies (including automatic speech recognition, alignment of text and speech, format conversion, OCR, handwriting recognition, machine translation) - Named entity recognition for identifying people, places, and events in testimonies. - Standards, representation formats, and guidelines for annotations and vocabularies relevant to the Holocaust testimonies - Creation, adaptation and tuning of software applications for the creation, annotation, enhancement and use of Holocaust testimonies as language resources. - Research using and Holocaust testimonies. - Applications of NLP in analysing Holocaust survivor testimonies - Sentiment analysis and emotional content extraction from survivor narratives. - Data Visualisation, Knowledge representation and Information Extraction: - Visualising complex data structures from Holocaust testimonies - Building knowledge graphs and networks to represent historical relationships - Interactive data visualisations for education and research - Extracting biographical and temporal information relevant to the Holocaust - Deep learning and large language models - Digital Archiving and Long-Term Preservation: - Methods and tools for digitising and preserving Holocaust testimonies - Best practices for metadata standards and cataloguing - Ensuring long-term accessibility and data integrity - Ethical Considerations and Privacy - Ethical challenges in digitising and sharing sensitive testimonies - Anonymisation and privacy protection in Holocaust data - Community engagement and consent in digital projects - User and application aspects - Development of tools and interfaces for the search, analysis and exploration of Holocaust testimonies - Other relevant use cases and application scenarios All papers must clearly state and explain their relevance to the topic of 'Holocaust Testimonies as Language Resources'. *Submission & Publication* Papers will be evaluated according to their significance, originality, technical content, style, clarity, and relevance to the workshop. We welcome the following types of contributions: - Standard research papers (up to 8 pages, plus more pages for references if needed); - Short research papers (from 4 to 6 pages, plus more pages for references if needed). Submissions must be anonymous and strictly follow the LREC2024 stylesheet formatting <https://url6.mailanyone.net/scanner?m=1rX22L-0002ym-4R&d=4%7Cmail%2F90%2F17…>guidelines. All papers should be electronically submitted in PDF format via the main conference platform via START <https://url6.mailanyone.net/scanner?m=1rX22L-0002ym-4R&d=4%7Cmail%2F90%2F17…> . *Important Dates* - *Paper submission deadline:* 21 February 2024 - *Extended paper submission deadline:* 28 February 2024 - *Notification of acceptance:* 20 March 2024 - *Camera-ready paper: *15 April 2024 - *Workshop date: *21 May 2024 *Organising Committee* - Isuri Anuradha, University of Wolverhampton, UK - Ingo Frommholz, University of Wolverhampton, UK - Francesca Frontini, CNR-ILC, Italy & CLARIN, Italy - Martin Wynne, Oxford University, UK - Ruslan Mitkov, Lancaster University, UK - Paul Rayson, Lancaster University, UK - Alistair Plum, University of Luxembourg, Luxembourg *Programme Committee* - Le An Ha, Ho Chi Minh City University of Foreign Languages and Information Technology, Vietnam - Federico Boschetti, CNR-Istituto di, Linguistica Computazionale “A. Zampolli”, Italy - Estelle Bunout, University of Luxembourg, Luxembourg - Martin Bulin, University of West Bohemia, Czech Republic - Tim Cole, University of Bristol, UK - Angelo Mario Del Grosso, CNR-Istituto di, Linguistica Computazionale “A. Zampolli”, Italy - Maria Dermentzi, King’s College London, UK - Robert Ehrenreich, USHMM, USA - Ignatius Ezeani, Lancaster University, UK - Ian Gregory, Lancaster University, UK - Arjan van Hessen, Radboud University - Henk van den Heuvel, Radboud University & CLARIN ERIC - Renana Keydar, The Hebrew University of Jerusalem, Israel - William J.B. Mattingly, USHMM, USA - Patricia Murrieta-Flores, Lancaster, University, UK - Maciej Ogrodniczuk, Institute of Computer, Science, Polish Academy of Sciences, Poland - Maciej Piasecki, Wroclaw University of Science and Technology, Poland - Rachel Pistol, King’s College London, UK - Johannes-Dieter Steinert, University of Wolverhampton, UK - Jan Svec, University of West Bohemia - Gabor Toth, University of Luxembourg, Luxembourg - Eveline Wandl-Vogt, Austrian Academy of Sciences, Vienna

1 0

Call for Papers - the 11th Workshop on Argument Mining (ACL 2024, Bangkok, Thailand)
by Yamen Ajjour 21 Feb '24

21 Feb '24

CALL FOR PAPERS The 11th Workshop on Argument Mining @ ACL 2024 August 15, 2024 https://argmining-org.github.io/2024/ The 11th Workshop on Argument Mining will be held on August 15, 2024, in Bangkok, Thailand, together with ACL 2024. The Workshop on Argument Mining provides a regular forum for presenting and discussing cutting-edge research in argument mining (a.k.a argumentation mining) for academic and industry researchers. By continuing a series of ten successful previous workshops, this edition will welcome the submission of long, short, and demo papers. Also, it will feature two shared tasks and a keynote talk. IMPORTANT DATES Direct paper submission deadline (OpenReview): May 17, 2024 Paper commitment from ARR: May 24, 2024 Notification of acceptance: June 17, 2024 Camera-ready papers due: July 1, 2024 Workshop: August 15, 2024 TOPICS OF INTEREST - Identification, Assessment, and Analysis of Arguments - Identification of argument components (e.g., premises and conclusions) - Structure analysis of arguments within and across documents - Relation Identification between arguments and counterarguments (e.g., support and attack) - Creation and evaluation of argument annotation schemes, relationships to linguistic and discourse annotations, (semi-) automatic argument annotation methods and tools, and creation of argumentation corpora - Assessment of arguments for various properties (e.g., stance, clarity) - Generation of Arguments, Multi-modal and Multi-lingual Argument Mining - Automatic generation of arguments and their components - Consideration of discourse goals in argument generation - Argument mining and generation from multi-modal/multi-lingual data - Mining and Analysis of different Genres and Domains of Arguments - Argument mining in specific genres and domains (e.g., education, law, scientific writing) - Analysis of unique styles within genres (e.g., short informal text, highly structured writing) - Knowledge Integration, Information Retrieval, and Real-world Applications - Integration of commonsense and domain knowledge into argumentation models - Combination of information retrieval methods with argument mining - Real-world applications, including argument web search, opinion analysis and summarization, and misinformation detection - Ethical Considerations and Future Reflections - Reflection on the ethical aspects and societal impact of argument-mining methods - Reflection on the future of argument mining in light of the fast advancement of large language models (LLMs) SUBMISSIONS The organizing committee welcomes submitting long papers, short papers, and demo descriptions. Accepted papers will be presented via oral or poster presentations and included in the ACL proceedings as workshop papers. - Long paper submissions must describe substantial, original, completed, and unpublished work. Wherever appropriate, concrete evaluation and analysis should be included. Long papers must be at most eight pages, including title, text, figures, and tables. An unlimited number of pages is allowed for references. Two additional pages are allowed for appendices, and an extra page is allowed in the final version to address reviewers’ comments. - Short paper submissions must describe original and unpublished work. Please note that a short paper is not a shortened long paper. Instead, short papers should have a point that can be made in a few pages, such as a small, focused contribution, a negative result, or an interesting application nugget. Short papers must be at most four pages, including title, text, figures, and tables. An unlimited number of pages is allowed for references. One additional page is allowed for the appendix, and an extra page is allowed in the final version to address reviewers’ comments. - Demo descriptions must be at most four pages, including title, text, examples, figures, tables, and references. A separate one-page document should be provided to the workshop organizers for demo descriptions, specifying furniture and equipment needed for the demo. Multiple Submissions ArgMining 2024 will not consider any paper under review in a journal or another conference or workshop at the time of submission, and submitted papers must not be submitted elsewhere during the review period. ArgMining 2024 will also accept submissions of ARR-reviewed papers, provided that the ARR reviews and meta-reviews are available by the ARR commitment deadline (May 24). However, ArgMining 2024 will not accept direct submissions that are actively under review in ARR, or that overlap significantly (>25%) with such submissions. Submission Format All long, short, and demonstration submissions must follow the two-column ACL 2024 format. Authors are expected to use the LaTeX or Microsoft Word style template (https://github.com/acl-org/acl-style-files). Submissions must conform to the official ACL style guidelines contained in these templates. Submissions must be electronic and in PDF format. Submission Link and Deadline For Direct Submissions Authors have to fill in the submission form in the OpenReview system and upload a PDF of their paper before May 17, 2024, 11:59 pm UTC-12h (anywhere on earth). https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/ArgMining For the ARR commitment process, we will provide details in our second call for papers. Double Blind Review ArgMining 2024 will follow the ACL policies for preserving the integrity of double-blind review for long and short paper submissions. Papers must not include authors’ names and affiliations. Furthermore, self-references or links (such as GitHub) that reveal the author’s identity, e.g., “We previously showed (Smith, 1991) …” must be avoided. Instead, use citations such as “Smith previously showed (Smith, 1991) …” Papers that do not conform to these requirements will be rejected without review. Papers should not refer, for further detail, to documents that are not available to the reviewers. For example, do not omit or redact important citation information to preserve anonymity. Instead, use the third person or named reference to this work, as described above (“Smith showed” rather than “we showed”). Papers may be accompanied by a resource (software and/or data) described in the paper, but these resources should also be anonymized. Unlike long and short papers, demo descriptions will not be anonymous. Demo descriptions should include the authors’ names and affiliations, and self-references are allowed. ANONYMITY PERIOD (taken from the ACL call for papers in verbatim for the most part) We follow the ACL Policies for Review and Citation. Submissions must be anonymized, but there is no anonymity period or limitation on posting or discussing non-anonymous preprints while the work is under peer review. BEST PAPER AWARD In order to recognize significant advancements in argument mining science and technology, ArgMining 2024 will include the Best Paper award. All papers at the workshop are eligible for the best paper award, and a selection committee consisting of prominent researchers in the fields of interest will select the award recipients. SHARED TASKS We will be hosting two shared tasks this year: 1. Perspective Argument Retrieval 2. DialAM-2024: The First Shared Task on Dialogical Argument Mining ArgMining 2024 ORGANIZING COMMITTEE Yamen Ajjour, Leibniz University Hannover Roy Bar-Haim, IBM Research Roxanne El Baff, German Aerospace Center (DLR) and Bauhaus-Universität, Weimar Zhexiong Liu, University of Pittsburgh Gabriella Skitalinskaya, Leibniz University Hannover

1 0

Call for participation: ‘Statistics for linguistics with R’ bootcamp (08 – 12/07/2024)
by Magali Paquot 21 Feb '24

21 Feb '24

The Linguistics Research Unit of the Institute of Language and Communication (Université catholique de Louvain, Belgium) will be hosting Stefan Gries’s next bootcamp on statistics for linguistics with R from 08 to 12 July 2024. The ‘Statistics for linguistics with R’ bootcamp is a hands-on introduction to statistical methods for both graduate students and seasoned researchers and is loosely based on the third edition (2021) of Gries’s textbook Statistics for linguistics with R. The course is intended for linguists who already have a basic knowledge in statistics and some experience using R and who wish to improve their proficiency in statistical modeling of linguistic data. Using the open source software and programming language R, we will deal with: • fundamental aspects of fixed effects regression modeling for both numeric and binary response variables; these include exploration of data and their preparation for modeling, model formulation and selection; numerical and visual interpretation and evaluation of models; • more advanced aspects of fixed-effects regression modeling such as contrasts for ordinal predictors, orthogonal contrasts, curvature of numeric predictors, and maybe general linear hypothesis tests; • the theoretical foundations of mixed-effects regression modeling; • applications of mixed-effects modeling for both numeric and binary response variables; • tree-based methods and random forests: 'fitting' and interpreting them with importance scores, partial dependence scores, and detecting (not just capturing) interactions. Online registration will start on 4 March 2024, 1 pm CEST. The number of participants is limited. If you would like to participate, mark the date in your diary! https://uclouvain.be/en/research-institutes/ilc/cecl/rling2024.html Contact email : magali.paquot(a)uclouvain.be<mailto:magali.paquot@uclouvain.be> Magali Paquot Convenor

1 0

ArabicNLP 2024 Shared Tasks Announcement
by Salam Khalifa 20 Feb '24

20 Feb '24

Dear Corpora members, We are happy to announce *eight* exciting shared tasks in this year's ArabicNLP conference: - Shared Task 1: Arabic Financial NLP (*AraFinNLP*) - Shared Task 2: *FIGNEWS 2024*: Shared Task on News Media Narratives of the Israel War on Gaza - Shared Task 3: *ArAIEval*: Propagandistic Techniques Detection in Unimodal and Multimodal Arabic Content - Shared Task 4: *StanceEval2024*: Arabic Stance Evaluation Shared Task - Shared Task 5: *WojoodNER 2024*: The 2nd Arabic Named Entity Recognition Shared Task - Shared Task 6: *ArabicNLU* Shared-Task: Arabic Natural Language Understanding - Shared Task 7: *NADI 2024*: Nuanced Arabic Dialect Identification - Shared Task 8: *KSAA-CAD* Shared Task: Contemporary Arabic Reverse Dictionary and Word Sense Disambiguation For more information and details about participation, please visit https://arabicnlp2024.sigarab.org/shared-tasks Best, ArabicNLP 2024 publicity chairs. -- Salam Khalifa PhD Candidate at Stony Brook Linguistics <https://www.linguistics.stonybrook.edu/>.

1 0

[CfP] 2nd Call for Papers on LDL! Workshop on Linked Data in Linguistics @LREC-COLING 2024
by Patricia Martín Chozas 20 Feb '24

20 Feb '24

(apologies for cross-posting) The 9th Workshop on Linked Data in Linguistics: Resources, Applications, Best PracticesSecond Call for Papers Workshop colocated with *LREC-COLING 2024*, Date: *May 25, 2024* Venue: Torino, Italy and online For up to date info, check: https://ldl2024.linguistic-lod.org/ The Linked Data in Linguistics (LDL) workshop series has established itself as the premier venue for discussing the application of Semantic Web technologies to the fields of linguistics, digital lexicography, and digital humanities (DH). While recent years have witnessed a steady growth in adoption of the technology in these areas, its uptake in other relevant domains, most notably in the case of natural language processing (NLP), continues to lag behind. This year, aside from embracing the full bandwidth of applications of LLOD technologies and the closely related area of knowledge graphs in linguistics, we welcome contributions addressing the application of LLOD technologies to NLP applications, as well as those dealing with emerging hot topics of future bridges between structured (linguistic) knowledge and neural methods. In addition, this year’s edition of the workshop will be a venue for in-depth discussions on community standards and best practices, and, above all, those related to the work of the W3C community groups OntoLex [1], LD4LT [2] and BPMLOD [3]. To this end, it will include featured talks on the latest achievements, developments, and perspectives of these W3C Community Groups. [1] Ontology-Lexica Community Group [2] Linked Data in Language Technology Community Group [3] Best Practices in Multilingual Linked Open Data *Topics of interest* We invite presentations of algorithms, methodologies, experiments, tools, use cases, descriptions of ongoing or planned research projects as well as position papers that describe the creation, publication or application of linked linguistic data collections and their linking with other resources. Descriptions of such data, and in particular, its uses in research (linguistics, lexicology, digital humanities) and technology (NLP, e-lexicography, localization) are also welcome. The following is a non-exhaustive list of relevant topics: 1) Building, managing and linking language resources - Lexicons and Lexical Data, including Dictionaries and Lexicographic Resources - Annotations and Annotated Corpora - Entity Linking 2) Technologies, challenges and best practices for language technology and language resources on the web: - Interoperability - Sustainability - FAIRness 3) Structured data in language technology: - Knowledge Graphs - Machine Learning - Multilingual Technologies - Language Knowledge Injection in LLMs 4) Show cases, case studies and applications by different communities of practice: - Multimodality - Corpus Linguistics - Lexicography - Digital Humanities 5) Current directions and critical reflection. Position papers on: - Ethical, legal, technological aspects of structured data in the age of LLMs - The role of LLOD in promoting low-resource languages - Extensions of RDF and graph formalisms We invite both long (8 pages and 2 pages of references) and short papers (4 pages and 2 pages of references) representing original research, innovative approaches and resource descriptions. Short papers may also represent project descriptions. These do not have to be implemented but discuss to what extent and for which purposes Linguistic Linked Open Data is reused or created. Projects that are still in their early stages and seek advice from the broader Linguistic Linked Data community are welcome, especially if they include underrepresented fields of study. Papers should be formatted according to the LREC-COLING guidelines, please see https://lrec-coling-2024.org/authors-kit/. Please note that the review process will be *single-blind*. *Identify, Describe and Share your LRs!* When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC-COLING authors to share the described LRs (data, tools, services, etc.) to enable their reuse and replicability of experiments (including evaluation ones). *Important Dates* - Submission Date: February 23 *March 1, 2024* - Notification of Acceptance: March 22, 2024 - Camera-Ready: April 2, 2024 - Workshop: May 25, 2024 *Workshop Organizers* - Christian Chiarcos (University of Augsburg, Germany) - Katerina Gkirtzou (Athena Research Center, Greece) - Maxim Ionov (University of Cologne, Germany) - Fahad Khan (Consiglio Nazionale delle Ricerche, Italy) - John P. McCrae, (University of Galway, Ireland) - Elena Montiel Ponsoda (Universidad Politécnica de Madrid, Spain) - Patricia Martín Chozas (Universidad Politécnica de Madrid, Spain) Please get in contact via ldl2024(a)linguistic-lod.org. *Program Committee* - Sina Ahmadi (George Mason University, USA) - Verginica Barbu Mititelu (Research Institute for Artificial Intelligence of the Romanian Academy, Romania) - Paul Buitelaar (Insight, Ireland) - Sara Carvalho (University of Aveiro, Portugal) - Rute Costa (NOVA FCSH/NOVA CLUNL, Portugal) - Milan Dojchinovski (Czech Technical University, Czech Republic) - Agata Filipowska (Uniwersytet Ekonomiczny w Poznaniu, Poland) - Francesca Frontini (CNR-ILC, Italy) - Frances Gillis Webber (University of Cape Town, South Africa) - Voula Giouli (Athena Research Center, Greece) - Dagmar Gromann (University of Vienna, Austria) - Yoshihiko Hayashi (Waseda University, Japan) - Alik Kirillovich (Higher School of Economics, Russia) - Penny Labropoulou (Athena Research Center, Greece) - Chaya Liebeskind (Jerusalem College of Technology, Israel) - David Lindemann (University of the Basque Country, Spain) - Francesco Mambrini (Università Cattolica del Sacro Cuore, Italy) - Monica Monachini (CNR-ILC, Italy) - Steven Moran (Université de Neuchâtel, Switzerland) - Diego Moussallem (Paderborn University, Germany) - Roberto Navigli (“La Sapienza” Università di Roma, Italy) - Petya Osenova (IICT-BAS, Bulgaria) - Ana Ostroški Anić (Institute of Croatian Language and Linguistics, Croatia) - Giulia Pedonese (CNR-ILC, Italy) - Sigita Rackevičienė (Mykolas Romeris University, Lithuania) - Felix Sasaki (SAP, Germany) - Andrea Schalley (Karlstad University, Sweden) - Gilles Sérasset (University Grenoble Alpes, France) - Milena Slavcheva (IICT-BAS, Bulgaria) - Blerina Spahiu (Bicocca University, Italy) - Ranka Stanković (University of Belgrade, Serbia) - Armando Stellato (University of Rome, Italy) - Federica Vezzani (University of Padua, Italy) *Patricia Martín Chozas - Postdoctoral Researcher* * Ontology Engineering Group* Artificial Intelligence Department ETSI Informáticos - Universidad Politécnica de Madrid Phone: (+34) 910673091

1 0

LT4HALA 2024 -- deadline extension
by Rachele SPRUGNOLI 20 Feb '24

20 Feb '24

LT4HALA 2024 -- deadline extension -- Third Workshop on Language Technologies for Historical and Ancient LAnguages @ LREC COLING 2024 The Third Workshop on Language Technologies for Historical and Ancient LAnguages (LT4HALA 2024) will be held on May 25th in Torino (Italy), co-located with LREC-COLING 2024. This one-day workshop seeks to bring together scholars, who are developing and/or are using Language Technologies (LTs) for historically attested languages, so to foster cross-fertilization between the Computational Linguistics community and the areas in the Humanities dealing with historical linguistic data, e.g. historians, philologists, linguists, archaeologists and literary scholars. * Submission deadline: 26th February 2024 **NEW DEADLINE: 1st March 2024** Website: https://circse.github.io/LT4HALA/2024/ Submission page: https://softconf.com/lrec-coling2024/lt4hala2024/

1 0

[SEPLN] Call for participation - MentalRiskES@IberLEF2024: Early detection of mental disorders risk in Spanish
by María Dolores Molina González 20 Feb '24

20 Feb '24

------------------------------------------------------------------------------ MentalRiskES at IberLEF 2024: Call for Participation Website: https://sites.google.com/view/mentalriskes2024 ------------------------------------------------------------------------------- MentalRiskES describes the second edition of a novel task on early risk identification of mental disorders in Spanish comments from social media sources. The first edition took place last year in the IberLEF evaluation forum as part of the SEPLN 2023. The task was resolved as an online problem, that is, the participants had to detect a potential risk as early as possible in a continuous stream of data. Therefore, the performance not only depended on the accuracy of the systems but also on how fast the problem was detected. These dynamics are reflected in the design of the tasks and the metrics used to evaluate participants. For this second edition, we propose three novel tasks, the first subtask is about detection disorder, the second subtask consists of detecting the context that may be associated with the disorder, and the third subtask is about suicidal ideation detection. We would like to invite you to participate in the following tasks: 1. Disorders detection (multi-class classification) 2. Disorder contextualization (fine-grained classification) 3. Suicidal ideation detection (binary classification) Find out more at https://sites.google.com/view/mentalriskes2024. MentalRiskES 2024 is part of the IberLEF Workshop and will be held in conjunction with the SEPLN 2024 conference in Valladolid (Spain). ------------------------------------------------------------------------------- Important Dates ------------------------------------------------------------------------------- Feb 16th Registration open Feb 21st Release of trial corpora (trial server available) Mar 20th Release of training corpora Mar 29th Registration closed Apr 8th Release of test corpora and start of the evaluation campaign (test server available and trial submissions closed) Apr 12th End of evaluation campaign (deadline for submission of runs) Apr 18th Publication of official results and release of test gold labels May 10th Deadline for paper submission May 31st Acceptance notification Jun 17th Camera-ready submission deadline July 11th Final camera-ready submission deadline (to IberLEF organisers) Please reach out to the organizers at MentalRiskEs@IberLEF2024. The MentalRiskES 2024 organizing committee. -- M. Dolores Molina González ----------------------------------------------------------- Mas informacion sobre listas de correo en la Univ. de Jaen http://www.ujaen.es/sci/redes/listas/ -----------------------------------------------------------

1 0

8th Workshop on Online Abuse and Harms (WOAH) @NAACL: 2nd CFP
by Yi-Ling Chung 20 Feb '24

20 Feb '24

*** Second Call for Papers *** We invite paper submissions to the 8th Workshop on Online Abuse and Harms (WOAH), which will take place on June 20/21 at NAACL 2024. Website: https://www.workshopononlineabuse.com/cfp.html Join our WOAH community Slack channel<https://hatespeechdet-47d7560.slack.com/join/shared_invite/zt-2a8d96j4z-gkN…>! Important Dates Submission due: March 10, 2024 ARR reviewed submission due: April 7, 2024 Notification of acceptance: April 14, 2024 Camera-ready papers due: April 24, 2024 Workshop: June 20/21, 2024 Overview Digital technologies have brought many benefits for society, transforming how people connect, communicate and interact with each other. However, they have also enabled abusive and harmful content such as hate speech and harassment to reach large audiences, and for their negative effects to be amplified. The sheer amount of content shared online means that abuse and harm can only be tackled at scale with the help of computational tools. However, detecting and moderating online abuse and harms is a difficult task, with many technical, social, legal and ethical challenges. The Workshop on Online Abuse and Harms invites paper submissions from a wide range of fields, including natural language processing, machine learning, computational social sciences, law, politics, psychology, sociology and cultural studies. We explicitly encourage interdisciplinary submissions, technical as well as non-technical submissions, and submissions that focus on under-resourced languages. We also invite non-archival submissions and civil society reports. The topics covered by WOAH include, but are not limited to: * New models or methods for detecting abusive and harmful online content, including misinformation; * Biases and limitations of existing detection models or datasets for abusive and harmful online content, particularly those in commercial use; * New datasets and taxonomies for online abuse and harms; * New evaluation metrics and procedures for the detection of harmful content; * Dynamics of online abuse and harms, as well as their impact on different communities * Social, legal, and ethical implications of detecting, monitoring and moderating online abuse In addition, we invite submissions related to the theme for this eighth edition of WOAH, which will be online harms in the age of large language models. Highly capable Large Language Models (LLMs) are now widely deployed and easily accessible by millions across the globe. Without proper safeguards, these LLMs will readily follow malicious instructions and generate toxic content. Even the safest LLMs can be exploited by bad actors for harmful purposes. With this theme, we invite submissions that explore the implications of LLMs for the creation, dissemination and detection of harmful online content. We are interested in how to stop LLMs from following malicious instructions and generating toxic content, but also how they could be used to improve content moderation and enable countermeasures like personalised counterspeech. To support our theme, we have invited an interdisciplinary line-up of high-profile speakers across academia, industry and public policy. Submission Submission is electronic, using the Softconf START conference management system. Submission link: https://softconf.com/naacl2024/WOAH2024/manager/scmd.cgi?scmd=submitPaperCu… The workshop will accept three types of papers. * Academic Papers (long and short): Long papers of up to 8 pages, excluding references, and short papers of up to 4 pages, excluding references. Unlimited pages for references and appendices. Accepted papers will be given an additional page of content to address reviewer comments. Previously published papers cannot be accepted. * Non-Archival Submissions: Up to 2 pages, excluding references, to summarise and showcase in-progress work and work published elsewhere. * Civil Society Reports: Non-archival submissions, with a minimum of 2 pages and no upper limit. Can include work published elsewhere. Format and styling All submissions must use the official ACL two-column format, using the supplied official style files. The templates can be downloaded in Style Files and Formatting<https://github.com/acl-org/acl-style-files>. Please send any questions about the workshop to organizers(a)workshopononlineabuse.com<mailto:organizers@workshopononlineabuse.com> Organisers Paul Röttger, Bocconi University Yi-Ling Chung, The Alan Turing Institute Debora Nozza, Bocconi University Aida Mostafazadeh Davani, Google Research Agostina Calabrese, University of Edinburgh Flor Miriam Plaza-del-Arco, Bocconi University Zeerak Talat, MBZUAI The Alan Turing Institute is a limited liability company, registered in England with registered number 09512457. Our registered office is at British Library, 96 Euston Road, London, England, NW1 2DB. We are also a charity registered in England with charity number 1162533. This email and any attachments are confidential and may be legally privileged. If you have received it in error, you are on notice of its status. If you have received this message in error, please send it back to us, and immediately and permanently delete it. Do not use, copy or disclose the information contained in this message or in any attachment. DISCLAIMER: Although The Alan Turing Institute has taken reasonable precautions to ensure no viruses are present in this email, The Alan Turing Institute cannot accept responsibility for any loss or damage sustained as a result of computer viruses and the recipient must ensure that the email (and attachments) are virus free. While we take care to protect our systems from virus attacks and other harmful events, we give no warranty that this message (including attachments) is free of any virus or other harmful matter, and we accept no responsibility for any loss or damage resulting from the recipient receiving, opening or using it. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or be incomplete. If you think someone may have interfered with this email, please contact the Alan Turing Institute by telephone only and speak to the person dealing with your matter or the Accounts Department. Fraudsters are increasingly targeting organisations and their affiliates, often requesting funds to be transferred to a different bank account. The Alan Turing's bank details are contained within our terms of engagement. If you receive a suspicious or unexpected email from us, or purporting to have been sent on our behalf, particularly containing different bank details, please do not reply to the email, click on any links, open any attachments, nor comply with any instructions contained within it, but contact our Accounts department by telephone. Our Transparency Notice found here - https://www.turing.ac.uk/transparency-notice sets out how and why we collect, store, use and share your personal data and it explains your rights and how to raise concerns with us.

1 0