- Corpora - ELRA lists

free Taster session on Corpus linguistics etc.
by Brezina, Vaclav 22 Jan '24

22 Jan '24

Hi all, We are offering a free taster session for anyone who would like to know more about Lancaster University's courses in corpus linguistics (MA, PG Certificate, individual 3-month courses). You can join us online on 21 March, 10am-11am UK time Free registration: https://www.lancaster.ac.uk/linguistics/events/study-online-corpus-linguist… Best, Vaclav Professor Vaclav Brezina Professor in Corpus Linguistics Department of Linguistics and English Language ESRC Centre for Corpus Approaches to Social Science Faculty of Arts and Social Sciences, Lancaster University Lancaster, LA1 4YD Office: County South, room C05 T: +44 (0)1524 510828 [cid:image001.png@01DA4D41.FF6D5820]@vaclavbrezina [cid:image002.png@01DA4D41.FF6D5820]<http://www.lancaster.ac.uk/arts-and-social-sciences/about-us/people/vaclav-…>

1 0

Final CfP: Translation in Transition
by Ekaterina Lapshinova-Koltunski 22 Jan '24

22 Jan '24

Dear colleagues, we are happy to announce the 7th edition of the Translation in Transition conference (https://sites.google.com/view/tt2024). This series of conferences has established itself as a central meeting point for researchers in the field of empirical translation studies through previous editions in Copenhagen, Germersheim, Ghent, Barcelona, Kent and Prague. In its 7th edition held at the Shota Rustaveli State University in Batumi it once again wants to be a forum of discussion for empirical research that is based on any kind of empirical methodology and that advances our knowledge in the fields of translation and interpreting. While the Batumi edition will be open to various topics within empirical translation studies, we also want to put special emphasis on two directions: low-resourced and less-researched language pairs, as well as an interplay between different methods and data types, e.g. combining product and process research. *Final Call for Papers* We invite original submissions that deal with any of the conference topics. To encourage a fruitful exchange of ideas and experience among the researchers of various fields of specialization, preference will be given to interdisciplinary contributions that cover two or more of the conference topics. The submissions are to be made in the form of anonymized extended abstracts that should be between 800 and 1000 words long (excluding references) by February 16, 2024. Apart from a clear outline of the aims and methods of the study, the abstracts should also provide (preliminary) results. The abstracts will be submitted through the open review system (https://openreview.net/group?id=TT/2024/Conference) and reviewed by at least two members of the scientific committee. The accepted contributions will be presented either as oral talks or as posters. All submissions must follow the abstract submission instructions (https://sites.google.com/view/tt2024/submission-instructions). We welcome contributions (in English) grounded in empirical approaches to studying both interlingual and intralingual translation, as well as theoretical and position papers on the following topics: * Empirical methods and models (corpus-based, corpus-driven, experimental) or methods derived from computational linguistics and data mining (e.g. computational semantics, pragmatics) applied to translation studies * Presentation of new resources for translation studies (spoken corpora, multimodal corpora, interpreting transcript datasets, corpora of low-resourced languages, lexicons, databases, etc.) * Method and data triangulation: combined use of corpus data and methods and other sources of data * Detection and analysis of specific features of translation (translationese, interpretese, editese, machine translationese, post-editese, etc.) using parallel and comparable corpora * Analysis and interpretation of variation in translation, e.g. variation driven through register/genre, expertise, mode, etc. * Empirical analysis of specialised translation, e.g. legal translation, technical translation and others * Analysis of non-canonical forms of translation/interpreting and multilingual communication * Cognitive and computational insights of variation in translation and translationese * Cognitive modeling of translation processes, including cognitive load measurements * Translation quality assessment and evaluation using corpora or experimental research * Translation in specific settings: between close languages, from a third language, non-native translation, indirect/relay translation, etc. * The use of corpora in translator and/or interpreter training * Improving understanding of translation in the context of NLP * Computer-assisted translation and/or interpreting (CAT/CAI) * Machine translation (MT): analysis, evaluation, selection and preparation of data for MT, ‘machine translationese’Important dates · Conference abstract submission due: Feb 16, 2024 · Notification of acceptance: April 8, 2024 · Final abstract version due: April 29, 2024 · Registration open: May 6, 2024 · Early-bird registration: June 6, 2024 · Conference date: September 23-25, 2024 The conference is organized by the Department of European Studies, Faculty of Humanities, Batumi Shota Rustaveli State University in Batumi (Georgia) in cooperation with the Institute of Translation Studies and Specialised Communication, University of Hildesheim (Germany). Local organizing committee at the Batumi Shota Rustaveli State University Khatuna Beridze, Theona Beridze, Khatuna Diasamidze, Tamta Nagervadze Program Chairs at the University of Hildesheim Ekaterina Lapshinova-Koltunski and Silvana Deilen -- Prof. Dr. Ekaterina Lapshinova-Koltunski Geschäftsführende Direktorin Institut für Übersetzungswissenschaft und Fachkommunikation Fachbereich 3: Sprach und Informationswissenschaften Stiftung Universität Hildesheim Lübecker Straße 3 31141 Hildesheim +49 5121 883-30934

1 0

Call for Workshop Proposals -- LLcD 2024
by Pascal Denis 22 Jan '24

22 Jan '24

Hi there, Could you please distribute the following call for workshop proposals? thanks. Best, Pascal Denis =================================================================================================================================================== Langues et langage à la croisée des disciplines (LLcD) is a new research network, supported by the French CNRS. It organizes an international conference in France aiming to advance the scientific understanding of human language and linguistic systems, as well as to foster new collaborations and cross-fertilization among different research areas and approaches. The LLcD Meeting, slated to occur annually, will change its location and its thematic focus every year. It will be organized around a conference, as well as a panel discussion and a summer school, both within the theme promoted that year. The LLcD conference will feature plenary lectures delivered by invited keynote speakers, a general session and thematic workshops. The first edition of the conference is scheduled to take place at Sorbonne University, Paris, from September 9th to September 11th, 2024 . Its main theme will be: Languages, variations, changes: synchronic, diachronic, typological and comparative approaches.The languages of the conference will be English and French. CALL FOR WORKSHOP PROPOSALS At this stage, we invite workshop proposals on all sub-fields of linguistics, languages, approaches (descriptive, theoretical, empirical, interdisciplinary, etc.), in order to promote diversity across disciplines and methodologies. The proposals are not required to address the main theme of the conference. Each workshop will be organized in 30-minute slots for talks (20 min. presentation, 5 min. discussion, 5 min. room change). The languages of the conference will be English and French. Please note that this call is only for workshop proposals, and separate calls will be issued for the general session and for the summer school. WORKSHOP PROPOSAL REQUIREMENTS Workshop proposals will be formatted both as a docx and pdf document, containing the following information: - Workshop title and five keywords. - Convenors’ names, affiliations and emails. - Workshop description summarizing the main topic, research questions and objectives, including references (between 500 and 1000 words excluding references). The workshop description may additionally include a provisional collection of short abstracts (around 300 words) from foreseen speakers. The names of tentative keynote speakers may also be submitted (note however that LLcD will not be able to cover the associated expenses). Submission Deadlines and Details Workshop proposals should be submitted no later than January 30th, 2024 . They will be reviewed by the scientific committee of LLcD and their inclusion in the conference program will be decided with the conference co-chairs. Notification of acceptance/rejection of the proposal will be communicated by March 1st, 2024. The list of accepted workshop proposals will be published on the conference webpage ( [ http://llcd2024.sciencesconf.org/ | http://llcd2024.sciencesconf.org ] ), and a call for papers will then be issued by the organizers. Individual submissions should be made through EasyChair, and will undergo evaluation by three referees from the LLcD scientific committee, possibly with additional support from external experts suggested by the workshop convenor(s). EasyChair submission page: [ https://easychair.org/conferences/?conf=llcd2024 | https://easychair.org/conferences/?conf=llcd2024 ] Contact: [ mailto:llcd@sciencesconf.org | llcd(a)sciencesconf.org ] ====================================== Langues et langage à la croisée des disciplines (LLcD) est un nouveau groupement de recherche, soutenu par le CNRS. Il organise une conférence internationale en France dans le but de faire progresser la compréhension scientifique du langage humain et des systèmes linguistiques, et de servir de forum favorisant la création de nouvelles collaborations et la fertilisation croisée entre différentes disciplines de recherche et approches. La Rencontre LLcD, annuelle, changera de lieu et de thème chaque anné. Elle incluera un colloque, ainsi qu'une table ronde et une école d’été, toutes deux liées au thème choisi pour la conférence de cette année. La conférence LLcD comprendra des conférences plénières, une session générale et des workshops thématiques. La première édition du colloque se tiendra à Sorbonne Université à Paris, du 9 au 11 septembre 2024, et son thème principal sera : Langues, variations, changements : approches synchroniques, diachroniques, typologiques et comparées. Les langues de communication seront l’anglais et le français. La table ronde aura lieu le 11 septembre et portera sur le thème Évolution linguistique et évolution biologique. L’école d’été aura lieu du 12 au 14 septembre et sera consacrée au Statut cognitif des universaux du langage et leurs mécanismes de diffusion. APPEL A PROPOSITIONS DE WORKSHOPS Nous sollicitons des propositions de workshops portant sur toutes les sous- disciplines de la linguistique, portant sur des langues variées et adoptant différentes approches (descriptives, théoriques, empiriques, interdisciplinaires, etc.), afin de promouvoir la diversité entre les disciplines et les méthodologies. Il n’est pas requis que les propositions se rattachent à la thématique de la conférence. Chaque workshop se composera de créneaux horaires de 30 minutes (20’ de présentation, 5’ de discussion, 5’ pour le changement de salle). Les langues de communication seront l’anglais et le français. Veuillez noter que cet appel concerne les propositions de workshops uniquement. Des appels séparés seront diffusés pour les propositions de communications pour la session générale du colloque et pour l'école d'été. CONSIGNES POUR LA SOUMISSION DES PROPOSITIONS DE WORKSHOPS Les propositions de workshops seront envoyées sous forme de documents docx et PDF, et contiendront les informations suivantes : - Titre du workshop et cinq mots-clés. - Noms, affiliations et adresses électroniques des organisateurs du workshop. - Description du workshop résumant son objet, les questions de recherche et les objectifs, accompagnée d’une bibliographie (entre 500 et 1000 mots, hors références). La description du workshop peut inclure les résumés (environ 300 mots) des communications prévues. Les noms des éventuels conférenciers invités peuvent également être soumis (notez cependant que LLcD ne pourra pas couvrir les frais de mission). DATES ET DÉTAILS POUR LA SOUMISSION DE PROPOSITIONS Les propositions de workshops doivent être soumises au plus tard le 30 janvier 2024. Elles seront examinées par le comité scientifique de LLcD, et leur inclusion dans le programme de la conférence sera décidée avec les coprésidents de la conférence. La notification d’acceptation ou de rejet de la proposition sera communiquée avant le 1er mars 2024. La liste des propositions de workshops acceptées sera publiée sur la page web de la conférence ( [ http://llcd2024.sciencesconf.org/ | http://llcd2024.sciencesconf.org ] ), et un appel à communications sera ensuite lancé par les organisateurs. Les soumissions individuelles se feront via EasyChair ( [ https://easychair.org/conferences/?conf=llcd2024 | https://easychair.org/conferences/?conf=llcd2024 ] ) et seront évaluées par trois évaluateurs du comité scientifique de LLcD. Ce comité peut éventuellement être élargi à des experts suggérés par le(s) organisateur(s) du workshop. Lien pour la soumission de propositions de workshops : [ https://easychair.org/conferences/?conf=llcd2024 | https://easychair.org/conferences/?conf=llcd2024 ] Contact: llcd(a)sciencesconf.org

1 1

ACL2024: First Call for Main Conference Papers
by Yuki Arase 22 Jan '24

22 Jan '24

============================ ACL 2024 Website: https://2024.aclweb.org/ Submission Deadline: 15 February 2024 (February ARR cycle) Conference Dates: August 11-16 2024 Location: Bangkok, Thailand Special Theme: “Open science, open data, and open models for reproducible NLP research” Contact: - Claire Gardent (General Chair) - Lun-Wei Ku, André Martins, Vivek Srikumar (Program Chairs): For questions related to paper submission, email: editors(a)aclrollingreview.org. For all other questions, email: acl2024-programchairs(a)googlegroups.com. ============================ Call for Main Conference Papers ACL 2024 invites the submission of long and short papers featuring substantial, original, and unpublished research in all aspects of Computational Linguistics and Natural Language Processing. As in recent years, some of the presentations at the conference will be of papers accepted by the Transactions of the ACL (TACL) and by the Computational Linguistics (CL) journals. Papers submitted to ACL 2024, but not selected for the main conference, will also automatically be considered for publication in the Findings of the Association of Computational Linguistics. === Important Dates === Anonymity period: ACL changed its policy for review and citation on January 12, 2022 (https://www.aclweb.org/adminwiki/index.php/ACL_Policies_for_Review_and_Cita…). As a result, no anonymity period will be required for papers submitted for the Feb. 1, 2024 TACL deadline or the Feb. 15, 2024 ARR deadline (https://aclrollingreview.org/dates). The submissions themselves must still be fully anonymized. Please see https://www.aclweb.org/adminwiki/index.php/ACL_Anonymity_Policy for details. Submission deadline (all papers are submitted to ARR): February 15, 2024 Papers submitted to ARR no later than February 15, 2024 will have reviews and meta-reviews by April 15, 2024, in time for the ACL 2024 commitment deadline (see below). At submission time to ARR, authors will be asked to select one preferred venue to calculate the acceptance rate. However, selecting ACL 2024 as a preferred venue does not require authors to commit to ACL 2024. ARR reviews & meta-reviews available to authors of February cycle: April 15, 2024 Commitment deadline for ACL 2024: April 20, 2024 Deadline for authors to commit their reviewed papers, reviews, and meta-review to ACL 2024. It is not necessary to have selected ACL as a preferred venue during submission. Notification of acceptance: May 15, 2024 Withdrawal deadline: June 5, 2024 Camera-ready papers due: June 5, 2024 Tutorials: August 11, 2024 Conference: August 12-14, 2024 Workshops: August 15-16, 2024 All deadlines are 11:59PM UTC-12:00 (“anywhere on Earth”). === Submission Topics === ACL 2024 aims to have a broad technical program. Relevant topics for the conference include, but are not limited to, the following areas (in alphabetical order): * Computational Social Science and Cultural Analytics * Dialogue and Interactive Systems * Discourse and Pragmatics * Efficient/Low-Resource Methods for NLP * Ethics, Bias, and Fairness * Generation * Information Extraction * Information Retrieval and Text Mining * Interpretability and Analysis of Models for NLP * Linguistic theories, Cognitive Modeling and Psycholinguistics * Machine Learning for NLP * Machine Translation * Multilinguality and Language Diversity * Multimodality and Language Grounding to Vision, Robotics and Beyond * NLP Applications * Phonology, Morphology and Word Segmentation * Question Answering * Resources and Evaluation * Semantics: Lexical * Semantics: Sentence-level Semantics, Textual Inference and Other areas * Sentiment Analysis, Stylistic Analysis, and Argument Mining * Speech recognition, text-to-speech and spoken language understanding * Summarization * Syntax: Tagging, Chunking and Parsing === Paper Submission Details === Papers must be submitted by the ARR’s February 2024 cycle. Papers submitted to one of the earlier ARR deadlines are also eligible, and it is not necessary to (re) submit on the current cycle. Both long and short paper submissions should follow all of the ARR submission requirements, including: * Anonymity Period and Instructions for Two-Way Anonymized Review * Authorship policies * Citation and Comparison to the literature * Multiple Submission Policy, Resubmission Policy, and Withdrawal Policy * Ethics Policy * Limitations section * Paper Submission and Templates * Optional Supplementary Materials Papers should be submitted to one of the ARR 2023 submission sites. Final versions of accepted papers will be given one additional page of content (up to 9 pages for long papers, up to 5 pages for short papers) to address reviewers’ comments. === Theme Track: Open science, open data, and open models for reproducible NLP research === Following the success of the ACL 2020-2023 Theme tracks, we are happy to announce that ACL 2024 will have a new theme with the goal of reflecting and stimulating discussion about open science and reproducible NLP research, as well as supporting the open source software movement. We encourage contributions related to the release of high quality datasets, novel ideas for evaluation, non-trivial algorithm and toolbox implementations, and models which are properly documented (e.g. via model cards). We believe this topic is very timely and addresses a growing concern from NLP researchers. The advent of large language models as a general purpose tool for NLP, often served as closed APIs, without public information about training data and model size, perhaps even containing test data, makes it very hard to reproduce prior work and compare fairly and rigorously with newly developed models and techniques. This brings the serious risk of hindering progress in the field. With this theme track we seek a discussion on increased transparency in the field by promoting the use of open models and open-source initiatives in NLP as an alternative to closed approaches. The theme track invites empirical and theoretical research, descriptions and release of high quality open datasets, open models, and open source software implementations, as well position and survey papers reflecting on the ways in which open data, open models, and open-source initiative can contribute to advances in the field. The possible topics of discussion include (but are not limited to) the following: * What are the advantages (and risks, if any) of making available open-source software, open datasets and models to the research community? What are the risks (scientific and societal) of not making them available? * What kind of incentive mechanisms should be in place to encourage the creation by the research community of open, high-quality datasets and models and their adoption in experimental workflows? * What elements of open releases (e.g. documentation, cards, licensing, testing) are essential or should be highly recommended in order to be scientifically useful and adopted by the community? The theme track submissions can be either long or short. Visit https://2024.aclweb.org/calls/main_conference_papers/ for more details!

1 0

CFP The First Workshop on Visualization for Natural Language Processing (Vis4NLP)
by Tariq Yousef 21 Jan '24

21 Jan '24

[Apologies for cross-postings] Dear Colleagues, Please find below the CFP for our “*Visualization for Natural Language Processing (Vis4NLP-2024)* ” workshop, co-located with Eurovis 2024 <https://www.linkedin.com/company/eurovis-2024/>, Odense, Denmark. 2024. Online version: http://vis4nlp.com/ Workshop paper due: March 3, 2024 Workshop date: May 27, 2024 Venue: University of Southern Denmark <http://sdu.dk> ===WORKSHOP=== *The First Workshop on Visualization for Natural Language Processing (Vis4NLP) May 27th 2024, Odense, Denmark* The workshop will be co-located with EuroVis 2024 <https://www.eurovis.org/eurovis> in Odense, Denmark, will take place in person on May 27. The workshop aims to create a dedicated space for interdisciplinary collaboration at the intersections of NLP and visualization. Vis4NLP serves as a pivotal platform where researchers, practitioners, and academics come together to collectively tackle the ever-evolving challenges and opportunities in NLP visualization. *Call for Papers* Call for Papers In the rapidly evolving landscape of Natural Language Processing, visualization plays a pivotal role in aiding researchers and practitioners alike. The surge in the sophistication and complexity of NLP models, driven by advancements in machine learning and deep learning, has created a pressing need for effective visualization tools and techniques. These tools are essential for gaining insights into the behavior of NLP models, comparing their performance, and addressing issues related to debugging, diagnostics, and evaluation. In this context, the importance of visualization for developing NLP tasks cannot be overstated. Topics of interest include, but are not limited to, the following: - Visual exploration of corpora, datasets, and model outputs - Visual analytics tools for debugging diagnosing NLP models - Performance Troubleshooting and comparison - Visual analytics tools for quantitative or qualitative evaluation of NLP models - Text annotation tools - Text reuse visualization - Generative and explainable AI *Important Dates* Important Dates - Workshop paper due: March 3, 2024 - Notification of acceptance: April 10, 2024 - Camera-ready papers due: April 20, 2024 - Workshop date: May 27, 2024 All submission deadlines are at 23:59 GMT on the date indicated. For general questions, comments, etc. please send email to eurovis(a)vis4nlp.com Best regards, *Tariq Yousef *Assistant Professor of Data Science Faculty of Science - Department of Mathematics and Computer Science *University of Southern Denmark*

1 0

PhD Scholorship in e-Health and/or Sign Language Translation] @ Queen's University Belfast, UK
by Mohammed Hasanuzzaman 21 Jan '24

21 Jan '24

*Apologies for the multiple postings* I am looking for a PhD student in e-Health and/or Sign Language Translation at the School of School of Electronics, Electrical Engineering and Computer Science, Queen's University in Belfast, UK. The details of the project can be found below: https://www.qub.ac.uk/courses/postgraduate-research/phd-opportunities/causa… https://www.qub.ac.uk/courses/postgraduate-research/phd-opportunities/endto… Requirements: The successful candidate is expected to have a solid background in Machine Learning, Statistics, computer science or related discipline. The minimum academic requirement for admission to a research degree programme is normally an Upper Second Class Honours degree from a UK or ROI HE provider, or an equivalent qualification acceptable to the University. Refer to the official application process for details. Application Process: For enquiries, please send an email to Dr. Mohammed Hasanuzzaman at m.hasanuzzaman(a)qub.ac.uk with your CV and transcript as well as a brief description of your research interests. Unfortunately, due to a high volume of inquiries, I may not be able to respond to all emails. Submit your formal application through the following link. https://dap.qub.ac.uk/portal/user/u_login.php Your application should be clearly marked as EEECS/2024/MH1 and/or EEECS/2024/MH2 to ensure consideration for funding Application Deadline: 29 February, 2024 About Queen's: The Queen's University Belfast, founded almost two centuries ago, is one of the oldest universities in the United Kingdom. As a member of the prestigious Russell Group, Queen’s is one of the UK’s 24 leading research-intensive universities (ranked 13th in the UK for research intensity). Queen's has 15 subjects in the top 200 in the world (QS World University Rankings 2023). Five of those subjects are in the World Top 100 (QS World Rankings by subject 2023). For more details: https://www.qub.ac.uk/Study/Why-Study-at-Queens/rankings-and-reputation/ *International students are welcome to apply, but additional funding ( https://www.qub.ac.uk/Study/international-students/international-scholarshi…) or personal finances will be required to cover the difference between home (UK) and overseas fees. Best regards, Mohammed ------------------------------------------------------------------------------------------------------ *Dr. Mohammed HasanuzzamanLecturer/Assistant Professor**Queen's University Belfast <https://www.qub.ac.uk/>, UK * *&Munster Technological University <https://www.mtu.ie/>, Ireland* *Funded Investigator, ADAPT Centre- <https://www.adaptcentre.ie/> A <https://www.adaptcentre.ie/>* World-Leading SFI Research Centre <https://www.adaptcentre.ie/> *C**hercheur Associé*, GREYC UMR CNRS 6072 Research Centre, France <https://www.greyc.fr/en/home/> *Associate Editor:* * IEEE Transactions on Affective Computing, Nature Scientific Reports, IEEE Transactions on Computational Social Systems, ACM TALLIP, PLOS One, Computer Speech and Language**Website: **https://mohammedhasanuzzaman.github.io/ <https://mohammedhasanuzzaman.github.io/>* [image: Mailtrack] <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=sig…> Sender notified by Mailtrack <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=sig…> 21/01/24, 13:19:43

1 0

ReproNLP 2024 - First Call for Participation - Shared Task on Reproducibility of Evaluations in NLP
by Thomson, Craig 21 Jan '24

21 Jan '24

ReproNLP 2024 First Call for Participation Background Across Natural Language Processing (NLP), a growing body of work is exploring the issue of reproducibility in machine learning contexts. The field is currently far from having a generally agreed toolbox of methods for defining and assessing reproducibility. Reproducibility of results of human evaluation experiments is particularly under-addressed which is of concern for areas of NLP where human evaluation is common including e.g. MT, text generation, and summarisation. More generally, human evaluations provide the benchmarks against which automatic evaluation methods are assessed across NLP. We previously organised the First ReproGen Shared Task on reproducibility of human evaluations in NLG as part of Generation Challenges (GenChal) at INLG’21, and the Second ReproGen Shared Task at INLG’22, where we extended the scope to encompass automatic evaluation methods as well as human. Last year we expanded the scope of the shared task series to encompass all NLP tasks, renaming it the ReproNLP Shared Task on Reproducibility of Evaluations in NLP (part of the HumEval workshop at RANLP’23). This year we are focussing on human evaluations, and the results session will be held at the 4th Workshop on Human Evaluation of NLP Systems (HumEval’24 at LREC-COLING’24). As with the previous shared tasks, our overall aim is (i) to shed light on the extent to which past NLP evaluations have been reproducible, and (ii) to draw conclusions regarding how NLP evaluations can be designed and reported to increase reproducibility. With this task being run over several years, we hope to be able to document an overall increase in levels of reproducibility over time. About ReproNLP ReproNLP has two tracks, one an ‘unshared task’ in which teams attempt to reproduce any prior evaluation results (Track A below), the other a standard shared task in which teams repeat existing evaluation studies with the aim of reproducing their results (Track B): A. Open Track: Reproduce previous evaluation results from any paper, and report what happened. Unshared task. B. ReproHum Track (papers see below): For a shared set of selected evaluation studies from the ReproHum Project, participants repeat one or more of the studies and attempt to reproduce their results, using the information provided by the ReproNLP organisers only, and following a common reproduction approach. Track B Papers The specific experiments listed and described below are currently the subject of reproduction studies in the ReproHum<https://reprohum.github.io/> project. The authors have provided very detailed information about the experiments. In some cases, we have introduced standardisations to the experimental design as noted in the detailed instructions to participants which will be shared upon registration. The papers and studies, with many thanks to the authors for supporting ReproHum and ReproNLP, are: Reif et al. (2022): A Recipe for Arbitrary Text Style Transfer with Large Language Models: https://aclanthology.org/2022.acl-short.94 * Absolute evaluation study; English; 3 quality criteria; 3 datasets; varies between 4 and 6 systems and between 200 and 300 evaluation items per dataset-criterion combination; crowdsourced. Vamvas & Sennrich (2022): As Little as Possible, as Much as Necessary: Detecting Over- and Undertranslations with Contrastive Conditioning: https://aclanthology.org/2022.acl-short.53 * Absolute evaluation study; ZH to EN, EN to DE; 1 quality criterion; 1 dataset; 1 system and 757 evaluation items. Bai et al. (2021): Cross-Lingual Abstractive Summarization with Limited Parallel Resources: https://aclanthology.org/2021.acl-long.538 * Relative evaluation study; ZH to EN; 3 quality criteria; 1 dataset; 4 systems and 240 evaluation items per criterion. Puduppully & Lapata (2021): Data-to-text Generation with Macro Planning: https://aclanthology.org/2021.tacl-1.31/ * Absolute evaluation study; English; 2 quality criteria; 2 datasets; 5 systems and 400 evaluation items per dataset-criterion combination; crowdsourced. * Relative evaluation study; English; 3 quality criteria; 2 datasets; 5 systems and 200 evaluation items per dataset-criterion combination; crowdsourced. Liu et al. (2021): DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts: https://aclanthology.org/2021.acl-long.522 * Relative evaluation study; English; 3 quality criteria; 2 datasets; varies between 5 and 6 systems and between 960 and 1200 evaluation items per dataset-criterion combination; crowdsourced. Dwiastuti (2019): English-Indonesian Neural Machine Translation for Spoken Language Domains: https://aclanthology.org/P19-2043 * Relative evaluation study; Indonesian; 1 quality criterion; 1 dataset; 2 systems and 50 evaluation items. Hosking & Lapata (2021): Factorising Meaning and Form for Intent-Preserving Paraphrasing: https://aclanthology.org/2021.acl-long.112 * Relative evaluation study; English; 3 quality criteria; 1 dataset; 4 systems and 1200 evaluation items per criterion; crowdsourced. Atanasova et al. (2020): Generating Fact Checking Explanations: https://aclanthology.org/2020.acl-main.656 * Absolute evaluation study; English; 1 quality criterion; 1 dataset; 3 systems and 240 evaluation items. * Relative evaluation study; English; 4 quality criteria; 1 dataset; 3 systems and 40 evaluation items per criterion. August et al. (2022): Generating Scientific Definitions with Controllable Complexity: https://aclanthology.org/2022.acl-long.569 * Absolute evaluation study; English; 5 quality criteria; 2 datasets; 3 systems and 300 evaluation items per dataset-criterion combination; some crowdsourced. Hosking et al. (2022): Hierarchical Sketch Induction for Paraphrase Generation: https://aclanthology.org/2022.acl-long.178 * Relative evaluation study; English; 3 quality criteria; 1 dataset; 4 systems and 1800 evaluation items per criterion; crowdsourced. Yao et al. (2022): It is AI’s Turn to Ask Humans a Question: Question-Answer Pair Generation for Children’s Story Books: https://aclanthology.org/2022.acl-long.54 * Absolute evaluation study; English; 3 quality criteria; 1 dataset; 3 systems and 361 evaluation items per criterion. Chakrabarty et al. (2022): It’s not Rocket Science: Interpreting Figurative Language in Narratives: https://aclanthology.org/2022.tacl-1.34/ * Absolute evaluation study; English; 1 quality criterion; 2 datasets; varies between 6 and 8 systems and between 150 and 200 evaluation items per dataset; crowdsourced. * Relative evaluation study; English; 1 quality criterion; 2 datasets; 3 systems and varies between 225 and 300 evaluation items per dataset; crowdsourced. Feng et al. (2021): Language Model as an Annotator: Exploring DialoGPT for Dialogue Summarization: https://aclanthology.org/2021.acl-long.117 * Absolute evaluation study; English; 3 quality criteria; 2 datasets; 7 systems and varies between 70 and 700 evaluation items per dataset-criterion combination. Lux & Vu (2022): Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features: https://aclanthology.org/2022.acl-long.472 * Relative evaluation study; German; 1 quality criterion; 1 dataset; 2 systems and 12 evaluation items. Gu et al. (2022): MemSum: Extractive Summarization of Long Documents Using Multi-Step Episodic Markov Decision Processes: https://aclanthology.org/2022.acl-long.450 * Relative evaluation study; English; 3 quality criteria; 1 dataset; 2 systems and varies between 63 and 67 evaluation items per criterion. Gabriel et al. (2022): Misinfo Reaction Frames: Reasoning about Readers’ Reactions to News Headlines: https://aclanthology.org/2022.acl-long.222 * Absolute evaluation study; English; 3 quality criteria; 1 dataset; 3 systems and 588 evaluation items per criterion; crowdsourced. Kasner & Dusek (2022): Neural Pipeline for Zero-Shot Data-to-Text Generation: https://aclanthology.org/2022.acl-long.271 * Absolute evaluation study; English; 5 quality criteria; 2 datasets; 6 systems and 600 evaluation items per dataset-criterion combination. Shardlow & Nawaz (2019): Neural Text Simplification of Clinical Letters with a Domain Specific Phrase Table: https://aclanthology.org/P19-1037 * Relative evaluation study; English; 1 quality criterion; 1 dataset; 4 systems and 100 evaluation items; crowdsourced. Castro Ferreira et al. (2018): NeuralREG: An end-to-end approach to referring expression generation: https://aclanthology.org/P18-1182 * Absolute evaluation study; English; 3 quality criteria; 1 dataset; 6 systems and 144 evaluation items per criterion; crowdsourced. Same et al. (2022): Non-neural Models Matter: a Re-evaluation of Neural Referring Expression Generation Systems: https://aclanthology.org/2022.acl-long.380 * Absolute evaluation study; English; 3 quality criteria; 2 datasets; varies between 6 and 8 systems and between 180 and 384 evaluation items per dataset-criterion combination; crowdsourced. Lin et al. (2022): Other Roles Matter! Enhancing Role-Oriented Dialogue Summarization via Role Interactions: https://aclanthology.org/2022.acl-long.182 * Absolute evaluation study; Chinese; 3 quality criteria; 1 dataset; 4 systems and 800 evaluation items per criterion. Track A and B Instructions Step 1. Fill in the registration form<https://docs.google.com/forms/d/e/1FAIpQLSetAzVp3BVkTaf8i0MZuya_sAKuP3ii6zC…>, indicating which of the above papers, or which other paper(s), you wish to carry out a reproduction study for. Step 2. After registration, the ReproNLP participants information will be made available to you, plus data, tools and other materials for each of the studies you have selected in the registration form. Step 3. Carry out the reproduction, and submit a report of up to 8 pages plus references and supplementary material including a completed ReproGen Human Evaluation Sheet (HEDS) for each reproduction study, by April 1st 2024. Step 4. The organisers will carry out light touch review of the evaluation reports according to the following criteria: * Evaluation sheet has been completed. * Exact repetition of study has been attempted and is described in the report. * Report gives full details of the reproduction study, in accordance with the reporting guidelines provided. * All tools and resources used in the study are publicly available. Step 5. Present paper at the results meeting. Reports will be included in the HumEval’24 proceedings, and results will be presented at the workshop in May 2024. Full details and instructions will be provided as part of the ReproNLP participants information. Important Dates Report submission deadline: 1st of April 2024 Acceptance notification: 9th of April 2024 Camera-ready reports due: 19th of April 2024 Presentation of results: 21st of May 2024 All deadlines are 23:59 UTC-12. Organisers Anya Belz, ADAPT/DCU, Ireland Craig Thomson, University of Aberdeen, UK Ehud Reiter, University of Aberdeen, UK Contact anya.belz(a)adaptcentre.ie, c.thomson(a)abdn.ac.uk https://repronlp.github.io The University of Aberdeen is a charity registered in Scotland, No SC013683. Tha Oilthigh Obar Dheathain na charthannas clàraichte ann an Alba, Àir. SC013683.

1 0

Call for Papers: 9th Symposium on Corpus Approaches to Lexicogrammar (LxGr2024)
by Costas Gabrielatos 19 Jan '24

19 Jan '24

9th Symposium on Corpus Approaches to Lexicogrammar (LxGr2024) CALL FOR PAPERS Deadline for abstract submission: Friday 15 March 2024 The symposium will take place online on Friday 5 and Saturday 6 July 2024. Invited Speakers Lise Fontaine<http://www.uqtr.ca/PagePerso/Lise.Fontaine> (Université du Québec à Trois-Rivières): Reconciling (or not) lexis and grammar Ute Römer-Barron<http://alsl.gsu.edu/profile/ute-romer> (Georgia State University): Phraseology research in second language acquisition LxGr primarily welcomes papers reporting on corpus-based research on any aspect of the interaction of lexis and grammar - particularly studies that interrogate the system lexicogrammatically to get lexicogrammatical answers. However, position papers discussing theoretical or methodological issues are also welcome, as long as they are relevant to both lexicogrammar and corpus linguistics. If you would like to present, send an abstract of 500 words (excluding references) to lxgr(a)edgehill.ac.uk<mailto:lxgr@edgehill.ac.uk>. Make sure that the abstract clearly specifies the research focus (research questions or hypotheses), the corpus, the methodology (techniques and metrics), the theoretical orientation, and the main findings. Abstracts will be double-blind reviewed, and decisions will be communicated within four weeks. Full papers will be allocated 35 minutes (including 10 minutes for discussion). Work-in-progress reports will be allocated 20 minutes (including 5 minutes for discussion). There will be no parallel sessions. Participation is free. For details, visit the LxGr website: https://sites.edgehill.ac.uk/lxgr/lxgr2024 If you have any questions, contact lxgr(a)edgehill.ac.uk<mailto:lxgr@edgehill.ac.uk>. ________________________________ Edge Hill University<http://ehu.ac.uk/home/emailfooter> Modern University of the Year, The Times and Sunday Times Good University Guide 2022<http://ehu.ac.uk/tef/emailfooter> University of the Year, Educate North 2021/21 ________________________________ This message is private and confidential. If you have received this message in error, please notify the sender and remove it from your system. Any views or opinions presented are solely those of the author and do not necessarily represent those of Edge Hill or associated companies. Edge Hill University may monitor email traffic data and also the content of email for the purposes of security and business communications during staff absence.<http://ehu.ac.uk/itspolicies/emailfooter>

1 0

[CFP] NLDB 2024 - The 29th International Conference on Natural Language & Information Systems
by Federico Torrielli 19 Jan '24

19 Jan '24

* We apologize if you receive multiple copies of this CFP * For the online version of this Call, visit: https://nldb2024.di.unito.it/submissions/ =============== NLDB 2024 The 29th International Conference on Natural Language & Information Systems 25-27 June 2024, University of Turin, Italy. Website: https://nldb2024.di.unito.it/ Submission deadline: 22 March, 2024 About NLDB The 29th International Conference on Natural Language & Information Systems will be held at the University of Turin, Italy, and will be a face to face event. Since 1995, the NLDB conference brings together researchers, industry practitioners, and potential users interested in various applications of Natural Language in the Database and Information Systems field. The term "Information Systems" has to be considered in the broader sense of Information and Communication Systems, including Big Data, Linked Data and Social Networks. The field of Natural Language Processing (NLP) has itself recently experienced several exciting developments. In research, these developments have been reflected in the emergence of Large Language Modelsand the importance of aspects such as transparency, bias and fairness, Large Multimodal Models and the connection of the NLP field with Computer Vision, chatbots and dialogue-based pipelines. Regarding applications, NLP systems have evolved to the point that they now offer real-life, tangible benefits to enterprises. Many of these NLP systems are now considered a de-facto offering in business intelligence suites, such as algorithms for recommender systems and opinion mining/sentiment analysis. Language models developed by the open-source community have become widespread and commonly used. Businesses are now readily adopting these technologies, thanks to the efforts of the open-source community. For example, fine-tuning a language model on a company’s own dataset is now easy and convenient, using modules created by thousands of academic researchers and industry experts. It is against this backdrop of recent innovations in NLP and its applications in information systems that the 29th edition of the NLDB conference takes place. We welcome research and industrial contributions, describing novel, previously unpublished works on NLP and its applications across a plethora of topics as described in the Call for Papers. Call for Papers: NLDB 2024 invites authors to submit papers on unpublished research that addresses theoretical aspects, algorithms, applications, architectures for applied and integrated NLP, resources for applied NLP, and other aspects of NLP, as well as survey and discussion papers. This year's edition of NLDB continues with the Industry Track to foster fruitful interaction between the industry and the research community. Topics of interest include but are not limited to: * Large Language Models: training, applications, transfer learning, interpretability of large language models. * Multimodal Models: Integration of text with other modalities like images, video, and audio; multimodal representation learning; applications of multimodal models. * AI Safety and ethics: Safe and ethical use of Generative AI and NLP; avoiding and mitigating biases in NLP models and systems; explainability and transparency in AI. * Natural Language Interfaces and Interaction: design and implementation of Natural Language Interfaces, user studies with human participants on Conversational User Interfaces, chatbots and LLM-based chatbots and their interaction with users. * Social Media and Web Analytics: Opinion mining/sentiment analysis, irony/sarcasm detection; detection of fake reviews and deceptive language; detection of harmful information: fake news and hate speech; sexism and misogyny; detection of mental health disorders; identification of stereotypes and social biases; robust NLP methods for sparse, ill-formed texts; recommendation systems. * Deep Learning and eXplainable Artificial Intelligence (XAI): Deep learning architectures, word embeddings, transparency, interpretability, fairness, debiasing, ethics. * Argumentation Mining and Applications: Automatic detection of argumentation components and relationships; creation of resource (e.g. annotated corpora, treebanks and parsers); Integration of NLP techniques with formal, abstract argumentation structures; Argumentation Mining from legal texts and scientific articles. * Question Answering (QA): Natural language interfaces to databases, QA using web data, multi-lingual QA, non-factoid QA(how/why/opinion questions, lists), geographical QA, QA corpora and training sets, QA over linked data (QALD). * Corpus Analysis: multi-lingual, multi-cultural and multi-modal corpora; machine translation, text analysis, text classification and clustering; language identification; plagiarism detection; information extraction: named entity, extraction of events, terms and semantic relationships. * Semantic Web, Open Linked Data, and Ontologies: Ontology learning and alignment, ontology population, ontology evaluation, querying ontologies and linked data, semantic tagging and classification, ontology-driven NLP, ontology-driven systems integration. * Natural Language in Conceptual Modelling: Analysis of natural language descriptions, NLP in requirement engineering, terminological ontologies, consistency checking, metadata creation and harvesting. * Natural Language and Ubiquitous Computing: Pervasive computing, embedded, robotic and mobile applications; conversational agents; NLP techniques for Internet of Things (IoT); NLP techniques for ambient intelligence * Big Data and Business Intelligence: Identity detection, semantic data cleaning, summarisation, reporting, and data to text. Important Dates: Full paper submission: 22 March, 2024 Paper notification: 19 April, 2024 Camera-ready deadline: 26 April, 2024 Conference: 25-27 June 2024 Submission Guidelines: Authors should follow the LNCS format (https://www.springer.com/gp/computer-science/lncs/conference-proceedings-gu… ) and submit their manuscripts in pdf via Easychair (submission will open on 1 February, 2024) Papers can be submitted to either the main conference or the industry track. Submissions can be full papers (up to 15 pages including references and appendices), short papers (up to 11 pages including references and appendices) or papers for a poster presentation or system demonstration (6 pages including references). The programme committee may decide to accept some full papers as short papers or poster papers. All questions about submissions should be emailed to federico.torrielli(a)unito.it (Web & Publicity Chair) General Chairs: Luigi Di Caro, University of Turin Farid Meziane, University of Derby Amon Rapp, University of Turin Vijayan Sugumaran, Oakland University

1 0

Invito: UCREL seminar: Wednesday 24th @ 13:00 - Ar... - ven 19 gen 2024 3:30PM - 4:30PM (CET) (corpora@list.elra.info)
by Marica Belmonte 19 Jan '24

19 Jan '24

[Corpora-List] UCREL seminar: Wednesday 24th @ 13:00 - Are rule-based approaches a thing of the past? The case of anaphora resolution. venerdì 19 gen 2024 ⋅ 3:30PM – 4:30PM Ora dell’Europa centrale - Amsterdam Partecipa con Google Meet https://meet.google.com/eft-bkah-zuc?hs=224 At the UCREL / NLP group (https://ucrel.lancs.ac.uk/) we’re excited to have Prof. Ruslan Mitkov with us on Wednesday 24th January 13:00 – 14:00 (UTC/GMT) in our next UCREL corpus research seminar which you can join online via the Teams link below. Ruslan will present an extended version of his recent talk from ICON2023: Are rule-based approaches a thing of the past? The case of anaphora resolution. Ruslan Mitkov https://wp.lancs.ac.uk/mitkov/ Abstract: In this talk I shall present the results of a study which evaluates and compares new variants of a popular rule-based anaphora resolution algorithm (Mitkov 1998, 2002) with the original version. We seek to establish whether configurations that benefit from Deep Learning, LLMs and eye-tracking data (always) outperform the original... Organizzatore Marica Belmonte marica.belmonte(a)gmail.com Invitati Marica Belmonte - organizzatore corpora(a)list.elra.info Visualizza le informazioni di tutti gli invitati https://calendar.google.com/calendar/event?action=VIEW&eid=NTRjc2I1MDJrMXZy… Rispondi per corpora(a)list.elra.info e visualizza ulteriori dettagli https://calendar.google.com/calendar/event?action=VIEW&eid=NTRjc2I1MDJrMXZy… La tua presenza è facoltativa. ~~//~~ Invito da Google Calendar: https://calendar.google.com/calendar/ Hai ricevuto questa email perché partecipi all'evento. Per disattivare la ricezione di aggiornamenti futuri per questo evento, rifiuta l'evento. Se inoltri questo invito potresti consentire al destinatario di inviare una risposta all'organizzatore, di essere aggiunto all'elenco degli invitati oppure di invitare altre persone indipendentemente dallo stato dell'invito o ancora di modificare la tua risposta. Scopri di più https://support.google.com/calendar/answer/37135#forwarding

1 0

2026

2025

2024

2023

2022

Corpora