June 2024 - Corpora - ELRA lists

Global WordNet Conference 2025 First Call for Papers
by Itziar Gonzalez Dios 21 Jun '24

21 Jun '24

[ Apologies for crossposting ] *Global WordNet Conference 2025 - GWC2025* The Global Wordnet Association is delighted to announce the *13th International Global Wordnet Conference* (GWC2025), to be held in *Pavia (Italy) from 27 to 31 January, 2025*. The GWC2025 conference will be hosted by the Department of Humanities, at the University of Pavia. [image: 📍] *Dates*: 27-Jan-2025 - 31-Jan-2025 *Location*: Pavia, Italy *Meeting Email*: gwc2025pavia(a)unipv.it *Web Site*: https://unipv-larl.github.io/GWC2025/ [image: 🗓️] *Call Deadline*: 07-Oct-2024 We invite submissions of original research contributions addressing, though not limited to, the topics listed below. *Presentations of new WordNets *will be assigned to a dedicated panel. Additionally, proposals for tutorials and demonstrations or panel discussions on *WordNet for ancient languages* are encouraged. Conference topics: - Lexical semantics and meaning representation; - Architecture of lexical databases; - Tools and methods for WordNet development; - Applications of WordNet; - Standardization, distribution and availability of WordNet and WordNet tools See the full call for papers here: https://easychair.org/cfp/gwc2025

1 0

PhD student contract in Natural Language Processing - Nancy, France
by Mathieu Constant 20 Jun '24

20 Jun '24

You can find below an offer for a PhD student contract in Natural Processing at Univ. of Lorraine, Nancy, France. Subject: Automatic generation of explanations for multiword expressions in the context of language learning Thesis supervisors: Mathieu Constant (ATILF, Univ. Lorraine, France) and Patrick Watrin (CENTAL, Univ. of Louvain, Belgium) Thesis funded for three years by the ANR STAR-FLE project Start date: 1 October 2024 Salary: 2135,00 € gross monthly Host laboratory: ATILF (Computer Processing and Analysis of the French Language) Location: Nancy, France Application deadline: July 11, 2024 Scientific background: The successful candidate will join the ATILF, a research unit in language sciences, and in particular the research group on natural language processing (NLP). This research group works, among other things, on exploiting recent NLP models for linguistic modelling (e.g. lexical modelling) with applications in the medical field and language learning. In particular, its work is based on the integration of large (generative) language models and knowledge bases (e.g. scientific textual data, lexical resources). More specifically, the thesis will be part of the STAR-FLE project (STrategic Adaptations for better Reading and Text Comprehension in FFL) funded by the Agence Nationale de la Recherche for 4 years (2024-2027). The project is in the field of computer-assisted language teaching. The aim of STAR-FLE is to gain a better understanding of the difficulties encountered by learners of French as a foreign language (FFL) when faced with the lexicon present in authentic texts. It will propose digital solutions based on natural language processing (NLP) to facilitate text comprehension and enable teachers to better manage heterogeneous levels in the classroom. Contextual aids and personalized vocabulary adaptations are envisaged, particularly for multiword expressions. Objectives: The thesis will focus on multiword expressions. They correspond to combinations of several lexical units which are composed in an irregular manner on one or more linguistic levels (morphology, syntax, semantics, etc.). This term covers a wide variety of phenomena, such as idiomatic expressions (run around in circles, dry run), support verb constructions (take a walk), complex functional units (in spite of), etc. This non-compositionality, which can lead to a certain semantic opacity, can pose problems for learners when reading. In this thesis, the person recruited will develop methods based on new NLP techniques to produce in-context explanatory card enabling learners to better understand these expressions. The production of these cards will be based on the prediction of linguistic properties (e.g. a dry run is not dry), on the generation of natural language explanations using large generative language models (e.g. paraphrases), or on semantic linking to different lexical resources (e.g. to retrieve definitions and lexical neighbors), depending on the context in which the expression occurs. One of the challenges will be to propose explanatory cards adapted to the learner's level. Application requirements and procedures Candidates should have the following skills and profiles: - a Master's degree in computational linguistics, in natural language processing, in computer science or in cognitive science. - very good programming skills - very good skills in recent models of natural language processing (e.g. large language models). Applications should include a cover letter, CV and Master's grades, together with references or one or more letters of recommendation. They should be submitted at the following url: https://emploi.cnrs.fr/Offres/Doctorant/UMR7118-SABMAR-020/Default.aspx?lan… <https://emploi.cnrs.fr/Offres/Doctorant/UMR7118-SABMAR-020/Default.aspx?lan…> For more information, do not hesitate to contact Mathieu Constant (Mathieu.Constant(a)univ-lorraine.fr <mailto:Mathieu.Constant@univ-lorraine.fr>).

1 0

Practical D2T 2024 @ INLG 2024 - First call for papers (data-to-text, neuro-symbolic, shared task)
by Simone Balloccu 20 Jun '24

20 Jun '24

The 2nd Workshop on Practical LLM-assisted Data-to-Text Generation (Practical D2T 2024) While large language models (LLMs) offer to become a viable alternative to traditional rule-based data-to-text (D2T) natural language generation (NLG), they still suffer from well-known neural model issues, such as lack of controllability and risk of producing harmful text. There are many potential solutions to this problem up for discussion. The Practical D2T workshop at INLG 2024 aims to build a space for researchers to discuss and present innovative work on D2T systems using LLMs. Building upon the 2023 edition’s hackathon, Practical D2T 2024 opens up a broader range of activities, including a special track for neuro-symbolic D2T approaches and a shared task in D2T evaluation focused on semantic accuracy. Website: https://practicald2t.github.io/ Practical D2T 2023 at INLG 2023: https://practicald2t.github.io/2023/ Workshop Topic and Content Practical D2T 2024 will be a full-day in-person-only event. We welcome contributions from both original unpublished work and non-archival submissions, in the form of long (8 pages) or short (4 pages) papers, on topics including but not limited to: - Design, implementation and evaluation of LLM-assisted D2T systems - Cross-domain adaption of LLMs for D2T - User perceptions and acceptance of LLM-generated text in D2T - Bias, fairness and red-teaming issues in LLM-assisted D2T systems - Leveraging LLMs for D2T in low-resource languages and domains - Error analysis and debugging techniques for LLM-assisted D2T - Human-in-the-loop approaches for improving LLM-assisted D2T - Comparison between LLM-assisted D2T and traditional symbolic approaches Special Track: Neuro-Symbolic D2T Research is currently seeing a renewed interest in developing systems combining neural and symbolic approaches to improve explainability and reduce dependence on training data. Practical D2T 2024 will feature a special track on neuro-symbolic approaches to D2T. Submissions for papers in the special track follow the same requirements and procedure as the main workshop submissions. Shared task: Improving Semantic Accuracy in LLM-assisted D2T This year will feature a shared task on improving semantic accuracy of D2T systems. Participants will build an LLM-assisted D2T system to generate textual reports from various domains, such as weather forecasting, product descriptions or sports reports. We will provide testing data obtained from public APIs, to limit potential previous exposure to the used LLMs. We encourage participants to focus on system robustness and objective evaluation, rather than metrics scores. Because of this, participants will receive an initial evaluation script, that they are encouraged to change/improve. All submitted system’s outputs will be evaluated against every submitted custom evaluation, and correlated with human ratings. The system reaching the highest correlation with humans will be declared winner of the competition. Results and participants’ system descriptions will be featured in the workshop proceedings. For more info, visit the workshop website: https://practicald2t.github.io/pages/cfp Important dates Note: all deadlines are 23:59 UTC-12. - Evaluation script and data release for known domains (shared task) 24 June - Regular paper submission (main & special track, archival & non-archival): 22 July - Known domains system output submission & surprise domain data release: 29 July - Surprise domain system outputs submission: 5 August - System description submission (shared task): 12 August - Notification of acceptance (main, special track and shared task): 19 August - Camera-ready (main, special track and shared task): 28 August - Workshop: 23/24 September (to be announced) Contacts and more info: Find detailed information about submission, deadlines and contacts on the official Practical D2T 2024 website: https://practicald2t.github.io/ For any query, contact the organiser at d2t2024(a)googlegroups.com If you have any problem with the above mail group, contact balloccu(a)ufal.mff.cuni.cz Organisers Simone Balloccu, Ondřej Dušek, Patrícia Schmidtová, Zdeněk Kasner, Kristýna Onderková, Ondřej Plátek, Mateusz Lango, Ondřej Dušek - Charles University (CZ) Ehud Reiter - University of Aberdeen (UK) Lucie Flek - University of Bonn (DE) Simon Mille - ADAPT Centre (UK) Dimitra Gkatzia - Edinburgh Napier University (UK)

1 0

ArgNLE: Natural Language Argument-Based Explanations - Workshop at ECAI 2024
by elena_cabrio 20 Jun '24

20 Jun '24

*Call for Papers: *The First Workshop on Natural Language Argument-Based Explanations (ArgNLE - https://argnle.github.io/ECAI-ArgNLE/) Co-located with ECAI 2024 (https://www.ecai2024.eu/). Universidad de Santiago de Compostela, Spain. *Workshop description* Explainability and Computational Argumentation have usually been approached as separate, independent research topics, which neglects many aspects arising from considering the interdependencies between them. To be effective for human users, explanations are required to be formulated in natural language, possibly in an argumentative fashion. A workshop on exploring Natural language Argument-based Explanations is proposed to investigate this challenging topic, at the crossroad of these different research fields. Providing high quality explanations for AI predictions based on machine learning is a challenging and complex task. To work well it requires, among other factors: selecting a proper level of generality/specificity of the explanation; considering assumptions about the familiarity of the explanation beneficiary with the AI task under consideration; referring to specific elements that have contributed to the decision; making use of additional knowledge (e.g., metadata) which might not be part of the prediction process; selecting appropriate examples; providing evidence supporting negative hypothesis. Finally, the system needs to formulate the explanation in a clearly interpretable, and possibly convincing, way. Given these considerations, the workshop welcomes contributions showing an integrated vision of Explainable AI (XAI), where low level characteristics of the deep learning process are combined with higher level schemas proper of the human argumentation capacity. These integrated vision relies on three main considerations: i) In neural architectures the correlation between internal states of the network and the justification of the network classification outcome is not well studied; ii) High quality explanations are crucially based on argumentation mechanisms (e.g., provide supporting examples and rejected alternatives); iii) In real settings, providing explanations is inherently an interactive process involving the system and the user. Accordingly, the workshop calls for cross-disciplinary contributions in three areas, i.e., deep learning, argumentation and interactivity, to support a broader and innovative view of explainable AI. More precisely, the workshop is intended to discuss research challenges that will allow to advance the state of the art in explainable AI. Providing explanations to support a certain conclusion has been largely studied in logic, as a fundamental characteristic of human reasoning. As a result, both theoretical and computational models of human argumentation are investigated. The recent resurgence of AI highlighted the idea that low level system behaviors not only need to be interpretable (e.g., showing those elements that most contributed to the system decision), but also need to fit high level human schemas to produce convincing arguments. ** *Topics of interest* * Natural language argument-based explanations * Dialectical, dialogical and conversational explanations * AI methods to support argumentative explainability * User-acceptance and evaluation of argumentation-based explanations * Tools that provide argumentation-based explanations * Use of argument-based explanations for research from the social sciences, digital humanities, and related fields * Real-world applications The workshop solicits the submission of three types of contributions relevant to the workshop topics and suitable to generate discussion: * Original, unpublished contributions * Dataset related submissions (presenting a dataset or a corpus related to the workshop topics, that has been or is currently under development. These papers may have already been published in another venue). * Projects related submissions (presenting funded projects or lines of work within the topics of the workshop, both academic and industrial). *Invited speaker* Professor Francesca Toni, Faculty of Engineering, Department of Computing, Imperial College London, UK. (https://www.imperial.ac.uk/people/f.toni) *Important Dates * * Paper submission: 31 May 2024 * Notification of acceptance: 1 July 2024 * Camera-ready papers: 31 July 2024 * ArgNLE workshop: 19 or 20 October 2024 *Submission Instructions *Papers must be written in English, be prepared for double-blind review using the ECAI LaTeX template, and not exceed 7 pages (not including references). The ECAI LaTeX Template can be found at https://ecai2024.eu/download/ecai-template.zip. Papers should be submitted via EasyChair: https://easychair.org/conferences/?conf=argnle2024 *Workshop Organizers:* * Rodrigo Agerri <https://ragerri.github.io/> - HiTZ Center - Ixa, University of the Basque Country UPV/EHU, Spain * Elena Cabrio <https://www-sop.inria.fr/members/Elena.Cabrio/> - Université Côte d’Azur, Inria, CNRS, I3S, France * Serena Villata <https://webusers.i3s.unice.fr/~villata/Home.html> - Université Côte d’Azur, Inria, CNRS, I3S, France * Marcin Lewinski <https://ifilnova.pt/en/people/marcin-lewinski/> - IFILNOVA, Universidade Nova de Lisboa, Portugal * Bernardo Magnini <http://hlt.fbk.eu/people/magnini> - Fondazione Bruno Kessler, Italy * Marie-Francine Moens <https://people.cs.kuleuven.be/~sien.moens/> - KU Leuven, Belgium

1 3

Final call: 3-years PhD/PostDoc position on NLP for Social Good at Leibniz University Hannover (due June 23)
by h.wachsmuth＠ai.uni-hannover.de 20 Jun '24

20 Jun '24

The Institute of Artificial Intelligence invites applications for the position of a DOCTORAL OR POSTDOCTORAL RESEARCHER (M/F/D) ON THE TOPIC OF NATURAL LANGUAGE PROCESSING (NLP) FOR SOCIAL GOOD (SALARY SCALE 13 TV-L, 100%) starting in September 2024 or soon afterwards. The position is limited to a period of three years with the possibility of extension. TASKS The goal of the offered position is to carry out innovative research on NLP, aiming for scientific publications at reputed international venues. The research should involve LARGE LANGUAGE MODELS (LLMs) related to NLP FOR SOCIAL GOOD. We support the development of own research directions in this broad context. The position also comes with a teaching duty of four hours per week; the candidate is expected to lead tutorials and/or programming labs as well as to support the supervision of bachelor's and master’s students. We are looking for highly motivated candidates with a passion for creativity and learning who seek to make a positive impact through open and independent research in a young team. YOUR PROFILE - Completed academic degree (Master or comparable) in computer science, computational linguistics, artificial intelligence, or related disciplines - Solid understanding of machine learning with hands-on experience, ideally in the context of NLP and LLMs - Proficient programming skills in Python - Good scientific writing skills (for example, shown by a very good master’s thesis) are expected - Strong communication skills in English, both in oral and in written form TEAM The position will be placed in the NLP Group at the Institute of Artificial Intelligence. We are a diverse and international team, studying how humans express their views and intentions in language, and how LLMs can understand and create such language in a fair, trustworthy, and explainable way. Our research tackles interdisciplinary questions from the humanities and social sciences, while building on state-of-the-art NLP techniques, such as instruction fine-tuning and contrastive learning. We seek to do cutting-edge research on artificial intelligence methods that have a positive impact on society and the world. OUR OFFER - Creative and innovative work in a diverse and international team - Possibility to obtain a Ph.D. degree or to shape your Postdoc profile - State-of-the-art research facilities, including top-notch computing clusters - Participation in international scientific events and research collaborations - Salary at the level of 100% of salary scale 13 according to the Collective Agreement for the Public Service of the Länder (TV-L) D&I Leibniz University Hannover considers itself a family-friendly university and therefore promotes a balance between work and family responsibilities. Part-time employment can be arranged upon request. The university aims to promote equality between women and men. For this purpose, the university strives to reduce under-representation in areas where a certain gender is under-represented. Women are under-represented in the salary scale of the advertised position. Therefore, qualified women are encouraged to apply. Moreover, we welcome applications from qualified men. Preference will be given to equally-qualified applicants with disabilities. QUESTIONS In case you have questions, please contact Maja Stahl (email: m.stahl(a)ai.uni-hannover.de). Further information about the NLP Group can be found at: https://www.ai.uni-hannover.de/en/institute/research-groups/nlp For information on the salary scales, see: https://oeffentlicher-dienst.info/c/t/rechner/tv-l/west?id=tv-l-2023&matrix… APPLICATION Please submit your application with supporting documents (including CV, full set of transcripts, a brief statement of at most 1 page of why you apply to the NLP Group, and possibly further qualifications) by June 23, 2024 as A SINGLE PDF FILE to Email: office(a)ai.uni-hannover.de (subject: “[ai-nlp] Application”) or alternatively by post to: Gottfried Wilhelm Leibniz Universität Hannover Institute of Artificial Intelligence Prof. Dr. Henning Wachsmuth Welfengarten 1, 30167 Hannover Germany http://www.uni-hannover.de/jobs Information on the collection of personal data according to article 13 GDPR can be found at https://www.uni-hannover.de/en/datenschutzhinweis-bewerbungen/.

1 0

LxGr2024: Call for Participation
by Costas Gabrielatos 19 Jun '24

19 Jun '24

9th Symposium on Corpus Approaches to Lexicogrammar (LxGr2024) 5-6 July 2024. Online. Attendance is free. Symposium programme: https://sites.edgehill.ac.uk/lxgr/lxgr2023 Registration is now open: https://store.edgehill.ac.uk/conferences-and-events/conferences/conferences… If you have any questions, or if you want to be added to the LxGr mailing list, contact: lxgr(a)edgehill.ac.uk<mailto:lxgr@edgehill.ac.uk>. ________________________________ Edge Hill University<http://ehu.ac.uk/home/emailfooter> Modern University of the Year, The Times and Sunday Times Good University Guide 2022<http://ehu.ac.uk/tef/emailfooter> University of the Year, Educate North 2021/21 ________________________________ This message is private and confidential. If you have received this message in error, please notify the sender and remove it from your system. Any views or opinions presented are solely those of the author and do not necessarily represent those of Edge Hill or associated companies. Edge Hill University may monitor email traffic data and also the content of email for the purposes of security and business communications during staff absence.<http://ehu.ac.uk/itspolicies/emailfooter>

1 0

CALL FOR PAPERS: KGLLM 2024 : Special session on Knowledge Graphs and Large Language Models, 19-20 October 2024
by Mourad Abbas 19 Jun '24

19 Jun '24

KGLLM 2024 : Special session on Knowledge Graphs and Large Language Models Oct 19, 2024 - Oct 19, 2024 Trento, Italy Link: https://www.icnlsp.org/2024welcome/#special_session The Special session on Knowledge Graphs and Large Language Models will be held within the 7th International Conference on Natural Language and Speech Processing (ICNLSP 2024 <https://www.icnlsp.org/2024welcome/>) on October 19, 2024. ** DESCRIPTION ** “In recent years, the fields of Knowledge Graphs (KGs) and Large Language Models (LLMs) have witnessed remarkable advancements, revolutionizing the landscape of artificial intelligence and natural language processing. KGs, structured representations of knowledge, and LLMs, powerful language models trained on vast amounts of text data, have individually demonstrated their prowess in various applications. However, the integration and synergy between KGs and LLMs have emerged as a new frontier, offering unprecedented opportunities for enhancing knowledge representation, understanding, and generation. This integration not only enriches the semantic understanding of textual data but also empowers AI systems with the ability to reason, infer, and generate contextually relevant responses. ** TOPICS ** This special session aims to delve into the theoretical foundations, historical perspectives, and practical applications of the fusion between Knowledge Graphs and Large Language Models. We invite contributions that explore the following areas: 1- Theoretical Frameworks: Papers elucidating the theoretical underpinnings of integrating KGs and LLMs, including methodologies, algorithms, and models for knowledge-enhanced language understanding and generation. 2- Historical Perspectives: Insights into the evolution of KGs and LLMs, tracing their development trajectories, seminal works, and transformative milestones leading to their integration. 3- Design and Implementation: Research articles focusing on the design principles, architectures, and techniques for effectively combining KGs and LLMs to facilitate tasks such as information retrieval, question answering, knowledge inference, and natural language understanding. 4- Explanatory Capabilities: Explorations into how the fusion of KGs and LLMs enables the development of explainable AI systems, providing transparent and interpretable insights into model decisions and outputs. 5- Human-Centered Intelligent Systems: Studies examining the design and deployment of interactive AI systems that leverage KGs and LLMs to facilitate seamless human- computer interaction, catering not only to experts but also to a broader lay audience. We encourage submissions that contribute to advancing our understanding of the synergistic relationship between Knowledge Graphs and Large Language Models, fostering interdisciplinary collaborations across computer science, artificial intelligence, linguistics, cognitive science, and beyond. By shedding light on this burgeoning area of research, this special session aims to propel the field forward and inspire future innovations in AI-driven knowledge representation and natural language processing.” ** SESSION ORGANIZERS ** Gérard Chollet, CNRS-SAMOVAR Institut Polytechnique de Paris, France. Hugues Sansen, Institut Polytechnique de Paris, France. ** IMPORTANT DEADLINES ** Submission deadline: 30 June 2024 11:59 PM (GMT) Notification of acceptance: 15 September 2024 Camera-ready paper due: 25 September 2024 ** PUBLICATION ** The accepted papers will be included in the ICNLSP Conference proceedings which will be published in ACL anthology. The extended versions will be published in a special issue of the Machine Learning and Knowledge Extraction Journal (MAKE), indexed in Web of Science, Scopus, etc. ** CONTACT ** icnlsp(at)gmail(dot)com

1 0

PhD position, Grenoble, Structural Biases for Semantic Prediction, 3 years
by maximin coavoux 19 Jun '24

19 Jun '24

Title: Structural Biases for Compositional Semantic Prediction # Scientific context Compositionality is a foundational hypothesis in formal semantics and states that the semantic interpretation of an utterance is a function of its parts and how they are combined (i.e. their syntactic structure). In NLP, the current dominant paradigm is to design end-to-end models with no intermediate linguistically interpretable representations, which is often motivated by the fact that pretrained language models implicitly encode latent syntactical representations. However, recent studies suggest that the syntactic information learned by language models are insufficient and that, in their current form, they are unable to exploit the syntactic information provided in their input when they need to generate a structured output. Strikingly, most systems that obtained decent results on compositional generalization benchmarks either (i) include some data augmentation methods that increase the exposure of the model to diverse syntactic structures at training time, or (ii) resort to a natural language parser and hand-crafted rules to derive the semantic representation from the syntactic tree. These two approaches are efficient, but they still have limitations that need to be addressed. Firstly, data augmentation bypasses the issue altogether, is tied to a particular dataset or task and requires additional computation, both for generating new data and for re-training or fine-tuning models. Secondly, approach (ii) leaves the seq2seq framework for a more conceptually complex framework, and often uses architectures that are tied to specific data or tasks. In contrast, we believe that with proper built-in inductive biases, a seq2seq model might provide a simple, yet effective solution to the structural compositionality issue. # PhD Proposal The goal of this PhD wil be to explore inductive biases related to linguistic structures, in an attempt to build small NLP models with compositional skills, i.e. models with built-in knowledge making them able to infer generalization rules from few data points. Research directions will be defined together with the successful applicant (who is encouraged to bring their own ideas!) and may include: - Learning invariant language representations. A risk of learning from little data or rare phenomena is that a model may rely on spurious correlations and be unable to generalize outside a specific context. Developing representations that are invariant to noise has been proposed as a way of improving generalization (Peyrard et al 2022). We propose to formalize invariants related to syntactic and semantic structures and explore ways to integrate them during the training phase. - Syntactically constrained decoders. Unlike parsers, Seq2seq models are unable to generate structures unseen at train time. We propose to explore the use of structural constraints to guide decoding in seq2seq models. # Important information: - Starting date: between September and December 2024 (duration 3 years) - Place of work: Laboratoire d’Informatique de Grenoble, CNRS, Grenoble, France - Funding: ANR project ''COMPO: Inductive Biases for Compositionality-capable Deep Learning Models of Natural Language'’ (2024-2028) - Partners: Université Paris Cité, Université Aix-Marseille, Université Grenoble Alpes - The PhD will be supervised by Éric Gaussier and Maximin Coavoux, and in close collaborations with other partners from the COMPO consortium, the PhD candidate will be part of 2 teams of the LIG: GETALP and APTIKAL. - Salary: ~2300€ gross/month - Profile: Master’s degree in NLP, computer science, experience in NLP and machine learning To apply, please send cv, cover letter and most recent academic transcripts to eric.gaussier(a)univ-grenoble-alpes.fr and maximin.coavoux(a)univ-grenoble-alpes.fr References: SLOG: A Structural Generalization Benchmark for Semantic Parsing Bingzhi Li, Lucia Donatelli, Alexander Koller, Tal Linzen, Yuekun Yao, Najoung Kim <https://aclanthology.org/2023.emnlp-main.194/> Structural generalization is hard for sequence-to-sequence models Yuekun Yao, Alexander Koller <https://aclanthology.org/2022.emnlp-main.337/> Compositional Generalization Requires Compositional Parsers Pia Weißenhorn, Yuekun Yao, Lucia Donatelli, Alexander Koller <https://arxiv.org/abs/2202.11937> Invariant Language Modeling Maxime Peyrard, Sarvjeet Ghotra, Martin Josifoski, Vidhan Agarwal, Barun Patra, Dean Carignan, Emre Kiciman, Saurabh Tiwary, Robert West <https://aclanthology.org/2022.emnlp-main.387/>

1 0

2nd CALL FOR PAPERS: ICNLSP 2024, 7th International Conference on Natural Language and Speech Processing
by Mourad Abbas 19 Jun '24

19 Jun '24

We are delighted to invite you to ICNLSP 2024 <https://www.icnlsp.org/2024welcome/>, the 7th edition of the International Conference on Natural Language and Speech Processing, which will be held at University of Trento from October 19th to 20th, 2024 (*HYBRID*). *Topics* - Signal processing, acoustic modeling. - Speech recognition (Architecture, search methods, lexical modeling, language modeling, language model adaptation, multimodal systems, applications in education and learning, zero-resource speech recognition, etc.). - Speech Analysis. - Paralinguistics in Speech and Language (Perception of paralinguistic phenomena, analysis of speaker states and traits, etc.). - Spoken Dialog Systems and Conversational Analysis - Speech Translation. - Speech synthesis. - Speaker verification and identification. - Language identification - Speech coding. - Speech enhancement - Speech intelligibility - Speech Perception - Speech Production - Brain studies on speech - Phonetics, phonology and prosody. - Speech and hearing disorders. - Paralinguistics of pathological speech and language. - Speech technology for disordered speech/hairing. - Cognition and natural language processing. - Machine translation. - Text categorization. - Summarization. - Sentiment analysis and opinion mining. - Computational Social Web. - Arabic dialects processing. - Under-resourced languages: tools and corpora. - Large language models. - Arabic OCR. - NLP tools for software requirements and engineering. - Knowledge fundamentals. - Knowledge management systems. - Information extraction. - Data mining and information retrieval. - Lexical semantics and knowledge representation. - Requirements engineering and NLP. - NLP for Arabic heritage documents. *Submission* Papers must be submitted via the link: https://cmt3.research.microsoft.com/ICNLSP2024/ <https://cmt3.research.microsoft.com/ICNLSP2024/> Each submitted paper will be reviewed by three program committee members.The reviewing process is double-blind. Authors can use the *ACL format*: *Latex <https://www.icnlsp.org/ACL%202023%20Proceedings%20Template.zip>*or Word. Authors have the choice to submit their papers as a full or short paper. Long papers consist of up to 8 pages of content + references. Short papers, up to 4 pages of content + references. *Important dates* *Submission deadline:* *30 June 2024 11:59 PM (GMT*) *Notification of acceptance:* 15 September 2024 *Camera-ready paper due:* 25 September 2024 *Conference dates:* 19, 20 October 2024 *Publication* *1- All accepted papers will be published in **ACL Anthology <https://aclanthology.org/>**.* *2- Selected papers will be published (after extension) in:* * 2-a-* A *SPECIAL ISSUE* <https://www.mdpi.com/journal/make/special_issues/POB4VNE0QP> of Machine Learning and Knowledge Extraction Journal <https://www.mdpi.com/journal/make> (MAKE), indexed in *Web of Science <https://mjl.clarivate.com/search-results>*, *Scopus* <https://www.scopus.com/sources.uri>, etc. *Special issue title*: <https://www.mdpi.com/journal/make/special_issues/POB4VNE0QP> <https://www.mdpi.com/journal/make/special_issues/POB4VNE0QP>*Knowledge Graphs and Large Language Models. <https://www.mdpi.com/journal/make/special_issues/POB4VNE0QP>* * 2-b-* Signals and Communication Technology (Springer), indexed in *Scopus* <https://www.scopus.com/> and *zbMATH* <https://zbmath.org/>.

1 0

Shared Task on Quality Estimation at WMT'24
by Fred Blain 19 Jun '24

19 Jun '24

Dear all, we are happy to invite you to participate in the Shared Task on Quality Estimation at WMT'24. The details of the task can be found at: https://www2.statmt.org/wmt24/qe-task.html New this year: * We introduce a new language pair (zero-shot): English-Spanish * Continuing from the previous edition, we will also analyse the robustness of submitted QE systems to a set of different phenomena which will span from hallucinations and biases to localized errors, which can significantly impact real-world applications. * We also introduce a new task, seeking not only to detect but also to correct errors: Quality-aware Automatic Post-Editing! We invite participants to submit systems capable of automatically generating QE predictions for machine-translated text and the corresponding output corrections. 2024 QE Tasks: Task 1 -- Sentence-level quality estimation This task follows the same format as last year but with fresh test sets and a new language pair: English-Spanish. We will test the following language pairs: * English to German (MQM) * English to Spanish (MQM) * English to Hindi (MQM & DA) * English to Gujarati (DA) * English to Telugu (DA) * English to Tamil (DA) More details: https://www2.statmt.org/wmt24/qe-subtask1.html Task 2 -- Fine-grained error span detection Sequence labelling task: predict the error spans in each translation and the associated error severity: Major or Minor. We will test the following language pairs: * English to German (MQM) * English to Spanish (MQM) * English to Hindi (MQM) More details: https://www2.statmt.org/wmt24/qe-subtask2.html Task 3 -- Quality-aware Automatic Post-editing We expect submissions of post edits correcting detected error spans of the original translation. Although the task is focused on quality-informed APE, we also allow participants to submit APE output without QE predictions to understand the impact of their QE system. Submissions w/o QE predictions will also be considered official. We will test the following language pairs: * English to Hindi * English to Tamil More details: https://www2.statmt.org/wmt24/qe-subtask3.html Important dates: 1. Test sets will be released on July 15th. 2. Participants can submit their systems by July 23rd on codalab. 3. System paper submissions are due by 20th August [aligned with WMT deadlines]. Note: Like last year, we aligned with the General MT and Metrics shared tasks to facilitate cross-submission on the common language pairs: English-German, English-Spanish, and English-Hindi (MQM). We look forward to your submissions and feel free to contact us if you have any more questions! Best wishes, on behalf of the organisers.

1 0

2026

2025

2024

2023

2022

Corpora June 2024