- Corpora - ELRA lists

PhD thesis offer : Language and vision-based mobility assistance for visually impaired people
by Christophe Lohr 30 Apr '24

30 Apr '24

*Summary* • Subject: Language & vision-based mobility assistance for visually impaired people • Keywords: Assistive Technologies, Visual Impairment, Computer Vision, Image Captioning • Research Unit: Lab-STICC (UMR CNRS 6285) • Team: RAMBO - Robot interaction, Ambient system, Machine learning, Behaviour, Optimization • Location: IMT Atlantique, Brest • Start: September/October 2024 • Duration: 3 years • Supervision: Panagiotis Papadakis, Christophe Lohr *Full subject description:* https://www.imt-atlantique.fr/sites/default/files/recherche/Offres%20de%20t… *Application * The candidate must hold (or is about to obtain) a Master Degree in Computer Science with theoretical and practical skills in AI algorithms and associated deep-learning tools (e.g. Pytorch), and a solid background in Computer Vision. The candidate should be fluent in English (working and publishing main language), but French speaking is an advantage (meetings with end-users representatives). A detailed application should be addressed to thesis-application-rambo(a)imt-atlantique.fr, including a cover letter, an up-to-date CV, transcripts of grades (last two years), and a list of referees. *Deadline: 17 May 2024 *

1 0

[CFP] Workshop on Multimodal Semantic Representations
by Lucia Donatelli 30 Apr '24

30 Apr '24

SECOND CALL FOR PAPERS *The Second Workshop on Multimodal Semantic Representations (MMSR II)* Co-located with ECAI 2024 (https://www.ecai2024.eu/) 19-24 October, Santiago de Compostela, Spain (workshop on 19 or 20 October) *Workshop website*: https://mmsr-workshop.github.io/ *Description* The demand for more sophisticated natural human-computer and human-robot interactions is rapidly increasing as users become more accustomed to conversation-like interactions with AI and NLP systems. Such interactions require not only the robust recognition and generation of expressions through multiple modalities (language, gesture, vision, action, etc.), but also the encoding of situated meaning. When communications become multimodal, each modality in operation provides an orthogonal angle through which to probe the computational model of the other modalities, including the behaviors and communicative capabilities afforded by each. Multimodal interactions thus require a unified framework and control language through which systems interpret inputs and behaviors and generate informative outputs. This is vital for intelligent and often embodied systems to understand the situation and context that they inhabit, whether in the real world or in a mixed-reality environment shared with humans. Furthermore, multimodal large language models appear to offer the possibility for more dynamic and contextually rich interactions across various modalities, including facial expressions, gestures, actions, and language. We invite discussion on how representations and pipelines can potentially integrate such state-of-the-art language models. We solicit papers on multimodal semantic representation, including but not limited to the following topics: - Semantic frameworks for individual linguistic co-modalities (e.g. gaze, facial expression); - Formal representation of situated conversation and embodiment, including knowledge graphs, designed to represent epistemic state; - Design, annotation, and corpora of multimodal interaction and meaning representation; - Challenges (including cross-lingual and cross-cultural) in multimodal representation and/or processing; - Criteria or frameworks for evaluation of multimodal semantics; - Challenges in aligning co-modalities in formal representation and/or NLP tasks; - Design and implementation of neurosymbolic or fusion models for multimodal processing (with a representational component); - Methods for probing knowledge of multimodal (language and vision) models; - Virtual and situated agents that embody multimodal representations of common ground. *Submission Information* Two types of submissions are solicited: long papers and short papers. Long papers should describe original research and must not exceed 8 pages, excluding references. Short papers (typically system or project descriptions, or ongoing research) must not exceed 4 pages, excluding references. Both types will be published in the workshop proceedings. Accepted papers get an extra page in the camera-ready version. We strongly encourage students to submit to the workshop. *Important Dates* May 15, 2024: Submissions due July 1, 2024: Notification of acceptance decisions August 2, 2024: Camera-ready papers due Papers should be formatted using the ECAI style files, available at: https://www.ecai2024.eu/calls/main-track Papers will be submitted in PDF format via Chairing Tool at the following link: https://chairingtool.com/conferences/2MMSR24/MainTrack Please do not hesitate to reach out with any questions. Best regards, Richard Brutti, Lucia Donatelli, Nikhil Krishnaswamy, Kenneth Lai, & James Pustejovsky MMSR II organizers Web page: https://mmsr-workshop.github.io/

1 0

[Iber AuTexTification|IberLEF2024] Test set released
by Francisco Manuel Rangel Pardo 30 Apr '24

30 Apr '24

Event Notification Type: Test set released. Website: https://sites.google.com/view/iberautextification *TEST SET RELEASED* *IberAuTexTification* *Automated Text Identification on Languages of the Iberian Peninsula* Dear All, The test dataset for the IberAuTexTification 2024 shared task has been released. It can be found on the shared task website ( https://sites.google.com/view/iberautextification/data), in Genaios Github repository (https://github.com/Genaios/IberAuTexTification) and Zenodo ( https://zenodo.org/records/11034382). Please, remember that registration is on a per-team basis, meaning only one member from each team needs to sign up. Once requested the test dataset on one of the previous links, you will receive the download permission and a password to decompress the data within 24 hours. Please, make sure to write your email address correctly, since we will send passwords, as well as future notifications to that address. Please reach out to the organizers or join the Slack workspace to connect with the other participants and organizers. Best regards on behalf of IberAuTexTification shared tasks organizers

1 0

AIRiAL CFP closing today
by Voss, Erik 30 Apr '24

30 Apr '24

** apologies for cross-posting** 2nd Annual Artificial Intelligence Research in Applied Linguistics (AIRiAL) Conference at Teachers College, Columbia University Call for Proposals: *closing today (30 April, 2024)* Theme: AI in Education: Empowering Learners & Preparing Educators Location: Teachers College, Columbia University, New York City Dates: September 27-28, 2024 For more information and submission form: https://tinyurl.com/CFPAIRiAL2024 -- Erik Voss, Ph.D. Assistant Professor, Applied Linguistics & TESOL program Language & Technology Specialization Department of Arts & Humanities Teachers College, Columbia University TC Faculty Profile <https://www.tc.columbia.edu/faculty/ev2449/>, Linkedin Profile <https://www.linkedin.com/in/erik-voss-ph-d-941a3ab9>, Google Scholar <https://scholar.google.com/citations?user=FMnVdjcAAAAJ&hl=en> ALTESOL Language & Technology Research Group <https://sites.google.com/tc.columbia.edu/al-tesol-language-technology/home> Editor-in-Chief of NYS TESOL Journal Associate Editor of Language Assessment Quarterly *Latest Publications* TC Interview: How New Artificial Intelligence Tools Will Keep Changing Education <https://youtu.be/Zh1RB7DLRMI?si=vDIvowSnzrWy480P>(7:28 mins.) Voss, E. et al. (2023). The Use of Assistive Technologies Including Generative AI by Test Takers in Language Assessment: A Debate of Theory and Practice. <https://doi.org/10.1080/15434303.2023.2288256> LAQ Journal Voss, E. (2024) Duolingo Webinar: Current Applications of Artificial Intelligence in Language Assessment <https://youtu.be/b-mjLmvXLBU?si=nmph76-lizkfzi1J> (1 hour)

1 0

2nd Call For Papers: The 1st Annual Meeting of ACL SIGTURK at ACL 2024
by Duygu Ataman 29 Apr '24

29 Apr '24

SECOND CALL FOR PAPERS --------------------- The First Annual Meeting of the Special Interest Group on Turkic Languages (SIGTURK) August 15-16 2024, Bangkok (Co-located with ACL) INTRODUCTION We present the first edition of the SIGTURK Workshop representing the annual meeting of the ACL Special Interest Group on Turkic Languages, which aims to provide a new venue that will promote studies on computational linguistics in Turkic languages. Our main objective is to bring together an interdisciplinary community of researchers working on different aspects of natural language processing (NLP) models and corpora in Turkic languages, providing the recently growing number of researchers working on the topic with a means of communication and an opportunity to present their work and exchange ideas. TOPICS OF INTEREST We welcome submissions on, but not limited to, the following topics: * Computational linguistics: models of all aspects of linguistics in Turkic languages (e.g., semantics, syntax, lexicon, morphology) * Systems: Case studies on the construction of NLP systems for Turkic languages * Evaluation: Understanding the applicability of current NLP methods in Turkic languages * Metrics: New metrics and measures for evaluating NLP systems suitable to Turkic languages * Learning from sparse data: Novel methods for learning from small or sparse data in Turkic languages * Resources: Datasets, benchmarks, and software libraries for NLP models in Turkic languages IMPORTANT DATES * First call for papers: February 9, 2024 (Friday) * Second call for papers: March 4, 2024 (Monday) * Paper submission deadline: May 31, 2024 (Friday) * Notification of acceptance: June 16, 2024 (Monday) * Camera-ready submission deadline: July 1, 2024 (Monday) * Workshop dates: August 15-16 (Thursday - Friday) Note: All deadlines are 11:59 pm UTC -12h (“anywhere on Earth”). CONTACT INFORMATION * Email: workshop(a)sigturk.com * Submission Portal: https://urldefense.proofpoint.com/v2/url?u=https-3A__openreview.net_group-3… * Official Website: https://urldefense.proofpoint.com/v2/url?u=https-3A__sigturk.com_workshop&d… SUBMISSION GUIDELINES Research papers We invite all potential participants to submit their novel research contributions in the related fields as long papers following the ACL 2024 long paper format (anonymized with 8 pages excluding the references, and an additional page for the camera-ready versions for the accepted papers). All accepted research papers will be published as part of our workshop proceedings and will be presented either through oral presentations or poster sessions. Our research paper track will accept submissions through our own submission system available at https://urldefense.proofpoint.com/v2/url?u=https-3A__openreview.net_group-3… . Extended abstracts Besides long paper submissions, we also invite previously published or ongoing and incomplete research contributions to our non-archival extended abstract track. All extended abstracts can use the same EMNLP template with a 2-page limit, excluding the bibliography. Extended abstracts can be submitted to the workshop submission system using the link: https://urldefense.proofpoint.com/v2/url?u=https-3A__openreview.net_group-3… . "HACK TOGETHER" PROGRAMMING EVENT In addition to the workshop itself, the second day will be devoted to a full-day collaborative hybrid programming event "Hack Together". Our goal is to demonstrate the SIGTURK NLP library and interested parties can contribute to the integration of new NLP methods and models into the SIGTURK pipeline. The SIGTURK infrastructure can be found at https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_sigturk&d=D… . Findings of the event will be combined into a system demonstration paper. INVITED SPEAKERS Kemal Oflazer, Carnegie Mellon University MORE INFORMATION For further details and updates, please visit our workshop website: https://urldefense.proofpoint.com/v2/url?u=https-3A__sigturk.com_workshop&d… ORGANIZERS Duygu Ataman, New York University Deniz Zeyrek Bozşahin, Middle East Technical University Mehmet Oguz Derin Sardana Ivanova, University of Helsinki Abdullatif Köksal, LMU Munich Jonne Sälevä, Brandeis University

1 0

Job offer: Professor in Digital Humanities (AI Ethics, Bias, and Fairness) and Associate Professor in Digital Humanities or Digital Media and Communication
by George Mikros 29 Apr '24

29 Apr '24

The College of Humanities & Social Sciences (CHSS) at Hamad Bin Khalifa University (HBKU) in Qatar invites applications for the following positions: 1. Full Professor in Digital Humanities with expertise in AI ethics, bias, and fairness. 2. Associate Professor in Digital Humanities with expertise in Digital Cultural Heritage, or Digital Media and Communication with a particular emphasis on the analysis of mis/disinformation (2 positions). Successful applicants will work closely with our established Digital Humanities and Societies Program, as well as other programs in the college, and will develop academic and research collaborations with potential national, regional, and international partners. They will teach graduate courses at the MA level, contribute to curriculum development in their area(s) of specialty, supervise MA and Ph.D. students, and maintain an active research agenda. The ideal candidates will be dynamic, experienced, and internationally recognized scholars with an innovative research agenda, as demonstrated through a strong record of peer-reviewed publications and funding/grant acquisitions. Candidates for both positions should hold a Ph.D. or terminal degree from an accredited university. For the Full Professor position, applicants should have 10-12 years of relevant full-time university teaching experience in research, higher education, or closely related areas in industry (additional experience may be required to show equivalency), with evidence of increasing professional maturity and productivity. For the Associate Professor position, applicants should have 6-8 years of relevant full-time university teaching experience, with evidence of increasing professional maturity and productivity. All candidates should demonstrate proficiency in teaching, student supervision, curriculum development, and have a strong portfolio of scholarly endeavors. They should also show evidence of participation in scholarly and academic affairs, an established regional and international reputation in their discipline, and the ability to work with diverse groups, cultures, and communities. To apply, please submit your CV, cover letter, teaching philosophy, research statement, and the names of three references via the provided link at the end of this post. Applications will be accepted on a rolling basis until the positions are filled, with the first review stage starting in May. Full Professor position: https://www.hbku.edu.qa/en/CHSS-P Associate professor positions: https://www.hbku.edu.qa/en/CHSS-APR

1 0

Second call for TextDetox CLEF-2024 Participants
by Daryna Dementieva 29 Apr '24

29 Apr '24

Dear all, Our shared task on *multilingual text detoxification* ( https://pan.webis.de/clef24/pan24-web/text-detoxification.html) is ongoing and reaching *its final phase*😉 We are releasing the parallel pairs for the dev part: https://huggingface.co/datasets/textdetox/multilingual_paradetox and new toxic sentences for *the test part*: https://huggingface.co/datasets/textdetox/multilingual_paradetox_test Also, you can check out new baseline: new fine-tuned multilingual text detoxification model: https://huggingface.co/textdetox/mt5-xl-detox-baseline We are waiting for you submission here: https://codalab.lisn.upsaclay.fr/competitions/18243 *till May 12th*🤗 You can submit for *ANY** language*! There are 9 of them: English, Spanish, German, Chinese, Arabic, Hindi, Ukrainian, Russian, and Amharic. All kind regards, Daryna Dementieva on behalf of TextDetox CLEF-2024 organizers

1 0

CfP: 2024-2025 Bloomberg Data Science Ph.D. Fellowship
by Bloomberg Research Grant Program (BLOOMBERG/ 731 LEX) 29 Apr '24

29 Apr '24

Hello, Bloomberg is happy to announce an exciting funding opportunity for Ph.D. students. The seventh edition of the Bloomberg Data Science Ph.D. Fellowship Program invites Ph.D. students working in broadly-construed data science to apply for fellowships. Our fellowship program, launched in 2018, provides the opportunity for outstanding Ph.D. candidates to be funded for up to three years of their Ph.D. studies to work on their research proposal. The recipients will collaborate and be supported by our Data Science community throughout this time and will complete 14-week summer internships with Bloomberg for the duration of their fellowships. Previous recipients of the fellowship are presented here: 2022-2023, 2021-2022, 2020-2021, 2019-2020, 2018-2019. Applications for the 2024-2025 academic year must be submitted by May 30, 2024. Fellowship recipients will be announced by July 15, 2024. Full details about the fellowship, specific topics of interest for this year and application process can be found at: https://www.bloomberg.com/company/values/tech-at-bloomberg/data-science/aca… We would appreciate it if you can share this opportunity with interested parties. Please direct all questions and future communications to rdml(a)bloomberg.net. Bloomberg

1 0

Funded NLP PhD at Inria Paris and Télécom Paris
by Maria Boritchev 29 Apr '24

29 Apr '24

Dear all, We have a PhD opportunity in NLP and computational linguistics about automatic analysis of human ability to collaborate in dyadic and group conversations, for educational applications: [ https://jobs.inria.fr/public/classic/en/offres/2024-07248 | https://jobs.inria.fr/public/classic/en/offres/2024-07248 ] . Though the offer description in the link is in French, we strongly encourage non-French speakers to apply as well! The offer is translated in English below. Prospective candidates are encouraged to get in touch with us as soon as possible. Looking forward to reading you, Maria Boritchev and Chloé Clavel ______________________________________ Automatic analysis of human capacity to collaborate during dyadic and group conversations, for educational applications. Context and scientific objectives Work on dialog using NLP and deep learning approaches for Dialog Act prediction or sentiment analysis integrates the conversational aspects by capturing contextual dependencies between utterances using recurrent neural networks (RNN) or convolutional neural networks (CNN) for supervised learning (Bapna et al., 2017). The inter-speaker dynamics has also recently started to be integrated. For example, in (Hazarika et al., 2018), intra-speaker dynamics is modeled using a GRU (Gated Recurrent Unit). Other ways to model a conversation in structures that are more complex than flat sequences of utterances are also investigated by leveraging hierarchical neural architectures (Chapuis et al., 2020) or by using graphs in the neural architectures (Ghosal et al., 2019). The conversational aspects and contextual dependencies between the labels are also modeled using sequential decoders and attention mechanisms for NLP-oriented Dialog Act classification (Colombo et al., 2020). Regarding neural architectures dedicated to generating an agent’s behavior, a few studies on affective computing attempt to integrate collaborative processes. The studies concern the generation of agent’s non-verbal behaviors related to social stances (Dermouche & Pelachaud, 2016) and Long-Short-Term-Memory (LSTM) architectures are used as a black box in order to model inter-speaker dynamics. Other studies that are not relying on neural architectures address the question of selecting the agent’s utterance or best dialog policy (ex. conversation strategies such as hedging or self-disclosure or extroverted or introverted linguistic styles) according to the user’s social behaviors (multimodal behavior in (Ritschel et al., 2017) and verbal behavior in (Pecune & Marsella, 2020)). In both studies, a social reward is built for reinforcement learning. A recent work investigates neural architectures (Bert model named CoBERT) trained on Empathetic conversations for response selection, but there is no option in order to select the level or the kind of empathy which is the most relevant (Zhong et al. 2020). While these existing neural architectures (convolutional, recurrent and transformer), for tracking a speaker’s state in conversations are extremely promising by modelling inter-speaker dynamics and the sequential structure of the conversation, the phenomena they are detecting are restricted to sentiment, emotions, or dialogue acts. What is still missing in the module dedicated to tracking the user’s state in modular conversational systems is the consideration of the collaborative processes as a joint action of the user and the agent to understand each other, maintain the flow of the interaction and create a social relationship. The aforementioned neural approaches are very effective, but they are not very data-efficient. There are many use cases where the amount of available data is not sufficient to be able to use these methods, particularly when it comes to deep learning; this is notably the case in educational contexts, where the data at stake is quite confidential, especially when children are involved, as the data is considered to be personal data and is therefore subject to GDPR (https://gdpr-info.eu/). Computational linguistics provide us with other approaches to the analysis of conversations, symbolic and logic-based. These approaches rely on small amounts of data and focus on specific phenomena, such as management of implicit implications/information in dialogues (Breitholtz, 2020) and various contexts (Rebuschi, 2017). Segmented Discourse Representation Theory (SDRT, Asher and Lascarides, 2003) is one of the most widely used frameworks for dialogue analysis used within both formal and neural approaches to dialogue. Another approach is to propose a hybridation of knowledge graphs for modelling social commonsense and large language models (Kim et al., 2023). The objective of the thesis is to investigate approaches that hybridize neural and symbolic models. The approaches will be dedicated to analysing and controlling the level of collaborations between participants in conversations (e.g., misunderstanding analysis and management) through their verbal expressions. We will focus on educational applications such as classroom dynamics & student engagement analysis and conversational systems for supporting students with difficulties, or learning social skills following the ethical guidelines defined in (1). (1) [ https://web-archive.oecd.org/2020-07-23/559610-trustworthy-artificial-intel… | https://web-archive.oecd.org/2020-07-23/559610-trustworthy-artificial-intel… ] (Breitholtz, 2020) Breitholtz, E. (2020). Enthymemes in Dialogue. Brill. (Asher and Lascarides, 2003) Asher, N. and Lascarides, A. (2003). Logics of conversation. Cambridge University Press. (Rebuschi, 2017) Rebuschi, M. (2017). Schizophrenic conversations and context shifting. In International and Interdisciplinary Conference on Modeling and Using Context, pages 708–721. Springer (Kim et al., 2023) Hyunwoo Kim, Jack Hessel, Liwei Jiang, Peter West, Ximing Lu, Youngjae Yu, Pei Zhou, Ronan Bras, Malihe Alikhani, Gunhee Kim, Maarten Sap, and Yejin Choi. 2023. SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 12930–12949, Singapore. Association for Computational Linguistics. Supervision : Thesis supervisor: Chloé Clavel, senior research, ALMAnaCH team, Inria Paris Co-supervisor: Maria Boritchev, associate professor, S2a team, Telecom-Paris

1 0

Funded NLP PHD at University of Birmingham (UK)
by Venelin Kovachev 29 Apr '24

29 Apr '24

PhD opportunity in data-centric NLP, with potential topics such as Active Learning, Curriculum Learning, Multi-objective optimization, Dynamic Adversarial Data Collection, Synthetic data generation, Red-teaming for AI, and interpretability for Natural Language Processing. Position is open until filled. Prospective candidates are encouraged to get in touch to discuss topics prior to formal application. More information: http://tiny.cc/mzbwxz -- Dr. Venelin Kovatchev, PhD Assistant Professor University of Birmingham

1 0