March 2025 - Corpora

Extension of Submission Deadline for NTCIR-19 Task Proposals (New Deadline: April 7th)
by CHUNG-CHI CHEN 27 Mar '25

27 Mar '25

Dear colleagues, Apologies for sending multiple messages in a short period of time. After the 2nd Call for NTCIR-19 Task Proposals was sent out two days ago, we received several messages inquiring about the possibility of extending the submission deadline. In order to allow more participants to join and contribute, we have decided to extend the submission deadline by one week — *the new deadline is April 7 (AoE).* We look forward to receiving your proposals. If you have any questions, please feel free to contact us. Best regards, NTCIR-19 Program Co-Chairs Qingyao Ai, Chung-Chi Chen, Shoko Wakamiya * IMPORTANT DATES: *April 7, 2025 (Extended): Task Proposal Submission Due (Anywhere on Earth)*May 15, 2025: Acceptance Notification of Task Proposals June 10-13, 2025: NTCIR-18 Conference (Organizers of accepted tasks have a chance to present their proposed tasks) * SUBMISSION LINK: *https://easychair.org/conferences/?conf=ntcir19proposal <https://easychair.org/conferences/?conf=ntcir19proposal>* * NTCIR-19 TENTATIVE SCHEDULE: January 2026: Dataset release* January-June 2026: Dry run* March-July 2026: Formal run* August 1, 2026: Evaluation results return August 1, 2026: Task overview release (draft) September 1, 2026: Submission due of participant papers (draft) November 1, 2026: Camera-ready participant paper due December 2026: NTCIR-19 Conference at NII, Tokyo, Japan (* indicates that the schedule can be different for different tasks) * WHO SHOULD SUBMIT NTCIR-19 TASK PROPOSALS? We invite new task proposals within the expansive field of information access. Organizing an evaluation task entails pinpointing significant research challenges, strategically addressing them through collaboration with fellow researchers (including co-organizers and participants), developing the requisite evaluation framework to propel advancements in the state of the art, and generating a meaningful impact on both the research community and future developments. Prospective applicants are urged to underscore the real-world applicability of their proposed tasks by utilizing authentic data, focusing on practical tasks, and solving tangible problems. Additionally, they should confront challenges in evaluating information access technology, such as the extensive number of assessments needed for evaluation, ensuring privacy while using proprietary data, and conducting live tests with actual users. In the era of large language models (LLMs), these models are anticipated to significantly influence daily human activities. Nonetheless, the content produced by LLMs often exhibits issues, such as hallucinations. NTCIR-19 encourages tasks that focus on the evaluation of the quality of content generated by LLMs continued from NTCIR-18 as well as information access exploiting LLMs, including generative information retrieval (IR), IR using generative queries, conversational search using generated utterances, evaluation using LLM (relevance judgements or language annotation using LLM), and RAG. * PROPOSAL TYPES: We will accept two types of task proposals: - Proposal of a Core task: This is for fostering research on a particular information access problem by providing researchers with a common ground for evaluation. New test collections and evaluation methods may be developed through the collaboration between task organizers (proposers) and task participants. At NTCIR-18, the core tasks are AEOLLM, FairWeb-2, FinArg-2, Lifelog-6, MedNLP-CHAT, RadNLP, and Transfer-2. Details can be found at http://research.nii.ac.jp/ntcir/NTCIR-18/tasks.html. - Proposal of a Pilot task: This is recommended for organizers who propose to focus on a novel information access problem, and there are uncertainties either in task design or organization. It may focus on a sub-problem of an information access problem and attract a smaller group of participating teams than core tasks. However, it may grow into a core challenging task in the next round of NTCIR. At NTCIR-18, the pilot tasks are HIDDEN-RAD, SUSHI, and U4. Details can be found at http://research.nii.ac.jp/ntcir/NTCIR-18/tasks.html. Organizers are expected to run their tasks mainly with their own funding and to make the task as self-sustaining as possible. A part of the fund can be supported by NTCIR, which is called "seed funding." It is usually used for some limited purposes such as hiring relevance assessors. The seed funding allocated to each task varies depending on requirements and the number of accepted tasks. Typical cases would be around 1M JPY for a core task and around 0.5M JPY for a pilot task (note that the amount is subject to change). Please submit your task proposal as a PDF file via EasyChair by March 31, 2025 (Anywhere on Earth). https://easychair.org/conferences/?conf=ntcir19proposal * TASK PROPOSAL FORMAT: The proposal should not exceed four pages in A4 single-column format. The first three pages should contain the main part and appendix, and the last page should contain only a description of the data to be used in the task. Please describe the data in as much detail as possible so that we can help your data release process after the proposal is accepted. In the past NTCIRs, it took much time to create memorandums for data release, which sometimes slowed down the task organization. Main part - Task name and short name - Task type (core or pilot) - Abstract - Motivation - Methodology - Expected results Appendix - Names and contact information of the organizers - Prospective participants - Data to be used and/or constructed - Budget planning - Schedule - Other notes Data (to be used in your task) - Details (Please describe the details of the data, which should include the source of the data, methods to collect the data, range of the data, etc.) - License (Please make sure that you have a license to distribute the data, and details of the license should be provided. If you do not have permission to release the data yet, please describe your plan to get the permission.) - Distribution (Please describe how you plan to distribute the data to participants. There are mainly three choices: distributed by the data provider, distributed by organizers, and distributed by NII.) - Legal / Ethical issues (If the data can cause legal or ethical problems, please describe how you propose to address them. e.g., some medical data may need approval from an ethical committee. e.g., some Web data may need filtering for excluding discriminative messages.) If you want NII to distribute your data to task participants on your behalf, please email ntc-admin(a)nii.ac.jp before your task proposal submission attaching the task proposal. * REVIEW CRITERIA: - Importance of the task to the information access community and the society - Timeliness of the task - Organizers’ commitment in ensuring a successful task - Financial sustainability (self-sustainable tasks are encouraged) - Soundness of the evaluation methodology - Detailed description about the data to be used - Language scope * NTCIR-19 PROGRAM CO-CHAIRS: Qingyao Ai (Tsinghua University, China) Chung-Chi Chen (National Institute of Advanced Industrial Science and Technology (AIST), Japan) Shoko Wakamiya (Nara Institute of Science and Technology (NAIST), Japan) * NTCIR-19 GENERAL CHAIRS: Charles Clarke (University of Waterloo, Canada) Noriko Kando (National Institute of Informatics, Japan) Makoto P. Kato (University of Tsukuba, Japan) Yiqun Liu (Tsinghua University, China)

1 0

[CfP] SEMANTiCS 2025 - 2nd Call for Workshops and Tutorials
by Beyza Yaman 27 Mar '25

27 Mar '25

SEMANTiCS 2025 - Last Call for Workshops and Tutorials 21st International Conference on Semantic Systems Vienna, Austria September 03-05, 2025 Important Dates for Workshops: - *Proposals WS Deadline:* March 22, 2025 (11:59 pm, AoE) March 29, 2025 (11:59 pm, AoE) - *Notification of Acceptance: *March 29, 2025 (11:59 pm, AoE) April 5, 2025 (11:59 pm, AoE) Important Dates for Tutorials (and other meetings, e.g. seminars, show-cases, etc., without call for papers): - *Proposals Tutorial Deadline:* June 11, 2025 (11:59 pm, AoE) - *Notification of Acceptance:* June 18, 2025 (11:59 pm, AoE) *Submission via Easychair on https://easychair.org/conferences/?conf=semantics2025 <https://easychair.org/conferences/?conf=semantics2025>* *SEMANTiCS Workshops and Tutorials* SEMANTiCS 2025 is a major venue for research and industrial innovation and features a workshop and tutorial program addressing the diverse practical interests of its audience. This program is intended to offer a rich diversity of topics to conference attendees and local participants seeking to pick up new skills and stay up-to-date regarding the latest developments in the community. We encourage submissions of proposals on all topics in the general areas of SEMANTiCS 2025 and proposals bridging or introducing new perspectives and/or challenges in these areas. Workshops and tutorials may incorporate panel discussions, lightning talks, meetings, networking or hands-on sessions, hackathons and other practical formats where applicable. Rooms for business or project meetings are available upon request as well. *Scope and Goals* Workshops and tutorials at SEMANTiCS 2025 allow your organization or project to advance and promote your topics and gain increased visibility. The workshops and tutorials will be announced on the SEMANTiCS website, and they will be seen by all participants. SEMANTiCS 2025 workshops and tutorials can be incubators for industrial and scientific communities that form and share a particular research and development agenda, and they will provide a forum for presenting contributions and findings to a diverse and knowledgeable community. Furthermore, the event can be used as a dissemination activity in the scope of large research projects or as a closed format for research/commercial project consortia meetings. *Proceedings* Workshop papers will be published in the SEMANTiCS side event proceedings through CEUR. Side events proceedings will include posters & demos and contributions from workshops. *Setup and Requirements* SEMANTiCS 2025 workshops and tutorials may be either half or full-day long. Workshops and tutorials take place on the days before and/or after the main SEMANTiCS 2025 EU conference (03th of September 2025). Further details will be communicated in due time. Organizers of workshops and tutorials will be granted three free tickets (only for the workshop & tutorial day) for organization purposes or keynotes. Participants of workshops and tutorials will only be charged a reduced fee to cover the basic costs. Workshop and tutorial proposals must include the following information: - outline of the *themes and goals of the event*, including a title and a brief abstract (less than 200 words) intended for the SEMANTiCS 2025 website. - a statement addressing why the event is important, *why the event is timely*, and how it is relevant to SEMANTiCS 2025 and the field of Semantic Web. For the tutorials, why the presenters are qualified for a high-quality introduction to the topic. - *related workshops and conferences*, i.e., specifying if this is a continuation of a workshop series or a new workshop. Please provide information about past versions (in any) and other related workshops (including URLs and submission/acceptance counts, if available). - a statement addressing the *quality assurance criteria* that will be used by the event organizers to select the papers for the workshops and the presenters for the tutorials (e.g., peer review or review/evaluation by event organizers). If a peer review process is chosen as a quality assurance criterion for the workshops, the organizers will be responsible for their own reviewing process. Workshop organizers will be responsible also for their own publicity (e.g., website, timelines and call for papers) and proceedings production. - *structure of the event* and plans for generating and stimulating discussion; how will the interaction be organized in case of a hybrid event. - expected *number of event participants* and (in case of previously held events) number of registered attendees and website for previous editions of the event - a *description* of the intended audience and the expected learning *outcomes.* - desired *prerequisite* knowledge of the audience. - proposed *duration of the event* (i.e., half or full day), different sessions if applicable (final time slot will be assigned in accordance with the SEMANTiCS program). - any *equipment*, room capacity, or other logistic constraints. - full *contact information* of all organizers of the event and main contact person; a brief description of each *organizer's background*, including relevant past experience in organizing events. Proposals for workshop and tutorial proposals must be submitted via Easychair: *https://easychair.org/conferences/?conf=semantics2025* <https://easychair.org/conferences/?conf=semantics2025> (max 4 pages) *Important Dates* Important Dates for Workshops: - *Proposals WS Deadline:* March 22, 2025 (11:59 pm, AoE) March 29, 2025 (11:59 pm, AoE) - *Notification of Acceptance:* March 29, 2025 (11:59 pm, AoE) April 5, 2025 (11:59 pm, AoE) - *Workshop website is online:* April 15th, 2025 *Suggested* dates for Workshop organizers (with Call for Papers) - *Submission WS papers Deadline:* June 14, 2025 (11:59 pm, AoE) - *Notification of Acceptance:* July 05, 2025 (11:59 pm, AoE) Important Dates for Tutorials (and other meetings, e.g. seminars, show-cases, etc., without call for papers): - *Proposals Tutorial Deadline:* June 11, 2025 (11:59 pm, AoE) - *Notification of Acceptance:* June 18, 2025 (11:59 pm, AoE) *Review and Evaluation Criteria* Workshop and tutorial proposals will be reviewed by the SEMANTiCS 2025 Workshop Chairs, as well as by the SEMANTiCS 2025 organizing committee, according to the following criteria: - The potential to advance the state of Semantic Web research and practice - The quality assurance criteria proposed by the organizers to select high-quality papers for workshops and presenters for tutorials - The organizers' experience and ability to lead a successful event - Timeliness and expected interest in the event topics - The balance and synergy between all SEMANTiCS 2025 events *Topics of interest include (but are not limited to):* - Web Semantics & Linked (Open) Data - Enterprise Knowledge Graphs, Graph Data Management - Machine Learning Techniques for/using Knowledge Graphs (e.g. reinforcement learning, deep learning, data mining and knowledge discovery) - Interplay between Large Language Models, generative AI and Knowledge Graphs (e.g., Retrieval Augmented Generation) - Knowledge Management (e.g. acquisition, capture, extraction, authoring, integration, publication) - Terminology, Thesaurus & Ontology Management, Ontology engineering - Reasoning, Rules, and Policies - Natural Language Processing for/using Knowledge Graphs (e.g. entity linking and resolution using target knowledge such as Wikidata and DBpedia, foundation models) - Crowdsourcing for/using Knowledge Graphs - Data Quality Management and Assurance - Mathematical Foundation of Knowledge-aware AI - Multimodal Knowledge Graphs - Semantics in Data Science - Semantics in Blockchain environments - Trust, Data Privacy, and Security with Semantic Technologies - IoT, Stream Processing, dealing with temporal data - Conversational AI and Dialogue Systems - Provenance and Data Change Tracking - Semantic Interoperability (via mapping, crosswalks, standards, etc.) - Linked Data storage, triple stores, graph databases - Robust and scalable management, querying and analysis of semantics and data - User interfaces for the Semantic Web & its management - Explainable and Interoperable AI - Decentralised and Federated Knowledge Graphs (e.g., Federated querying, link traversal) - Application of Semantically-Enriched and AI-based Approaches, such as, but not limited to: - Knowledge Graphs in Bioinformatics, Medical AI and Preventive Healthcare - Clinical Use Case of semantic-enabled AI-based Approaches - AI for Environmental Challenges - Semantics in Scholarly Communication and Scientific Knowledge Graphs - AI and LOD within GLAM (galleries, libraries, archives, and museums) institutions - Knowledge Graphs & hybrid AI for predictive maintenance and Industry 4.0/5.0 - Digital Humanities and Cultural Heritage - LegalTech, AI Safety, EU AI Act - Economics of Data, Data Services, and Data Ecosystems. We especially invite contributions that illustrate the applicability of the topics mentioned above for industrial purposes and/or illustrate the business relevance of their contribution for specific industries. Workshop proposals on *emerging themes* and *open challenges* for the topics listed above are encouraged. In case you have additional questions concerning the submission process, please do not hesitate to contact us via Easychair. We are looking forward to your contribution! *Workshop & Tutorial Chairs:* - Daniel Garijo, Universidad Politécnica de Madrid, Spain (email: daniel.garijo(a)upm.es) - David Chaves-Fraga, Universidade de Santiago de Compostela Spain (email: david.chaves(a)usc.es) Kind Regards, Beyza Yaman. On behalf of the organising committee.

1 0

[CFP] ACL 2025 - Second Call for System Demonstrations
by ACL Announcements 26 Mar '25

26 Mar '25

ACL 2025 - Call for System Demonstrations The ACL 2025 System Demonstration Program Committee invites proposals for the Demonstrations Program. Demonstrations may range from early research prototypes to mature production-ready systems. Publicly available open-source or open-access systems are of special interest. We additionally strongly encourage demonstrations of industrial systems that are technologically innovative given the current state of the art of theory and applied research in natural language processing. Areas of interest include all topics related to theoretical and applied natural language processing, such as (but not limited to) the topics listed on the main conference website. Submitted systems may be of the following types: Natural language processing systems or system components Application systems using language technology components Software tools for natural language processing research Software for demonstration or evaluation Software supporting learning or education Tools for data visualization and annotation Tools for model inspection Development tools Papers describing accepted demonstrations will be published in a companion volume of the ACL 2025 conference proceedings. We expect at least one of the authors to present a live demo during a demo session at ACL 2025 in Vienna, with an accompanying poster. Please note: Commercial sales and marketing activities are not appropriate in the Demonstrations Program and should be arranged as part of the Exhibit Program Check the full Call at: https://2025.aclweb.org/calls/system_demonstration/ Link to submission system: https://openreview.net/group?id=aclweb.org/ACL/2025/Demo

1 0

2nd CFP: Natural Logic Meets Machine Learning (NALOMA)
by Abzianidze, L. (Lasha) 26 Mar '25

26 Mar '25

[Apologies for cross-posting] The 5th iteration of the NALOMA (Natural Logic Meets Machine Learning) workshop invites submissions on any (theoretical or computational) aspect of hybrid methods concerning Natural Language Understanding and Reasoning (NLU&R). The topics include but are not limited to: * Hybrid NLU&R systems that integrate logic-based/symbolic methods with neural networks * Explainable NLU&R (with structured explanations) * Opening the black-box of deep learning in NLU&R * Downstream applications of hybrid NLU&R systems * Probabilistic semantics for NLU&R * Comparison and contrast between symbolic and deep learning work on NLU&R * Creation, criticism, refinement, and augmentation of NLU&R datasets *(Dis)Alignment of humans and machines on NLU&R tasks * Addressing inherent human disagreements in NLU&R tasks * Generalization of NLU&R systems * Fine-grained evaluation of NLU&R systems NALOMA accepts archival papers (to appear in the ACL anthology proceedings) and (non-archival) extended abstracts. The workshop is co-located with ESSLLI (https://2025.esslli.eu), 4-8 August 2025, Bochum (Germany). The submission deadline is 25 April 2025. Visit https://naloma.github.io for more details. - The NALOMA chairs, Lasha Abzianidze and Valeria de Paiva Lasha Abzianidze Assistant professor Institute for Language Sciences, Utrecht University

1 0

[CfP] IEEE FedCSIS 2025: "Digital Humanities, Computational Social Sciences and Economics Research (AI-HuSo)"
by Jens Dörpinghaus 26 Mar '25

26 Mar '25

Dear all, I would like to inform you about a call for papers for a thematic track at FedCSIS 2025 (IEEE #61123) called "AI in Digital Humanities, Computational Social Sciences and Economics Research (AI-HuSo)". FedCSIS 2025 will be held in Kraków, Poland, 14-17 September 2025. See https://2025.fedcsis.org/thematic/ai-huso for details. This thematic session is dedicated to the computational study of Social Sciences, Economics and Humanities, including all subjects like, for example, education, labour market, history, religious studies, theology, cultural heritage, and informative predictions for decision-making and behavioural-science perspectives. While digital methods, intelligence systems, and AI have been emerging topics in these fields for several decades, this thematic session is not only limited to discoveries in these domains, but also dedicated to the reflections of these methods and results within the field of computer science. Thus, we are in particular interested in interdisciplinary exchange and dissemination with a clear focus on computational and AI methods for intelligence systems. Since there is a clear methodological overlap between these three domains and often similar algorithms and AI approaches are considered, we see this thematic session as place for interdisciplinary learning, discussing a joint toolbox as a support for scholars from these field with human and context-aware agents. The aim of this thematic session is thus to bridge the gap between scientific domains, foster interdisciplinary exchange and discuss how research questions from other domains challenge current computer science. In particular, we are interested in communications between researchers from different fields of computer science, social sciences, economics, humanities, and practitioners from different fields. Topics ====== The list of topics includes, but is not limited to: - AI and computational approaches for the interdisciplinary work of the social sciences, economics, and humanities: report on theoretical, methodological, experimental, and applied research. - AI and computational approaches for linking data from different digital resources, including online social networks, web and data mining, Knowledge Graphs, Ontologies. - AI and computational methods for text mining and textual analysis, for example texts within social sciences, digital literary studies, computational stylistics and stylometry. - Text encoding, computational linguistics, annotation guidelines, OCR for humanities, economics, and social sciences. - Network analysis, including social and historical network analysis. - Ethical and philosophical considerations of AI in society, education and humanties research In general, the applications of interest are included in the list below, but are not limited to: - Labour market research and qualification, including behavioral-science perspectives. - Education: Digital methods and systems, e-learning, adult education, etc. - Contributions to the application of technology to culture, history, and societal issues: For example, computational text analysis, analytical and visualization, databases, etc. - In particular, we welcome submissions which focus on a critical reflection of digital methods in the humanities, economics and social sciences within computer science. - Linking of digital resources, a discussion of data sets, their quality and reliability, combining quantitative and qualitative data, anonymization and data protection. Contact: ai-huso(a)fedcsis.org Submission rules ================ - Authors should submit their papers as Postscript, PDF or MSWord files. - The total length of a paper should not exceed 12 pages IEEE style (including tables, figures and references). More pages can be added, for an additional fee. IEEE style templates are available here. - Papers will be refereed and accepted on the basis of their scientific merit and relevance to the Topical Area. - Preprints containing accepted papers will be published online. - Only papers presented at the conference will be published in Conference Proceedings and submitted for inclusion in the IEEE Xplore® database. - Conference proceedings will be published in a volume with ISBN, ISSN and DOI numbers and posted at the conference WWW site. - Conference proceedings will be submitted for indexation according to information here. - Organizers reserve right to move accepted papers between FedCSIS Sessions. Extended versions of selected papers presented at the conference will be published in a volume entitled "Advances in Computational Social Sciences: AI, Computational Methods and Applications for the Study of Society" from Springer.

1 0

ACL 2025 - Second Call for System Demonstrations
by Horacio Saggion 26 Mar '25

26 Mar '25

-- Horacio Saggion Full Professor / Chair in Computer Science and Artificial Intelligence Head of the Natural Language Processing Group - TALN Project Coordinator iDEM Project (HE) Co-PI of the AI-BOOST project (HE) Co-PI of the IDEAL project (HE) Universitat Pompeu Fabra https://twitter.com/h_saggion https://www.linkedin.com/in/horacio-saggion-1749b916 ACL 2025 - Call for System Demonstrations The ACL 2025 System Demonstration Program Committee invites proposals for the Demonstrations Program. Demonstrations may range from early research prototypes to mature production-ready systems. Publicly available open-source or open-access systems are of special interest. We additionally strongly encourage demonstrations of industrial systems that are technologically innovative given the current state of the art of theory and applied research in natural language processing. Areas of interest include all topics related to theoretical and applied natural language processing, such as (but not limited to) the topics listed on the main conference website. Submitted systems may be of the following types: - Natural language processing systems or system components - Application systems using language technology components - Software tools for natural language processing research - Software for demonstration or evaluation - Software supporting learning or education - Tools for data visualization and annotation - Tools for model inspection - Development tools Papers describing accepted demonstrations will be published in a companion volume of the ACL 2025 conference proceedings. We expect at least one of the authors to present a live demo during a demo session at ACL 2025 in Vienna, with an accompanying poster. Please note: Commercial sales and marketing activities are not appropriate in the Demonstrations Program and should be arranged as part of the Exhibit Program Check the full Call at: https://2025.aclweb.org/calls/system_demonstration/ Link to submission system: https://openreview.net/group?id=aclweb.org/ACL/2025/Demo -- Horacio Saggion Full Professor / Chair in Computer Science and Artificial Intelligence Head of the Natural Language Processing Group - TALN Project Coordinator iDEM Project (HE) Co-PI of the AI-BOOST project (HE) Co-PI of the IDEAL project (HE) Universitat Pompeu Fabra https://twitter.com/h_saggion https://www.linkedin.com/in/horacio-saggion-1749b916

1 0

ELRA Catalogue of Language Resources - Update
by info＠elda.org 26 Mar '25

26 Mar '25

[Apologies for multiple postings] We are happy to announce that 1 new phonetic database and 1 new speech corpus are available in our catalogue. *Comprehensive Arabic Phonetic Database <https://catalog.elra.info/en-us/repository/browse/ELRA-S0493/>* ISLRN: 511-751-240-544-8 <https://islrn.org/resources/511-751-240-544-8/> The Comprehensive Arabic Phonetic Database is a robust and detailed linguistic resource offering both phonemic and phonetic transcriptions, precisely reflecting how Modern Standard Arabic words are realized in actual speech. It is a highly comprehensive and accurate Arabic phonetic/phonemic database, covering over 329,000 entries, including over 61,000 general vocabulary entries, 101,000 Arab personal names, 143,000 foreign personal names in Arabic and 21,000 worldwide place names both Arab and non-Arab. Each entry consists of canonical forms both vocalized and unvocalized (as in natural language) accompanied by phonetic transcriptions in IPA and X-SAMPA and the user-friendly CARS phonemic transcription system. Additionally, unique features include explicit indication of vowel neutralization, accurate word stress, gender and number codes (singular or plural), and POS (part-of-speech) codes. The database is provided in a flat TSV text file. See also the *DiaLEX <https://catalog.elra.info/en-us/repository/search/?q=dialex>* and *ArabLEX <https://catalog.elra.info/en-us/repository/search/?q=arablex>* collections for Arabic from the same provider… *EthioSpeech <https://catalog.elra.info/en-us/repository/browse/ELRA-S0494/>* ISLRN:886-456-351-764-8 <https://islrn.org/resources/886-456-351-764-8/> EthioSpeech Corpora is comprised of over 391 hours of recorded read speech in six different Ethiopian languages by ca. 200 speakers per language: Amharic (68 hours), Tigrigna (62 hours), Oromo (70 hours), Somali (56 hours), Afar (68 hours), and Sidama (68 hours). The dominating domain is media (mainly newspapers), but for some of the languages texts from different domains were used, including spiritual contents. The recording is made using mobile devices using the LIG-Aikuma speech recording tool that is installed on the devices. The gender and age balance of readers is nearly equal for Amharic, Tigrigna and Oromo, whereas mainly male gender for the other 3 languages. The age distribution is between 18 and 40. For more information on the catalogue or if you would like to enquire about having your resources distributed by ELRA, please *contact us* <mailto:contact@elda.org>. _________________________________________ Visit the *ELRA Catalogue of Language Resources* <http://catalog.elra.info> *Archives * <https://www.elra.info/catalogues/language-resources-announcements/>of ELRA Language Resources Catalogue Updates

1 0

Call for Papers: Identity-Aware AI 2025
by Soda Marem Lo 26 Mar '25

26 Mar '25

Ethical and Technical Challenges for Identity-Aware AI Workshop at ECAI 2025, Bologna, Italy, October 25-30. https://ecai2025.org/workshops/ Workshop theme: What makes each of us unique, and which ethical and technical challenges does this imply? Overview What makes us unique? Language (and thus the automatic processing of it) is about people and what they mean. However, current practice relies on the assumptions that the involved humans are all the same, and that if enough data (and compute power) is present, the resulting generalizations will be robust enough and represent the majority. This approach often harms marginalized communities and ignores the notion of identity in models and systems. Our interdisciplinary workshop aims to raise the question of “what makes each of us unique?” to the AI community. We seek to gather researchers from diverse fields to understand how the identities of all stakeholders — e.g., the individuals projecting their views in texts, the individuals perceiving the texts, the individuals mentioned and those not mentioned in the texts — should be considered in future research in AI. Workshop Goals - The development of a shared and interdisciplinary understanding of identities and how identity is treated in AI. - The development of new methods that push the effective, fair, and inclusive treatment of individuals in AI to the next level. Topics of Interest We invite submissions on the following topics: - *Approaches to model subjective phenomena:* Personalization and perspectivist methods that leverage disaggregated labeled data, encoding annotator metadata on their beliefs, moral values, sociodemographic features, or personal narratives. ML methods to address the challenges of “learning from disagreements” both from the development of new models and the collection of data to train such models. - *Methods for detecting and controlling bias in models and data:* Techniques to audit fairness, enforce fairness constraints, and learn fair representation from data, in order to enhance the fairness of models while maintaining their predictive reliability. Ethical challenges for LLMs in identity-aware dialog and tasks: diversity, stereotypes, harms. - *The role of sociodemographics in LLMs:* Such as which characteristics (and disagreements) they embody and how to measure their capacity for representing and reasoning about diverse types of identities. - *Challenges for applying AI methods to model socio-political phenomena:* Including polarization, impact of media consumption on public opinion formation, agenda setting, deliberation support, and how integrating identity into AI methods can influence the accuracy for these tasks. - *NLP work at the intersection with social psychology:* The methodological foundation for quantitative investigation of identity-related topics. The reflection on best practices to reliably measure complex constructs such as morals and values. Detection and analysis of personal narratives across cultures. - *Accountability of AI in the eye of the general public:* The role of LLMs, and the responsibilities of AI and NLP developers for ethical use of identities. - *NLP work at the intersection of survey science:* The use of LLMs to model and simulate individuals and subpopulations; the role of LLMs in personalizing information elicitation; and methodological approaches to address data contamination and response validation when LLMs are used by either researchers or respondents. Submission Types We welcome the following types of submissions: - Long papers: Up to 8 pages (excluding references) - Short papers: Up to 4 pages (excluding references) - Non-archival submissions, student project presentations, mixed-media submissions: No page limit - For non-archival submissions, we welcome creative formats including: - Art, poetry, music - Blog posts - Jupyter notebooks - Teaching materials - TikToks and videos - Findings papers - Late-breaking papers - Extended abstracts - For creative format submissions, please submit a PDF containing: - A summary or abstract of your work - A link to your work (if hosted externally) - Any additional context or documentation Submission Guidelines - All submissions will be double-blind reviewed - Submissions should follow ECAI formatting guidelines <https://www.ecai2024.eu/calls/main-track> with the latex template here <https://ecai2024.eu/download/ecai-template.zip> - Submit your paper through EasyChair <https://easychair.org/conferences/?conf=identityawareai2025> - Accepted papers will be published in the workshop proceedings through CEUR Workshop Format The workshop will be a half-day event featuring: - Keynote speeches from leading experts in the field - Paper presentations (oral and lightning talks) - Participatory design activity to develop a shared interdisciplinary vocabulary, identify current gaps in datasets for studying identity, and design a vision for collecting new datasets - Special student project session We are committed to ensuring that our workshop is accessible to all. The workshop will be held in a hybrid format, allowing both in-person and virtual participation. Important Dates - Submissions: 22 August - Notifications: 26 September - Camera-ready: 3 October - Workshop: 25 October Diversity & Inclusion We actively encourage submissions from underrepresented communities and countries. The workshop organizers will provide mentorship and thorough feedback, especially to first-time authors and reviewers. Organizers - Pranav A (University of Hamburg) - Valerio Basile (University of Turin) - Neele Falk (University of Stuttgart) - David Jurgens (University of Michigan) - Gabriella Lapesa (GESIS, Leibniz Institute for the Social Sciences & Heinrich-Heine University of Düsseldorf) - Anne Lauscher (University of Hamburg) - Soda Marem Lo (University of Turin) Contact For queries, please contact: identity-aware-ai(a)googlegroups.com Join us at Identity-aware AI 2025 to contribute to this important conversation!

1 0

Open advanced research group meeting - today 12pm UK time
by Brezina, Vaclav 26 Mar '25

26 Mar '25

Dear all, If you are interested in a open discussion about different aspects of corpus linguistics, please join us today at 12pm UK time. The topic is "Repetition and replication". Free registration: https://forms.office.com/e/YT5md2fjka We will also discuss topics that will appear in the research group meetings in the future. Vaclav Professor Vaclav Brezina Professor in Corpus Linguistics Department of Linguistics and English Language ESRC Centre for Corpus Approaches to Social Science Faculty of Arts and Social Sciences, Lancaster University Lancaster, LA1 4YD Office: County South, room C05 T: +44 (0)1524 510828 [cid:image001.jpg@01DB9E37.3EF2EFE0]@vaclavbrezina [cid:image002.jpg@01DB9E37.3EF2EFE0]<http://www.lancaster.ac.uk/arts-and-social-sciences/about-us/people/vaclav-…>

1 0

eLex 2025 - Call for papers - reminder and extended deadline
by Iztok Kosem 26 Mar '25

26 Mar '25

(Apologies for cross-posting) Dear colleagues, This is a reminder that eLex 2025, the ninth biennial conference on electronic lexicography in the 21st century, will be held in Bled, Slovenia, 18–20 November 2025. We have decided to extend the deadline for abstracts until 7 April 2025. The abstracts should be submitted via Easychair website: https://easychair.org/my/conference?conf=elex2025. Confirmed keynote speakers: Carole Tiberius / Jesse de Does (Dutch Language Institute) Marko Robnik-Šikonja (University of Ljubljana) Michal Měchura (Lexical Computing and Dublin City University) Two workshops have also been confirmed, »Globalex Workshop on Lexicography and Neology« and »CLASSLA-express 2.0 workshop«. More information on keynote talks, workshops, call for papers etc. can be found on the conference website (https://elex.link/elex2025/). Iztok Kosem Head of the organising committee

1 0

2026

2025

2024

2023

2022

Corpora March 2025