- Corpora - ELRA lists

Free webinar: 22 July 2-3pm UK time
by Brezina, Vaclav 16 Jul '25

16 Jul '25

Dear all, We would like to invite you to a free webinar Corpus Linguistics: Skills for the Future from our Lancaster webinar series. In this webinar, we will focus on two domains that have used corpus methods to develop and improve their practice. Prof Elena Semino will talk about the use of corpus methods in healthcare communication and Dr Dana Gablasova will look at the role played by corpus methods in development and evaluation of GenAI tools for language learning and teaching. ⏲️ Time: 22 July 2025, 2-3pm UK time 🔗 Link for free registration: https://forms.office.com/e/uppRBrE5AF Best, Vaclav Professor Vaclav Brezina Professor in Corpus Linguistics Co-Director of ESRC Centre for Corpus Approaches to Social Science Lancaster University Lancaster, LA1 4YD Office: County South, room C05 T: +44 (0)1524 510828 @vaclavbrezina [cid:image001.jpg@01DBF65D.4028AAC0]<http://www.lancaster.ac.uk/arts-and-social-sciences/about-us/people/vaclav-…>

1 0

New book - Automatic Question Generation
by Flor, Michael 15 Jul '25

15 Jul '25

Dear colleagues, I am happy to announce the availability of the new book, Automatic Question Generation https://link.springer.com/book/10.1007/978-3-031-92072-1 Published by Springer, in the series Synthesis Lectures on Human Language Technologies. Many thanks to Graeme Hirst, the series editor! The book describes a variety of approaches, including generating questions from syntactic analyses, semantic resources, neural architectures, ontologies and knowledge graphs, and large language models. Also covers evaluation and some fundamentals of questions. Hopefully, the book might be useful for NLP/AI researchers, students, educators, test-developers, and anyone interested in this topic. Michael Flor Senior Research Scientist ETS Research Institute Educational Testing Service Princeton, NJ, USA mflor(a)ets.org ________________________________ This e-mail and any files transmitted with it may contain privileged or confidential information. It is solely for use by the individual for whom it is intended, even if addressed incorrectly. If you received this e-mail in error, please notify the sender; do not disclose, copy, distribute, or take any action in reliance on the contents of this information; and delete it from your system. Any other use of this e-mail is prohibited. Thank you for your compliance. ________________________________

1 0

July 2025 Newsletter - LDC
by Penn LDC 15 Jul '25

15 Jul '25

In this newsletter: Fall 2025 LDC data scholarship program New publications: AnnoDIFP Session Audio and Transcripts<https://catalog.ldc.upenn.edu/LDC2025S06> Penn Parsed Corpora of Historical English Second Release<https://catalog.ldc.upenn.edu/LDC2025T09> LoReHLT Uzbek Representative Language Pack<https://catalog.ldc.upenn.edu/LDC2025T08> ________________________________ Fall 2025 LDC data scholarship program Student applications for the Fall 2025 LDC data scholarship program are being accepted now through September 15, 2025. This program provides eligible students with no-cost access to LDC data. Students must complete an application consisting of a data use proposal and letter of support from their advisor. For application requirements and program rules, visit the LDC Data Scholarships page<https://www.ldc.upenn.edu/language-resources/data/data-scholarships>. ________________________________ New publications: AnnoDIFP (Annotated Data for the Investigation of Facets of Personality) Session Audio and Transcripts<https://catalog.ldc.upenn.edu/LDC2025S06> was developed by LDC, the Florida Institute of Technology <https://www.fit.edu/> (FIT), and the University of New Haven<https://www.newhaven.edu/index.php> (UNH) to support algorithm development for predicting personality traits. It contains 438.34 hours of English audio and transcripts from in-person interviews of 366 participants paired with scores from two self-reported personality assessments, HEXACO Personality Inventory (Revised) (HEXACO-PI-R) and Short Dark Triad (SD3). In-person interviews were recorded at LDC, FIT, and UNH. In each session, the participant and interviewer were in separate sound-isolated rooms with communication between them supplied by audio/video hardware. Sessions consisted of the following tasks: rapport building, a YouTube task, a map task, and a business task. Further details on collection methodology and session tasks are contained in the documentation accompanying this release. 2025 members can access this corpus through their LDC accounts. Non-members may license this data for a fee. * Penn Parsed Corpora of Historical English Second Release<https://catalog.ldc.upenn.edu/LDC2025T09> was developed at the University of Pennsylvania and consists of running texts and text samples of British English prose from the earliest Middle English documents (1100 CE) up to the period of the First World War (1914 CE). This second release corrects errors and inconsistencies in Penn Parsed Corpora of Historical English (LDC2020T16<https://catalog.ldc.upenn.edu/LDC2020T16>), further streamlines annotation, simplifies the directory structure, and includes updated documentation. This data set contains three corpora covering traditionally recognized periods of English: * The Penn-Helsinki Parsed Corpus of Middle English, second edition * The Penn-Helsinki Parsed Corpus of Early Modern English * The Penn Parsed Corpus of Modern British English, second edition The texts are in two forms: part-of-speech tagged text and syntactically annotated text. Annotations were manually reviewed for accuracy and consistency. Included in this release are updated annotation guidelines, philological information for each corpus, and the CorpusSearch 2 program, which allows users to search the data for words, word sequences, and syntactic structure. 2025 members can access this corpus through their LDC accounts provided they have submitted a completed copy of the special license agreement. Non-members may license this data for a fee. * LoReHLT Uzbek Representative Language Pack<https://catalog.ldc.upenn.edu/LDC2025T08> was developed by LDC and is comprised of approximately 47 million words of Uzbek monolingual text, 563,000 words of found Uzbek-English parallel text, 100,000 Uzbek words translated from English data, and 6.4 hours of Uzbek broadcast news and amateur web audio recordings. Approximately 151, 000 words were annotated for named entities and over 28,000 words were annotated for full entity including nominals and pronouns. Noun-phrase chunking was applied to more than 13,000 words. Over 20,890 words were labeled with simple semantic annotation. Topic annotation was applied to the audio recordings. Data was collected from discussion forum, news, reference, social network, broadcast news, web audio recordings, and weblogs. LoReHLT was a companion project of the DARPA LORELEI program. The LORELEI (Low Resource Languages for Emergent Incidents) program was concerned with building human language technology for low resource languages in the context of emergent situations. Representative languages were selected to provide broad typological coverage. 2025 members can access this corpus through their LDC accounts. Non-members may license this data for a fee. To unsubscribe from this newsletter, log in to your LDC account<https://catalog.ldc.upenn.edu/login> and uncheck the box next to "Receive Newsletter" under Account Options or contact LDC for assistance. Membership Coordinator Linguistic Data Consortium<ldc.upenn.edu> University of Pennsylvania T: +1-215-573-1275 E: ldc(a)ldc.upenn.edu<mailto:ldc@ldc.upenn.edu> M: 3600 Market St. Suite 810 Philadelphia, PA 19104

1 0

LREC 2026 - First Call for Tutorials
by Sara Goggi 14 Jul '25

14 Jul '25

*LREC 2026 - FIRST CALL FOR TUTORIALS* *Organized by the ELRA Language Resources Association * *Palma, Mallorca, Spain* *11-16 May 2026* The 15th edition of the Language Resources and Evaluation Conference (LREC 2026) invites proposals for tutorials to be held in conjunction with the conference. We seek proposals in all areas of natural language processing and computation, language resources (LRs) and evaluation, including spoken language, sign language, and multimodal interaction. The tutorials will be held at LREC 2026 in Palma de Mallorca (Spain), on 11, 12, or 16 May 2026. *IMPORTANT DATES* * 17 October 2025: Proposal submission due * 17November 2025: Notification of acceptance * 11-16 May 2026: LREC 2026 conference** *SUBMISSION DETAILS* We invite proposals for three types of tutorials: *Cutting-edge:*tutorials that cover advances in newly emerging areas. The tutorials are expected to give a brief introduction to the topic, but participants are assumed to have some prior knowledge of the topic. The focus of the class will be on discussing the most recent developments in the field, and it will spend a considerable amount of time pointing out open research questions and important novel research directions. *Introductory to computational linguistics (CL)/ natural language processing (NLP) topics:*tutorials that provide introductions to topics that are established in the LREC communities. The lecturers provide an overview of the development of the field from the beginning until now. Attendees are not expected to come with prior knowledge. They acquire sufficient understanding of the topic to understand the most recent research in the field. *Introductory to adjacent areas:*tutorials that provide introductions to topics that are established or emerging in areas adjacent to CL/NLP. The lecturers provide an overview of the development of the field from the beginning until now. Attendees are not expected to come with prior knowledge. They acquire a sufficient understanding of the topic to understand the most recent research in the field and its relevance for the CL/NLP domains. In all cases, the aim of a tutorial is primarily to help understand a scientific problem, its tractability, and its theoretical and practical implications. Presentations of particular technological solutions or systems are welcome, provided that they serve as illustrations of broader scientific considerations. None of the tutorial types are expected to be “self-invited” long talks – the content should be a good balance between research from multiple groups and perspectives, not only of the teachers of the tutorial. Proposals should be prepared according to the style files that will be available from the LREC website (https://lrec2026.info/). Proposals should not exceed 4 pages of content (plus unlimited pages for references), and they should be submitted as PDF documents. Tutorial proposals do not have to be anonymized. They should contain: * A title that helps potential attendees to understand what the tutorial will be about. * An abstract that summarizes the topics, goals, target audience, and type (see above) of the tutorial (this abstract will also be on the LREC website). * A section called “Introduction” that explains the topic and summarizes the starting point and relevance for our community, and in general. * A section called “Target Audience” that explains for whom the tutorial will be developed and what the expected prior knowledge is. Clearly specify what attendees should know and be able to practically do to get the most out of your tutorial. Examples of what to specify include prior mathematical knowledge, knowledge of specific modeling approaches and methods, programming skills, or adjacent areas like computer vision. Also specify the number of expected participants. * A section called “Outline” in which the various topics are explained. This can be a list of bullet points or a set of paragraphs explaining the content. Explain what you intend and how long the tutorial will be. * A section called “Diversity Considerations”, discussing each of the three aspects of diversity mentioned above or others. * A section called “Reading List”: What are introductory papers or books that potential attendees can read to get a first impression of the tutorial content? What do you expect them to have read before attending? What does provide further information beyond the content of the tutorial? * A section called “Presenters” in which each tutorial presenter is briefly introduced in one paragraph, including their research interests, their areas of expertise for the tutorial topic, and their experience in teaching a diverse and international audience. * A section called “Other Information” which should include information on how many people are expected to participate and how you came to this estimate. You can also explain any other aspects that you find important, including special equipment that you would need. * A section called “Ethics Statement” which discusses ethical considerations related to the topics of the tutorial. Tutorials can be half-day (morning 9:00 to 13:00 or afternoon 14:00 to 18:00) or full-day (9:00 to 18:00) and must follow fixed hours for breaks (morning coffee break 10.30-11.00, lunch break: 13:00-14:00, afternoon coffee break: 16.00-16.30). *EVALUATION CRITERIA* The tutorial proposals will be evaluated according to their originality and impact, the expected interest level of participants, as well as the quality of the organizing team and Program Committee and their contribution to the diversity of the conference. *DIVERSITY AND INCLUSION* We particularly encourage submissions from underrepresented groups in computational linguistics, researchers from any demographic or geographic minority, with disabilities, or others. In the evaluation of the proposal, we will take these aspects into account to create a varied and balanced set of tutorials. This includes several aspects of diversity, namely (1) how the topic of the tutorial contributes to improved diversity and increased fairness in the field, (2) if the topic is particularly relevant for a specific underrepresented group of potential participants, and (3) if the presenters are from an underrepresented group. *INSTRUCTOR RESPONSIBILITIES* Accepted tutorial presenters will be notified by the date mentioned above. They must then provide abstracts of their tutorials for inclusion in the conference registration material by the specific deadlines. The abstract needs to be provided in ASCII format. The summary will be submitted in PDF format and can be updated from the version submitted for review. The instructors will make their material available in an appropriate way, for instance, by setting up a website. They will be invited to submit their slides to the ACL Anthology. Finally, at least one tutorial presenter must attend the event in person to organise the tutorial. *CONTACT* * Tutorial Chairs: lrec2026-tutorial-chairs(a)googlegroups.com <mailto:lrec2026-tutorial-chairs@googlegroups.com> * General contact: mailto:info@lrec2026.info <mailto:info@lrec2026.info> * More information on LREC 2026: https://lrec2026.info/ <https://lrec2026.info/>

1 0

LREC 2026 - First Call for Workshops
by Sara Goggi 14 Jul '25

14 Jul '25

*LREC 2026 - FIRST CALL FOR WORKSHOPS* *Organized by the ELRA Language Resources Association * *Palma, Mallorca, Spain* *11-16 May 2026* The Organisers of LREC 2026 invite proposals for workshops to be held in conjunction with the main conference at Palau de Congressos de Palma, Palma de Mallorca (Spain). We solicit proposals in all areas of language resources, language technology, and evaluation of the underlying technologies, broadly conceived to also include related disciplines such as linguistics, language documentation, natural language processing, speech and multimodal processing, computational social science, and the digital humanities. The workshops will be held at LREC 2026 in Palma de Mallorca (Spain) on 11, 12 and 16 May 2026. *IMPORTANT DATES* (All deadlines are 11:59 PM UTC-12:00 (“anywhere on Earth”) * 17 October 2025: Proposal submission deadline * 17 November 2025: Notification of acceptance * 11-16 May 2026: LREC2026 conference *SUBMISSION INFORMATION* Proposals should be submitted as PDF documents using the START system (URL will soon be available on the conference website). Note that submissions should essentially be ready to be turned into a Call for Workshop Papers within one week of notification of acceptance (see Important dates above). The proposals should be at most two pages for the main proposal + at most two additional pages for information about organisers, program committee, and references. Thus, the whole proposal should not be more than FOUR pages long, excluding references. The two pages for the main proposal must include: * A title and a brief description of the workshop topic and content. * Workshops can be half-day (morning 9:00 to 13:00 or afternoon 14:00 to 18:00) or full-day (9:00 to 18:00) and must follow fixed hours for breaks (morning coffee break 10.30-11.00, lunch break: 13:00-14:00, afternoon coffee break: 16.00-16.30). * A list of invited speakers, if applicable, with an indication of which ones have already agreed and which are tentative, and sources of funding for the speakers, if needed. * An estimate of the number of attendees. * A description of any shared tasks associated with the workshop, and estimate of the number of participants. Note that any shared task will also need to be reviewed by the workshop committee for ethical concerns. * A description of special requirements and technical needs, where relevant. * If the workshop has been held before, a note specifying where previous iterations of the workshops were held, how many submissions the workshop received, how many papers were accepted (also specify if they were not regular papers, e.g., shared task system description papers, non-archival papers), and how many attendees the workshop attracted. The two pages for information about the workshop, the organisers and the program committee must include: * A very brief advertisement or tagline for the workshop, up to 140 characters, that highlights any key information you wish prospective attendees to know, and which would be suitable to be put onto a web-based survey (see below). * The names, affiliations, and email addresses of the organisers, with one-paragraph statements of their research interests, areas of expertise, and experience in organising workshops and related events. * A list of Program Committee members, with an indication of which members have already agreed. Organisers should do their best to estimate the number of submissions (especially for recurring workshops) in order to (a) ensure a sufficient number of reviewers so that each paper receives 3 reviews, and (b) anticipate that no one is committed to reviewing more than 3 papers. This practice is likely to ensure on-time, and more thorough and thoughtful reviews. *EVALUATION CRITERIA* The workshop proposals will be evaluated according to their originality and impact, the expected interest level of participants, as well as the quality of the organising team and Program Committee, and their contribution to the diversity of the conference. *DIVERSITY AND INCLUSION* We particularly encourage submissions of underrepresented groups in language resources and language technology, including researchers from any demographic or geographic minority, with disabilities, or others. In the evaluation of the proposal, we will take these aspects into account to create a varied and balanced set of workshops. Workshop proposals are evaluated on a range of aspects, including diversity, such as (1) how the topic of the workshop contributes to improved diversity and increased fairness in the field, (2) if the topic is particularly relevant for a specific underrepresented group of potential participants, (3), if the presenters are from an underrepresented group. *WORKSHOP ORGANISER RESPONSIBILITIES* At least one of the accepted organisers must attend the workshop in person. The organisers of the accepted proposals are responsible for publicizing and running the workshop, including reviewing submissions, producing the workshop program and the camera-ready workshop proceedings according to LREC requirements, organising the meeting days, and playing their part to ensure that all participants are aware of LREC’s anti-harassment policy and code of conduct (see https://lrec2026.info/lrec-2026-code-of-conduct/ ). It is crucial that organisers commit to all deadlines. In particular, failure to produce the camera-ready proceedings in the correct format on time will lead to the exclusion of the workshop from the unified proceedings and author indexes. Workshop organisers cannot accept submissions for publication that will be (or have been) published elsewhere, although they are free to set their own policies on simultaneous submission and review, as well as to accept additional non-archival presentations *CONTACT* * Workshop Chairs: lrec2026-workshop-chairs(a)googlegroups.com * General contact: mailto:info@lrec2026.info <mailto:info@lrec2026.info> * More information on LREC 2026: https://lrec2026.info/

1 0

CfP — Dialogue in the Era of Multimodal Foundation Models @ IWSDS 2026
by Giuseppe Riccardi 14 Jul '25

14 Jul '25

Dear all, We’re excited to announce IWSDS 2026, The 16th International Workshop on Spoken Dialogue Systems. It will take place on Feb 26 – Mar 1, 2026 in Trento, the gateway to the Dolomites following the Milano Cortina 2026 Winter Olympics. The theme of this year is "Human-Machine Dialogue in the Era of Multimodal Foundation Models" IWSDS 2026 aims to bring together researchers working on the theoretical foundations, systems and methods, and applications of spoken and multimodal dialogue systems. The Call for Papers is now open for long papers (up to 8 pages + references), as well as short papers, position papers, demos (up to 4 pages + references) Accepted papers will be included in the proceedings published in the ACL Anthology. Important Dates: Paper Submission Deadline: October 12, 2025 Acceptance Notification: December 10, 2025 Workshop Dates: February 26 – March 1, 2026 📌 Website & CfP: https://sites.google.com/unitn.it/iwsds26/ 🐦 Twitter: https://x.com/iwsdsmeeting 🌿 Bluesky: https://bsky.app/profile/iwsdsmeeting.bsky.social We look forward to welcoming you to Trento in 2026! On behalf of the Organizing Committee, Giuseppe Riccardi IWSDS'26 General Chair University of Trento <https://sites.google.com/unitn.it/iwsds26/> <https://scholar.google.it/citations?user=OYqE3uAAAAAJ&hl=en> <https://www.linkedin.com/in/mahedmousavi/> <https://twitter.com/mahedmousavi>

1 0

[Workshop CFP] - (R2LM) From Rules to Language Models: Comparative Performance Evaluation @ RANLP
by alicia.picazo＠ua.es 14 Jul '25

14 Jul '25

[Final CFP] - (R2LM) From Rules to Language Models: Comparative Performance Evaluation @ RANLP 2025 (Varna, Bulgaria) - 11, 12 or 13 September 2025 Dear colleagues, We are pleased to announce the FINAL call for papers for the R2LM Workshop - From Rules to Language Models: Comparative Performance Evaluation at RANLP 2025. https://r2lm2025.github.io/R2LM/ Workshop Description Deep learning (DL) and large language models (LLMs) have driven major advances in natural language processing (NLP), enabling impressive performance across many tasks. However, they continue to face key challenges in handling complex linguistic phenomena such as multiword expressions, long-context reasoning, and robustness to adversarial inputs. In parallel, concerns remain about the scalability, interpretability, and domain adaptability of these models, particularly in applications requiring high precision, such as grammar checking, legal analysis, or medical NLP. These limitations have sparked renewed interest in rule-based and knowledge-based approaches, which often offer better explainability and remain competitive, especially in low-resource or high-stakes scenarios. Our workshop aims to gather contributions that deal with the following topics: • Role of rule-based and knowledge-based NLP methods in modern applications • Comparative analysis of rule-based, machine-learning, deep-learning and large language models for different NLP tasks • Emerging trends in NLP research beyond deep learning and Large Language Models • Limitations and performance bottlenecks in scalability and accuracy of deep learning models Submission Details • Long papers: up to 8 pages (excluding references) • Short papers: up to 4 pages (excluding references) • Format: ACL-style (LaTeX or MS Word) • Submission portal and template info available on the RANLP 2025 website Important dates Paper Submission Deadline: 15 July 2025 Notification of Acceptance: 10 August 2025 Workshop date: 11, 12 or 13 September 2025 Organising Committee: Alicia Picazo-Izquierdo, University of Alicante, Spain Ernesto Luis Estevanell-Valladares, University of Alicante, Spain Rafael Muñoz Guillena, University of Alicante, Spain Ruslan Mitkov, Lancaster University, UK Raúl García Cerdá, University of Alicante, Spain

1 0

First Call for paper - Knowledge and Natural Language Processing Track @ ACM-SAC
by Patrizio Bellan 14 Jul '25

14 Jul '25

*Knowledge and Natural Language Processing Track @ ACM-SAC* Aim of the Knowledge and Natural Language Processing (KNLP) track at ACM SAC is to investigate techniques and application of knowledge engineering and natural language processing, focusing in particular on approaches combining them. This is an extremely interdisciplinary emerging research area, at the core of Artificial Intelligence, combining and complementing the scientific results from Natural Language Processing and Knowledge Representation and Reasoning. Topics of interest Topics of interest include, but are not limited to: - Natural Language Processing - NLP tasks for Knowledge Extraction - NLP for Ontology Population and Learning - Sentiment Analysis and Opinion Mining for Knowledge Applications - Interplay between Language and Ontologies - NLP for Explainable Knowledge - Machine Translation techniques for Multilingual Knowledge - NLP for the Web - Bias detection and mitigation in small/large LM - (Small/Large) LM and Knowledge - Knowledge - Knowledge to improve NLP tasks - Knowledge for Information Retrieval - Knowledge-based Sentiment Analysis and Opinion Mining - Combining Knowledge and Deep Learning for NLP - Knowledge for Text Summarization and Generation - Knowledge for Persuasion - Knowledge-based Machine Translation - Knowledge for the Web - Linked Data for NLP - Knowledge-based NL Explainability - LM-enhanced ontology and knowledge engineering methodologies and tools - LM-based agent for knowledge extraction, reasoning, and management - Ontology evaluation via small/large LMs - (Ontological) knowledge memorization in LMs - Knowledge-based techniques for LMs (Retrieval Augmented Generation based approaches, fact-checking, and bias mitigation) - Question answering over knowledge graphs via small/large LMs - Real-world applications that exploit Knowledge and NLP - Real-world applications that exploit Knowledge and NLP - Knowledge and NLP Systems for Big Data scenarios - Knowledge and NLP technology for a diverse, equitable, and inclusive society - Deployment of Knowledge and NLP Systems in specific domains, such as: - Digital Humanities and Social Sciences - eGovernment and public administration - Life sciences, health, and medicine - News and Data Streaming Paper Submission Submissions must not have been published or be concurrently considered for publication elsewhere. Papers should be submitted in PDF using the ACM-SAC proceedings format <https://www.sigapp.org/sac/sac2026/authorkit.php>. Authors' names and affiliations should be entered separately at the submission site and not appear in the submitted papers. Each submission will be reviewed in *a DOUBLE-BLIND *process according to the ACM-SAC Regulations. Student Research Competition (SRC) submissions are welcome (see SAC 2026 SRC page for details <https://www.sigapp.org/sac/sac2026/src_program.php>). Initial Submission Policy - All submissions must initially be submitted as regular papers. There is no separate submission track for poster papers. - Paper selection is based on originality, technical contribution, presentation quality, and relevance to the Knowledge and Natural Language Processing Track. - Based on the outcome of the review process, some submissions—although technically sound—may not be accepted as regular papers due to overall acceptance rate constraints, and could be accepted as posters Minimum Length for Review Consideration - While there is no formal minimum page requirement, submissions of fewer than four (4) full pages that do not demonstrate substantial contributions may be subject to desk rejection without external review. Camera-ready Page Limits - Regular Papers (accepted for publication): - Up to eight (8) pages are included with standard registration. Poster Papers (recommended for acceptance): - Up to two (2) pages are included with standard registration. *Important Dates (check SAC website <https://www.sigapp.org/sac/sac2026/#important-dates> for up-to-date dates)* September 26, 2025: Regular Paper & SRC Abstract Submission For further information, please visit the Knowledge and Natural Language Processing Track <https://knlp.fbk.eu/> and ACM-SAC 2026 <https://www.sigapp.org/sac/sac2026/> conference websites or feel free to contact the Track Co-Chairs <knlp(a)fbk.eu>. -- -- Le informazioni contenute nella presente comunicazione sono di natura privata e come tali sono da considerarsi riservate ed indirizzate esclusivamente ai destinatari indicati e per le finalità strettamente legate al relativo contenuto. Se avete ricevuto questo messaggio per errore, vi preghiamo di eliminarlo e di inviare una comunicazione all’indirizzo e-mail del mittente. -- The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. If you received this in error, please contact the sender and delete the material.

1 0

Deadline extension: DHASA Conference 2025
by Menno Van Zaanen 12 Jul '25

12 Jul '25

Deadline extension: DHASA Conference 2025 https://dh2025.digitalhumanities.org.za Due to several requests, we have decided to extend the deadlne NEW DEADLINE: 28 July 2025 Theme: The role of humanities in digital humanities and artificial intelligence The Digital Humanities Association of Southern Africa (DHASA) is pleased to announce its fifth conference, focusing on the theme The role of humanities in digital humanities and artificial intelligence. In a region where the field of Digital Humanities is still relatively underdeveloped, this conference aims to address this gap and foster growth and collaboration in the field. The conference offers an opportunity for researchers interested in showcasing their work in the broad field of Digital Humanities to come together. By doing so, the conference provides a comprehensive overview of the current state-of- the-art in Digital Humanities, particularly within the Southern Africa region. As such, we welcome submissions related to Digital Humanities research conducted by individuals from Southern Africa or research focused on the geographical area of Southern Africa in the broad sense. Furthermore, the conference serves as a platform for information sharing and networking among researchers passionate about Digital Humanities. By bringing together experts working on Digital Humanities in Southern Africa or with a focus on Southern Africa, we aim to promote collaboration and facilitate further research in this dynamic field. In addition to the main conference, affiliated workshops and tutorials will be organised, providing researchers with valuable insights into novel technologies and tools. These supplementary events are designed for researchers interested in specific aspects of Digital Humanities or seeking practical information to enter or advance their knowledge in the field. The DHASA conference welcomes interdisciplinary contributions from researchers in various domains of Digital Humanities, including, but not limited to, language, literature, visual art, performance and theatre studies, media studies, music, history, sociology, psychology, language technologies, library studies, philosophy, methodologies, software and computation, AI, and more. Our goal is to cultivate an inclusive scientific community of practice within Digital Humanities. Suggested topics include the following: * The role of AI in digital humanities, the role of Digital Humanities in shaping AI, and the broader role of the humanities in both AI and DH projects; * Digital archives and the preservation of marginalised voices; * Intersectionality and the digital humanities: exploring the intersections of race, gender, sexuality, culture, and class in digital research and activism; * Activism and social change through digital media: how digital humanities tools and methodologies can be used to promote inclusion; * Engaging marginalised communities in the creation and use of digital tools, resources, and AI; * Exploring the role of digital humanities in decolonising knowledge and promoting indigenous perspectives; * The ethics of data collection and analysis in digital humanities and AI research; * The role of digital humanities and AI in promoting inclusive and equitable pedagogy; * Digital humanities and inclusion in the context of African and global perspectives and international collaborations; * Critical approaches to digital humanities and inclusion: examining the limitations and possibilities of digital tools and methodologies in promoting inclusion; and * Collaborative digital humanities projects with non-profit organisations, community groups, and cultural institutions; * Development of digital and AI tools for supporting digital humanities; * Novel utilisation of digital and AI tools for performing digital humanities research; * The role of digital humanities in the classroom: reimagining literacy and AI fluency; * Digital humanities data and project management; * The role of librarians in the digital humanities project; * Any other digital humanities-related topic that serves the Southern African community. Submission Guidelines The DHASA conference 2025 asks for three types of submissions: * Long papers: Authors may submit long papers with a maximum of 8 content pages and unlimited pages for references and appendices. The final versions of accepted long papers will be granted an additional page (leading to a total of up to 9 content pages) to incorporate reviewers' comments. Long papers accepted for the conference will be presented in 30-minute time slots (which includes 10 minutes for questions). * Short papers: Authors may submit short papers with a maximum of 5 content pages and unlimited pages for references and appendices. The final versions of accepted short papers will be allowed an extra page (leading to a total of up to 6 content pages) to accommodate reviewers' comments. Short papers accepted for the conference will be presented in 15-minute time slots (which includes 5 minutes for questions). * Executive summaries: Authors can submit an executive summary for work in progress, limited to 1 page. Executive summaries accepted for the conference will be presented as posters during a dedicated poster presentation slot. All accepted long and short paper submissions that are presented at the conference will be published in the JDHASA journal, see https://upjournals.up.ac.za/index.php/dhasa. In addition, the executive summaries for the poster presentations will be published in a book of executive summaries before the conference. We particularly encourage student submissions where the first author is a student. All submissions should adhere to the ACL style guide: https://acl-org.github.io/ACLPUB/formatting.html Submissions should be submitted in PDF format. Submissions that do not adhere to the prescribed style guide will be rejected. Follow this link to go to the submission platform: https://dh2025.digitalhumanities.org.za/submission/ Authors are encouraged to upload their datasets to the SADiLaR repository: https://repo.sadilar.org/. In case of difficulties uploading the datasets, please reach out to Benito Trollip (benito.trollip(a)nwu.ac.za). Important dates Submission deadline: 28 July 2025 Date of notification: 16 September 2025 Camera-ready copy deadline: 24 October 2025 Conference: 10 November 2025 - 14 November 2025 Conference venue: CSIR ICC, Pretoria, South Africa Co-located events Several co-located events are currently being prepared, including workshops and tutorials. These will be updated on the conference website. Organising Committee Aby Louw, Council for Scientific and Industrial Research Andiswa Bukula, South African Centre for Digital Language Resources Avi Moodley, Council for Scientific and Industrial Research Franco Mak, Council for Scientific and Industrial Research Franziska Pannach, Rijksuniversiteit Groningen Ilana Wilken, Council for Scientific and Industrial Research Johannes Sibeko, Nelson Mandela University Juan Steyn, South African Centre for Digital Language Resources Laurette Marais, Council for Scientific and Industrial Research Marissa Griesel, South African Centre for Digital Language Resources Menno van Zaanen, South African Centre for Digital Language Resources Privolin Naidoo, Council for Scientific and Industrial Research Sthembiso Mkhwanazi, Council for Scientific and Industrial Research -- Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za Professor in Digital Humanities South African Centre for Digital Language Resources https://www.sadilar.org ________________________________ NWU PRIVACY STATEMENT: http://www.nwu.ac.za/it/gov-man/disclaimer.html DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system. ________________________________

1 0

Deadline extension: Sixth Workshop on Resources for African Indigenous Language (RAIL)
by Menno Van Zaanen 12 Jul '25

12 Jul '25

Deadline extension: Sixth Workshop on Resources for African Indigenous Language (RAIL) Co-located with DHASA 2025 https://sadilar.org/rail-2025/ Due to several requests, we have decided to extend the deadlne NEW DEADLINE: 28 July 2025 RAIL Workshop date: 10 November 2025 DHASA Conference dates: 10-14 November 2025 Venue: CSIR International Convention Centre. The sixth RAIL workshop website: https://sadilar.org/rail-2025/ DHASA website: https://digitalhumanities.org.za/ The sixth Resources for African Indigenous Languages (RAIL) workshop will be co-located with the Digital Humanities Association of Southern Africa (DHASA) 2025 conference at the CSIR International Convention Centre in Pretoria, South Africa, on 10 November 2025. The RAIL workshop is an interdisciplinary platform for researchers working on African indigenous languages resources such as natural languages processing (NLP) tools, Human Language Technologies (HLT), data collections, and annotations. This workshop aims to foster a scientific community of practice that focuses on computational linguistic tools and data that are designed for or applied to the indigenous languages of Africa. Many African languages are under-resourced while only a few are considered to be somewhat better resourced. These languages often share interesting properties such as writing systems, making them different from most high-resourced languages. From a computational perspective, these languages lack enough corpora to undertake high level development of NLP and HLT tools, which in turn impedes the development of African languages in these areas. During previous workshops, it was noted that the problems and solutions presented were not only applicable to African languages but were also relevant to many other low-resource languages across the world. Because these languages share similar challenges, this workshop provides researchers with opportunities to work collaboratively on issues of language resource development and learn from each other. The RAIL workshop has several aims. First, the workshop brings together researchers who work on African indigenous languages, forming a community of practice for people working on indigenous languages. Second, the workshop aims to reveal currently unknown or unpublished existing resources (corpora, NLP tools, and applications), resulting in a better overview of the current state-of-the-art, and also allows for discussions on novel, desired resources for future research in this area. Third, it enhances sharing of knowledge on the development of low-resource languages. Finally, it enables discussions on how to improve the quality as well as availability of the resources. The workshop has “Language resources in the age of large language models” as its theme, but submissions on any topic related to properties of African indigenous languages (including related non- African languages) may be accepted. Suggested topics include (but are not limited to) the following: * Digital representations of linguistic structures * Descriptions of corpora or other data sets of African indigenous languages * Building resources for (under-resourced) African indigenous languages * Developing and using African indigenous languages in the digital age * Effectiveness of digital technologies for the development of African indigenous languages * Revealing unknown or unpublished existing resources for African indigenous languages * Developing desired resources for African indigenous languages * Improving quality, availability and accessibility of African indigenous language resources Submission requirements: We invite papers on original, unpublished work related to the topics of the workshop. Submissions, presenting completed work, may consist of up to eight (8) pages of content plus additional pages of references. The final camera-ready version of accepted long papers are allowed one additional page of content (up to 9 pages) so that reviewers’ feedback can be incorporated. Papers should be formatted according to the DHASA style sheet which is provided on the Journal of the Digital Humanities Association of Southern Africa website (https://upjournals.up.ac.za/index.php/dhasa/about). Reviewing is double-blind, so make sure to anonymise your submission (e.g., do not provide author names, affiliations, project names, etc.) Limit the amount of self citations (anonymised citations should not be used). The RAIL workshop follows the DHASA submission requirements. Please submit papers in PDF format (the submission link is available on the website). Accepted papers will be published in proceedings linked to the DHASA conference. Important dates: Submission deadline: 28 July 2025 Date of notification: 16 September 2025 Camera ready copy deadline: 24 October 2025 Workshop: 10 November 2025 DHASA conference: 10 November 2025-14 November 2025 Organising Committee Rooweither Mabuya, South African Centre for Digital Language Resources (SADiLaR), South Africa Muzi Matfunjwa, South African Centre for Digital Language Resources (SADiLaR), South Africa Mmasibidi Setaka, South African Centre for Digital Language Resources (SADiLaR), South Africa Menno van Zaanen, South African Centre for Digital Language Resources (SADiLaR), South Africa -- Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za Professor in Digital Humanities South African Centre for Digital Language Resources https://www.sadilar.org ________________________________ NWU PRIVACY STATEMENT: http://www.nwu.ac.za/it/gov-man/disclaimer.html DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system. ________________________________

1 0

2025

2024

2023

2022

Corpora