- Corpora - ELRA lists

Third call for papers DHASA Conference 2023
by Menno Van Zaanen 27 Jul '23

27 Jul '23

Third call for papers DHASA Conference 2023 https://dh2023.digitalhumanities.org.za/ Theme: "Digital Humanities for Inclusion" The Digital Humanities Association of Southern Africa (DHASA) is pleased to announce its fourth conference, focusing on the theme "Digital Humanities for Inclusion." In a region where the field of Digital Humanities is still relatively underdeveloped, this conference aims to address this gap and foster growth and collaboration in the field. The conference offers an opportunity for researchers interested in showcasing their work in the broad field of Digital Humanities to come together. By doing so, the conference provides a comprehensive overview of the current state-of-the-art in Digital Humanities, particularly within the Southern Africa region. As such, we welcome submissions related to Digital Humanities research conducted by individuals from Southern Africa or research focused on the geographical area of Southern Africa. Furthermore, the conference serves as a platform for information sharing and networking among researchers passionate about Digital Humanities. By bringing together experts working on Digital Humanities in Southern Africa or with a focus on Southern Africa, we aim to promote collaboration and facilitate further research in this dynamic field. In addition to the main conference, affiliated workshops and tutorials will be organized, providing researchers with valuable insights into novel technologies and tools. These supplementary events are designed for researchers interested in specific aspects of Digital Humanities or seeking practical information to enter or advance their knowledge in the field. The DHASA conference welcomes interdisciplinary contributions from researchers in various domains of Digital Humanities, including, but not limited to, language, literature, visual art, performance and theatre studies, media studies, music, history, sociology, psychology, language technologies, library studies, philosophy, methodologies, software and computation, and more. Our goal is to cultivate an inclusive scientific community of practice within Digital Humanities. Suggested topics include the following: * Digital archives and the preservation of marginalized voices; * Intersectionality and the digital humanities: exploring the intersections of race, gender, sexuality, and class in digital research and activism; * Activism and social change through digital media: how digital humanities tools and methodologies can be used to promote inclusion; * Engaging marginalized communities in the creation and use of digital tools and resources; * Exploring the role of digital humanities in decolonizing knowledge and promoting indigenous perspectives; * The ethics of data collection and analysis in digital humanities research related; * The role of digital humanities in promoting inclusive and equitable pedagogy; * Digital humanities and inclusion in the context of global perspectives and international collaborations; * Critical approaches to digital humanities and inclusion: examining the limitations and possibilities of digital tools and methodologies in promoting inclusion; and * Collaborative digital humanities projects with non-profit organizations, community groups, and cultural institutions; * Any other digital humanities-related topic that serves the Southern African community. Submission Guidelines The DHASA conference 2023 asks for three types of submissions: * Long papers: Authors may submit long papers consisting of a maximum of 8 content pages and unlimited pages for references and appendix. The final versions of accepted long papers will be granted an additional page (up to 9 pages) to incorporate reviewers' comments. * Short papers: Authors may submit short papers with a maximum of 5 content pages and unlimited pages for references and appendix. The final versions of accepted short papers will be allowed an extra page (up to 6 pages) to accommodate reviewers' comments. Short papers accepted for the conference will be presented as posters. * Abstracts: Authors can submit abstracts of 250-300 words. Note that before submitting your contribution, you are required to submit an abstract before the abstract submission deadline. This holds for *all* submissions. The actual submission will need to be submitted before the submission deadline. More information on the submission process can be found on the submission page: https://dh2023.digitalhumanities.org.za/submission/ We particularly encourage student submissions where the first author is a student. All accepted long and short paper submissions that are presented at the conference will be published in the Journal of Digital Humanities Association of Southern Africa, see https://upjournals.up.ac.za/index.php/dhasa. In addition, the abstracts of the full papers and the lightning talks will be published in a book of abstracts before the conference. Important dates Abstract submission deadline: 8 August 2023 Ful paper submission deadline: 15 August 2023 Date of notification: 30 September 2023 Camera-ready copy deadline: 6 November 2023 Conference: 27 November 2023 - 1 December 2023 Conference format: Face-to-face Conference venue: Nelson Mandela University, Eastern Cape South Africa NOTE: Non-presenting delegates have the option to attend online. Co-located events Several co-located events are currently being prepared. These will be updated on the conference website. Organizing Committee * Johannes Sibeko, Nelson Mandela University * Aby Louw, Council for Scientific and Industrial Research * Alan Murdoch, Nelson Mandela University * Amanda du Preez, University of Pretoria * Andiswa Bukula, South African Centre for Digital Language Resources * Andiswa Mvanyashe, Nelson Mandela University * Avashna Govender, Council for Scientific and Industrial Research * Gabby Dlamini, Nelson Mandela University * Ilana Wilken, Council for Scientific and Industrial Research * Jonathan van der Walt, Nelson Mandela University * Laurette Marais, Council for Scientific and Industrial Research * Mukhtar Raban, Nelson Mandela University * Nomfundo Khumalo, Nelson Mandela University * Menno Van Zaanen, South African Centre for Digital Language Resources -- Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za Professor in Digital Humanities South African Centre for Digital Language Resources https://www.sadilar.org ________________________________ NWU PRIVACY STATEMENT: http://www.nwu.ac.za/it/gov-man/disclaimer.html DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system. ________________________________

1 0

Research Associate for the project “Predicting COVID-19 Vaccination Uptake from Public Discourse”
by Robert Fuchs 26 Jul '23

26 Jul '23

Research Associate (English linguistics) for the project “Predicting COVID-19 Vaccination Uptake from Public Discourse” https://www.uni-hamburg.de/stellenangebote/ausschreibung.html?jobID=7d287ef… Start date 01.10.2023 Application deadline 20.08.2023 The Department of English at the University of Hamburg is seeking a Research Associate to join a team of researchers working on an interdisciplinary project, funded through the Cross-Disciplinary Labs initiative, titled "Predicting COVID-19 Vaccination Uptake from Public Discourse: A Machine Learning Approach". In this project, we will explore the relationship between public discourse and COVID-19 vaccination uptake and how to use real world data from Germany and England to track public opinion on COVID-19 vaccination, with the ultimate aim of identifying strategies to increase the uptake of COVID-19 vaccinations. The analysis will apply big data and machine learning techniques to Twitter data and will link these to information on local vaccination rates. From a policy perspective, the output of this project will inform public health responses in real time in future pandemics. At the heart of the project is an interdisciplinary approach, combining health economics and linguistics with new methods from data science. The successful candidate will work under the supervision of Professor Robert Fuchs (English Linguistics) and will join a vibrant working group focused on research in the fields of data-intensive discourse analysis, varieties of English and Learner Corpus Research (see https://sites.google.com/view/rflinguistics/home). We offer flexible working arrangements, a supportive research environment and the opportunity for professional growth through skills development in cutting edge research methods in linguistics and data science. Specific duties of the Research Associate include the collection and analysis of Twitter data, the preparation and co-authorship of research publications and conference presentations, participation in international conferences as well as contributing to the strengthening of data science at the University of Hamburg through the Cross-Disciplinary Labs network. The position does not involve any teaching duties. The Research Associate is also encouraged to work towards a PhD, the topic of which should be broadly aligned with the Principal Investigator’s research interests, and may be, but does not need to be, connected to the project. Applicants need to be in possession of a Master’s degree (or equivalent) by the starting date of the position. Applications should include a cover letter explaining the candidate’s qualifications and interest in the position, a CV, copies of Bachelor’s and Master’s degree certificates (where applicable), a representative piece of writing (e.g. MA thesis, term paper) demonstrating the candidate’s skills in academic writing, data analysis and/or English linguistics , as well as a PhD proposal (optional, no more than five pages). Requirements A university degree in a relevant field. excellent proficiency in English excellent skills in academic writing experience with the statistical analysis of linguistic data (esp. with R and/or Python), the analysis of metaphorical language, corpus linguistics and/or discourse analysis are of advantage -- Prof. Dr. Robert Fuchs (JP) | Department of English Language and Literature/Institut für Anglistik und Amerikanistik | University of Hamburg | Überseering 35, 22297 Hamburg, Germany | Room 07076 | https://uni-hamburg.academia.edu/RobertFuchs | https://sites.google.com/view/rflinguistics/ Mailing list on varieties of English/World Englishes/ENL-ESL-EFL. Subscribe here: https://groups.google.com/forum/#!forum/var-eng/join Are you a non-native speaker of English? Please help us by taking this short survey on when and how you use the English language: https://lamapoll.de/englishusageofnonnativespeakers-1/

1 0

Call4Papers: GEM (Generation, Evaluation & Metrics Workshop) @ EMNLP 2023
by Kaustubh Dhole 26 Jul '23

26 Jul '23

Hello All, The Third Version of the Generation, Evaluation & Metrics (GEM) Workshop<https://gem-benchmark.com/> will be held as part of EMNLP<https://2023.emnlp.org/>, December 6-10, 2023, Singapore. The GEM workshop aims to encourage the development of model auditing & human evaluation strategies, and to popularize model evaluations in languages beyond English. We welcome submissions related, but not limited to, the following topics: * 💎 Automatic evaluation of generation systems (example<https://aclanthology.org/2021.gem-1.8/>, example<https://aclanthology.org/2021.gem-1.1/>, example<https://aclanthology.org/2022.gem-1.26/>) * 💎 Creating NLG corpora and challenge sets (example<https://aclanthology.org/2022.tacl-1.4/>, example<https://openreview.net/forum?id=CSi1eu_2q96>, example<https://aclanthology.org/2022.gem-1.6/>) * 💎 Critiques of benchmarking efforts and responsibly measuring progress in NLG (example<https://aclanthology.org/2020.emnlp-main.393/>, example<https://openreview.net/forum?id=j6NxpQbREA1>) * 💎 Effective and/or efficient NLG methods that can be applied to a wide range of languages and/or scenarios (example<https://aclanthology.org/2020.tacl-1.47/>, example<https://aclanthology.org/2021.gem-1.16/>, example<https://aclanthology.org/2022.gem-1.1/>) * 💎 Application and evaluation of generation models interacting with external data and tools (example<https://arxiv.org/abs/2302.04761>, example<https://arxiv.org/abs/2304.09842>, example<https://arxiv.org/abs/2302.07842>) * 💎 Sociotechnical perspectives of employing large language models (example<https://dl.acm.org/doi/abs/10.1145/3531146.3533088>) * 💎 Standardizing human evaluation and making it more robust (example<https://aclanthology.org/2021.tacl-1.87/>, example<https://aclanthology.org/2022.humeval-1.7/>, example<https://aclanthology.org/2022.gem-1.12/>) If you are interested, you can check out last year's workshop websites from ACL 2021<https://gem-benchmark.com/workshop/2021> and EMNLP 2022<https://gem-benchmark.com/workshop/2022>. Industrial Track - Unleashing the Power of NLP: Bridging the Gap between Academia and Industry GEM 2023 is proud to announce the launch of its Industrial Track, which aims to provide actionable insights to industry professionals and to foster collaborations between academia and industry. Shared Task We are organizing a shared task focused on multilingual summarization, including human and automatic evaluation. The Shared Task will be run "Backwards": the workshop will serve as a platform to pre-register your hypotheses. More info on how to participate to come! Important Dates Note: For any questions, please email gem-benchmark-chairs(a)googlegroups.com<mailto:gem-benchmark-chairs@googlegroups.com> Paper Submission Dates * 📅 8 September 2023: Workshop paper submission deadline * 📅 6 October 2023: Workshop paper notification deadline * 📅 18 October 2023: Workshop paper camera ready deadline Workshop Dates * 📅 December 2023 EMNLP Website: https://gem-benchmark.com/workshop Regards, Kaustubh Dhole Twiiter<https://twitter.com/KaustubhDhole> <http://in.linkedin.com/pub/kaustubh-dhol%C3%A9/2a/9b3/392>

1 0

Deadline extension 15 August: Workshop on Computational Terminology in NLP and Translation Studies (ConTeNTs)
by amalhaddad＠ugr.es 26 Jul '23

26 Jul '23

The 1st Workshop on Computational Terminology in NLP and Translation Studies (ConTeNTs) Varna, 7th-8th September, 2023 In conjunction with RANLP 2023 – International Conference “Recent Advances in Natural Language Processing” Final call for papers Computational Terminology and new technologies applied to translation studies have attracted the interest of researchers with very different multidisciplinary backgrounds and motivations. Those fields cover a range of areas in Natural Language Processing (NLP) such as information retrieval, terminology extraction, question-answering systems, ontology building, machine translation, computer-aided translation, automatic or semi-automatic abstracting, text generation, etc. Terminological identification, extraction and coinage of new terms are essential for knowledge mining from texts, both in high and low resources languages. Quick evolutions and new developments in specialised domains require efficient and systematic automatic term management. New terms need to be coined and translated to ensure the equitable development of domains in all languages. During the last decade, deep learning and neural methods have become the state of the art for most NLP applications. Those applications were shown to outperform previous methods on various tasks, including automatic term extraction, language mining, assessment of quality in machine translation, accessibility of terminology, etc. On the one hand, NLP and computational linguistics try to improve the work of translators and interpreters by developing Computer-Assisted Translation (CAT) tools, Translation Memories (TMs), terminological databases and terminology extraction tools, etc. On the other hand, the NLP field still needs the efforts and knowledge of translators, interpreters and linguists to provide better services and tools based on the real necessities of those language professionals. The aim of this workshop is to promote new insights into the ongoing and forthcoming developments in computational terminology by bringing together NLP experts, as well as terminologists and translators. By uniting researchers with such diverse profiles, we hope to bridge some of the gaps between these disciplines and inspire a dialogue between various parties, thus paving the way to more artificial intelligence applications based on mutual collaboration between language and technology. Topics of Interest The ConTeNTs workshop invites the submission of papers reporting on original and unpublished research on topics related to Computational Terminology in NLP and Translation Studies, including but not limited to: - Automatic term extraction: monolingual and multilingual extraction of terms from parallel and comparable corpora, including single and multiword expressions; - Extraction and acquisition of semantic relations between terms; - Extraction and generation of domain specific definitions and disambiguation of terms; - Representation of terms, management of term variation and the discovery of synonym terms or term clusters and its relation to NLP applications; - Extraction of terminological context, through the use of comparable and parallel corpus; - Accessibility of terminology in certain domains, relevant to non-experts or to laypersons, and its relevance to NLP applications such as, chatbots, automatic email generation or spoken language interface; - The impact of terminology on MT (applying terminology constraints, evaluation of MT in domain-specific settings, etc.); - The creation of domain ontologies, thesaurus, terminological resources in specialised domains; - The use of new technologies in translation studies and research and the use of terminological resources in specialised translation; - Identification of key problems in terminology and new technologies used in translation studies; - Evaluation of terminological resources in various NLP applications and the impact of these resources have on the performance of the automatic systems; - Emerging language technologies: how the increased reliance on real-time language technologies would change the structure of language; - Corpus based studies applied to translation and interpreting: the use of parallel and comparable corpora for translating phraseological units; - Phraseology and multiword expressions in cross-linguistic studies; - Translation and interpreting tools, such as translation memories, machine translation and alignment tools; - User requirements for interpreting and translation tools. Submission Guidelines Submissions must consist of full-text papers and should not exceed 7 pages excluding references, they should be a minimum of 5 pages long. The accepted papers will be published as ConTeNTs workshop e-proceedings with ISBN, will be assigned a DOI and will be also available at the time of the conference. The papers should be in English. Authors of accepted papers will receive guidelines regarding how to produce camera-ready versions of their papers for inclusion in the proceedings. Each submission will be reviewed by at least two programme committee members. Accepted papers will be presented orally as part of the programme of the workshop. Submissions Link to START system: https://softconf.com/ranlp23/ConTeNTS Website of the workshop: https://contents2023.kulak.kuleuven.be/ Should you require any assistance with the submission, please do not hesitate to contact us at amalhaddad(a)ugr.es and ayla.rigoutsterryn(a)kuleuven.be. Important Dates Deadline for paper submission: 15 August 2023 Workshop camera-ready proceedings ready: 31 August 2023 ConTeNTs workshop: 7/8 September 2023 Workshop Chairs & Organising Committee Ayla Rigouts Terryn, Katholieke Universiteit Leuven, Belgium Amal Haddad Haddad, Universidad de Granada, Spain Ruslan Mitkov, University of Wolverhampton, United Kingdom Programme Committee - Sophia Ananiadou (University of Manchester) - Maria Andreeva Todorova (Bulgarian Academy of Sciences) - Silvia Bernardini (University of Bologna) - Melania Cabezas García (Universidad de Granada) - Rute Costa (Universidade Nova de Lisboa) - Esther Castillo Pérez (Universidad de Granada) - Patrick Drouin (Université de Montréal) - Pamela Faber (Universidad de Granada) - Mercedes García de Quesada (Universidad de Granada) - Dagmar Gromann (Centre for Translation Studies – University of Vienna) - Tran Thi Hong Hanh (L3i Laboratory, University of La Rochelle) - Rejwanul Haque (National College of Ireland) - Amir Hazem (Nantes University) - Kyo Kageura (University of Tokyo) - Barbara Karsch (BIK Terminology – USA) - Dorothy Kenny (Dublin City University) - Miloš Jakubíček (Sketch Engine) - Hendrik Kockaert (KU Leuven) - Philipp Koehn (Johns Hopkins University) - Maria Kunilovskaya (Saarland University) - Marie-Claude L’Homme (Université de Montréal) - Hélène Ledouble (Université de Toulon) - Pilar León-Araúz (Universidad de Granada) - Rodolfo Maslias (former Head of TermCoord, European Parliament) - Silvia Montero Martínez (Universidad de Granada) - Emmanuel Morin (LS2N-TALN) - Rogelio Nazar (Pontificia Universidad Católica de Valparaíso) - Sandrine Peraldi (University College Dublin) - Silvia Piccini (Italian National Research Council) - Thierry Poibeau (CNRS) - Senja Pollak (Jožef Stefan Institute) - Maria Pozzi Pardo (El Colegio de México) - Tharindu Ranasinghe (Aston University) - Arianne Reimerink (Universidad de Granada) - Andres Repar (Jožef Stefan Institute) - Christophe Roche (Université Savoie Mont-Blanc) - Antonio San Martín Pizarro (Université du Québec à Trois-Rivières) - Beatriz Sánchez Cárdenas (Universidad de Granada) - Vilelmini Sosoni (Ionian University) - Irena Spasic (Cardiff University) - Elena Isabelle Tamba (Romanian Academy, Iași Branch) - Rita Temmerman (Vrije Universiteit Brussel) - Jorge Vivaldi Palatresi (Universitat Pompeu Fabra)

1 0

International workshop NLP for translation and interpreting applications (NLP4TIA)-Last Call for Papers
by Nanomi Arachchige, Isuri 26 Jul '23

26 Jul '23

International workshop NLP for translation and interpreting applications (NLP4TIA) Varna, Bulgaria, 8 September 2023 https://nlp4tia.web.uah.es/ Last Call for Papers ***Extended deadline: 10 August 2023*** In the last two decades, we have been able to witness a technological turn in translation and interpreting studies with Natural Language Processing (NLP) and deep learning playing more and more prominent part. There is already a growing number of NLP applications that are used to support the work of translators and interpreters. In addition, the recent advances in (and latest models of) deep learning have powered the further development and success of high performing Neural Machine Translation (NMT) systems. Translation technology has revolutionised the translation profession and nowadays most professional translators employ tools such as translation memory (TM) systems in their daily work. Latest advances of Neural Machine Translation (NMT) have resulted in NMT not only becoming an integral part of most state-of-the art TM tools but also typical for the translation workflow of many companies, organisations and freelance translators. Although translation has benefited more from technological advances, interpreting has also experienced a technological turn. However, it has not been until some years ago that soft technology has permeated interpreting practice and research. Computer assisted translation, MT and NLP tools have been adapted to be used by interpreters. In addition, corpus-based studies have also underpinned dialogue interpreting. The increasing interest in NLP, MT and the automation of processes has brought us to multidisciplinary projects that deal with the development of models for automated oral communication. Machine interpreting has already been developed and is being improved, focusing on speed and accuracy matters. Either domain-specific (commercial, military, humanitarian) or general (Skype Translator), there is still a long way to go to render machine interpreting more human-like. Many of the above recent developments have to do with the employment of Natural Language Processing tools and resources to support the work of translators and interpreters. This workshop is expected to discuss the growing importance of NLP in different translation and interpreting scenarios. Workshop topics The workshop invites submissions reporting original unpublished work on topics including but not limited to: * NLP and MT for under-resourced languages; * Translation Memory systems; * NLP and MT for translation memory systems; * NLP for CAT and CAI tools; * Integration of NLP tools in remote interpreting platforms; * NLP for dialogue interpreting; * Development of NLP based applications for communication in public service settings (healthcare, education, law, emergency services); * Corpus-based studies applied to translation and interpreting.; * Machine translation and machine interpreting; * Resources for translation and machine translation; * Resources for interpreting and interpreting technology application; * Quality estimation of human and machine translation; * Post-editing strategies and tools; * Automatic post-editing of MT; * NLP and MT for subtitling. * Technology acceptance by interpreters and translations; * Machine Translation and translation tools for literary texts; * Evaluation of machine translation and translation and interpreting tools in general; * The impact of the technological turn in translation and interpreting; * Cognitive effort and eye-tracking experiments in translation and interpreting; * Development of models for research and practice of translation and interpreting; * Multidisciplinary cooperation in NLP applied to translation and interpreting. Submissions and publication Submissions must consist of full-text papers and should not exceed 7 pages excluding references, they should be a minimum of 5 pages long. The accepted papers will be published as NLP4TIA workshop e-proceedings with ISBN, will be assigned a DOI and will be also available at the time of the conference. The papers should be in English and should be submitted via the conference management system START using this link<https://softconf.com/ranlp23/NLP4TIA/>. Authors of accepted papers will receive guidelines regarding how to produce camera-ready versions of their papers for inclusion in the proceedings. Each submission will be reviewed by at least two programme committee members. Accepted papers will be presented orally as part of the programme of the workshop. Submissions should be compliant with the below templates and should be uploaded as pdf files in START (START is configured to accept pdf files only). The following templates should be used: LaTeX at Overleaf<https://www.overleaf.com/latex/templates/instructions-for-ranlp-2023-procee…>, LaTeX<http://ranlp.org/ranlp2023/Templates/ranlp2023-LaTeX.zip> , MS Office<http://ranlp.org/ranlp2023/Templates/ranlp2023-word.docx> Important dates Deadline for paper submission: 23 July 2023 Deadline for paper submission (extended): 10 August 2023 Acceptance notification: 20 August 2023 Final camera-ready version: 30 August 2023 Workshop camera-ready proceedings ready: 3 September 2023 NLP4TIA workshop: 8 September 2023 Workshop Chairs Raquel Lázaro Gutiérrez (Universidad de Alcalá) Antonio Pareja Lora (Universidad de Alcalá) Ruslan Mitkov (Lancaster University) Programme Committee Cristina Aranda (Big Onion) Juanjo Arevalillo (Hermes Traducciones) Silvia Bernardini (University of Bologna) Gabriel Cabrera Méndez (Dualia Teletraducciones) Matt Coler (University of Groningen) Gloria Corpas Pastor (University of Malaga) Elena Davitti (University of Surrey) Joanna Drugan (Heriot-Watt University) Marie Escribe (LanguageWire) Claudio Fantinuoli (Mainz University/KUDO Inc) Antonio García Cabot (Universidad de Alcalá) Adriana Jaime Pérez (Migralingua Voze) Miguel Ángel Jiménez Crespo (Rutgers University) Óscar Luis Jiménez Serrano (University of Granada) Koen Kerremans (Free University Brussel) Maria Kunilovskaya (Saarland University) Els Lefever (Ghent University) Pilar León Arauz (University of Granada) Johanna Monti (University of Naples L'Orientale) Elena Montiel Ponsoda (Polytechnic University of Madrid) Helena Moriz (University of Lisbon) Elena Murgolo (Orbital 14) Dora Murgu (Interprefy) Constantin Orasan (University of Surrey) María Teresa Ortego Antón (University of Valladolid) Tharindu Ranasinghe (Aston University) Celia Rico (Universidad Complutense de Madrid) Caroline Rossi (University Grenoble les Alpes) María del Mar Sánchez Ramos (Universidad de Alcalá) Miriam Seghiri (University of Malaga) Vilelmini Sosoni (Ionian University) Rui Manuel Sousa Silva (University of Porto) Nicoletta Spinolo (University of Bologna) Venue The workshop will take place at hotel Cherno More<https://www.chernomorebg.com/en/> in Varna. Further information and contact details Registration for NLP4TIA is now open and is done via the RANLP main conference page. To register, please complete the registration form<https://url6.mailanyone.net/scanner?m=1pii0v-000B6E-3x&d=4%7Cmail%2F14%2F16…>. The conference website (https://nlp4tia.web.uah.es/) will be updated on a regular basis. For further information, please email raquel.lazaro(a)uah.es<mailto:raquel.lazaro@uah.es>.

1 0

Deep Learning Summer School at RANLP 2023 - Call for participation
by Nanomi Arachchige, Isuri 26 Jul '23

26 Jul '23

DLinNLP 2023 - Deep Learning Summer School at RANLP 2023 Call for Participation Varna, Bulgaria 30th August - 1st September https://dlinnlp2023.github.io/ We invite everyone interested in Machine Learning and Natural Language Processing to attend the Deep Learning Summer School at 14th biennial RANLP conference (RANLP 2023). Purpose: Deep Learning is a branch of machine learning that has gained significant traction in the field of Artificial Intelligence, pushing the envelope in the state-of-the-art, with many sub-areas including natural language, image, and speech processing employing it widely in their best-performing models. This summer school will feature presentations from outstanding researchers in the field of Natural Language Processing (NLP) and Deep Learning. These will include coverage of recent advances in theoretical foundations and extensive practical coding sessions showcasing the latest relevant technology. The summer school would be of interest to novices and established practitioners in the fields of NLP, corpus linguistics, language technologies, and similar related areas. Important Dates: 30 August - 1 September: Deep Learning Summer School in NLP Lectures: * Lucas Beyer (Google Brain) * Tharindu Ranasinghe (Aston University, UK) * Iacer Calixto (University of Amsterdam, Holland) Practical Sessions: * Damith Premasiri (practical sessions) (University of Wolverhampton, UK) * Isuri Anuradha (practical sessions) (University of Wolverhampton, UK) * Anthony Hughes (practical sessions) (University of Wolverhampton, UK) Registration: **** Registration is now open: ****** https://ranlp.org/ranlp2023/index.php/fees-registration/ Programme: Please refer to the website for the details of the programme: https://dlinnlp2023.github.io/#programme Contact Email: dlinnlp2023(a)gmail.com<mailto:dlinnlp2023@gmail.com>

1 0

IWSLT 2024 - Call for the Shared Tasks
by Atul K. Ojha 26 Jul '23

26 Jul '23

Apologies for cross-posting. ---------------------------------------- The International Conference on Spoken Language Translation (IWSLT) <https://iwslt.org/>is the premier annual conference for all aspects of Spoken Language Translation. Every year, the conference organizes and sponsors open evaluation campaigns around key challenges in simultaneous and consecutive translation, under real-time/low latency or offline conditions, and for a variety of languages in under-resourced or multilingual conditions. System descriptions and results from participants’ systems and scientific papers related to key algorithmic advances and best practices are presented. IWSLT is the venue of the SIGSLT, the Special Interest Group on Spoken Language Translation of ACL, ISCA, and ELRA. With a track record of 20 years, IWSLT benchmarks and proceedings serve as a reference for all researchers and practitioners working on speech translation and related fields. 2024 will mark IWSLT’s 21st edition. There are many challenges in speech translation that have not yet been addressed, among them, we are really interested in topics related to new application scenarios (e.g. meetings, subtitling, dubbing), specific aspects (e.g. names, accents), different styles, multilingually, discourse and summarization, multimodal and multi-party speech translation or many other ideas that researchers have not yet focused on. Therefore, we invite *proposals for shared tasks. *For more details about this initiative, please refer to https://iwslt.org/assets/pdfs/IWSLT2024-Call_for_Tasks.pdf If you want to propose a new task to encourage researchers around the world to work on particular timely challenges in SLT, please fill out the following form <https://iwslt.org/assets/pdfs/IWSLT2024-Call_for_Tasks.pdf>and *send it to <https://groups.google.com/>*iwslt-organizers(a)googlegroups.com * by August 31st, 2023.* Best, Marine, Marcello, Alex, Jan, Sebastian, Elizabeth, Atul IWSLT Organisers

1 0

Any Europarl Corpus update?
by alexandr.rosen＠ff.cuni.cz 26 Jul '23

26 Jul '23

Dear All, I'm wondering if there is a more recent version of the "European Parliament Proceedings Parallel Corpus 1996-2011" (https://www.statmt.org/europarl/). Alternatively, any experience downloading the EP proceedings using https://data.europarl.europa.eu/en/developer-corner/opendata-api would be welcome. Thanks! Alexandr

2 2

Funded PhD Natural language processing of electronic health records to improve empirical prescribing in acute admissions
by Mark Lee 26 Jul '23

26 Jul '23

Application Deadline: 30 August 2023 Details This project has a specific focus in managing the single greatest threat to global health, the increasing burden from infections caused by bacteria that are resistant to antibiotics (antimicrobial resistance, AMR). Doctors (humans) can’t reliably know which antibiotic to administer in an emergency. In fact, based on our earlier research they get it wrong about 20% of the time. A serious bacterial infection will look the same whether the bacteria causing the infection are resistant to certain antibiotics or not, and the first antibiotic must be selected on very limited information and be given the first hour of admission to hospital if there is a risk they have developed an infection that is spreading through their body. Understandably, this ‘high stakes’ uncertainty promotes the use of ‘broad-spectrum’ antibiotics which should be held in reserve for known drug-resistant infections. Natural language processing (NLP) has the potential to safely unlock successful antimicrobial stewardship for AMR at the first dose. In earlier work, we used quantitative and categorical data from electronic health records (EHRs) from patients who needed emergency hospital admission to see which antibiotics were given in the emergency room, how often a patient was prescribed an antibiotic that their bacterial infection was resistant to (under-prescribing), and how often a broad-spectrum antibiotic was used when another antibiotic alternative would have been equally effective (over-prescribing). We trained a machine learning algorithm that was allowed to under-prescribe at the same rate as doctors (about 20% of the time), that could also reduce the use of broad-spectrum antibiotics by about 40% by anticipation of which patients were unlikely to have an AMR infection. This powerful proof-of-concept work shows the huge potential for AI in personalised medicine and antimicrobial stewardship at the first and most important dose. Taking the next steps in AI for AMR. We know that a lot of important information is held in free text clinician notes that aren’t reflected in the data we used to build the model, and want to understand what valuable information contained in the free text data would help improve prediction accuracy. This project aims to analyse free-text clinician notes to retrieve valuable information that can improve the prescribing of antibiotics by more accurately predicting an individual patient’s risk of having an antibiotic-resistant infection. We are seeking a motivated student to undertake a 4 year funded PhD, in collaboration with Shionogi, a pharmaceutical company with offices in London. Eligiblity The successful candidate will hold a bachelor’s degree (or above) in Computer Science, Physics, Mathematics, Psychology or related discipline and have proven experience in computational linguistics, natural language processing, machine learning. Previous experience of applying AI methods to the medical domain is a strong advantage. Furthermore, the candidate will have strong programming skills, expertise in machine learning approaches and be excited be the challenges of interdisciplinary research between medicine and computer science. We want our PhD student cohorts to reflect our diverse society. UoB is therefore committed to widening the diversity of our PhD student cohorts. UoB studentships are open to all and we particularly welcome applications from under-represented groups, including, but not limited to BAME, disabled and neuro-diverse candidates. We also welcome applications for part-time study. The University of Birmingham works closely with University Hospitals Birmingham NHS Foundation Trust (UHB), which is the single-largest Acute NHS Trust in the UK, and serves the healthcare needs of over 1.2m people in the second-largest city in the UK. PIONEER, the Health Data Research Hub for Acute Care, alone includes >1.2m patient episodes per year with >10yrs longitudinal health data. This experienced collaboration means we are uniquely positioned to develop, model and then later embed AI-supported antimicrobial stewardship within a clinical trial and electronic prescribing systems. The student will be located at the Institute of Microbiology and Infection (IMI) of the University of Birmingham, the largest academic research institute in the field of microbiology and infectious diseases in the United Kingdom. The IMI is part of the School of Medical and Dental Sciences, defining the future of health and medicine through the provision of innovative education and exceptional research. Throughout the PhD project, regular meetings with industry partner colleagues at Shionogi will be held to monitor progression and support the student in their research. About Shionogi Established in Japan 140 years ago, Shionogi has a history of drug discovery and scientific rigour in addressing some of the toughest challenges in healthcare. Shionogi’s work in antimicrobial resistance (AMR) is a key part of our contribution to the UN Sustainable Development Goals (SDGs) - we invest the highest proportion of our pharmaceutical revenues in relevant anti-infectives R&D compared to other large pharmaceutical companies. Shionogi announced the first-ever licence agreement for an antibiotic to treat serious bacterial infections between a pharmaceutical company and a non-profit organisation driven by public health priorities. Working with the Global Antibiotic Research and Development Partnership (GARDP) and the Clinton Health Access Initiative (CHAI), the agreement aims to provide 135 countries with access. At Shionogi, our belief is that sustainable growth hinges not only on new drug creation, but also on consolidating our strengths in areas of strategic focus. Through external partnerships, we seek to bring benefits to more patients through collaboration in areas where it would be difficult for us to go it alone. Globally, the number of our partners, including partnerships across a range of industries, including academia, enables us to accelerate innovation to better help societies manage some of the most important public health threats and to take on areas where the unmet clinical need is greatest. Funding Notes The position offered is for three and a half years full-time study. The current (2023-24) value of the award is stipend; £18,622 pa; tuition fee: £4,712 pa. Awards are usually incremented on 1 October each following year. The package includes a Macbook Air and funding for additional training and conference attendance. References Moran E, Robinson E, Green C, Keeling M, Collyer B. Towards personalized guidelines: using machine-learning algorithms to guide antimicrobial selection. J Antimicrob Chemother. 2020. doi:10.1093/jac/dkaa222 Cavallaro M, Moran E, Collyer B, McCarthy ND, Green C, Keeling MJ. Informing antimicrobial stewardship with explainable AI. bioRxiv. 2022. doi:10.1101/2022.08.12.22278678 https://www.findaphd.com/phds/project/natural-language-processing-of-electr… With best regards, Mark Lee Professor of Artificial Intelligence School of Computer Science University of Birmingham www.cs.bham.ac.uk/~mgl<http://www.cs.bham.ac.uk/~mgl>

1 0

HASOC 2023 tasks - Call for Participation - Hate Speech and Offensive Content Identification
by Thomas Mandl 26 Jul '23

26 Jul '23

15th meeting of /Forum for Information Retrieval Evaluation* HASOC-2023*/ We are excited to announce the 5th edition of HASOC, consisting of four interesting shared tasks. We invite you to participate. *Task 1 focus on identifying hate speech, offensive language, and profanity in different languages using natural language processing techniques.* * Task 1A deals with identifying hate and offensive content in Sinhala, a low-resource Indo-Aryan language spoken in Sri Lanka. The task involves classifying tweets into Hate and Offensive (HOF) or Non-Hate and Offensive (NOT). The dataset for this task is based on the Sinhala Offensive Language Detection dataset. * Task 1B focuses on identifying hate and offensive content in Gujarati, another low-resource Indo-Aryan language spoken by approximately 50 million people in India. Similarly, participants need to classify tweets into HOF or NOT categories. The training set for this task consists of around 200 tweets. For more details, please visit task 1 page <https://hasocfire.github.io/hasoc/2023/task1.html>. *Task 2, Identification of Conversational Hate-Speech in Code-Mixed Languages (ICHCL), addresses the challenge of identifying hate speech and offensive content in code-mixed conversations on social media. Code-mixed text includes multiple languages within a single conversation. The task is divided into two subtasks.* * In Task 2a, participants need to perform binary classification on conversational tweets with tree-structured data. They must determine whether a tweet, comment, or reply contains hate speech, offensive language, or profanity (HOF) or is non-hate and offensive (NOT). The classification should consider both the individual content and support for hate expressed in the parent tweet. * Task 2b involves the classification of conversational tweets with tree-structured data into specific forms of hate. Participants must identify if the tweet, comment, or reply contains standalone hate (SHOF), contextual hate (CHOF) that supports hate expressed in the parent, or if it is non-hate (NONE). For more details, please visit Task 2 webpage. <https://hasocfire.github.io/hasoc/2023/ichcl.html> *Task 3 aims to detect hateful spans within a sentence already considered hateful. A hate span is a set of continuous tokens that, in tandem, communicate the explicit hatefulness in a sentence.* * For instance, in the statement, "Women ... Can't live with them... Can't shoot them," the portion highlighted in bold will be considered a hateful span. This shared task aims to extract all such spans from a hateful text. * The input texts are all in English. The detection of hateful spans is achieved by mapping this into a sequence labeling problem. For every token of the sequences, we have manually annotated the start and end of a hateful span. This is achieved by the BIO notation tagging, where B' represents the beginning of the hate span,' I' forms the continuation of a hate span, and' O' represents the non-hate tag. The task is then to learn the correct sequence of the BIO tags for a given sentence. For example, in the above sentence, the tag sequence for the preprocessed sentence will be of the form "women can't live with them can't shoot them" → "O O O O O B I I"; "I" notation cannot exist on its own and will always be preceded by either an "I" or "B". Consequently, a “B” notation can be immediately followed by an “O” in case the span is just a single word. For more details, please visit Task 3 webpage. <https://lcs2.in/hatenorm-2023/> *Task 4 aims to detect hate speech in Bengali, Bodo, and Assamese languages. It is a binary classification task. Each dataset (for the three languages) consists of a list of sentences with their corresponding class (hate or offensive (HOF) or not hate (NOT)). Data is primarily collected from Twitter, Facebook, and Youtube comments. * The Macro F1 score will be the yardstick of the task. Team rank will be determined based on the Macro F1 score of the first part. For more details, please visit Task 4 webpage. <https://sites.google.com/view/hasoc-2023-annihilate-hates/home> Registration for all four tasks is open on our registration page. <https://hasocfire.github.io/hasoc/2023/registration.html> We believe that your expertise and contribution will be invaluable in advancing the state-of-the-art hate speech classification. We encourage you to participate in this exciting shared task and contribute to the research community. Regards, HASOC organizing team

1 0

2026

2025

2024

2023

2022

Corpora