- Corpora - ELRA lists

First CFP: LoResMT 2024 at ACL 2024
by Atul K. Ojha 27 Feb '24

27 Feb '24

Apologies for cross-posting. --------------------------------------------------------------------------- The Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2024) https://www.loresmt.org/ @ ACL 2024 (August 11–16, 2024) Bangkok, Thailand SUBMISSION https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/LoResMT TIMELINE Paper submission due: May 17 (Friday), 2024, at 23:59 (Anywhere on Earth) Notification of acceptance: June 17 (Monday), 2024 Camera-ready papers due: July 1 (Monday), 2024, at 23:59 (Anywhere on Earth) Workshop dates at ACL: August 15, 2024 SCOPE Based on the success of past low-resource machine translation (MT) workshops at AMTA 2018 (https://amtaweb.org/), MT Summit 2019 ( https://www.mtsummit2019.com), AACL-IJCNLP 2020 (http://aacl2020.org/), AMTA 2021, COLING 2022 and EACL 2023, we introduce the Seventh LoResMT Workshop at ACL 2024. The workshop provides a discussion panel for researchers working on MT systems/methods for low-resource and under-represented languages in general. We would like to help review/overview the state of MT for low-resource languages and define the most important directions. We also solicit papers dedicated to supplementary NLP tools that are used in any language and especially in low-resource languages. Overview papers on these NLP tools are very welcome. It will be beneficial if the evaluations of these tools in research papers include their impact on the quality of MT output. TOPICS We are highly interested in (1) original research papers, (2) review/opinion papers, and (3) online systems on the topics below; however, we welcome all novel ideas that cover research on low-resource languages. - Neural machine translation (NMT) for low-resource languages - Use of LLMs (large language models) for low-resource MT systems - COVID-related corpora, their translations and corresponding NLP/MT systems - Work that presents online systems for practical use by native speakers - Word tokenizers/de-tokenizers for specific languages - Word/morpheme segmenters for specific languages - Alignment/Re-ordering tools for specific language pairs - Use of morphology analyzers and/or morpheme segmenters in MT - Multilingual/cross-lingual NLP tools for MT - Corpora creation and curation technologies for low-resource languages - Review of available parallel corpora for low-resource languages - Research and review papers on MT methods for low-resource languages - MT systems/methods (e.g. rule-based, SMT, NMT) for low-resource languages - Pivot MT for low-resource languages - Zero-shot MT for low-resource languages - Fast building of MT systems for low-resource languages - Re-usability of existing MT systems for low-resource languages - Machine translation for language preservation SUBMISSION INFORMATION We are soliciting two types of submissions: (1) research, review, and position papers and (2) system demonstration papers. For research, review and position papers, the length of each paper should be at least four (4) and not exceed eight (8) pages, plus unlimited pages for references. For system demonstration papers, the limit is four (4) pages. Submissions should be formatted according to the official ACL 2024 style templates. Accepted papers will be published online in the ACL 2024 proceedings and will be presented at the conference. Submissions must be anonymized and should be done using the provided submission system. Scientific papers that have been or will be submitted to other venues must be declared as such and must be withdrawn from the other venues if accepted and published at LoResMT. The review will be double-blind. Authors of an accepted paper should present their paper in person at ACL 2024. Papers should be submitted in PDF to the LoResMT Open Review. We would like to encourage authors to cite papers written in ANY language that are related to the topics, as long as both original bibliographic items and their corresponding English translations are provided. Registration is handled by the main conference (https://2024.aclweb.org/). ORGANIZING COMMITTEE (LISTED ALPHABETICALLY) Atul Kr. Ojha, University of Galway & Panlingua Language Processing LLP Chao-Hong Liu, Potamu Research Ltd Ekaterina Vylomova, University of Melbourne, Australia Jade Abbott, Retro Rabbit Jonathan Washington, Swarthmore College Nathaniel Oco, National University (Philippines) Tommi A Pirinen, UiT The Arctic University of Norway, Tromsø Valentin Malykh, Huawei Noah’s Ark lab and Kazan Federal University Varvara Logacheva, Skolkovo Institute of Science and Technology Xiaobing Zhao, Minzu University of China PROGRAM COMMITTEE (LISTED ALPHABETICALLY) Abigail Walsh, ADAPT Centre, Dublin City University, Ireland Alberto Poncelas, Rakuten, Singapore Alina Karakanta, Leiden University Amirhossein Tebbifakhr, Fondazione Bruno Kessler Anna Currey, Amazon Web Services Aswarth Abhilash Dara, Amazon Arturo Oncevay, University of Edinburgh Atul Kr. Ojha, DSI, University of Galway & Panlingua Language Processing LLP Barry Haddow, University of Edinburgh Bogdan Babych, Heidelberg University Chao-Hong Liu, Potamu Research Ltd Constantine Lignos, Brandeis University, USA Daan van Esch, Google Diptesh Kanojia, University of Surrey, UK Duygu Ataman, University of Zurich Ekaterina Vylomova, University of Melbourne, Australia Eleni Metheniti, CLLE-CNRS and IRIT-CNRS Flammie Pirinen, UiT The Arctic University of Norway, Tromsø Koel Dutta Chowdhury, Saarland University (Germany) Jade Abbott, Retro Rabbit Jasper Kyle Catapang, University of the Philippines Jindřich Libovicky, Charles University John P. McCrae, DSI, University of Galway Liangyou Li, Noah’s Ark Lab, Huawei Technologies Majid Latifi, University of York, York, UK Maria Art Antonette Clariño, University of the Philippines Los Baños Mathias Müller, University of Zurich Nathaniel Oco, De La Salle University (Philippines) Rajdeep Sarkar, Yahoo Rico Sennrich, University of Zurich Saliha Muradoglu, The Australian National University Sangjee Dondrub, Qinghai Normal University Santanu Pal, WIPRO AI Sardana Ivanova, University of Helsinki Shantipriya Parida, Silo AI Sunit Bhattacharya, Charles University Surafel Melaku Lakew, Amazon AI Wen Lai, Center for Information and Language Processing, LMU Munich Valentin Malykh, Huawei Noah’s Ark lab and Kazan Federal University CONTACT Please email loresmt(a)googlegroups.com if you have any questions/comments/suggestions.

1 0

Extended deadline [March 6th]: 7thWorkshop on Indian Language Data: Resources and Evaluation (WILDRE) @LREC-COLING 2024
by Atul K. Ojha 27 Feb '24

27 Feb '24

Apologies for cross-posting. you are requested to please circulate it for wider publicity... --------------------------------------------------------------------------- 7thWorkshop on Indian Language Data: Resources and Evaluation (WILDRE) Venue: Lingotto Conference Centre - Torino, Italy (Organized under LREC-COLING 2024 (20-25 May 2024) <https://lrec-coling-2024.org/>) Website: http://sanskrit.jnu.ac.in/conf/wildre7 WILDRE-7, the 7th Workshop on Indian Language Data: Resources and Evaluation is proposed to be organised in Lingotto Conference Centre - Torino, Italy under the LREC-COLING platform. India has a huge linguistic diversity and has seen concerted efforts from the Indian government and industry to develop language resources. European Language Resource Association (ELRA) and its associate organizations have been very active and successful in addressing the challenges and opportunities related to language resource creation and evaluation. It is therefore a big opportunity for resource creators of Indian languages to showcase their work on this platform and also to interact and learn from those involved in similar initiatives all over the world. The broader objectives of the WILDRE will be - To map the status of Indian Language Resources - To investigate challenges related to creating and sharing various levels of language resources - To promote a dialogue between language resource developers and users - To provide an opportunity for researchers from India to collaborate with researchers from other parts of the world *Dates for Short/Long papers and Posters and Demos* February 28, 2024 March 06, 2024: Paper submissions due [extended deadline] March 28, 2024: Paper notification acceptance April 10, 2024: Camera-ready papers due SUBMISSIONS Papers must describe original, completed/ in progress and unpublished work. Each submission will be reviewed by three program committee members. Accepted papers will be given up to 10 pages (for full papers) 5 pages (for short papers and posters) in the workshop proceedings, and will be presented as oral paper or poster. Papers should be formatted according to the LREC-COLING style sheet, which is provided on the LREC-COLING 2024 website ( https://lrec-coling-2024.org/authors-kit/). Papers should be submitted in PDF format to the LREC-COLING website ( https://softconf.com/lrec-coling2024/wildre-7/) We are seeking submissions under the following category - Full papers (10 pages) - Short papers (work in progress: 5 pages) - Posters (innovative ideas/proposals, research proposal of students) - Demo (of working online/standalone systems) WILDRE-7 will have a special focus on Demos of Indian Language Technology. In the past few years, as more resources have been developed and made available, there has been an increased activity in developing usable technology using these. WILDRE-7 would like to encourage and widen the Demo track to allow the community to showcase their demos and have mutually beneficial interactions with each other as well as resource developers. WILDRE-7 is seeking full, short papers, posters and demos on the following topics related to Indian Language Resources: - Digital Humanities, heritage computing - Corpora - text, speech, multimodal, methodologies, annotation and tools - Lexicons and Machine-readable dictionaries - Ontologies, Grammars - Language resources for NLP/ IR/Speech tasks, tools and Infrastructure for language resources - Standards or specifications for language resources application - Licensing and copyright issues - Data mining - Text summarization Both submission and review processes will be handled electronically. The review process will be double-blind. The workshop website will provide the submission guidelines and the link for the electronic submission. When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC-COLING authors to share the described LRs (data, tools, services, etc.), to enable their reuse, and replicability of experiments, including evaluation ones, etc. For further information on this initiative, please refer to https://lrec-coling-2024.org/ Shared Task Following the success of the five WILDRE workshops, WILDRE-7 will include *Code-mixed Less-Resourced Sentiment Analysis (Code-mixed) *and *Discourse Machine Translation (DiscoMT)* Shared Tasks. The organizers of shared tasks will provide datasets and evaluation platforms to evaluate systems developed by the participants. For further information on this initiative, please refer to http://sanskrit.jnu.ac.in/conf/wildre7 Workshop *Organisers* - Girish Nath Jha, Jawaharlal Nehru University, India - Kalika Bali, Microsoft Research India Lab, Bangalore, India - Sobha L, AU-KBC, Anna University, Chennai, India - Atul Kr. Ojha, University of Galway, Ireland & Panlingua Language Processing LLP, India Workshop contact: Atul Kr. Ojha, University of Galway, Ireland & Panlingua Language Processing LLP, India, shashwatup9k(a)gmail.com Identify, Describe and Share your LRs Describing your LRs in the LRE Map is now a normal practice in the submission procedure of LREC (introduced in 2010 and adopted by other conferences). To continue the efforts initiated at LREC 2014 about “Sharing LRs” (data, tools, web services, etc.), authors will have the possibility, when submitting a paper, to upload LRs in a special LREC repository. This effort of sharing LRs, linked to the LRE Map for their description, may become a new “regular” feature for conferences in our field, thus contributing to creating a common repository where everyone can deposit and share data. As scientific work requires accurate citations of referenced work to allow the community to understand the whole context and also replicate the experiments conducted by other researchers, LREC-COLING 2024 endorses the need to uniquely identify LRs through the use of the International Standard Language Resource Number (ISLRN, www.islrn.org), a Persistent Unique Identifier to be assigned to each Language Resource. The assignment of ISLRNs to LRs cited in LREC-COLING papers will be offered at submission time.

1 0

Survey of English Usage UCL Annual Report 2023
by Aarts, Bas 27 Feb '24

27 Feb '24

Dear colleagues, The Survey of English Usage Annual Report for 2023 can be viewed here: <https://www.ucl.ac.uk/english-usage/archives/2023report.htm> UCL Survey of English Usage, UCL<https://www.ucl.ac.uk/english-usage/archives/2023report.htm> ucl.ac.uk<https://www.ucl.ac.uk/english-usage/archives/2023report.htm> [favicon.ico]<https://www.ucl.ac.uk/english-usage/archives/2023report.htm> Apologies for cross-posting. Best wishes, Bas Prof. Bas Aarts Department of English Language and Literature UCL Grammarianism Blog: http://bit.ly/1d1zKzN Continuous Professional Development and INSET courses for teachers: https://bit.ly/39qnKIH Twitter: @UCLEnglishUsage and @EngliciousUCL Note: I respect your work/life balance. If I send you an email outside of your normal working hours there is no expectation that you will read or respond to the message at that time.

1 0

1st CFP-GITT-2024 at EAMT 2024
by bsavoldi＠fbk.eu 27 Feb '24

27 Feb '24

1st CALL FOR PAPERS Second International Workshop on Gender-Inclusive Translation Technologies (GITT) at EAMT 2023 27 June 2023, Sheffield, UK https://sites.google.com/tilburguniversity.edu/gitt2024 @GITT2024 ** Important Dates ** (Time zone: Anywhere on Earth) Submission deadline: 15 April, 2024 Notification of Acceptance: 15 May, 2024 Camera Ready Copy due: 24 May, 2024 Workshop: 27 June, 2024 ** Aim and scope ** The Gender-Inclusive Translation Technologies Workshop (GITT) is set out to be the only dedicated workshop that focuses on gender-inclusive language in translation and cross-lingual scenarios. The workshop aims to bring together researchers from diverse areas, including industry partners, MT practitioners, and language professionals. GITT aims to encourage multidisciplinary research that develops and interrogates both solutions and challenges for addressing bias and promoting gender inclusivity in MT and translation tools, including LLMs applications for the translation task. ** Topics ** GITT invites technical as well as non-technical submissions, which consist of experimental, theoretical or methodological contributions. We explicitly welcome interdisciplinary submissions and submissions that focus on innovative, non-binary linguistic strategies and/or with sociolinguistically-informed perspectives. The topics of interest include, but are not limited to: - Models or methods for assessing and mitigating gender bias - New resources for inclusive language and gender translation (e.g., datasets, translation memories, dictionaries) - Social, cross-lingual, and ethical implications of gender bias - Qualitative and quantitative analyses on the potential limits of current approaches to gender bias in translation and MT, error taxonomies as well as best practices and guidelines - User-centric case studies on the impact of biased language and/or mitigating approaches which can include translators, post-editors, or monolingual MT users GITT is also open to other non-listed topics aligned with the scope of the workshop and works focusing on non-textual modalities (e.g., audiovisual translation) ** Submission ** We welcome three types of submissions: - Research papers: of at least 4 up to 10 pages (including references) - Extended Abstracts: up to 2 pages (including references) Accepted papers and extended abstracts consisting of novel work will be published online as proceedings in the ACL Anthology. - Research Communications: up to 2 pages (including reference) We include a parallel submission policy for papers accepted in other venues in 2023. Research communications will not be included in the proceedings, but will serve to promote the dissemination of research aligned with the scope of the workshop. Submissions should adhere to the EAMT 2024 guidelines and style templates (PDF, LaTeX, Word) and be uploaded on OpenReview: https://openreview.net/group?id=EAMT.org/2024/Workshop/GITT ** Workshop organizers ** Beatrice Savoldi, Fondazione Bruno Kessler Janiça Hackenbuchner, University of Ghent Luisa Bentivogli, Fondazione Bruno Kessler Eva Vanmassenhove, University of Tilburg Joke Daems, University of Ghent Jasmijn Bastings, Google DeepMind

1 0

ESRC-funded PhD available - please promote
by Bell, Melanie 27 Feb '24

27 Feb '24

Dear colleagues, Ian van der Linde and I are recruiting for an ESRC-funded PhD studentship on ‘The role of gesture in spoken communication’. We are looking for people with a background in computational linguistics/ NLP, psycholinguistics, or cognitive science, who ideally have experience of using large language corpora and/or data collection with human participants as well as programming skills in R, MATLAB, or Python. The successful candidate will have the opportunity to develop their own project in consultation with us, in the general topic area of gesture and speech. Full details can be found at this link: https://www.esrcdtp.group.cam.ac.uk/current-studentship-opportunities/ Applications close on 15 March 2024. We would appreciate it if you could forward this information to anyone who might be suitable and interested or to any relevant lists. Please do encourage potential candidates to get in touch with us; we would be happy to discuss this further. Yours sincerely Melanie Bell Professor Melanie J. Bell Professor of Quantitative Linguistics    ARU, East Road, Cambridge, CB1 1PT aru.ac.uk<http://www.aru.ac.uk/> ARU Cambridge | ARU Chelmsford | ARU London | ARU Peterborough [ARU THE University of the Year 2023 | UK Social Mobility Awards University of the Year 2023 | TEF Gold 2023] -- Please click here to view our e-mail disclaimer http://www.aru.ac.uk/email-disclaimer

1 0

Call for participation: DETESTS-Dis IberLEF 2024
by Simona Frenda 27 Feb '24

27 Feb '24

Please consider contributing and/or forwarding to appropriate colleagues and groups. *******We apologize for the multiple copies of this e-mail****** -------------------------------------------------------------------------------------------------------------------- Call for Participation -------------------------------------------------------------------------------------------------------------------- DETESTS-Dis IberLEF 2024 Task: DETESTS-Dis (DETEction and classification of racial Stereotypes in Spanish – Learning with Disagreement) This task will take part of IberLEF 2024, the 6th Workshop on Iberian Languages Evaluation Forum at the SEPLN 2024 Conference, which will be held in Valladolid, Spain, on September 24th. ------------------------------------------------------------------------------------------------------------------- Here, we introduce the second edition of the DETESTS task (Ariza-Casabona, 2022), which was first presented at IberLEF 2022. The aim of the new edition, DETESTS-Dis, is to detect and classify explicit and implicit stereotypes in texts from social media and comments on news articles, incorporating learning with disagreement techniques. Next, a description of both subtasks is provided: - Subtask 1, Stereotype Identification: This is a binary classification task the aim of which is to determine whether a comment or sentence contains at least one stereotype or none, considering the full distribution of labels provided by the annotators. This subtask follows the SemEval 2021 Task 12 (Uma et al., 2021) proposal about learning with disagreement, in which the authors state that there does not necessarily exist a single gold label for every sample in the dataset. This fact is particularly evident when multiple contradictory annotations arise at the data labeling stage due to “debatable, subjective, or linguistic ambiguity”. The actual gold label of this subtask is left as a proxy to determine the subset of comments that will be evaluated in the posterior subtask. - Subtask 2 (Optional), Implicitness Identification: This subtask introduces a novel binary classification problem to determine whether the stereotype is manifested or latent within the text, that is, whether the stereotype is implicit or explicit. The added difficulty in this case is that implicit stereotypes are not directly expressed in the text, and a process of inference must be applied by the annotators. Moreover, there are different strategies in which an implicit stereotype can be coded, such as metaphors, irony and other figures of speech, evaluations of the in-group, and the overgeneralization of a social group from features of some of its members. This subtask will be presented as a hierarchical binary classification problem. Although we recommend participating in both subtasks, participants are allowed to participate just in one of them (e.g., subtask 1). Teams will be allowed (and encouraged) to submit multiple runs (max. 5). To avoid any conflict with the sources of the comments regarding their intellectual property rights (IPR), the data will be sent privately to each participant who is interested in the task. The corpus will only be made available for research purposes. Important dates (All deadlines are 11:59 PM UTC-12:00): Training dataset release: March 04, 2024 Test dataset release: April 15, 2024 Systems results: April 29, 2024 Results notification: May 13, 2024 Working papers submission: June 3, 2024 Working papers (peer-)reviewed: June 17, 2024 Camera-ready versions: July 4, 2024 Workshop: September 24, 2024 Task organizers: - Mariona Taulé (Universitat de Barcelona, UB) - Wolfgang Schmeisser (Universitat de Barcelona, UB) - Alejandro Ariza (Universitat de Barcelona, UB) - Pol Pastells (Universitat de Barcelona, UB) - Mireia Farrús (Universitat de Barcelona, UB) - Simona Frenda (Università degli Studi di Torino, UniTo) - Paolo Rosso (Universitat Politècnica de València, UPV) Contact: Contact the organizers by writing to: detests.iberlef(a)gmail.com Web page: https://detests-dis.github.io/ We invite participants to join our Google Groups to be kept up to date with the latest news related to the task.

1 0

CFP: NSLP 2024 Workshop – Natural Scientific Language Processing and Research Knowledge Graphs
by Georg Rehm 26 Feb '24

26 Feb '24

1st Workshop on Natural Scientific Language Processing and Research Knowledge Graphs (NSLP 2024) 26 or 27 May 2024 (tbc) Hersonissos, Crete, Greece (co-located with ESWC 2024) Submission deadline (extended): March 14, 2024 https://nfdi4ds.github.io/nslp2024/ <https://nfdi4ds.github.io/nslp2024/> Scientific research is almost exclusively published in unstructured text formats, which are not readily machine-readable. While technological approaches can help to get this flood of scientific information and new knowledge under control, the development of such technologies is very complex in practice and hinders the creation of infrastructures and systems to track research and assist the scientific community with applications such as dedicated scientific search engines and recommender systems. The 1st Workshop on Natural Scientific Language Processing and Research Knowledge Graphs (NSLP) aims to bring together researchers working on the processing, analysis, transformation and making-use-of scientific language and RKGs including all relevant sub-topics. NSLP 2024 is a full-day workshop co-located with ESWC 2024 <https://2024.eswc-conferences.org/> to be held in Crete, Greece, in May 2024. The workshop will consist of two invited keynote and two shared tasks (FoRC: Field of Research Classification of Scholarly Publications <https://nfdi4ds.github.io/nslp2024/docs/forc_shared_task.html>, SOMD: Software Mention Detection in Scholarly Publications <https://nfdi4ds.github.io/nslp2024/docs/somd_shared_task.html>), as well as presentations and posters of accepted papers. Topics of interest include, but are not limited to Research/Scientific Knowledge Graphs (RKGs/SKGs) and other forms of Structured Scientific Knowledge Representation Information Extraction for Research/Scientific Knowledge Graphs Question Answering over Research/Scientific Knowledge Graphs Scientific LLMs: LLMs for Natural Scientific Language Processing Natural Scientific Language Processing (monolingual, cross-lingual, multilingual) Language Resources and Language Technologies for Natural Scientific Language Processing Information Extraction from Scholarly Publications Classification of Scholarly Publications (document collections, individual documents, parts of documents) Summarisation of Scholarly Articles Scholarly Information Retrieval and Scientific Search Engines Digital Libraries of Scholarly Information Metadata and Cataloging Bibliometrics and Scientometrics Domain-specific Adaptation of Natural Language Processing (NLP) methods for NSLP purposes Micropublications and Nanopublications Important dates Deadline for submissions: March 07, 2024 – March 14, 2024 (deadline extended) Notification of acceptance: April 4, 2024 Deadline for camera-ready papers: April 18, 2024 Submissions The workshop invites anonymous submissions of regular long papers (up to 15 pages), position papers, and short papers (up to 8 pages) presenting negative results, in-progress projects, and demos. Papers can present negative results, in-progress projects, and demos. We especially encourage submissions from junior researchers and students from diverse backgrounds. Format of submissions: Springer LNCS style (full submission guidelines <https://nfdi4ds.github.io/nslp2024/docs/submission.html>). Submissions are done via easyChair: https://easychair.org/conferences/?conf=nslp2024 <https://easychair.org/conferences/?conf=nslp2024> The workshop proceedings will be published in the Springer series Lecture Notes in Artificial Intelligence (LNAI) as an Open Access book. Note that all fees for the Open Access book publication will be covered by the project NFDI4DS, which financially supports this workshop. Shared tasks The workshop offers two shared tasks: FoRC: Field of Research Classification of Scholarly Publications <https://nfdi4ds.github.io/nslp2024/docs/forc_shared_task.html> (two sub-tasks) SOMD: Software Mention Detection in Scholarly Publications <https://nfdi4ds.github.io/nslp2024/docs/somd_shared_task.html> (three sub-tasks) Confirmed keynote speakers Natalia Manola, OpenAIRE, Greece Francesco Osborne, Open University, UK Organisers Georg Rehm, DFKI, Germany Sonja Schimmler, TU Berlin & Fraunhofer FOKUS, Germany Stefan Dietze, GESIS & HHU Düsseldorf, Germany Frank Krüger, Wismar University, Germany Contact Georg Rehm <georg.rehm(a)dfki.de <mailto:georg.rehm@dfki.de>> – NSLP 2024 website <https://nfdi4ds.github.io/nslp2024/> -- Prof. Dr. Georg Rehm <http://georg-re.hm/> Principal Researcher and Research Fellow, DFKI Adjunct Professor, Humboldt-Universität zu Berlin DFKI GmbH <https://www.dfki.de/>, Alt-Moabit 91c, 10559 Berlin, Germany Phone: +49 30 23895-1833 – Fax: -1810 georg.rehm(a)dfki.de Deutsches Forschungszentrum für Künstliche Intelligenz GmbH Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern Geschäftsführung: Prof. Dr. Antonio Krüger (Vorsitzender), Helmut Ditzer Vorsitzender des Aufsichtsrats: Dr. Ferri Abolhassan Amtsgericht Kaiserslautern, HRB 2313

1 0

Summer Internships at ITHAKA (jstor)
by heather froehlich 26 Feb '24

26 Feb '24

All, With apologies for cross-posting, the folks at ITHAKA (the company that brings us JSTOR) are seeking paid interns in the following roles: Outreach Intern, Outreach <https://www.ithaka.org/job/4369666005/?gh_jid=4369666005>Technology Intern, Software Engineering (Platform Engineering) <https://www.ithaka.org/job/4369659005/?gh_jid=4369659005>, Intern, Software Engineering (Full-stack) <https://www.ithaka.org/job/4369668005/?gh_jid=4369668005>, Intern, Software Engineering <https://www.ithaka.org/job/4369658005/?gh_jid=4369658005>Data & Analytics Intern, Data Intelligence <https://www.ithaka.org/job/4371666005/?gh_jid=4371666005> , Intern, Machine Learning and Data Science <https://www.ithaka.org/job/4371667005/?gh_jid=4371667005> These are 12-week positions starting on June 3, 2024 and will be paid at a rate of $20/hr. To be considered for this opportunity, please ensure your application is submitted through the links above by the deadline of *March 1, 2024.* Thank you very much! Heather Froehlich -- Dr Heather Froehlich w // http://hfroehli.ch t // @heatherfro

1 0

Last Call for Papers for ParlaCLARIN IV
by t.k.kontino＠uu.nl 26 Feb '24

26 Feb '24

Last Call for Papers for ParlaCLARIN IV Date: to be held at LREC-COLING 2024, Monday 20 May, 2024 Location: Lingotto Conference Centre - Torino (Italy) Webpage: https://www.clarin.eu/ParlaCLARIN-IV Submission Deadline: 26 February 2024 (Extended) Submission Portal: https://softconf.com/lrec-coling2024/parlaclarin-iv/ ---------------------------------- Workshop description Parliamentary data is an important source of scholarly and socially relevant content, serving as a verified communication channel between the elected political representatives and members of the society. The development of accessible, comprehensive and well-annotated parliamentary corpora is therefore crucial for the information society, as such corpora help scientists and investigative journalists to ascertain the accuracy of socio-politically relevant information, and to inform the citizens about the trends and insights on the basis of such data explorations. Research-wise, parliamentary corpora are a quintessential resource for a number of disciplines in digital humanities and social sciences, such as political science, sociology, history, and (socio)linguistics. The distinguishing characteristic of parliamentary data is that it is spoken language produced in controlled circumstances. Such data has traditionally been transcribed in a formal way but is now also increasingly transcribed with speech-to-text software as well as released in the original audio and video formats, which encourages resource and software development and provides research opportunities related to structuring, synchronization, visualization, querying and analysis of parliamentary corpora. Therefore, a harmonized approach to data curation practices for this type of data can support the advancement of the field significantly. One of the ways in which the research community is supported in this line of work is through the conversion of existing corpora and further development of new cross-national parliamentary corpora into a highly comparable, harmonized set of multilingual resources. These allow researchers to share comparative perspectives and to perform multidisciplinary research on parliamentary data. We envision that the ParlaCLARIN IV workshop, as a venue for knowledge and experience exchange on the topic, will contribute to the development and growth of the field of digital parliamentary science. Objective This fourth ParlaCLARIN workshop is a continuation of the 2018, 2020 and 2022 editions held at the respective LREC conferences, see references below. On the one hand, it continues to bring together developers, curators and researchers of regional, national and international parliamentary debates from across diverse disciplines in the Humanities and Social Sciences. On the other hand, we envisage the appearance of new discussion threads, tasks, and challenges that are partially inspired by or related to the new data releases such as ParlaMint and data formats such as Parla-CLARIN. Topics of interests We invite unpublished original work focusing on (but not exclusive to) Compilation, annotation, visualisation and utilisation of historical or contemporary parliamentary written or audio records Harmonisation of existing multilingual parliamentary resources, containing either synchronic or diachronic data or both Linking or comparing parliamentary records with other datasets of political discourse such as party manifestos, political speeches, political campaign debates, and social media posts, and to other sources of structured knowledge, such as formal ontologies and LOD datasets (in particular for the description of speakers, political parties, etc.) Special themes for this year’s workshop are: Enrichment of parliamentary proceedings (with e.g. sentiment annotation, political profiling of speakers etc.) and research using such data Machine translation of parliamentary proceedings and research using such data Argument mining of parliamentary debates Apart from the dissemination of the results, the workshop also aims to address the identified obstacles, discuss open issues and coordinate future efforts in this increasingly trans-national and cross-disciplinary community. Previous editions for the reference: 2022: https://www.clarin.eu/ParlaCLARIN-III 2020: https://www.clarin.eu/ParlaCLARIN-II 2018: https://www.clarin.eu/ParlaCLARIN Submission and Publication We accept submission of long papers (up to 8 pages), short papers (up to 4 pages) and demo papers (up to 4 pages) to be presented as a long or short oral presentation at the workshop. The papers of the workshop will be published in online proceedings. When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC-COLING authors to share the described LRs (data, tools, services, etc.) to enable their reuse and replicability of experiments (including evaluation ones). Important Dates Paper submission deadline: 26 February 2024 (Extended) Notification of acceptance: 26 March 2024 Camera-ready paper: 1 April 2024 Workshop date: 20 May 2024 Organizing Committee Darja Fiser, Institute of Contemporary History and CLARIN ERIC Maria Eskevich, Huygens Institute, KNAW David Bordon, University of Ljubljana Programme Committee Andreas Blaette, University of Duisburg-Essen Kaspar Beelen, School of Advanced Study, University of London Robert Borges, Department of Statistics, Uppsala University Hajo Boomgaarden, University of Vienna Çağrı Çöltekin, University of Tübingen Francesca Frontini, CNR-ILC and CLARIN ERIC Maria Gavriilidou, ILSP/Athena RC Haidee Kotze, Utrecht University Bente Maegaard, University of Copenhagen, Denmark Cristina Lastres-López, University of Seville Maarten Marx, University of Amsterdam Christian Mair, University of Freiburg Germany Simone Paolo Ponzetto, University of Mannheim Petya Osenova, IICT-BAS and Sofia University Maria Pontiki, ILSP/Athena RC, Greece Hugo Sanjurjo-González, University of Deusto Adam Smith, Macquarie University, Australia Stelios Piperidis, ILSP/Athena RC Tanja Wissik, Austrian Academy of Sciences Tomaž Erjavec, Jožef Stefan Institute Henk van den Heuvel, CLST, Radboud University Tanja Wissik, Austrian Academy of Sciences Turo Hiltunen, University of Helsinki Jan Odijk, Utrecht University Maciej Ogrodniczuk, Institute of Computer Science, Polish Academy of Sciences Turo Vartiainen, University of Helsinki The workshop is supported by the CLARIN ERIC research infrastructure. To contact the organisers, please mail parlaclarin(a)clarin.eu (Subject: [ParlaCLARIN@LREC2024]).

1 0

[CfP] Call for Papers on TermTrends @ MDTT24: Models and Best Practices for Terminology Representation in the Semantic Web
by Patricia Martín Chozas 26 Feb '24

26 Feb '24

*Apologies for crossposting* TermTrends24: Models and Best Practices for Terminology Representation in the Semantic Web Workshop colocated with MDTT 2024 <https://mdtt2024.dei.unipd.it/en/> Date: 26th June, 2024 Venue: Granada, Spain More info: https://termtrends.linkeddata.es/ *About TermTrends*TermTrends 2024, co-located with MDTT 2024 aims to provide a discussion forum on the theoretical and methodological approaches for the representation of terminological data, both at a conceptual and a linguistic level. In particular, we would like to focus on their connection to the Linguistic Linked (Open) Data (LLOD) paradigm through the representation of these data according to Semantic Web formats. By adopting models or vocabularies proposed for the representation of linguistic data, we would contribute to the creation of interoperable and reusable terminological resources. With this objective, the workshop intends to explore the advantages and challenges underlying various Terminology-related standardisation approaches, ranging from the initially proposed standards to represent terminology within the International Standardisation Organisation (ISO), such as the TermBase eXchange (TBX) format, to models that represent linguistic descriptions associated with ontologies in the Semantic Web, such as SKOS and Ontolex-lemon. Being multidisciplinary in scope, it focuses on identifying terminological representation needs, as well as limitations of current models in addressing such needs, with the aim of also exploring the development of an extension of the Ontolex-lemon vocabulary and how that may contribute to overcoming such challenges. *Call for Papers*The topics of interest for this workshop include, but are not limited to, the following topics: - Terminology Representation Standards - Terminology as Linguistic Linked (Open) Data - Interoperability of Terminological Resources - Reusability of Terminological Resources - Challenges in Terminology Representation - Analysis of the structure of Terminological Resources *Submissions* Papers proposals should follow the CEUR template. Short and long papers will be accepted. Following CEUR guidelines, short papers should be 5-6 pages long and long papers 8-10 pages long. Authors must submit their papers through the EasyChair platform following this link. *Important Dates15 March 2024* - Deadline for paper submission *20 April 2024* - Deadline for notification for paper submission *15 May 2024* - Deadline for camera-ready paper submission *26 June 2024 *- TermTrends Workshop *Workshop Organisers* Rute Costa, NOVA FCSH / NOVA CLUNL (Portugal) Elena Montiel-Ponsoda, Universidad Politécnica de Madrid (Spain) Sara Carvalho, Univ. de Aveiro / NOVA CLUNL (Portugal) Patricia Martín-Chozas, Universidad Politécnica de Madrid (Spain) Federica Vezzani, University of Padova (Italy) *Patricia Martín Chozas - Postdoctoral Researcher* * Ontology Engineering Group* Artificial Intelligence Department ETSI Informáticos - Universidad Politécnica de Madrid Phone: (+34) 910673091

1 0

2025

2024

2023

2022