Apologies for cross-posting.
---------------------------------------------------------------------------
The Seventh Workshop on Technologies for Machine Translation of Low-Resource
Languages (LoResMT 2024)
https://www.loresmt.org/
@ ACL 2024 (August 11–16, 2024)
Bangkok, Thailand
SUBMISSION
https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/LoResMT
TIMELINE
Paper submission due: May 17 (Friday), 2024, at 23:59 (Anywhere on Earth)
Notification of acceptance: June 17 (Monday), 2024
Camera-ready papers due: July 1 (Monday), 2024, at 23:59 (Anywhere on Earth)
Workshop dates at ACL: August 15, 2024
SCOPE
Based on the success of past low-resource machine translation (MT)
workshops at AMTA 2018 (https://amtaweb.org/), MT Summit 2019 (
https://www.mtsummit2019.com), AACL-IJCNLP 2020 (http://aacl2020.org/),
AMTA 2021, COLING 2022 and EACL 2023, we introduce the Seventh LoResMT
Workshop at ACL 2024. The workshop provides a discussion panel for
researchers working on MT systems/methods for low-resource and
under-represented languages in general. We would like to help
review/overview the state of MT for low-resource languages and define the
most important directions. We also solicit papers dedicated to
supplementary NLP tools that are used in any language and especially in
low-resource languages. Overview papers on these NLP tools are very
welcome. It will be beneficial if the evaluations of these tools in
research papers include their impact on the quality of MT output.
TOPICS
We are highly interested in (1) original research papers, (2)
review/opinion papers, and (3) online systems on the topics below; however,
we welcome all novel ideas that cover research on low-resource languages.
- Neural machine translation (NMT) for low-resource languages
- Use of LLMs (large language models) for low-resource MT systems
- COVID-related corpora, their translations and corresponding NLP/MT systems
- Work that presents online systems for practical use by native speakers
- Word tokenizers/de-tokenizers for specific languages
- Word/morpheme segmenters for specific languages
- Alignment/Re-ordering tools for specific language pairs
- Use of morphology analyzers and/or morpheme segmenters in MT
- Multilingual/cross-lingual NLP tools for MT
- Corpora creation and curation technologies for low-resource languages
- Review of available parallel corpora for low-resource languages
- Research and review papers on MT methods for low-resource languages
- MT systems/methods (e.g. rule-based, SMT, NMT) for low-resource languages
- Pivot MT for low-resource languages
- Zero-shot MT for low-resource languages
- Fast building of MT systems for low-resource languages
- Re-usability of existing MT systems for low-resource languages
- Machine translation for language preservation
SUBMISSION INFORMATION
We are soliciting two types of submissions: (1) research, review, and
position papers and (2) system demonstration papers. For research, review
and position papers, the length of each paper should be at least four (4)
and not exceed eight (8) pages, plus unlimited pages for references. For
system demonstration papers, the limit is four (4) pages. Submissions
should be formatted according to the official ACL 2024 style templates.
Accepted papers will be published online in the ACL 2024 proceedings and
will be presented at the conference.
Submissions must be anonymized and should be done using the provided
submission system. Scientific papers that have been or will be submitted to
other venues must be declared as such and must be withdrawn from the other
venues if accepted and published at LoResMT. The review will be
double-blind. Authors of an accepted paper should present their paper in
person at ACL 2024. Papers should be submitted in PDF to the LoResMT Open
Review.
We would like to encourage authors to cite papers written in ANY language
that are related to the topics, as long as both original bibliographic
items and their corresponding English translations are provided.
Registration is handled by the main conference (https://2024.aclweb.org/).
ORGANIZING COMMITTEE (LISTED ALPHABETICALLY)
Atul Kr. Ojha, University of Galway & Panlingua Language Processing LLP
Chao-Hong Liu, Potamu Research Ltd
Ekaterina Vylomova, University of Melbourne, Australia
Jade Abbott, Retro Rabbit
Jonathan Washington, Swarthmore College
Nathaniel Oco, National University (Philippines)
Tommi A Pirinen, UiT The Arctic University of Norway, Tromsø
Valentin Malykh, Huawei Noah’s Ark lab and Kazan Federal University
Varvara Logacheva, Skolkovo Institute of Science and Technology
Xiaobing Zhao, Minzu University of China
PROGRAM COMMITTEE (LISTED ALPHABETICALLY)
Abigail Walsh, ADAPT Centre, Dublin City University, Ireland
Alberto Poncelas, Rakuten, Singapore
Alina Karakanta, Leiden University
Amirhossein Tebbifakhr, Fondazione Bruno Kessler
Anna Currey, Amazon Web Services
Aswarth Abhilash Dara, Amazon
Arturo Oncevay, University of Edinburgh
Atul Kr. Ojha, DSI, University of Galway & Panlingua Language Processing LLP
Barry Haddow, University of Edinburgh
Bogdan Babych, Heidelberg University
Chao-Hong Liu, Potamu Research Ltd
Constantine Lignos, Brandeis University, USA
Daan van Esch, Google
Diptesh Kanojia, University of Surrey, UK
Duygu Ataman, University of Zurich
Ekaterina Vylomova, University of Melbourne, Australia
Eleni Metheniti, CLLE-CNRS and IRIT-CNRS
Flammie Pirinen, UiT The Arctic University of Norway, Tromsø
Koel Dutta Chowdhury, Saarland University (Germany)
Jade Abbott, Retro Rabbit
Jasper Kyle Catapang, University of the Philippines
Jindřich Libovicky, Charles University
John P. McCrae, DSI, University of Galway
Liangyou Li, Noah’s Ark Lab, Huawei Technologies
Majid Latifi, University of York, York, UK
Maria Art Antonette Clariño, University of the Philippines Los Baños
Mathias Müller, University of Zurich
Nathaniel Oco, De La Salle University (Philippines)
Rajdeep Sarkar, Yahoo
Rico Sennrich, University of Zurich
Saliha Muradoglu, The Australian National University
Sangjee Dondrub, Qinghai Normal University
Santanu Pal, WIPRO AI
Sardana Ivanova, University of Helsinki
Shantipriya Parida, Silo AI
Sunit Bhattacharya, Charles University
Surafel Melaku Lakew, Amazon AI
Wen Lai, Center for Information and Language Processing, LMU Munich
Valentin Malykh, Huawei Noah’s Ark lab and Kazan Federal University
CONTACT
Please email loresmt(a)googlegroups.com if you have any
questions/comments/suggestions.
Apologies for cross-posting. you are requested to please circulate it for
wider publicity...
---------------------------------------------------------------------------
7thWorkshop on Indian Language Data: Resources and Evaluation (WILDRE)
Venue: Lingotto Conference Centre - Torino, Italy (Organized under LREC-COLING
2024 (20-25 May 2024) <https://lrec-coling-2024.org/>)
Website: http://sanskrit.jnu.ac.in/conf/wildre7
WILDRE-7, the 7th Workshop on Indian Language Data: Resources and
Evaluation is proposed to be organised in Lingotto Conference Centre -
Torino, Italy under the LREC-COLING platform. India has a huge linguistic
diversity and has seen concerted efforts from the Indian government and
industry to develop language resources. European Language Resource
Association (ELRA) and its associate organizations have been very active
and successful in addressing the challenges and opportunities related to
language resource creation and evaluation. It is therefore a big
opportunity for resource creators of Indian languages to showcase their
work on this platform and also to interact and learn from those involved in
similar initiatives all over the world. The broader objectives of the
WILDRE will be
-
To map the status of Indian Language Resources
-
To investigate challenges related to creating and sharing various levels
of language resources
-
To promote a dialogue between language resource developers and users
-
To provide an opportunity for researchers from India to collaborate with
researchers from other parts of the world
*Dates for Short/Long papers and Posters and Demos*
February 28, 2024 March 06, 2024: Paper submissions due [extended deadline]
March 28, 2024: Paper notification acceptance
April 10, 2024: Camera-ready papers due
SUBMISSIONS
Papers must describe original, completed/ in progress and unpublished work.
Each submission will be reviewed by three program committee members.
Accepted papers will be given up to 10 pages (for full papers) 5 pages (for
short papers and posters) in the workshop proceedings, and will be
presented as oral paper or poster.
Papers should be formatted according to the LREC-COLING style sheet, which
is provided on the LREC-COLING 2024 website (
https://lrec-coling-2024.org/authors-kit/). Papers should be submitted in
PDF format to the LREC-COLING website (
https://softconf.com/lrec-coling2024/wildre-7/)
We are seeking submissions under the following category
-
Full papers (10 pages)
-
Short papers (work in progress: 5 pages)
-
Posters (innovative ideas/proposals, research proposal of students)
-
Demo (of working online/standalone systems)
WILDRE-7 will have a special focus on Demos of Indian Language Technology.
In the past few years, as more resources have been developed and made
available, there has been an increased activity in developing usable
technology using these. WILDRE-7 would like to encourage and widen the Demo
track to allow the community to showcase their demos and have mutually
beneficial interactions with each other as well as resource developers.
WILDRE-7 is seeking full, short papers, posters and demos on the following
topics related to Indian Language Resources:
-
Digital Humanities, heritage computing
-
Corpora - text, speech, multimodal, methodologies, annotation and tools
-
Lexicons and Machine-readable dictionaries
-
Ontologies, Grammars
-
Language resources for NLP/ IR/Speech tasks, tools and Infrastructure
for language resources
-
Standards or specifications for language resources application
-
Licensing and copyright issues
-
Data mining
-
Text summarization
Both submission and review processes will be handled electronically. The
review process will be double-blind. The workshop website will provide the
submission guidelines and the link for the electronic submission.
When submitting a paper from the START page, authors will be asked to
provide essential information about resources (in a broad sense, i.e.
technologies, standards, evaluation kits, etc.) that have been used for the
work described in the paper or are a new result of your research. Moreover,
ELRA encourages all LREC-COLING authors to share the described LRs (data,
tools, services, etc.), to enable their reuse, and replicability of
experiments, including evaluation ones, etc.
For further information on this initiative, please refer to
https://lrec-coling-2024.org/
Shared Task
Following the success of the five WILDRE workshops, WILDRE-7 will
include *Code-mixed
Less-Resourced Sentiment Analysis (Code-mixed) *and *Discourse Machine
Translation (DiscoMT)* Shared Tasks. The organizers of shared tasks will
provide datasets and evaluation platforms to evaluate systems developed by
the participants. For further information on this initiative, please refer
to http://sanskrit.jnu.ac.in/conf/wildre7
Workshop *Organisers*
-
Girish Nath Jha, Jawaharlal Nehru University, India
-
Kalika Bali, Microsoft Research India Lab, Bangalore, India
-
Sobha L, AU-KBC, Anna University, Chennai, India
-
Atul Kr. Ojha, University of Galway, Ireland & Panlingua Language
Processing LLP, India
Workshop contact:
Atul Kr. Ojha, University of Galway, Ireland & Panlingua Language
Processing LLP, India, shashwatup9k(a)gmail.com
Identify, Describe and Share your LRs
Describing your LRs in the LRE Map is now a normal practice in the
submission procedure of LREC (introduced in 2010 and adopted by other
conferences). To continue the efforts initiated at LREC 2014 about “Sharing
LRs” (data, tools, web services, etc.), authors will have the possibility,
when submitting a paper, to upload LRs in a special LREC repository. This
effort of sharing LRs, linked to the LRE Map for their description, may
become a new “regular” feature for conferences in our field, thus
contributing to creating a common repository where everyone can deposit and
share data.
As scientific work requires accurate citations of referenced work to allow
the community to understand the whole context and also replicate the
experiments conducted by other researchers, LREC-COLING 2024 endorses the
need to uniquely identify LRs through the use of the International Standard
Language Resource Number (ISLRN, www.islrn.org), a Persistent Unique
Identifier to be assigned to each Language Resource. The assignment of
ISLRNs to LRs cited in LREC-COLING papers will be offered at submission
time.
1st CALL FOR PAPERS
Second International Workshop on Gender-Inclusive Translation Technologies (GITT) at EAMT 2023
27 June 2023, Sheffield, UK
https://sites.google.com/tilburguniversity.edu/gitt2024
@GITT2024
** Important Dates ** (Time zone: Anywhere on Earth)
Submission deadline: 15 April, 2024
Notification of Acceptance: 15 May, 2024
Camera Ready Copy due: 24 May, 2024
Workshop: 27 June, 2024
** Aim and scope **
The Gender-Inclusive Translation Technologies Workshop (GITT) is set out to be the only dedicated workshop that focuses on gender-inclusive language in translation and cross-lingual scenarios. The workshop aims to bring together researchers from diverse areas, including industry partners, MT practitioners, and language professionals. GITT aims to encourage multidisciplinary research that develops and interrogates both solutions and challenges for addressing bias and promoting gender inclusivity in MT and translation tools, including LLMs applications for the translation task.
** Topics **
GITT invites technical as well as non-technical submissions, which consist of experimental, theoretical or methodological contributions. We explicitly welcome interdisciplinary submissions and submissions that focus on innovative, non-binary linguistic strategies and/or with sociolinguistically-informed perspectives. The topics of interest include, but are not limited to:
- Models or methods for assessing and mitigating gender bias
- New resources for inclusive language and gender translation (e.g., datasets, translation memories, dictionaries)
- Social, cross-lingual, and ethical implications of gender bias
- Qualitative and quantitative analyses on the potential limits of current approaches to gender bias in translation and MT, error taxonomies as well as best practices and guidelines
- User-centric case studies on the impact of biased language and/or mitigating approaches which can include translators, post-editors, or monolingual MT users
GITT is also open to other non-listed topics aligned with the scope of the workshop and works focusing on non-textual modalities (e.g., audiovisual translation)
** Submission **
We welcome three types of submissions:
- Research papers: of at least 4 up to 10 pages (including references)
- Extended Abstracts: up to 2 pages (including references)
Accepted papers and extended abstracts consisting of novel work will be published online as proceedings in the ACL Anthology.
- Research Communications: up to 2 pages (including reference)
We include a parallel submission policy for papers accepted in other venues in 2023. Research communications will not be included in the proceedings, but will serve to promote the dissemination of research aligned with the scope of the workshop.
Submissions should adhere to the EAMT 2024 guidelines and style templates (PDF, LaTeX, Word) and be uploaded on OpenReview: https://openreview.net/group?id=EAMT.org/2024/Workshop/GITT
** Workshop organizers **
Beatrice Savoldi, Fondazione Bruno Kessler
Janiça Hackenbuchner, University of Ghent
Luisa Bentivogli, Fondazione Bruno Kessler
Eva Vanmassenhove, University of Tilburg
Joke Daems, University of Ghent
Jasmijn Bastings, Google DeepMind
Dear colleagues,
Ian van der Linde and I are recruiting for an ESRC-funded PhD studentship on ‘The role of gesture in spoken communication’. We are looking for people with a background in computational linguistics/ NLP, psycholinguistics, or cognitive science, who ideally have experience of using large language corpora and/or data collection with human participants as well as programming skills in R, MATLAB, or Python. The successful candidate will have the opportunity to develop their own project in consultation with us, in the general topic area of gesture and speech.
Full details can be found at this link: https://www.esrcdtp.group.cam.ac.uk/current-studentship-opportunities/
Applications close on 15 March 2024.
We would appreciate it if you could forward this information to anyone who might be suitable and interested or to any relevant lists. Please do encourage potential candidates to get in touch with us; we would be happy to discuss this further.
Yours sincerely
Melanie Bell
Professor Melanie J. Bell
Professor of Quantitative Linguistics
ARU, East Road, Cambridge, CB1 1PT
aru.ac.uk<http://www.aru.ac.uk/>
ARU Cambridge | ARU Chelmsford | ARU London | ARU Peterborough
[ARU THE University of the Year 2023 | UK Social Mobility Awards University of the Year 2023 | TEF Gold 2023]
-- Please click here to view our e-mail disclaimer http://www.aru.ac.uk/email-disclaimer
Please consider contributing and/or forwarding to appropriate colleagues
and groups.
*******We apologize for the multiple copies of this e-mail******
--------------------------------------------------------------------------------------------------------------------
Call for Participation
--------------------------------------------------------------------------------------------------------------------
DETESTS-Dis IberLEF 2024
Task: DETESTS-Dis (DETEction and classification of racial Stereotypes in
Spanish – Learning with Disagreement)
This task will take part of IberLEF 2024, the 6th Workshop on Iberian
Languages Evaluation Forum at the SEPLN 2024 Conference, which will be held
in Valladolid, Spain, on September 24th.
-------------------------------------------------------------------------------------------------------------------
Here, we introduce the second edition of the DETESTS task (Ariza-Casabona,
2022), which was first presented at IberLEF 2022. The aim of the new
edition, DETESTS-Dis, is to detect and classify explicit and implicit
stereotypes in texts from social media and comments on news articles,
incorporating learning with disagreement techniques. Next, a description of
both subtasks is provided:
-
Subtask 1, Stereotype Identification: This is a binary classification
task the aim of which is to determine whether a comment or sentence
contains at least one stereotype or none, considering the full distribution
of labels provided by the annotators. This subtask follows the SemEval 2021
Task 12 (Uma et al., 2021) proposal about learning with disagreement, in
which the authors state that there does not necessarily exist a single gold
label for every sample in the dataset. This fact is particularly evident
when multiple contradictory annotations arise at the data labeling stage
due to “debatable, subjective, or linguistic ambiguity”. The actual gold
label of this subtask is left as a proxy to determine the subset of
comments that will be evaluated in the posterior subtask.
-
Subtask 2 (Optional), Implicitness Identification: This subtask
introduces a novel binary classification problem to determine whether the
stereotype is manifested or latent within the text, that is, whether the
stereotype is implicit or explicit. The added difficulty in this case is
that implicit stereotypes are not directly expressed in the text, and a
process of inference must be applied by the annotators. Moreover, there are
different strategies in which an implicit stereotype can be coded, such as
metaphors, irony and other figures of speech, evaluations of the in-group,
and the overgeneralization of a social group from features of some of its
members. This subtask will be presented as a hierarchical binary
classification problem.
Although we recommend participating in both subtasks, participants are
allowed to participate just in one of them (e.g., subtask 1).
Teams will be allowed (and encouraged) to submit multiple runs (max. 5).
To avoid any conflict with the sources of the comments regarding their
intellectual property rights (IPR), the data will be sent privately to each
participant who is interested in the task. The corpus will only be made
available for research purposes.
Important dates (All deadlines are 11:59 PM UTC-12:00):
Training dataset release: March 04, 2024
Test dataset release: April 15, 2024
Systems results: April 29, 2024
Results notification: May 13, 2024
Working papers submission: June 3, 2024
Working papers (peer-)reviewed: June 17, 2024
Camera-ready versions: July 4, 2024
Workshop: September 24, 2024
Task organizers:
-
Mariona Taulé (Universitat de Barcelona, UB)
-
Wolfgang Schmeisser (Universitat de Barcelona, UB)
-
Alejandro Ariza (Universitat de Barcelona, UB)
-
Pol Pastells (Universitat de Barcelona, UB)
-
Mireia Farrús (Universitat de Barcelona, UB)
-
Simona Frenda (Università degli Studi di Torino, UniTo)
-
Paolo Rosso (Universitat Politècnica de València, UPV)
Contact:
Contact the organizers by writing to: detests.iberlef(a)gmail.com
Web page: https://detests-dis.github.io/
We invite participants to join our Google Groups to be kept up to date with
the latest news related to the task.
1st Workshop on Natural Scientific Language Processing and Research Knowledge Graphs (NSLP 2024)
26 or 27 May 2024 (tbc)
Hersonissos, Crete, Greece (co-located with ESWC 2024)
Submission deadline (extended): March 14, 2024
https://nfdi4ds.github.io/nslp2024/ <https://nfdi4ds.github.io/nslp2024/>
Scientific research is almost exclusively published in unstructured text formats, which are not readily machine-readable. While technological approaches can help to get this flood of scientific information and new knowledge under control, the development of such technologies is very complex in practice and hinders the creation of infrastructures and systems to track research and assist the scientific community with applications such as dedicated scientific search engines and recommender systems. The 1st Workshop on Natural Scientific Language Processing and Research Knowledge Graphs (NSLP) aims to bring together researchers working on the processing, analysis, transformation and making-use-of scientific language and RKGs including all relevant sub-topics. NSLP 2024 is a full-day workshop co-located with ESWC 2024 <https://2024.eswc-conferences.org/> to be held in Crete, Greece, in May 2024. The workshop will consist of two invited keynote and two shared tasks (FoRC: Field of Research Classification of Scholarly Publications <https://nfdi4ds.github.io/nslp2024/docs/forc_shared_task.html>, SOMD: Software Mention Detection in Scholarly Publications <https://nfdi4ds.github.io/nslp2024/docs/somd_shared_task.html>), as well as presentations and posters of accepted papers.
Topics of interest include, but are not limited to
Research/Scientific Knowledge Graphs (RKGs/SKGs) and other forms of Structured Scientific Knowledge Representation
Information Extraction for Research/Scientific Knowledge Graphs
Question Answering over Research/Scientific Knowledge Graphs
Scientific LLMs: LLMs for Natural Scientific Language Processing
Natural Scientific Language Processing (monolingual, cross-lingual, multilingual)
Language Resources and Language Technologies for Natural Scientific Language Processing
Information Extraction from Scholarly Publications
Classification of Scholarly Publications (document collections, individual documents, parts of documents)
Summarisation of Scholarly Articles
Scholarly Information Retrieval and Scientific Search Engines
Digital Libraries of Scholarly Information
Metadata and Cataloging
Bibliometrics and Scientometrics
Domain-specific Adaptation of Natural Language Processing (NLP) methods for NSLP purposes
Micropublications and Nanopublications
Important dates
Deadline for submissions: March 07, 2024 – March 14, 2024 (deadline extended)
Notification of acceptance: April 4, 2024
Deadline for camera-ready papers: April 18, 2024
Submissions
The workshop invites anonymous submissions of regular long papers (up to 15 pages), position papers, and short papers (up to 8 pages) presenting negative results, in-progress projects, and demos. Papers can present negative results, in-progress projects, and demos. We especially encourage submissions from junior researchers and students from diverse backgrounds. Format of submissions: Springer LNCS style (full submission guidelines <https://nfdi4ds.github.io/nslp2024/docs/submission.html>).
Submissions are done via easyChair: https://easychair.org/conferences/?conf=nslp2024 <https://easychair.org/conferences/?conf=nslp2024>
The workshop proceedings will be published in the Springer series Lecture Notes in Artificial Intelligence (LNAI) as an Open Access book. Note that all fees for the Open Access book publication will be covered by the project NFDI4DS, which financially supports this workshop.
Shared tasks
The workshop offers two shared tasks:
FoRC: Field of Research Classification of Scholarly Publications <https://nfdi4ds.github.io/nslp2024/docs/forc_shared_task.html> (two sub-tasks)
SOMD: Software Mention Detection in Scholarly Publications <https://nfdi4ds.github.io/nslp2024/docs/somd_shared_task.html> (three sub-tasks)
Confirmed keynote speakers
Natalia Manola, OpenAIRE, Greece
Francesco Osborne, Open University, UK
Organisers
Georg Rehm, DFKI, Germany
Sonja Schimmler, TU Berlin & Fraunhofer FOKUS, Germany
Stefan Dietze, GESIS & HHU Düsseldorf, Germany
Frank Krüger, Wismar University, Germany
Contact
Georg Rehm <georg.rehm(a)dfki.de <mailto:georg.rehm@dfki.de>> – NSLP 2024 website <https://nfdi4ds.github.io/nslp2024/>
--
Prof. Dr. Georg Rehm <http://georg-re.hm/>
Principal Researcher and Research Fellow, DFKI
Adjunct Professor, Humboldt-Universität zu Berlin
DFKI GmbH <https://www.dfki.de/>, Alt-Moabit 91c, 10559 Berlin, Germany
Phone: +49 30 23895-1833 – Fax: -1810
georg.rehm(a)dfki.de
Deutsches Forschungszentrum für Künstliche Intelligenz GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
Geschäftsführung: Prof. Dr. Antonio Krüger (Vorsitzender), Helmut Ditzer
Vorsitzender des Aufsichtsrats: Dr. Ferri Abolhassan
Amtsgericht Kaiserslautern, HRB 2313
Last Call for Papers for ParlaCLARIN IV
Date: to be held at LREC-COLING 2024, Monday 20 May, 2024
Location: Lingotto Conference Centre - Torino (Italy)
Webpage: https://www.clarin.eu/ParlaCLARIN-IV
Submission Deadline: 26 February 2024 (Extended)
Submission Portal: https://softconf.com/lrec-coling2024/parlaclarin-iv/
----------------------------------
Workshop description
Parliamentary data is an important source of scholarly and socially relevant content, serving as a verified communication channel between the elected political representatives and members of the society. The development of accessible, comprehensive and well-annotated parliamentary corpora is therefore crucial for the information society, as such corpora help scientists and investigative journalists to ascertain the accuracy of socio-politically relevant information, and to inform the citizens about the trends and insights on the basis of such data explorations. Research-wise, parliamentary corpora are a quintessential resource for a number of disciplines in digital humanities and social sciences, such as political science, sociology, history, and (socio)linguistics.
The distinguishing characteristic of parliamentary data is that it is spoken language produced in controlled circumstances. Such data has traditionally been transcribed in a formal way but is now also increasingly transcribed with speech-to-text software as well as released in the original audio and video formats, which encourages resource and software development and provides research opportunities related to structuring, synchronization, visualization, querying and analysis of parliamentary corpora. Therefore, a harmonized approach to data curation practices for this type of data can support the advancement of the field significantly. One of the ways in which the research community is supported in this line of work is through the conversion of existing corpora and further development of new cross-national parliamentary corpora into a highly comparable, harmonized set of multilingual resources. These allow researchers to share comparative perspectives and to perform multidisciplinary research on parliamentary data. We envision that the ParlaCLARIN IV workshop, as a venue for knowledge and experience exchange on the topic, will contribute to the development and growth of the field of digital parliamentary science.
Objective
This fourth ParlaCLARIN workshop is a continuation of the 2018, 2020 and 2022 editions held at the respective LREC conferences, see references below. On the one hand, it continues to bring together developers, curators and researchers of regional, national and international parliamentary debates from across diverse disciplines in the Humanities and Social Sciences. On the other hand, we envisage the appearance of new discussion threads, tasks, and challenges that are partially inspired by or related to the new data releases such as ParlaMint and data formats such as Parla-CLARIN.
Topics of interests
We invite unpublished original work focusing on (but not exclusive to)
Compilation, annotation, visualisation and utilisation of historical or contemporary parliamentary written or audio records
Harmonisation of existing multilingual parliamentary resources, containing either synchronic or diachronic data or both
Linking or comparing parliamentary records with other datasets of political discourse such as party manifestos, political speeches, political campaign debates, and social media posts, and to other sources of structured knowledge, such as formal ontologies and LOD datasets (in particular for the description of speakers, political parties, etc.)
Special themes for this year’s workshop are:
Enrichment of parliamentary proceedings (with e.g. sentiment annotation, political profiling of speakers etc.) and research using such data
Machine translation of parliamentary proceedings and research using such data
Argument mining of parliamentary debates
Apart from the dissemination of the results, the workshop also aims to address the identified obstacles, discuss open issues and coordinate future efforts in this increasingly trans-national and cross-disciplinary community.
Previous editions for the reference:
2022: https://www.clarin.eu/ParlaCLARIN-III
2020: https://www.clarin.eu/ParlaCLARIN-II
2018: https://www.clarin.eu/ParlaCLARIN
Submission and Publication
We accept submission of long papers (up to 8 pages), short papers (up to 4 pages) and demo papers (up to 4 pages) to be presented as a long or short oral presentation at the workshop. The papers of the workshop will be published in online proceedings.
When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC-COLING authors to share the described LRs (data, tools, services, etc.) to enable their reuse and replicability of experiments (including evaluation ones).
Important Dates
Paper submission deadline: 26 February 2024 (Extended)
Notification of acceptance: 26 March 2024
Camera-ready paper: 1 April 2024
Workshop date: 20 May 2024
Organizing Committee
Darja Fiser, Institute of Contemporary History and CLARIN ERIC
Maria Eskevich, Huygens Institute, KNAW
David Bordon, University of Ljubljana
Programme Committee
Andreas Blaette, University of Duisburg-Essen
Kaspar Beelen, School of Advanced Study, University of London
Robert Borges, Department of Statistics, Uppsala University
Hajo Boomgaarden, University of Vienna
Çağrı Çöltekin, University of Tübingen
Francesca Frontini, CNR-ILC and CLARIN ERIC
Maria Gavriilidou, ILSP/Athena RC
Haidee Kotze, Utrecht University
Bente Maegaard, University of Copenhagen, Denmark
Cristina Lastres-López, University of Seville
Maarten Marx, University of Amsterdam
Christian Mair, University of Freiburg Germany
Simone Paolo Ponzetto, University of Mannheim
Petya Osenova, IICT-BAS and Sofia University
Maria Pontiki, ILSP/Athena RC, Greece
Hugo Sanjurjo-González, University of Deusto
Adam Smith, Macquarie University, Australia
Stelios Piperidis, ILSP/Athena RC
Tanja Wissik, Austrian Academy of Sciences
Tomaž Erjavec, Jožef Stefan Institute
Henk van den Heuvel, CLST, Radboud University
Tanja Wissik, Austrian Academy of Sciences
Turo Hiltunen, University of Helsinki
Jan Odijk, Utrecht University
Maciej Ogrodniczuk, Institute of Computer Science, Polish Academy of Sciences
Turo Vartiainen, University of Helsinki
The workshop is supported by the CLARIN ERIC research infrastructure.
To contact the organisers, please mail parlaclarin(a)clarin.eu (Subject: [ParlaCLARIN@LREC2024]).
*Apologies for crossposting*
TermTrends24: Models and Best Practices for Terminology Representation in
the Semantic Web
Workshop colocated with MDTT 2024 <https://mdtt2024.dei.unipd.it/en/>
Date: 26th June, 2024
Venue: Granada, Spain
More info: https://termtrends.linkeddata.es/
*About TermTrends*TermTrends 2024, co-located with MDTT 2024 aims to
provide a discussion forum on the theoretical and methodological approaches
for the representation of terminological data, both at a conceptual and a
linguistic level. In particular, we would like to focus on their connection
to the Linguistic Linked (Open) Data (LLOD) paradigm through the
representation of these data according to Semantic Web formats. By adopting
models or vocabularies proposed for the representation of linguistic data,
we would contribute to the creation of interoperable and reusable
terminological resources.
With this objective, the workshop intends to explore the advantages and
challenges underlying various Terminology-related standardisation
approaches, ranging from the initially proposed standards to represent
terminology within the International Standardisation Organisation (ISO),
such as the TermBase eXchange (TBX) format, to models that represent
linguistic descriptions associated with ontologies in the Semantic Web,
such as SKOS and Ontolex-lemon.
Being multidisciplinary in scope, it focuses on identifying terminological
representation needs, as well as limitations of current models in
addressing such needs, with the aim of also exploring the development of an
extension of the Ontolex-lemon vocabulary and how that may contribute to
overcoming such challenges.
*Call for Papers*The topics of interest for this workshop include, but are
not limited to, the following topics:
- Terminology Representation Standards
- Terminology as Linguistic Linked (Open) Data
- Interoperability of Terminological Resources
- Reusability of Terminological Resources
- Challenges in Terminology Representation
- Analysis of the structure of Terminological Resources
*Submissions*
Papers proposals should follow the CEUR template. Short and long papers
will be accepted. Following CEUR guidelines, short papers should be 5-6
pages long and long papers 8-10 pages long. Authors must submit their
papers through the EasyChair platform following this link.
*Important Dates15 March 2024* - Deadline for paper submission
*20 April 2024* - Deadline for notification for paper submission
*15 May 2024* - Deadline for camera-ready paper submission
*26 June 2024 *- TermTrends Workshop
*Workshop Organisers*
Rute Costa, NOVA FCSH / NOVA CLUNL (Portugal)
Elena Montiel-Ponsoda, Universidad Politécnica de Madrid (Spain)
Sara Carvalho, Univ. de Aveiro / NOVA CLUNL (Portugal)
Patricia Martín-Chozas, Universidad Politécnica de Madrid (Spain)
Federica Vezzani, University of Padova (Italy)
*Patricia Martín Chozas - Postdoctoral Researcher*
* Ontology Engineering Group*
Artificial Intelligence Department
ETSI Informáticos - Universidad Politécnica de Madrid
Phone: (+34) 910673091