June 10-13, 2024, University of Colorado, Boulder
Held in conjunction with the UMR Parsing Workshop, June 14, 2024
https://umr4nlp.github.io/web/UMRParsingWorkshop.html
Impressive progress has been made in many aspects of natural language processing (NLP) in recent years. Most notably, the achievements of transformer-based large language models such as ChatGPT would seem to obviate the need for any type of semantic representation beyond what can be encoded as contextualized word embeddings of surface text. Advances have been particularly notable in areas where large training data sets exist, and it is advantageous to build an end-to-end training architecture without resorting to intermediate representations. For any truly interactive NLP applications, however, a more complete understanding of the information conveyed by each sentence is needed to advance the state of the art. Here, "understanding'' entails the use of some form of meaning representation. NLP techniques that can accurately capture the required elements of the meaning of each utterance in a formal representation are critical to making progress in these areas and have long been a central goal of the field. As with end-to-end NLP applications, the dominant approach for deriving meaning representations from raw textual data is through the use of machine learning and appropriate training data. This allows the development of systems that can assign appropriate meaning representations to previously unseen text.
In this four-day course, instructors from the University of Colorado and Brandeis University will describe the framework of Uniform Meaning Representations (UMRs), a recent cross-lingual, multi-sentence incarnation of Abstract Meaning Representations (AMRs), that addresses these issues and comprises such a transformative representation. Incorporating Named Entity tagging, discourse relations, intra-sentential coreference, negation and modality, and the popular PropBank-style predicate argument structures with semantic role labels into a single directed acyclic graph structure, UMR builds on AMR and keeps the essential characteristics of AMR while making it cross-lingual and extending it to be a document-level representation. It also adds aspect, multi-sentence coreference and temporal relations, and scope. Each day will include lectures and hands-on practice.
Topics to be covered June 10-13:
1. The basic structural representation of UMR and its application to multiple languages;
2. How UMR encodes different types of MWE (multi-word expressions), discourse and temporal relations, and TAM (tense-aspect-modality) information in multiple languages, and differences between AMR and UMR;
3. Going from IGT (interlinear glossed text) to UMR graphs semi-automatically;
4. Formal semantic interpretation of UMR incorporating a continuation-based semantics for scope phenomena involving modality, negation, and quantification;
5. Extension to UMR for encoding gesture in multimodal dialogue, Gesture AMR (GAMR), which aligns with speech-based UMR to account for situated grounding in dialogue.
The fifth day of the summer school, June 14, will be co-located with a UMR Parsing Workshop, focusing on parsing algorithms that generate AMR and UMR representations over multiple languages. https://umr4nlp.github.io/web/UMRParsingWorkshop.html
To apply, please complete this form by Jan. 30, 2024. https://www.colorado.edu/linguistics/umrs-boulder-summer-school-application
Other important dates:
● Notification of acceptance: Feb. 20, 2024
● Confirmation of participation: Mar. 1, 2024
● Arrival in Boulder June 9, departure June 15, 2024.
Participation will be fully funded (reasonable airfare, lodging, and meals). This summer school has been made possible by funding from NSF Collaborative Research: Building a Broad Infrastructure for Uniform Meaning Representations (Award # 2213805), with additional support from the University of Colorado Boulder and the CLEAR Center.
UMR Parsing Workshop - First Call
University of Colorado, Boulder
June 14, 2024
This workshop will focus on developing parsers for Uniform Meaning Representations. The goal is to start from raw text from real-world settings that could be in any one of many typologically different languages, even low-resource languages for which there is little or no training data. This can be achieved by exploiting a common semantic annotation standard. This workshop has been made possible by funding for NSF Collaborative Research: Building a Broad Infrastructure for Uniform Meaning Representations (Award # 2213805), which is aimed at developing guidelines and annotation for cross-lingual Uniform Meaning Representations, based on the original Abstract Meaning Representation guidelines for English, but ensuring cross-linguistically consistent annotation and recoverability of the original raw texts.
This workshop will overlap with the last day of the Colorado UMR Annotation Summer School.
The workshop is open to everyone and will cover the fundamentals of UMR annotation and the differences between AMR and UMR. In addition to the talk from our invited speaker, there will be presentations on recent successful approaches to AMR parsing and how they can be applied to UMR parsing. We welcome submissions from anyone on related topics, such as:
● AMR or UMR parsing for any language
● AMR or UMR generation for any language
● Evaluation metrics for AMR or UMR parsing
● Bootstrapping of AMRs or UMRs from related semantic representations such as Propbanks
● Projections of English AMR onto other languages;
● Challenges of applying AMR annotation to languages other than English;
● Challenges of accurate multi-sentence coreference as a subtask of AMR parsing;
● Any other topic related to the parsing and generation of AMRs or UMRs.
Important dates
● Workshop paper submissions due: March 30, 2024
● Notification of acceptance: April 25, 2024
● Camera-ready versions due: May 30, 2024
Submissions
Submissions should report original and unpublished research on topics of interest to the workshop. Accepted papers are expected to be presented at the workshop and will be published in the workshop proceedings. They should emphasize obtained results rather than intended work and should clearly indicate the state of completion of the reported results.
Submission is electronic, using the Workshop submission site in Easy Chair. https://easychair.org/my/conference?conf=umrpw2024
Submissions must adhere to the two-column format of ACL venues, using the Overleaf template taken from ACL 2023. https://www.overleaf.com/latex/templates/acl-2023-proceedings-template/qjdg…
Initial submissions should be fully anonymous to ensure double-blind reviewing. Long papers must not exceed eight (8) pages of content; short papers must not exceed four (4) pages of content. References and appendices do not count against these limits.
To ensure double-blind reviews, papers must not include the authors’ names and affiliations or self-references that reveal any author’s identity. Papers that do not conform to these requirements will be rejected without review.
Dear Colleagues,
We are delighted to launch the Call for Paper for the *11th Inter-Varietal
Applied Corpus Studies (IVACS) Biennial Conference *which will be hosted by
the University of Cambridge, U.K., on Tuesday 16th and Wednesday 17th July
2024.
Conference website: https://www.ivacs2024.com/
Abstract deadline 20th December, 2023.
*Plenary Speakers*
We are delighted that the following researchers will be giving plenary
talks at the conference:
- Dr Brian Clancy <https://www.mic.ul.ie/staff/276-brian-clancy>
- Dr Geraldine Mark <https://profiles.cardiff.ac.uk/staff/markg2>
Please spread the word!
Best wishes,
Anne and Andrew
*Dr Andrew Caines, Conference Convenor, University of Cambridge*
*Prof. Anne O'Keeffe, Inter-Varietal Applied Corpus Studies (IVACS) Network
Director*
*Call for Papers*
*The 11th Inter-Varietal Applied Corpus Studies (IVACS) Biennial Conference*
We are particularly interested in papers in but not limited to the
following areas:
Strand 1 – Corpus Methods and Innovations: Innovations in Corpus Design,
Analysis and Annotation Tools; Critical Reflections on Corpus Methods;
Advances in Quantitative and Qualitative Approaches to Analysing Corpora;
Innovations in Statistics for CL.
Strand 2 – Corpus Linguistics, Pragmatics and Discourse: Corpus Approaches
to Discourse Analysis, Conversation Analysis, Critical Discourse Analysis;
Corpus Pragmatics; CL and Real-World Contexts (e.g. Media Discourse,
Classroom Discourse; Workplace Discourse).
Strand 3 – Corpus Linguistics and Applied Linguistics: Learner Corpus
Research; CL and Second Language Acquisition; Data-Driven Learning; CL for
Materials Development; CL and Teacher Education; CL and Lexicography.
Strand 4 – Corpus Linguistics, Literature, Texts and Register: CL and
Register Studies; Corpus Stylistics; CL and Literary Linguistics; CL and
Translation Studies; Forensic Linguistics.
Strand 5 – Corpus Linguistics and Speech: CL Speech Technology; CL and
Multimodality; Spoken Corpora; Corpus Phonology.
Strand 6 – Corpus Linguistics and Sociolinguistics: CL and Language Change;
Language Varieties and Variation; CL and Minority Language Studies.
Strand 7 – Computational Linguistics and Corpora: The use of Corpora for
Computational Linguistics research; Exploration and analyses of Corpora
using Computational Linguistic methods; Data collection and annotation for
Computational Linguistics.
*Abstract Submission and Timeline*
Full papers will involve a 20-minute presentation, plus 10 minutes for
questions and discussion.
Posters can present work in progress or summaries of completed studies,
research projects or other innovations. Posters will be printed in portrait
A0 size.
Abstracts will be 300 words (not including reference list, if any).
Note that the deadlines are 11.59 pm UTC -12h
<https://www.timeanddate.com/time/zone/timezone/utc-12> (“anywhere on
Earth”).
Abstract deadline
20th December, 2023
Notification
31st January, 2024
Conference
16th-17th July, 2024
Submission of abstracts: OpenReview
<https://openreview.net/group?id=IVACS/2024/Conference>
*Seeking Reviewers*
Would you have time to help us review the abstracts in January? Maximum 5
per person. Please sign up here <https://forms.gle/BkopQZ12esXMAnv36>
The WNUT Workshop will be collocated with EACL 2024 (Malta). The website for
the workshop is at:
http://noisy-text.github.io/
The WNUT workshop focuses on core NLP tasks (e.g., POS/NER tagging and
translation; not computational social science) over user-generated text, such
as that found on social media, web forums, online reviews, digital health
records, or language learner essays.
We seek submissions of long and short papers on original and unpublished work
(same format and page limit as EACL main conference). All accepted
submissions will be presented as posters. Additionally, selected submissions
will be presented orally. There will be best paper awards for both short and
long papers.
Topics of interest include but are not limited to:
* NLP of noisy text, e.g. POS, NER tagging, Parsing
* Text normalization and error correction
* Paraphrase identification and semantic similarity of short text or noisy
text
* Extracting user demographics, profiles, and major life events
* Machine translation and Multilingual NLP over noisy text
* Information extraction from noisy text, global and regional trend
detection, and event extraction
* Colloquial language, e.g. idiom detection
* Domain adaptation to user-generated text
* Detecting rumors, contradictory information, sarcasm and humor on social
media
* Sentiment analysis
* Temporal aspects of user-generated content (resolving time expressions,
concept drift, etc...)
* Representing and mining language variation in user-generated content
* Processing of automatically generated data
* Robustness to Noise, both Natural and Adversarial
[IMPORTANT DATES]
* Submission Deadline: December 18, 2023 (anytime on earth; dual-submission
allowed)
* Acceptance Notification: January 20, 2024
* Camera-Ready Deadline: january 30, 2024
* Workshop Day: March 21/22, 2024
[INVITED SPEAKERS]
* Su Lin Blodgett
* Jennifer Foster
[ORGANIZERS]
* Tim Baldwin (University of Melbourne)
* Wei Xu (Georgia Institute of Technology)
* Alan Ritter (Georgia Institute of Technology)
* Rob van der Goot (IT University of Copenhagen)
* Max Müller-Eberstein (IT University of Copenhagen)
[SUBMISSION]
Submissions should conform to the ACL style guidelines. Long and short paper
submissions must be anonymized. Please submit your papers via OpenReview:
https://openreview.net/group?id=eacl.org/EACL/2024/Workshop/WNUT
*** Ph.D. Award: First Call for Applications ***
36th International Conference on Advanced Information Systems Engineering
(CAiSE'24)
June 3-7, 2024, 5* St. Raphael Resort and Marina, Limassol, Cyprus
https://cyprusconferences.org/caise2024/
(*** Submission Deadline: 1st March, 2024 AoE ***)
The deadline to apply for the CAiSE 2024 PhD Award is March 1st 2024. The conditions to
apply are:
• having participated as an author in a previous CAiSE Doctoral Consortium or at a main
CAiSE Event: either the main conference, the CAiSE Forum, EMMSAD, or BPMDS;
• having successfully defended the PhD thesis in the last two years (i.e., since January 2022).
The application must be submitted electronically to the PhD Awards track of CAiSE 2024 via
EasyChair <https://easychair.org/conferences/?conf=caise2024> . The application must be
a single PDF file containing:
• a short cover letter that includes the list of PhD committee members,
• a support letter from the thesis advisor,
• the candidate's defended PhD thesis,
• the candidate’s CV.
About the PhD Award
The CAiSE PhD Award 2024 is granted annually to an outstanding recent PhD thesis in the
field of Information Systems Engineering.
The award is co-sponsored by the CAiSE Steering Committee and Springer. It consists of a
certificate, free full registration (5 days) to the next two editions of the CAiSE conference,
and a book voucher for a free selection worth EUR 500 from Springer’s printed books
collection. In addition, the selected thesis will be recommended for publication as a
monograph in the LNBIP series published by Springer, provided that Springer’s publication
conditions are met.
The PhD theses submitted for the award will be reviewed by a standing committee of senior
members selected from the CAiSE Advisory Committee, the CAiSE Steering Committee, and
the CAiSE Program Committee.
Award Chair
Professor Andreas L Opdahl, University of Bergen, Norway
Key Dates
• Submission of application: 1st March, 2024 (AoE)
• Notification: 15th April, 2024
Past Recipents
• 2023: Anna Bernasconi, PhD from Politecninco Milano (Italy), thesis title “Model, Integrate,
Search... Repeat: a Sound Approach to Building Integrated Repositories of Genomic Data” (link to the forthcoming monograph: https://link.springer.com/book/9783031449062)
• 2022: Volodymyr Leno, PhD from University of Melbourne (Australia), thesis title
“Robotic Process Mining: Accelerating the adoption of Robotic Process Automation” (link to the thesis: https://minerva-access.unimelb.edu.au/bitstream/handle/
11343/297274/98f9efca-4dd2-eb11-94dc-0050568d0279_manuscript.pdf)
• 2021: Orlenys Lopez Pintado, PhD from University of Tartu (Estonia), thesis title
“Collaborative Business Process Execution on the Block Chain: the Caterpillar System” (link to the thesis: https://dspace.ut.ee/items/1e09072c-5442-463a-b8c6-0425951cb90b)
• 2020: Steven Mertens, PhD from Ghent University (Belgium), thesis title “Enabling process
management for loosely framed knowledge-intensive processes” (link to the published monograph: https://www.springer.com/gp/book/9783030661922)
• 2019: Giovanni Meroni, PhD from Politecnico di Milano (Italy), thesis title “Artifact-driven
business process monitoring” (link to the published monograph: https://www.springer.com/gp/book/9783030324117)
• 2018: Wei Wang, PhD from University of Queensland (Australia) thesis title “Integrated
Modeling of Business Processes and Business Rules” (link to the published monograph: https://www.springer.com/gp/book/9783030118082)
• 2017: Marcela Ruiz, PhD from the Universidad Politécnica de Valencia (Spain), thesis title
“TraceME: A Traceability-Based Method for Conceptual Model Evolution” (link to the published monograph: https://www.springer.com/gp/book/9783319897158)
• 2016: Le Minh Sang Tran, PhD from University of Trento (Italy), thesis title “Managing the
Uncertainty of the Evolution of Requirements Models” (testimony of the 2016 CAiSE PhD Award winner: https://www.youtube.com/watch?v=q-vvlH66lC4)
* We apologize if you receive multiple copies of this CFP *
For the online version of this Call, visit:
https://easychair.org/cfp/3rdDHandNLP
===============
3rd DHandNLP
Third Workshop on Digital Humanities and Natural Language Processing
Co-located with PROPOR 2024
14-15 March 2024, Universidade de Santiago de Compostela, Galizia, Spain
Website: https://sites.google.com/view/dhandnlp-propor
Submission deadline: 20 January 2024 (23:59 GMT)
Submission link: https://easychair.org/conferences/?conf=3rddhandnlp
3rd DHandNLP is a one-day workshop during PROPOR - 14-15 March 2024
*Workshop description*
Digital humanities (DH) stand at the intersection of computing and the
humanities, involving
collaborative transdisciplinary research. While current DH practice already
shows an impressive
array of new digital tools and methods for the study of the humanities, we
believe that natural
language processing techniques and experience can significantly enhance the
field, while DH
can also bring new testbeds and problems for the NLP community.
As shown in the previous workshops, there is an increasing set of
researchers in the processing
of Portuguese who are interested in this active collaboration, and we
believe that we should
cater for a forum which may join the two communities, DH and NLP,
showcasing several
different aspects allowed by this cross-fertilization.
The 3rdDHandNLP welcomes papers stemming from humanities that deal with
language, such
as philosophy, history, geography, law, philology, linguistics, or
literature, and that can benefit
from a digital approach or enhanced with computational linguistics methods
or techniques, be it
by using large sets of (written or spoken) textual data or by developing
applications for an
increasingly digital world.
We also welcome papers that use “traditional” DH tools or techniques, such
as topic modeling,
and papers that use standard NLP tools that were already applied in
different DH contexts, such
as named entity recognition, document clustering and classification,
sentiment analysis,
dialect/language identification and linked data.
*Main workshop topics*
- Digital philology, critical editions production and textual criticism
- Lexicometrics, lexicology and lexicography
- Visualization or sonification of large textual bodies in specific domains
- Computational stylometry, authorship attribution and profiling
- Distant reading of literature
- Construction of historical thesauri
Finally, we are especially interested in approaches that deal with
historical material, involving
not only historical linguistics but historical lexicology, corpus
processing and their multilingual
analysis.
*SUBMISSION GUIDELINES*
All papers must be anonymous, original and not simultaneously submitted to
another journal or
conference. They must strictly adhere to the submission templates of the
main conference.
We welcome submissions of:
- Short papers, consisting of up to 4 pages of content, plus unlimited
pages of references
- Full papers, consisting of up to 8 pages of content, plus unlimited pages
of references
Kind regards,
Maria José B. Finatto and Leonardo Zilio (on behalf of the organising
committee)
The PROPOR 2024 demonstration program committee invites submissions for
demonstrations. Following the spirit of previous PROPOR editions, the
demonstration track aims at bringing together academia and industry,
creating a forum where more than written or spoken descriptions of research
are available. Thus, demos should allow attendees to try and test them
during their presentation in a dedicated session that will provide a more
informal and interactive setting. Products, systems, or tools are examples
of acceptable demos. Both early-research prototypes and mature systems may
also be considered.
*Important dates:*
Demos Submission: January 10 2024
Notification of acceptance or rejection: February 21 2024
Camera-ready demo paper: February 28 2024
Conference: March 14 and 15 2024
*Topics:*
The areas of interest include all topics related to theoretical and applied
issues of written and spoken Portuguese and Galician, such as, but not
limited to, the same topics as for the conference paper submission:
Natural language processing tasks (e.g. parsing, word sense disambiguation,
coreference resolution)
Natural language processing applications (e.g. question answering,
subtitling, summarization, sentiment analysis)
Natural language generation
Information extraction and information retrieval
Speech technologies (e.g. spoken language generation, speech and speaker
recognition, spoken language understanding)
Speech applications (e.g. spoken language interfaces, dialogue systems,
speech-to-speech translation)
Resources, standardization and evaluation (e.g. corpora, ontologies,
lexicons, grammars)
NLP-oriented linguistic description or theoretical analysis
Distributional semantics and language modeling
Portuguese language varieties and dialect processing (including the
language varieties of Angola, Brazil, Cape Verde, East Timor, Galicia,
Guinea-Bissau, Macau, Mozambique, Portugal, São Tomé, and Principe)
Multilingual studies, methods, applications, and resources including
Portuguese/Galician
The systems may be of the following kinds:
Natural Language Processing systems or system components
Application systems using language technology components
Software tools for computational linguistics research
Software for demonstration or evaluation
Development tools
*Submissions:*
Submissions should consist of a non-anonymous brief description document of
up to three pages of content, including references. Developers must outline
the main characteristics of their system/product/tool, provide sufficient
details to allow its evaluation, and give information on how they plan to
demonstrate it. Developers are encouraged to focus their description on the
relevance of the computational processing component of Portuguese or
Galician in the proposed system.
Submissions should be written in English. At submission time, only PDF
format is accepted. For the final versions, authors of accepted papers will
be given one extra content page to take the reviews into account. Authors
of accepted papers will be requested to send the source files for the
production of the proceedings.
Submissions must be sent via EasyChair (
https://easychair.org/my/conference?conf=propor2024) — please select the
track: PROPOR2024 Demo Paper.
All submitted papers must conform to the official ACL style guidelines. ACL
provides style files for LaTeX and Microsoft Word that meet these
requirements. They can be found at:
LaTeX styelesheet:
https://github.com/acl-org/acl-style-files/tree/master/latex
MS Word stylesheet:
https://github.com/acl-org/acl-style-files/tree/master/word
Publication:
Accepted demo papers are expected to be published by ACL as a volume in ACL
Anthology (https://aclanthology.org/) as part of the PROPOR 2024
proceedings. They will be available online. To ensure publication, at least
one author of each accepted paper must complete an adequate registration
for PROPOR 2024 by the early registration deadline.
*Presentation format:*
Accepted demos will be presented at a designated demo session with an
optional accompanying poster. Developers should make sure they could run
their demos properly. Thus, it is the authors’ responsibility to provide
the necessary technical conditions (i.e. equipment) for the demo at the
conference. Note that the local organizers will not provide any hardware or
software. Free high-speed Internet access will be available.
There will be a best demo award for the best-presented project.
Further details on the date, time, and instructions of the demonstration
session(s) will be determined and provided at a later date.
*Demo chairs:*
Marlo Souza (Universidade Federal da Bahia, Brazil)
Iria de-Dios-Flores (Universidade de Santiago de Compostela, Spain)
--
*Iria de-Dios-Flores (PhD)*
*https://sites.google.com/view/iriadediosflores/
<https://sites.google.com/view/iriadediosflores/>*
We are very pleased to share our first call for papers for our workshop on Reference, Framing, and Perspective co-located with LREC-COLING 2024.
* Workshop website: https://cltl.github.io/reference-framing-perspective/
* When: Saturday, May 25th, 20204
* Where: Torino, Italy (co-located with LREC-COLING 2024)
* Deadline for submissions: February (details tba)
* Paper submission link: tba
* Deadline for camera-ready papers: beginning of April 2024 (details tba)
When something happens in the world, we have access to an unlimited range of ways (from lexical choices to specific syntactic structures) to refer to the same real-world event. Variations in reference may convey radically different perspectives. This process of making reference to something by adopting a specific perspective is also known as framing. Although previous work in is this area is present (see Ali and Hassan (2022)’s survey for an overview), there is a lack of unitary framework and only few targeted datasets (Chen et al., 2019) and tools based on Large Language Models exist (Minnema et al., 2022). In this workshop, we propose to adopt Frame Semantics (Fillmore, 1968, 1985, 2006) as a unifying theoretical framework and analysis method to understand the choices made in linguistic references to events. The semantic frames (expressed by predicates and roles) we choose give rise to our understanding, or framing, of an event. We aim to bring together different research communities interested in lexical and syntactic variation, referential grounding, frame semantics, and perspectives. We believe that there is significant overlap within the goals and interests of these communities, but not the necessary common ground to enable collaborative work.
Shared dataset:
To facilitate discussion among participants and to make this a real working workshop, we make available a shared corpus. The corpus is composed of news articles reporting on the 2020/2021 Eurovision Song Contest (canceled in 2020 and held in 2021) that took place in Rotterdam (the Netherlands). The news articles have been collected using the structured data-to-text approach (Vossen et al., 2018). At this point, the corpus contains texts in English and Dutch. We are extending it to a range of other European languages. We invite participants to submit short and targeted analyses using the data (extended abstracts to be discussed in a hands-on data session). Participants are also free to use the data in regular contributions. More information about the corpus will be released soon.
Regular contributions:
We aim to lay the groundwork for such efforts. We invite contributions (regular long papers of 8 pages or short papers of 4 pages) targeting any of the following - non-exhaustive - list of topics:
* Theoretical models of framing and perspective
* Annotation frameworks for framing and perspectives
* Computational models of framing and perspective
* Approaches for creating and analyzing referentially grounded datasets (containing different perspectives, written at different points in time, written in different languages)
* Approaches for and analyses of texts about contested and divisive events triggering different opinions and perspectives
* Analyses of and methods for analyzing (diachronic) lexical variation and framing
* Language resources for reference, frames, and perspectives
* Approaches and tools to compare claims of sources
* Frames as expressions of bias in the representation of social groups
* User interface for the visualization of multiple perspectives
Extended abstracts:
We invite extended abstracts (1,500 words maximum) about small analyses or experiments conducted on our Shared Data. The abstracts will be non-archival and discussed in a dedicated data session.
Invited speakers:
Maria Antoniak
Vered Shwartz
Organizers:
Pia Sommerauer, Tommaso Caselli, Malvina Nissim, Levi Remijnse, Piek Vossen
***Third Call for Papers***
**Overview**
- Submission of long and short papers: December 18, 2023 (no deadline
extension possible)
- Submission page: https://softconf.com/eacl2024/LAW-XVIII/
- Website: https://sigann.github.io/LAW-XVIII-2024/
**Workshop Description**
LAW-XVIII will be the 18th annual meeting endorsed by the ACL Special
Interest Group for Annotation (SIGANN). It will take place in March 2024 at
EACL in St. Julians, Malta.
Linguistic annotation of natural language corpora is the backbone of
supervised methods in both statistical and neural natural language
processing. Annotated corpora are also a major supporting source of
information for unsupervised methods, multitask learning, and evaluation of
both NLP tools and theories about language within and outside of
linguistics. The LAW-XVIII will provide a forum for presentation and
discussion of innovative research on all aspects of linguistic annotation,
including creation/evaluation of annotation schemes, methods for automatic
and manual annotation, use and evaluation of annotation software and
frameworks, representation of linguistic data and annotations,
semi-supervised “human in the loop” methods of annotation, crowd-sourcing
approaches, and more.
The LAW will also provide a forum for annotation researchers to work
towards standardization, best practices, and interoperability of annotation
information and software.
In line with the EACL main conference, LAW will be hybrid, allowing both
in-person and virtual presentations.
**Special Theme**
The special theme of LAW-XVIII is “Annotation in the Age of Large Language
Models (LLMs).” In addition to LAW’s general topics, we specifically invite
submissions on the following topics:
- Comparison of linguistically annotated datasets vs. datasets created
using large language models. Potential topics include:
- Comparison of models that have been trained on the respective datasets
- Impact of data size of manually annotated resources already available
prior to dataset creation with LLMs
- Is synthetic dataset creation a viable option for non-standard domains,
e.g., the medical domain, where expert knowledge is required?
- Non-performance-related considerations of manual vs. synthetic dataset
creation (e.g., explainability)
- Impact and prevention of test dataset contamination in LLM training
- Usefulness of LLMs for linguistic research (in relation to annotation).
- Any other topics related to the special theme.
**Submissions**
We accept both direct submissions and commitments from ACL Rolling Review
(ARR).
We welcome submissions of long and short papers, posters, and
demonstrations relating to the special theme or any aspect of linguistic
annotation, including:
- Annotation procedures
- Innovative automated and manual strategies for annotation
- Machine learning and knowledge-based methods for automation of corpus
annotation
- Creation, maintenance, and interactive exploration of annotation
structures and annotated data
- Annotation evaluation
- Inter-annotator agreement and other evaluation metrics and strategies
- Qualitative evaluation of linguistic representations
- Innovative means to evaluate annotation quality
- Annotation access and use
- Representation formats/structures for annotations of different phenomena,
especially annotations at multiple levels, and means to explore/manipulate
them
- Linguistic considerations for merging annotations of distinct phenomena
- Annotation schemes, guidelines and standards
- New and innovative annotation schemes, comparison of annotation schemes
- Methodologies and resources for annotation scheme development
- Best practices for annotation procedures and/or development and
documentation of annotation schemes
- Interoperability of annotation formats and/or frameworks among different
systems as well as different tasks, frameworks, modalities, and languages
- Results from the application and evaluation of standards for linguistic
annotation
- Annotation software and frameworks
- Development, evaluation and/or innovative use of annotation software
frameworks
Submissions should report original and unpublished research on topics of
interest to the workshop. We also invite substantiated position papers, in
particular with regard to our special theme. Accepted papers are expected
to be presented at the workshop and will be published in the workshop
proceedings. They should emphasize obtained results rather than intended
work, and should indicate clearly the state of completion of the reported
results.
A paper accepted for presentation at the workshop must not be or have been
presented at any other meeting with publicly available proceedings.
Long/short paper submissions must use the official ACL style templates.
Long papers must not exceed eight (8) pages of content. Short papers and
demonstration papers must not exceed four (4) pages of content. References
do not count against these limits.
Note: The supplementary material does not count towards page limit and
should not be included in the paper, but should be submitted separately
using the appropriate field on the submission website. All submissions must
be in PDF format.
Reviewing of papers will be double-blind. Therefore, the paper must not
include the authors' names and affiliations or self-references that reveal
the authors’ identity--e.g., "We previously showed (Smith, 1991) ..."
should be replaced with citations such as "Smith (1991) previously showed
...". Papers that do not conform to these requirements will be rejected
without review.
Authors of papers that have been or will be submitted to other meetings or
publications must provide this information to the workshop co-chairs (
law-xviii-2024(a)googlegroups.com). Authors of accepted papers must notify
the program chairs within 10 days of acceptance if the paper is withdrawn
for any reason.
We follow previous and current ACL policy to establish an anonymity period
(from submission to author notification) during which non-anonymous posting
of preprints is not allowed. Also included in that policy are instructions
to reviewers to not rate papers down for not citing recent preprints.
Authors are asked to cite published versions of papers instead of preprint
versions when possible.
Papers can be submitted at https://softconf.com/eacl2024/LAW-XVIII/.
If you have any questions, please feel free to contact the program
co-chairs via e-mail or check the workshop website (
https://sigann.github.io/LAW-XVIII-2024/) for updates.
**Dates**
(All submission deadlines are 11:59 p.m. UTC-12:00 “anywhere on Earth” and
will not be extended)
Anonymity period starts: November 18, 2023
Submission of long and short papers: December 18, 2023
ARR Commitment deadline: January 17, 2024
Notification of acceptance: January 20, 2024
Camera-ready papers due: January 30, 2024
Workshop: March 21 or 22, 2024
**Workshop Organizers**
Manfred Stede (Program Co-Chair)
Sophie Henning (Program Co-Chair)
Amir Zeldes (ACL SIGANN President)
Ines Rehbein (ACL SIGANN Secretary)
Apologies for cross-posting.
-------------------------
2nd Workshop on Resources and Technologies for Indigenous, Endangered and
Lesser-resourced Languages in Eurasia (EURALI) @ LREC-COLING 2024
Date: 20-25 May, 2024
Venue: Lingotto Conference Centre - Torino (Italia)
Main website: https://sites.google.com/view/eurali/
<https://sites.google.com/view/eurali/>
LREC-COLING 2024 website: https://lrec-coling-2024.org/
——————————————————————————————————
Workshop overview and objectives
This workshop will focus on the development of language technology
resources and tools for indigenous, endangered and lesser-resourced
languages on the Eurasian continent.
In a media-centric world where language technology allows people to break
cultural and language barriers, it is important that speakers of endangered
and indigenous languages can be empowered to use this technology to
continue to share their knowledge and culture with the world. With the hope
of bridging this gap, the goal of this workshop is to increase visibility
and promote research for lesser-resourced and underrepresented languages in
Europe and Asia. Through collaboration between NLP researchers, language
experts and linguists working for the benefit of endangered languages in
these communities, we aim to create language technology resources that will
help to preserve and revive these languages for future generations.
Furthermore, the workshop aims to promote the emergence of new methods that
benefit linguists, for instance for automation of analysis and validation
processes, field linguists, the facilitation of data collection and
analysis processes, and computational linguists by developing new
techniques necessary for linguistic analysis, development of supervised or
weakly supervised methods for the analysis of poorly written or
undocumented languages.
The main objective of the workshop is to create basic resources and develop
tools for Eurasiatic languages, including but not limited to the following
topics:
-
identifying languages and variants spoken in these regions
-
creation of language resources and applications, e.g. sentiment
analysis, named entity recognition, and syntactic parsing
-
standardization for endangered languages
-
automatic identification and classification of lexical variation and
language varieties
-
adaptation of fundamental NLP tools for these languages, e.g.,
morphological analysis, taggers and parsers
-
reusability of language resources in NLP applications, e.g. machine
translation, and POS tagging
-
machine translation between closely related languages
evaluation of language resources and tools when applied to lesser-resourced
languages in the same language families
-
corpora, resources, and tools for closely related languages
-
linguistic and textual similarities among languages in Eurasia
-
digitalization of endangered languages
-
challenges in the creation of language resources and tools from
linguistic perspectives (which includes any perspective formal theory)
Submissions
We are seeking submissions under the following category:
Full papers: 8 pages+unlimited reference
Short papers (work in progress): 4 pages+unlimited reference
Posters (innovative ideas/proposals, a research idea of students): 4
pages+unlimited reference
Demo (of working online/standalone systems): 2 pages
Papers must describe original, completed or in progress, and unpublished
work. The accepted papers will be given up for full/short paper and poster
in the workshop proceedings and will be presented as an oral presentation
or poster.
Papers should be formatted according to the LREC-COLING style sheet (
https://lrec-coling-2024.org/authors-kit/), which is provided on the
LREC-COLING 2024 website(https://lrec-coling-2024.org/). Please submit
papers in PDF format to the START account (the submission link will be
available soon). For further information on this initiative, please refer
to the https://sites.google.com/view/eurali/.
Important Dates (tentative)
February 23, 2024: Paper submissions due
March 22, 2024: Paper notification of acceptance
May 20-25, 2024: Workshop
Workshop Chair:
Atul Kr. Ojha, Sina Ahmadi,
Chao-Hong Liu, Potamu Research Ltd, Dublin (Ireland)
John P. McCrae, University of Galway, Galway (Ireland)
Theodorus Fransen, Università Cattolica del Sacro Cuore, Milan (Italy)
Silvie Cinkovà, Charles University, Prague (Czech Republic)
Programme Committee (to be updated):
Abigail Walsh*, Dublin City University, Dublin (Ireland)
Agata Savary, University of Paris-Saclay, Paris-Saclay (France)
A. Seza Doğruöz, Ghent University, Ghent (Belgium)
Alina Karakanta, University of Leiden, Leiden (Netherlands)
Alina Wróblewska, Institute of Computer Science, Jana Kazimierza, Warszawa
(Poland)
Akanksha Bansal, Panlingua, Delhi (India)
Anabela Barreiro*, INESC-ID, Lisboa (Portugal)
Atul Kr. Ojha, University of Galway, Galway (Ireland) & Panlingua, (India)
Bharathi Raja Chakravarthi, University of Galway, Galway (Ireland)
Bogdan Babych, Heidelberg University, Heidelberg (Germany)
Chao-Hong Liu, Potamu Research Ltd, Dublin (Ireland)
Daan van Esch, Google, Amsterdam (Netherlands)
Daniel Zeman, Charles University, Prague (Czech Republic)
Deepak Alok, IIT-Delhi, Delhi (India)
Dorothee Beermann, Norwegian University of Science and Technology,
Trøndelag (Norway)
Esha Banerjee, J.P. Morgan, Bengaluru (India)
Ekaterina Vylomova, University of Melbourne, Melbourne (Australia)
George Rehm, GmbH, Berlin (Germany)
Jamal Abdul Nasir, University of Galway, Galway (Ireland)
Joakim Nivre, Uppsala University, (Sweden)
John P. McCrae, University of Galway, (Ireland)
Jonathan Washington, Swarthmore College, Swarthmore (USA)
Joseph Mariani, LIMSI-CNRS, Pairs (France)
Kaja Dobrovoljc, University of Ljubljana, Ljubljana (Slovenia)
Katharina Kann*, University of Colorado at Boulder, USA
Kevin Patrick Scannell, Cadhan Aonair, LLC, Missouri (USA)
Khalid Choukri, ELDA/ELRA, Paris (France)
Marie-Catherine de Marneffe, UCLouvainCollège Léon Durpiez, (Belgium)
Massimo Monaglia, University of Florence, (Italy)
Nicoletta Calzolari, CNR-ILC, (Italy)
Olesea Caftanatov, Vladimir Andrunachievici Institute of Mathematics and
Computer Science, Chişinău (Moldova)
Richard Sproat, Google, Tokyo (Japan)
Rico Sennrich, University of Zurich, Zurich (Switzerland)
Ritesh Kumar, Agra University, Agra (India)
Saliha Muradoglu, Australian National University, Canberra (Australia)
Silvie Cinkovà, Charles University, Prague (Czech Republic)
Sina Ahmadi, George Mason University, (USA)
Stella Markantonatou, Athena RC, Athens (Greece)
Sourabrata Mukherjee, Charles University, Prague (Czech Republic)
Sunipa Dev, Google, Washington (USA)
Theodorus Fransen, Università Cattolica del Sacro Cuore, Milan (Italy)
Valentin Malykh, MTS AI / ITMO University
Verginica Barbu Mititelu, Research Institute for Artificial Intelligence,
Bucharest (Romania)
Voula Giouli, Institute for Language and Speech Processing, Athens (Greece)