Workshop URL: https://elex.link/elex2023/workshop-on-lexicography-and-cefr/
INTRODUCTION
The workshop ‘Lexicography and CEFR: Linking lexicographic resources and language proficiency levels’ will be held in conjunction with eLex 2023 in Brno, Czech Republic, on June 29 afternoon, in hybrid mode.
The focus is lexicographic projects connected with the Common European Framework of Reference for Languages (CEFR). CEFR is a generally established international standard for describing language proficiency, and CEFR-graded resources have been developed for many languages in Europe. However, incorporating their information is still not a common practice in modern lexicography for most languages, notably except for two English dictionaries for advanced learners (Cambridge and Oxford). Moreover, there are substantial unsolved issues, such as inconsistencies in vocabulary size per level between languages; no, or limited, sense disambiguation in CEFR resources; words from a higher CEFR level in definitions and example sentences; and limited collaboration among the related fields of lexicography, language acquisition, and linguistic linked data.
The main objectives are to examine approaches and methods used for linking lexicographic data and language proficiency levels and to discuss strategies for more convergence between lexicographic resources with CEFR-based language learning programs.
TOPICS
The workshop will feature an overview by the organizers as well as invited talks. In addition, we invite submissions for papers (20 minutes, plus discussion) on the following topics:
• the creation of CEFR-graded lexicographic resources
• the implementation of vocabulary and grammar profiles in lexicographic resources
• the creation of crosslingual concept-based CEFR resources
• the use of lexicographic resources for creating language proficiency-level learning applications and tools
• the use of CEFR-graded lexicographic resources in CALL
• the linking of lexicographic resources to CEFR-graded vocabularies
• data collection for creating CEFR-graded lexicographic resources
SUBMISSION AND DATES
Abstracts including 300-500 words should be submitted by February 15, 2023 via Indico [https://indico.elex.link/event/1/]. Please choose the ‘cefr’ track when submitting your abstract.
Notification of acceptance will be made by February 22.
The authors of accepted papers will be invited to submit a full paper for the eLex 2023 Proceedings (indexed by SCOPUS).
Paper submissions will be due by March 31 and acceptance notification will be made by April 15, 2023.
ORGANIZERS AND CONTACT
Kris Heylen. Dutch Language Institute
Jelena Kallas. Institute of the Estonian Language (jelena.kallas(a)eki.ee)
Ilan Kernerman. Lexicala by K Dictionaries
Carole Tiberius. Dutch Language Institute (carole.tiberius(a)ivdnt.org)
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Following this workshop at eLex, a related workshop ‘Linking Lexicographic and Language Learning Resources’ (4LR 2023) will be held in conjunction with LDK 2023 – the fourth conference on Language, Data and Knowledge – on September 13 in Vienna, Austria. 4LR will emphasize the linked data and knowledge management methodologies and technologies for linking lexicography and language learning in general. The call for papers will be published in February 2023.
We are hiring in the CAISA Lab! (https://caisa-lab.github.io/team/,
https://twitter.com/CaisaLab)
The position is part of the BMBF project "Dynamically Social Natural
Language Processing for Online Discourse Analysis" (
https://caisa-lab.github.io/projects/).
In this project we develop novel representation methods for users +
language + network, to model the stance framing as well as the perception
of opinionated statements in a personalized manner.
The position is available for 2 years starting asap, with the foreseen
salary of E13-3 100% (about 3k Eur net/month, health and social insurance
included). Fluent scientific English is necessary, German not required.
Relevant experience in the field (e.g. some of: argument mining,
computational social science, misinformation analysis, discourse modeling)
is expected, as well as an ability to work independently and support more
junior researchers.
Apply here until Jan 29th:
https://stellenangebote.uni-marburg.de/jobposting/f66a7770882c529633774957a…
Or contact lucie.flek(a)uni-marburg.de for more information.
Call for Submission of Extended Abstracts CLARIN Annual Conference 2023
CLARIN ERIC is pleased to announce the CLARIN Annual Conference 2023<https://www.clarin.eu/event/2023/clarin-annual-conference-2023> and calls for the submission of extended abstracts. CLARIN is the European research infrastructure that makes digital language resources available to scholars, researchers, students and citizen-scientists from a wide range of disciplines, coordinates the collection of language resources and tools, and offers advanced tools to explore, exploit, annotate, analyse or combine such datasets, regardless of their location.
Submission deadline: 14 April 2023
________________________________
Location
After the successful hybrid edition of 2022, we plan to repeat the same format in 2023. The CLARIN Conference 2023 will be a face-to-face event, which will also be fully accessible virtually. The conference will take place in the historic city of Leuven, Belgium, at the heritage campus of the Irish College<https://www.irishcollegeleuven.eu/>. The event will be hosted and organised by CLARIN ERIC, in collaboration with KU Leuven, <https://www.kuleuven.be/english/about-kuleuven/> CLARIN-BE <https://clarin-be.ivdnt.org/> and the Instituut voor Nederlandse Taal<http://www.ivdnt.org>.
________________________________
Important Dates
* 23 January 2023: First call published on CLARIN website, disseminated, and submission system open
* 14 April 2023: Submission deadline
* 30 June 2023: Notification of acceptance
* 4 September 2023: Camera-ready version deadline
* 16-18 October 2023: CLARIN Annual Conference
________________________________
Conference Aims
The CLARIN Annual Conference is organised for the wider Humanities and Social Sciences (SSH) community in order to exchange experiences and best practices in working with the CLARIN infrastructure and to share plans for future developments. The programme will cover a range of topics, including the design, construction and operation of the CLARIN infrastructure, the data, tools and services that it contains or should contain, its actual use by researchers, teachers or interested parties, its relation to other infrastructures and projects, and the CLARIN Knowledge Infrastructure.
________________________________
Keynote Speakers
To be confirmed.
________________________________
Conference Topics
We invite submissions describing CLARIN-related work addressing the following aspects:
Use of the CLARIN Infrastructure:
* Use of the CLARIN infrastructure in SSH research and beyond
* Usability studies and evaluations of CLARIN services
* Analysis of the CLARIN infrastructure usage and impact studies/use cases
* Identification and analysis of user audiences and developer communities, including digital humanities, libraries, computer science, information science, cognitive science and human-centred AI
* Showcases, demonstrations and research projects that are relevant to CLARIN
* Teaching and learning cases for which CLARIN resources and services are used.
Design and Construction of the CLARIN Infrastructure:
* Recent tools and resources added to the CLARIN infrastructure
* Metadata and concept registries, cataloguing and browsing
* Persistent identifiers and citation mechanisms ]
* Access, including single-sign-on authentication and authorisation
* Search functions, including Federated Content Search
* Web applications, web services and workflows
* Standards and solutions for interoperability of language resources, tools and services
* Models for the sustainability of the infrastructure, including curation, migration financing and cooperation
* Legal and ethical issues in operating the infrastructure.
CLARIN Knowledge Infrastructure and Dissemination:
* User assistance (help desks, user manuals, FAQs)
* CLARIN portals and outreach to users
* Videos, screencasts, recorded lectures
* Researcher training activities, hackathons
* Knowledge infrastructure centres.
CLARIN vis-à-vis other Infrastructures and Initiatives:
* SSH research infrastructures, such as DARIAH<https://www.dariah.eu/> and CESSDA<https://www.cessda.eu/> and the collaboration under the umbrella of the SSH Open Cluster<https://www.sshopencloud.eu/news/sshoc-ssh-open-cluster>, etc.
* Generic infrastructural initiatives, such as <https://www.clarin.eu/glossary#EUDAT> EOSC<https://eosc.eu/about-eosc>, Europeana<https://www.europeana.eu>, Language Data Space, etc.
* Projects such as EOSC Future<https://eoscfuture.eu/>, FAIRCORE4EOSC<https://faircore4eosc.eu/> and TRIPLE<https://project.gotriple.eu/about/>
* National and regional initiatives
________________________________
Format of the Programme Sessions
The programme of the conference will include oral presentations and posters, and may also include demos. Papers are allocated a presentation format based on the suitability of the paper for the type of session (i.e. more or less interactive), not based on their quality or other factors. Authors of accepted submissions will be offered the opportunity to demo their work in addition to their presentation.
________________________________
Submissions
The language of the conference is English and presentations will be made in English. Proposals for oral, poster or demo presentations must be submitted as extended abstracts (length: 3 to 4 pages A4, including references) in PDF format, in accordance with the template (ZIP-archive<https://www.dropbox.com/s/s7ocg2i15y0q1gy/Template_CLARIN2023.zip?dl=0>, Overleaf template<https://www.overleaf.com/read/qtvdcbqrmpfs>). Authors can choose whether to submit on an anonymous or non-anonymous basis.
Extended abstracts should address one or more topics that are relevant to CLARIN’s activities, resources, tools or services. This relevance should be explicitly articulated in the submission, as well as in the presentation at the conference. Contributions addressing desiderata for the CLARIN infrastructure that are currently not in place are also eligible. It is not required for authors to be or have been directly involved in national or cross-national CLARIN projects.
Extended abstracts must be submitted through the EasyChair submission system<https://easychair.org/conferences/?conf=clarin2023> and will be reviewed by the Programme Committee. All proposals will be reviewed on the basis of the following criteria:
* Appropriateness: The contribution must pertain to the CLARIN infrastructure or be relevant for it (e.g. its use, design, construction, operation, exploitation, illustration of possible applications, etc.), and this relevance should be explicitly articulated in the submission.
* Soundness and correctness: The content must be technically and factually correct and methods must be scientifically sound, according to best practice, and preferably evaluated.
* Meaningful comparison: The abstract must indicate that the author is aware of alternative approaches, if any, and highlight relevant differences.
* Substance: Concrete work and experiences will be given preference over ideas and plans.
* Impact: Contributions with a higher impact on the research community and society more broadly will be given preference over papers with lower impact.
* Clarity: The abstract should be clearly written and well structured.
* Timeliness and novelty: The work must convey relevant new knowledge to the audience at this event.
________________________________
Attendance
For each accepted abstract, one author will be granted reimbursement of travel costs up to 220 Euros, free accommodation and meals (conditional on the event taking place face-to-face; this does not apply if the conference needs to shift to a virtual format due to epidemiological reasons).
________________________________
Proceedings
Accepted submissions will be published in the online conference Book of Extended Abstracts, ISSN: 2773-2177. After the conference, the author(s) of accepted submissions will be invited to submit full papers (10-12 pages) to be reviewed according to the same criteria as the abstracts. Accepted full papers will be published in a digital conference proceedings volume after the conference: Linköping Electronic Conference Proceedings (peer reviewed) ISSN: 1650-3686 (print), 1650-3740 (online) https://ep.liu.se/en/conferences.aspx
________________________________
Conference Programme Committee
The Programme Committee for the conference consists of the following members:
* Krister Lindén, University of Helsinki, Finland (Chair)
* Starkaður Barkarson, Árni Magnússon Institute for Icelandic Studies, Iceland
* Lars Borin, University of Gothenburg, Sweden
* António Branco, University of Lisbon, Portugal
* Tomaž Erjavec, Jožef Stefan Institute, Slovenia
* Eva Hajičová, Charles University Prague, Czech Republic
* Monica Monachini, Institute of Computational Linguistics ‘A. Zampolli’, Italy
* Karlheinz Mörth, Austrian Academy of Sciences, Austria
* Costanza Navarretta, University of Copenhagen, Denmark
* Gijsbert Rutten, Leiden University, the Netherlands
* Maciej Piasecki, Wrocław University of Science and Technology, Poland
* Stelios Piperidis, ILSP, Athena Research Center, Greece
* Kiril Simov, IICT, Bulgarian Academy of Sciences, Bulgaria
* Inguna Skadiņa, Institute of Mathematics and Computer Science, University of Latvia, Latvia
* Koenraad De Smedt, University of Bergen, Norway
* Marko Tadić, University of Zagreb, Croatia
* Jurgita Vaičenonienė, Vytautas Magnus University, Lithuania
* Vincent Vandeghinste, Instituut voor de Nederlandse Taal (Dutch Language Institute), the Netherlands & KU Leuven, Belgium
* Tamás Váradi, Research Institute for Linguistics, Hungarian Academy of Sciences, Hungary
* Joshua Wilbur, Center of Estonian Language Resources, Estonia
* Andreas Witt, University of Mannheim, Germany
* Friedel Wolff, South African Centre for Digital Language Resources, North-West University, South Africa
* Martin Wynne, University of Oxford, United Kingdom
* Marianne Hundt, University of Zurich, Switzerland
________________________________
Links
* CLARIN Annual Conference 2023 website: https://www.clarin.eu/event/2023/clarin-annual-conference-2023
* EasyChair submission: available here<https://easychair.org/conferences/?conf=clarin2023>
* Template for submissions:
* ZIP-archive: available here<https://www.dropbox.com/s/s7ocg2i15y0q1gy/Template_CLARIN2023.zip?dl=0>
* Overleaf template: available here<https://www.overleaf.com/read/qtvdcbqrmpfs>
* Contact for any questions regarding the conference: events(a)clarin.eu<https://mailto:events@clarin.eu> (Please mention [CLARIN2023] in the email subject)
* Proceedings of selected papers from previous CLARIN conferences:
* CLARIN 2021: https://doi.org/10.3384/9789179294441
* CLARIN 2020: https://doi.org/10.3384/ecp180
* CLARIN 2019: https://doi.org/10.3384/ecp2020172
* CLARIN 2018: http://www.ep.liu.se/ecp/contents.asp?issue=159
* CLARIN 2017: http://www.ep.liu.se/ecp/contents.asp?issue=147
* CLARIN 2016: http://www.ep.liu.se/ecp/contents.asp?issue=136
* CLARIN 2015: http://www.ep.liu.se/ecp/contents.asp?issue=123
* CLARIN 2014: http://www.ep.liu.se/ecp/contents.asp?issue=116 <http://www.ep.liu.se/ecp/contents.asp?issue=116>
---
Elisa Gorgaini
External Relations Officer
CLARIN ERIC www.clarin.eu
E-mail: elisa(a)clarin.eu
Phone: +31648213015
Dear colleagues,
We are pleased to share the information of The 3rd Workshop on Financial
Technology on the Web (FinWeb) with you. FinWeb-2023 is to be held on April
30, 2023 in conjunction with The Web Conference 2023. Our keynote speaker,
Dr. James Zhang, Managing Director of AI Prediction and Strategy Platform
of Ant Group, will share their experience on the topic of China‘s First
Natural Language-based AI Chatbot Trader.
We invite the submission of papers on original research in this area. We
offer the prize in the main track (USD$500 to the Best Paper Award winner).
*Submission Deadline: Feb. 06, 2023*
The proceedings of the workshops will be published jointly with the
conference proceedings.
Please refer to the site of FinWeb-2023 for more details:
https://sites.google.com/nlg.csie.ntu.edu.tw/finweb-2023/home
Sincerely,
Chung-Chi Chen, Hen-Hsen Huang, Hiroya Takamura, Hsin-Hsi Chen
FinWeb-2023 Organizers
*Topics of Interest*
We list some possible topics below with the research tracks of The Web
Conference, but the submissions from participants are not limited to these
topics.
-
-
*FinTech*
-
Analyzing Cloud, Online, and Mobile Financial Services
-
Anti-Money Laundering
-
Client Financial Security
-
Credit Analysis and Pricing
-
Crowdfunding
-
Digital Financial Advising
-
Financial Crime Detection
-
Financial Digital Authentication
-
Internet Payment
-
Internet Wealth Management
-
Mobile Payment
-
Modeling Financial Chaos, Uncertainty, and Change
-
Novel Financial Service Design
-
Online Banking
-
Peer-to-Peer Lending
-
Regulation
-
*Web and Internet Economics*
-
Algorithmic Game Theory
-
Algorithmic Mechanism Design
-
Auction Algorithms and Analysis
-
Computational Advertising
-
Computational Aspects of Equilibria
-
Computational Social Choice
-
Learning in Markets and Mechanism Design
-
Learning under Strategic Behavior
-
Coalitions, Coordination, and Collective Action
-
Economic Aspects of Security and Privacy
-
Economic Aspects of Distributed Computing and Cryptocurrencies
-
Econometrics, Machine Learning, and Data Science
-
Behavioral Economics and Behavioral Modeling
-
Fairness and Trust in Games and Markets
-
Price Differentiation and Price Dynamics
-
Revenue Management
-
Social Networks and Network Games
*Best Paper Award*
We will offer *USD$500* to the Best Paper Award winner.
*Submission Details *(Time zone : Anywhere On Earth (AOE)
<https://www.google.com/url?q=https%3A%2F%2Fwww.timeanddate.com%2Ftime%2Fzon…>
)
*Submission System: *
*https://easychair.org/conferences/?conf=thewebconf2023iwpd*
<https://www.google.com/url?q=https%3A%2F%2Feasychair.org%2Fconferences%2F%3…>
-
*Submission Deadline: Feb. 06, 2023*
-
Notification: March 06, 2023
-
Camera-ready version ready: March 20, 2023
The proceedings of the workshop will be published jointly with the
conference proceedings.
*Instructions for Authors of **Main **Track submissions*
-
*Regular Paper (A**nonymous**):* *8 pages* for the main text (including
all figures but excluding references), and one additional page for
references.
-
*Poster & **Demonstration (non-anonymous)**: 4 pages* in total
(including all figures and references).
-
Papers submitted to the main track must be formatted according to *The
Web Conference 2023 Guidelines*
<https://www.google.com/url?q=https%3A%2F%2Fwww2023.thewebconf.org%2Fcalls%2…>
and
must follow the page limitation. Papers must be submitted in PDF
according to the ACM format published in the ACM guidelines, selecting the
generic “sigconf” sample. The PDF files must have all non-standard fonts
embedded. Papers must be self-contained and in English.
-
At least one author of each accepted paper is required to attend the
workshop to present the work. *All presenters need to be physically
present.* No virtual presentations are allowed. As such, registration is
mandatory for all presenters / speakers. It is allowed to foresee a proxy
(i.e. somebody else who presents the paper physically) in case the author
cannot attend. Authors will be required to agree to this requirement at the
time of submission.
Dear Colleagues
I am writing to advertise a new PhD position in Translation Studies and
Lexicography at Innsbruck University.
All details (German and English) and the link for applying are at
https://lfuonline.uibk.ac.at/public/karriereportal.details?asg_id_in=13221
Kind regards,
Laura Giacomini
--
Univ.-Prof. Dr. Laura Giacomini
Institut für Translationswissenschaft
Herzog-Siegmund-Ufer 15
A-6020 Innsbruck
Special Issue:
Current Trends in Natural Language Processing (NLP) and Human Language Technology (HLT)
MATHEMATICS
NEW IMPACT FACTOR 2.592
An Open Access Journal by MDPI
link: https://www.mdpi.com/journal/mathematics
Guest Editor:
* Florentina Hristea, University of Bucharest
Deadline for manuscript submissions: April 23, 2023
Accepted papers are published continuously in the journal (as soon as accepted).
Message from the Guest Editor and Special Issue Web page:
https://www.mdpi.com/si/mathematics/NLP_HLT
A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors<https://www.mdpi.com/journal/mathematics/instructions> page.
For further information and questions, please contact:
Florentina Hristea
University of Bucharest
fhristea(a)fmi.unibuc.ro
https://cs.unibuc.ro/~fhristea/
[apologies for x-posting]
Call for Papers and Extended Abstracts
Workshop on RESOURCEs and representations For Under-resourced Languages and domains (RESOURCEFUL-2023)
collocated with the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
Norðurlandahúsið - The Nordic House in Tórshavn, Faroe Islands
22nd May 2023
https://resourceful-workshop.github.io/resourceful-2023/
Important dates:
- Submission deadline (both papers and abstracts): 28th March 2023
- Notification of acceptance: 25th April 2023
- Camera-ready version: 9th May 2023
- Workshop date: 22nd May 2023
All deadlines are 11:59PM UTC-12:00 ("anywhere on Earth").
Workshop description
The second workshop on resources and representations for under-resourced language and domains (RESOURCEFUL-2023) explores the role of the kind and the quality of resources that are available to us and challenges and directions for constructing new resources in light of the latest trends in natural language processing.
Data-driven machine-learning techniques in natural language processing have achieved remarkable performance (e.g., BERT, GPT, ChatGPT) but in order to do so large quantities of quality data (which is mostly text) is required. Interpretability studies of large language models in both text-only and multi-modal setups have revealed that even in cases where large text datasets are available, the models still do not cover all the contexts of human social activity and are prone to capturing unwanted bias where data is focused towards only some contexts. A question has also been raised whether textual data is enough to capture semantics of natural language processing and other modalities such as visual representations or a situated context of a robot might be required. Annotator-based resources have been constructed over years based on theoretical work in linguistics, psychology and related fields and a large amount of work has been done both theoretically and practically.
The purpose of the workshop is to initiate a discussion between the two communities involved in building resources (data vs annotation-based) and exploring their synergies for the new challenges in natural language processing. We encourage contributions in the areas of resource creation, representation learning and interpretability in data-driven and expert-driven machine learning setups and both uni-modal and multi-modal scenarios.
In particular we would like to open a forum by bringing together students, researchers, and experts to address and discuss the following questions:
- What is relevant linguistic knowledge the models should capture and how can this knowledge be sampled and extracted in practice?
- What kind of linguistic knowledge do we want and can capture in different contexts and tasks?
- To what degree are resources that have been traditionally aimed at rule-based natural language processing approaches relevant today both for machine learning techniques and hybrid approaches?
- How can they be adapted for data-driven approaches?
- To what degree data-driven approaches can be used to facilitate expert-driven annotation?
- What are current challenges for expert-based annotation?
- How can crowd-sourcing and citizen science be used in building resources?
- How can we evaluate and reduce unwanted biases?
Intended participants are researchers, PhD students and practitioners from diverse backgrounds (linguistics, psychology, computational linguistics, speech, computer science, machine learning, computer vision etc). We foresee an interactive workshop with plenty of time for discussion, complemented with invited talks and presentations of on-going or completed research.
This workshop is a continuation of the first workshop on resources and representations for under-resourced languages and domains held together with the SLTC 2020, https://gu-clasp.github.io/resourceful-2020/.
Submission
We invite submissions of both long (8 pages) and short papers (4 pages) with any number of pages for references. All submissions must follow the NoDaLida template, available in both LaTeX and MS Word, the templates are available at the official conference website, https://www.nodalida2023.fo/authorkit-nodalida23 Submissions must be anonymous and submitted in the PDF format through OpenReview.
We also invite submissions of maximum 2-page extended non-anonymous abstracts with any number of pages for references describing work in progress, negative results and opinion pieces. Papers related to our theme and already presented at other venues or have already been published elsewhere will be considered for acceptance for presentation as well. The abstracts, which should follow the same formatting templates as the archival track, will be reviewed by the workshop organisers and the accepted ones will be posted on the workshop website.
Workshop organisers
Dana Dannélls, Språkbanken Text, University of Gothenburg
Simon Dobnik, CLASP, University of Gothenburg
Adam Ek, CLASP, University of Gothenburg
Stella Frank, University of Copenhagen
Nikolai Ilinykh, CLASP, University of Gothenburg
Beáta Megyesi, Uppsala University
Felix Morger, Språkbanken Text, University of Gothenburg
Joakim Nivre, RISE and Uppsala University
Magnus Sahlgren, AI Sweden
Sara Stymne, Uppsala University
Jörg Tiedemann, University of Helsinki
Lilja Øvrelid, University of Oslo
resourceful-2023(a)listserv.gu.se
********************************************************************************
Second Call for Papers
19th Workshop on Multiword Expressions (MWE 2023)
Organized and sponsored by SIGLEX, the Special Interest Group
on the Lexicon of the ACL
Full-day workshop collocated with EACL 2023, Dubrovnik, Croatia, May 2 or
6, 2023
Hybrid (on-site & on-line)
Submission deadline: February 13, 2023
NEW: ARR commitment deadline: March 6, 2023
NEW: Special track on MWEs in Clinical NLP (see below)
NEW: Best paper award (see below)
MWE 2023 website: https://multiword.org/mwe2023/
********************************************************************************
Multiword expressions (MWEs) are word combinations that exhibit lexical,
syntactic, semantic, pragmatic, and/or statistical idiosyncrasies (Baldwin
& Kim 2010), such as by and large, hot dog, pay a visit and pull one's leg.
The notion encompasses closely related phenomena: idioms, compounds,
light-verb constructions, phrasal verbs, rhetorical figures, collocations,
institutionalised phrases, etc. Their behaviour is often unpredictable; for
example, their meaning often does not result from the direct combination of
the meanings of their parts. Given their irregular nature, MWEs often pose
complex problems in linguistic modelling (e.g. annotation), NLP tasks (e.g.
parsing), and end-user applications (e.g. natural language understanding
and MT), hence still representing an open issue for computational
linguistics (Constant et al. 2017).
For almost two decades, modelling and processing MWEs for NLP has been the
topic of the MWE workshop organised by the MWE section of SIGLEX in
conjunction with major NLP conferences since 2003. Impressive progress has
been made in the field, but our understanding of MWEs still requires much
research considering their need and usefulness in NLP applications. This is
also relevant to domain-specific NLP pipelines that need to tackle
terminologies most often realised as MWEs. Following previous years, for
this 19th edition of the workshop, we identified the following topics on
which contributions are particularly encouraged:
MWE processing and identification in specialized languages and domains:
Multiword terminology extraction from domain-specific corpora (Bonin et al.
2010) is of particular importance to various applications, such as MT
(Semmar & Laib, 2017), or for the identification and monitoring of
neologisms and technical jargon (Chatzitheodorou et al, 2021). We expect
approaches that deal with the processing of MWEs as well as the processing
of terminology in specialised domains can benefit from each other.
MWE processing to enhance end-user applications: MWEs have gained
particular attention in end-user applications, including MT (Zaninello &
Birch 2020; Han et al. 2021), simplification (Kochmar et al. 2020),
language learning and assessment (Paquot et al. 2019; Christiansen & Arnon
2017), social media mining (Maisto et al 2017), and abusive language
detection (Zampieri et al. 2020; Caselli et al. 2020). We believe that it
is crucial to extend and deepen these first attempts to integrate and
evaluate MWE technology in these and further end-user applications.
MWE identification and interpretation in pre-trained language models:
Most current MWE processing is limited to their identification and
detection using pre-trained language models, but we still lack
understanding about how MWEs are represented and dealt with therein
(Nedumpozhimana & Kelleher 2021; Garcia et al. 2021, Fakharian & Cook
2021), how to better model the compositionality of MWEs from semantics
(Moreau et al. 2018) Now that NLP has shifted towards end-to-end neural
models like BERT, capable of solving complex tasks with little or no
intermediary linguistic symbols, questions arise about the extent to which
MWEs should be implicitly or explicitly modelled (Shwartz & Dagan, 2019).
MWE processing in low-resource languages: The PARSEME shared tasks
(Ramisch et al. 2020; 2018; Savary et al. 2017), among others, have
fostered significant progress in MWE identification, providing datasets
that include low-resource languages, evaluation measures, and tools that
now allow fully integrating MWE identification into end-user applications.
A few efforts have recently explored methods for the automatic
interpretation of MWEs (Bhatia, et al. 2018; 2017), and their processing in
low-resource languages (Liu & Wang 2020; Kumar et al. 2017). Resource
creation and sharing should be pursued in parallel with the development of
methods able to capitalize on small datasets (Han et al. 2020).
Through this workshop, we would like to bring together and encourage
researchers in various NLP subfields to submit MWE-related research, so
that approaches that deal with processing of MWEs including processing for
low-resource languages and for various applications can benefit from each
other. We also intend to consolidate the converging effects of previous
joint workshops LAW-MWE-CxG 2018, MWE-WN 2019 and MWE-LEX 2020, the joint
MWE-WOAH panel in 2021, and the MWE-SIGUL 2022 joint session, extending our
scope to MWEs in e-lexicons and WordNets, MWE annotation, as well as
grammatical constructions. Correspondingly, we call for papers on research
related (but not limited) to MWEs and constructions in:
Computationally-applicable theoretical work in psycholinguistics and
corpus linguistics;
Annotation (expert, crowdsourcing, automatic) and representation in
resources such as corpora, treebanks, e-lexicons, and WordNets (also for
low-resource languages);
Processing in syntactic and semantic frameworks (e.g. CCG, CxG, HPSG,
LFG, TAG, UD, etc.);
Discovery and identification methods, including for specialized
languages and domains such as clinical or biomedical NLP;
Interpretation of MWEs and understanding of text containing them;
Language acquisition, language learning, and non-standard language
(e.g. tweets, speech);
Evaluation of annotation and processing techniques;
Retrospective comparative analyses from the PARSEME shared tasks;
Processing for end-user applications (e.g. MT, NLU, summarisation,
language learning, etc.);
Implicit and explicit representation in pre-trained language models and
end-user applications;
Evaluation and probing of pre-trained language models;
Resources and tools (e.g. lexicons, identifiers) and their integration
into end-user applications;
Multiword terminology extraction;
Adaptation and transfer of annotations and related resources to new
languages and domains including low-resource ones.
Shared Task
We do not have a shared task this year, but a new release of the PARSEME
corpus of verbal MWEs is currently underway. We encourage submission of
research papers that include analyses of the new edition of the PARSEME
data and improvements over the results for PARSEME 2020 shared task as well
as SemEval 2022 task 2 on idiomaticity prediction.
*** Special Track on MWEs in Clinical NLP ***
Pursuing the MWE Section’s tradition of synergies with other communities,
this year, we are organizing a joint session with the Clinical NLP workshop
for shared papers/poster presentations. Since clinical texts contain an
important amount of multiword expressions (e.g. medical terms or
domain-specific collocations), a joint session is deemed beneficial for
both communities. The goal is to foster future synergies that could address
scientific challenges in the creation of resources, models and applications
to deal with multiword expressions and related phenomena in the specialised
domain of ClinicalNLP. Submissions describing research on MWEs in the
specialized domain of ClinicalNLP, especially introducing new datasets or
new tools and resources, are welcome. Papers accepted in this track will
have the option to present their work in the Clinical NLP workshop at ACL
2023 as well, after being presented at MWE 2023.
Best paper award
All full papers in the workshop will be considered by the program committee
for a best paper award.
Submission formats
The workshop invites two types of submissions:
archival submissions that present substantially original research in
both long paper format (8 pages + references) and short paper format (4
pages + references).
non-archival submissions of abstracts describing relevant research
presented/published elsewhere which will not be included in the MWE
proceedings.
Paper submission and templates
Papers should be submitted via the workshop's START submission page (link
will be provided once available). Please choose the appropriate submission
format (archival/non-archival). Archival papers with existing reviews will
also be accepted through the ACL Rolling Review. Submissions must follow
the ACL 2023 stylesheet.
Archival papers with existing reviews from ACL Rolling Review will also be
considered. A paper may not be simultaneously under review through ARR and
MWE. A paper that has or will receive reviews through ARR may not be
submitted for review to MWE.
Important Dates
Paper submission: February 13, 2023
ARR paper commitment: March 6, 2023
Notification of acceptance: March 13, 2023
Camera-ready papers due: March 27, 2023
Workshop: May 2 or 6, 2023
All deadlines are at 23:59 UTC-12 (Anywhere on Earth)
Organizing Committee
Program chairs: Marcos Garcia, Voula Giouli, Lifeng Han, Shiva Taslimipoor
Publication chair: Archna Bhatia
Publicity chair: Kilian Evang
Anti-harassment policy
The workshop follows the ACL anti-harassment policy.
Contact
For any inquiries regarding the workshop, please send an email to the
Organizing Committee at mweworkshop2023(a)googlegroups.com.
We are delighted to announce that we have released the consolidated version
of the Chilean Waiting List corpus. This dataset comprises 9,000 clinical
referrals in Spanish, annotated with ten entity types (almost half nested),
relations, and attributes. For more details, refer to the papers published
at ACM Healthcare (https://lnkd.in/dJskpprV) and EMNLP conference (
https://lnkd.in/dPt6RFsj). The corpus is available through the following
resources:
1. Zenodo (https://lnkd.in/dWfF_Cj6): Here, we make available the corpus in
its original version (the referrals in text file format and the annotations
following the standoff format). In addition, we transformed these files
into the CoNLL format, which is the most suitable format for performing NER
experiments.
2. Papers with code (https://lnkd.in/dsAw3Npt): This page contains the
benchmark of the dataset, including references to the NER models tested to
date. In particular, we published our corpus’s first results regarding the
Nested Named Entity Recognition task. The results were published at
COLING’s main conference. Please refer to the following link:
https://lnkd.in/dHnnA3aV.
3. Hugging Face (https://huggingface.co/plncmm): To facilitate the testing
of transformer-based models, we have made available 7 NER datasets in
Huggingface, one for each entity type (disease, medication, body part,
finding, abbreviation, family member, and procedure). Here is a simple
notebook of how to load these datasets: https://lnkd.in/dVddWXux.
Contact: wassa2023(a)googlegroups.com
Website: https://wassa-workshop.github.io/
BACKGROUND AND ENVISAGED SCOPE
Subjectivity and Sentiment Analysis has become a highly developed research area, ranging from binary classification of reviews to the detection of complex emotion structures between entities found in text. This field has expanded both on a practical level, finding numerous successful applications in business, as well as on a theoretical level, allowing researchers to explore more complex research questions related to affective computing. Its continuing importance is also shown by the interest it generates in other disciplines such as Economics, Sociology, Psychology, Marketing, Crisis Management & Digital Humanities.
The aim of WASSA 2023 is to bring together researchers working on Subjectivity, Sentiment Analysis, Emotion Detection and Classification and their applications to other NLP or real-world tasks (e.g. public health messaging, fake news, media impact analysis, social media mining, computational literary studies) and researchers working on interdisciplinary aspects of affect computation from text. For this edition, we encourage the submission of long and short research and demo papers
including, but not restricted to the following topics:
• Resources for subjectivity, sentiment, emotion and social media analysis
• Opinion retrieval, extraction, categorization, aggregation and summarization
• Humor, Irony and Sarcasm detection
• Mis- and disinformation analysis and the role of affective attributes
• Aspect and topic-based sentiment and emotion analysis
• Analysis of stable traits of social media users, incl. personality analysis and profiling
• Transfer learning for domain, language and genre portability of sentiment analysis
• Modelling commonsense knowledge for subjectivity, sentiment or emotion analysis
• Improvement of NLP tasks using subjectivity and/or sentiment analysis
• Intrinsic and extrinsic evaluation of subjectivity and/or sentiment analysis
• The role of emotions in argument mining
• Application of theories from related fields to subjectivity and sentiment analysis
• Multimodal emotion detection and classification
• Applications of sentiment and emotion mining
• Public sentiments and communication patterns of public health emergencies.
We furthermore encourage submissions to the special theme Ethics in Affective Computing, including opinion papers, as well as experimental papers. This includes the following topics, but is not limited to them:
• Which properties of a model render a automatic analysis task unethical?
• Which characteristics of an annotation task are to be considered in ethical considerations?
• What are appropriate methods to analyze data and models from an ethical perspective?
• What aspects are particular important for affective analysis tasks, in contrast to other NLP
settings?
IMPORTANT DATES
April 24, 2023 – Submission deadline for main workshop papers.
May 1, 2023 – Commitment deadline for submitting through ARR with reviews
May 22, 2023 – Notification of acceptance.
June 6, 2023 – Camera-ready papers due.
June 12, 2023 – Pre-recorded video due.
July 13 or 14, 2023 – Workshop.
Note that the shared tasks follow a different timeline that will be communicated separately.
SUBMISSION
At WASSA 2023, we will accept four types of submissions: long, short, ARR commitments, and industry track demo papers. For the regular research track we accept long & short papers. Submission is electronic, through the OpenReview portal for the workshop with the deadline on April 24, 2023. Both long and short papers must be anonymised for double-blind reviewing, must follow the ACL Author Guidelines, and must use the ACL 2023 templates available on the ACL Rolling Review website. The submitting author must have an OpenReview profile. Please ensure profiles are complete before
submission.
Long: Long papers may consist of up to eight (8) pages of content, with any number of additional pages of references, and will be presented orally.
Short: Short papers may consist of up to four (4) pages of content, with two (2) additional pages of references, and will be presented either orally or as a poster.
ARR Commitments: Additionally, we accept double submissions and double commitment of ARR reviews in parallel to WASSA and another venue. Please note that you must immediately withdraw your paper from WASSA if you decide to publish it elsewhere. They must be committed to the workshop (together with the reviews) not later than May 1, 2023.
Industry Demos: We also include an industry track, for which we accept demo papers that describe system demonstrations, ranging from early prototypes to mature production-ready systems. Please note: Commercial sales and marketing activities are not appropriate for this track. Demo papers may consist of up to six (6) pages of content, these will be presented as a poster and should include a live demonstration.
Additionally, system description papers from the shared tasks will be presented either orally
or as poster.
SHARED TASK
Following the success of the shared tasks organized in 2017, 2018, 2021 and 2022, we will continue our line of shared tasks. We will propose a first shared task on Empathy Detection and Emotion Classification in conversation at the speech-turn level, and a second shared task on multi-class and multi-label emotion classification on code-mixed (Roman Urdu + English) text messages. The tasks and deadlines will be communicated in due time. Keep a close eye on the
workshop website for more details: https://wassa-workshop.github.io/
ORGANIZERS
Jeremy Barnes, IXA group, University of the Basque Country UPV/EHU
Orph ́ee De Clercq, LT3 Language and Translation Technology Team, Ghent University
Roman Klinger, Institut f ̈ur Maschinelle Sprachverarbeitung, University of Stuttgart, Germany
Valentin Barriere, Centro Nacional de Inteligencia Artificial
Shabnam Tafreshi, University of Maryland: ARLIS
Jo ̃ao Sedoc, Technology, Operations, and Statistics department, New York University
Iqra Ameer, Yale University