11^th Workshop on the Challenges in the Management of Large Corpora (CMLC)
The next meeting of CMLC will be held as part ofCorpus Linguistics 2023
<https://wp.lancs.ac.uk/cl2023/> in Lancaster, UK, on the 2^nd of July,
2023.
See https://corpora.ids-mannheim.de/cmlc-2023.html for up-to-date
information.
Important dates
* Deadline for abstract submission: the 3^rd of May 2023 (Wednesday,
23:59 UTC)
* Notification of acceptance: the 19^th of May 2023 (Thursday)
* Deadline for the submission of camera-ready papers: the 4^th of June
2023 (Sunday)
* Meeting: Sunday, the 2nd of July 2023, 9.30-12.30 in George Fox LT2
(Lancaster University Campus)
Abstract submission
* We invite anonymised extended abstracts for/oral presentations/on
the topics listed below (/ideally/using theACL-2023 templates
<https://2023.aclweb.org/calls/style_and_formatting/>, or PDF,
750-1000 words excluding references, font preferably 11 pt, line
spacing 1.5).
* CMLC has always reserved a track for national corpus project
reports, and to this end, we invite/poster proposals/of 500-750
words. National project reports need not be anonymised.
Submissions are accepted through the EasyChair submission system,
athttps://easychair.org/conferences/?conf=cmlc11.
Please note that each CMLC event produces a volume of proceedings
(published in Open Access before the meeting), where both oral and
poster contributions have equal status./All/final submissions to the
2023 proceedings volume will be expected to be formatted according to
theACLPUB guidelines
<https://acl-org.github.io/ACLPUB/formatting.html>and to pass
theaclpubcheck <https://github.com/acl-org/aclpubcheck>.
Workshop description
The upcoming CMLC meeting continues the successful series of “Challenges
in the management of large corpora” events, previously hosted at LREC
(since 2012) and CL (since 2015) conferences. As in the previous
meetings, we wish to explore common areas of interest across a range of
issues in language resource management, corpus linguistics, natural
language processing, and data science.
Large textual datasets require careful design, collection, cleaning,
encoding, annotation, storage, retrieval, and curation to be of use for
a wide range of research questions and to users across a number of
disciplines. A growing number of national and other very large corpora
are being made available, many historical archives are being digitised,
numerous publishing houses are opening their textual assets for text
mining, and many billions of words can be quickly sourced from the web
and online social media.
A number of key themes and questions emerge of interest to the
contributing research communities: (a) what can be done to deal with IPR
and data protection issues? (b) what sampling techniques can we apply?
(c) what quality issues should we be aware of? (d) what infrastructures
and frameworks are being developed for the efficient storage,
annotation, analysis and retrieval of large datasets? (e) what
affordances do visualisation techniques offer for the exploratory
analysis approaches of corpora? (f) what kinds of APIs or other means of
access would make the corpus data as widely usable as possible without
interfering with legal restrictions? (g) how to guarantee that corpus
data remain available and usable in a sustainable way?
Motivation and topics of interest
This year’s event will cover the entire range of the standard CMLC
themes, with some new additions:
*
New and hot topics
o Language Models
+ What linguistic insights can we gain by post-hoc language
model analysis in the age of ChatGPT?
+ How can we avoid the proliferation of stereotypes in terms
of both linguistic surface form and content when using
language models for linguistic analysis?
o Societal and legal issues relevant for corpora and studies
+ political and sociological balance
+ social media bubbles, hate speech and fake news
+ proliferation of stereotypes via corpora and language models
+ corpora as archives of the past: evolution in mentalities or
laws, personality rights
o How to make corpora as accessible as possible despite big data
issues, application heterogeneity, and IPR issues
+ What are the most interesting APIs and libraries to build,
analyse and access very large corpora?
+ How can we get us researchers to use existing research
tools, infrastructures, libraries and APIs in research and
teaching?
*
Linguistic content challenges
o Dealing with the variety of language resources: multilinguality,
historical texts, noisy OCR texts, user-generated content, etc.
o Integration of human computation (crowdsourcing) and automatic
annotation
o Quality management of annotations
*
Technical challenges
o Storage and retrieval solutions for big textual data corpora:
primary data, metadata, and annotation data
o Scalable and efficient NLP tooling for annotating and analysing
large datasets: distributed and GPGPU computing; using big data
analysis frameworks for language processing
o Dealing with streaming (e.g. Social Media) and rapidly changing
underlying data
*
Exploitation challenges
o Legal and privacy issues
o Query languages, data models, and standardisation
o Licensing models of open and closed data, coping with
intellectual property restrictions
o Innovative approaches for aggregation and visualisation of text
analytics
In the tradition of CMLC, we invite reports on national corpus
initiatives; submitters of these reports should be prepared to present a
poster along with a short presentation.
Programme Committee
Names are being added as Programme Committee members confirm their
participation.
* Laurence Anthony (Waseda University, Japan)
* Vladimír Benko (Slovak Academy of Sciences)
* Tomaž Erjavec (Jožef Stefan Institute, Ljubljana)
* Stephanie Evert (Friedrich-Alexander-Universität Erlangen-Nürnberg)
* Johannes Graën (University of Zurich, Switzerland)
* Andrew Hardie (Lancaster University, UK)
* Serge Heiden (ENS de Lyon)
* Dawn Knight (Cardiff University)
* Paweł Kamocki (IDS Mannheim)
* Natalia Kotsyba (Samsung Poland)
* Michal Křen (Charles University, Prague)
* Paul Rayson (Lancaster University)
* Martin Reynaert (Tilburg University)
* Kevin Scannell (Saint-Louis University)
* Marko Tadić (University of Zagreb, Faculty of Humanities and Social
Sciences)
Organising Committee
Institut für Deutsche Sprache, Mannheim
📩 Piotr Bański,Marc Kupietz,Harald Lüngen
Berlin-Brandenburg Academy of Sciences
📩 Adrien Barbaresi
Institute of Computational Linguistics, University of Zurich
Simon Clematide
Homepage
CMLC series homepage is located athttp://corpora.ids-mannheim.de/cmlc.html
Hello All,
*** Apologies for Cross-Posting ***
The First Arabic Natural Language Processing Conference (WANLP 2023)
Co-located with EMNLP 2023 in Singapore.
Conference URL: https://wanlp2023.sigarab.org/
We invite you to submit proposals for shared tasks to be run as part of
WANLP 2023. WANLP 2023 will run as a conference for the first time. WANLP
2023 builds on seven previous workshop editions, which have been extremely
successful drawing in a large active participation in various capacities.
With the move to a conference format, we aim to bring a larger
participation from the Arabic NLP community. The conference is organized
by the Special Interest Group on Arabic NLP (SIGARAB), an Association for
Computational Linguistics Special Interest Group on Arabic Natural Language
Processing.
Submission Details
The proposals should provide an overview of the proposed task, motivation,
data/resources (how the data will be collected), task description (what are
the tasks to be included), evaluation (proposed evaluation method for each
task), pilot run (if available), tentative timeline that matches the
submission dates below, and task organizers (name, affiliation). Proposals
(up to 4 pages) should be sent to: wanlp-shared-task-chair(a)sigarab.org
Please use the ACL template files:
https://2023.emnlp.org/calls/style-and-formatting/
Selection Process
The proposals will be reviewed by the organizing committee and will be
selected based on multiple factors such as the novelty of the task, the
expected interest from the community, how convincing the data collection
plans are, the soundness of the evaluation method, and the expected impact
of the task.
Task Organization
Upon acceptance, the task organizers are expected to verify that the task
organization and data delivery to participants are happening in a timely
manner, provide the participants with all needed resources related to the
task, create a mailing list and maintain communication and support to
participants, create and manage CodaLab or similar competition website,
manage submissions to CodaLab, write a task description paper, manage
participants submissions of system description papers, and review and
maintain the quality of submitted system description papers.
Important Dates
-
May 7, 2023: submission of shared tasks proposals
-
May 14, 2023: notification of acceptance of shared tasks
-
September 5, 2023: conference paper & shared task papers due date
-
October 12, 2023: notification of acceptance
-
October 20, 2023: camera-ready papers due
-
Conference Date (one day): TBD (timeframe: December 6-10)
All deadlines are 11:59 pm UTC -12h
<https://www.timeanddate.com/time/zone/timezone/utc-12> (“Anywhere on
Earth”).
If you have any questions, please contact us at:
wanlp-shared-task-chair(a)sigarab.org
The WANLP 2023 Organizing Committee
Best regards,
WANLP publicity chairs: Salam Khalifa and Amr Keleg
====
SEMANTiCS - 19th International Conference on Semantic Systems
Leipzig, Germany
Call for Tutorials
September 20 - 22, 2023
https://2023-eu.semantics.cc/page/cfp_ws
====
SEMANTiCS 2023 is a major venue for research and industrial innovation
and features a workshop and tutorial program addressing the diverse
practical interests of its audience. This program is intended to offer a
rich diversity of topics to conference attendees and local participants
seeking to pick up new skills and stay up-to-date regarding the latest
developments in the community. We encourage submissions of proposals on
all topics in the general areas of SEMANTiCS 2023 and proposals bridging
or introducing new perspectives in these areas.
=Important Dates for Tutorials (and other meetings, e.g. seminars,
show-cases, etc., without call for papers)=
* Proposals Tutorial Deadline: June 06, 2023 (11:59 pm, Hawaii time)
* Notification of Acceptance: June 20, 2023 (11:59 pm, Hawaii time)
Submission via Easychair on https://easychair.org/conferences/?conf=sem23
=Scope & Goals=
Tutorials at SEMANTiCS 2023 allow your organisation or project to
advance and promote your topics and gain increased visibility. The
tutorials will be announced on the SEMANTiCS website and they will be
seen by all participants. SEMANTiCS 2023 tutorials can be incubators for
industrial and scientific communities that form and share a particular
research and development agenda. They provide a forum for presenting
contributions and findings to a diverse and knowledgeable community.
Furthermore, the event can be used as a dissemination activity in the
scope of large research projects or as a closed format for
research/commercial project consortia meetings.
=Setup and Requirements=
SEMANTiCS 2023 tutorials may be either half or full day long. Tutorials
take place on the days before and/or after the main SEMANTiCS 2023 EU
conference (20th, 21st, and/or 22nd of September 2023). Details will be
communicated on time.
Organizers of tutorials will be granted three free tickets (only for the
workshop & tutorial day) for organization purposes or keynotes.
Participants of tutorials will be charged a marginal fee to cover the
basic costs.
Tutorial proposals must include the following information:
* outline of the themes and goals of the event, including a title and a
brief abstract (less than 200 words) intended for the SEMANTiCS 2023 website
* a statement addressing why the event is important, why the event is
timely, how it is relevant to SEMANTiCS 2023 and the field of semantic
web. For the tutorials, why the presenters are qualified for a
high-quality introduction of the topic
* a statement addressing the quality assurance criterion that will be
used for the tutorial presenters..
* structure of the event and plans for generating and stimulating
discussion; how will the interaction be organized in case of a hybrid event
* desired minimum and maximum number of event participants, expected
number of participants, and (in case of previously held events) number
of registered attendees and web site for previous editions of the event
* a description of the intended audience and the expected learning outcomes
* desired prerequisite knowledge of the audience
* proposed duration of the event (i.e., half or full day), different
sessions if applicable (final time slot will be assigned in accordance
with the SEMANTiCS program)
* any equipment, room capacity, or other logistic constraints
* full contact information of all organizers of the event and main
contact person; a brief description of each organizer's background,
including relevant past experience in organizing events
Proposals for tutorials must be submitted via Easychair:
https://easychair.org/my/conference?conf=sem23
=Review and Evaluation Criteria=
Tutorial proposals will be reviewed by the SEMANTiCS 2023 Workshop &
Tutorial Chairs, as well as by the SEMANTiCS 2023 organizing committee,
according to the following criteria:
* The potential to advance the state of semantic web research and practice
* The quality assurance criterion proposed by the organizers to select
high-quality presenters for tutorials
* The organizers' experience and ability to lead a successful event
* Timeliness and expected interest in the event topics
* The balance and synergy between all SEMANTiCS 2023 events
=Topics of interest include (but are not limited to)=
* Web Semantics & Linked (Open) Data
* Enterprise Knowledge Graphs, Graph Data Management and Deep Semantics
* Machine Learning & Deep Learning Techniques
* Semantic Information Management & Knowledge Integration
* Terminology, Thesaurus & Ontology Management
* Data Mining and Knowledge Discovery
* Reasoning, Rules and Policies
* Natural Language Processing and Computational Linguistics
* Social and Human aspects of Semantic Web
* Data Quality Management and Assurance
* Explainable Artificial Intelligence
* Semantics in Data Science
* Semantics of Blockchain & Distributed Ledger Technologies
* Trust, Data Privacy, and Security with Semantic Technologies
* Economics of Data, Data Services and Data Ecosystems
* Applications of Semantic Web technologies in domains such as law,
medicine, life sciences, digital humanities, mobility and smart cities, etc.
We especially invite contributions that illustrate the applicability of
the topics mentioned above for industrial purposes and/or illustrate the
business relevance of their contribution for specific industries.
Workshop proposals on emerging themes for the topics listed above are
encouraged.
In case you have additional questions concerning the submission process,
please do not hesitate to contact us via Easychair.
We are looking forward to your contribution!
Jennifer D’Souza - jennifer.dsouza(a)tib.eu
Anisa Rula - anisa.rula(a)unibs.it
Workshop & Tutorial Chairs
*Apologies for cross-posting*
_____________________________________________________________________
EMit
Categorical Emotions Detection in Italian shared task at EVALITA 2023
Info: http://www.di.unito.it/~tutreeb/emit23/index.html
EVALITA 2023, the 8th evaluation campaign of Natural Language Processing
and Speech tools for Italian, 7-8 September 2023, Parma, Italy
Registration is required to obtain data and participate in the shared task.
Subscribe to the google group: emit_evalita2023(a)googlegroups.com
_____________________________________________________________________
CALL FOR PARTICIPATION
The detection of emotions in texts has a long history in international
evaluation campaigns (at SemEval in 2007, 2018 and 2019, or TASS 2020,
EmoEvalEs 2021, EmotionX 2018 and 2019, and WASSA 2022 and 2023), but has
never been addressed in EVALITA where the only shared task to deal with
emotions was about emotional speech recognition systems (ERT 2014).
In this context, the EMit (Emotions in Italian) task aims at providing the
first evaluation framework for categorical emotion detection in Italian
texts (with a specific attention on the entertainment sector) and make new
annotated data available to the community.
Task Description
EMit is organized according to two subtasks, both designed as multilabel
classification problems:
1.
SUBTASK A: Categorial Emotion Detection (mandatory). The main proposed
subtask concerns the detection of emotions in social media messages about
TV shows emitted by RAI (Radiotelevisione italiana, the national public
broadcasting company of Italy) and other out-of-domain texts.
1.
SUBTASK B: Target Detection (optional). The second subtask is about the
detection of the target addressed by the author of the message: the topic
or the direction. In each text, it should be indicated whether this refers
to what the broadcast is about (the topic) or whether it refers to
something that is under control of the broadcast itself (the direction).
*Important Dates*
7th February 2023: training data available to participants
30th April 2023: registration closes
2nd-19th May 2023: evaluation window and collection of participants’ results
30th May 2023: assessment returned to participants
14th June 2023: final reports from task participants due to task organizers
25th July 2023: camera ready version deadline
7th-8th September 2023: final workshop in Parma
*Organizers*
Oscar Araque: Universidad Politécnica de Madrid, Madrid, Spain
Simona Frenda: Università degli Studi di Torino, Turin, Italy
Debora Nozza: Università Bocconi, Milan, Italy
Viviana Patti: Università degli Studi di Torino, Turin, Italy
Rachele Sprugnoli: Università di Parma, Parma, Italy
If you have any enquiries/comments, contact us via:
emit_evalita2023(a)googlegroups.com
****NLPerspectives****
2nd Workshop on Perspectivist Approaches to Disagreement in NLP (and Beyond)
https://nlperspectives.di.unito.it/w/2nd-workshop-on-perspectivist-approach…
Until recently, the dominant paradigm in natural language processing (and
other areas of artificial intelligence) has been to resolve observed label
disagreement into a single “ground truth” or “gold standard” via
aggregation, adjudication, or statistical means. However, in recent years,
the field has increasingly focused on subjective tasks, such as abuse
detection or quality estimation, in which multiple points of view may be
equally valid, and a unique ‘ground truth’ label may not exist (Plank,
2022). At the same time, as concerns have been raised about bias and
fairness in AI, it has become increasingly apparent that an approach which
assumes a single “ground truth” can erase minority voices.
Strong perspectivism in NLP (Cabitza et al., 2023) pursues the spirit of
recent initiatives such as Data Statements (Bender and Friedman, 2018),
extending their scope to the full NLP pipeline, including the aspects
related to modelling, evaluation and explanation.
In line with the first edition <https://nlperspectives.di.unito.it/w/w2022/>,
the NLPerspectives (Perspectivist Approaches to Disagreement in NLP)
workshop will explore current and ongoing work on: the collection and
labelling of non-aggregated datasets; and approaches to modelling and
including these perspectives, as well as evaluation and applications of
multi-perspective Machine Learning models. We also welcome opinion pieces
and literature reviews, e.g., in the context of fairness and inclusion.
A key outcome of this second edition will be to build on the work begun at
https://pdai.info/ to create a repository of perspectivist datasets with
non-aggregated labels for use by researchers in perspectivist NLP
modelling.
Authors are, therefore, invited to share their LRs (data, tools, services,
etc.) and provide essential information about resources (i.e., also
technologies, standards, evaluation kits, etc.) that have been used for the
work or are a result of their research. In addition, authors will be
required to adhere to ethical research policies on AI and may include an
ethics statement in their papers.
The NLPerspectives workshop will be hosted in person during the 26th
edition of ECAI 2023 <https://ecai2023.eu/> in Kraków, Poland, on 30
September or 1 October 2023.
Submissions
The contributions cannot exceed 7 pages (4 for research communications, see
below) not including references, and as established by ECAI 2023
conference, the over length submissions will be rejected without review.
The papers should be submitted as a PDF document, conforming to the
formatting guidelines provided in the call for papers of ECAI 2023
conference: https://ecai2023.eu/ECAI2023
We accept three types of submissions:
-
Regular research papers;
-
Non-archival submissions: like research papers, but will not be included
in the proceedings;
-
Research communications: 4-page abstracts summarising relevant research
published elsewhere.
Topics
We invite original research papers from a wide range of topics, including
but not limited to:
-
Non-aggregated data collection and annotation frameworks
-
Descriptions of corpora collected under the perspectivist paradigm
-
Multi-perspective Modelling and Machine Learning
-
Evaluation of multi-perspective models/ models of disagreement
-
Multi-perspective disagreement as applied to NLP evaluation
-
Fairness and inclusive modelling
-
Perspectivist approaches for social good
-
Applications of multi-perspective modelling
-
Computing with (dis)agreement
-
Perspectivist Natural Language Generation
-
Foundational aspects of perspectivism
-
Opinion pieces and reviews on perspectivist approaches to NLP
Submissions are open to all, and are to be submitted anonymously (and must
conform to the instructions for double-blind review). All papers will be
refereed through a double-blind peer review process by at least three
reviewers, with final acceptance decisions made by the workshop organisers.
Scientific papers will be evaluated based on relevance, significance of
contribution, impact, technical quality, scholarship, and quality of
presentation.
More information about the submission, publication of proceedings and date
of the workshop will be provided soon. We are seeking sponsors in order to
provide financial support for conference registration, travel, and
accommodation for participants.
Attendance
At least one author of each accepted paper is required to participate in
the conference and present the work.
Important Dates
* Friday June 23, 2023: Paper submission
* Friday August 4, 2023: Notification of acceptance
* Friday September 1, 2023: Camera-ready papers due
* Saturday September 30 or Sunday October 1, 2023: Workshop
Workshop organisers:
Gavin Abercrombie, Heriot-Watt University
Valerio Basile, University of Turin
Davide Bernardi, Amazon Alexa
Shiran Dudy, University of Colorado, Boulder
Simona Frenda, University of Turin
Lucy Havens, University of Edinburgh
Elisa Leonardelli, Fondazione Bruno Kessler
Sara Tonelli, Fondazione Bruno Kessler
Contact us at g.abercrombie(a)hw.ac.uk if you have any questions.
Website: https://nlperspectives.di.unito.it/
2nd Call for Papers: 'TwinTalks 4: Understanding and Facilitating Remote Collaboration in DH'
The workshop is a joint initiative by the European Social Sciences and Humanities Research Infrastructures CLARIN <http://www.clarin.eu> and DARIAH<https://www.dariah.eu/> and it will be organised as part of the DH 2023 Collaboration and Opportunity Conference<https://dh2023.adho.org/> that will take place on July 10-14 in Graz, Austria.
Dates and Location
Main conference: 10-14 July, Messe Congress Graz convention centre<http://www.mcg.at/messegraz.at/en/locations/messecongress-graz/veranstalter…>
TwinTalks workshop: 10 July, 9:00 - 12:30, University of Graz<https://www.uni-graz.at/en/>
Important Dates
* 15 March 2023: Call for Papers
* 15 May 2023: Submission deadline
* 15 June 2023: Notification of acceptance
* 30 June 2023: Deadline for the final version of extended abstracts
* 10 July 2023 (9:00 - 12:30): Workshop
Workshop Aims
The main objective of the workshop is to develop a better understanding of the dynamics on the Digital Humanities work floor when researchers, teachers and/or professionals with different – but often overlapping – areas of competence engage in remote collaboration to solve humanities research questions, and to explore how education and training of humanities scholars, cultural heritage professionals and technical experts can help to make remote collaboration across disciplines more efficient and effective, more creative and innovative, and more inclusive and rewarding for all participants.
To this end, we invite submissions reporting on all aspects and stages of engaging in remote collaborative research and teaching in DH, including the obstacles encountered and solutions found. We also welcome position papers on the role of research infrastructures in facilitating remote collaboration in DH.
The insights gained should help those involved in the education of humanities scholars, professionals and technical experts alike to develop better training programmes, tailored towards the needs of a diverse group of potential learners.
The workshop is a follow-up of three previous successful TwinTalks workshops that have taken place at various DH conferences from 2019 onwards (TwinTalks 1 proceedings<https://ceur-ws.org/Vol-2365/>; TwinTalks 1 blog<http://www.parthenos-project.eu/clarin-and-parthenos-twintalks>; TwinTalks 2+3 proceedings<https://ceur-ws.org/Vol-2717/>).
Audience
Researchers, cultural heritage professionals, educators, scientific programmers, research infrastructure operators and policy-makers with a special interest in creating the conditions where people with humanities research skills and technical expertise (or both) can fruitfully collaborate in answering humanities research questions remotely.
Workshop Format
The programme starts with an invited talk by a prominent speaker, which will set the scene for the rest of the day. The main component of the workshop programme consists of two types of (submitted) talks:
* Twin talks, i.e. talks presented by pairs or teams consisting of someone rooted primarily in humanities research (with a humanities research problem, i.e. not a technical problem or tool), someone with a background in a totally different discipline (e.g. technical) who has contributed their specific capabilities to arrive at the answers, and/or a cultural heritage professional whose collection knowledge has contributed to the development of the research corpus. Talks will usually consist of three parts, followed by questions from the audience: In the first part, the humanities research question is the point of focus, while in the second part, it is shown how the joint effort resulted in an answer to the respective question. In the third part, these perspectives come together, as the team describes how the remote collaboration went, including obstacles that were encountered, and how better training and education could help to make remote collaboration more efficient and effective.
* Teach talks by people with experience with or interesting ideas about how remote cross-discipline collaboration is or can be addressed in curricula or other training activities.
Submissions
The language of the workshop is English.
What we expect from the submissions for the Twin Talks track:
* They are authored and presented by one or more humanities scholars and one or more digital experts
* They start from a humanities research question (i.e. not a technical question, a presentation of a tool, a platform or a data collection)
* They describe the remote research carried out jointly and its results
* They describe the technical aspects of the methods used and the results obtained
* They analyse the way the scholars and the technicians collaborated remotely, addressing issues such as (but not limited to):
* What was easy and what was difficult, and why?
* How did the researchers, technicians or cultural heritage professionals change each other’s way of looking at things?
* Did they, for instance, make each other aware of blind spots they had?
* Did the combination of thinking from a DH research question and thinking from a technical solution lead to new insights?
* How could better training or education of scholars and digital experts make remote collaboration easier, more effective and more efficient?
With regards to the TeachTalks track, one single author and presenter is sufficient. Of course, multi-author papers are equally welcome.
Submission instructions
* Format: PDF. For format instructions, see http://ceur-ws.org/Vol-XXX/CEURART.zip
* Size: Extended abstracts, size ca 4-8 pages (between 2000-4000 words), covering research questions and answers, technical aspects and collaboration experience for Twin Talks, or relevant educational experience for Teach Talks.
* Publication: The workshop proceedings will be published at CEUR-WS<https://ceur-ws.org/>.
* Submission URL: https://easychair.org/conferences/?conf=twintalksdh2023
Workshop Programme Committee
* Bente Maegaard (CLARIN ERIC / University of Copenhagen, Denmark)
* Barbara McGillivray (King's College London & The Alan Turing Institute, UK)
* Benjamin Wiggins (University of Manchester, UK)
* Eleni Gouli (Academy of Athens, Greece)
* Francesca Frontini (CNR, Italy & CLARIN ERIC)
* Frank Uiterwaal (EHRI / NIOD / KNAW, Netherlands)
* Folgert Karsdorp (Meertens Institute, KNAW, Netherlands)
* Geoffrey Rockwell (University of Alberta, Canada)
* Hitoshi Isahara (Center for IT-Based Education, Japan)
* Jennifer Edmond (Trinity College Dublin, Ireland)
* Koenraad De Smedt (CLARINO, University of Bergen, Norway)
* Maria Gavrilidou (Institute for Language and Speech Processing, Athens, Greece)
* Menno Van Zaanen (South African Centre for Digital Language Resources, South Africa)
* Milena Dobreva (Sofia University “St. Kliment Ohridski”, Bulgaria)
* Mikko Tolonen (University of Helsinki, Finland)
* Radim Hladik (Academy of Sciences, Czech Republic)
* Ulrike Wuttke (University of Applied Sciences Potsdam, Germany)
* Vicky Garnett (Trinity College Dublin, Ireland)
Chairs and Organisers
The workshop is a joint initiative by European SSH Research Infrastructures CLARIN (www.clarin.eu<http://www.clarin.eu/>) and DARIAH (https://www.dariah.eu/).
* Steven Krauwer (CLARIN ERIC / Utrecht University, Netherlands)
* Darja Fišer (CLARIN ERIC / Institute of Contemporary History, Slovenia)
* Iulianna van der Lek-Ciudin (CLARIN ERIC, Netherlands)
* Sally Chambers (DARIAH-EU / Ghent Centre for Digital Humanities, Belgium)
* Agiatis Benardou (DARIAH-EU / Digital Curation Unit, ATHENA R.C., Athens, Greece)
Contact Information
For any questions, please contact Iulianna van der Lek at events(a)clarin.eu<https://mailto:events@clarin.eu>.
—
Elisa Gorgaini
CLARIN ERIC External Relation Officer
elisa(a)clarin.eu
+31648213015
www.clarin.eu
*Call for Papers: The 1st International Workshop on Implicit Author
Characterization from Texts for Search and Retrieval (IACT’23) *
The workshop will be held in conjunction with the 46th International ACM
SIGIR Conference on Research and Development in Information Retrieval
Workshop website: https://en.sce.ac.il/news/iact23
July 27, 2023. Taipei, Taiwan.
Paper submission deadline: April 25 Extended to May 2, 2023, AoE
Submission link: https://easychair.org/conferences/?conf=iact23
To bring the research community's attention to the limitations of current
models in recognizing and characterizing AI vs. human authors, we organize
the first edition of IACT workshops under the umbrella of the SIGIR
conference. Research works submitted to the workshop should foster
scientific advances in all aspects of author characterization.
Organizing Committee:
- Marina Litvak - marinal(a)ac.sce.ac.il; Shamoon College of Engineering
Beer Sheva; Israel
- Irina Rabaev - irinar(a)ac.sce.ac.il; Shamoon College of Engineering
Beer Sheva; Israel
- Alípio Mário Jorge - amjorge(a)fc.up.pt; University of Porto; Porto,
Portugal
- Ricardo Campos - ricardo.campos(a)ipt.pt; Polytechnic Institute of Tomar
INESC TEC, Portugal; Porto, Portugal
- Adam Jatowt - adam.jatowt(a)uibk.ac.at; University of Innsbruck;
Innsbruck, Austria
Invited Speakers:
- Prof. Mark Last - Ben-Gurion University of the Negev, Israel
- Prof. Dr. Valia Kordoni - Humboldt-Universität Berlin, Germany
IACT’23 proceedings will be published at CEUR workshop proceedings (indexed
in Scopus and DBLP) as long as they do not conflict with previous
publication rights.
Contact:
- Dr. Marina Litvak: litvak.marina(a)gmail.com
- Dr. Irina Rabaev: irinar(a)ac.sce.ac.il
--
Best regards,
Marina Litvak
FINAL CALL FOR PAPERS
NLP-TeMA’23 will be held at the 22th Portuguese Conference on Artificial
Intelligence (EPIA 2023) taking place at Horta, Faial island, Azores,
between September 5th-8th 2023. This track is organized under the auspices
of the Portuguese Association for Artificial Intelligence (APPIA), and part
of the EPIA 2023 Conference on Artificial Intelligence, URL:
https://epia2023.inesctec.pt/
This announcement contains the following: [1] Track description; [2] Topics
of interest; [3] Important dates; [4] Paper submission; [5] Track fees; [6]
Organizing Committee; and [7] Program Committee.
[1] Track Description
The Track of NLP-TeMA 2023 is a forum for researchers working in Human
Language Technologies, i.e., Natural Language Processing (NLP),
Computational Linguistics (CL), Natural Language Engineering (NLE), Text
Mining (TM), Information Retrieval (IR), and related areas.
A huge amount of information is openly published every day, on many
different topics and written in natural language, thus offering new
insights and many opportunities for innovative applications of Human
Language Technologies.
Following advances in AI sub-fields such as NLP, Machine Learning (ML) and
Deep Learning (DL), NLP and TM are now even more valuable for bridging the
gap between language theories and effective use of natural language
contents, for harnessing the power of semi-structured and unstructured
data, and to enable important applications in real-world heterogeneous
environments. Both hidden and new knowledge can be discovered by using NLP
and TM methods, at multiple levels and in multiple dimensions, and often
with high commercial value.
Authors are invited to submit their papers on any of the topics listed in
section [2]. Submitted papers will be subject to a double-blind review
process and will be peer-reviewed by at least three members of the track
Program Committee. It is the responsibility of the authors to remove names
and affiliations from the submitted papers, and to take reasonable care to
assure anonymity during the review process. Accepted papers will be
included in the conference proceedings (a volume of Springer’s LNAI-Lecture
Notes in Artificial Intelligence), provided that at least one author is
registered in EPIA 2023 by the early registration deadline. EPIA 2023
proceedings are indexed in Thomson Reuters ISI Web of Science, Scopus, DBLP
and Google Scholar. Each accepted paper must be presented by one of the
authors in a track session.
The conference will grant the following awards:
* Best Paper Award, for the best research paper presented at the conference.
* Best Student Paper Award, for the best research paper presented at the
conference where the first author is a student.
[2] Topics of Interest
Natural Language Processing
• Language and Cognitive Modeling
• Sentence-level Semantics and Text Inference
• Language Resources: Acquisition and Usage.
• Entailment and Paraphrase Recognition
• Entity Recognition and Word Sense Disambiguation
• Distributional Models and Semantics
• Mathematical Properties of Language
• Tagging, Chunking and Parsing
• Morphology and Word Segmentation
• Natural Language Generation
• Discourse and Pragmatics
• NLP for Low-Resource Languages
Text Mining and Applications
• Text Clustering, Classification and Summarization
• Sentiment Analysis and Argument Mining
• Computational Social Science
• Multi-Word Units
• Machine Learning for NLP and Text Mining
• Spatio-Temporal and Big Text Mining
• Cross-Lingual Approaches
• Algorithms and Data Structures for Text Mining
• Information Retrieval and Information Extraction
• Question-Answering and Dialogue Systems
• Text-Based Prediction and Forecasting
• Web Content Annotation
• Health/Biomedical/Legal and other Text Mining Applications
[3] Important dates
Paper submission April 28, 2023 [EXTENDED DEADLINE]
Notification of paper acceptance May 26, 2023
Camera-ready papers deadline June 15, 2023
Conference dates September 5-8, 2023
[4] Paper submission
Submissions must be full technical papers on substantial, original, and
previously unpublished research. Papers can have a maximum length of 12
pages. All papers should be prepared according to the formatting
instructions of Springer LNCS format and submitted in PDF format through
the EPIA 2023 EasyChair submission page
https://easychair.org/my/conference?conf=epia2023.
For the preparation of their papers, authors should consult Springer’s
authors’ guidelines and use their proceedings templates, either for LaTeX
or for Word. Springer encourages authors to include their ORCIDs in their
papers. In addition, the corresponding author of each paper, acting on
behalf of all of the authors of that paper, must complete and sign a
Consent-to-Publish form. The corresponding author signing the copyright
form should match the corresponding author marked on the paper. Once the
files have been sent to Springer, changes relating to the authorship of the
papers cannot be made.
[5] Track Fees:
Track participants must register at the main EPIA 2023 conference.
[6] Organizing Committee:
Joaquim Silva, jfs(a)fct.unl.pt, DI – FCT/UNL, Portugal (Contact person).
Pablo Gamallo, Pablo.gamallo(a)usc.es, Universidade de Santiago de
Compostela, Galiza/Spain.
Paulo Quaresma, pq(a)uevora.pt, DI – Uviversidade de Évora, Portugal.
Irene Rodrigues, ipr(a)uevora.pt, DI – Uviversidade de Évora, Portugal
Hugo Gonçalo Oliveira, hroliv(a)dei.uc.pt – Universidade de Coimbra, Portugal
[7] Program Committee:
Adam Jatowt – Universit of Kioto, Japan
Alverto Simões – 2Ai Lab – IPCA
Alexandre Rademaker – IBM / FGV, Brazil
Antoine Doucet – University of Caen, France
Altigran Silva – Universidade Federal do Amazonas, Brazil
António Branco – Universidade de Lisboa, Portugal
Antoine Doucet – University of Caen, France
Béatrice Daille – University of Nantes, France
Bruno Martins – Instituto Superior Técnico – Universidade de Lisboa,
Portugal
Fernando Batista – Instituto Universitário de Lisboa, Portugal
Gaël Dias – University of Caen Basse-Normandie
Hugo Gonçalo Oliveira – Universidade de Coimbra, Portugal
Irene Rodrigues – Universidade de Évora, Portugal
Jesús Vilares – University of A Coruña, Spain
Joaquim Ferreira da Silva – Faculdade de Ciências e Tecnologia –
Universidade Nova de Lisboa
Luísa Coheur – IST/INESC–ID Lisboa
Manuel Vilares Ferro – University of Vigo, Spain
Marcos Garcia – Universidade de Santiago de Compostela, Galiza/Spain
Mário Silva – Instituto Superior Técnico – Universidade de Lisboa, Portugal
Nuno Marques – Universidade Nova de Lisboa, Portugal
Pablo Gamallo – Universidade de Santiago de Compostela, Galiza/Spain
Paulo Quaresma – Universidade de Évora, Portugal
Pavel Brazdil – University of Porto, Portugal
Sophia Ananiadou –University of Manchester
Sérgio Nunes – Faculdade de Engenharia – Universidade do Porto, Portugal
----------
Hugo Gonçalo Oliveira
CISUC, Department of Informatics Engineering, University of Coimbra
http://eden.dei.uc.pt/~hroliv
Deadline extension: DISRPT 2023 - Shared Task on Discourse Relation Parsing and Treebanking
In conjunction with CODI 2023, ACL 2023 - 14 July 2023
News: The deadline has been extended, please consult the new timeline below
This year, we are organizing DISRPT 2023 as a shared task on discourse processing across formalisms, for a variety of languages and genres. It is the third iteration of a cross-formalism shared task on discourse analysis, with three subtasks:
* Task 1: discourse segmentation * Task 2: connective identification * Task 3: relation classification
We will provide training, development and test datasets from all available languages in RST, SDRT, PDTB and Discourse Dependencies using a uniform format. Because different corpora, languages, and frameworks use different guidelines, the shared task will promote the design of flexible methods for dealing with various guidelines, and will help to push forward the discussion of converging standards for discourse units, discourse relations and discourse markers. For datasets which have treebanks, we will evaluate segmentation in two different scenarios: with and without gold syntax. An automatically parsed version is provided for all corpora without a gold parse.
Shared Task Data and Formats
Data for the shared task is released via GitHub together with format documentation and tools: https://github.com/disrpt/sharedtask2023
See here for more information about the previous shared tasks:
- 2019: https://sites.google.com/view/disrpt2019/shared-task
- 2021: https://sites.google.com/georgetown.edu/disrpt2021/
Schedule:
* 25 January 2023 – Sample data released
* 22 February 2023 – Train / dev data release
* 17 April 2023 – Test data release
* 14 May 2023 8 May 2023 – Submission of system and paper
* 26 May 2023 22 May 2023 - Notification of acceptance
* 5 June 2023 1 June 2023 - Camera-ready paper due
* 13-14 July 2023 - CODI Workshop at ACL
Information:
Contact the organizers: disrpt_chairs(a)googlegroups.com
Official website: https://sites.google.com/georgetown.edu/disrpt2021
Organization:
* Amir Zeldes (Georgetown University, Washington, DC, USA) * Janet Liu (Georgetown University, Washington, DC, USA) * Philippe Muller (IRIT, University of Toulouse, Toulouse, France) * Chloé Braud (IRIT, CNRS, Toulouse, France) * Laura Rivière (IRIT, University of Toulouse, Toulouse, France) * Attapol Te Rutherford (Faculty of Arts Chulalongkorn University, Bangkok, Thaïland)
*apologies for cross-postings*
�
CODI, 4th Workshop on Computational Approaches to Discourse
�
https://sites.google.com/view/codi-2023/
�
2023-07-13–14 - ACL 2023 - Toronto, Canada
�
** Submission deadline extended: May 3, 2023 **
(Was: April 24, 2023)
�
Aims and scope
�
The last ten years have seen a dramatic improvement in the ability of NLP systems to understand and produce words and sentences. This development has created a renewed interest in discourse phenomena as researchers move towards the processing of long-form text and conversations. There is a surge of activity in discourse parsing, coherence models, text summarization, corpora for discourse level reading comprehension, and discourse related/aided representation learning, to name a few, but the problems in computational approaches to discourse are still substantial. At this juncture, we have organized three Workshops on Computational Approaches to Discourse (CODI) at EMNLP 2020, EMNLP 2021 and COLING 2022 to bring together discourse experts and upcoming researchers. These workshops have catalyzed work to improve the speed and knowledge needed to solve such problems and have served as a forum for the discussion of suitable datasets and reliable evaluation methods.
�
The previous workshops on discourse in machine translation (DiscoMT), linking lexical, sentential and discourse semantics (LSDSem), discourse structure in natural language generation (DSNNLG), discourse relation parsing and treebanking (DISRPT) and coreference (CORBON/CRAC), have shown that there is considerable interest and success in bringing together the community working on specific problems in discourse. We believe that the discourse community will also benefit from a general forum where work ranging from corpus development/analysis to computational models, and evaluation is discussed, and desiderata can be drawn for future progress.
�
The 4th CODI workshop is planned as a 2 day event which brings together different subcommunities. It will feature invited talks and regular papers on the first day. The second day will be dedicated to shared tasks and special sessions which focus on the issues mentioned above. After a first successful iteration in 2019 and 2021 the shared task on Discourse Relation Parsing and Treebanking (DISRPT) will be held again in 2023, with three tasks: discourse segmentation, discourse connective identification and discourse relation classification, including new datasets and languages. For more information on the shared task see:
�
<https://sites.google.com/view/disrpt2023/> https://sites.google.com/view/disrpt2023/ �
�
Topics of interest
�
We welcome symbolic and probabilistic approaches, corpus development and analysis, as well as machine and deep learning approaches to discourse. We appreciate theoretical contributions as well as practical applications, including demos of systems and tools. The goal of the workshop is to provide a forum for the community of NLP researchers working on all aspects of discourse. �
�
Topics of interest include, but are not limited to: �
* discourse structure �
* discourse connectives �
* discourse relations �
* annotation tools and schemes for discourse phenomena �
* corpora annotated with discourse phenomena �
* discourse parsing �
* cross-lingual discourse processing �
* cross-domain discourse processing �
* anaphora and coreference resolution �
* event coreference �
* argument mining �
* coherence modeling �
* discourse and semantics �
* discourse in applications such as machine translation, summarization, etc. �
* evaluation methodology for discourse processing �
�
Submissions �
�
We solicit four categories of papers: regular workshop papers, demos, shared task papers and extended abstracts. Only regular workshop long and short papers, shared task papers and demos will be included in the proceedings as archival publications, while extended abstracts will be non-archival (see below). �
�
Regular papers must describe original unpublished research. Long papers may consist of up to 8 pages of content, plus unlimited pages for references. �
�
Short papers can be up to 4 pages, plus unlimited pages for references. �
�
Demo submissions may describe systems, tools, visualizations, etc., and may consist of up to 4 pages, plus unlimited pages for references. �
�
Each submission can contain unlimited pages for Appendices but the paper submissions need to remain fully self-contained, as these supplementary materials are completely optional, and reviewers are not even asked to review them.
Accepted long, short, and demo papers will be presented orally. �
�
Extended abstracts can describe work in progress or those already published elsewhere. These may be two pages long (without references). Extended abstracts are non-archival. They will be presented orally, and included in the workshop program and handbook, but will not appear in the workshop proceedings.
�
Double submission of papers is allowed but will need to be indicated at submission. �
�
Submission website
�
All submissions must be anonymous and follow the ACL 2023 formatting instructions described here:
https://2023.aclweb.org/calls/style_and_formatting/ � �
�
Please submit your workshop papers at <https://www.softconf.com/acl2023/CODI2023> https://www.softconf.com/acl2023/CODI2023
�
Shared task papers should be submitted to the links specified on the shared task pages.
�
Important dates
* 2023-03-24: Anonymity period starts
* 2023-05-03: CODI papers due (was: 04-24)
* 2023-05-29: Notification of acceptance (was: 05-22)
* 2023-06-08: Camera ready deadline
* 2023-07-13 – 2022-07-14: CODI workshop
�
All deadlines are 11:59 pm UTC -12h ("anywhere on Earth").
�
Invited Speakers �
* Yufang Hou, IBM Research �
* Giuseppe Carenini, University of British Columbia �
�
Organizers
* Chloé Braud, CNRS-IRIT
* Christian Hardmeier, IT University of Copenhagen and Uppsala University
* Jessy Li, University of Texas, Austin
* Sharid Loáiciga, University of Gothenburg
* Michael Strube, Heidelberg Institute for Theoretical Studies
* Amir Zeldes, Georgetown University
To contact the organizers, please send an email to: codi-workshop(a)googlegroups.com <mailto:codi-workshop@googlegroups.com>
�
�
�