*MWE-UD2024 @LREC-COLING2024 Call for Sponsorships*
We are pleased to announce that the multiword expressions (MWE) and
Universal Dependencies (UD) research communities are joining forces in 2024
to organize a joint workshop MWE-UD2024 <https://multiword.org/mweud2024/>,
to be held as a full-day event collocated with LREC-COLING 2024
<https://lrec-coling-2024.org/about-lrec-coling/>, Torino, Italy, May 25,
2024.
The MWE WS organised by the MWE section <https://multiword.org/> of SIGLEX
<http://www.siglex.org/> (Special Interest Group on the Lexicon of the ACL
<https://www.aclweb.org>), ran into its 19th edition in 2023 and is
organized within major NLP conferences since 2003. The UniDive COST action
(CA21167) <https://unidive.lisn.upsaclay.fr/doku.php?id=start> is an
interdisciplinary scientific network devoted to universality, diversity,
and idiosyncrasy in language technology. Both communities share an interest
in developing guidelines, data-sets, and tools that can be applied to a
wide range of typologically diverse languages, raising fundamental
questions about tokenization, lemmatization, and morphological
decomposition of tokens.
The joint workshop invites submissions of original research on MWE, UD, and
the interplay of both. In particular, MWE-UD2024 calls for the submission
of the following topics:
● *Sensitivity of large language models (LLMs) to MWEs and syntactic
dependencies *
● *Applicability of UD and MWE annotation and discovery for
low-resource and typologically diverse languages and language varieties*
● *Case studies* on the consistency, coverage or universal
applicability of MWE annotation in the UD or PARSEME frameworks, as well as
studies on automatic detection and interpretation of MWEs in corpora
● *MWE and UD processing to enhance end-user applications, *including
Machine Translation (MT), text simplification, language learning and
assessment, social media mining, and abusive language detection
● *Testing developed systems on the latest dataset versions*
MWE-UD2024 features both long and short research papers, archival and
non-archival papers, oral and poster presentations, “Best Paper Awards”,
and keynote speakers of rising-star successful scientists including Dr N.G.
Levshina (Natalia) <https://www.ru.nl/en/people/levshina-n> Assistant
professor - Centre for Language Studies - Department of Language and
Communication, Radboud Universiteit, NL and Dr. Harish Tayyar Madabushi
<https://www.harishtayyarmadabushi.com/> Lecturer in Artificial
Intelligence, University of Bath, UK.
MWE-UD is proud of its *gender balance* feature among STEM conferences,
e.g. this year we have confirmed PC members (reviewers) 26 females and 23
males with a ratio of almost 50% each, from 22 countries across most
continents.
MWE-UD2024 offers a variety of sponsorship opportunities suitable for
organizations of all sizes. Sponsors of the MWE-UD2024 workshop at
LREC-COLING gain *visibility* for their companies and institutes at an
*international
and well-known venue* and contribute to the success of the event.
Sponsorship will support increasing the in-person attendance and
engagement of the WS, increasing the diversity of attendances (e.g.,
regions, levels of seniority, gender), through grants, delegates and
organizers reimbursement (e.g., waiving registration fees) especially for
those with limited funds. Sponsors receive more visibility in the MWE-UD
workshop and the LREC-COLING conference, by having their names and logos in
our website, having posters/desks/flags/leaflets in the venue, etc.
For more information about the Level of Sponsorship available at MWE-UD2024
and their benefits, please reach out to e-mail at
mweud2024-organizers(a)uni-duesseldorf.de including details of the person to
contact, e.g. e-mail address, level of sponsorship desired.
*Organizing Committee*
Archna Bhatia (Institute for Human and Machine Cognition, USA)
Gosse Bouma (Groningen University, NL)
Kilian Evang (Heinrich Heine University Düsseldorf, DE)
Marcos Garcia (University of Santiago de Compostela, Galiza, Spain)
Voula Giouli (Institute for Language & Speech Processing, ATHENA RC,
Greece)
Lifeng Han (University of Manchester, UK)
Joakim Nivre (Uppsala University and Research Institutes of Sweden,
Sweden)
*Standing by | At Large Members (SIGLEX-MWE):*
● A. Seza Doğruöz <https://research.flw.ugent.be/en/as.dogruoz> (Ghent
University) - SIGLEX-MWE nominated officer in 2023-2025
● Alexandre Rademaker <http://arademaker.github.io/> (IBM Research and
FGV EMAp) - SIGLEX-MWE nominated officer in 2023-2025
Best regards,
We look forward to seeing you at MWE-UD2024 in Italia!
The MWE-UD 2024 organizing committee
--
https://www.research.manchester.ac.uk/portal/lifeng.han.html Office: 2.90
Kilburn, Oxford Road, Manchester
Apologies for cross-posting.
----------------------------------------
*The International Conference on Spoken Language Translation*
*21st IWSLT 2024 – **third** Call for Participation*
*August 17-18, 2024 – Bangkok, Thailand*
*http://iwslt.org <http://iwslt.org/>*
The International Conference on Spoken Language Translation (IWSLT) is the
premier annual conference for all aspects of Spoken Language Translation.
Every year, the conference organizes and sponsors open evaluation campaigns
around key challenges in simultaneous and consecutive translation, under
real-time/low latency or offline conditions and under low-resource or
multilingual constraints. System descriptions and results from
participants’ systems and scientific papers related to key algorithmic
advances and best practices are presented.
IWSLT is the venue of the SIGSLTs, the Special Interest Group on Spoken
Language Translation of ACL, ISCA and ELRA. With a track record of 20
years, IWSLT benchmarks and proceedings serve as reference for all
researchers and practitioners working on speech translation and related
fields.
The 21st edition of IWSLT <https://iwslt.org/2024/> will be run as an
*ELRA/ACL* event and co-located with ACL 2024 <https://2024.aclweb.org/> on
August 17-18, 2024. It will be run as a hybrid event.
Important Dates
January 15, 2024: Release of shared task training and dev data
April 01-15, 2024: Evaluation period
April 29, 2024: Paper submission due (all papers)
June 4, 2024: Notification of acceptance
June 24, 2024: Camera-ready paper due
July 22, 2024: Pre-recorded video due
August 17-18, 2024: Conference
Evaluation
The IWSLT 2024 features shared tasks <https://iwslt.org/2024/#shared-tasks>
that address the following focus areas:
- Speech-to-speech track
- Simultaneous track
- Subtitling track
- Offline track
- Dubbing track
- Low-resource track
- Indic track
*Registration
<https://docs.google.com/forms/d/e/1FAIpQLSdGhCusmyPVmBz36EB8ABUFignw7nZCoKx…>
for the evaluation* is now open and training and development data for each
shared task are available through the website. The results of all tasks
will be collected and discussed in an overview paper that will be presented
at the conference. In addition, participants have the opportunity to
present their work through a system paper that will be published in the ACL
Proceedings.
Conference
IWSLT also invites submissions of scientific papers to be published in the
ACL Proceedings and presented either in oral or poster format. The
conference selects high-quality, original contributions on theoretical and
practical issues of spoken language translation research, technologies and
applications. For further information on this initiative, please refer to
the website <https://iwslt.org/2024/#paper-submission>
*Special Session: Recent Highlights in SLT*
IWSLT 2024 will introduce a special session entitled “SLT Recent Highlights
<https://iwslt.org/2024/special-session>”. The session intends to provide
an overview of recent highlights from the field across venues, with a
series of short presentations focused reviews on recent developments and
new/emerging trends. Additionally, a short and lively discussion may follow
the presentations. We believe this initiative will contribute to a better
understanding of the current landscape of SLT research and foster
collaboration and exchange of ideas within the IWSLT community. For more
information on this initiative, please refer to the website
<https://iwslt.org/2024/special-session>
Contact
Please send an email to iwslt-evaluation-campaign(a)googlegroups.com if you
have any questions related to the shared tasks.
Thanks,
Marine, Marcello, Alex, Jan, Sebastian, Elizabeth, Atul
(IWSLT organisers)
*******************************************************
EAMT 2024: The 25th Annual Conference of
The European Association for Machine Translation
24 - 27 June 2024
Sheffield, UK
https://eamt2024.sheffield.ac.uk/
@eamt_2024 (X account)
Keynote speakers:
- Alexandra Birch (University of Edinburgh, UK)
- Valter Mavrič (DG TRAD, European Parliament)
EXTENDED Tutorial proposal deadline: 22 March 2024
Tutorial date: 27 June 2024
More information:
https://eamt2024.sheffield.ac.uk/conference-calls/final-call-for-tutorials
*******************************************************
*****
IMPORTANT: In order to use OpenReview (https://openreview.net/), authors
need to have an account in the system. Please create your account as soon
as possible and use an institutional e-mail if possible. OpenReview may
take some time to approve accounts that do not use institutional domains.
If you have any issues, contact eamt2024(a)gmail.com
*****
*** Overview ***
The European Association for Machine Translation (EAMT) invites proposals
for tutorials to be held in conjunction with the EAMT 2024 conference
taking place in Sheffield, UK, from 24 to 27 June, with tutorials held on
27 June. We seek proposals in all areas of machine translation (see the
call for papers of the main conference for the focus areas of EAMT 2024).
The aim of a tutorial is primarily to help the audience develop an
understanding of particular technical, applied, and business matters
related to research, development, and use of MT and translation technology.
Presentations of particular technological solutions or systems are welcome,
provided that they serve as illustrations of broader scientific
considerations.
We recommend that the tutorial covers work by the presenters as well as by
other researchers. The submission should explain that this breadth is
ensured. Tutorials should not be “self-invited talks”.
*** Submission Details ***
Proposals should not exceed 4 pages of content (plus unlimited pages for
references), should be in PDF format, and should contain the following:
- A title and authors, affiliations, and contact information.
- A brief description of the tutorial content and its relevance to the
machine translation community.
- Short description of the target audience and any expected prerequisite
background the audience should be aware of.
- An outline of the tutorial structure content and how it will be covered
in a three-hour slot (half-day). In exceptional cases, six-hour tutorial
slots (full day) are available. These time limits do not include coffee
breaks, e.g., a three-hour tutorial, in fact, occupies a 3.5-hour slot, and
a six-hour tutorial occupies a 7-hour slot.
- Diversity considerations, e.g. use of multilingual data, indications of
how the described methods scale up to various languages or domains,
participation of both senior and junior instructors, demographic and
geographical diversity of the instructors, plans for how to diversify
audience participation, etc.
- Reading list. Work that you expect the audience to read before the
tutorial can be indicated by an asterisk. Recommended papers should provide
the breadth of authorship and include work by other authors, and work from
other disciplines is welcome if relevant.
- For each tutorial presenter, a one-paragraph statement of their research
interests and areas of expertise for the tutorial topic, as well as
experience in instructing an international audience.
An estimate of the audience size for the tutorial. If the same or a similar
tutorial has been given before, include information on where any previous
version of the tutorial was given and how many attendees the tutorial
attracted.
- A description of special requirements for technical equipment.
Tutorial proposals should be submitted as PDF files to OpenReview:
https://openreview.net/group?id=EAMT.org/2024/Tutorials_Track.
Submissions should be formatted according to the templates specified below.
Anonymisation is not required. Submissions should be no longer than 4 pages
(excluding references).
*** Templates for writing your proposal ***
There templates available in the following formats (check our website --
https://eamt2024.sheffield.ac.uk/conference-calls/final-call-for-tutorials):
- LaTeX
- Cloneable Overleaf template
- Word
- Libre Office/Open Office
- PDF
*** Evaluation Criteria ***
Each tutorial proposal will be evaluated according to its clarity and
preparedness, novelty or timely character of the topic, and instructors’
experience.
** Tutorial Instructor Responsibilities ***
Accepted tutorial presenters will be notified by 22 April 2024. They must
then provide abstracts of their tutorials for inclusion in the conference
registration material by the specific conference deadlines. The description
should be in two formats: (a) an ASCII version that can be included in
email announcements and published on the conference website, and (b) a PDF
version for inclusion in the electronic proceedings (detailed instructions
will be provided). Tutorial speakers must provide tutorial materials by 22
May 2024. The final submitted tutorial materials must minimally include
copies of the course slides and a bibliography for the material covered in
the tutorial.
For each tutorial being held at EAMT 2024, we offer free registration to
the conference for one tutor only.
*** Important Dates ***
- Submission deadline for tutorial proposals (extended): 22 March 2024
- Notification of acceptance (extended): 22 April 2024
- Tutorial slides + abstract + bibliography + any other materials
(extended): 22 May 2024
All deadlines are at 23:59 CEST.
*** Workshop Co-Chairs ***
Mary Nurminen (Tampere University)
Diptesh Kanojia (University of Surrey)
*** Local organising committee ***
Carolina Scarton (University of Sheffield)
Charlotte Prescott (ZOO Digital)
Chris Bayliss (ZOO Digital)
Chris Oakley (ZOO Digital)
Xingyi Song (University of Sheffield)
*** Sponsors ***
Silver: Translated <https://translated.com/welcome>, Unbabel
<https://unbabel.com/>
Bronze: Pangeanic <https://pangeanic.com/>, STAR
<https://www.star-group.net/en/home.html>, Transperfect
<https://globallink.transperfect.com/>
Collaborator: Apertium <https://apertium.org/>
Supporter: Spring Nature <https://www.springernature.com/gp>
Confirmed sponsors: Crosslang <https://crosslang.com> and RWS Language
Weaver <https://www.rws.com>
--
*Carolina Scarton*
Lecturer in Natural Language Processing
Department of Computer Science
University of Sheffield
http://staffwww.dcs.shef.ac.uk/people/C.Scarton/
*******************************************************
EAMT 2024: The 25th Annual Conference of
The European Association for Machine Translation
24 - 27 June 2024
Sheffield, UK
https://eamt2024.sheffield.ac.uk/https://twitter.com/eamt_2024 (X account)
Keynote speakers:
- Alexandra Birch (University of Edinburgh, UK)
- Valter Mavrič (DG TRAD, European Parliament)
EXTENDED Paper submission deadline: 22 March 2024
More information:
https://eamt2024.sheffield.ac.uk/conference-calls/final-call-for-papers
*******************************************************
*****
IMPORTANT: In order to use OpenReview (https://openreview.net/), authors
need to have an account in the system. Please create your account as soon
as possible and use an institutional e-mail if possible. OpenReview may
take some time to approve accounts that do not use institutional domains.
If you have any issues, contact eamt2024(a)gmail.com
*****
The European Association for Machine Translation (EAMT) invites everyone
interested in machine translation (MT) and translation-related tools and
resources ― developers, researchers, users, translation and localization
professionals and managers ― to participate in this conference.
Driven by the state of the art, the research community will demonstrate
their cutting-edge research and results. Professional MTusers will provide
insights into successful MT implementation of MT in business scenarios as
well as implementation scenarios involving large corporations, governments,
or NGOs. Translation scholars and translation practitioners are also
invited to share their first-hand MT experience, which will be addressed
during a special track.
Note that papers that have been archived in arXiv can be accepted for
submission provided that they have not already been published elsewhere.
EAMT 2024 has four tracks, namely Research: Technical, Research:
Translators & Users, Implementations & Case Studies, and Products &
Projects.
*** Research: technical ***
Submissions (up to 10 pages, plus unlimited pages for references and
appendices) are invited for reports of significant research results in any
aspect of MT and related areas. Such reports should include a substantial
evaluation component, or have a strong theoretical and/or methodological
contribution where results and in-depth evaluations may not be appropriate.
Papers are welcome on all topics in the areas of MT and translation-related
technologies, including, but not limited to:
- Deep-learning approaches for MT and MT evaluation
- Advances in classical MT paradigms: statistical, rule-based, and hybrid
approaches
- Comparison of various MT approaches
- Technologies for MT deployment: quality estimation, domain adaptation,
etc.
- Resources and evaluation
- MT in special settings: low resources, massive resources, high volume,
low computing resources
- MT applications: translation/localization aids, speech translation,
multimodal MT, MT for user generated content (blogs, social networks), MT
in computer-aided language learning, etc.
- Linguistic resources for MT: corpora, terminologies, dictionaries, etc.
- MT evaluation techniques, metrics, and evaluation results
- Human factors in MT and user interfaces
- Related multilingual technologies: natural language generation,
information retrieval, text categorization, text summarization, information
extraction, optical character recognition, etc.
Papers should describe original work. They should emphasise completed work
rather than intended work, and should indicate clearly the state of
completion of the reported results. Where appropriate, concrete evaluation
results should be included.
Papers should be anonymized, prepared according to the templates specified
below, and be no longer than 10 pages (plus unlimited pages for references
and appendices). Submit the paper as a PDF to OpenReview:
https://openreview.net/group?id=EAMT.org/2024/Technical_Track. Submissions
that do not conform to the required styles may be rejected without review.
**Track co-chairs
Rachel Bawden (Inria, Paris)
Víctor M Sánchez-Cartagena (University of Alicant)
*** Research: translators & users ***
Submissions (up to 10 pages, plus unlimited pages for references and
appendices) are invited for academic research on all topics related to how
professional translators and other types of MT users interact with, are
affected by, or conceptualise MT. Papers should report significant research
results with a strong theoretical and/or methodological contribution.
Topics for the track include, but are not limited to:
- The impact of MT and post-editing: including studies on processes,
effort, strategies, usability, productivity, pricing, workflows, and
post-editese
- Human factors and psycho-social aspects of MT adoption (ergonomics,
motivation, and social impact on the profession, relationship between user
profiles and MT adoption)
- Emerging areas for MT & post-editing: e.g. audiovisual, game
localisation, literary texts, creative texts, social media, health care
communication, crisis translation
- MT and ethics
- The impact of using translators’ metadata and user activity data for
monitoring their work
- The evaluation and reception of different modalities of translation:
human translation, post-edited, raw MT
- MT and interpreting
- Human evaluations of MT output
- MT for gisting and the impact of MT on users: use cases, expectations,
perceptions, trust, views on acceptability
- MT and usability
- MT and education/language learning
- MT in the translation/interpreting classroom
Papers should describe original work. They should emphasise completed work
rather than intended work, and should indicate clearly the state of
completion of the reported results.
Papers should be anonymized, prepared according to the templates specified
below, and be no longer than 10 pages (plus unlimited pages for references
and appendices). Submit the paper as a PDF to OpenReview:
https://openreview.net/group?id=EAMT.org/2024/Research_Translators_Users_Tr….
Submissions that do not conform to the required styles may be rejected
without review.
** Track co-chairs
Patrick Cadwell (DCU)
Ekaterina Lapshinova-Koltunski (University of Hildesheim)
*** Implementations & case studies ***
Submissions (approximately 4–6 pages) are invited for reports on case
studies and implementation experience with MT in organisations of all
types, including small businesses, large corporations, governments, NGOs,
or language service providers. We also invite translation practitioners to
share their views and observations based on their day-to-day experience
working with MT in a variety of environments.
Topics for the track include, but are not limited to:
- Integrating or optimising MT and computer-assisted translation in
translation production workflows (translation memory/MT thresholds, mixing
online and offline tools, using interactive MT, dealing with MT confidence
scores)
- Managing change when implementing and using MT (e.g. switching between
multiple MT systems, limiting degradations when updating or upgrading an MT
system)
- Implementing open-source MT (e.g. strategies to get support, reports on
taking pilot results into full deployment, examples of advanced
customization sought and obtained thanks to the open-source paradigm,
collaboration within open-source MT projects)
- Evaluating MT in a real-world setting (e.g. error detection strategies
employed, metrics used, productivity or translation quality gains achieved)
- Ethical and confidentiality issues when using MT, especially MT in the
cloud
- Using MT in social networking or real-time communication (e.g. enterprise
support chat, multilingual content for social media)
- MT and usability
- Implementing MT to process multilingual content for assimilation purposes
(e.g. cross-lingual information retrieval, MT for e-discovery or spam
detection, MT for highly dynamic content)
- MT in literary, audiovisual, game localization and creative texts
- Impact of MT and post-editing on translation practices and the
profession: processes, effort, compensation,
- Psycho-social aspects of MT adoption (ergonomics, motivation, and social
impact on the profession)
- Error analysis and post-editing strategies (including automatic
post-editing and automation strategies)
- The use of translators’ metadata and user activity data in MT development
- Freelance translators’ independent use of MT
- MT and interpreting
Papers should highlight real-world use scenarios, solutions, and problems
in addition to describing MT integration processes and project settings.
Where solutions do not seem to exist, suggestions for MT researchers and
developers should be clearly emphasized. For papers on implementations and
case studies produced by academics, we require co-authorship with the
actual organizations working with MT implementations.
Papers (approximately 4–6 pages, with a maximum of 10 pages -- plus
unlimited pages for references) should be formatted according to the
templates specified below and submitted as PDF files to Open Review:
https://openreview.net/group?id=EAMT.org/2024/Implementations_Case_Studies_….
Anonymization is not required in the Implementations & Case Studies track
submissions. Submissions that do not conform to the required styles may be
rejected without review.
** Track co-chairs
Vera Cabarrão (Unbabel)
Konstantinos Chatzitheodorou (Strategic Agenda)
*** Products & Projects ***
Submissions (2 pages, including references) are invited on either of the
subtracks (Products or Projects).
- Products: Tools for MT, computer-aided translation, and other translation
technologies (including commercial products and free/open-source
software). Descriptions should include information about product
availability and licensing, an indication of cost if applicable, basic
functionality, (optionally) a comparison with other products, and a
description of the technologies used. The authors should be ready to
present the tools in the form of demos or posters during the conference.
- Projects: Research projects, funded through grants obtained in
competitive public or private calls related to MT. Descriptions should
contain: project title and acronym, funding agency, project reference,
duration, list of partner institutions or companies in the consortium if
there is one, project objectives, and a summary of partial results
available or final results if the project has ended. The authors should be
ready to present the projects in the form of posters during the conference.
This follows on from the successful ‘project villages’ held at the last
EAMT conferences.
There will be a poster boaster session for this track, in which authors
will have 120 seconds to attract attendees to their posters or demos with a
two-slide presentation.
Submissions should be formatted according to the templates specified
below. Anonymization is not required. Submissions should be no longer than
2 pages (including references), and submitted as PDF files to OpenReview:
https://openreview.net/group?id=EAMT.org/2024/Products_Projects_Track.
Track chairs
Helena Moniz (University of Lisbon (FLUL), INESC-ID)
Mikel Forcada (Prompsit Language Engineering, Elx)
*** Templates for writing your proposal ***
There templates available in the following formats (check our website --
https://eamt2024.sheffield.ac.uk/conference-calls/call-for-papers):
- LaTeX
- Cloneable Overleaf template
- Word
- Libre Office/Open Office
- PDF
*** Important deadlines ***
- Deadline for paper submission (further extended): 22 March 2024
- Notification to authors (further extended): 22 April 2024
- Camera ready deadline (further extended): 6 May 2024
- Author Registration (extended): 15 May 2024
All deadlines are at 23:59 CEST.
*** Local organising committee ***
Carolina Scarton (University of Sheffield)
Charlotte Prescott (ZOO Digital)
Chris Bayliss (ZOO Digital)
Chris Oakley (ZOO Digital)
Joanna Wright (University of Sheffield)
Xingyi Song (University of Sheffield)
*** Sponsors ***
Silver: Translated <https://translated.com/welcome>, Unbabel
<https://unbabel.com/>
Bronze: Pangeanic <https://pangeanic.com/>, STAR
<https://www.star-group.net/en/home.html>, Transperfect
<https://globallink.transperfect.com/>
Collaborator: Apertium <https://apertium.org/>
Supporter: Spring Nature <https://www.springernature.com/gp>
Confirmed sponsors: Crosslang <https://crosslang.com> and RWS Language
Weaver <https://www.rws.com>
--
*Carolina Scarton*
Lecturer in Natural Language Processing
Department of Computer Science
University of Sheffield
http://staffwww.dcs.shef.ac.uk/people/C.Scarton/
We offer a 3-year postdoctoral position in NLP at the University of Oslo, Norway, on the topic "Evaluating large language models - model architectures, training regimes and data selection". The application deadline is April 14, 2024. This position is funded by the DSTrain program (https://www.uio.no/dscience/english/dstrain/).
In the past years, (generative) large language models have become the core foundation models for a wide range of traditional NLP tasks, and they have also seen widespread adoption by the general public. At the same time, little is known about the specific training setups of commercial models, and some design decisions (in terms of model architecture, training regimes, and data selection) are based on traditions rather than empirical or theoretical considerations. Moreover, most current LLMs rely heavily on English training and evaluation data, and their performance on non-English languages remains difficult to assess. Potential candidates are expected to formulate their research project within the broad area of LLM evaluation. Examples of research topics are given below:
- Compare fine-tuning external pre-trained LLMs with training language-specific LLMs from scratch.
- Compare encoder-decoder LLMs with decoder-only LLMs.
- Evaluate generative LLMs on various text generation tasks, such as summarization, simplification, text normalization.
- Assess the multilingual (e.g. machine translation) and cross-lingual capabilities (cross-lingual transfer) of LLMs.
- Investigate how closely related low-resource languages are best accommodated in LLMs.
- Implement benchmarking datasets for LLM evaluation.
Applicants are expected to submit a research project that fits in the proposed research theme (Evaluaing large language models). Prospective applicants are encouraged to discuss their application with the contact person (me) to explore scientific focus and cooperation possibilities.
The application process for the DSTrain call is described here:
https://www.uio.no/dscience/english/dstrain/guide-for-applicants/applicatio…
This is the relevant research theme description:
https://www.uio.no/dscience/english/dstrain/research-areas/informatics/eval…
Please apply here:
https://www.jobbnorge.no/en/available-jobs/job/255679/dstrain-msca-postdoct…
Contact:
Yves Scherrer, LTG, University of Oslo
yves.scherrer(a)ifi.uio.no
[apologies if you receive multiple copies of this call]
Dear colleagues and friends,
*We are pleased to release the 2nd Call for Participation - CLEF 2024
SimpleText Task4: SOTA?*
*Overview:* SOTA? is introduced as Task 4 in the SimpleText track of CLEF
2024. The goal of the SOTA? shared task is to develop systems which given
the full text of an AI paper, are capable of recognizing whether an
incoming AI paper indeed reports model scores on benchmark datasets, and if
so, to extract all pertinent (Task, Dataset, Metric, Score) quadruples
presented within the paper.
More info on the task website:
https://sites.google.com/view/simpletext-sota/home
SOTA? will be divided into two evaluation phases:
- Evaluation Phase 1: Few-shot Testing;
- Evaluation Phase 2: Zero-shot Testing
*To participate in SOTA? i.e. SimpleText Task 4 @ CLEF 2024, please
register your team*:
1. CLEF 2024 official registration page
https://clef2024.imag.fr/index.php?page=Pages/registration.html
2. Codalab competition site:
https://codalab.lisn.upsaclay.fr/competitions/16616
Note, SOTA? is organized as a new task this year under the "SimpleText -
Improving Access to Scientific Texts for Everyone" initiative
https://simpletext-project.com/. Please take a look at the other 3 tasks,
i.e. Task 1, 2, and 3, offered by SimpleText and select one or more of
those task options too if you are interested. Note that there is no
interdependence of the dataset between "Task 4 - SOTA?" and the other three
tasks of SimpleText.
*Dates*
Training and validation datasets available: Feb 1, 2024 March 13, 2024
Test data available/Evaluation starts: April 23, 2024
Evaluation ends: May 3, 2024
Participant paper submissions due: May 31, 2024
Notification to authors: June 24, 2024
Camera ready due: July 8, 2024
CLEF 2024 Workshop, Grenoble, France: 9-12 September 2024
*Task Organizers*
Jennifer D’Souza (TIB Leibniz Information Centre for Science and Technology
- Germany)
Salomon Kabongo (L3S Research Center, Germany)
Hamed Babaei Giglou (TIB Leibniz Information Centre for Science and
Technology - Germany)
Yue Zhang (Berlin Technical University, Germany)
Sören Auer (TIB Leibniz Information Centre for Science and Technology -
Germany)
*We look forward to having you on board!*
*Contact:* sota.task [at] gmail.com
Extended deadline for abstract submission: 24 March 2024
The 8th International Conference 'Discourse Markers in Romance Languages'
https://sites.google.com/view/disrom2024
Lisbon, Portugal, 19-21 June 2024
*Important Dates*
24 March 2024 New deadline for abstract submission !
30 April 2024 Notification of acceptance
19-21 June 2024 Conference dates
*Meeting Description*
The Conference is one of a series of conferences on discourse markers in Romance languages (Madrid, 2010; Buenos Aires, 2011; Campinas, 2012; Heidelberg, 2015; Louvain-la- Neuve, 2017; Bergamo, 2019; Craiova 2022) and aims to build on the previous events, serving as a platform for internationally renowned linguists and young researchers alike to exchange views and ideas and to broaden their research perspectives.
This Conference’s theme will deal specifically :
1. with interactions between DMs and their explicit/implicit context, overcoming the traditional divide between their textual and interpersonal functions;
2. with the subjective adjustment function of DMs.
Researchers on discourse markers in Romance languages are invited to submit contributions on these topics, as well as on related subjects including (but not restricted to):
- definition of the discourse marker category;
- lexicons of discourse markers;
- discourse markers and their relation to other pragmatic categories;
- syntax-prosody-discourse interface;
- sociolinguistic approaches to discourse markers;
- variation of discourse markers across registers, languages and language varieties;
- translation studies;
- L1 and L2 acquisition of discourse markers;
- diachronic studies;
- experimental studies;
- corpus-based and computational studies;
- applied studies (business language, legal discourse, educational settings, etc.).
*Submissions*
The Conference will be on-site. Two presentation modalities will be possible: oral presentation and poster presentation.
Abstracts should not exceed one page (single spacing, 12-point Times New Roman font, not including figures and references, and must be uploaded as pdf). Abstracts can be written in any Romance language or in English.
They should be anonymous.
They will be submitted via EasyChair (https://easychair.org/conferences/?conf=disrom2024).
Authors must select the option oral presentation or poster presentation during the submission process on EasyChair.
*Keynote Speakers (provisional list)*
Denis Paillard (CNRS and Université Paris Diderot)
Isabel Margarida Duarte (Universidade de Porto)
*Workshop organizers (University of Lisbon)*
- Pierre Lejeune
- Marco Favaro
- Fabrizio Macagno
- Amália Mendes
*Scientific Committee*
Joanna Blochowiak (Université de Genève)
Margarita Borreguero Zuloaga (Universidad Complutense de Madrid)
Chloé Braud (University of Copenhague)
Sorina Ciobanu (University of Iasi)
Maria Antónia Diniz Caetano Coutinho (Universidade Nova de Lisboa)
Maria Josep Cuenca (Universitat de València)
Antonio Briz Gómez (Universitat de València)
Conceição Carapinha (Universidade de Coimbra)
Anna-Maria De Cesare (Universität Dresden)
Iria da Cunha (Universidad Nacional de Educación a Distancia)
Gaétane Dostie (Université de Sherbrooke)
Oana Adriana Duta (University of Craiova)
Chiara Fedriani (Università di Genova)
Mar Garachana Camarero (Universitat de Barcelona)
Chiara Ghezzi (Universitá di Bergamo)
Sonia Gómez-Jordana (Universidad Complutense de Madrid)
Pedro Gras (Université d’Anvers)
Martin Hummel (Universität Graz)
Julia Lavid Lopez (Universidad Complutense de Madrid)
Diana Lewis (Université Aix-Marseille)
Araceli López Serena (Universidad de Sevilla)
José Pinto de Lima (Centro de Linguística da Universidade Nova de Lisboa)
Maria Aldina Marques (Universidade do Minho)
Piera Molinelli (Università di Bergamo)
Silvia Murillo Ornat (Universidad de Zaragoza)
Cornelia Plag (Universidade de Coimbra)
Salvador Pons Bordería (Universitat de València)
Cecilia Popescu (University of Craiova)
Laurent Prévot (Université Aix-Marseille)
Augusto Soares da Silva (Universidade Catolica Portuguesa)
Laure Vieu (IRIT – Université de Toulouse III – Paul Sabatier)
Jacqueline Visconti (Università di Genova)
Sandrine Zufferey (Universität zu Bern)
Call for Papers
1st Workshop on Reliable Evaluation of LLMs for Factual Information (REAL-Info)
Co-located with ICWSM 2024, June 3, 2024, Buffalo, NY
https://sites.google.com/view/real-info-2024
LLMs have achieved state-of-the-art performance in several textual inference tasks and are gaining popularity. There is a significant focus on their integration with web and online applications, including web search, thus allowing them to reach millions of users. LLMs can influence various information tasks in our everyday lives, ranging from personal content creation to education, financial advice, and mental health support (Augenstein, 2023). However, with their vast linguistic capabilities and opaque nature, LLMs can inadvertently generate or amplify false information. There is growing concern about the factuality of LLM-generated content and its potential adverse impact on our information ecosystem (Chen, 2023; Peskoff, 2023).
Thus the need for reliable methods to assess the factuality of information is more critical than ever. This is where the synergy of AI, Natural Language Processing (NLP), and Human-Computer Interaction (HCI) becomes essential. AI and NLP techniques can be employed to analyze and identify the factuality of information through various tasks (Augenstein, 2023), such as fact-checking, stance detection, claim verification, and misinformation detection. These techniques can sift through the vast amounts of data to spot inconsistencies, biases, or inaccuracies that could indicate misinformation. Still, these approaches often use language models themselves, and epistemological questions arise when one LLM is fact-checked using another (or itself). Meanwhile, HCI plays a vital role in designing interactions and tools that enable humans to effectively oversee, interpret, and correct the outputs of LLMs. This human-in-the-loop approach ensures a critical evaluation and context-sensitive understanding of the factuality of information, which pure algorithmic methods might overlook. The combination of NLP's analytical capabilities and HCI's focus on human-centric design is instrumental in creating a digital ecosystem where LLMs can be utilized safely and responsibly, minimizing the risks of false information while maximizing their potential for user-centric applications.
The goals of the 1st ICWSM workshop Reliable Evaluation of LLMs for Factual Information (REAL-Info) are to facilitate discussion around such new LLM evaluation approaches, metrics, and benchmarks for factuality assessment tasks within the community, to inform the scope, biases, and blindspots of LLMs. It will spark interdisciplinary conversations from academic and industry researchers in computational social sciences (CSS), natural language processing (NLP), human-computer interaction (HCI), data science, and social computing. The workshop will solicit, research, and position papers with novel ideas, including but not limited to:
- New evaluation methods and metrics for evaluating LLM’s factuality considering diverse social context, e.g., source and domain of data, language, temporal generalization of information, or hallucination in generated/summarized content.
- Human-centered design approaches to aid LLMs in detecting and mitigating false information, e.g., human experts in the loop, and variation in prompting.
- New LLM-powered tools, methods, and applications for improving factuality assessment in social computing and computational social science.
- Biases and blindspots of LLMs in factuality assessment, including approaches for error analysis and model diagnostics.
- Limitations of existing benchmarks for tasks relevant to factuality assessment, e.g., claim verification, fact-checking, stance detection, and misinformation detection.
- Improve datasets and evaluation quality, e.g., avoidance of selection bias, addressing subjective judgments and biases in crowd-sourced annotation.
- Comparative evaluation and implications of open source and commercial LLMs for tasks relevant to factuality assessment.
- How does the reliability and factuality of LLM impact users (e.g. journalists, software engineers, artists..) and communities?
Submission instructions can be found on the workshop website. The workshop will take place as a half-day meeting in June. Authors of accepted papers will have the opportunity to publish their papers through workshop proceedings by the AAAI Press.
Timeline
- Workshop Papers Submission deadline: March 24, 2024
- Notifications: April 14, 2024
- Final Camera-Ready Paper Due: May 5, 2024
- ICWSM-2024 Workshops Day: June 3, 2024
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
_*INVITATION*_
We kindly invite you to the debate on
*Artificial Intelligence and the Future of the Portuguese Language*
which will take place on March 15, 2024, from 10h30 to 12h00 (Lisbon time)
as part of PROPOR 2024 - 16th International Conference on the Computational
Processing of the Portuguese Language.
As a plenary session of this conference, it will count on the contributions from
the researchers who are experts in this field gathering here.
It will also include contributions from guests who are experts in the field of
public policies for language promotion and who will help launch the debate:
*Ana Paula Laborinho*
Former President of the Camões Institute for Cooperation and Language,
current Director in Portugal of the OEI Organization of Ibero-American States,
and professor at the University of Lisbon, Faculty of Letters
*Claudio Pinhanez*
Deputy Director of the C4AI Artificial Intelligence Center in São Paulo,
and principal investigator at IBM Research, Brazil
*Ismael Gómez García*
Director of the OEI's Global Digital Strategy
*Valentín García*
Secretary General for Language Policy, Galicia Regional Government
*António Branco**(moderator)*
Honorary President of the ELRA Language Resources Association,
Director General of PORTULAN CLARIN Research Infrastructure for the Science and
and Technology of Language,
and Professor at the University of Lisbon, Faculty of Sciences
Information about this debate can be found here
https://propor2024.citius.gal/index.php/discussion-panel/
where in due course, it will be made available the way
to participate.
++++++++++++++++++++++++++++++++++++++++
_*BACKGROUND*_
For about a year now, it's been a rare day when we don't come across
news, comments, opinions, interviews, debates, podcasts,
prognoses, plans, panics, condemnations, glorifications, warnings, regulations,
fears and hopes about Artificial Intelligence. We are living the privilege,
rare in human history, to find ourselves facing the unprecedented promises
and challenges of a civilizational transformation induced by a technological shock
of a scope never before experienced.
This scientific and social tsunami has its origins in what for decades
has been considered the subarea of AI with the most difficult and challenging interdisciplinarity. Also known as natural language processing, computational
linguistics, computational language processing, etc., language technology deals
the most distinctively human cognitive capacity.
No area of human activity will be immune to this technological shock.
Even less so will the very object of its scientific inquiry, natural languages.
It is opportune to hold a debate on AI and the Portuguese language
by the scientists themselves, and inverting the perspective of passive analysis
to that of building an active contribution:
What is the impact on the future of the the Portuguese language and on citizenship
and sovereignty in the age of artificial intelligence?
What is the impact on public policies promoting language and how should
they be rethought and reconfigured?
What is the impact on public policies promoting science and technology and
how should their priorities be rethought and reconfigured?
What is the role of international cooperation, given that the Portuguese
is a multicentric language with global projection?
What should we learn from the responses are being advanced in other geographies
and for other languages? etc
The scientific community dedicated to research into the Portuguese language
technology has been meeting every other year for 30 years, alternately in Portugal
and Brazil, at the PROPOR international conference, which will be held again soon,
between March 13 and 15, 2024, the first time it will be held in another geopgraphy:
https://propor2024.citius.gal
With the help of guest speakers who are experts in the fields of language promotion
and international cooperation, scientific researchers in this field will try to open up
this reflection and contribute to finding answers to these questions in a debate
that will take place on March 15, 2024 between 10h30 and 12h00 (Lisbon time).
Information on this debate can be found here
https://propor2024.citius.gal/index.php/discussion-panel/
where in due course, it will be made available the way
to participate remotely, as technical conditions allow.
The debate will take place in Portuguese.
The Faculty of Mathematics and Natural Sciences at Heinrich Heine
University Düsseldorf is inviting applications for the position of a full
professorship (W2) for Machine Learning at the Department of Computer
Science to be filled as soon as possible.
Ideally, candidates should have an outstanding expertise in the field of
Machine Learning, particularly in modern machine learning techniques (e.g.,
large language models and deep learning architectures such as transformers
and related sequence models) and are willing to contribute to collaborative
projects, especially in the field of Natural Language Processing (NLP).
Application deadline 17 April 2024.
For more information see
https://berufungsportal.hhu.de/VAADIN/dynamic/resource/2/96c63060-a332-4561…
--
Prof. Dr. Laura Kallmeyer
Institut für Linguistik
Heinrich-Heine Universität Duesseldorf
Universitaetsstr. 1
D-40225 Duesseldorf, Germany
https://user.phil.hhu.de/kallmeyer/
Phone +49 (0)211 8113899