We offer a 3-year postdoctoral position in NLP at the University of Oslo, Norway, on the topic "Evaluating large language models - model architectures, training regimes and data selection". The application deadline is April 14, 2024. This position is funded by the DSTrain program (https://www.uio.no/dscience/english/dstrain/).
In the past years, (generative) large language models have become the core foundation models for a wide range of traditional NLP tasks, and they have also seen widespread adoption by the general public. At the same time, little is known about the specific training setups of commercial models, and some design decisions (in terms of model architecture, training regimes, and data selection) are based on traditions rather than empirical or theoretical considerations. Moreover, most current LLMs rely heavily on English training and evaluation data, and their performance on non-English languages remains difficult to assess. Potential candidates are expected to formulate their research project within the broad area of LLM evaluation. Examples of research topics are given below:
- Compare fine-tuning external pre-trained LLMs with training language-specific LLMs from scratch.
- Compare encoder-decoder LLMs with decoder-only LLMs.
- Evaluate generative LLMs on various text generation tasks, such as summarization, simplification, text normalization.
- Assess the multilingual (e.g. machine translation) and cross-lingual capabilities (cross-lingual transfer) of LLMs.
- Investigate how closely related low-resource languages are best accommodated in LLMs.
- Implement benchmarking datasets for LLM evaluation.
Applicants are expected to submit a research project that fits in the proposed research theme (Evaluaing large language models). Prospective applicants are encouraged to discuss their application with the contact person (me) to explore scientific focus and cooperation possibilities.
The application process for the DSTrain call is described here:
https://www.uio.no/dscience/english/dstrain/guide-for-applicants/applicatio…
This is the relevant research theme description:
https://www.uio.no/dscience/english/dstrain/research-areas/informatics/eval…
Please apply here:
https://www.jobbnorge.no/en/available-jobs/job/255679/dstrain-msca-postdoct…
Contact:
Yves Scherrer, LTG, University of Oslo
yves.scherrer(a)ifi.uio.no
[apologies if you receive multiple copies of this call]
Dear colleagues and friends,
*We are pleased to release the 2nd Call for Participation - CLEF 2024
SimpleText Task4: SOTA?*
*Overview:* SOTA? is introduced as Task 4 in the SimpleText track of CLEF
2024. The goal of the SOTA? shared task is to develop systems which given
the full text of an AI paper, are capable of recognizing whether an
incoming AI paper indeed reports model scores on benchmark datasets, and if
so, to extract all pertinent (Task, Dataset, Metric, Score) quadruples
presented within the paper.
More info on the task website:
https://sites.google.com/view/simpletext-sota/home
SOTA? will be divided into two evaluation phases:
- Evaluation Phase 1: Few-shot Testing;
- Evaluation Phase 2: Zero-shot Testing
*To participate in SOTA? i.e. SimpleText Task 4 @ CLEF 2024, please
register your team*:
1. CLEF 2024 official registration page
https://clef2024.imag.fr/index.php?page=Pages/registration.html
2. Codalab competition site:
https://codalab.lisn.upsaclay.fr/competitions/16616
Note, SOTA? is organized as a new task this year under the "SimpleText -
Improving Access to Scientific Texts for Everyone" initiative
https://simpletext-project.com/. Please take a look at the other 3 tasks,
i.e. Task 1, 2, and 3, offered by SimpleText and select one or more of
those task options too if you are interested. Note that there is no
interdependence of the dataset between "Task 4 - SOTA?" and the other three
tasks of SimpleText.
*Dates*
Training and validation datasets available: Feb 1, 2024 March 13, 2024
Test data available/Evaluation starts: April 23, 2024
Evaluation ends: May 3, 2024
Participant paper submissions due: May 31, 2024
Notification to authors: June 24, 2024
Camera ready due: July 8, 2024
CLEF 2024 Workshop, Grenoble, France: 9-12 September 2024
*Task Organizers*
Jennifer D’Souza (TIB Leibniz Information Centre for Science and Technology
- Germany)
Salomon Kabongo (L3S Research Center, Germany)
Hamed Babaei Giglou (TIB Leibniz Information Centre for Science and
Technology - Germany)
Yue Zhang (Berlin Technical University, Germany)
Sören Auer (TIB Leibniz Information Centre for Science and Technology -
Germany)
*We look forward to having you on board!*
*Contact:* sota.task [at] gmail.com
Extended deadline for abstract submission: 24 March 2024
The 8th International Conference 'Discourse Markers in Romance Languages'
https://sites.google.com/view/disrom2024
Lisbon, Portugal, 19-21 June 2024
*Important Dates*
24 March 2024 New deadline for abstract submission !
30 April 2024 Notification of acceptance
19-21 June 2024 Conference dates
*Meeting Description*
The Conference is one of a series of conferences on discourse markers in Romance languages (Madrid, 2010; Buenos Aires, 2011; Campinas, 2012; Heidelberg, 2015; Louvain-la- Neuve, 2017; Bergamo, 2019; Craiova 2022) and aims to build on the previous events, serving as a platform for internationally renowned linguists and young researchers alike to exchange views and ideas and to broaden their research perspectives.
This Conference’s theme will deal specifically :
1. with interactions between DMs and their explicit/implicit context, overcoming the traditional divide between their textual and interpersonal functions;
2. with the subjective adjustment function of DMs.
Researchers on discourse markers in Romance languages are invited to submit contributions on these topics, as well as on related subjects including (but not restricted to):
- definition of the discourse marker category;
- lexicons of discourse markers;
- discourse markers and their relation to other pragmatic categories;
- syntax-prosody-discourse interface;
- sociolinguistic approaches to discourse markers;
- variation of discourse markers across registers, languages and language varieties;
- translation studies;
- L1 and L2 acquisition of discourse markers;
- diachronic studies;
- experimental studies;
- corpus-based and computational studies;
- applied studies (business language, legal discourse, educational settings, etc.).
*Submissions*
The Conference will be on-site. Two presentation modalities will be possible: oral presentation and poster presentation.
Abstracts should not exceed one page (single spacing, 12-point Times New Roman font, not including figures and references, and must be uploaded as pdf). Abstracts can be written in any Romance language or in English.
They should be anonymous.
They will be submitted via EasyChair (https://easychair.org/conferences/?conf=disrom2024).
Authors must select the option oral presentation or poster presentation during the submission process on EasyChair.
*Keynote Speakers (provisional list)*
Denis Paillard (CNRS and Université Paris Diderot)
Isabel Margarida Duarte (Universidade de Porto)
*Workshop organizers (University of Lisbon)*
- Pierre Lejeune
- Marco Favaro
- Fabrizio Macagno
- Amália Mendes
*Scientific Committee*
Joanna Blochowiak (Université de Genève)
Margarita Borreguero Zuloaga (Universidad Complutense de Madrid)
Chloé Braud (University of Copenhague)
Sorina Ciobanu (University of Iasi)
Maria Antónia Diniz Caetano Coutinho (Universidade Nova de Lisboa)
Maria Josep Cuenca (Universitat de València)
Antonio Briz Gómez (Universitat de València)
Conceição Carapinha (Universidade de Coimbra)
Anna-Maria De Cesare (Universität Dresden)
Iria da Cunha (Universidad Nacional de Educación a Distancia)
Gaétane Dostie (Université de Sherbrooke)
Oana Adriana Duta (University of Craiova)
Chiara Fedriani (Università di Genova)
Mar Garachana Camarero (Universitat de Barcelona)
Chiara Ghezzi (Universitá di Bergamo)
Sonia Gómez-Jordana (Universidad Complutense de Madrid)
Pedro Gras (Université d’Anvers)
Martin Hummel (Universität Graz)
Julia Lavid Lopez (Universidad Complutense de Madrid)
Diana Lewis (Université Aix-Marseille)
Araceli López Serena (Universidad de Sevilla)
José Pinto de Lima (Centro de Linguística da Universidade Nova de Lisboa)
Maria Aldina Marques (Universidade do Minho)
Piera Molinelli (Università di Bergamo)
Silvia Murillo Ornat (Universidad de Zaragoza)
Cornelia Plag (Universidade de Coimbra)
Salvador Pons Bordería (Universitat de València)
Cecilia Popescu (University of Craiova)
Laurent Prévot (Université Aix-Marseille)
Augusto Soares da Silva (Universidade Catolica Portuguesa)
Laure Vieu (IRIT – Université de Toulouse III – Paul Sabatier)
Jacqueline Visconti (Università di Genova)
Sandrine Zufferey (Universität zu Bern)
Call for Papers
1st Workshop on Reliable Evaluation of LLMs for Factual Information (REAL-Info)
Co-located with ICWSM 2024, June 3, 2024, Buffalo, NY
https://sites.google.com/view/real-info-2024
LLMs have achieved state-of-the-art performance in several textual inference tasks and are gaining popularity. There is a significant focus on their integration with web and online applications, including web search, thus allowing them to reach millions of users. LLMs can influence various information tasks in our everyday lives, ranging from personal content creation to education, financial advice, and mental health support (Augenstein, 2023). However, with their vast linguistic capabilities and opaque nature, LLMs can inadvertently generate or amplify false information. There is growing concern about the factuality of LLM-generated content and its potential adverse impact on our information ecosystem (Chen, 2023; Peskoff, 2023).
Thus the need for reliable methods to assess the factuality of information is more critical than ever. This is where the synergy of AI, Natural Language Processing (NLP), and Human-Computer Interaction (HCI) becomes essential. AI and NLP techniques can be employed to analyze and identify the factuality of information through various tasks (Augenstein, 2023), such as fact-checking, stance detection, claim verification, and misinformation detection. These techniques can sift through the vast amounts of data to spot inconsistencies, biases, or inaccuracies that could indicate misinformation. Still, these approaches often use language models themselves, and epistemological questions arise when one LLM is fact-checked using another (or itself). Meanwhile, HCI plays a vital role in designing interactions and tools that enable humans to effectively oversee, interpret, and correct the outputs of LLMs. This human-in-the-loop approach ensures a critical evaluation and context-sensitive understanding of the factuality of information, which pure algorithmic methods might overlook. The combination of NLP's analytical capabilities and HCI's focus on human-centric design is instrumental in creating a digital ecosystem where LLMs can be utilized safely and responsibly, minimizing the risks of false information while maximizing their potential for user-centric applications.
The goals of the 1st ICWSM workshop Reliable Evaluation of LLMs for Factual Information (REAL-Info) are to facilitate discussion around such new LLM evaluation approaches, metrics, and benchmarks for factuality assessment tasks within the community, to inform the scope, biases, and blindspots of LLMs. It will spark interdisciplinary conversations from academic and industry researchers in computational social sciences (CSS), natural language processing (NLP), human-computer interaction (HCI), data science, and social computing. The workshop will solicit, research, and position papers with novel ideas, including but not limited to:
- New evaluation methods and metrics for evaluating LLM’s factuality considering diverse social context, e.g., source and domain of data, language, temporal generalization of information, or hallucination in generated/summarized content.
- Human-centered design approaches to aid LLMs in detecting and mitigating false information, e.g., human experts in the loop, and variation in prompting.
- New LLM-powered tools, methods, and applications for improving factuality assessment in social computing and computational social science.
- Biases and blindspots of LLMs in factuality assessment, including approaches for error analysis and model diagnostics.
- Limitations of existing benchmarks for tasks relevant to factuality assessment, e.g., claim verification, fact-checking, stance detection, and misinformation detection.
- Improve datasets and evaluation quality, e.g., avoidance of selection bias, addressing subjective judgments and biases in crowd-sourced annotation.
- Comparative evaluation and implications of open source and commercial LLMs for tasks relevant to factuality assessment.
- How does the reliability and factuality of LLM impact users (e.g. journalists, software engineers, artists..) and communities?
Submission instructions can be found on the workshop website. The workshop will take place as a half-day meeting in June. Authors of accepted papers will have the opportunity to publish their papers through workshop proceedings by the AAAI Press.
Timeline
- Workshop Papers Submission deadline: March 24, 2024
- Notifications: April 14, 2024
- Final Camera-Ready Paper Due: May 5, 2024
- ICWSM-2024 Workshops Day: June 3, 2024
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
_*INVITATION*_
We kindly invite you to the debate on
*Artificial Intelligence and the Future of the Portuguese Language*
which will take place on March 15, 2024, from 10h30 to 12h00 (Lisbon time)
as part of PROPOR 2024 - 16th International Conference on the Computational
Processing of the Portuguese Language.
As a plenary session of this conference, it will count on the contributions from
the researchers who are experts in this field gathering here.
It will also include contributions from guests who are experts in the field of
public policies for language promotion and who will help launch the debate:
*Ana Paula Laborinho*
Former President of the Camões Institute for Cooperation and Language,
current Director in Portugal of the OEI Organization of Ibero-American States,
and professor at the University of Lisbon, Faculty of Letters
*Claudio Pinhanez*
Deputy Director of the C4AI Artificial Intelligence Center in São Paulo,
and principal investigator at IBM Research, Brazil
*Ismael Gómez García*
Director of the OEI's Global Digital Strategy
*Valentín García*
Secretary General for Language Policy, Galicia Regional Government
*António Branco**(moderator)*
Honorary President of the ELRA Language Resources Association,
Director General of PORTULAN CLARIN Research Infrastructure for the Science and
and Technology of Language,
and Professor at the University of Lisbon, Faculty of Sciences
Information about this debate can be found here
https://propor2024.citius.gal/index.php/discussion-panel/
where in due course, it will be made available the way
to participate.
++++++++++++++++++++++++++++++++++++++++
_*BACKGROUND*_
For about a year now, it's been a rare day when we don't come across
news, comments, opinions, interviews, debates, podcasts,
prognoses, plans, panics, condemnations, glorifications, warnings, regulations,
fears and hopes about Artificial Intelligence. We are living the privilege,
rare in human history, to find ourselves facing the unprecedented promises
and challenges of a civilizational transformation induced by a technological shock
of a scope never before experienced.
This scientific and social tsunami has its origins in what for decades
has been considered the subarea of AI with the most difficult and challenging interdisciplinarity. Also known as natural language processing, computational
linguistics, computational language processing, etc., language technology deals
the most distinctively human cognitive capacity.
No area of human activity will be immune to this technological shock.
Even less so will the very object of its scientific inquiry, natural languages.
It is opportune to hold a debate on AI and the Portuguese language
by the scientists themselves, and inverting the perspective of passive analysis
to that of building an active contribution:
What is the impact on the future of the the Portuguese language and on citizenship
and sovereignty in the age of artificial intelligence?
What is the impact on public policies promoting language and how should
they be rethought and reconfigured?
What is the impact on public policies promoting science and technology and
how should their priorities be rethought and reconfigured?
What is the role of international cooperation, given that the Portuguese
is a multicentric language with global projection?
What should we learn from the responses are being advanced in other geographies
and for other languages? etc
The scientific community dedicated to research into the Portuguese language
technology has been meeting every other year for 30 years, alternately in Portugal
and Brazil, at the PROPOR international conference, which will be held again soon,
between March 13 and 15, 2024, the first time it will be held in another geopgraphy:
https://propor2024.citius.gal
With the help of guest speakers who are experts in the fields of language promotion
and international cooperation, scientific researchers in this field will try to open up
this reflection and contribute to finding answers to these questions in a debate
that will take place on March 15, 2024 between 10h30 and 12h00 (Lisbon time).
Information on this debate can be found here
https://propor2024.citius.gal/index.php/discussion-panel/
where in due course, it will be made available the way
to participate remotely, as technical conditions allow.
The debate will take place in Portuguese.
The Faculty of Mathematics and Natural Sciences at Heinrich Heine
University Düsseldorf is inviting applications for the position of a full
professorship (W2) for Machine Learning at the Department of Computer
Science to be filled as soon as possible.
Ideally, candidates should have an outstanding expertise in the field of
Machine Learning, particularly in modern machine learning techniques (e.g.,
large language models and deep learning architectures such as transformers
and related sequence models) and are willing to contribute to collaborative
projects, especially in the field of Natural Language Processing (NLP).
Application deadline 17 April 2024.
For more information see
https://berufungsportal.hhu.de/VAADIN/dynamic/resource/2/96c63060-a332-4561…
--
Prof. Dr. Laura Kallmeyer
Institut für Linguistik
Heinrich-Heine Universität Duesseldorf
Universitaetsstr. 1
D-40225 Duesseldorf, Germany
https://user.phil.hhu.de/kallmeyer/
Phone +49 (0)211 8113899
We are reaching out to the research community to ask for volunteers to
join the programme committee of ECAI-2024, the 27th European Conference
on Artificial Intelligence.
We are looking for volunteers who have completed their PhD, who have
published at good AI conferences in the past, and who have prior
experience with reviewing.
If you're interested, please sign up here:
https://forms.gle/24AxSkGdKv57cYqSA
Thanks a lot for your support!
Ulle Endriss and Francisco Melo
ECAI-2024 PC Chairs
Humor and Artificial Intelligence Track
=======================================
34th International Society for Humor Studies Conference (ISHS 2024) and
14th Humor Research Conference (HRC)
Hosted online by Texas A&M University-Commerce, April 19 to 21, 2024
https://tamuc.edu/humor
ABSTRACT SUBMISSION DEADLINE: MARCH 25, 2024
Call for papers
---------------
As in previous years, the Humor and AI Special Interest Group
<https://humorstudies.org/Forum/forumdisplay.php?fid=9> of the
International Society for Humor Studies will hold a panel at the 34th
International Society for Humor Studies Conference (ISHS 2024). This
year's conference is being held concurrently with the 14th Humor
Research Conference (HRC), hosted online by Texas A&M
University-Commerce from April 19 to 21, 2024.
We invite 20-minute presentations on AI-based technology for generating,
processing, or analyzing humor, for our dedicated panel that kicks off
ISHS's 2024 webinar series: <https://humorstudies.org/WebinarCenter2024.htm>
Application areas include, but are not limited to:
* human–computer interaction
* computer-mediated communication
* intelligent writing assistants
* conversational agents
* machine and computer-assisted translation
* digital humanities
* natural language processing
* computer vision
Abstracts of 250 words, excluding references, should be submitted by
e-mail to the conveners by March 25, 2024.
Conveners
---------
Kiki Hempelmann, Texas A&M University-Commerce <kiki(a)tamuc.edu>
Tristan Miller, University of Manitoba <Tristan.Miller(a)umanitoba.ca>
Julia M. Rayz, Purdue University <jtaylor1(a)purdue.edu>
--
Dr. Tristan Miller, Assistant Professor
Department of Computer Science, University of Manitoba
https://logological.org/ | Tel. +1 204 474 6792
*Apologies for crossposting*
TermTrends24: Models and Best Practices for Terminology Representation in
the Semantic Web
Workshop colocated with MDTT 2024 <https://mdtt2024.dei.unipd.it/en/>
Date: 26th June, 2024
Venue: Granada, Spain
More info: https://termtrends.linkeddata.es/
*15 March 2024 7th April (Extended): Deadline for paper submission*
*About TermTrends*TermTrends 2024, co-located with MDTT 2024 aims to
provide a discussion forum on the theoretical and methodological approaches
for the representation of terminological data, both at a conceptual and a
linguistic level. In particular, we would like to focus on their connection
to the Linguistic Linked (Open) Data (LLOD) paradigm through the
representation of these data according to Semantic Web formats. By adopting
models or vocabularies proposed for the representation of linguistic data,
we would contribute to the creation of interoperable and reusable
terminological resources.
With this objective, the workshop intends to explore the advantages and
challenges underlying various Terminology-related standardisation
approaches, ranging from the initially proposed standards to represent
terminology within the International Standardisation Organisation (ISO),
such as the TermBase eXchange (TBX) format, to models that represent
linguistic descriptions associated with ontologies in the Semantic Web,
such as SKOS and Ontolex-lemon.
Being multidisciplinary in scope, it focuses on identifying terminological
representation needs, as well as limitations of current models in
addressing such needs, with the aim of also exploring the development of an
extension of the Ontolex-lemon vocabulary and how that may contribute to
overcoming such challenges.
*Call for Papers*The topics of interest for this workshop include, but are
not limited to, the following topics:
- Terminology Representation Standards
- Terminology as Linguistic Linked (Open) Data
- Interoperability of Terminological Resources
- Reusability of Terminological Resources
- Challenges in Terminology Representation
- Analysis of the structure of Terminological Resources
*Submissions*
Papers proposals should follow the CEUR template. Short and long papers
will be accepted. Following CEUR guidelines, short papers should be 5-6
pages long and long papers 8-10 pages long. Authors must submit their
papers through the EasyChair platform following this link.
*Important Dates*15 March 2024* 7th April (Extended) *- Deadline for paper
submission
*20 April 2024* - Deadline for notification for paper submission
*15 May 2024* - Deadline for camera-ready paper submission
*26 June 2024 *- TermTrends Workshop
*Workshop Organisers*
Rute Costa, NOVA FCSH / NOVA CLUNL (Portugal)
Elena Montiel-Ponsoda, Universidad Politécnica de Madrid (Spain)
Sara Carvalho, Univ. de Aveiro / NOVA CLUNL (Portugal)
Patricia Martín-Chozas, Universidad Politécnica de Madrid (Spain)
Federica Vezzani, University of Padova (Italy)
*Patricia Martín Chozas - Postdoctoral Researcher*
* Ontology Engineering Group*
Artificial Intelligence Department
ETSI Informáticos - Universidad Politécnica de Madrid
Phone: (+34) 910673091
Dear all,
Applications are invited for a postdoctoral fellowship with the TartuNLP lab in the Institute of Computer Science at the University of Tartu. The funding for the position is provided by the recently established Estonian Centre of Excellence in AI (EXAI), which gathers various research teams from several Estonian research institutions.
The successful candidate will work with Kairit Sirts on AI-related projects at the interface between natural language processing (mainly using large language model technology) and psychology with the goal of developing chatbots for supporting mental health. The candidate will also participate with their NLP expertise in collaborative projects with other teams that are part of the EXAI.
The suitable candidate has a PhD in natural language processing, artificial intelligence, computer science, or other relevant discipline. They should have a good research and publication track record in NLP. Interest in psychology or mental health related topics is a bonus.
Employer: Institute of Computer Science, University of Tartu
Title: Researcher in natural language processing
Speciality: Natural language processing
Location: Tartu, Estonia
Deadline: 1 April, 2024
More info about the job offer, application process and the requirements: https://ut.ee/en/job-offer/research-fellow-natural-language-processing-0
All questions related to the position should be sent to me (kairit.sirts(a)ut.ee).
Best regards
Kairit Sirts