We invite the community to participate in a shared task organized in the
context of the CONDA workshop https://conda-workshop.github.io/
<https://conda-workshop.github.io/>.
Data contamination, where evaluation data is inadvertently included in
pre-training corpora of large scale models, and language models (LMs) in
particular, has become a concern in recent times (Sainz et al. 2023
<https://aclanthology.org/2023.findings-emnlp.722/>; Jacovi et al. 2023
<https://aclanthology.org/2023.emnlp-main.308/>). The growing scale of
both models and data, coupled with massive web crawling, has led to the
inclusion of segments from evaluation benchmarks in the pre-training
data of LMs (Dodge et al., 2021
<https://aclanthology.org/2021.emnlp-main.98/>; OpenAI, 2023
<https://arxiv.org/abs/2303.08774>; Google, 2023
<https://arxiv.org/abs/2305.10403>; Elazar et al., 2023
<https://arxiv.org/abs/2310.20707>). The scale of internet data makes it
difficult to prevent this contamination from happening, or even detect
when it has happened (Bommasani et al., 2022
<https://arxiv.org/abs/2108.07258>; Mitchell et al., 2023
<https://arxiv.org/abs/2212.05129>). Crucially, when evaluation data
becomes part of pre-training data, it introduces biases and can
artificially inflate the performance of LMs on specific tasks or
benchmarks (Magar and Schwartz, 2022
<https://aclanthology.org/2022.acl-short.18/>). This poses a challenge
for fair and unbiased evaluation of models, as their performance may not
accurately reflect their generalization capabilities.
The shared task is a community effort on centralized data contamination
evidence collection. While the problem of data contamination is
prevalent and serious, the breadth and depth of this contamination are
still largely unknown. The concrete evidence of contamination is
scattered across papers, blog posts, and social media, and it is
suspected that the true scope of data contamination in NLP is
significantly larger than reported.
With this shared task we aim to provide a structured, centralized
platform for contamination evidence collection to help the community
understand the extent of the problem and to help researchers avoid
repeating the same mistakes. The shared task also gathers evidence of
clean, non-contaminated instances. The platform is already available for
perusal at
https://huggingface.co/spaces/CONDA-Workshop/Data-Contamination-Database
<https://huggingface.co/spaces/CONDA-Workshop/Data-Contamination-Report>.
Participants in the shared task need to submit their contamination
evidence (see instructions below). The CONDA 2024 workshop organizers
will review the evidence through pull requests.
*/Compilation Paper/*
As a companion to the contamination evidence platform, we will produce a
paper that will provide a summary and overview of the evidence collected
in the shared task. The participants who contribute to the shared task
will be listed as co-authors in the paper.
*/
/*
*/Instructions for Evidence Submission/*
Each submission should report a case of contamination or lack of
contamination thereof. The submission can be either about (1)
contamination in the corpus used to pre-train language models, where the
pre-training corpus contains a specific evaluation dataset, or about (2)
contamination in a model that shows evidence of having seen a specific
evaluation dataset while being trained. Each submission needs to mention
the corpus (or model) and the evaluation dataset, in addition to some
evidence of contamination. Alternatively, we also welcome evidence of a
lack of contamination.
Reports must be submitted through a Pull Request in the Data
Contamination Report space at HuggingFace. The reports must follow the
Contribution Guidelines provided in the space and will be reviewed by
the organizers. If you have any questions, please contact us at
conda-workshop(a)googlegroups.com
<mailto:conda-workshop@googlegroups.com> or open a discussion in the
space itself.
URL with contribution guidelines:
https://huggingface.co/spaces/CONDA-Workshop/Data-Contamination-Database
<https://huggingface.co/spaces/CONDA-Workshop/Data-Contamination-Report> (“Contribution
Guidelines” tab)
*/Important dates/*
* Deadline for evidence submission: July 1, 2024
* Workshop day: August 16, 2024
*/Sponsors/*
* AWS AI and Amazon Bedrock
* HuggingFace
* Google
*/Contact/*
* Website: https://conda-workshop.github.io/
<https://conda-workshop.github.io/>
* Email: conda-workshop(a)googlegroups.com
<mailto:conda-workshop@googlegroups.com>
*/Organizers/*
Oscar Sainz, University of the Basque Country (UPV/EHU)
Iker García Ferrero, University of the Basque Country (UPV/EHU)
Eneko Agirre, University of the Basque Country (UPV/EHU)
Jon Ander Campos, Cohere
Alon Jacovi, Bar Ilan University
Yanai Elazar, Allen Institute for Artificial Intelligence and University
of Washington
Yoav Goldberg, Bar Ilan University and Allen Institute for Artificial
Intelligence
Dear all,
(Apologies for cross-posting)
This is the third CFP for the second Arabic Natural Language Processing
Conference (ArabicNLP 2024)
Co-located with ACL 2024 in Bangkok, Thailand, August 16, 2024. (Hybrid
Mode).
Conference URL: https://arabicnlp2024.sigarab.org/
Upcoming deadline: May 3, 2024: Abstract of direct conference paper
submissions due date (Open Review)
ArabicNLP 2024 builds on eight previous conference and workshop editions,
which have been very successful drawing in a large active participation in
various capacities (See Scholar Page
<https://scholar.google.com/citations?user=LGzh8jYAAAAJ>). This conference
is timely given the continued rise in research projects focusing on Arabic
NLP. The conference is organized by the Special Interest Group on Arabic
NLP (SIGARAB <https://www.sigarab.org/>), an Association for Computational
Linguistics Special Interest Group on Arabic NLP.
Call for Papers
We invite long (up to 8 pages), short (up to 4 pages), and demo paper (up
to 4 pages) submissions. Long and short papers will be presented orally or
as posters as determined by the program committee; presentation mode does
not reflect the quality of the work.
Submissions are invited on topics that include, but are not limited to, the
following:
-
Enabling technologies: (any size) language models, diacritization,
lemmatization, morphological analysis, disambiguation, tokenization, POS
tagging, named entity detection, chunking, parsing, semantic role labeling,
sentiment analysis, Arabic dialect modeling, etc.
-
Applications: dialog modeling, machine translation, speech recognition,
speech synthesis, optical character recognition, pedagogy, assistive
technologies, social media analytics, etc.
-
Resources: dictionaries, annotated data, corpora, etc.
Submissions may include work in progress as well as finished work.
Submissions must have a clear focus on specific issues pertaining to the
Arabic language whether it is standard Arabic, dialectal, classical, or
mixed. Papers on other languages sharing problems faced by Arabic NLP
researchers, such as Semitic languages or languages using Arabic script,
are welcome provided that they propose techniques or approaches that would
be of interest to Arabic NLP, and they explain why this is the case.
Additionally, papers on efforts using Arabic resources but targeting other
languages are also welcome. Descriptions of commercial systems are welcome,
but authors should be willing to discuss the details of their work. We also
welcome position papers and surveys about any of the above topics.
Conference Paper Submission URL: <https://softconf.com/emnlp2022/WANLP2022>
https://openreview.net/group?id=SIGARAB.org/ArabicNLP/2024/Conference
Important Dates for Conference Papers
-
May 3, 2024: Abstract of direct conference paper submissions due date
(Open Review)
-
May 10, 2024: Full direct conference paper submissions due date (Open
Review)
-
May 17, 2024: ARR commitment date <https://aclrollingreview.org/dates>
-
May 31, 2024: Reviews submission deadline
-
June 17, 2024: Notification of acceptance
-
July 1, 2024: Camera-ready papers due
-
August 16, 2024: ArabicNLP conference
All deadlines are 11:59 pm UTC -12h
<https://www.timeanddate.com/time/zone/timezone/utc-12> (“Anywhere on
Earth”).
There are eight exciting shared tasks:
https://arabicnlp2024.sigarab.org/shared-tasks
-
Task 1: AraFinNLP: Arabic Financial NLP
-
Task 2: FIGNEWS 2024: Shared Task on News Media Narratives of the Israel
War on Gaza
-
Task 3: ArAIEval: Propagandistic Techniques Detection in Unimodal and
Multimodal Arabic Content
-
Task 4: StanceEval2024: Arabic Stance Evaluation Shared Task
-
Task 5: WojoodNER 2024: The 2nd Arabic Named Entity Recognition Shared
Task
-
Task 6: ArabicNLU Shared-Task: Arabic Natural Language Understanding
-
Task 7: NADI 2024: Nuanced Arabic Dialect Identification
-
Task 8: KSAA-CAD Shared Task: Contemporary Arabic Reverse Dictionary and
Word Sense Disambiguation
If you have any questions, please contact us at
arabicnlp-pc-chairs(a)sigarab.org
The ArabicNLP 2024 Organizing Committee
--
Salam Khalifa
PhD Student at Stony Brook Linguistics
<https://www.linguistics.stonybrook.edu/>.
Job title: Design of Information Extraction Tools to characterize
molecules produced or degraded by microbes and applications to
plant-fermented food ecosystems.
MaIAGE-Bibliome (INRAE, University Paris-Saclay), a transdisciplinary
research lab, offers a PhD position in NLP applied to biology and food
science. The candidate will work within the FAIROmics doctoral network;
a Marie Skłodowska-Curie Action that aims to leverage AI techniques to
improve and discover knowledge about fermented food.
The position is located in Jouy-en-Josas (near Paris) and includes a
twelve-month secondment at the Applied AI Research Group at the
University of Szeged (Hungary). Both universities will award the PhD
diploma.
We are looking for candidates with:
- Master’s degree in Computer Science with a solid background in NLP,
AI, and/or ML. Strong academic records are highly recommended.
- Experience in deep learning approaches for NLP.
- Programming skills in Python.
- Very good English skills (both writing and speaking).
- An interest in biology, bioinformatics, and food science.
Application deadline: 15/05/2024 23:59 - Europe/Brussels.
Application form: https://sondages.inrae.fr/index.php/342264
<https://sondages.inrae.fr/index.php/342264>(select DC9)
Detailed description:
https://www.dn-fairomics.eu/open-phd-positions/dc-9-phd-position
<https://www.dn-fairomics.eu/open-phd-positions/dc-9-phd-position>
FAIROmics Doctoral Network: https://www.dn-fairomics.eu
<https://www.dn-fairomics.eu/>
MaIAGE-Bibliome: https://maiage.inrae.fr/en/bibliome
<https://maiage.inrae.fr/en/bibliome>
Department of Software Engineering - University of Szeged:
https://www.sed.inf.u-szeged.hu <https://www.sed.inf.u-szeged.hu>
Feel free to contact us for any questions: Robert.Bossy(a)inrae.fr
<mailto:Robert.Bossy@inrae.fr>
Apologies for crossposting.
Call for Papers
Information Processing & Management (IPM), Elsevier
-
CiteScore: 14.8
-
Impact Factor: 8.6
Guest editors:
-
Omar Alonso, Applied Science, Amazon, Palo Alto, California, USA.
E-mail: omralon(a)amazon.com
-
Stefano Marchesin, Department of Information Engineering, University of
Padua, Padua, Italy. E-mail: stefano.marchesin(a)unipd.it
-
Gianmaria Silvello, Department of Information Engineering, University
of Padua, Padua, Italy. E-mail: gianmaria.silvello(a)unipd.it
Special Issue on “Large Language Models and Data Quality for Knowledge
Graphs”
In recent years, Knowledge Graphs (KGs), encompassing millions of
relational facts, have emerged as central assets to support virtual
assistants and search and recommendations on the web. Moreover, KGs are
increasingly used by large companies and organizations to organize and
comprehend their data, with industry-scale KGs fusing data from various
sources for downstream applications. Building KGs involves data management
and artificial intelligence areas, such as data integration, cleaning,
named entity recognition and disambiguation, relation extraction, and
active learning.
However, the methods used to build these KGs involve automated components
that could be better, resulting in KGs with high sparsity and incorporating
several inaccuracies and wrong facts. As a result, evaluating the KG
quality plays a significant role, as it serves multiple purposes – e.g.,
gaining insights into the quality of data, triggering the refinement of the
KG construction process, and providing valuable information to downstream
applications. In this regard, the information in the KG must be correct to
ensure an engaging user experience for entity-oriented services like
virtual assistants. Despite its importance, there is little research on
data quality and evaluation for KGs at scale.
In this context, the rise of Large Language Models (LLMs) opens up
unprecedented opportunities – and challenges – to advance KG construction
and evaluation, providing an intriguing intersection between human and
machine capabilities. On the one hand, integrating LLMs within KG
construction systems could trigger the development of more context-aware
and adaptive AI systems. At the same time, however, LLMs are known to
hallucinate and can thus generate mis/disinformation, which can affect the
quality of the resulting KG. In this sense, reliability and credibility
components are of paramount importance to manage the hallucinations
produced by LLMs and avoid polluting the KG. On the other hand,
investigating how to combine LLMs and quality evaluation has excellent
potential, as shown by promising results from using LLMs to generate
relevance judgments in information retrieval.
Thus, this special issue promotes novel research on human-machine
collaboration for KG construction and evaluation, fostering the
intersection between KGs and LLMs. To this end, we encourage submissions
related to using LLMs within KG construction systems, evaluating KG
quality, and applying quality control systems to empower KG and LLM
interactions on both research- and industrial-oriented scenarios.
Topics include but are not limited to:
-
KG construction systems
-
Use of LLMs for KG generation
-
Efficient solutions to deploy LLMs on large-scale KGs
-
Quality control systems for KG construction
-
KG versioning and active learning
-
Human-in-the-loop architectures
-
Efficient KG quality assessment
-
Quality assessment over temporal and dynamic KGs
-
Redundancy and completeness issues
-
Error detection and correction mechanisms
-
Benchmarks and Evaluation
-
Domain-specific applications and challenges
-
Maintenance of industry-scale KGs
-
LLM validation via reliable/credible KG data
Submission guidelines:
Authors are invited to submit original and unpublished papers. All
submissions will be peer-reviewed and judged on originality, significance,
quality, and relevance to the special issue topics of interest. Submitted
papers should not have appeared in or be under consideration for another
journal.
Papers can be submitted *up *to 1 September 2024. The estimated publication
date for the special issue is 15 January 2025.
Papers submission via IP&M electronic submission system:
https://www.editorialmanager.com/IPM
To submit your manuscript to the special issue, please choose the article
type:
"VSI: LLMs and Data Quality for KGs".
More info here:
https://www.sciencedirect.com/journal/information-processing-and-management…
Instructions for authors:
https://www.sciencedirect.com/journal/information-processing-and-management…
Important dates:
-
Submissions close: 1 September 2024
-
Publication date (estimated): 15 January 2025
References:
Weikum G., Dong X.L., Razniewski S., et al. (2021) Machine knowledge:
creation and curation of comprehensive knowledge bases. Found. Trends
Databases, 10, 108–490.
Hogan A., Blomqvist E., Cochez M. et al. (2021) Knowledge graphs. ACM
Comput. Surv., 54, 71:1–71:37.
B. Xue and L. Zou. 2023. Knowledge Graph Quality Management: A
Comprehensive Survey. IEEE Trans. Knowl. Data Eng. 35, 5 (2023), 4969 – 4988
G. Faggioli, L. Dietz, C. L. A. Clarke, G. Demartini, M. Hagen, C. Hauff,
N. Kando, E. Kanoulas, M. Potthast, B. Stein, and H. Wachsmuth. 2023.
Perspectives on Large Language Models for Relevance Judgment. In Proc. of
the 2023 ACM SIGIR International Conference on Theory of Information
Retrieval, ICTIR 2023, Taipei, Taiwan, 23 July 2023. ACM, 39 – 50.
S. MacAvaney and L. Soldaini. 2023. One-Shot Labeling for Automatic
Relevance Estimation. In Proc. of the 46th International ACM SIGIR
Conference on Research and Development in Information Retrieval, SIGIR
2023, Taipei, Taiwan, July 23-27, 2023. ACM, 2230 – 2235.
X. L. Dong. 2023. Generations of Knowledge Graphs: The Crazy Ideas and the
Business Impact. Proc. VLDB Endow. 16, 12 (2023), 4130 – 4137.
S. Pan, L. Luo, Y. Wang, C. Chen, J. Wang, and X. Wu. 2023. Unifying Large
Language Models and Knowledge Graphs: A Roadmap. CoRR abs/2306.08302 (2023).
--
Stefano Marchesin, PhD
Assistant Professor (RTD/a)
Information Management Systems (IMS) Group
Department of Information Engineering
University of Padua
Via Gradenigo 6/a, 35131 Padua, Italy
Home page: http://www.dei.unipd.it/~marches1/
9th Symposium on Corpus Approaches to Lexicogrammar (LxGr2024)
CALL FOR PAPERS
Extended deadline for abstract submission: 15 April 2024
The symposium will take place online on Friday 5 and Saturday 6 July 2024.
Invited Speakers
Lise Fontaine<http://www.uqtr.ca/PagePerso/Lise.Fontaine> (Université du Québec à Trois-Rivières): Reconciling (or not) lexis and grammar
Ute Römer-Barron<http://alsl.gsu.edu/profile/ute-romer> (Georgia State University): Phraseology research in second language acquisition
LxGr primarily welcomes papers reporting on corpus-based research on any aspect of the interaction of lexis and grammar - particularly studies that interrogate the system lexicogrammatically to get lexicogrammatical answers. However, position papers discussing theoretical or methodological issues are also welcome, as long as they are relevant to both lexicogrammar and corpus linguistics.
If you would like to present, send an abstract of 500 words (excluding references) to lxgr(a)edgehill.ac.uk<mailto:lxgr@edgehill.ac.uk>
Abstracts for research papers should specify the research focus (research questions or hypotheses), the corpus, the methodology (techniques, metrics), the theoretical orientation, and the main findings. Abstracts for position papers should specify the theoretical orientation and the potential contribution to both lexicogrammar and corpus linguistics.
Abstracts will be double-blind reviewed by members of the Programme Committee<https://sites.edgehill.ac.uk/lxgr/committee>.
Full papers will be allocated 35 minutes (including 10 minutes for discussion).
Work-in-progress reports will be allocated 20 minutes (including 5 minutes for discussion).
There will be no parallel sessions.
Participation is free.
For details, visit the LxGr website: https://sites.edgehill.ac.uk/lxgr/lxgr2024
If you have any questions, contact gabrielc(a)edgehill.ac.uk<mailto:gabrielc@edgehill.ac.uk>
________________________________
Edge Hill University<http://ehu.ac.uk/home/emailfooter>
Modern University of the Year, The Times and Sunday Times Good University Guide 2022<http://ehu.ac.uk/tef/emailfooter>
University of the Year, Educate North 2021/21
________________________________
This message is private and confidential. If you have received this message in error, please notify the sender and remove it from your system. Any views or opinions presented are solely those of the author and do not necessarily represent those of Edge Hill or associated companies. Edge Hill University may monitor email traffic data and also the content of email for the purposes of security and business communications during staff absence.<http://ehu.ac.uk/itspolicies/emailfooter>
The Department of Digital Humanities, Faculty of Arts, University of Helsinki, invites applications for the position of
UNIVERSITY LECTURER IN HUMANITIES DATA SCIENCE / COMPUTATIONAL HUMANITIES
for a permanent appointment starting 1st of September 2024.
https://jobs.helsinki.fi/job/Helsinki-University-Lecturer-in-Humanities-Dat…
Due date: April 25, 2024
The position relates to the application of computational and/or statistical methods in the humanities. The application areas are to be interpreted broadly, from area studies to cognitive science, linguistics to history, phonetics to literature. Application, on the other hand, is to be understood primarily from the viewpoint of end-use across this plethora of humanistic research, e.g. through matching approaches to research questions and data, and not as a focus on methodological development itself. The lecturer will be attached to the Liberal Arts and Sciences bachelor’s programme currently under preparation at the university.
——————————————
Jörg Tiedemann
University of Helsinki
https://blogs.helsinki.fi/language-technology/
The first workshop on evaluating IR systems with Large Language Models
(LLMs) is accepting submissions that describe original research findings,
preliminary research results, proposals for new work, and recent relevant
studies already published in high-quality venues.
Topics of interest
We welcome both full papers and extended abstract submissions on the
following topics, including but not limited to:
- LLM-based evaluation metrics for traditional IR and generative IR.
- Agreement between human and LLM labels.
- Effectiveness and/or efficiency of LLMs to produce robust relevance
labels.
- Investigating LLM-based relevance estimators for potential systemic
biases.
- Automated evaluation of text generation systems.
- End-to-end evaluation of Retrieval Augmented Generation systems.
- Trustworthiness in the world of LLMs evaluation.
- Prompt engineering in LLMs evaluation.
- Effectiveness and/or efficiency of LLMs as ranking models.
- LLMs in specific IR tasks such as personalized search, conversational
search, and multimodal retrieval.
- Challenges and future directions in LLM-based IR evaluation.
Submission guidelines
We welcome the following submissions:
- Previously unpublished manuscripts will be accepted as extended
abstracts and full papers (any length between 1 - 9 pages) with unlimited
references, formatted according to the latest ACM SIG proceedings template
available at http://www.acm.org/publications/proceedings-template.
- Published manuscripts can be submitted in their original format.
All submissions should be made through Easychair:
https://easychair.org/conferences/?conf=llm4eval
All papers will be peer-reviewed (single-blind) by the program committee
and judged by their relevance to the workshop, especially to the main
themes identified above, and their potential to generate discussion. For
already published studies, the paper can be submitted in the original
format. These submissions will be reviewed for their relevance to this
workshop. All submissions must be in English (PDF format).
All accepted papers will have a poster presentation with a few selected for
spotlight talks. Accepted papers may be uploaded to arXiv.org, allowing
submission elsewhere as they will be considered non-archival. The
workshop’s website will maintain a link to the arXiv versions of the papers.
Important Dates
- Submission Deadline: April 25th, 2024 (AoE time)
- Acceptance Notifications: May 31st, 2024 (AoE time)
- Workshop date: July 18, 2024
Website
For more information, visit the workshop website:
https://llm4eval.github.io/
Contact
For any questions about paper submission, you may contact the workshop
organizers at llm4eval(a)easychair.org
--
Apologies for cross-posting.
--
Have you recently completed or expect very soon an MSc or equivalent degree
in computer science, artificial intelligence, computational linguistics,
engineering, or a related area? Are you interested in carrying out research
on automatic translation during the next few years? Are you excited to
spend a part of your life in a pleasant city in the heart of the Italian
Alps?
WE ARE LOOKING FOR YOU!!!
The Machine Translation <https://mt.fbk.eu/> (MT) group at Fondazione Bruno
Kessler (Trento, Italy) in conjunction with the ICT International Doctorate
School of the University of Trento <https://iecs.unitn.it/> is pleased to
announce the availability of the following fully-funded PhD position:
TITLE: Resource-efficient Foundation Models for Automatic Translation
DESCRIPTION:
The advent of foundation models has led to impressive advancements in all
areas of natural language processing. However, their huge size poses
limitations due to the significant computational costs associated with
their use or adaptation. When applying them to specific tasks, fundamental
questions arise: do we actually need all the architectural complexity of
large and - by design - general-purpose foundation models? Can we optimize
them to achieve higher efficiency? These questions spark interest in
research aimed at reducing models’ size, or deploying efficient decoding
strategies, so as to accomplish the same tasks while maintaining or even
improving performance. Success in this direction would lead to significant
practical and economic benefits (e.g., lower adaptation costs, the
possibility of local deployment on small-sized hardware devices), as well
as advantages from an environmental impact perspective towards sustainable
AI. Focusing on automatic translation, this PhD aims to understand the
functioning dynamics of general-purpose massive foundation models and
explore possibilities to streamline them for specific tasks. Possible areas
of interest range from textual and speech translation (e.g., how to
streamline a massively multilingual model to best handle a subset of
languages?) to scenarios where the latency is a critical factor, such as in
simultaneous/streaming translation (e.g., how to streamline the model to
reduce latency?), to automatic subtitling of audiovisual content (e.g., how
to streamline the model without losing its ability to generate compact
outputs suitable for subtitling?).
CONTACTS: Matteo Negri (negri(a)fbk.eu), Luisa Bentivogli (bentivo(a)fbk.eu)
COMPLETE DETAILS AVAILABLE AT:
https://iecs.unitn.it/education/admission/call-for-application
IMPORTANT DATES:
The deadline for application is May 7th, 2024, hrs. 04:00 PM (CEST)
Prospective candidates are strongly invited to contact us in advance for
preliminary interviews. Precedence for interviews will be given to
short-listed candidates that will send us a complete CV via email (
negri(a)fbk.eu, bentivo(a)fbk.eu) by April 22, 2024.
Candidate profile
The ideal candidate must have recently completed or expect very soon an MSc
or equivalent degree in computer science, artificial intelligence,
computational linguistics, engineering, or a closely related area. In
addition, the applicant should:
-
Have an interest in Machine and Speech Translation
-
Have experience in deep learning and machine learning, in general
-
Have good programming skills in Python and experience in PyTorch
-
Enjoy working with real-world problems and large data sets
-
Have good knowledge of written and spoken English
-
Enjoy working in a closely collaborating team
Working Environment
The doctoral student will be employed at the MT group at Fondazione Bruno
Kessler, Trento, Italy. The group (about 10 people including staff and
students) has a long tradition in research on machine and speech
translation and is currently involved in several projects. Former students
are nowadays employed in leading IT companies in the world.
Benefits
Fondazione Bruno Kessler offers an attractive benefits package, including a
flexible work week, full reimbursement for conferences and summer schools,
a competitive salary, an excellent team of supervisors and mentors, help
with housing, full health insurance, the possibility of Italian courses,
and sporting facilities.
Further Information
For preliminary interviews, and should you need further information about
the position, please contact Matteo Negri (negri(a)fbk.eu) and Luisa
Bentivogli (bentivo(a)fbk.eu).
Best Regards,
Matteo Negri
--
--
Le informazioni contenute nella presente comunicazione sono di natura
privata e come tali sono da considerarsi riservate ed indirizzate
esclusivamente ai destinatari indicati e per le finalità strettamente
legate al relativo contenuto. Se avete ricevuto questo messaggio per
errore, vi preghiamo di eliminarlo e di inviare una comunicazione
all’indirizzo e-mail del mittente.
--
The information transmitted is
intended only for the person or entity to which it is addressed and may
contain confidential and/or privileged material. If you received this in
error, please contact the sender and delete the material.
Dear Colleagues,
We are pleased to announce that the 2024 edition of the *Lectures on
Computational Linguistics*, a series of lectures dedicated to central topics in
Computational Linguistics and Natural Language Processing, will be held in
Bari from June 19 to 21.
The programme and all information are available on this
<https://www.ai-lc.it/en/lectures-2/lectures-2024/> site.
The 2024 edition is organized by the Italian Association of Computational
Linguistics/Associazione Italiana di Linguistica Computazionale (AILC) with
the Department of Computer Science and the Department of Humanistic
Research and Innovation of the University of Bari 'Aldo Moro'.
The interdisciplinary nature of the school crosses several areas,
particularly the Humanities, Computer Science and Artificial Intelligence.
The program includes tutorials, labs, evening lectures, and two student
presentation sessions. The 2024 edition features a four-hour tutorial
dedicated to introducing Large Language Models to a broad audience.
*Programme*
*Wednesday, June 19, 2024*
9:00–9:30: Welcome and opening
9.30 – 11.30: Tutorial 1 (part 1) – Introduction to Large Language Models –
Andrey Kutuzov, Language Technology Group, University of Oslo
11.30 – 12.00: BREAK
12:00 – 13:30: Student session
1.30pm – 3.00pm: LUNCH
3.00pm – 5.00pm: Tutorial 1 (part 2) – Introduction to Large Language
Models – Andrey Kutuzov, Language Technology Group, University of Oslo
5.00pm – 5.30pm: BREAK
5.30pm – 6.30pm: Evening lecture
7.30pm: Welcome drink
*Thursday, June 20, 2024*
9:00 – 11:00: Tutorial 2 – Computational methods for lexical semantic
change detection – Nina Tahmasebi, University of Gothenburg
11:00 – 11:30: BREAK
11.30 – 13.30: Lab. 1 (part 1) – Hands-on Large Language Models – Marco
Polignano & Lucia Siciliani, University of Bari Aldo Moro
1.30pm – 3.00pm: LUNCH
3.00pm – 5.00pm: Lab. 1 (part 2) – Hands-on Large Language Models – Marco
Polignano & Lucia Siciliani, University of Bari Aldo Moro
5.00pm – 5.30pm: BREAK
5.30pm – 6.30pm: Evening lecture
7.00pm: Tour of the Old Town and dinner with typical food
*Friday, June 21, 2024*
9:00 – 11:00: Tutorial 3 – Dissociating language and thought in Large
Language Models – Anna Ivanova, School of Psychology, Georgia Tech
11:00 – 11:30: BREAK
11.30am – 1.00pm: Student session
1.00pm – 2.00pm: LUNCH
2.00pm – 4.00pm: Lab 2 – Lab. Computational methods for lexical semantic
change detection – Pierluigi Cassotti, University of Gothenburg
*Registration*
The school is mainly aimed at Doctoral and Master's degree students,
although a minimum qualification is not required for access. Participation
is free but subject to registration, and places are limited to 200.
Students wishing to present aspects of their work in the "Student
Presentations" sessions are asked to send a 500-word abstract to
ailc.lectures(a)gmail.com by May 10, 2024. Notifications of acceptance will
be sent by May 31.
Scientific Committee
Pierpaolo Basile (University of Bari Aldo Moro)
Raffaella Bernardi (University of Trento)
Tommaso Caselli (University of Groningen)
Felice Dell'Orletta (Institute of Computational Linguistics CNR – Pisa)
Elisabetta Jezek (University of Pavia)
Local Organizing Committee
Pierpaolo Basile (Department of Computer Science, University of Bari Aldo
Moro)
Marco de Gemmis (Department of Computer Science, University of Bari Aldo
Moro)
Maristella Gatto (Department of Humanistic Research and Innovation,
University of Bari Aldo Moro)
Olimpia Imperio (Coordinator of the Doctorate in Letters, Languages and
Arts, Department of Humanistic Research and Innovation, University of Bari
Aldo Moro)
Secretariat
Lucia Siciliani (Department of Computer Science, University of Bari Aldo
Moro)
Contacts: ailc.lectures(a)gmail.com
--
*Linguistica computazionale. Introduzione all'analisi automatica dei testi
<https://www.mulino.it/isbn/9788815290359>.*
Bologna, Il Mulino, in libreria dal 3 marzo 2023
--
[image: LOGO-UNIPV]
Elisabetta Jezek
Dipartimento di Studi Umanistici
Professore Associato di Glottologia e Linguistica
Corso Strada Nuova 65 - 27100 Pavia (Italia)
<http://maps.google.com/?q=Corso+Strada+Nuova+65+27100+Pavia+%28Italia%29>
T. 0382984391
https://studiumanistici.unipv.it/?pagina=docenti&id=13
<https://studiumanistici.unipv.it/?pagina=docenti&id=135>5
Elisabetta Jezek's Personal Meeting Room
https://us02web.zoom.us/j/7814331810
--
Le informazioni contenute nella presente comunicazione sono di natura privata
e come tali sono da considerarsi riservate ed indirizzate esclusivamente ai
destinatari indicati e per le finalità strettamente legate al relativo
contenuto. Se avete ricevuto questo messaggio per errore, vi preghiamo di
eliminarlo e di inviare una comunicazione all’indirizzo e-mail del mittente.
--
The information transmitted is intended only for the person or entity to
which it is addressed and may contain confidential and/or privileged
material. If you received this in error, please contact the sender and
delete the material.
<http://lettere.unipv.it/diplinguistica/docenti.php>
--
[image: LOGO-UNIPV]
PhD ELISABETTA JEZEK
Dipartimento di Studi Umanistici
PROFESSORE ASSOCIATO IN LINGUISTICA E GLOTTOLOGIA
Presidente del corso di laurea magistrale internazionale in European
Languages, Cultures and Societies in Contact
Membro del Consiglio Direttivo dell'Associazione Italiana di Linguistica
Computazionale
<https://firmamail.unipv.it/index.php/firme/genera>
https://unipv.unifind.cineca.it/resource/person/659960
Elisabetta Jezek's Personal Meeting Room
https://us02web.zoom.us/j/7814331810
--
Le informazioni contenute nella presente comunicazione sono di natura privata
e come tali sono da considerarsi riservate ed indirizzate esclusivamente ai
destinatari indicati e per le finalità strettamente legate al relativo
contenuto. Se avete ricevuto questo messaggio per errore, vi preghiamo di
eliminarlo e di inviare una comunicazione all’indirizzo e-mail del mittente.
--
The information transmitted is intended only for the person or entity to
which it is addressed and may contain confidential and/or privileged
material. If you received this in error, please contact the sender and
delete the material.
<http://lettere.unipv.it/diplinguistica/docenti.php>