Processing of figurative language is a rapidly growing area in NLP, including computational modeling of metaphors, idioms, puns, irony, sarcasm, simile, and other figures. Characteristic to all areas of human activity (from poetic, ordinary, scientific, social media) and, thus, to all types of discourse, figurative language becomes an important problem for NLP systems. Its ubiquity in language has been established in a number of corpus studies and the role it plays in human reasoning has been confirmed in psychological experiments. This makes figurative language an important research area for computational and cognitive linguistics, and its automatic identification, interpretation and generation indispensable for any semantics-oriented NLP application.
The proposed workshop will be the fourth edition of the biennial Workshop on Figurative Language Processing, whose first editions were held at NAACL 2018, ACL 2020 and EMNLP 2022, respectively. The workshop builds upon a long series of related workshops that the current organizers have been involved with: “Metaphor in NLP” series (2013-2016) and “Computational Approaches to Linguistic Creativity” series (2009-2010). We expand the scope to incorporate various types of figurative language, with the aim of maintaining and nourishing a community of NLP researchers interested in this topic. The main focus will be on computational modeling of figurative language, however papers on cognitive, linguistic, social, rhetorical, and applied aspects are also of interest, provided that they are presented within a computational, formal, or a quantitative framework. Recent advancement in language models have led to several works on figurative language understanding (Chakrabarty et al 2022a; Chakrabarty et al 2022b; Liu et al 2022; Hu et al 2023) and generation (Stowe et al 2021; Chakrabarty et al 2021; Sun et al 2022; Tian et al 2021) At the same time large language models have opened up opportunities to utilize figurative language in scientific (Kim et al 2023) as well as creative writing (Chakrabarty et al 2022c; Tian et al 2022). Additionally there have also been recent work on multimodal figurative language generation (Chakrabarty et al 2023; Akula et al 2023), understanding (Hessel et al 2023; Yosef et al 2023) and interpretation (Hwang et al 2023; Desai et al 2022; Kumar et al 2022). We encourage submissions along these axes.
Topics of Interest
The workshop will solicit both full papers and short papers for either oral or poster presentation. Topics will include, but will not be limited to, the following:
Identification and interpretation of different types of figurative language: Linguistic, conceptual and extended metaphor; irony, sarcasm, puns, simile, metonymy, personification, synecdoche, hyperbole
Generation of different types of figurative language: sarcasm, simile, metaphors, humor, hyperbole
Multilingual and multimodal figurative language processing
Resources and evaluation
Annotation of figurative language in corpora
Datasets for evaluation of tools
Evaluation methodologies
Figurative use in low-resource languages
Processing of figurative language for NLP applications
Figurative language in sentiment analysis; dialogue systems; computational social science; educational applications
Figurative language and mental health
Figurative language in digital humanities
Figurative language in creative writing
Figurative language and cognition
Cognitive models of processing of figurative language by the human brain
Human-AI collaboration for figurative language
Shared Tasks
Multilingual euphemisms detection: Euphemisms are a linguistic device used to soften or neutralize language that may otherwise be harsh or awkward to state directly (e.g. "between jobs" instead of "unemployed", "late" instead of "dead", "collateral damage" instead of "war-related civilian deaths"). By acting as alternative words or phrases, euphemisms are used in everyday language to maintain politeness, mitigate discomfort, or conceal the truth. While they are culturally-dependent, the need to discuss sensitive topics in a non-offensive way is universal, suggesting similarities in the way euphemisms are used across languages and cultures. We propose a shared task in which participants will need to disambiguate sentences in multiple languages as either euphemistic or not. The dataset will include English, Mandarin, Spanish, Yoruba, and possibly additional languages.
Understanding of Figurative Language through Visual Entailment: One important modality that has gained interest recently is vision, namely the interpretation of figurative language in media such as memes, art, or comics. This task is challenging because it involves reasoning abstractly about images, and also involves understanding social commonsense and cultural context. We will frame this as a visual entailment task where a model not only has to predict if a caption entails the content in the image but also provide free text explanations justifying the label prediction. These tasks have proved difficult for state-of-the-art multimodal models in the past. We will have a paper and a baseline for the same.
Important Dates
Long, Short & Demonstration Paper Submission: March 10th, 2024
Long, Short & Demonstration Paper Notification: April 14th, 2024
Final Paper Submission: April 24th, 2024
Workshop: June 21/22, 2024
For more information, please check https://sites.google.com/view/figlang2024
In this newsletter:
LDC membership discounts expire March 1
Spring 2024 data scholarship recipients
Four corpora withdrawn from the LDC Catalog
New publications:
Second Language University Speech Intelligibility Corpus<https://catalog.ldc.upenn.edu/LDC2024S02>
AIDA Scenario 1 Practice Topic Annotation<https://catalog.ldc.upenn.edu/LDC2024T02>
________________________________
LDC membership discounts expire March 1
Time is running out to save on 2024 membership fees. Renew your LDC membership, rejoin the Consortium, or become a new member by March 1 to receive a discount of up to 10%. For more information on membership benefits and options, visit Join LDC<https://www.ldc.upenn.edu/members/join-ldc>.
Spring 2024 data scholarship recipients
Congratulations to the recipients of LDC's Spring 2024 data scholarships:
Jordan Chandler: Université Rennes 2 (France): Master's student, English Studies. Jordan is awarded a copy of Penn Parsed Corpora of Historical English LDC2020T16 to continue his research on the historical development of adjective, quantifier, and article indefiniteness in the English language.
Nikhil Raghav: TCG Crest (India): PhD candidate, Institute for Advancing Intelligence. Nikhil is awarded copies of Third DIHARD Challenge Development LDC2022S12 and Third DIHARD Challenge Evaluation LDC2022S14 for his work in speaker diarization.
Abraham Sanders: Rensselaer Polytechnical Institute (USA): PhD candidate, Cognitive Science. Abraham is awarded copies of Fisher English Training Speech Part 1 Speech LDC2004S13, Fisher English Training Speech Part 1 Transcripts LDC2004T19, Fisher English Training Part 2 Speech LDC2005S13 and Fisher English Training Part 2 Transcripts LDC2005T19, for his work in spoken dialogue systems.
The next round of applications will be accepted in September 2024. For information about the program, visit the Data Scholarships page<https://www.ldc.upenn.edu/language-resources/data/data-scholarships>.
Four corpora withdrawn from the LDC Catalog
We regret to announce that The New York Times Annotated Corpus, LDC2008T19, has been withdrawn from the LDC Catalog by the data provider. Because they contain data from LDC2008T19, the following three corpora are also withdrawn from the Catalog: Benchmarks for Open Relation Extraction LDC2014T27, Concretely Annotated New York Times LDC2018T12, and News Sub-domain Named Entity Recognition LDC2023T12. Organizations and individuals who have previously licensed any of these data sets can continue to use them under the terms of their respective special license agreements.
________________________________
New publications:
Second Language University Speech Intelligibility Corpus<https://catalog.ldc.upenn.edu/LDC2024S02> was developed by Northern Arizona University, The Pennsylvania State University, and The University of Texas at Dallas. It contains 10.5 hours of English speech collected from 66 international faculty and university students representing 15 language backgrounds at 10 North American universities. This release also includes orthographic transcriptions for all recordings, intelligibility scores for 73% of the files, speaker metadata, and aligned Praat textgrids.
The speech data is comprised of presentations, descriptions, reflections, and microteaching tasks. Speakers were recruited from courses at intensive English programs and oral skills courses for international graduate students seeking to become international teaching assistants.
2024 members can access this corpus through their LDC accounts provided they have submitted a completed copy of the special license agreement. Non-members may license this data for a fee.
*
AIDA Scenario 1 Practice Topic Annotation<https://catalog.ldc.upenn.edu/LDC2024T02> was developed by LDC and is comprised of annotations for 212 English, Russian, and Ukrainian web documents (text, image, and video) from AIDA Scenario 1 Practice Topic Source Data (LDC2023T11)<https://catalog.ldc.upenn.edu/LDC2023T11>, specifically, the set of practice documents designated for annotation in Phase 1.
Annotations are presented as tab separated files in the following categories for each topic:
* Mentions: single references in source data to a real-world entity or filler, event, or relation.
* Slots: pre-defined roles in an event or relation filled by an argument (entity mention).
* Linking: entity mentions linked to entries in the knowledge base as a method of indicating the real-world entity to which an entity referred.
2024 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
To unsubscribe from this newsletter, log in to your LDC account<https://catalog.ldc.upenn.edu/login> and uncheck the box next to "Receive Newsletter" under Account Options or contact LDC for assistance.
Membership Coordinator
Linguistic Data Consortium<ldc.upenn.edu>
University of Pennsylvania
T: +1-215-573-1275
E: ldc(a)ldc.upenn.edu<mailto:ldc@ldc.upenn.edu>
M: 3600 Market St. Suite 810
Philadelphia, PA 19104
[apologies if you receive multiple copies of this call]
Dear colleagues and friends,
*We are pleased to release the 1st Call for Participation - CLEF 2024
SimpleText Task4: SOTA?*
*Overview:* SOTA? is introduced as Task 4 in the SimpleText track of CLEF
2024. The goal of the SOTA? shared task is to develop systems which given
the full text of an AI paper, are capable of recognizing whether an
incoming AI paper indeed reports model scores on benchmark datasets, and if
so, to extract all pertinent (Task, Dataset, Metric, Score) quadruples
presented within the paper.
More info on the task website:
https://sites.google.com/view/simpletext-sota/home
SOTA? will be divided into two evaluation phases:
- Evaluation Phase 1: Few-shot Testing;
- Evaluation Phase 2: Zero-shot Testing
*To participate in SOTA? i.e. SimpleText Task 4 @ CLEF 2024, please
register your team*:
1. CLEF 2024 official registration page
https://clef2024.imag.fr/index.php?page=Pages/registration.html
2. Codalab competition site:
https://codalab.lisn.upsaclay.fr/competitions/16616
Note, SOTA? is organized as a new task this year under the "SimpleText -
Improving Access to Scientific Texts for Everyone" initiative
https://simpletext-project.com/. Please take a look at the other 3 tasks,
i.e. Task 1, 2, and 3, offered by SimpleText and select one or more of
those task options too if you are interested. Note that there is no
interdependence of the dataset between "Task 4 - SOTA?" and the other three
tasks of SimpleText.
*Dates*
Training and validation datasets available: Feb 1, 2024
Test data available/Evaluation starts: April 23, 2024
Evaluation ends: May 3, 2024
Participant paper submissions due: May 31, 2024
Notification to authors: June 24, 2024
Camera ready due: July 8, 2024
CLEF 2024 Workshop, Grenoble, France: 9-12 September 2024
*Task Organizers*
Jennifer D’Souza (TIB Leibniz Information Centre for Science and Technology
- Germany)
Salomon Kabongo (L3S Research Center, Germany)
Hamed Babaei Giglou (TIB Leibniz Information Centre for Science and
Technology - Germany)
Yue Zhang (Berlin Technical University, Germany)
Sören Auer (TIB Leibniz Information Centre for Science and Technology -
Germany)
*We look forward to having you on board!*
*Contact:* sota.task [at] gmail.com
The Institute of Translation Studies and Specialised Communication, Department of Language and Information Sciences, University of Hildesheim (Germany, https://www.uni-hildesheim.de/fb3/institute/institut-fuer-uebersetzungswiss…) is seeking to fill a lectureship position. (Near-)native command of English is a must, very good command of German is also required, see details in the official announcement (in German):
https://bewerbung.uni-hildesheim.de/jobposting/3653d72a8c32e0c740078c88667d…
Application *deadline*: 22nd of March 2024
--
Prof. Dr. Ekaterina Lapshinova-Koltunski
Geschäftsführende Direktorin
Institut für Übersetzungswissenschaft und Fachkommunikation
Fachbereich 3: Sprach und Informationswissenschaften
Stiftung Universität Hildesheim
Lübecker Straße 3
31141 Hildesheim
+49 5121 883-30934
I will start a new research group on natural language processing as part
of the Bamberg AI Center (https://www.uni-bamberg.de/en/bacai/). There
are currently four open positions:
We do fundamental NLP research at the intersection to computational
psychology, digital humanities, and computational social sciences.
We have currently four positions open (deadline February 28, 2024):
1. Postdoc, Open Topic (3 years)
2. PhD student in interactive prompt optimization (3 years)
3. Researcher in event-centered emotion analysis (1 year)
4. Researcher in multimodal emotion analysis (1 year)
Position 3+4 can be combined to have a 2-year position.
Please find more details at
https://www.bamnlp.de/openpositions/
Do not hesitate to contact me, if you have questions!
Roman Klinger
Dear all,
Some of you might be interested in LancsLex, a new free online tool developed at Lancaster University for the analysis of English vocabulary. It is available at https://lancslex.lancs.ac.uk/
It is based on recent research (2024) that led to the publication of the Frequency Dictionary of British English: Core Vocabulary and Exercises for Learners https://cass.lancs.ac.uk/words-words-words-a-new-frequency-dictionary-of-br…
Best,
Vaclav
Professor Vaclav Brezina
Professor in Corpus Linguistics
Department of Linguistics and English Language
ESRC Centre for Corpus Approaches to Social Science
Faculty of Arts and Social Sciences, Lancaster University
Lancaster, LA1 4YD
Office: County South, room C05
T: +44 (0)1524 510828
[cid:a6b1d92e-489d-4010-affb-663448b416a5]@vaclavbrezina
[cid:9f8ad673-48c0-498e-ad39-c5ade712437c]<http://www.lancaster.ac.uk/arts-and-social-sciences/about-us/people/vaclav-…>
We invite proposals for tasks to be run as part of SemEval-2025
<https://semeval.github.io/SemEval2025/>. SemEval (the International
Workshop on Semantic Evaluation) is an ongoing series of evaluations of
computational semantics systems, organized under the umbrella of SIGLEX
<https://siglex.org/>, the Special Interest Group on the Lexicon of the
Association for Computational Linguistics.
SemEval tasks explore the nature of meaning in natural languages: how to
characterize meaning and how to compute it. This is achieved in practical
terms, using shared datasets and standardized evaluation metrics to
quantify the strengths and weaknesses and possible solutions. SemEval tasks
encompass a broad range of semantic topics from the lexical level to the
discourse level, including word sense identification, semantic parsing,
coreference resolution, and sentiment analysis, among others.
For SemEval-2025 <https://semeval.github.io/SemEval2025/cft>, we welcome
tasks that can test an automatic system for the semantic analysis of text
(e.g., intrinsic semantic evaluation, or an application-oriented
evaluation). We especially encourage tasks for languages other than
English, cross-lingual tasks, and tasks that develop novel applications of
computational semantics. See the websites of previous editions of SemEval
to get an idea about the range of tasks explored, e.g. SemEval-2020
<http://alt.qcri.org/semeval2020/> and SemEval-2021-/2023/2024
<https://semeval.github.io/>.
We strongly encourage proposals based on pilot studies that have already
generated initial data, evaluation measures and baselines. In this way, we
can avoid unforeseen challenges down the road which that may delay the task.
In case you are not sure whether a task is suitable for SemEval, please
feel free to get in touch with the SemEval organizers at
semevalorganizers(a)gmail.com to discuss your idea.
=== Task Selection ===
Task proposals will be reviewed by experts, and reviews will serve as the
basis for acceptance decisions. Everything else being equal, more
innovative new tasks will be given preference over task reruns. Task
proposals will be evaluated on:
- Novelty: Is the task on a compelling new problem that has not been
explored much in the community? Is the task a rerun, but covering
substantially new ground (new subtasks, new types of data, new languages,
etc.)?
- Interest: Is the proposed task likely to attract a sufficient number
of participants?
- Data: Are the plans for collecting data convincing? Will the resulting
data be of high quality? Will annotations have meaningfully high
inter-annotator agreements? Have all appropriate licenses for use and
re-use of the data after the evaluation been secured? Have all
international privacy concerns been addressed? Will the data annotation be
ready on time?
- Evaluation: Is the methodology for evaluation sound? Is the necessary
infrastructure available or can it be built in time for the shared task?
Will research inspired by this task be able to evaluate in the same manner
and on the same data after the initial task?
- Impact: What is the expected impact of the data in this task on future
research beyond the SemEval Workshop?
-
Ethical: The data must be compliant with privacy policies. e.g.
a) avoid personally identifiable information (PII). Tasks aimed at
identifying specific people will not be accepted,
b) avoid medical decision making (compliance with HIPAA, do not try to
replace medical professionals, especially if it has anything to do with
mental health)
c) these are representative and not exhaustive
=== New Tasks vs. Task Reruns ===
We welcome both new tasks and task reruns. For a new task, the proposal
should address whether the task would be able to attract participants.
Preference will be given to novel tasks that have not received much
attention yet.
For reruns of previous shared tasks (whether or not the previous task was
part of SemEval), the proposal should address the need for another
iteration of the task. Valid reasons include: a new form of evaluation
(e.g. a new evaluation metric, a new application-oriented scenario), new
genres or domains (e.g. social media, domain-specific corpora), or a
significant expansion in scale. We further discourage carrying over a
previous task and just adding new subtasks, as this can lead to the
accumulation of too many subtasks. Evaluating on a different dataset with
the same task formulation, or evaluating on the same dataset with a
different evaluation metric, typically should not be considered a separate
subtask.
=== Task Organization ===
We welcome people who have never organized a SemEval task before, as well
as those who have. Apart from providing a dataset, task organizers are
expected to:
- Verify the data annotations have sufficient inter-annotator agreement
- Verify licenses for the data allow its use in the competition and
afterwards. In particular, text that is publicly available online is not
necessarily in the public domain; unless a license has been provided, the
author retains all rights associated with their work, including copying,
sharing and publishing. For more information, see:
https://creativecommons.org/faq/#what-is-copyright-and-why-does-it-matter
- Resolve any potential security, privacy, or ethical concerns about the
data
- Commit to make the data available after the task
- Provide task participants with format checkers and standard scorers.
- Provide task participants with baseline systems to use as a starting
point (in order to lower the obstacles to participation). A baseline system
typically contains code that reads the data, creates a baseline response
(e.g. random guessing, majority class prediction), and outputs the
evaluation results. Whenever possible, baseline systems should be written
in widely used programming languages and/or should be implemented as a
component for standard NLP pipelines.
- Create a mailing list and website for the task and post all relevant
information there.
- Create a CodaLab or other similar competition for the task and upload
the evaluation script.
- Manage submissions on CodaLab or a similar competition site.
- Write a task description paper to be included in SemEval proceedings,
and present it at the workshop.
- Manage participants’ submissions of system description papers, manage
participants’ peer review of each others’ papers, and possibly shepherd
papers that need additional help in improving the writing.
- Review other task description papers.
- Define Roles for each Organizer:
- Lead Organizer - main point of contact, expected to ensure
deliverables are met on time and participate in contributing to
task duties
(see below).
- Co-Organizers - provide significant contributions to ensuring the
task runs smoothly. Some examples include, maintaining communication with
task participants, preparing data, creating and running
evaluation scripts,
and leading paper reviewing and acceptance.
- Advisory Organizers - more of a supervisor role, may not contribute
to detailed tasks but will provide guidance and support.
=== Important dates ===
- Task proposals due March 31, 2024 (Anywhere on Earth)
- Task selection notification May 18, 2024
=== Preliminary timetable ===
- Sample data ready July 15, 2024
- Training data ready September 1, 2024
- Evaluation data ready December 1, 2024 (internal deadline; not for public
release)
- Evaluation starts January 10, 2025
- Evaluation end by January 31, 2025 (latest date; task organizers may
choose an earlier date)
- Paper submission due February 2025
- Notification to authors on March 2025
- Camera-ready due April 2025
- SemEval workshop Summer 2025 (co-located with a major NLP conference)
Tasks that fail to keep up with crucial deadlines (such as the dates for
having the task and CodaLab website up and dates for uploading sample,
training, and evaluation data) or that diverge significantly from the
proposal may be cancelled at the discretion of SemEval organizers. While
consideration will be given to extenuating circumstances, our goal is to
provide sufficient time for the participants to develop strong and
well-thought-out systems. Cancelled tasks will be encouraged to submit
proposals for the subsequent year’s SemEval. To reduce the risk of tasks
failing to meet the deadlines, we are unlikely to accept multiple tasks
with overlap in the task organizers.
=== Submission Details ===
The task proposal should be a self-contained document of no longer than 3
pages (plus additional pages for references). All submissions must be in
PDF format, following the ACL template
<https://github.com/acl-org/acl-style-files>.
Each proposal should contain the following:
- Overview
- Summary of the task
- Why this task is needed and which communities would be interested
in participating
- Expected impact of the task
- Data & Resources
- How the training/testing data will be produced. Please discuss whether
existing corpora will be re-used.
- Details of copyright, so that the data can be used by the research
community both during the SemEval evaluation and afterwards
- How much data will be produced
- How data quality will be ensured and evaluated
- An example of what the data would look like
- Resources required to produce the data and prepare the task for
participants (annotation cost, annotation time, computation time, etc.)
- Assessment of any concerns with respect to ethics, privacy, or
security (e.g. personally identifiable information of private
individuals;
potential for systems to cause harm)
- Pilot Task (strongly recommended)
- Details of the pilot task
- What lessons were learned and how these will impact the task design
- Evaluation
- The evaluation methodology to be used, including clear evaluation
criteria
- For Task Reruns
- Justification for why a new iteration of the task is needed (see
criteria above)
- What will differ from the previous iteration
- Expected impact of the rerun compared with the previous iteration
- Task organizers
- Names, affiliations, email addresses
- (optional) brief description of relevant experience or expertise
- (if applicable) years and task numbers, of any SemEval tasks you
have run in the past
- Role of each organizer
Proposals will be reviewed by an independent group of area experts who may
not have familiarity with recent SemEval tasks, and therefore all proposals
should be written in a self-explanatory manner and contain sufficient
examples.
*The submission webpage is:* SemEval2025 Task Proposal Submission
<https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/SemEval> (
https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/SemEval)
For further information on this initiative, please refer to
https://semeval.github.io/SemEval2025/cft
=== Chairs ===
Atul Kr. Ojha, Insight SFI Centre for Data Analytics, DSI, University of
Galway
A. Seza Doğruöz, Ghent University
Giovanni Da San Martino, University of Padua
Harish Tayyar Madabushi, The University of Bath
Sara Rosenthal, IBM Research AI
Aiala Rosá, Universidad de la República - Uruguay
Contact: semevalorganizers(a)gmail.com
*** With apologies for multiple postings ***
Call for Applications
Resident Academic Full-Time Post in Experimental Linguistics
Institute of Linguistics and Language Technology
Applications are invited for a Resident Academic full-time post in Experimental Linguistics at the Institute of Linguistics and Language Technology of the University of Malta.
The appointment will be on initial definite four-year contract of employment. Following the successful completion of the one-year probationary period, the Resident Academic will commence a two-year Tenure Track process. At the conclusion of this two-year period of service, the Resident Academic will be subject to a Tenure Review.
Candidates must be in possession of a PhD in Linguistics/Language Sciences, and have demonstrable expertise in empirical linguistics and commitment to research that involves linguistic data processing and statistics from an experimental perspective. Teaching experience at tertiary level will be considered an asset.
The appointee will be required to contribute to the teaching and supervision of students, research and administrative duties, as well as outreach activities as may be required by the Institute and/or the University. More specifically, s/he will be required to teach courses in areas which apply quantitative approaches to the study of language and speech; these include courses on empirical linguistics, linguistic data processing and statistics, as well as experimental techniques. Moreover, s/he will be required to contribute to the development of the research programmes of the Institute and is additionally expected to be available to offer support to staff and students on matters such as experimental design, practical aspects in the implementation of an experiment and data processing and use of appropriate statistical tests, should the need arise. Good communicative skills, the ability to work in a team, and a strong willingness to engage in the Institute’s administrative and outreach activities are also essential.
The Resident Academic Stream is composed of four grades; these being Professor, Associate Professor, Senior Lecturer and Lecturer. Entry into the grade of Lecturer or above shall only be open to persons in possession of a PhD or an equivalent research-based doctorate within strict guidelines established by the University.
The annual salary for 2024 attached to the respective grades in the Resident Academic Stream is as follows:
*Professor: €48,818 plus an Academic Supplement of €33,119 and a Professorial Allowance of €2,330
*Associate Professor: €44,864 plus Academic Supplement of €25,377 and a Professorial Allowance of €1,423
Senior Lecturer: €40,677 plus an Academic Supplement of €18,358
Lecturer: €34,008 with an annual increment of €641 to €35,931 and an Academic Supplement of €14,932
*The University will only consider appointing an applicant at the grade of Professor or Associate Professor, when the applicant already holds an equivalent appointment at a University or Research Institute of repute.
The University of Malta will provide academic staff with financial resources through the Academic Resources Fund to support continuous professional development and to provide the tools and resources required by an academic to adequately fulfil the teaching and research commitments within the University.
The University of Malta may also appoint promising and exceptional candidates into the grade of Assistant Lecturer, provided that they are committed to obtain the necessary qualifications to enter the Resident Academic Stream. Such candidates will either have achieved exceptional results at undergraduate level, be already in possession of a relevant Masters qualification, or would have been accepted for or already in the process of achieving their PhD.
Assistant Lecturer with Masters: €31,764 with an annual increment of €596 to €33,552 and an Academic Supplement of €5,294
Assistant Lecturer: €29,589 with an annual increment of €531 to €31,182 and an Academic Supplement of €5,037.
Candidates must upload covering letter, curriculum vitae, certificates (certificates should be submitted in English) and names and emails of three referees through this form: https://www.um.edu.mt/hrmd/workatum-general
Applications should be received by Sunday, 25 February 2024.
Late applications will not be considered.
For more detail, see https://www.um.edu.mt/hrmd/recruitment/generalrecruitment/residentacademicf…
*********************
Patrizia Paggio
Professor
University of Malta
Institute of Linguistics and Language Technology
patrizia.paggio(a)um.edu.mt
Associate Professor
University of Copenhagen
Centre for Language Technology
paggio(a)hum.ku.dk
The fifth workshop on Resources for African Indigenous Language (RAIL)
Colocated with LREC-COLING 2024
https://bit.ly/rail2024
Conference dates: 20-25 May 2024
Workshop date: 25 May 2024
Venue: Lingotto Conference Centre, Torino (Italy)
The fifth RAIL workshop website: https://bit.ly/rail2024
LREC-COLING 2024 website: https://lrec-coling-2024.org/
Submission website: https://softconf.com/lrec-coling2024/rail2024/
The fifth Resources for African Indigenous Languages (RAIL) workshop
will be co-located with LREC-COLING 2024 in Lingotto Conference Centre,
Torino, Italy on 25 May 2024. The RAIL workshop is an interdisciplinary
platform for researchers working on resources (data collections, tools,
etc.) specifically targeted towards African indigenous languages. In
particular, it aims to create the conditions for the emergence of a
scientific community of practice that focuses on data, as well as
computational linguistic tools specifically designed for or applied to
indigenous languages found in Africa.
Many African languages are under-resourced while only a few of them are
somewhat better resourced. These languages often share interesting
properties such as writing systems, or tone, making them different from
most high-resourced languages. From a computational perspective, these
languages lack enough corpora to undertake high level development of
Human Language Technologies (HLT) and Natural Language Processing (NLP)
tools, which in turn impedes the development of African languages in
these areas. During previous workshops, it has become clear that the
problems and solutions presented are not only applicable to African
languages but are also relevant to many other low-resource languages.
Because these languages share similar challenges, this workshop
provides researchers with opportunities to work collaboratively on
issues of language resource development and learn from each other.
The RAIL workshop has several aims. First, the workshop brings together
researchers who work on African indigenous languages, forming a
community of practice for people working on indigenous languages.
Second, the workshop aims to reveal currently unknown or unpublished
existing resources (corpora, NLP tools, and applications), resulting in
a better overview of the current state-of-the-art, and also allows for
discussions on novel, desired resources for future research in this
area. Third, it enhances sharing of knowledge on the development of
low-resource languages. Finally, it enables discussions on how to
improve the quality as well as availability of the resources.
The workshop has “Creating resources for less-resourced languages” as
its theme, but submissions on any topic related to properties of
African indigenous languages (including non-African languages) may be
accepted. Suggested topics include (but are not limited to) the
following:
* Digital representations of linguistic structures
* Descriptions of corpora or other data sets of African indigenous
languages
* Building resources for (under resourced) African indigenous languages
* Developing and using African indigenous languages in the digital age
* Effectiveness of digital technologies for the development of African
indigenous languages
* Revealing unknown or unpublished existing resources for African
indigenous languages
* Developing desired resources for African indigenous languages
* Improving quality, availability and accessibility of African
indigenous language resources
Submission requirements:
We invite papers on original, unpublished work related to the topics of
the workshop. Submissions, presenting completed work, may consist of up
to eight (8) pages of content for a long submission and up to four (4)
pages of content for a short submission plus additional pages of
references. The final camera-ready version of accepted long papers are
allowed one additional page of content (up to 9 pages) so that
reviewers’ feedback can be incorporated. Papers should be formatted
according to the LREC-COLING style sheet
(https://lrec-coling-2024.org/authors-kit/), which is provided on the
LREC-COLING 2024 website (https://lrec-coling-2024.org/). Reviewing is
double-blind, so make sure to anonymise your submission (e.g., do not
provide author names, affiliations, project names, etc.) Limit the
amount of self citations (anonymised citations should not be used). The
RAIL workshop follows the LREC-COLING submission requirements.
Please submit papers in PDF format to the START account
(https://softconf.com/lrec-coling2024/rail2024/). Accepted papers will
be published in proceedings linked to the LREC-COLING conference.
Important dates:
Submission deadline: 23 February 2024
Date of notification: 15 March 2024
Camera ready deadline: 29 March 2024
RAIL workshop: 25 May 2024
Organising Committee
Rooweither Mabuya, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Muzi Matfunjwa, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Mmasibidi Setaka, South African Centre for Digital Language Resources
(SADiLaR), South Africa
Menno van Zaanen, South African Centre for Digital Language Resources
(SADiLaR), South Africa
--
Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za
Professor in Digital Humanities
South African Centre for Digital Language Resources
https://www.sadilar.org
[NWU Celebrations]
________________________________
NWU PRIVACY STATEMENT:
http://www.nwu.ac.za/it/gov-man/disclaimer.html
DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system.
________________________________