Call for Papers
2024 CORE Project Workshop Unpacking Efficient Communication: The Roles of
Cognitive Bias and Extralinguistic Context in Referring Expression Choice
When: April 18-19, 2024
Where: Universitat Pompeu Fabra, Barcelona
Language offers a rich set of lexical and syntactic options for reference,
reflecting the different ways we can choose to identify,
describe, categorize, and differentiate the entities and events we talk
about. For example, in any given context, a speaker can choose between a
more or less specific expression (the dog, the spotted dog, the Dalmatian),
or between expressions that convey complementary information about the
referent (the woman, the skier). A well-established line of research
highlights the role of efficiency in referring expression choice. But what
makes a referring expression “efficient”? Efficiency in communication has
been frequently characterized in terms of an informativity/effort
trade-off, with informativity operationalized in terms of inference, and
effort, in terms of cognitive or physical cost (Horn 1984, Levshina 2021).
However, there is also evidence that other factors such as the salience of
visual features (e.g., color, Rubio-Fernández 2016) or the prototypicality
of an entity as an exemplar of a category (see, e.g., Degen, et al. 2020)
can lead speakers to use expressions that are, strictly speaking,
overinformative in the narrowest sense of the term. Efficiency can also be
examined at the level of the whole system; for instance, Brochhagen and
Boleda (2022) argue that the informativity/effort trade-off helps explain
cross-linguistic patterns in colexification, or how meanings are organized
in the lexicon.
The goal of this workshop, supported by the Spanish AEI-funded CORE project
(“COntextual effects in the choice of Referring Expressions for visually
presented entities”, PID2020-112602GB-I00), is to dig deeper into what
makes a linguistic expression “efficient”, considering factors such as:
- Cognitive biases that influence the potential for rapid/efficient
discrimination.
- Potential for exploiting inferences due to choice of one expression vs.
another.
- Information load a referring expression has to bear given extralinguistic
sources of information in the context, especially visual information.
- Lexical/constructional frequency effects and association strength between
RE options and the referent in question.
The workshop aims to give a forum to new and especially exploratory
research in this area. The workshop will include a combination of invited
talks, presentations of ongoing research by project members, and
presentations and/or posters selected in this open call.
We invite submissions on topics including, but not limited to:
- The general principles that intervene in efficient communication,
especially alternatives to or refined definitions of notions such as
“efficiency”, “effort”, and “informativity”.
- Which features of entities or events are more likely to be used for
discrimination.
- The role of the visual context and/or distractor entities in influencing
RE choice; more generally, the role of multi-modal aspects.
- The role of the implicit semantic organization of RE alternatives and the
conventionalized division of labor between them, especially organization
based on implicative semantic relations (e.g. hyponymy, troponymy).
- The factors influencing the choice among alternative
cross-classifications of a target referent (e.g. the choice between
“taxonomic” descriptions such as woman vs. role-based descriptions such as
skier).
- The dynamics between reference and the linguistic system, that is, how
efficient communication is enabled by and at the same time transforms a
given language.
We take a methodologically pluralistic approach and thus welcome
presentations on experimental studies, analysis of corpus data,
computational modeling, critiques or analyses of published research, as
well as position papers.
Invited speakers:
Lilia Rissman, University of Wisconsin - Madison
Paula Rubio-Fernández, Max Planck Institute for Psycholinguistics
Sina Zarrieß, University of Bielefeld
Abstract guidelines: Abstracts should not exceed 2 pages in length (A4 or
letter-size), in 12 pt. font, with 1-inch/2,5-cm margins; a third page can
be used for references, data, and figures. Please indicate whether you want
the submission to be considered for a paper, a poster, or either. Abstracts
should be submitted to EasyChair at the following link:
https://easychair.org/conferences/?conf=core2024.
Important dates:
Deadline for abstract submission: December 20, 2023
Notification of acceptance: January 15, 2024
Workshop dates: April 18-19, 2024
Organizers: Louise McNally, Gemma Boleda, Jialing Liang, Marina Bolea.
References:
Degen, J., Hawkins, R. D., Graf, C., Kreiss, E., & Goodman, N. D. (2020).
When redundancy is useful: A Bayesian approach to “overinformative”
referring expressions. Psychological Review, 127(4), 591–621.
Gualdoni, E., T. Brochhagen, A. Mädebach, G. Boleda. 2023. What's in a
name? A large-scale computational study on how competition between names
affects naming variation. Journal of Memory and Language, 133, 104459.
Brochhagen, T., G. Boleda. 2022. When do languages use the same word for
different meanings? The Goldilocks Principle in colexification. Cognition,
226, 105179.
Horn, L.R. (1984). Towards a new taxonomy for pragmatic inference: Q-based
and R-based implicature. In Schiffrin, D. (ed.), Meaning, Form, and Use in
Context: Linguistic Applications, 11-42. Georgetown University Press,
Washington, DC. Levshina, N. (2023). Communicative
efficiency: Language structure and use. Cambridge: Cambridge University
Press.
Rissman, L., & Lupyan, G. (2022). A Dissociation Between Conceptual
Prominence and Explicit Category Learning: Evidence From Agent and Patient
Event Roles. Journal of Experimental Psychology: General, 151(7):1707-1732.
Rubio-Fernandez, P., Mollica, F., & Jara-Ettinger, J. (2021). Speakers and
listeners exploit word order for communicative efficiency: A
cross-linguistic investigation. Journal of Experimental Psychology:
General, 150(3), 583–594.
Schüz, S., Han, T., Zarrieß, S. (2021) Diversity as a By-Product:
Goal-oriented Language Generation Leads to Linguistic Variation.
Proceedings of the 22nd Annual SIGdial Meeting on Discourse and Dialogue.
Association for Computational Linguistics.
Dear Professor/Scholar,
We would like to invite you to contribute a chapter for the upcoming book
entitled “Applied Speech and Text Processing For Low Resource Languages” to
be published by the River Publishers.
------------------------------
Motivation and Scope of the Book
Out of over 7000 recognized languages worldwide, only a small proportion
offer sufficient resources to build speech and natural language processing
(NLP) technologies adequately. Developing such solutions for low-resource
languages is challenging in multiple aspects. The dependency on deep
learning over volumes of resources is quite conventional. Hence, the
research in this domain is propelled by different ideas to train and
validate systems, such as data augmentation, transfer learning, and hybrid
multi-modal architectures, to name a few. This book aims at collecting such
ideas, advent, and solutions for building speech/NLP technologies in
low-resource scenarios.
------------------------------
Table of Contents
We invite submissions of high-quality, original chapters addressing both
theoretical and practical aspects, including their ethical and social
implications of NLP in healthcare. The Book aims to cover (but is not
limited) the following topics:
-
Speech Processing for Low Resource Languages
-
Natural Language Processing for Low Resource
-
Deep learning methods for low-resource languages
-
End-to-end speech Recognition for Low-Resource Language
-
Development of a Speech Corpus for Low Resource Language
-
Multimodal architectures for social media analysis
-
Speech synthesis for low-resource language
-
Speech translation for low-resource language
-
Indigenous Language revitalization/preservation
-
Transfer learning applications
-
Leveraging Large pre-trained language models
knowledge under few-shot and zero-shot in NLP tasks
-
Efficiently aligning acoustic and textual embeddings
-
Speech Recognition for Specific Dialects
------------------------------
Important Dates and Submission Guidelines
-
Full Chapter Abstract Submission: 10.01.2024
-
Abstract Acceptance/Rejection Notification: 25.01.2024
-
Full Chapter Submission: 15.03.2024
-
First Review Notification: 15.04.2024
-
Revised Version Notification: 15.05.2024
-
Final Acceptance/Rejection Notification: 30.05.2024
-
Camera Ready Chapter Submission: 10.06.2024
Authors should send their abstracts and chapters through easy-chair
<https://easychair.org/my/conference?conf=appliedspeechnlpbook0>only. The
submission guidelines and other detailed information can be found on the
book website <https://sites.google.com/view/speech-nlp-boook-river/home>.
For any query mail to: speech_nlp_book_river(a)googlegroups.com
<https://groups.google.com/g/speech_nlp_book_river>
------------------------------
Editors
-
Dr. Shantipriya Parida (Silo AI, Finland)
<https://www.linkedin.com/in/shantipriya-parida-9781a9127/>
-
Assoc. Prof. Satya Ranjan Dash (KIIT University, India)
<https://ksca.kiit.ac.in/profiles/satya-ranjan-dash/>
-
Asst. Prof. Biswa Ranjan Acharya (Marwadi university, India)
<https://www.linkedin.com/in/acharyabiswa/?originalSubdomain=in>
-
Dr. Ravi Shankar Prasad (Idiap Research Institute, Switzerland)
<https://www.linkedin.com/in/ravishankar-prasad-b9907924/>
-
Prof. Esaú Villatoro-Tello (Idiap Research Institute, Switzerland)
<https://www.linkedin.com/in/esa%C3%BA-villatoro-tello-bb185b1aa/>
Request to share among NLP researchers/scholars.
--
Shantipriya Parida
Senior AI Scientist @ Silo AI <http://www.silo.ai/>
Mobile:# +358 (0465787840)
LinkedIn <http://linkedin.com/>*Shantipriya Parida
<https://www.linkedin.com/in/shantipriya-parida-9781a9127/>*
*http://www.shantipriya.me/ <http://www.shantipriya.me/>*
[Apologies for multiple postings]
The European Language Resources Distribution Agency (ELDA), a company
specialized in Human Language Technologies within an international
context, is currently seeking to fill an immediate vacancy for a
permanent Senior Project Manager position, specialised in Speech
Technologies.
Under the supervision of the CEO, the Senior Project Manager,
specialised in Speech Technologies will be in charge of conducting the
activities related to the production of language resources and the
co-ordination of R&D projects. Their responsibilities include language
resources design/specification, production frameworks and platforms
setup, quality control and assessment, project-dedicated team members
recruitment and management. They will also contribute to improving or
updating of the current language resources production workflows. This
yields excellent opportunities for qualified, creative, and motivated
candidates wishing to participate actively in the Language Engineering
field.
The position is based in Paris (13th).
Required profile:
* PhD in computer science specialised in speech technologies. A proven
background in research (scientific publications) will be a strong plus
* At least 3 years of experience in speech technologies (speech
recognition, synthesis, language modelling) and the well-used tools
to produce and collect data, and assess quality
* Ability to experiment with various techniques for improving or
building tools (eg., transcription and annotation tools)
* Contribution to international projects
* Good knowledge of Linux and open source software
* Proficiency in Python programming language
* Good knowledge of scripting languages: bash, R, Perl
* Experience and ability to supervise members of a multidisciplinary team
* Dynamic and communicative, flexible to combine and work on different
tasks
* Proficiency in English with ability to write user guides,
administration documentation and reports, and good mastering of
French. Knowledge of other languages would be a plus.
* Citizenship (or residency papers) of a European Union country
Salary: Commensurate with qualifications and experience (between 40-50K€).
Other benefits: complementary health insurance and meal vouchers.
About
ELDA is an SME established in 1995 to promote the development and
exploitation of Language Resources (LRs). Language Resources include all
data necessary for language engineering, such as monolingual and
multilingual lexica, text corpora, speech databases and terminology.
ELDA’s role is to produce LRs, to collect and to validate them and,
foremost, make them available to users in compliance with applicable
regulations and ethical requirements.
For further information about ELDA, visit: http://www.elda.org
Applicants should email a cover letter addressing the points listed
above together with a curriculum vitae to:
ELDA
9, rue des Cordelières
75013 Paris
FRANCE
Email: job(a)elda.org *__*
Dear colleagues,
We are pleased to invite you to the 6th edition of the International Conference on Computational Linguistics in Bulgaria (CLIB 2024, [ http://dcl.bas.bg/clib/ | http://dcl.bas.bg/clib/ ] ), to be held on 9 and 10 September 2024 in Sofia, Bulga ria.
Computational Linguistics in Bulgaria (CLIB) is an international conference that aims at exploring novel approaches and methods in computational linguistics and natural language processing (NLP), especially with a view to their application to small and less-resourced languages such as Bulgarian and the bridging of the discrepancies between big and small languages with respect to language technologies.
IMPORTANT DATES
Tutorial submission deadline: 15 February 2024
Tutorial notification deadline: 15 March 2024
Paper abstract submission deadline: 15 March 2024
Paper submission deadline: 15 April 2024 (23:59 UTC/GMT+2)
Author notification deadline: 15 May 2024
Camera-ready PDF due: 15 June 2024
Official proceedings publication date: 7 September 2024
Conference: 9 – 10 September 2024
TOPICS OF INTEREST
CLIB invites contributions on original research, including, but not limited to:
*
computer-aided learning, training and education
*
dialogue and interactive systems
*
information retrieval, information extraction, text mining and knowledge graph derivation
*
language grounding for computer vision and robotics
*
language modelling
*
language theories and cognitive modelling for NLP
*
large language models and NLP evaluation methodologies
*
language resources and benchmarking for large language models
*
language resources construction and annotation
*
machine learning for NLP
*
machine translation, multilingualism, translation aids
*
morphology and segmentation
*
natural language generation, understanding, summarisation and simplification
*
ontologies, terminology and knowledge representation
*
sentiment analysis, stylistic analysis, opinion and argument mining
*
speech recognition, synthesis and spoken language understanding
*
tagging, chunking, syntax and parsing
CLIB 2024 also solicits submissions presenting project reports , new data resources , system demonstrations , position papers .
SPECIAL SESSION ON WORDNETS, FRAMENETS AND ONTOLOGIES
The Special Session on Wordnets, Framenets and Ontologies brings together researchers interested in the principles, theory, practice and applications of wordnets, ontologies, related linguistic resources and their interoperability and seeks to establish a dedicated community and to foster joint initiatives in this particular field.
PAPER TYPES AND FORMAT
Long papers must describe substantial, original, completed, and unpublished work. Long papers may consist of up to eight (8) pages of content.
Short paper submissions must describe original and unpublished work dealing with a small, focused contribution. Short papers may consist of up to four (4) pages.
Both types of submissions allow for an unlimited number of pages of references and appendices.
All accepted papers will be included in the Conference Proceedings.
Additional information and the CLIB 2024 style guidelines and templates are available in the [ http://dcl.bas.bg/clib/instructions-for-authors/ | Instructions for Authors ] section at the Conference website.
PAPER SUBMISSION
Papers must be submitted in English and should be anonymous .
Reviewing will be double blind . Each submission will be reviewed by at least two anonymous reviewers.
We invite authors to submit a provisional title along with a brief abstract (approx. 150 words) by 15 March 2024 in pdf format. Abstract should be anonymous.
Submission of papers and abstracts will be managed online by the EasyChair conference management system through the [ https://easychair.org/conferences/?conf=clib2024 | CLIB 2024 EasyChair page ] .
BEST STUDENT PAPER AWARD
In order to encourage talented young researchers, the best paper with a Master/PhD student among the authors and presenting the work at the conference will be awarded a small prize and a diploma.
CALL FOR TUTORIALS
CLIB 2024 invites proposals for tutorials which will be held before the Conference (on September 8).
Proposals should not exceed 4 pages of content (plus unlimited pages for references) using CLIB paper templates, and they should be submitted as pdf documents through the [ https://easychair.org/conferences/?conf=clib2024 | CLIB 2024 EasyChair page ] . Tutorial proposals are not anonymous .
Guidelines for the proposals for tutorials are available in the [ https://dcl.bas.bg/clib/call-for-tutorials/ | Call for Tutorials ] section at the Conference website.
The CLIB 2024 style guidelines and templates are available in the [ http://dcl.bas.bg/clib/instructions-for-authors/ | Instructions for Authors ] section published at the Conference website.
CLIB PROCEEDINGS INDEXING
The Proceedings from CLIB 2016, CLIB 2018, CLIB 2020 are indexed in ISI Web of Science. As of November 2020 the Proceedings are indexed in Scopus . CLIB Proceedings published since 2020 are also included in the ACL Anthology.
You can contact us via the Conference e-mail: [ mailto:clib2024@dcl.bas.bg | clib2024(a)dcl.bas.bg ]
Best regards,
The CLIB2024 Organising Committee
Second CFP: The 6th Workshop on Research in Computational Linguistic
Typology and Multilingual NLP (SIGTYP 2024)
To be held at EACL 2024 (March 21 or 22, 2024 Malta)
Website: https://sigtyp.github.io/
Submission website:
https://openreview.net/group?id=eacl.org/EACL/2024/Workshop/SIGTYP
<https://openreview.net/group?id=eacl.org/EACL/2024/Workshop/SIGTYP>
Submission deadline: December 18, 2023 We invite submissions to the 6th
edition of the SIGTYP workshop on Research in Computational Linguistic
Typology and Multilingual NLP, to be held at EACL 2024 on March 21 or
22, 2024.
Workshop description
The aim of the 6th edition of SIGTYP workshop is to act as a platform
and a forum for the exchange of information between typology-related
research, multilingual NLP, and other research areas that can lead to
the development of truly multilingual NLP methods. The workshop is
specifically aimed at raising awareness of linguistic typology and its
potential in supporting and widening the global reach of multilingual
NLP, as well as at introducing computational approaches to linguistic
typology. It will foster research and discussion on open problems, not
only within the active community working on cross- and multilingual NLP
but also inviting input from leading researchers in linguistic typology.
Our workshop will serve as a platform to enable fruitful discussions. In
2024, we additionally focus on bridging the gap between cross-linguistic
and universal annotation, models, and technology.
SIGTYP is the first dedicated venue for typology-related research and
its integration in multilingual NLP. Appropriate topics include (but are
not limited to) the following as they relate to the areas of the workshop:
*
Integration of typological features in language transfer and joint
multilingual learning. In addition to established techniques such as
“selective sharing”, are there alternative ways to encoding
heterogeneous external knowledge in machine learning algorithms?
*
Development of unified taxonomy and resources. Building universal
databases and models to facilitate understanding and processing of
diverse languages.
*
Automatic inference of typological features. The pros and cons of
existing techniques (e.g. heuristics derived from morphosyntactic
annotation, propagation from features of other languages, supervised
Bayesian and neural models) and discussion on emerging ones.
*
Typology and interpretability. The use of typological knowledge for
interpretation of hidden representations of multilingual neural
models, multilingual data generation and selection, and typological
annotation of texts.
*
Improvement and completion of typological databases. Combining
linguistic knowledge and automatic data-driven methods towards the
joint goal of improving the knowledge on cross-linguistic variation
and universals.
*
Linguistic diversity and universals. Challenges of cross-lingual
annotation. Which linguistic phenomena or categories should be
considered universal? How should they be annotated?
*
Language-specific studies to support or contradict universals.
Framing a study on 1-3 languages that would shed more light on common
linguistic structures and properties.
*
Extra topics also include: generation of constructed languages,
universals in diachronic languages changes, information-theoretic
approaches to typology, automated approaches to etymology.
Important Dates (all deadlines are 23:59 AoE)
— December 18, 2023: Paper submission deadline
— January 20, 2024: Notification of acceptance
— January 30, 2024: Camera-ready deadline
— March 21 or 22, 2024: Workshop
Submissions
We invite both extended abstract submissions (non-archival) and general
paper submissions (archival). The accepted submissions will be presented
at the workshop, providing new insights and ideas. Extended abstracts
should describe already published work or work in progress and should
not exceed two (2) pages. This way, we will not discourage researchers
from preferring main conference proceedings, at the same time ensuring
that interesting and thought-provoking research is presented at the
workshop. For general (archival) submissions we accept both long and
short papers. Short papers should not exceed four (4) pages, long papers
should not exceed eight (8) pages papers. Unlimited additional pages are
allowed for the references section in all submission types.
Submissions should be anonymous, without authors or an acknowledgement
section; self-citations should appear in third person.
Submissions must follow the EACL 2024 stylesheet
https://github.com/acl-org/acl-style-files
<https://github.com/acl-org/acl-style-files>; both long and short paper
submissions must follow the two-column format of ACL proceedings. All
submissions must be in PDF format.
These should be submitted via OpenReview:
https://openreview.net/group?id=eacl.org/EACL/2024/Workshop/SIGTYP
<https://openreview.net/group?id=eacl.org/EACL/2024/Workshop/SIGTYP>.
ARR submissions that were rejected or withdrawn from EACL can be
submitted to SIGTYP by January 17, 2024. We will create a web form for
submitting, and announce it at
https://sigtyp.github.io/sigtyp-cfp2024.html by January 15, 2024.
Acceptance decisions will be made based on the existing ARR reviews.
Authors will be notified by January 20, 2024.
*Shared Task*
In 2024, SIGTYP is hosting a Word Embedding Evaluation for Ancient and
Historical Languages. More details can be found here:
https://sigtyp.github.io/st2024.html.
Organizing Committee
Michael Hahn, Rena Gao, Saliha Muradoglu, Yulia Otmakhova, Andreas
Shcherbakov, Oleg Serikov, Jinrui Yang, Alexey Sorokin, Priya Rani,
Ritesh Kumar, Ryan Cotterell, Edoardo M. Ponti, Kat Vylomova
Anti-harassment policy
The workshop follows the ACL anti-harassment policy:
https://www.aclweb.org/adminwiki/index.php?title=Anti-Harassment_Policy
<https://www.aclweb.org/adminwiki/index.php?title=Anti-Harassment_Policy>.
Contact
For any inquiries regarding the workshop, please send an email to the
Organizing Committee at sigtyp(a)gmail.com
=================================
IberLEF 2024 -- Second Call for Task Proposals
=================================
The goal of IberLEF is to encourage the research community to organize
competitive text processing, understanding and generation tasks, with the
aim of defining new research challenges and advancing the state of the art
in Natural Language Processing challenges involving at least one of the
following Iberian languages: Spanish, Portuguese, Catalan, Basque or
Galician. Researchers and practitioners from all areas of Natural Language
Processing and related communities are invited to submit task proposals
that fit IberLEF goals by December 22, 2023.
Proposals must be submitted (as a pdf file) to iberlef(a)googlegroups.com,
and should include the following fields:
-
Title of the task.
-
Description of the task, highlighting:
-
Relevance and novelty of the task, and the challenges involved.
-
Evaluation measures, and other relevant methodological aspects.
-
Expected target community, and actual or potential industrial takeup.
-
Related evaluation activities, if any.
-
Previous editions of the task, if any. If it has been organized
previously, what the roadmap is and what the novelties for 2024 are.
-
Linguistic resources to be gathered, created and/or reused. Please
include as many details on data gathering, selection and annotation
procedures as possible: sources and representativity,
training/validation/test sizes, harvesting procedures, profile of
annotators (experts, linguists, crowdworkers, etc.), multiple annotation
policy, IPR issues, baselines, etc.
-
Tentative schedule (note that camera-ready versions of the proceedings
must be ready by July 11, 2024).
-
Organization committee: full name and affiliation of the organizers,
with a succinct description of their research interests, areas of expertise
and experience organizing similar events.
-
Funding, if available.
-
Contact person.
-
Any other relevant issues.
Task organizers duties
Note that organizers of accepted tasks are expected to:
-
Set up the evaluation exercise according to the submitted proposal.
-
Promote the task within the target research community.
-
Manage the submission and scientific evaluation of the system
description papers of the corresponding systems submitted by the
participants. The accepted papers will be published in
the IberLEF proceedings.
-
Prepare and submit an overview of the evaluation exercise.
-
Present the results of the task at IberLEF 2024.
Task selection procedure
Each submitted proposal will be reviewed by members of the IberLEF steering
and program committee, and decisions will be sent back to the task
organizers by January 26, 2024.
Proceedings
IberLEF 2024 Proceedings including the description of the participating
systems will be published at CEUR-WS.org. Task Overviews will be published
in the SEPLN journal (http://www.sepln.org/en/journal, indexed in Clarivate
ESCI (JCI: 0.21), CiteScore (Scopus): 2,9 and SJR: 0,421) in its September
2024 issue. Task Organizers are expected to send the camera ready task and
system description papers for their task to IberLEF organizers by July
11, 2024.
Important dates
-
Task proposals due: December 22, 2023.
-
Notification of acceptance: January 26, 2024.
-
Camera ready submissions due: July 11, 2024.
-
IberLEF Workshop: September 2024.
IberLEF general chairs:
Salud María Jiménez Zafra, SINAI, Universidad de Jaén (Spain)
Luis Chiruzzo, Universidad de la República (Uruguay)
Francisco Rangel, Symanto Research (Spain)
Website
https://sites.google.com/view/iberlef-2024
Contact
E-mail: iberlef(a)googlegroups.com
=================================
[image: Universidad de Jaén] <http://www.uja.es/> *Salud María Jiménez
Zafra*
sjzafra(a)ujaen.es
Universidad de Jaén
Grupo de Investigación SINAI <http://sinai.ujaen.es/> | Departamento de
Informática
EPS Jaén, Edificio A3, Despacho 219
Campus Las Lagunillas s/n 23071 - Jaén | +34 953212992
[image: Universidad de Jaén] <http://www.uja.es/>
The Namkin company and Loria - Université de Lorraine invites applications for a postdoctoral position on business event extraction.
Location: Troyes, France and Nancy, France
Application Deadline: 31st January 2024
Starting Date: March 2024
Contract Duration: 1 year (with possible extension)
The industry faces numerous challenges that necessitate the evolution of BtoB marketing tools, in order to develop a valuable offer and provide an enhanced customer experience. Namkin's BrainLab develops industrial marketing tools for digitalizing customer relations, evolving business models, and exploiting business and economic data for business development. One of the key challenges of marketing intelligence is to identify risks and opportunities so as to guide marketing strategies. Among the sources of information useful to detect risks and opportunities, Namkin has identified Business Events, that is, “textually reported real-world occurrences, actions, relations, and situations involving companies and firms” (Jacobs et al., 2018).
The Loria Semagram team specialises in modelling natural language semantics to represent discourse. While modern semantic representations may contain vast quantities of information, they do not always (or necessarily) contain the information that is useful for the concrete application. For instance, significant challenges still persist in dealing with temporal relations and finely-grained negation interpretation.
A number of studies at the crossroads of business intelligence and NLP have focused on the detection or extraction of Business Events (e.g., Arendarenko & Kakkonen, 2012; Han et al., 2018; Jacobs et al., 2018; Jacobs & Hoste, 2020; Jacobs & Hoste, 2022). Despite the richness of the event extraction literature, many challenges still remain. Some of these challenges are concerned with the modelling of the task itself, such as the necessity / benefit of trigger identification for event extraction (see Zhu et al. 2021), some with the scope of the task, such as sentence level vs document level extraction (e.g., Zheng et al. 2019), some with the information necessary to the integration of events in a coherent knowledge base, like factuality detection (e.g., Zhang et al., 2022) and event disambiguation (e.g., Barhom et al., 2019).
Recent research has looked into the benefits of exploiting semantic representations, and in particular Abstract Meaning Representation (AMR; Banarescu et al. 2013), for low-resources scenarios (Huang et al., 2018) and document level event argument extraction (e.g., Xu et al., 2022). However, it appears that AMR has to be adapted in order to optimally support event extraction related tasks (Yang et al., 2023). One major limitation of AMR for document-level event extraction is that AMR works at the sentence level, and thus requires the aggregation of sentence-level representations. AMR is also limited in terms of negation and universal quantification expressive power.
To overcome these issues, we seek to appoint a Postdoctoral Researcher to work on semantic modelling. Some promising new lead was recently provided by Bos (2023) who proposes a new meaning representation system that overcomes expressive power limitations, supports discourse relations and inter-sentential coreferences, and reduces the annotation load. The appointed Postdoctoral Researcher will explore semantic modelling solutions and their application to event extraction in the field of business.
The topic covers various subjects, including:
- Computational semantics,
- Machine learning with neural networks,
- Cross-domain model transfer,
- Learning from small data,
- Combining top-down (expert-driven) and bottom-up (dataset-driven) models,
- Design of meaning representations
- Shallow and deep semantic processing and reasoning
- Hybrid symbolic and statistical approaches to semantics
- Neural semantic parsing
- Semantics and ontologies
The successful candidate will be part of Namkin's Data & IA team and the Sémagramme Team at Loria, with co-supervision provided by Agata Marcante and Professor Maxime Amblard.
As part of the role, you will have the opportunity to...
- Design, develop and test semantic representation algorithms for text-mining with the aim of identifying significant information in unstructured text.
- Collaborate with Namkin’s experts to evaluate the algorithms on real-world use cases.
You will be responsible for writing academic papers, technical reports and project deliverables. You will also attend academic conferences or project meetings to present your findings and act as a representative for the team.
Requirements include expertise in semantic representation algorithms, excellent technical writing skills and the ability to work well in a team.
* Applicants must hold a PhD in Computer Science, related to Data Systems, Natural Language Processing, or Artificial Intelligence.
* They should have proven fluency in at least one programming language, such as Python, R, Java or C++.
* Candidates must possess a curious and passionate attitude towards research and learning in general.
* Proficiency in French language would be considered a bonus.
* Previous experience in the NLP field would be considered advantageous.
How to apply:
send an email to:
applications(a)namkin.fr <mailto:applications@namkin.fr>
- with the subject starting with ''Namkin-Loria Postdoc''
- with a single PDF attached containing:
* Cover letter detailing motivation and qualifications for this position.
* Curriculum vitae, with a list of publications and contact details for references.
Interested parties are encouraged to contact us for further information regarding the position before applying.
References
Arendarenko, E., & Kakkonen, T. (2012). Ontology-based information and event extraction for business intelligence. In Artificial Intelligence: Methodology, Systems, and Applications: 15th International Conference, AIMSA 2012, Varna, Bulgaria, September 12-15, 2012. Proceedings 15 (pp. 89-102). Springer Berlin Heidelberg.
Barhom, S., Shwartz, V., Eirew, A., Bugert, M., Reimers, N., & Dagan, I. (2019). Revisiting joint modeling of cross-document entity and event coreference resolution. arXiv preprint arXiv:1906.01753.
Banarescu, L., Bonial, C., Cai, S., Georgescu, M., Griffitt, K., Hermjakob, U., ... & Schneider, N. (2013, August). Abstract meaning representation for sembanking. In Proceedings of the 7th linguistic annotation workshop and interoperability with discourse (pp. 178-186).
Jacobs, G., & Hoste, V. (2020). Extracting fine-grained economic events from business news. In COLING 2020 (pp. 235-245). COLING.
Jacobs, G., & Hoste, V. (2022). SENTiVENT: enabling supervised information extraction of company-specific events in economic and financial news. Language Resources and Evaluation, 56(1), 225-257.
Jacobs, G., Lefever, E., & Hoste, V. (2018). Economic event detection in company-specific news text. In 1st Workshop on Economics and Natural Language Processing (ECONLP) at Meeting of the Association-for-Computational-Linguistics (ACL) (pp. 1-10). Association for Computational Linguistics (ACL).
Han, S., Hao, X., & Huang, H. (2018). An event-extraction approach for business analysis from online Chinese news. Electronic Commerce Research and Applications, 28, 244-260.
Huang, L., Ji, H., Cho, K., Dagan, I., Riedel, S., & Voss, C. (2018, July). Zero-Shot Transfer Learning for Event Extraction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 2160-2170).
Xu, R., Wang, P., Liu, T., Zeng, S., Chang, B., & Sui, Z. (2022). A two-stream AMR-enhanced model for document-level event argument extraction. arXiv preprint arXiv:2205.00241.
Yang, Y., Guo, Q., Hu, X., Zhang, Y., Qiu, X., & Zhang, Z. (2023). An AMR-based link prediction approach for document-level event argument extraction. arXiv preprint arXiv:2305.19162.
Zhang, H., Qian, Z., Li, P., & Zhu, X. (2022, November). Evidence-Based Document-Level Event Factuality Identification. In PRICAI 2022: Trends in Artificial Intelligence: 19th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2022, Shanghai, China, November 10–13, 2022, Proceedings, Part II (pp. 240-254). Cham: Springer Nature Switzerland.
Zheng, S., Cao, W., Xu, W., & Bian, J. (2019). Doc2EDAG: An end-to-end document-level framework for Chinese financial event extraction. arXiv preprint arXiv:1904.07535.
Zhu, T., Qu, X., Chen, W., Wang, Z., Huai, B., Yuan, N. J., & Zhang, M. (2021). Efficient document-level event extraction via pseudo-trigger-aware pruned complete graph. arXiv preprint arXiv:2112.06013.
----------------------
Maxime Amblard
Université de Lorraine
https://members.loria.fr/mamblard <https://members.loria.fr/mamblard>
http://espoir-ul.fr <http://espoir-ul.fr/>
Job offer: W1 Professorship for Digital Humanities in the Study of Religion
(with Tenure Track W2)
The Center for Religious Studies (CERES) at the Ruhr University Bochum (RUB),
Germany, invites applications for the position of a W1 Professorship for
Digital Humanities in the Study of Religion (with Tenure Track W2).
The successful applicant is expected to represent the field of Digital
Humanities in the study of religion in research and teaching. The aim is to
strengthen Digital Humanities at RUB and the study of religion at CERES and
especially in the context of the Collaborative Research Center "Metaphors of
Religion".
The full text of the advertisement (in German) is available at
https://jobs.ruhr-uni-bochum.de/jobposting/8c4b276c0e75e597169c5082e2051ca7…
(for an unofficial English version, see
https://ceres.rub.de/en/news/w1-professur-fur-digital-humanities-in-der-rel…).
Address for questions and applications: Dr. Tim Karis/tim.karis(a)rub.de
Application deadline: 2024, Jan 31
Stefanie Dipper
DLnLD: Deep Learning and Linked Data
Workshop colocated with LREC-COLING 2024,
Date: May 21, 2024
Venue: Torino, Italy and online
For up to date info, check: https://dl-n-ld.github.io/ <https://dl-n-ld.github.io/>
Call for Papers
----------------------------------------------------------------------------------------
What does Linguistic Linked Data brings to Deep Learning and vice versa ? Let’s bring together these two complementary approaches in NLP.
----------------------------------------------------------------------------------------
Motivations for the Workshop
Since the appearance of transformers (Vaswani et al., 2017), Deep Learning (DL) and neural approaches have brought a huge contribution to Natural Language Processing (NLP) either with highly specialized models for specific application or via Large Language Models (LLMs) (Devlin et al., 2019; Brown et al., 2020; Touvron et al., 2023) that are efficient few-shot learners for many NLP tasks. Such models usually build on huge web-scale data (raw multilingual corpora and annotated specialized, task related, corpora) that are now widely available on the Web. This approach has clearly shown many successes, but still suffers from several weaknesses, such as the cost/impact of training on raw data, biases, hallucinations, explainability, among others (Nah et al., 2023).
The Linguistic Linked Open Data (LLOD) (Chiarcos et al., 2013) community aims at creating/distributing explicitly structured data (modelled as RDF graphs) and interlinking such data across languages. This collection of datasets, gathered inside the LLOD Cloud (Chiarcos et al., 2020), contains a huge amount of multilingual ontological (e.g. DBpedia (Lehmann et al., 2015)); lexical (e.g., DBnary (Sérasset, 2015), Wordnet (McCrae et al., 2014), Wikidata (Vrandečić and Krötzsch, 2014)); or linguistic (e.g., Universal Dependencies Treebank (Nivre et al., 2020; Chiarcos et al., 2021), DBpedia Abstract Corpus (Brümmer et al., 2016)) information, structured using common metadata (e.g., OntoLex (McCrae et al., 2017), NIF (Hellmann et al., 2013), etc.) and standardised data categories (e.g., lexinfo (Cimiano et al., 2011), OliA (Chiarcos and Sukhareva, 2015)).
Both communities bring striking contributions that seem to be highly complementary. However, if knowledge (ontological) graphs are now routinely used in DL, there is still very few research studying the value of Linguistic/Lexical knowledge in the context of DL. We think that, today, there is a real opportunity to bring both communities together to take the best of both worlds. Indeed, with more and more work on Graph Neural Networks (Wu et al., 2023) and Embeddings on RDF graphs (Ristoski et al., 2019), there is more and more opportunity to apply DL techniques to build, interlink or enhance Linguistic Linked Open Datasets, to borrow data from the LLOD Cloud for enhancing Neural Models on NLP tasks, or to take the best of both worlds for specific NLP use cases.
Submission Topics
This workshop aims at gathering researchers that work on the interaction between DL and LLOD in order to discuss what each approach has to bring to the other. For this, we welcome contributions on original work involving some of the following (non exhaustive) topics:
• Deep Learning for Linguistic Linked Data, among which (but not exclusively):
• Modelling, Resources & Interlinking,
• Relation Extraction
• Corpus annotation
• Ontology localization
• Knowledge/Linguistic Graphs creation or expansion
• Linguistic Linked Data for Deep Learning, among which (but not exclusively):
• Linguistic/Knowledge Graphs as training data
• Fine tuning LLMs using Linguistic Linked (meta)Data
• Graph Neural Networks
• Knowledge/Linguistic Graphs embeddings
• LLOD for model explainability/sourcing
• Neural models for under-resourced languages
• Joint Deep Learning and Linguistic Data applications
• Use cases combining Language Models and Structured Linguistic Data
• LLOD and DL for Digital Humanities
• Question-Answering on graph data
All application domains (Digital Humanities, FinTech, Education, Linguistics, Cybersecurity…) as well as approaches (NLG, NLU, Data Extraction…) are welcome, provided that the work is based on the use of BOTH Deep Learning techniques and Linguistic Linked (meta)Data.
Important Dates
(Current dates are tentative and will be revised when we will have more input from LREC-COLING Workshop Chairs)
All deadlines are 11:59PM UTC-12:00 (“anywhere on Earth”)
• Final submissions due: 25 February 2024
• Notification of acceptance: 25 March 2024
• Camera-ready due: 2nd April 2024
Authors kit
All papers must follow the LREC-COLING 2024 two-column format, using the supplied official style files. The templates can be downloaded from the Style Files and Formatting page provided on the website. Please do not modify these style files, nor should you use templates designed for other conferences. Submissions that do not conform to the required styles, including paper size, margin width, and font size restrictions, will be rejected without review.
LREC-COLING 2024 Author’s Kit Page: https://lrec-coling-2024.org/authors-kit/ <https://lrec-coling-2024.org/authors-kit/>
Paper submission
Submission is electronic, using the Softconf START conference management system. For the submission link, refer to DLnLD website: https://dl-n-ld.github.io/ <https://dl-n-ld.github.io/>
Workshop Chairs
• Gilles Sérasset, Université Grenoble Alpes, France
• Hugo Gonçalo Oliveira, University of Coimbra, Portugal
• Giedre Valunaite Oleskeviciene, Mykolas Romeris University, Lithuania
Program Committee
• Mehwish Alam, Télécom Paris, Institut Polytechnique de Paris, France
• Russa Biswas, Hasso Plattner Institute, Potsdam, Germany
• Milana Bolatbek, Al-Farabi Kazakh National University, Kazakhstan
• Michael Cochez, Vrije Universiteit Amsterdam, Netherlands
• Milan Dojchinovski, Czech Technical University in Prague, Czech Republic
• Basil Ell, University of Oslo, Norway
• Robert Fuchs, University of Hamburg, Germany
• Radovan Garabík, L’. Štúr Institute of Linguistics, Slovak Academy of Sciences, Slovakia
• Daniela Gifu, Romanian Academy, Iasi branch & Alexandru Ioan Cuza University of Iasi, Romania
• Katerina Gkirtzou, Athena Research Center, Maroussi, Greece
• Jorge Gracia del Río, University of Zaragoza, Spain
• Dagmar Gromann, University of Vienna, Austria
• Dangis Gudelis, Mykolas Romeris University, Lithuania
• Ilan Kernerman, Lexicala by K Dictionaries, Israel
• Chaya Liebeskind, Jerusalem College of Technology, Israel
• Marco C. Passarotti, Università Cattolica del Sacro Cuore, Milan, Italy
• Heiko Paulheim, University of Mannheim, Germany
• Alexandre Rademaker, IBM Research Brazil and EMAp/FGV, Brazil
• Georg Rehm, DFKI GmbH, Berlin, Germany
• Harald Sack, Karlsruhe Institute of Technology, Karlsruhe, Germany
• Didier Schwab, Université Grenoble Alpes, France
• Ranka Stanković, University of Belgrade, Serbia
• Andon Tchechmedjiev, IMT Mines Alès, France
• Dimitar Trajanov, Ss. Cyril and Methodius University – Skopje, Macedonia
• Ciprian-Octavian Truică, POLITEHNICA Bucharest, Romania
• Nicolas Turenne, Guangdong University of Foreign Studies, China
• Slavko Žitnik, University of Ljubljana, Slovenia