We are offering a fully funded, industry-sponsored PhD scholarship on the topic of Language Models.
The selected candidate will have the opportunity to conduct research at the junction of industry and academia.
She/he will also be part of an exciting team of data scientists & PhD researchers from the corporate and from the academic world.
For more details, please see
https://www.akadeus.com/announcement,a7165.htmlhttps://www.digitallab.be/en/
Co-located with COLING 2022, at VarDial we anticipate discussion on computational methods and on language resources for closely related languages, language varieties and dialects. We plan to organize VarDial 2022 as a hybrid workshop with options for both on-site and remote participation. We accept paper submissions until July 22, 2022 (details below).
https://sites.google.com/view/vardial-2022
We welcome papers dealing with one or more of the following topics:
- Language resources and tools for similar languages, varieties and dialects;
- Adaptation of tools (taggers, parsers) for similar languages, varieties and dialects;
- Evaluation of language resources and tools when applied to language varieties;
- Reusability of language resources in NLP applications (e.g., for machine translation, POS tagging, syntactic parsing, etc.);
- Corpus-driven studies in dialectology and language variation;
- Computational approaches to the study of mutual intelligibility between dialects and similar languages;
- Automatic identification of lexical variation;
- Automatic classification of language varieties;
- Text similarity and adaptation between language varieties;
- Linguistic issues in the adaptation of language resources and tools (e.g., semantic discrepancies, lexical gaps, false friends);
- Machine translation between closely related languages, language varieties and dialects.
In addition to the topics listed above, we also welcome papers dealing with diachronic language variation (e.g. phylogenetic methods, historical dialects).
Instructions for Authors
Submissions should be formatted according to the COLING template and submitted in PDF format. The review process will be double-blind. More information on the website.
Important Dates
Submission deadline: EXTENDED TO JULY 22, 2022 (anywhere on earth)
Notification of acceptance: August 22, 2022
Camera-ready papers due: September 5, 2022
VarDial Workshop at COLING 2022: October 16, 2022
Organizers
Yves Scherrer - University of Helsinki (Finland)
Tommi Jauhiainen - University of Helsinki (Finland)
Nikola Ljubešić - Jožef Stefan Institute (Slovenia) and University of Zagreb (Croatia)
Preslav Nakov - Qatar Computing Research Institute, HBKU (Qatar)
Jörg Tiedemann - University of Helsinki (Finland)
Marcos Zampieri - Rochester Institute of Technology (USA)
Contact: yves.scherrer(a)helsinki.fi<mailto:yves.scherrer@helsinki.fi>
--- apologies for cross-postings ---
Dear colleagues,
We have an open position for a postdoctoral researcher on natural
language processing / information retrieval / machine learning (SCAI/BnF
research program)
Starting period: autumn 2022
Duration: 12-month postdoctoral contract, renewable)
Location: Sorbonne university (ISIR lab in the MLIA team) / DataLab of
the BNF
Supervision:
Laure Soulier, MCF in computer science at Sorbonne University, MLIA
team, ISIR.
Emmanuelle Bermès, Scientific and Technical Assistant to the Director of
Services and Networks at BnF.
Jean-Philippe Moreux, Scientific expert of Gallica at the BnF.
More info:
https://scai.sorbonne-universite.fr/public/news/view/27d72d260c950c8d66c6/1
_*Context*_
Gallica, the digital library of the BnF, contains nearly 10 million
digitized documents that are freely accessible online (18.5 million
visits per year). However, most users do not know that Gallica contains
not only printed documents, but also photographs, sound recordings,
videos, and 3D objects. In satisfaction surveys, only a minority of
users consider the search engine's answers to be relevant and a majority
would like to be better guided in their searches. A recommendation
system should be able to help users find their way through the mass of
collections and improve the visibility of the least known. In this
project, BnF is committed to adopting a resolutely ethical approach. The
exploitation of user logs must respect their privacy and guarantee both
the relevance and transparency of the algorithms, avoiding the risk of
filter bubbles. The interface design is also at the heart of the
approach: a trustworthy system relies on a good user experience and on
the diversity and relevance of the proposed recommendations. Three lines
of thought emerge:
1) based on the available data, including both user logs and collection
descriptions, how to develop predictive algorithms?
2) how to integrate diversity in the recommendation algorithm while
leaving the choice to the user to moderate his serendipity threshold?
3) how to build user trust in algorithm design and audit?
_*Main missions*_
This project consists in working on information access in the Gallica
library, from the point of view of machine and deep learning techniques.
The research axes concern (1) the analysis and indexing of textual
documents as well as (2) the analysis of user traces and (3)
recommendation systems. We are particularly interested in multimodal
techniques that allow contextualizing a document or a query based on
user interactions.
The successful candidate will be responsible for:
● Implementing models to learn the semantics of textual data for the
purpose of indexing them.
● Developing algorithms based on representation learning methodologies
to effectively blend text and user traces.
● Reporting and presenting development work in a clear and effective
manner, both for discussion with BnF experts and writing machine
learning publications.
The printed book collection will be the primary focus of the program
described above, but an extension to other collections with textual
descriptors (in particular iconographic collections) may be considered.
--
-------------
Laure Soulier
Maître de conférences
Equipe MLIA - Laboratoire ISIR - Sorbonne Université
Tour 26, Couloir 26-00, Bureau 515
(+33) 1 44 27 74 91
https://pages.isir.upmc.fr/soulier/
[Apologies for multiple postings]
*** SECOND CALL FOR PAPERS ***
EMNLP 2022 Workshop - The 13th International Workshop on Health Text Mining
and Information Analysis (LOUHI 2022)
https://louhi2022.fbk.eu/
Colocated with EMNLP 2022, Abu Dhabi (7 December 2022)
but also accessible online
*Please, note that this year we use both the ACL Rolling Review (ARR)
system and Softconf as paper submission platforms.*
ARR Submission deadline: July 15, 2022
Direct submission deadline: 7 September 2022
** Call for Papers **
The 13th International Workshop on Health Text Mining and Information
Analysis provides an interdisciplinary forum for researchers interested in
automated processing of health documents. Health documents encompass
textual content of electronic health records, clinical guidelines,
spontaneous reports for pharmacovigilance, biomedical literature, health
forums/blogs or any other type of health-related documents.
The LOUHI workshop series fosters interactions between the Computational
Linguistics, Medical Informatics, and Artificial Intelligence communities.
It started in 2008 in Turku, Finland and has been organized 12 times: LOUHI
2010 was co-located with NAACL in Los Angeles, CA; LOUHI 2011 was
co-located with Artificial Intelligence in Medicine (AIME) in Bled,
Slovenia; LOUHI 2013 was held in Sydney, Australia during NICTA Techfest;
LOUHI 2014 was co-located with EACL in Gothenburg, Sweden; LOUHI 2015 was
co-located with EMNLP in Lisbon, Portugal; LOUHI 2016 was co-located with
EMNLP in Austin, Texas; LOUHI 2017 was held in Sydney, Australia; LOUHI
2018 was co-located with EMNLP in Brussels, Belgium; LOUHI 2019 was
co-located with EMNLP-IJCNLP in Hong Kong; LOUHI 2020 was co-located with
EMNLP; and LOUHI 2021 was co-located with EACL.
LOUHI 2022 is soliciting papers describing original research. Papers must
describe substantial and completed work but could also focus on other
contributions, such as a negative result, a software package or work in
progress. The topics include, but are not limited to, the following
language processing techniques and related areas:
- Techniques supporting information extraction, e.g., named entity
recognition, negation and uncertainty detection
- Classification and text mining applications (e.g., diagnostic
classifications such as ICD-10 and nursing intensity scores) and problems
(e.g., handling of unbalanced data sets)
- Text representation, including dealing with issues of data sparsity and
dimensionality
- Domain adaptation, e.g., adaptation of standard NLP tools (incl.
tokenizers, PoS-taggers, etc) to the medical domain
- Information fusion, i.e. integrating data from various sources, e.g.
structured and narrative documentation
- Unsupervised methods, including distributional semantics
- Evaluation, gold/reference standard construction and annotation
- Syntactic, semantic and pragmatic analysis of health documents
- Anonymization / de-identification of health records and ethics
- Supporting the development of medical terminologies and ontologies
- Individualization of content, consumer health vocabularies, summarization
and simplification of text
- NLP for supporting documentation and decision making practices
- Predictive modeling of adverse events, e.g., adverse drug events and
hospital acquired infections
- Terminology and information model standards (SNOMED CT, FHIR) for health
text mining
- Bridging gaps between formal ontology and biomedical NLP
We welcome submissions on topics related to text mining of health
documents, particularly emphasizing multidisciplinary aspects of health
documentation and the interplay between nursing and medical sciences,
information systems, computational linguistics and computer science. We
also encourage submissions reporting work on low-resourced languages,
addressing the challenges of data sparsity and language characteristic
diversity.
** Important Dates **
ARR submission deadline: July 15, 2022 (via ARR)
Direct submission deadline: 7 September 2022
Notification to authors: October 9, 2022
Camera-ready papers due: October 16, 2022
Workshop: December 7, 2022
** Submission Instructions **
Submissions go through a double-blind review process, where each submission
is reviewed by three program committee members. Accepted papers will be
presented by the authors in a regular workshop session either as a talk or
a poster. All accepted papers will be published in the workshop proceedings.
The submissions should be in PDF format and anonymized for review. All
submissions must be written in English and follow the EMNLP 2022 style
guidelines: https://2022.emnlp.org/calls/style-and-formatting/
* Long paper submission: up to 8 pages of content, plus 2 pages for
references; final versions of long papers: one additional page (so that
reviewers' comments can be taken into account): up to 9 pages with
unlimited pages for references
* Short paper submission: up to 4 pages of content, plus 2 pages for
references; final version of short papers: up to 5 pages with unlimited
pages for references
LOUHI 2022 will accept electronic submission both via ARR and Softconf.
** Invited Speaker **
TO BE ANNOUNCED
Workshop Organizers:
Alberto Lavelli (FBK, Trento, Italy)
James Pustejovsky (Brandeis University, USA)
Eben Holderness (Brandeis University, USA)
Antonio Jimeno Yepes (RMIT University, Australia)
Anne-Lyse Minard (University of Orleans, LLL CNRS, France)
Fabio Rinaldi (Dalle Molle Institute for Artificial Intelligence Research -
IDSIA, Switzerland & FBK, Trento, Italy)
** Programme Committee **
TO BE ANNOUNCED
--
--
Le informazioni contenute nella presente comunicazione sono di natura
privata e come tali sono da considerarsi riservate ed indirizzate
esclusivamente ai destinatari indicati e per le finalità strettamente
legate al relativo contenuto. Se avete ricevuto questo messaggio per
errore, vi preghiamo di eliminarlo e di inviare una comunicazione
all’indirizzo e-mail del mittente.
--
The information transmitted is
intended only for the person or entity to which it is addressed and may
contain confidential and/or privileged material. If you received this in
error, please contact the sender and delete the material.
https://sites.google.com/view/figlang2022/shared-tasks?authuser=0
*Euphemism Detection Shared Task*
Euphemisms are mild or indirect expressions used in place of harsher or
more offensive ones. Euphemisms are often used to mask profanity or refer
to taboo topics such as death, disability, sex, religion or personal
relationships in a polite way. Euphemisms are often ambiguous: their
literal and non-literal interpretation is context-dependent:
Asked to choose *between jobs* and the environment, a majority -- at least
in our warped, first-past-the-post system -- will pick jobs.
[non-euphemistic]
vs.
This summer, the budding talent agent was *between jobs* and free to
babysit pretty much any time. [euphemistic]
The state of the art language models perform well on many major NLP
benchmarks; however, it is unclear how such models perform on euphemisms.
Thus, we propose a euphemism detection task: given an input sentence,
identify whether the sentence contains a euphemism.
For more information about the shared task and to participate visit
https://codalab.lisn.upsaclay.fr/competitions/5726
<https://www.google.com/url?q=https%3A%2F%2Fnam10.safelinks.protection.outlo…>
.
*Important dates:*
-
July 5, 2022: CodaLab competition is open; training data can be
downloaded
-
Aug 5, 2022: Test data can be downloaded and results submitted;
performance will be tracked on CodaLab dashboard
-
Aug 20, 2022: Last day for submitting predictions on test data
-
Sept 7, 2022: Papers describing the systems are due
-
Oct 9, 2022: Notification of acceptance
-
TBD, 2022: Camera-ready papers due
-
December 7 or 8, 2022: Workshop
Hello,
Could you please distribute the following job offer? Thanks.
Best,
Pascal
-------------------------------------------------------------------------------------
3-year PhD position in Computational Models of Semantic Memory and its Acquisition (Inria and University of Lille, France)
We invite applications for a 3-year PhD position at the University of
Lille in the context of the recently funded research project
"COMANCHE" (Computational Models of Lexical Meaning and Change). The
position is funded by Inria, the French national research institute in
Computer Science and Applied Mathematics.
COMANCHE proposes to transfer and adapt neural word embeddings
algorithms to model the acquisition and evolution of word meaning, by
comparing them with linguistic theories on language acquisition and
language evolution. At the intersection between Natural Language
Processing, psycholinguistics and historical linguistics, this project
intends to validate or revise some of these theories, while also
developing computational models that are less data hungry and
computationally intensive as they exploit new inductive biases
inspired by these disciplines.
The first strand of the project, on which the successful candidate
will work, focuses on the development of computational models of
semantic memory and its acquisition. Two main research directions will
be pursued. On the one hand, we will compare the structural properties
associated to different semantic spaces derived from word embedding
algorithms to those found in human semantic memory as reflected in
behavioral data (such as typicality norms) as well as brain imaging
data. The latter data will then used as additional supervision to
inject more hierarchical structure into the learned semantic
spaces. One the other hand, we intend to experiment with training
regimes for word embedding algorithms that are closer to those of
humans when they acquire language, controlling the quantity as well as
the linguistic complexity of the inputs fed to the learning algorithms
through the use of longitudinal and child directed speech corpora
(e.g., CHILDES, Colaje). In both cases, both English and French data
will be considered.
The successful candidate holds a Master's degree in computational
linguistics or computer science or cognitive science and has prior
experience in word embedding models. Furthermore, the candidate will
provide strong programming skills, expertise in machine learning
approaches and is eager to work across languages.
The position is affiliated with the MAGNET team at Inria, Lille [1] as
well as with the SCALAB group at University of Lille [2] in an effort
to strenghten collaborations between these two groups, and ultimately
foster cross-fertilizations between Natural Language Processing and
Psycholinguistics.
Applications will be considered until the position is filled. However,
you are encouraged to apply early as we shall start processing the
applications as and when they are received. Applications, written in
English or French, should include a brief cover letter with research
interests and vision, a CV (including your contact address, work
experience, publications), and contact information for at least 2
referees. Applications (and questions) should be sent to Angèle
Brunellière (angele.brunelliere(a)univ-lille.fr) and Pascal Denis
(pascal.denis(a)inria.fr).
The starting date of the position is 1 October 2022 or soon
thereafter, for a total of 3 full years.
Best regards,
Angèle Brunellière and Pascal Denis
[1] https://team.inria.fr/magnet/
[2] https://scalab.univ-lille.fr/
--
Pascal
----
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.
----
+++++++++++++++++++++++++++++++++++++++++++++++
Pascal Denis
Equipe MAGNET, INRIA Lille Nord Europe
Bâtiment B, Avenue Heloïse
Parc scientifique de la Haute Borne
59650 Villeneuve d'Ascq
Tel: ++33 3 59 35 87 24
Url: http://researchers.lille.inria.fr/~pdenis/
+++++++++++++++++++++++++++++++++++++++++++++++
Australia's publicly funded science organisation, the CSIRO, is offering a full PhD scholarship on the topic of: "Using AI and NLP to assist in scientific and technical document creation".
Recent methods in artificial intelligence (AI) and natural language processing (NLP) have shown the tremendous potential of neural network language models for text generation. However, there are many areas that can still be improved through further research and development. These relate to issues such as: (1) the system-generated text is not factually consistent, with respect to the source documents, (2) inappropriate wording or style (e.g., formal vs informal) is present in the generated text, and (3) the coherence and overall readability of the passage is poor, even if the component sentences seem fluent.
Working with text generation and document creation scenarios that utilise scientific and technical literature, the candidate will research solutions to problems like the ones described above. The candidate will draw on methods from NLP and user-centric approaches, noting that these AI methods often have the most impact when they consider the context of the user, or author, and the AI-collaborative scenario in which the generated text is being used.
The successful candidate should have a passion for working with natural language data, an interest in linguistics, and a desire to research novel NLP algorithms and prepare new NLP data sets. Additional time to build up expertise in data science/machine learning/user-centered methods can be negotiated.
For more details, please email Stephen.Wan(a)csiro.au, or apply at https://jobs.csiro.au/job/Sydney%2C-NSW-Data61-PhD-scholarships/900646900/
First round application deadline: 8 July 2022
Data61 PhD scholarships<https://jobs.csiro.au/job/Sydney%2C-NSW-Data61-PhD-scholarships/900646900/>
Sydney, NSW Data61 PhD scholarships. Acknowledgement of Country CSIRO acknowledges the Traditional Owners of the land, sea and waters, of the area that we live and work on across Australia.
jobs.csiro.au
----------------------------------------------------------
Dr Stephen Wan
Team Leader, Senior Research Scientist | Language and Social Computing Team
CSIRO's Data61
E stephen.wan(a)data61.csiro.au
T +61 2 93724703
W http://people.csiro.au/W/S/Stephen-Wan.aspxwww.data61.csiro.au
Special Issue:
Current Trends in Natural Language Processing (NLP) and Human Language Technology (HLT)
MATHEMATICS
NEW IMPACT FACTOR 2.592
An Open Access Journal by MDPI
link: https://www.mdpi.com/journal/mathematics
Guest Editor:
* Florentina Hristea, University of Bucharest
Deadline for manuscript submissions: April 23, 2023
(Please note that this deadline will not be extended.)
Message from the Guest Editor and Special Issue Web page:
https://www.mdpi.com/si/mathematics/NLP_HLT
A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors<https://www.mdpi.com/journal/mathematics/instructions> page.
For further information and questions, please contact:
Florentina Hristea
University of Bucharest
fhristea(a)fmi.unibuc.ro<mailto:fhristea@fmi.unibuc.ro>
https://cs.unibuc.ro/~fhristea/
Dear colleagues and friends,
*We invite submissions to a special issue on "Information Extraction and
Language Discourse Processing" of journal Information
<https://www.mdpi.com/journal/information> (ISSN 2078-2489).*
*Special Issue Information*
This Special Issue seeks novel research reports on the spectrum that blends
information extraction and language discourse processing research in
diverse communities. The editors welcome submissions along various
dimensions derived from the nature of the extraction task, the advanced
neural techniques used for extraction, the variety of input resources
exploited, and the type of output produced. Quantitative, qualitative, and
mixed methods studies are welcome, as are case studies and experience
reports if they describe an impactful application at a scale that delivers
useful lessons to the journal readership.
Topics of interest include (but are not limited to):
- Knowledge base population with discourse-centric information
extraction (IE)
- Coreference resolution and its impact on discourse-centric IE
- Relationship extraction leveraging linguistic discourse
- Template filling
- Impact of pragmatics or rhetorics on information extraction
- Discourse-centric IE at scale
- Intelligent and novel assessment models of discourse-centric IE
- Survey of discourse-centric IE in natural language processing (NLP)
- Challenges implementing discourse-centric IE in real-world scenarios
- Modeling domains using discourse-centric IE
- Human–AI hybrid systems for learning discourse and IE
*Submission Instructions*
https://www.mdpi.com/journal/information/special_issues/WYS02U2GTD
*Deadline for manuscript submissions* Submissions to the SI will be
accepted and published on a rolling basis until the close of the issue on
10 December 2022
Yours cordially,
Dr. Jennifer D'Souza
Prof. Dr. Chengzhi Zhang
*Guest Editors*
** With apologies for multiple posting **
--------------
Onto4FAIR Workshop at SEMANTICS 2022
--------------
1st Workshop on Ontologies for FAIR and FAIR Ontologies (Onto4FAIR)
in conjunction with SEMANTiCS 2022, Vienna, Austria
Website: https://onto4fair.github.io
--------------
Important dates
--------------
- July 04, 2022 (11:59 pm, Hawaii time): Submission deadline
- July 30, 2022 (11:59 pm, Hawaii time): Notification of acceptance
- August 15, 2022 (11:59 pm, Hawaii time): Camera-ready version
- September 13, 2022: Workshop
The workshop is planned to take place physically in Vienna, Austria.
https://2022-eu.semantics.cc/venue
--------------
Presentation
--------------
Making the huge and diverse kinds of data produced by researchers, data stewards, and service providers, fully reusable and understood requires specific efforts. The Findable, Accessible, Interoperable and Reusable (FAIR) principles were laborated to address these issues, describing a set of requirements for data reusability and interoperability. These principles have been gaining increasing attention in a range of different areas and applications, including in the industrial area.
A key aspect in making data FAIR is the ability of machines to automatically find, access, interoperate, and reuse data with none or minimal human intervention. For that, the ability of properly and semantically describing data is essential.
The workshop has the following main goals:
- to bring together leaders from academia, industry and user institutions to discuss the adoption of FAIR principles in real-world requirements.
- to serve to inform industry and user representatives about existing research efforts that may meet their requirements.
- to investigate how the FAIR principles are supported by the use of schemes, vocabulaires, and ontologies that ideally are themselves FAIR.%, including creation, reuse and alignment of schemas, vocabularies and ontologies, to support the FAIR principles and their adoption in diverse areas of application.
- to discuss the challenges and perspectives in adopting FAIR principles.
--------------
Workshop topics
--------------
The topics of interest include, but are not limited to, the following:
- schemes, ontologies and vocabulaires for FAIR data and metadata;
- domain and cross-domain ontologies for FAIR data;
- making vocabularies and ontologies FAIR;
- alignment of schemes, vocabulaires and ontologies for FAIR;
- data management for FAIR data;
- best practices for implementing the FAIR principles;
- FAIRification process and use cases;
- metrics for FAIRness assessment;
- provenance in FAIR environments;
- FAIR principles and open science;
- FAIR principles and linked open data;
- FAIR in industry, scientific communities (life science, digital humanities, health, smart cities, etc.).
--------------
Submissions
--------------
- Full research papers: 12 pages (including references).
- Short papers: 6 pages (including references).
Please submit your contribution on EasyChair (https://easychair.org/conferences/?conf=onto4fair).
Submissions must be in PDF, formatted in the style of LNCS conference proceedings.
The workshop proceedings will be published in the CEUR-WS.org online proceedings.
--------------
Workshop Chairs
--------------
- Luiz Olavo Bonino da Silva Santos, University of Twente and Leiden University Medical Center, the Netherlands
- Giancarlo Guizzardi, Free University of Bozen-Bolzano, Italy & University of Twente, the Netherlands
- Clement Jonquet, French National Research Institute for Agriculture, Food and Environment, Mathematics, Informatics and STatistics for Environment and Agronomy research unit, Montpellier, France
- Cassia Trojahn, Institut de Recherche en Informatique de Toulouse, France
--------------
Program Committee
--------------
(to be completed)
Joao Paulo Almeida, Federal University of Espirito Santo
Emna Amdouni, Université de Lyon 2
Nathalie Aussenac-Gilles, IRIT CNRS
Maria Luiza Campos, PPGI - IM/NCE - Federal University of Rio de Janeiro
Daniel Garijo, Universidad Politécnica de Madrid
Nicolas Matentzoglu, Semanticly Ltd
María Poveda-Villalón, Universidad Politécnica de Madrid
Tiago Prince Sales, Free University of Bozen-Bolzano