CEA List, a research institute of Paris-Saclay University, is looking for a Postdoctoral Fellow to join its laboratory of semantic analysis of texts and images.
In the context of the DeepGenSeq project, the person hired will integrate an interdisciplinary team aiming to move closer to the goal of predictive and generative artificial intelligence for biology by exploiting deep contextual language models of biological sequences, which representations generalize to several applications like the prediction of mutational effects.
BACKGROUND
Exponential growth in sequencing throughput together with the sampling of natural (uncultured) populations are providing a deeper view of the diversity of proteins sequences across the tree of life. Proteins are molecular engines sustaining cellular life and the unobserved determinants of their structure and function are encoded in the distribution of observed natural sequences. Therefore, such vast amounts of (unlabeled) sequences provide evolutionary data that can form the ground for unsupervised learning of predictive and generative models of biological function.
Recent advances in machine learning, with the development of the transformer architecture, have allowed the emergence of powerful language models that can be used to model proteins sequences. Through transfer learning, the learned representations can be used to detect homology (i.e. the relatedness between two protein sequences), predict secondary and tertiary structures, predict residue-residue contacts or predict fluorescence landscape.
CHALLENGES AND OBJECTIVES
Our focus here will be to develop high-capacity transformer-based language models on protein sequence data. Intrinsic organising principles captured in the resulting representations can then be applied in transfer learning settings to different predictive sub-tasks using limited experimental data (e.g. the effect of sequence variation on protein function). Following promising recent results, we plan to also explore zero-shot inference with no additional training and/or supervision from experimental data.
Responsibilities:
* Tune and optimize existing unsupervised transformer-based language models for protein sequences.
* Develop and optimize code and machine learning algorithms for predictive models.
* Integrate and analyze large data volumes.
* Interact continuously with scientists in an interdisciplinary team.
APPLICATION
This project will be an excellent opportunity for a candidate who is looking to contribute to cutting-edge research and to train with experts in the field. We are seeking here a detail-oriented computer scientist and problem solver passionate in science. This 2 years position is open to a range of candidates from recent college graduates to more experienced scientists (e.g. post-docs)
The ideal candidate should have the following qualifications:
* Ph.D. or M.Sc. in Applied Mathematics, Computer Science, or Computational Biology.
* Experience in Deep Learning methods.
* Experience with Python, open-source software libraries for machine learning and Linux.
* Strong mathematical background and analytical skills.
* Effective organizational skills, e.g. the ability to prioritize work and contribute to the planning of a program of scientific research.
* Demonstrated interpersonal skills including both the ability to work independently and perform collaborative research in an interdisciplinary team environment.
* Good oral and written communication skills.
Preferred: Previous experience with transformer-based techniques for NLP pre-training and transformer language models
TERMS & COMPENSATION
This 2 years position is open to a range of candidates from recent college graduates to more experienced scientists (e.g. post-docs) – the chosen candidate's salary will be commensurate with their level of education, skills, and experience. Other benefits include:
- 48 days of paid holidays
- on-site subsidized restaurant
- partial remote work is possible, up to 3 days per week within the limit of 100 days per year
- CEA contribution to the personal company savings plan
LOCATION
We are based on the Paris-Saclay research campus in the south of Paris, France.
HOW TO APPLY
Interested candidates should submit a resume and short cover letter to deepgenseq «at» saxifrage.saclay.cea.fr
ABOUT US
About CEA-List: https://list.cea.fr/en/
About the LASTI lab: https://kalisteo.cea.fr/index.php/ai/https://kalisteo.cea.fr/index.php/textual-and-visual-semantic/
About Genoscope: https://www.genoscope.cns.fr
NooJ 2023: Last Call for Papers & Deadline Extension
Dear Colleagues,
due to many requests and expressed interest, the submission deadline has been extended.
The new deadline for apstract submission is now January 29th, 2023.
*********************************************************************************
Conference URL: https://conference.unizd.hr/noojconference/
*********************************************************************************
The 17th NooJ International Conference 2023
Zadar, Croatia
May, 31st – June, 2nd 2023
*********************************************************************************
The University of Zadar (Department of Classical Philology and Department of Information Sciences), in cooperation with the Centre de Recherches Interdisciplinaires et Transculturelles (C.R.I.T.) from the Université de Franche-Comté (Besançon) and the NooJ association are organizing the 17th NooJ International Conference 2023 to be held from May 31st to June 2nd, 2023 in Zadar (Croatia).
NooJ annual conferences give NooJ users the opportunity to meet and share their experience as developers, researchers and teachers; to present the latest linguistic resources, Digital Humanities experiments and NLP applications developed with NooJ; to offer researchers and graduate students a tutorial to help them parse corpora and build NLP applications with NooJ.
ABOUT NOOJ
********************
NooJ is a linguistic development environment software as well as a corpus processor. NooJ provides linguists with tools to develop dictionaries, Regular Grammars, Context-Free Grammars, Context-Sensitive Grammars, as well as their graphical equivalents, to formalize various linguistic phenomena. NooJ’s multi-layer approach allows linguists to accumulate elementary descriptions across different linguistic levels.
NooJ is used as a corpus processor in the Digital Humanities as it allows researchers in the Social sciences to apply sophisticated queries to large corpora in real time, annotate texts automatically and perform various statistical analyses.
NooJ’s linguistic engine has been integrated into various NLP applications that perform automatic semantic annotation, Named Entities Recognition, Information extraction, Paraphrase Generation, Business Intelligence, Machine Translator, Web Semantics.
NooJ is a free open-source software promoted by the METASHARE European programme. It can run on Windows (C# .NET), macOS, LINUX and UNIX (Java). Its new engine and its source “RA” can be downloaded from GitLab and runs natively on Windows, macOS and LINUX.
TOPICS OF INTEREST
********************
* Linguistic Resources: Typography, Spelling, Syllabification, Phonemic and Prosodic Transcription, Morphology, Lexical Analysis, Local Syntax, Structural Syntax, Transformational Analysis, Paraphrase Generation, Semantic Annotations, Semantic Analysis.
* Digital Humanities: Corpus Linguistics, Discourse Analysis, Literature Studies, Second-Language Teaching, Narrative content analysis, Corpus processing for the Social Sciences.
* Natural Language Processing Applications: Business Intelligence, Text Mining, Text Generation. Language Teaching Software, Automatic Paraphrasing, Machine Translation, etc.
SUBMISSIONS
********************
We invite the submission of abstracts in English until January 15th, 2023. Abstracts should be between 300 and 600 words and submitted via Easy Abstract: http://linguistlist.org/easyabs/nooj2023. The scientific committee will review all proposals and authors will be given notice of acceptance of their papers no later than March 1st, 2023. All papers must be original and cannot simultaneously be presented to another journal or conference.
IMPORTANT DATES
********************
Abstract submission: January 29h, 2023
Notification of acceptance: March 1st, 2023
Camera-ready abstract submission: March 20th, 2023
Early Registration: until April 15th, 2023
Selected papers submission: September 13th, 2023
POST-PROCEEDINGS
********************
A selection of the papers presented at the NooJ 2023 will be published by Springer Verlag in their CCIS Series. Deadline for submission of full camera-ready papers is September 13th, 2023.
Meeting Location: Zadar, Croatia
Contact Information: Linda Mijić nooj2023conf(a)gmail.com
Meeting Dates: May 31st, 2023 to June 2nd, 2023
Abstract Submission Information: Abstracts can be submitted from November 11th, 2022 until January 29th, 2023.
I am looking for brilliant candidates who hold (or about to hold) MSc in Computer Science with a strong NLP research background for two, fully funded, PhD studentships at Queen Mary University of London (QMUL), School of Electronic Engineering and Computer Science.
Next generation methods for meaning change
The PhD student will develop new tools and models for detecting, understanding and predicting meaning change across languages. This student will be part of the Change is Key!<https://www.changeiskey.org/> international research program, and will be co-supervised by Prof Maria Liakata. To learn more about the project please visit here<http://eecs.qmul.ac.uk/phd/phd-studentships/principal-and-epsrc-dtp-phd-stu…> (under the section: Next generation NLP methods for meaning change).
Eligibility: This fully funded 3-year Principal's studentship (tuition fees + monthly stipend) is open for UK home students only. For further details on the application process and eligibility criteria please read this<http://eecs.qmul.ac.uk/phd/phd-studentships/principal-and-epsrc-dtp-phd-stu…>.
Understanding neural representations via their algebraic-topological structures.
The PhD student will work in an interdisciplinary environment and at the forefront of NLP research and will develop new methods to better understand and describe how neural Language Models work. The student will be co-supervised by Dr Omer Bobrowski. To learn more about the project please visit here<http://eecs.qmul.ac.uk/phd/phd-studentships/csc-phd-studentships-in-electro…> (under the section: Understanding neural representations via their algebraic-topological structures).
Eligibility: This fully funded 4-year studentship (tuition fees + monthly stipend) is part of the collaboration scheme between QMUL and the China Scholarship Council (CSC), and is therefore open for Chinese students only. Additional information about the scheme's requirements (English Language test (IELTS) from the last 2 years among other things) is found here<https://www.qmul.ac.uk/scholarships/items/china-scholarship-council-scholar…>.
If you are interested, please get in touch with me on: h.dubossarsky(a)qmul.ac.uk<mailto:h.dubossarsky@qmul.ac.uk>.
Bests,
Haim
25th ACM International Conference on Multimodal Interaction (ICMI 2023)
9-13 October 2023, Paris, France
The 25th International Conference on Multimodal Interaction (ICMI 2023) will
be held in Paris, France. ICMI is the premier international forum that
brings together multimodal artificial intelligence (AI) and social
interaction research. Multimodal AI encompasses technical challenges in
machine learning and computational modeling such as representations, fusion,
data and systems. The study of social interactions englobes both human-human
interactions and human-computer interactions. A unique aspect of ICMI is its
multidisciplinary nature which values both scientific discoveries and
technical modeling achievements, with an eye towards impactful applications
for the good of people and society.
ICMI 2023 will feature a single-track main conference which includes:
keynote speakers, technical full and short papers (including oral and poster
presentations), demonstrations, exhibits, doctoral consortium, and
late-breaking papers. The conference will also feature tutorials, workshops
and grand challenges. The proceedings of all ICMI 2023 papers, including
Long and Short Papers, will be published by ACM as part of their series of
International Conference Proceedings and Digital Library, and the adjunct
proceedings will feature the workshop papers.
Novelty will be assessed along two dimensions: scientific novelty and
technical novelty. Accepted papers at ICMI 2023 will need to be novel along
one of the two dimensions:
* Scientific Novelty: Papers should bring new scientific knowledge about
human social interactions, including human-computer interactions. For
example, discovering new behavioral markers that are predictive of mental
health or how new behavioral patterns relate to children's interactions
during learning. It is the responsibility of the authors to perform a proper
literature review and clearly discuss the novelty in the scientific
discoveries made in their paper.
* Technical Novelty: Papers should propose novelty in their computational
approach for recognizing, generating or modeling multimodal data. Examples
include: novelty in the learning and prediction algorithms, in the neural
architecture, or in the data representation. Novelty can also be associated
with new usages of an existing approach.
Please see the Submission Guidelines for Authors https://icmi.acm.org/ for
detailed submission instructions. Commitment to ethical conduct is required
and submissions must adhere to ethical standards in particular when
human-derived data are employed. Authors are encouraged to read the ACM Code
of Ethics and Professional Conduct (https://ethics.acm.org/).
ICMI 2023 conference theme: The theme for this year's conference is "Science
of Multimodal Interactions". As the community grows, it is important to
understand the main scientific pillars involved in deep understanding of
multimodal social interactions. As a first step, we want to acknowledge key
discoveries and contributions that the ICMI community enabled over the past
20+ years. As a second step, we reflect on the core principles,
20+ foundational
methodologies and scientific knowledge involved in studying and modeling
multimodal interactions. This will help establish a distinctive research
identity for the ICMI community while at the same time embracing its
multidisciplinary collaborative nature. This research identity and long-term
agenda will enable the community to develop future technologies and
applications while maintaining commitment to world-class scientific
research.
Additional topics of interest include but are not limited to:
* Affective computing and interaction
* Cognitive modeling and multimodal interaction
* Gesture, touch and haptics
* Healthcare, assistive technologies
* Human communication dynamics
* Human-robot/agent multimodal interaction
* Human-centered A.I. and ethics
* Interaction with smart environment
* Machine learning for multimodal interaction
* Mobile multimodal systems
* Multimodal behaviour generation
* Multimodal datasets and validation
* Multimodal dialogue modeling
* Multimodal fusion and representation
* Multimodal interactive applications
* Novel multimodal datasets
* Speech behaviours in social interaction
* System components and multimodal platforms
* Visual behaviours in social interaction
* Virtual/augmented reality and multimodal interaction
Important Dates
Paper Submission: May 1, 2023
Rebuttal period: June 26-29, 2023
Paper notification: July 21, 2023
Camera-ready paper: August 14, 2023
Presenting at main conference: October 9-13, 2023
Dear Colleagues.
We are pleased to announce that the submission deadline for the 10th International Contrastive Linguistics Conference (ICLC-10) in Mannheim, Germany, been extended to 31st January 2023.
Best regards,
Marc
(on behalf of the organizers)
Final call for papers: ICLC-10
The Leibniz Institute for the German Language in Mannheim is pleased to announce the 10th International Contrastive Linguistics Conference (ICLC-10). The conference will take place in Mannheim,
Germany, from 18 to 21 July 2023.
The aim of the ICLC conference series, running since 1998, is to encourage fine-grained cross-linguistic research comprising two or more languages from a broad range of theoretical and methodological
perspectives. ICLC brings together researchers from different linguistic subfields (and neighboring disciplines) to continue the (interdisciplinary) dialog on comparing languages, to foster the
development of an international community, to discuss the state of the art, and to advance possible new areas of cross-linguistic research. Contrastive Linguistics as a linguistic subfield has had a
checkered history, but comparative and contrastive work has always been and continues to be an important part of linguistic research. New impulses for comparative and contrastive work include the
increasing availability of multilingual corpora or comparative work drawing on naturalistic interaction data. At this anniversary edition of ICLC, we want to provide a stage for the presentation of
such new work, and reflect the past, current and future developments of contrastive research in linguistics.
We invite contributions addressing (meta)theoretical, methodological or empirical issues, such as (but not limited to) the following:
* Comparison of phenomena in two or more languages addressing topics from any area and level of linguistic analysis, including lexicon, phonetics and phonology, morphology, syntax and morphosyntax,
semantics, pragmatics as well as matters such as register and socio-cultural context
* The state of the art and recent advances in contrastive linguistic research
* The aims, objectives and scope of contrastive linguistic research
* The status of contrastive research within linguistic studies and its relationship with neighbouring or complementary approaches such as historical, typological, micro-variationist, intercultural
and contact linguistics
* The link between contrastive studies and fields of applied linguistics such as foreign language teaching and learning, translation studies and corpus linguistics
* Potentials and limits of theoretical frameworks in relation to contrastive analysis (e.g., functional, cognitive, interactional, generative, constructional approaches)
* Theoretical and theoretical-methodological issues (comparability, incommensurability, the socio-cultural context, tertia comparationis, language universals)
* Empirical and data-related methodological issues (parallel / translation corpora, comparable corpora, learner corpora, multimodal corpora, naturalistic data of face-to-face interaction, psycho- and
neurolinguistic experiments, surveys)
* The significance of the contrastive perspective for language-specific description on the one hand and for cross-linguistic generalizations and the development of linguistic theory on the other hand
Some of these issues will be addressed by five invited keynote speakers.
Keynote speakers are:
* Artemis Alexiadou (Humboldt-Universität zu Berlin and Leibniz-Centre for General Linguistics, Germany)
* Jenny Audring (Leiden University, The Netherlands)
* Elwys De Stefani (University of Heidelberg, Germany, and KU Leuven, Belgium)
* Martin Haspelmath (Max Planck Institute for the Science of Human History, Germany)
* Hilde Hasselgård (University of Oslo, Norway)
The conference will include a poster session. The conference language will be English. Following the conference, all participants will be offered the possibility to submit their contribution for
publication in a volume of selected conference papers.
Submission of AbstractsWe invite submissions for 20-minute oral presentations (plus 10 minutes for discussion) or poster presentations. Abstracts should formulate a clear research question and include a description of the
methods, results and conclusions. All submissions will be reviewed anonymously by at least two reviewers.
One person may submit only one (oral or poster) paper as the first author. The number of co-authored submissions is not limited. However, presenting more than one paper (oral or poster) at the
conference by a single person should be avoided.
All submissions must be in English, fully anonymous, and no longer than one page (12 point Times New Roman), with up to one additional page for data, figures and references.
Abstracts must be submitted via the EasyChair system through the following submission web page https://easychair.org/conferences/?conf=iclc10
Important Dates * 16.01.2023: Deadline for abstract submission31.01.2023: Extended deadline for abstract submission
* 31.03.2023: Notification of acceptance
* 14.04.2023: Confirmation of participation
* 18.07.2023: Arrival, Registration, Get-together
* 19.-21.07.2023: Conference
Conference Web Site
https://iclc10.ids-mannheim.de
Organizing CommitteeBeata Trawinski (Chair)
Marc Kupietz
Kristel Proost
Jörg Zinken
*** *FINAL* Call for Abstracts and *Extended Deadline*
*** 2023 NARNiHS Research Incubator
*** North American Research Network in Historical Sociolinguistics
*** 5th edition
20-22 April 2023 - entirely online!
The 2023 NARNiHS Research Incubator will take place as an entirely online
event (with free registration). This presents a great opportunity for
scholars in historical sociolinguistics from all over the world to
participate as presenters and/or attendees without the limitations imposed
by international travel, and we encourage our fellow historical
sociolinguists, and scholars from related fields, from our global scholarly
community (in addition to North America), to join us online for our
Research Incubator this spring.
==> NEW Abstract submission deadline:
==> 15 January 2023, 11:59 PM (U.S. Eastern Time).
==> Abstract submission online:
==> http://linguistlist.org/easyabs/NARNiHS2023_RI
<https://protect-us.mimecast.com/s/AR-uCwplD5Tx94x6h9o__n?domain=linguistlis…>
.
The North American Research Network in Historical Sociolinguistics (NARNiHS)
is accepting abstracts for its 2023 NARNiHS Research Incubator. Building on
the great success of the first four years, the 5th edition of this unique
kind of NARNiHS conference seeks to provide a collaborative environment
where presenters bring work that is in-progress, exploratory,
proof-of-concept, prototyping; and the audience actively participates in
the brainstorming and workshopping of those new ideas. We see the
NARNiHS Research
Incubator as a place for testing/pushing boundaries; developing new
theories, methods, models, tools; seeking feedback from peers willing to
engage in productive assessment of fledgling ideas and nascent projects.
Successful abstracts for this research incubator environment will
demonstrate thorough grounding in the field, scientific rigor in the
formulation of research questions, and promise for rich discussion of ideas.
NARNiHS welcomes papers in all areas of historical sociolinguistics, which
is understood as the application/development of sociolinguistic theories,
methods, and models for the study of historical language variation and
change over time, or more broadly, the study of the interaction of language
and society in historical periods and from historical perspectives. Thus,
a wide range of linguistic areas, subdisciplines, and methodologies easily
find their place within the field, and we encourage submission of abstracts
that reflect this broad scope.
We are soliciting abstracts for 25-minute presentations. Presenters will
have the entire 25 minutes for their presentations, with discussion
happening in the "incubation session" at the end of each panel. Abstracts
should be no more than one page (not including examples and references, see
below). Abstracts will be accepted until 19 December 2022 - late abstracts
will not be considered.
Successful abstracts will be explicit about which theoretical frameworks,
methodological protocols, and analytical strategies are being applied or
critiqued; and data sources and examples should be sufficiently (if
briefly) presented, so as to allow reviewers a full understanding of the
scope and claims of the research. Please note that the connection of your
research to the field of historical sociolinguistics should be explicitly
outlined in your abstract. Failure to adhere to these criteria will likely
result in non-acceptance.
To encourage maximum exchange of ideas in the brainstorming/workshopping
environment of the NARNiHS Research Incubator, presentations will be
grouped into thematic panels of three presentations, each panel followed by
an hour-long discussion with the audience led by specialists. Discussion
will encompass specific feedback on the individual papers as well as
consideration of overarching questions of theory, methods, and models
emerging from the papers. To facilitate such discussion, authors will be
required to submit a draft of their presentation materials for distribution
to the panel discussants and to the other presenters 10 days prior to the
start of the conference.
General Requirements:
1) Abstracts must be submitted electronically, using the following link:
http://linguistlist.org/easyabs/NARNiHS2023_RI
<https://protect-us.mimecast.com/s/AR-uCwplD5Tx94x6h9o__n?domain=linguistlis…>
2) Papers must be delivered as projected in the abstract or represent bona
fide developments of the same research.
3) Authors are expected to virtually attend the conference and present
their own papers.
4) Presentations will be delivered via a video-conferencing platform, most
likely Zoom. Technical details and instructions regarding the platform for
our NARNiHS Research Incubator will be sent to authors in due time.
Content Requirements:
1) Abstracts should be explicit about which theoretical frameworks,
methodological protocols, and analytical strategies are being applied or
critiqued.
2) Data sources and examples should be sufficiently (if briefly) presented,
so as to allow reviewers a full understanding of the scope and claims of
the research.
3) The connection of your research to the field of historical
sociolinguistics should be explicitly outlined.
Abstract Format Guidelines:
1) Abstracts must be submitted in PDF format.
2) Abstracts must fit on one standard 8.5×11 inch page, with margins no
smaller than 1 inch and a font style and size no smaller than Times New
Roman 12 point. All additional content (visualizations, trees, tables,
figures, captions, examples, and references) must fit on a single (1)
additional page. No exceptions to these requirements are allowed.
3) Anonymize your abstract. We realize that sometimes it is not possible to
attain complete anonymity, but there is a difference between "inability to
anonymize completely" (due to the nature of the research) and "careless
non-anonymizing" (for example: "In Jones 2021, I describe..."). In
addition, be sure to anonymize your PDF file (you may do so in Adobe
Acrobat Reader by clicking on "File", then "Properties", removing your name
if it appears in the "Author" line of the "Description" tab, and re-saving
before submitting it). Please be aware that abstract file names might not
be automatically anonymized by the system; do not use your name (e.g.
Smith_Abstract.pdf) when saving your abstract in PDF format, rather, use
non-identifying information (e.g. HistSoc4Lyfe_NARNiHS.pdf). Your name
should only appear in the online form accompanying your abstract
submission. Papers that are not sufficiently anonymized wherever possible
(whether in the text of the abstract or in the metadata of the digital
file) risk being rejected.
Please contact us at NARNiHistSoc(a)gmail.com with any questions.
########################################################################
*
Call for Participation
*
*
The 4thSlav-NER Shared Task on
Named Entities in Slavic Languages: <http://bsnlp.cs.helsinki.fi/shared-task.html>
Recognition, Normalization, Classification and Cross-Lingual Linking <http://bsnlp.cs.helsinki.fi/shared-task.html>
co-located with the Slav-NLP <http://bsnlp.cs.helsinki.fi/>Workshop, EACL 2023
http://bsnlp.cs.helsinki.fi/shared-task.html <http://bsnlp.cs.helsinki.fi/shared-task.html>
TASK DESCRIPTION:
The 4thSlav-NER Shared Task focuses on Named Entities in Slavic languages.
Due to rich inflection, free word order, derivation, and other phenomena common to the Slavic languages, work on Named Entities poses important challenges. Fostering research & development on the problems of Named Entities — detecting names, lemmatization (normalization), classification, and cross-lingual matching — is crucial for information access and wider use of NLP in Slavic languages.
The 4thSlav-NER Shared Task covers three languages:
*
Czech,
*
Polish,
*
Russian.
and five types of named entities:
*
persons,
*
locations,
*
organizations,
*
events,
*
products.
For information about training and test data, guidelines, and participation, please see the Shared Task Home Page. <http://bsnlp.cs.helsinki.fi/shared-task.html>
IMPORTANT: Participants are NOT required to perform all tasks or for all languages. For example, a monolingual entry, without lemmatization of the names, can participate.
The Shared Task focuses on cross-lingual extraction of named entities — the systems should recognize, classify, and extract all mentions of a name in a document; detecting the positionof each name mention is NOT required. Name mentions should be lemmatized, and mentions referring to the same real-world object should be linked across documents and languages. The text collection consists of sets of documents retrieved from the Web, each set about a certain major entity or event. The corpus was collected by crawling the Web and parsing the HTML documents.
For background, see the details about the1stedition (2017) <http://bsnlp-2017.cs.helsinki.fi/shared_task.html>, 2ndedition (2019) <http://bsnlp.cs.helsinki.fi/bsnlp-2019/shared_task.html>and the3rdedition (2021) <http://bsnlp.cs.helsinki.fi/shared-task.html>of this shared task.
Participation
Teams that wish to participate should register via email to: bsnlp(a)cs.helsinki.fi, with the following information:
*
name of team,
*
team members,
*
contact person,
*
contact email.
Important Dates
*
Shared task announcement: 11 January 2023 ⇒ Training data available
*
Registration deadline: 19 February 2023
*
Release of Testdata to registered participants: 20 February2023
*
Submission of system responses: 22 February 2023
*
Results announced to participants: 24February 2023
*
Submission of shared task papers (optional): 27 February 2023
*
*
Call for Papers
*
*
SNLP: The9thWorkshoponNLP for Slavic languages <http://bsnlp.cs.helsinki.fi/>
2 May 2023 or 6 May 2023
co-located with EACL 2023
http://bsnlp.cs.helsinki.fi/ <http://bsnlp.cs.helsinki.fi/>
Submission Deadline: 27 February 2023
WORKSHOPDESCRIPTION
The 9th edition of the SNLP Workshop at EACLSponsored by SIGSLAV: the ACL Special Interest Group on Slavic NLP
The languages from the Slavic group play an important role due to their diverse cultural heritage and widespread use — with over 400 million speakers worldwide. The current political and economic developments in Central and Eastern Europe bring Slavic societies and languages into focus in terms of rapid technological advancement and expanding consumer markets.
Research on theoretical and applied topics in the context of Slavic languages is still lagging in the community. Linguistic phenomena that are common to Slavic languages — rich morphology, free word order, etc. — make NLP for these languages a challenging task. Slavic NLP gathers researchers from academia and industry. It aims to stimulate research in Slavic NLP, and to foster the creation of tools and resources. The Workshops provides a forum for exchange of ideas and experience, discussing current challenges, and making the available resources widely-known. The structural similarity, as well as the easily recognizable core vocabulary and inflectional inventory spanning this entire large language group creates a special environment, where researchers can appreciate the shared problems and communicate naturally — despite the lack of mutual intelligibility.This year, we are especially glad to have an opportunity to organize Slav NLP in a Slavic-speaking country.
This Workshop addresses Natural Language Processing (NLP) for the Slavic languages. The NLP tasks in urgent need of attention include:
*
language modeling
*
morphological analysis and generation,
*
syntactic and semantic parsing,
*
lexical semantics,
*
named-entity recognition,
*
text normalisation and processing non-standard language
*
coreference resolution,
*
information extraction,
*
question answering,
*
text summarization,
*
machine translation,
*
development of linguistic resources,
*
text classification
*
disinformation detection,
*
fact verification.
*
sentiment analysis
This Workshop continues the proud tradition established by the 8 previous BSNLP Workshops.
IMPORTANT DATES
*
Submission deadline: 27 February 2023
*
Notification of acceptance: 19 March 2023
*
Camera-ready papers due: 27 March 2023
*
Workshop: 2 or 6 May 2023
SHARED TASK
This year's SNLP features the 4th editionof the Shared Task on Multilingual Named Entity Recognition: recognizing mentions of named entities in Web documents, lemmatization, and cross-lingual matching in Slavic languages. The shared task covers:
*
Czech,
*
Polish,
*
Russian.
Information about the Shared Task and training data is available on the Workshop web page
SUBMISSION INFORMATION
At the Workshop Web page: bsnlp.cs.helsinki.fi <http://bsnlp.cs.helsinki.fi/call-for-papers.html>
Workshop contact email address: bsnlp(a)cs.helsinki.fi
*
--
Roman Yangarber
Associate Professor, University of Helsinki
Digital Humanities
INEQ: Helsinki Inequality Initiative <https://helsinki.fi/en/ineq-helsinki-inequality-initiative> — Linguistic Inequalities and Translation Technologies
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
e-Learning & language learninghelsinki.fi/revita <https://www.helsinki.fi/revita>
Language Learning Labhelsinki.fi/language-learning-lab <https://www.helsinki.fi/language-learning-lab>
Unioninkatu 40, Metsätalo A214 mobile: +358 50 41 51 71 3
Helsinki, Finland
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
RЯ
BIONLP 2023 and Shared Tasks @ ACL 2023
https://aclweb.org/aclwiki/BioNLP_Workshop#SHARED_TASKS_2023
WORKSHOP OVERVIEW AND SCOPE
The BioNLP workshop associated with the ACL SIGBIOMED special interest group has established itself as the primary venue for presenting foundational research in language processing for the biological and medical domains. The workshop is running every year since 2002 and continues getting stronger. BioNLP welcomes and encourages work on languages other than English, and inclusion and diversity. BioNLP truly encompasses the breadth of the domain and brings together researchers in bio- and clinical NLP from all over the world. The workshop will continue presenting work on a broad and interesting range of topics in NLP. The interest to biomedical language has broadened significantly due to the COVID-19 pandemic and continues to grow: as access to information becomes easier and more people generate and access health-related text, it becomes clearer that only language technologies can enable and support adequate use of the biomedical text.
BioNLP 2023 will be particularly interested in language processing that supports DEIA (Diversity, Equity, Inclusion and Accessibility). The work on detection and mitigation of bias and misinformation continues to be of interest. Research in languages other than English, particularly, under-represented languages, and health disparities are always of interest to BioNLP.
Other active areas of research include, but are not limited to:
Tangible results of biomedical language processing applications;
Entity identification and normalization (linking) for a broad range of semantic categories;
Extraction of complex relations and events;
Discourse analysis;
Anaphora/coreference resolution;
Text mining / Literature based discovery;
Summarization;
Τext simplification;
Question Answering;
Resources and strategies for system testing and evaluation;
Infrastructures and pre-trained language models for biomedical NLP (Processing and annotation platforms);
Development of synthetic data & data augmentation;
Translating NLP research into practice;
Getting reproducible results.
SHARED TASKS 2023
Shared Tasks on Summarization of Clinical Notes and Scientific Articles
The first task focuses on Clinical Text.
Task 1A. Problem List Summarization
Automatically summarizing patients’ main problems from the daily care notes in the electronic health record can help mitigate information and cognitive overload for clinicians and provide augmented intelligence via computerized diagnostic decision support at the bedside. The task of Problem List Summarization aims to generate a list of diagnoses and problems in a patient’s daily care plan using input from the provider’s progress notes during hospitalization.This task aims to promote NLP model development for downstream applications in diagnostic decision support systems that could improve efficiency and reduce diagnostic errors in hospitals. This task will contain 768 hospital daily progress notes and 2783 diagnoses in the training set, and a new set of 300 daily progress notes will be annotated by physicians as the test set. The annotation methods and annotation quality have previously been reported here. The goal of this shared task is to attract future research efforts in building NLP models for real-world decision support applications, where a system generating relevant and accurate diagnoses will assist the healthcare providers’ decision-making process and improve the quality of care for patients.
Shared Task 1A Registration: https://forms.gle/yp6TKD66G8KGpweN9
Please join our Google discussion group for the important update: https://groups.google.com/g/bionlp2023problemsumm
Important Dates:
Registration Started: January 13th, 2023
Releasing of training and validation data: January 13th, 2023
Releasing of test data: April 13th, 2023
System submission deadline: April 20th, 2023
System papers due date: May 4th, 2023
Notification of acceptance: June 1st, 2023
Camera-ready system papers due: June 13th, 2023
BioNLP Workshop Date: July 13th or 14th, 2023
Task 1A Organizers:
Majid Afshar, Department of Medicine University of Wisconsin - Madison.
Yanjun Gao, University of Wisconsin Madison.
Dmitriy Dligach, Department of Computer Science at Loyola University Chicago.
Timothy Miller, Boston Children’s Hospital and Harvard Medical School.
Task 1B. Radiology report summarization
Radiology report summarization is a growing area of research. Given the Findings and/or Background sections of a radiology report, the goal is to generate a summary (called an Impression section) that highlights the key observations and conclusions of the radiology study.
The research area of radiology report summarization currently faces an important limitation: most research is carried out on chest X-rays. To palliate these limitations, we propose two datasets: A shared summarization task that includes six different modalities and anatomies, totalling 79,779 samples, based on the MIMIC-III database.
A shared summarization task on chest x-ray radiology reports with images and a brand new out-of-domain test-set from Stanford.
SEE MORE at: https://vilmedic.app/misc/bionlp23/sharedtask
Task 1B Organizers:
Jean-Benoit Delbrouck, Stanford University.
Maya Varma, Stanford University.
Task 2. Lay Summarization of Biomedical Research Articles
Biomedical publications contain the latest research on prominent health-related topics, ranging from common illnesses to global pandemics. This can often result in their content being of interest to a wide variety of audiences including researchers, medical professionals, journalists, and even members of the public. However, the highly technical and specialist language used within such articles typically makes it difficult for non-expert audiences to understand their contents.
Abstractive summarization models can be used to generate a concise summary of an article, capturing its salient point using words and sentences that aren’t used in the original text. As such, these models have the potential to help broaden access to highly technical documents when trained to generate summaries that are more readable, containing more background information and less technical terminology (i.e., a “lay summary”).
This shared task surrounds the abstractive summarization of biomedical research articles, with an emphasis on controllability and catering to non-expert audiences. Through this task, we aim to help foster increased research interest in controllable summarization that helps broaden access to technical texts and progress toward more usable abstractive summarization models in the biomedical domain.
For more information, see:
Main site: https://biolaysumm.org/
CodaLab page - subtask 1: https://codalab.lisn.upsaclay.fr/competitions/9541
CodaLab page - subtask 2: https://codalab.lisn.upsaclay.fr/competitions/9544
Detailed descriptions of the motivation, the tasks, and the data are also published in:
Goldsack, T., Zhang, Z., Lin, C., Scarton, C.. Making Science Simple: Corpora for the Lay Summarisation of Scientific Literature. EMNLP 2022.
Luo, Z., Xie, Q., Ananiadou, S.. Readability Controllable Biomedical Document Summarization. EMNLP 2022 Findings.
Task 2 Organizers:
Chenghua Lin, Deputy Director of Research and Innovation in the Computer Science Department, University of Sheffield.
Sophia Ananiadou, Turing Fellow, Director of the National Centre for Text Mining and Deputy Director of the Institute of Data Science and AI at the University of Manchester.
Carolina Scarton, Computer Science Department at the University of Sheffield.
Qianqian Xie, National Centre for Text Mining (NaCTeM).
Tomas Goldsack, University of Sheffield.
Zheheng Luo, the University of Manchester.
Zhihao Zhang, Beihang University.
Organizers
Dina Demner-Fushman, US National Library of Medicine
Kevin Bretonnel Cohen, University of Colorado School of Medicine
Sophia Ananiadou, National Centre for Text Mining and University of Manchester, UK
Jun-ichi Tsujii, National Institute of Advanced Industrial Science and Technology, Japan
*SEM 2023 Call for Papers
*SEM brings together researchers interested in the semantics of natural languages and its computational modeling. The conference embraces data-driven, neural, and probabilistic approaches, as well as symbolic approaches and everything in between; practical applications and resources as well as theoretical contributions are welcome. The long-term goal of *SEM is to provide a stable forum for the growing number of NLP researchers working on all aspects of semantics of (many and diverse!) natural languages.
Topics of interest include, but are not limited to:
* Lexical semantics and word representations
* Compositional semantics and sentence representations
* Statistical, machine learning and deep learning methods for semantics
* Multilingual and cross-lingual semantics
* Word sense disambiguation and induction
* Semantic parsing; syntax-semantics interface
* Frame semantics and semantic role labeling
* Textual inference, entailment and question answering
* Formal approaches to semantics
* Extraction of events and causal and temporal relations
* Entity linking; pronouns and coreference
* Discourse, pragmatics, and dialogue
* Machine reading
* Extra-propositional aspects of meaning
* Multiword and idiomatic expressions
* Metaphor, irony, and humor
* Knowledge mining and acquisition
* Common sense reasoning
* Language generation
* Semantics in NLP applications: sentiment analysis, abusive language detection, summarization, fact-checking, etc.
* Multidisciplinary research on semantics
* Grounding and multimodal semantics
* Human semantic processing
* Semantic annotation, evaluation, and resources
* Ethical aspects and bias in semantic representations
We encourage authors to think about the ethical aspects of their work, and to address and discuss all ethical questions and implications relevant to their research. STARSEM values reproducibility and particularly welcomes submissions that adhere to the reproducibility guidelines as specified here<https://folk.idi.ntnu.no/odderik/reproducibility_guidelines.pdf>.
Important dates
Anonymity period begins: February 18 2023, AoE
Paper submission deadline: March 18 2023, AoE
Commitment deadline for ARR-reviewed papers: April 16 2023, AoE
Notification of acceptance: May 12 2023
STARSEM conference: July 13-14 2023
Submission instructions
Submissions must describe unpublished work and be written in English. We solicit both long and short papers. Please note that double submission of papers will need to be notified at submission.
Long papers describe original research and may consist of up to eight (8) pages of content, plus unlimited pages for references. Final versions of long papers will be given one additional page of content (up to 9 pages) so that reviewers' comments can be taken into account. Short papers describe original focused research and may consist of up to four (4) pages, plus unlimited pages for references. Upon acceptance, short papers will be given five (5) content pages in the proceedings. Authors are encouraged to use this additional page to address reviewers comments in their final versions. Submissions should follow the ACL 2023 formatting requirements<https://2023.aclweb.org/calls/style_and_formatting/>.
Submission link: Softconf link TBA
Organisers
General Chair:
Mohammad Taher Pilehvar, Tehran Institute for Advanced Studies
Program Chairs:
Jose Camacho-Collados, Cardiff University
Alexis Palmer, University of Colorado Boulder
Anonymity period
To protect the integrity of double-blind review and ensure that submissions are reviewed fairly, we adopt the rules and guidelines for ACL conferences. The following rules and guidelines make reference to the anonymity period, which runs from 1 month before the submission deadline (starting February 18, 2023 11:59PM UTC-12:00) up to the date when your paper is either accepted, rejected (May 12, 2023), or withdrawn.
* You may not make a non-anonymized version of your paper available online to the general community (for example, via a preprint server) during the anonymity period. By a version of a paper we understand another paper having essentially the same scientific content but possibly differing in minor details (including title and structure) and/or in length (e.g., an abstract is a version of the paper that it summarizes).
* If you have posted a non-anonymized version of your paper online before the start of the anonymity period, you may submit an anonymized version to the conference. The submitted version must not refer to the non-anonymized version, and you must inform the program chair(s) that a non-anonymized version exists.
* You may not update the non-anonymized version during the anonymity period, and we ask you not to advertise it on social media or take other actions that would further compromise double-blind reviewing during the anonymity period.
* Note that, while you are not prohibited from making a non-anonymous version available online before the start of the anonymity period, this does make double-blind reviewing more difficult to maintain, and we therefore encourage you to wait until the end of the anonymity period if possible. Alternatively, you may consider submitting your work to the Computational Linguistics journal, which does not require anonymization and has a track for “short” (i.e., conference-length) papers.
Website
Further information can be found online at: https://sites.google.com/view/starsem2023
========================================================================
Alexis Palmer
Assistant Professor
Department of Linguistics
University of Colorado Boulder
alexis.palmer(a)colorado.edu<mailto:alexis.palmer@colorado.edu>
303-735-0418