*** Second Workshop on Information Extraction from Scientific Publications (
WIESP) at IJCNLP-AACL 2023 ***
*** Website: https://ui.adsabs.harvard.edu/WIESP/2023/ ***
*** Twitter: https://twitter.com/wiesp_nlp ***
Building on the success of the First WIESP at AACL-IJCNLP 2022, the Second
Workshop on Information Extraction from Scientific Publications (WIESP)
will provide a platform to researchers to foster discussion and research on
information extraction, mining, generation, and knowledge discovery from
scientific publications using Natural Language Processing and Machine
Learning techniques. A lot of technological change happened in one year
(since the 1st WIESP), especially with Generative Artificial Intelligence
research. We are incorporating a few additional topics to stay abreast with
the latest developments and research in the community. The 2nd iteration of
WIESP would focus on the following topics (but not limited to):
- Large Language Models (LLMs) for Science
- Application of LLMs on information extraction, generation, mining and
knowledge discovery from scientific publications
- Probing LLMs for scientific fact checking and misinformation
- Scientific document parsing
- Scientific named-entity recognition
- Scientific article summarization
- Question-answering on scientific articles
- Citation context/span extraction
- Structured information extraction from full-text, tables, figures,
bibliography
- Novel datasets curated from scientific publications
- Argument extraction and mining
- Challenges in information extraction from scientific articles
- Building knowledge graphs via mining scientific literature; querying
scientific knowledge graphs
- Novel tools for IE on scientific literature and interaction with users
- Mathematical information extraction
- Scientific concepts, facts extraction
- Visualizing scientific knowledge
- Bibliometric and Altmetric studies via information extraction from
scientific articles and metadata
In addition to research paper presentations, WIESP will also feature
keynote talks, a panel discussion on “Large Language Models and Scientific
Literature Mining'', and shared tasks. We will update the details on our
website as and when they become available. We especially welcome
participation from academic and research institutions, government and
industry labs, publishers, and information service providers. Projects and
organizations using NLP/ML techniques in their text mining and enrichment
efforts are also welcome to participate. We strongly encourage the
participation
of students, researchers, and science practitioners from diverse
backgrounds, especially from underrepresented groups and communities, to be
a part of WIESP events, and pro-actively make the workshop a diverse and
inclusive one.
****Call for Papers****
We invite papers of the following categories:
***Long papers*** must describe substantial, original, completed, and
unpublished work. Wherever appropriate, concrete evaluation and analysis
should be included. Papers must not exceed eight (8) pages of content, plus
unlimited pages of references. The final versions of long papers will be
given one additional page of content (up to 9 pages) so that reviewers'
comments can be taken into account.
***Short papers*** must describe original and unpublished work. Please note
that a short paper is not a shortened long paper. Instead, short papers
should have a point that can be made in a few pages, such as a small,
focused contribution, a negative result, or an interesting application
nugget. Short papers must not exceed four (4) pages, plus unlimited pages
of references. The final versions of short papers will be given one
additional page of content (up to 5 pages) so that reviewers' comments can
be taken into account.
In addition to papers, WIESP will also host shared tasks. More details on
the WIESP shared tasks will be available on our website shortly. Also, we
will publish separate CfPs on the shared tasks. Shared task authors will be
invited to write their system descriptions and those will be subjected to
peer review.
***Shared Task: Function of Citation in Astrophysics Literature (FOCAL)***
The citation graph is an essential tool for helping researchers find
relevant literature. To further empower discovery, we aim to label the
edges of the graph with the function of the citation: e.g. is the cited
work necessary background knowledge, or is it used as a comparison, to the
citing work? To start this process, we propose a shared task of
automatically labelling citations with a function based on the textual
context of the citation. A sample dataset and more instructions can be
found at: https://ui.adsabs.harvard.edu/WIESP/2023/SharedTasks
*All accepted papers would be published in the WIESP proceedings as part of
IJCNLP-AACL 2023 and indexed in the ACL Anthology.*
***Important Dates***
- Paper Submission Deadline: *September 11, 2023 (final extended deadline)*
- Notification of workshop paper/abstract acceptance: October 5, 2023
- Camera-ready Submission Deadline: October 12, 2023
- Workshop: November 1, 2023 (online)
***All submission deadlines are 11.59 pm UTC -12h ("Anywhere on Earth")***
****Submission Website and Format****
Submission Link: https://softconf.com/ijcnlp2023/WorkshopWIESP2023/
Submission will be via softconf. Submissions should follow the ACLPUB
formatting guidelines (https://acl-org.github.io/ACLPUB/formatting.html)
and template files (https://github.com/acl-org/acl-style-files/tree/master).
Submissions (Long and Short Papers) will be subject to a double-blind
peer-review process. We follow the same policies as IJCNLP-AACL 2023
regarding anonymity, preprints and double submissions.
***Organizers***
- Tirthankar Ghosal, National Center for Computational Sciences| Oak Ridge
National Laboratory, USA
- Felix Grezes, Center for Astrophysics | Harvard & Smithsonian, USA
- Thomas Allen, Center for Astrophysics | Harvard & Smithsonian, USA
- Kelly Lockhart, Center for Astrophysics | Harvard & Smithsonian, USA
- Alberto Accomazzi, Center for Astrophysics | Harvard & Smithsonian, USA
--
+++++++++++++++++++++++++++++++++++
Tirthankar Ghosal
https://member.acm.org/~tghosal
++++++++++++++++++++++++++++++++++++
===== Call for Participation to online FOIS 2023 conference, showcases
and demos =====
Program: https://fois2023.griis.ca/online-conference/
Registration: https://event.fourwaves.com/fr/fois2023/inscription
(free registration for students)
====================================================================================
13th International Conference on Formal Ontology in Information Systems
(FOIS 2023), September 18-20, 2023 (Online)
Definition and scope
====================
The FOIS conference is a meeting point for all researchers with an
interest in formal ontology. Formal ontology is the systematic study of
the types of entities and relations making up the domains of interest
represented in modern information systems. FOIS 2023 will have distinct
tracks for foundational issues, ontology applications and methods, and
domain ontologies. FOIS aims to be a nexus of interdisciplinary research
and communication for researchers from many domains engaging with formal
ontology. Common application areas include conceptual modeling, database
design, knowledge engineering and management, software engineering,
organizational modeling, artificial intelligence, robotics,
computational linguistics, the life sciences, bioinformatics and
scientific research in general, geographic information science,
information retrieval, library and information science, as well as the
Semantic Web.
FOIS is the flagship conference of the International Association for
Ontology and its Applications (IAOA: http://iaoa.org/), which is a
non-profit organization promoting interdisciplinary research and
international collaboration in formal ontology.
Program
====================
Monday September, 18th
EDT (UTC -4) CEST (UTC +2)
08:15-08:30 14:15-14:30 FOIS online Welcome Zoom
08:30-10:30 14:30-16:30 Session 1: Foundational concepts (Chair:
Laure Vieu) Zoom
10:30-11:00 16:30-17:00 Coffee break gather.town
11:00-12:00 17:00-18:00 Ontology showcases and demos gather.town
Tuesday September, 19th
08:30-09:00 14:30-15:00 Invited talks special session (TBC) Zoom
09:00-10:30 15:00-16:30 Session 2: Methodological issues (TBA) Zoom
10:30-11:00 16:30-17:00 Coffee break gather.town
11:00-12:00 17:00-18:00 ESAO panel Zoom
Wednesday September, 20th
09:00-10:30 15:00-16:30 Session 3: Domain ontologies (TBA) Zoom
10:30-11:00 16:30-17:00 Coffee break gather.town
11:00-12:00 17:00-18:00 IAOA General Assembly Zoom
12:00-12:15 18:00-18:15 Closing Zoom
Details: https://fois2023.griis.ca/onlinesession/
Registration fees
====================
Online presenter: 500 CAN / 340 EUR
Listener - regular fee (academia or industry): 100 CAN / 70 EUR
Listener - reduced fee (student or participant from less developed
contry): free
More information: https://fois2023.griis.ca/registration/
Conference Organization
=======================
General Chair: Antony Galton, University of Exeter, UK
PC Chairs: Nathalie Aussenac-Gilles, IRIT-CNRS Toulouse, France
Torsten Hahmann, University of Maine, USA
Local Organization Chair: Jean-François Ethier, University of
Sherbrooke, Canada
Online Chair: Cassia Trojahn, IRIT Université Toulouse 2, France
Workshop and Tutorial Chairs: Megan Katsumi, University of Toronto,
Canada
Emilio Sanfilippo, ISTC-CNR, Trento, Italy
Early Career Chairs: Antoine Zimmermann, École des Mines de
Saint-Étienne (EMSE), France
Guendalina Righetti, Free University
Bozen/Bolzano, Italy
Demo & Showcase Chairs: Sergio de Cesare, University of Westminster, UK
Tiago Prince Sales, University of Twente,
Netherlands
Publicity Chairs: Lucia Gomez Alvarez, TU Dresden, Germany
Selja Seppälä, University College Cork, Ireland
Proceedings Chair: Maria Hedblom, Jönköping University, Sweden
Program committee: https://fois2023.griis.ca/conference-organization/
The 8th Biomedical Linked Annotation Hackathon (BLAH8)
15 - 19 January, 2024
Kashiwa, Chiba, Japan
https://blah8.linkedannotation.org/
Submission due of project proposals : 20 Oct., 2023
INTRODUCTION
BLAH (Biomedical Linked Annotation Hackathon) represents a series of annual
hackathon events, specifically designed to foster open collaboration. The
goal is to achieve a breakthrough in the sharing and linking of various
resources for biomedical literature annotation and mining. By enhancing the
interoperability of these resources, the initiative aims to substantially
increase both the productivity and the impact within the community.
Within the scope of BLAH, the term "resources" encompasses a wide range of
elements including corpora, annotation datasets, databases, language
models, software tools, web services, terminologies, ontologies, graphical
representations, movies, and more. The aspiration of BLAH is to create
connections between all these resources, allowing them to interoperate
seamlessly. We believe this integration will foster a more cohesive and
effective environment for all stakeholders.
Unfortunately, the pandemic led to a temporary halt in the organization of
BLAH events. However, with the world gradually reopening, BLAH is excited
to announce its return with the 8th edition (BLAH8). In recognition of the
era of Large Language Models (LLMs), BLAH8 will center around a special
theme: "Biomedical Annotations in the Age of LLMs." This theme represents a
contemporary focus for the community and signals a commitment to staying at
the forefront of technological advancements in the biomedical field.
Through BLAH8, we aspire to explore the potential synergy between LLMs and
literature annotations, diving deep into various facets of biomedical
applications.
CALL FOR PROJECT PROPOSALS
We invite submission of project proposals from those who are interested in
contributing biomedical literature annotation with their literature
annotation resources, and expertise, particularly this year with a
connection to LLMs. We invite projects which can be accomplished during the
hackathon. The type of contribution may include, but not restricted to
- Integration of annotation resources
- Evaluation of annotation resources
- Application of annotation resources
- ...
Submission due of project proposals is 20 Oct., 2023
TRAVEL SUPPORT
Those who submit project proposals are eligible to apply for travel
support. See the homepage for detailed information.
PUBLICATION
Immediately after BLAH8, participants will be invited to submit papers, e.
g., report of the hackathon outputs, to either of the two venues:
- Genomics & Informatics : an open access journal, which is indexed by
PubMed. All the papers of the journal will be immediately included in the
PMC open access subset.
- BioHackrxiv : a preprint server, which is powered by OSF preprints and
indexed by EuropeanPMC.
PROGRAM COMMITTEE
- Jin-Dong Kim (DBCLS, ROIS-DS)
- Fabio Rinaldi (IDSIA)
- Lars Juhl Jensen (Univ. Copenhagen)
- Zhiyong Lu (NCBI, NLM)
- Event Notification Type: Call for Papers
- Abbreviated Title: [CFP 2nd] EACL 2024
- Location: Hotel Radisson Blu, St. Julians
Sunday, 17 March 2024 to Friday, 22 March 2024
- Country: Malta
- Contact Email:
michael.strube(a)h-its.org
YGRAHAM(a)tcd.ie
m.purver(a)qmul.ac.uk
- Contact:
Michael Strube
Yvette Graham
Matthew Purver
- Website: https://2024.eacl.org/
- Submission Deadline: Sunday, 15 October 2023
============================
* Second Call for Papers: EACL 2024
The 18th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2024) invites the submission of long and short papers on substantial, original, and unpublished research on Natural Language Processing. EACL 2024 will be held at the Hotel Radisson Blu, St. Julians, in Malta on 17th-22nd March 2024, with online attendance possible.
Papers must be submitted to EACL 2024 via the ACL Rolling Review (ARR) system. As in recent years, some of the presentations at the conference will be for papers accepted by the Transactions of the ACL (TACL) and Computational Linguistics (CL) journals.
* Important Dates
- Anonymity period begins: Friday, 15 September 2023
- Paper submission deadline (via ARR): Sunday, 15 October 2023
- Author response period: Friday-Tuesday, 8-12 December 2023
- Paper commitment deadline: Sunday, 20 December 2023
- Notification of acceptance: (long & short papers): Monday, 15 January 2024
- Withdrawal deadline (long & short papers): Monday, 22 January 2024
- Camera-ready papers due (long & short papers): Wednesday, 31 January 2024
- Workshops & Tutorials: Sunday; Thu-Fri, 17 & 21-22 March 2024
- Main Conference: Monday-Wed, 18-20 March 2024
All deadlines are 11.59 pm UTC -12h (“anywhere on Earth”).
============================
* Paper Submission Information
* Topics of Interest
EACL 2024 has the goal of a broad technical program. Relevant topics for the conference include, but are not limited to, the following areas (in alphabetical order):
- Computational Social Science and Cultural Analytics
- Dialogue and Interactive Systems
- Discourse and Pragmatics
- Efficient/Low-resource methods in NLP
- Ethics and NLP
- Generation
- Information Retrieval and Text Mining
- Information Extraction
- Interpretability and Model Analysis in NLP
- Language Grounding to Vision, Robotics and Beyond
- Linguistic Theories, Cognitive Modeling and Psycholinguistics
- Machine Learning for NLP
- Machine Translation
- Multilinguality and Language Diversity
- NLP Applications
- Phonology, Morphology, and Word Segmentation
- Question Answering
- Resources and Evaluation
- Semantics: Lexical
- Semantics: Sentence-level Semantics, Textual Inference and other areas
- Sentiment Analysis, Stylistic Analysis and Argument Mining
- Speech and Multimodality
- Summarization
- Syntax: Tagging, Chunking and Parsing
* Long Papers
Long paper submissions must describe substantial, original, completed and unpublished work. Wherever appropriate, concrete evaluation and analysis should be included. Long papers may consist of up to 8 pages of content, plus unlimited pages for references and appendices. Upon acceptance, long papers will be given one additional page of content (i.e. up to 9 pages) in the proceedings so that reviewers’ comments can be taken into account.
* Short Papers
Short paper submissions must describe original and unpublished work. Please note that a short paper is not a shortened long paper. Instead, short papers should have a point that can be made in a few pages. Short papers may consist of up to 4 pages of content, plus unlimited references and appendices. Upon acceptance, short papers will be given one additional page of content (i.e. up to 5 pages) in the proceedings so that reviewers’ comments can be taken into account.
* Findings of the ACL
Papers submitted to EACL 2024, but not selected for the main conference, will also automatically be considered for publication in the Findings of the Association of Computational Linguistics. Acceptance notifications for the main track and Findings will come out simultaneously.
* Presentation Mode
Long and short papers will be presented orally or as posters, as determined by the programme committee based on the nature rather than the quality of the work. While short papers will be distinguished from long papers in the proceedings, there will be no distinction in the proceedings between papers presented orally and as posters. Papers accepted to the Findings of the ACL may present a poster.
* Presentation Requirements
All accepted papers must be presented at the conference—either online or in-person—in order to appear in the proceedings. Authors of papers accepted for presentation at EACL 2024 must notify the program chairs by the withdrawal deadline if they wish to withdraw the paper. At least one author of each accepted paper must register for EACL 2024 by the early registration deadline.
* Paper Submission and Anonymity
Following standard ACL and ARR policy, submitted papers must be prepared for two-way anonymized review, and no deanonymized preprint may be posted in the month prior to submission. Please see the ARR CfP for more detail.
https://aclrollingreview.org/cfp
* Policies on Authorship, Citation and Ethics
EACL 2024 follows the ARR policies on authorship, citation and comparison and ethics - please see the ARR CfP.
* Multiple Submission Policy
EACL 2024 follows the ARR policy on multiple submission: we will not consider any paper that is under review in a journal or another conference at the time of submission, and submitted papers must not be submitted elsewhere during the review period. See the ARR CfP for more detail. Please note that the EACL 2024 submission deadline is currently timed to come after EMNLP 2023 decisions have been announced, and that EACL 2024 acceptance decisions will be announced before the likely submission deadline for ACL 2024, although after that for NAACL 2024.
* Mandatory Discussion of Limitations
We believe that it is also important to discuss the limitations of your work, in addition to its strengths. Following EACL 2023, EACL 2024 requires all papers to have a clear discussion of limitations, in a dedicated section titled “Limitations”. This section will appear at the end of the paper, after the discussion/conclusions section and before the references, and will not count towards the page limit. Papers without a limitations section will be automatically rejected without review. Papers resubmitted from previous ARR review rounds that did not include a limitations section must ensure that such a section is included in the EACL 2024 version.
While we are open to different types of limitations, just mentioning that a set of results have been shown for English only probably does not reflect what we expect. Mentioning that the method works mostly for languages with limited morphology, like English, is a much better alternative. In addition, limitations such as low scalability to long text, the requirement of large GPU resources, or other things that inspire further investigation are welcome.
The School of Informatics, University of Edinburgh, is thrilled to
announce a PhD scholarship funded by Google DeepMind.
The scholarship covers tuition fees (at the Home/International tuition
fee rate), provides an annual stipend of £18,622 per annum (for 3.5
years full time study) and provides a research training and support
grant. The student will be supervised by Dr. Mirella Lapata and will
also benefit from mentoring from DeepMind staff during their period of
study.
Applicants would be expected to work on a topic drawn from the
following research areas:
- multimodal natural language understanding and generation
- long-form and retrieval-augmented text generation
- multilingual generation
Applicants wishing to apply for the scholarship should meet one OR
both of the following criteria:
- are resident of a country and/or region underrepresented in AI;
- identify as women including cis and trans people and non-binary or
gender fluid people who identify in a significant way as women or
female;
- and/or identify as Black or other minority ethnicity.
The successful candidate will have a good honours degree or equivalent
in artificial intelligence, computer science, machine learning, or a
related discipline; or have a breadth of relevant experience in
industry/academia/public sector, etc. They will have strong
programming skills and previous experience in natural language
processing.
If you have further questions, please contact Dr. Mirella Lapata:
mlap(a)inf.ed.ac.uk.
To apply, please follow the instructions at:
http://www.inf.ed.ac.uk/postgraduate/apply.html
As your research area, please select "Informatics: ILCC: Language
Processing, Speech Technology, Information Retrieval, Cognition". On
the application form under "Research Project", please state "DeepMind
Scholarship".
IMPORTANT: After submitting your application through the website,
please email your applicant number to mlap(a)inf.ed.ac.uk.
Application deadline: 24 November 2023 received after
the deadline may be considered, but this cannot be guaranteed].
Call for Papers: North Africans in Machine Learning Affinity Workshop at
NeurIPS 2023!
We are thrilled to announce the Call for Papers for the North Africans in
Machine Learning Affinity Workshop, which will be held at NeurIPS 2023.
This is your chance to share your groundbreaking research, insights, and
discoveries with a vibrant community of peers in the field of Machine
Learning. Whether you're a junior researcher or a seasoned expert, and of
North African origins. if you have a passion for advancing the theory and
applications of ML, we want to hear from you!
Why submit your paper?
- Showcase your work on a prestigious stage.
- Gain valuable feedback from experts in the ML community.
- Connect with like-minded professionals from North African institutions
and beyond.
- Contribute to the collective knowledge of the ML field.
📜 Submission Guidelines:
- Papers related to all aspects of Machine Learning are welcome.
- Submissions from North Africans are encouraged.
- The workshop is open to academia and industry professionals.
🔗 Submission Link and Deadlines:
https://lnkd.in/eSKVv2H5
🏆 Awards and Recognition:
Outstanding contributions will be recognized, and selected papers may have
the opportunity to be featured prominently during the workshop.
Join us in making the North Africans in Machine Learning Affinity Workshop
at NeurIPS 2023 a resounding success! Submit your paper, share your
insights, and be part of this exciting journey in advancing the field of
Machine Learning.
Stay tuned for more updates and mark your calendars for NeurIPS 2023! Let's
shape the future of ML together.
(Apologies for cross-posting)
CFP: SYMPTEMIST Shared Task (BioCreative VIII run with AMIA 2023)
Named entity recognition and linking of symptoms, signs & findings (incl.
multilingual dataset)
https://temu.bsc.es/SYMPTEMIST/ <https://temu.bsc.es/distemist/>
The SYMPTEMIST track focuses on the automatic detection of mentions of
clinical symptoms (NER) and mapping to concept identifiers in clinical case
reports in Spanish (entity linking). Also a multilingual version of the
dataset will be released including versions in English, French, Italian,
Dutch, Portuguese, Romanian and Swedish.
Key information:
-
Web: https://temu.bsc.es/symptemist
-
Data: <https://doi.org/10.5281/zenodo.6408476>
https://zenodo.org/record/8223654
-
Annotation guidelines: https://zenodo.org/record/8246440
-
BioCreative web: https://biocreative.bioinformatics.udel.edu
-
Registration form (Track 2- SYMPTEMIST):
<https://temu.bsc.es/distemist/registration/>
https://docs.google.com/forms/d/e/1FAIpQLScoSNulOoxRju3c8v9Q-CSv-w5jJcXu93G…
Motivation
Systems able to detect and normalize clinical symptom mentions from medical
texts are crucial for almost any healthcare data mining, AI, medical
analytics or predictive application. As opposed to other clinical
information types, such as diagnoses (diseases/procedures), lab test
results or even medications, clinical symptoms can only be recovered
directly from written clinical narratives. Due to the high complexity,
variability and difficulty in generating annotated corpora for clinical
symptoms, only few large manually annotated data collections have been
constructed so far, with certain underlying limitations in terms of a)
entity linking / normalization of the symptom mentions to controlled
vocabularies and b) a lack of attempts to promote the development of
multilingual solutions and b) provide detailed annotation criteria and
guidelines. To address these issues, we have posed the SYMPTEMIST track at
the upcoming BioCreative VIII initiative, which will be run in the context
of the prestigious AMIA 2023 conference, which received over 1400
submissions this year.
Automatic detection of symptoms mentions are key for a range of clinical
use cases and real world applications like:
-
Predictive modeling of diseases
-
Differential diagnosis of complex diseases
-
Rare disease characterization & analysis
-
Selection of appropriate treatment & therapy
-
Study of disease-symptom associations
-
Early detection of disease outbreaks & epidemiological surveillance
-
Extraction of phenotypes
-
Drug repurposing & off label indications
The SYMPTEMIST organizers will also release multilingual resources to
foster the development of multilingual tools and generate systems not only
for Spanish but also for content in English and Romance languages (French,
Portuguese, Italian, Romanian and Catalan) as well as versions in Dutch,
Swedish and Czech.
Inspired by previous initiatives (e.g. n2c2, CLEF or TREC) and shared tasks
(CANTEMIST, PharmaCoNER, or CodiEsp), we are launching the SYMPTEMIST
shared task as part of the BioCreative 2023 evaluation initiative, with the
following three sub-tracks:
-
SYMPTEMIST-entities: automatic detection of mentions of symptoms.
-
SYMPTEMIST-linking: finding mentions of symptoms and normalizing them to
their Snomed-CT concept identifiers.
-
SYMPTEMIST-multilingual: automatic detection of mentions of symptoms in
versions of the corpus generated in English, French, Italian, Portuguese,
Romanian, Catalan, Dutch, Swedish and Czech.
Tentative schedule
-
Annotation Guidelines: August 8th 2023
-
Train Set Subtask 1 (NER): August 8th, 2023
-
Train Set Subtask 2 (Linking): September 10th 2023
-
Train Set Subtask 3 (Multilingual): September 10th 2023
-
SympTEMIST Test Set: September 30th 2023
-
Participants Test Predictions Deadline: October 5th 2023
-
Participants Evaluation Results Release. October 10th 2023
-
Submission of Participant Papers Deadline: October 22nd 2023
-
Notification of Acceptance Participant Papers: October 30 2023
-
Submission of Camera-ready Participant Papers Deadline. November 1st 2023
-
BioCreative VIII workshop @ AMIA 2023: November 11-15, 2023, In New
Orleans, LA.
BioCreative proceedings and AMIA workshop
Teams participating in SYMPTEMIST will be invited to contribute a systems
description paper for the BioCreative 2023 Working Notes proceedings and a
flash presentation of their approach at the BioCreative 2023 session. The
BioCreative VIII workshop will run with AMIA 2023, November 11-15, 2023, In
New Orleans, LA. See:
https://amia.org/education-events/amia-2023-annual-symposium
Workshop Proceedings and Special Issue:
The BioCreative VIII Proceedings will host all the submissions from
participating teams, and it will be freely available by the time of the
workshop. In addition, we are happy to announce that the journal Database
will host the BioCreative VIII special issue for work that has passed their
peer-review process. Invitation to submit will be sent after the workshop.
All BioCreative VIII tracks
Track 1: BioRED (Biomedical Relation Extraction Dataset)
*Track 2: SYMPTEMIST (Symptom TExt Mining Shared Task)
Track 3: Genetic Phenotype Extraction and Normalization from Dysmorphology
Physical Examination Entries
Track 4: Clinical Annotation Tool Track
Main Organizers
-
Martin Krallinger, Barcelona Supercomputing Center, Spain
-
Eulàlia Farré-Maduell, Barcelona Supercomputing Center, Spain
-
Luis Gascó, Barcelona Supercomputing Center, Spain
-
Salvador Lima, Barcelona Supercomputing Center, Spain
-
Jan Rodriguez, Barcelona Supercomputing Center, Spain
=======================================
Martin Krallinger, Dr.
Head of NLP for Biomedical Information Analysis Unit
Barcelona Supercomputing Center (BSC-CNS)
https://www.linkedin.com/in/martin-krallinger-85495920/
=======================================
We are innovating university! Interdisciplinary, international and digital – these are the pillars of the University of Technology Nuremberg. The aim is to combine and interlink engineering science with other topics of society. Besides the outlined interdisciplinary approach, the university will put its emphasis on courses in English, on digital learning as well as future-oriented research. In the medium term, the university is to provide a place for learning and personal development for up to 6,000 students – on a campus combining research, learning, and living. The project is currently one of the most important higher education projects of the Free State of Bavaria (Germany).
The University of Technology Nuremberg is looking to fill, at the earliest possible date, a position as a
Professor (m/f/d) (W3) of Natural Language Processing
at the Department of Engineering.
You represent the subject Natural Language Processing in research and teaching. You will play a key role in establishing the first main research area of the Department, which will focus on "Robotics and Artificial Intelligence". Here you will collaborate, among others, with excellent international researchers from the fields of Robotics, Machine Learning, Data Science and Computer Vision. The interdisciplinary collaboration with humanities scholars and social and natural scientists at the Department of Liberal Arts and Sciences, within the topics of “Human and Artificial Intelligence” and “Rhetoric and Political Communication” is a further goal.
To find the complete advertisement follow this link: https://www.utn.de/en/career/professorships/
CoCo4MT is extended its deadline for paper submission to July 16th!
The Second Workshop on Corpus Generation and Corpus Augmentation for
Machine Translation (CoCo4MT) @MT-SUMMIT XIX
The 19th Machine Translation Summit
Sep 4-8, 2023, Macau SAR, China
https://sites.google.com/view/coco4mt
SCOPE
It is a well-known fact that machine translation systems, especially
those that use deep learning, require massive amounts of data. Several
resources for languages are not available in their human-created format.
Some of the types of resources available are monolingual, multilingual,
translation memories, and lexicons. Those types of resources are
generally created for formal purposes such as parliamentary collections
when parallel and more informal situations when monolingual. The quality
and abundance of resources including corpora used for formal reasons is
generally higher than those used for informal purposes. Additionally,
corpora for low-resource languages, languages with less digital
resources available, tends to be less abundant and of lower quality.
CoCo4MT is a workshop centered around research that focuses on manual
and automatic corpus creation, cleansing, and augmentation techniques
specifically for machine translation. We accept work that covers any
language (including sign language) but we are specifically interested in
those submissions that explicitly report on work with languages with
limited existing resources (low-resource languages). Since techniques
from high-resource languages are generally statistical in nature and
could be used as generic solutions for any language, we welcome
submissions on high-resource languages also.
CoCo4MT aims to encourage research on new and undiscovered techniques.
We hope that the methods presented at this workshop will lead to the
development of high-quality corpora that will in turn lead to
high-performing MT systems and new dataset creation for multiple
corpora. We hope that submissions will provide high-quality corpora that
are available publicly for download and can be used to increase machine
translation performance thus encouraging new dataset creation for
multiple languages that will, in turn, provide a general workshop to
consult for corpora needs in the future. The workshop’s success will be
measured by the following key performance indicators:
- Promotes the ongoing increase in quality of machine translation
systems when measured by standard measurements,
- Provides a meeting place for collaboration from several research areas
to increase the availability of commonly used corpora and new corpora,
- Drives innovation to address the need for higher quality and abundance
of low-resource language data.
Topics of interest include:
- Difficulties with using existing corpora (e.g., political
considerations or domain limitations) and their effects on final MT
systems,
- Strategies for collecting new MT datasets (e.g., via crowdsourcing),
- Data augmentation techniques,
- Data cleansing and denoising techniques,
- Quality control strategies for MT data,
- Exploration of datasets for pretraining or auxiliary tasks for
training MT systems.
SHARED TASK
To encourage research on corpus construction for low-resource machine
translation, we introduce a shared task focused on identifying
high-quality instances that should be translated into a target
low-resource language. Participants are provided access to multi-way
corpora in the high-resource languages of English, Spanish, German,
Korean, and Indonesian, and using these, are required to identify
beneficial instances, that when translated into the low-resource
languages of Cebuano, Gujarati, and Burmese, lead to high-performing MT
systems. More details on data, evaluation and submission can be found on
the website (https://sites.google.com/view/coco4mt/shared-task) or by
emailing coco4mt-shared-task(a)googlegroups.com.
SUBMISSION INFORMATION
CoCo4MT will accept research, review, or position papers. The length of
each paper should be at least four (4) and not exceed ten (10) pages,
plus unlimited pages for references. Submissions should be formatted
according to the official MT Summit 2023 style templates
(https://www.overleaf.com/latex/templates/mt-summit-2023-template/knrrcnxhkq…).
Accepted papers will be published in the MT Summit 2023 proceedings
which are included in the ACL Anthology and will be presented at the
conference either orally or as a poster.
Submissions must be anonymized and should be made to the workshop using
the Softconf conference management system
(https://softconf.com/mtsummit2023/CoCo4MT). Scientific papers that have
been or will be submitted to other venues must be declared as such, and
must be withdrawn from the other venues if accepted and published at
CoCo4MT. The review will be double-blind.
We would like to encourage authors to cite papers written in ANY
language that are related to the topics, as long as both original
bibliographic items and their corresponding English translations are
provided.
Registration will be handled by the main conference. (To be announced)
IMPORTANT DATES
May 18, 2023 - Call for papers released
May 19, 2023 - Shared task release of train, dev and test data
May 25, 2023 - Shared task release of baselines
June 5, 2023 - Second call for papers
June 20, 2023 - Third and final call for papers
July 16, 2023 - Paper submissions due
July 16, 2023 - Shared task deadline to submit results
July 27, 2023 - Notification of acceptance
July 27, 2023 - Shared task system description papers due
August 03, 2023 - Camera-ready due
September 4-5, 2023 - CoCo4MT workshop
CONTACT
CoCo4MT Workshop Organizers:
coco4mt-2023-organizers(a)googlegroups.com
CoCo4MT Shared Task Organizers:
coco4mt-shared-task(a)googlegroups.com
ORGANIZING COMMITTEE (listed alphabetically)
Ananya Ganesh University of Colorado Boulder
Constantine Lignos Brandeis University
John E. Ortega Northeastern University
Jonne Sälevä Brandeis University
Katharina Kann University of Colorado Boulder
Marine Carpuat University of Maryland
Rodolfo Zevallos Universitat Pompeu Fabra
Shabnam Tafreshi University of Maryland
William Chen Carnegie Mellon University
PROGRAM COMMITTEE (listed alphabetically tentative)
Abteen Ebrahimi University of Colorado Boulder
Adelani David Saarland University
Ananya Ganesh University of Colorado Boulder
Alberto Poncelas ADAPT Centre at Dublin City University
Anna Currey Amazon
Amirhossein Tebbifakhr University of Trento
Atul Kr. Ojha National University of Ireland Galway
Ayush Singh Northeastern University
Barrow Haddow University of Edinburgh
Bharathi Raja Chakravarthi National University of Ireland Galway
Beatrice Savoldi University of Trento
Bogdan Babych Heidelberg University
Briakou Eleftheria University of Maryland
Constantine Lignos Brandeis University
Dossou Bonaventure Mila Quebec AI Institute
Duygu Ataman New York University
Eleftheria Briakou University of Maryland
Eleni Metheniti Université Toulosse - Paul Sabatier
Jasper Kyle Catapang University of Birmingham
John E. Ortega Northeastern University
Jonne Sälevä Brandeis University
Kalika Bali Microsoft
Katharina Kann University of Colorado Boulder
Kochiro Watanabe The University of Tokyo
Koel Dutta Chowdhury Saarland University
Liangyou Li Huawei
Manuel Mager University of Stuttgart
Maria Art Antonette Clariño University of the Philippines Los Baños
Marine Carpuat University of Maryland
Mathias Müller University of Zurich
Nathaniel Oco De La Salle University
Niu Xing Amazon
Patrick Simianer Lilt
Rico Sennrich University of Zurich
Rodolfo Zevallos Universitat Pompeu Fabra
Sangjee Dondrub Qinghai Normal University
Santanu Pal Saarland University
Sardana Ivanova University of Helsinki
Shantipriya Parida Silo AI
Shiran Dudy Northeastern University
Surafel Melaku Lakew Amazon
Tommi A Pirinen University of Tromsø
Valentin Malykh Moscow Institute of Physics and Technology
Xing Niu Amazon
Xu Weijia University of Maryland
Dear all,
We are organising a free training event (online and in person): Language Data Analysis for Business and Professional Communication.
It will take place on 22 September 2023 10:00 - 15:30 UK time.
More details and registration: https://www.lancaster.ac.uk/events/language-data-analysis-for-business-and-…
The ESRC Centre for Corpus Approaches to Social Science, Lancaster University offers a practical training workshop focused on computational analysis of language data for businesses and professional organisations and anyone interested in communication in professional contexts. The data includes social media, newspapers, business reports, marketing materials and other data sources.
The workshop will introduce a new software tool #LancsBox X<https://lancsbox.lancs.ac.uk/> developed at Lancaster University, which can analyse and visualise large amounts of language data (millions and billions of words). Practical examples of uses of #LancsBox X (case studies) will be provided.
Best,
Vaclav
Professor Vaclav Brezina
Professor in Corpus Linguistics
Department of Linguistics and English Language
ESRC Centre for Corpus Approaches to Social Science
Faculty of Arts and Social Sciences, Lancaster University
Lancaster, LA1 4YD
Office: County South, room C05
T: +44 (0)1524 510828
[cid:image001.png@01D9DF47.CD8CA940]@vaclavbrezina
[cid:image002.png@01D9DF47.CD8CA940]<http://www.lancaster.ac.uk/arts-and-social-sciences/about-us/people/vaclav-…>