The Free University of Bozen-Bolzano (https://www.unibz.it/) has opened a
public competition for 21 fully funded PhD scholarships in Computer Science
(*deadline July 1, 2022*). They cover a range of epistemologies, theories,
methods and applications of computer science. Topics include studies of
theoretical AI, data science and machine learning application, up to the
design of the most advanced user interfaces and critical user research.
In particular, the two following topics (in collaboration with Fondazione
Bruno Kessler) can be of interest for the mailing list.
*Emotions in Multilingual Texts (Carlo Strapparava)*
The affective dimension of word meaning often forms part of our reservoir
of common-sense knowledge, and it is reflected in the way we use words.
This project aims at producing and evaluating new technologies for
recognition of emotional language and possibly other subtle pragmatic
aspects of communication. Because there are diverse subtilties in emotional
expressions in different languages, the project will devote particular
attention in approaching the problem from a multilingual point of view.
*Neural Models of Collaborative Behaviours in Conversational Agents
(Bernardo Magnini)*
Human-human dialogues are characterized by collaborative behaviours,
through which interlocutors achieve their communicative goals. As an
example, proactivity (i.e., anticipating user needs during dialogue) and
grounding (e.g., posing clarification questions) are two relevant cases
that have been investigated from a linguistics perspective. However, such
collaborative behaviours are still largely absent in current neural
dialogue models. There are several open research challenges in this
direction, including investigating how dialogue systems can learn when and
how to be collaborative, depending on the dialogue context, and how do we
evaluate whether collaborative behaviours have improved the efficacy of
dialogue. This PhD project addresses collaborative behaviours in
conversational agents from a computational perspective, exploiting the
integration of machine learning approaches based on neural models,
reinforcement learning, and knowledge-based techniques.
Key information to apply and gain admission to the PhD Programme can be
found here:
https://www.unibz.it/en/faculties/computer-science/phd-computer-science/
The scholarship includes
University Fees
3-year personal grant (approx. € 17,000 NET per year)
50% pay increase to support international mobility for a
period variable between 6 months and one year according to the type of
projects
Personal budget for research and travel expenses (Euro 2,500)
State-of-the-art technical equipment
Further financial possibilities are available in the form of teaching
contracts and research consultancies during the years of study, for top
students.
--
--
Le informazioni contenute nella presente comunicazione sono di natura
privata e come tali sono da considerarsi riservate ed indirizzate
esclusivamente ai destinatari indicati e per le finalità strettamente
legate al relativo contenuto. Se avete ricevuto questo messaggio per
errore, vi preghiamo di eliminarlo e di inviare una comunicazione
all’indirizzo e-mail del mittente.
--
The information transmitted is
intended only for the person or entity to which it is addressed and may
contain confidential and/or privileged material. If you received this in
error, please contact the sender and delete the material.
The Natural Language Processing Program (nlp.ucsc.edu) in the Computer
Science and Engineering Department at the University of California, Santa
Cruz (UCSC) invites applications for the Natural Language Processing
Postdoctoral Researcher, under the direction of Professor Marilyn Walker.
We seek outstanding applicants with research expertise in all areas of
Natural Language Processing (NLP). The NLP Postdoctoral Researcher will be
expected to contribute to the research profile of the NLP group. We also
expect the successful candidate to support graduate students and other
Postdoctoral Scholars as a peer mentor.
Feel free to contact me at nlp(a)ucsc.edu with any questions.
Applications are open now, full consideration will be given to applications
submitted by July 15th, 2022 for a start date of September 1st. For
details, and to apply, go here: https://recruit.ucsc.edu/JPF01330
--
Professor Marilyn Walker
Fellow of the Association for Computational Linguistics
Program Director, NLP MS Program, https://nlp.ucsc.edu/
Natural Language and Dialogue Systems Lab
Department of Computer Science and Engineering
Baskin School of Engineering
University of California Santa Cruz
users.soe.ucsc.edu/~maw
(Dis)embodiment
University of Gothenburg, Sweden, September 14-16, 2022
REMINDER: Late-breaking and non-archival round
https://sites.google.com/view/disembodiment/home
(Dis)embodiment will bring together researchers from various areas looking
to answer the question of the role of grounding and embodiment in modelling
human language tasks and behaviour -- or limits thereof. The conference is
open to viewpoints from machine learning, computational linguistics,
theoretical linguistics and philosophy, cognitive science and
psycholinguistics, as well as artificial intelligence ethics and policy. We
hope to see technical contributions and the full spectrum of reasoned
debate.
Important dates
***** NEW! Late-breaking and archival submission deadline: 2022 July 11,
anywhere on Earth *****
Submission deadline: 2022 May 16 2020 May 30, anywhere on Earth
Notification of acceptance: 2022 June 30, anywhere on Earth
Camera ready: 2022 August 19, anywhere on Earth
Conference: 2022 September 14-16, not anywhere on Earth, but in Gothenburg
The 8th Workshop on Noisy User-generated Text (WNUT @COLING 2022)
The WNUT Workshop will be collocated with COLING 2022 (Hybrid - Gyeongju, Republic of Korea). The website for the workshop is at:
http://noisy-text.github.io/<https://urldefense.com/v3/__http://noisy-text.github.io/__;!!KGKeukY!jkgFYC…>
The WNUT workshop focuses on Natural Language Processing applied to noisy user-generated text, such as that found in social media, online reviews, crowdsourced data, web forums, clinical records, and language learner essays.
We seek submissions of long and short papers on original and unpublished work (same format and page limit as COLING main conference). All accepted submissions will be presented as posters. Additionally, selected submissions will be presented orally. We have Best Paper Awards sponsored by Megagon Labs this year.
Topics of interest include but are not limited to:
* NLP Preprocessing of Noisy Text
- Part of speech tagging
- Named entity tagging, including a wide range of categories, e.g. product names
- Chunking of user-generated text
- Parsing
* Text Normalization and Error Correction
- Normalizing noisy text for downstream tasks and for human readability
- Error detection and correction
* Robustness to Noise, both Natural and Adversarial
* Multilingual NLP in noisy text
* Machine Translation of Noisy Text
* Sentiment analysis
* Crowdsourcing of text data
* User prediction, e.g. gender, age, etc
* Stylistics, e.g. formality, politeness, etc
* Colloquial language, e.g. code-switching, idiom detection
* Bilingual translation of the noisy text
* Paraphrase identification and semantic similarity of short text or noisy text
* Information extraction from noisy text
* Domain adaptation to user-generated text
* Geolocation prediction
* Global and regional trend detection and event extraction
* Detecting rumors, contradictory information, sarcasm, and humor on social media
* Extracting user demographics, profiles, and major life events
* Temporal aspects of user-generated content (resolving time expressions, concept drift, diachronic analyses, etc...)
= IMPORTANT DATES =
* August 19, 2022: Submission Deadline (dual-submission w/ COLING main conference allowed)
* September 7, 2022: Acceptance Notification
* September 14, 2022: Camera-ready Deadline
* October 17, 2022: Workshop Day
= ORGANIZERS =
Tim Baldwin (University of Melbourne)
Afshin Rahimi (University of Queensland)
Wei Xu (Georgia Institute of Technology)
Alan Ritter (Georgia Institute of Technology)
= SUBMISSION =
Formatting should be according to COLING 2022 specifications.
Dual submission is allowed but must state at the time of submission.
Please submit through the START system at the following URL:
https://www.softconf.com/coling2022/W-NUT_2022
Our team (cocodev.fr) at Aix-Marseille University offers a fully-funded
Ph.D. research position (with no teaching duties) in the framework of the
ANR grant MACoMiC (Mastering the Art of Conversation in Middle Childhood).
The broad goal of the PhD researcher is to lead the development of deep
learning models of child-parent multimodal communication, across several
cultures, using data of face-to-face conversations recorded using portable
eye-tracking systems and zoom calls.
We are interested in studying the development of various conversational
skills including mechanisms of building shared understanding, multimodal
synchrony/alignment, and discourse coherence/contingency.
We are also interested in the application of this research both to help
design more effective clinical interventions (for children with
communicative difficulties) and to build child-oriented conversational AI.
The selected candidate can focus on one or several of these dimensions,
defining a personalized research program together with the main advisor.
The PhD researcher will be integrated into a supportive and highly
interdisciplinary team of senior and early career researchers in computer
science (with expertise in conversational AI), developmental psychology,
and neuro-linguistics. They will be located at the Department of Computer
science of Aix-Marseille University and part of the Institute of Language
Communication and the Brain (ILCB.fr) <https://www.ilcb.fr/>.
Additionally, the PhD researcher will have the opportunity to
interact/collaborate
with CoCoDev’s internal network, especially researchers from the Dialog
Modelling Group (University of Amsterdam), the Interacting Minds Center (The
University of Aarhus), and the Multimodal Language and Cognition group (Max
Plank Institute of Psycholinguistics).
Requirement
-
-The ideal candidate for this position should have a strong
background/training in computer science and experience with deep-learning
modeling.
-
-Interest in cognitive science (though no prior experience is required).
-
-Good mastery of English
-
Key dates
Open until filled.
Please send (as soon as possible for full consideration):
1) a CV
2) A recent transcript (a university document with courses taken and grades)
2) Contact info of one reference (ideally a research supervisor)
3) (Optional) Evidence of prior experience with deep-learning modeling (a
publication, dissertation, code on GitHub, etc.)
*Latest starting date:* October 1st, 2022
Inquiries
All kinds of inquiries (about the scientific project, the university, life
in Marseille, etc) as well as the application documents should be addressed
to Abdellah Fourtassi (abdellah.fourtassi(a)univ-amu.fr)
--
Abdellah Fourtassi
Assistant Professor
Department of Computer Science
Institute of Language, Communication, and the Brain
Aix-Marseille University, France
https://sites.google.com/site/fourtassi/
***2nd SummDial: A SemDial 2022 <https://semdial2022.github.io/#> Special
Session on Summarization of Dialogues and Multi-Party Meetings***
***Website: https://elitr.github.io/automatic-minuting/summdial-2022.html
***
***Submission Deadline: August 1, 2022 ***
***Event Date: August 24, 2022 ***
With a sizeable working population of the world going virtual, resulting in
information overload from multiple online meetings, imagine how convenient
it would be to just hover over past calendar invites and get concise
summaries of the meeting proceedings? How about automatically minuting a
multimodal multi-party meeting? Are minutes and multi-party dialogue
summaries the same? We believe Automatic Minuting is challenging. There are
possibly no agreed-upon guidelines for taking minutes, and people adopt
different styles to record meeting minutes. The minutes also depend on the
meeting's category, the intended audience, and the goal or objective of the
meeting. We hosted the First SummDial Special Session at SIGDial 2021.
Several significant problems and challenges in multi-party dialogue and
meeting summarization came from the discussions in the first SummDial,
which we documented in our event report
<https://dl.acm.org/doi/10.1145/3527546.3527561>.
Since we witnessed enthusiastic participation of the dialogue and
summarization community in the first SummDial special session
<https://elitr.github.io/automatic-minuting/summdial.html> (
https://elitr.github.io/automatic-minuting/summdial.html), we are hosting
the Second SummDial special session at SemDial 2022
<https://semdial2022.github.io/#> (https://semdial2022.github.io/#). This
year, we intend to continue discussing these challenges and lessons learned
from the previous SummDial. Our goal for this special session would be to
stimulate intense discussions around this topic and set the tone for
further interest, research, and collaboration in both Speech and Natural
Language Processing communities. Our topics of interest are Dialogue
Summarization, including but not limited to Meeting Summarization, Chat
Summarization, Email Threads Summarization, Customer Service Summarization,
Medical Dialogue Summarziation, and Multi-modal Dialogue Summarization. Our
shared task on Automatic Minuting (AutoMin) at Interspeech 2021 was another
community effort in this direction. Our shared task on Automatic Minuting
(AutoMin) <https://elitr.github.io/automatic-minuting/> at Interspeech 2021
<https://www.interspeech2021.org/> was another community effort in this
direction.
***Call for papers***
We invite regular and work-in-progress papers that report:
-
Current research in multi-party dialogue summarization for summarizing
meetings, spoken dialogue, using speech, text, or multi-modal data (audio,
video),
-
Challenges in dialogue summarization evaluation (manual + automatic),
-
New methods and metrics for dialogue summarization evaluation,
-
Relevant corpus collection, pre-processing, development, and ethical
issues involved,
-
Compare and contrast speech-specific systems to systems imported from
text summarization,
-
Tools for meeting transcript generation and automatic summarization,
-
Topic detection and span identification in meeting transcripts for
multi-topic summarization,
-
Position papers to reflect on the current state of the art in this
topic, to take stock of where we have been, where we are, where we are
going and where we should go.
Researchers may choose to submit:
-
***Long papers*** Authors should submit an anonymous paper of at most 8
pages of content (up to 2 additional pages are allowed for references).
-
***Short papers*** Authors should submit a non-anonymized paper of at
most 2 pages of content (up to 1 additional page allowed for references).
Submissions to this track can be non-archival on request.
-
***Position Papers*** Including extended abstracts, work-in-progress,
and late-breaking papers.
***Submission Link***
https://easychair.org/my/conference?conf=summdial2022
Submissions should follow the ACL format. Papers that have been or will be
submitted to other meetings or publications must provide this information
using a footnote on the title page of the submissions. SummDial 2022 cannot
accept work for a publication that will be (or has been) published
elsewhere.
***Special Session Program***
The special session would consist of a keynote, a panel, oral and/or poster
paper presentations.
***Organizers***
-
Tirthankar Ghosal <https://elitr.eu/tirthankar-ghosal/>, Institute of
Formal and Applied Linguistics, Charles University, Czech Republic
-
Muskaan Singh, IDIAP, Switzerland
-
Xinnou Xu, University of Edinburgh, UK
- Ondřej Bojar <https://ufal.mff.cuni.cz/ondrej-bojar>, Institute of
Formal and Applied Linguistics, Charles University, Czech Republic
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Tirthankar Ghosal
Researcher at UFAL, Charles University, CZ
https://member.acm.org/~tghosal
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
***Shared Task: Detecting Entities in the Astrophysics Literature (DEAL)***
***Website: https://ui.adsabs.harvard.edu/WIESP/2022/SharedTasks ***
***Twitter: https://twitter.com/wiesp_nlp ***
A good amount of astrophysics research makes use of data coming from
missions and facilities such as ground observatories in remote locations or
space telescopes, as well as digital archives that hold large amounts of
observed and simulated data. These missions and facilities are frequently
named after historical figures or use some ingenious acronym which,
unfortunately, can be easily confused when searching for them in the
literature via simple string matching. For instance, Planck can refer to
the person, the mission, the constant, or several institutions.
Automatically recognizing entities such as missions or facilities would
help tackle this word sense disambiguation problem.
The shared task consists of Named Entity recognition (NER) on samples of
text extracted from astrophysics publications. The labels were created by
domain experts and designed to identify entities of interest to the
astrophysics community. They range from simple to detect (ex: URLs) to
highly unstructured (ex: Formula), and from useful to researchers (ex:
Telescope) to more useful to archivists and administrators (ex: Grant).
Overall 31 different labels are included, and their distribution is highly
unbalanced (ex: ~100x more Citations than Proposals). Submissions will be
scored using both the CoNLL-2000 shared task seqeval F1-Score at the entity
level, and scikit-learn's Matthews correlation coefficient method at the
token level. We also encourage authors to propose their own evaluation
metrics. A sample dataset and more instructions can be found at:
https://ui.adsabs.harvard.edu/WIESP/2022/SharedTasks
Participants (individuals or groups) will have the opportunity to present
their findings during the workshop and write a short paper. The best
performant or interesting approaches might be invited to further
collaborate with the NASA Astrophysics Data System (
https://ui.adsabs.harvard.edu/).
The DEAL shared task is a part of the *1st Workshop on Information
Extraction from Scientific Publications (WIESP) at AACL-IJCNLP 2022: *
https://ui.adsabs.harvard.edu/WIESP/2022/
***Please fill in this form to report your intention to participate in the
shared task***
https://forms.office.com/r/KKpeKJBLy3
***Shared Task Submission***
Link to data and scoring scripts:
https://huggingface.co/datasets/fgrezes/WIESP2022-NER
CodaLab Link to the online competition :
https://codalab.lisn.upsaclay.fr/competitions/5062
***Important Dates***
-
Training+Validation Data Release: June 1, 2022
-
Validation Phase: June 1 - July 31, 2022
-
Test Data Release: August 1, 2022
-
Final Scoring Period: August 1 - August 10, 2022
-
System Report Submission: August 25, 2022
-
Notification: September 25, 2022
-
Camera-ready Submission Deadline: October 10, 2022
-
Event Date: November 20, 2022 (online)
***All submission deadlines are 11.59 pm UTC -12h (“Anywhere on Earth”)***
***Organizers***
-
Tirthankar Ghosal <https://elitr.eu/tirthankar-ghosal>, Charles
University, CZ
-
Sergi Blanco-Cuaresma <https://www.blancocuaresma.com/s/>, Center for
Astrophysics | Harvard & Smithsonian, USA
-
Alberto Accomazzi
<https://ui.adsabs.harvard.edu/about/team/team/aaccomazzi.html>, Center
for Astrophysics | Harvard & Smithsonian, USA
-
Robert M. Patton <https://www.ornl.gov/staff-profile/robert-m-patton>,
Oak Ridge National Laboratory, USA
-
Felix Grezes <https://ui.adsabs.harvard.edu/about/team/team/fgrezes.html>,
Center for Astrophysics | Harvard & Smithsonian, USA
-
Thomas Allen <https://ui.adsabs.harvard.edu/about/team/team/tallen.html>,
Center for Astrophysics | Harvard & Smithsonian, USA
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Tirthankar Ghosal
Researcher at UFAL, Charles University, CZ
https://member.acm.org/~tghosal
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Tirthankar Ghosal
Researcher at UFAL, Charles University, CZ
https://member.acm.org/~tghosal
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
***Call for Participation***
***First Shared Task on Multi-Perspective Scientific Document Summarization
(MuP)***
Website: https://github.com/allenai/mup
Generating summaries of scientific documents is known to be a challenging
task. The majority of existing work in summarization assumes only one
single best gold summary for each given document. Having only one gold
summary negatively impacts our ability to evaluate the quality of
summarization systems, as writing summaries is a subjective activity. At
the same time, annotating multiple gold summaries for scientific documents
can be extremely expensive as it requires domain experts to read and
understand long scientific documents. This shared task will enable
exploring methods for generating multi-perspective summaries. We introduce
a novel summarization corpus, leveraging data from scientific peer reviews
to capture diverse perspectives from the reader's point of view (each paper
has multiple summaries reflecting multiple perspectives of the reader).
The MuP shared task is a part of the 3rd Scholarly Document Processing
(SDP) workshop at COLING 2022. https://sdproc.org/2022/
More details on the shared task and the corresponding dataset can be found
on: https://github.com/allenai/mup
****Please fill in this form to participate in the shared task*** *
https://forms.gle/K2UECKvmghzDHUpo7
The leaderboard for the shared task will be announced soon on the website.
Shared Task Timelines
Training Data Release: May 10, 2022
Test Data Release: June 30, 2022
Evaluation Period: July 1 - July 15, 2022
System Description Papers Due: August 1, 2022
Reviews Notification: August 15, 2022
Camera-Ready Papers Due: September 5, 2022
Event at SDP @ COLING 2022: October 16/17, 2022
MuP 2022 Organizers
1.
Guy Feigenblat - Piiano, Israel
2.
Arman Cohan - AI2, US
3.
Tirthankar Ghosal - ÚFAL, Charles University, Czechia
4.
Michal Shmueli-Scheuer - IBM Research AI, Israel
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Tirthankar Ghosal
Researcher at UFAL, Charles University, CZ
https://member.acm.org/~tghosal
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Is anyone aware of metadata for the BNC 2014 *Written* corpus -- source,
date, # words, (sub)genre, etc for each of the ~88,000 texts?
I've contacted the BNC people, but no response.
Thanks,
Mark Davies
============================================
Mark Davies
english-corpora.orgmark-davies.org
============================================
In our newly established Research Training Group
Dimensions of Constructional Space
we're offering
13 PhD positions (65%, 3 years)
on a wide range of topics connected to Construction Grammar as a common theoretical core, and
1 postdoc position (100%, 4.5 years) on developing a multilingual research constructicon
to integrate results obtained in the PhD projects and create a new model for linguistic research documentation.
You can apply for one of the 13 PhD projects offered or for the postdoc position, including a motivation letter that explains why you're interested in, and qualified for this particular position.
Application deadline: 10 July 2022
More information is available online:
Call for applications – https://www.linguistics.phil.fau.eu/fau-linguistics/research-training-group…
Project descriptions – https://www.linguistics.phil.fau.eu/fau-linguistics/research-training-group…
Homepage of the RTG – https://www.linguistics.phil.fau.eu/fau-linguistics/research-training-group…
Full details – https://www.linguistics.phil.fau.eu/files/2022/05/rtg-dimensions-of-constru…
Please share this call with anyone who might be interested!
Best wishes,
Stephanie
--
Prof. Stephanie Evert
Chair of Computational Corpus Linguistics
Friedrich-Alexander-Universität Erlangen-Nürnberg
Bismarckstr. 6, 91054 Erlangen, Germany
office: Bismarckstr. 6, room 4.000
phone: +49 9131 8522426
e-mail: stephanie.evert(a)fau.de
web: www.linguistik.fau.de