Dear corpora readers:
We -- Danielle Bragg, Alex Lu, and Hal Daumé III -- are looking to hire
research interns to work on data-driven accessibility research projects,
alongside leading researchers and engineers in the field. We are recruiting
both graduate research interns and undergraduate research interns for
Summer 2023. (ASL recruitment video: https://youtu.be/Gb-8CTpKxhU.)
Our team takes a human-centered and data-driven approach to advancing the
state of accessible technologies. Recent work has focused on data
collection methods, sign language modeling, understanding concerns and
perspectives of user communities, and building novel apps and experiences.
You can learn more about some of the team’s recent efforts at the data-driven
accessibility systems page
<https://www.microsoft.com/en-us/research/project/data-driven-accessibility-…>.
These positions sit within Microsoft Research New York City, with
opportunities to collaborate with Microsoft Research New England and others
across the company. Our team is highly interdisciplinary and offers the
opportunity to interact with diverse researchers.
For graduate students, please apply (short research statement and two
letters) at the Research Intern Portal
<https://careers.microsoft.com/us/en/job/1483492/Research-Intern-Data-Driven…>
.
For undergraduate students, please apply to the MSR Undergraduate Research
Internship and mention one or more of us by name (CV, 2-3 reference
letters, and two essays) at the Undergraduate Research Intern Portal
<https://www.microsoft.com/en-us/research/academic-program/undergraduate-res…>
.
Microsoft is an equal opportunity employer. All qualified applicants will
receive consideration for employment without regard to age, ancestry,
color, family or medical care leave, gender identity or expression, genetic
information, marital status, medical condition, national origin, physical
or mental disability, political affiliation, protected veteran status,
race, religion, sex (including pregnancy), sexual orientation, or any other
characteristic protected by applicable laws, regulations and ordinances.
We also consider qualified applicants regardless of criminal histories,
consistent with legal requirements.
If you need assistance and/or a reasonable accommodation due to a
disability during the application or the recruiting process, please send a
request via the Accommodation request form
<https://careers.microsoft.com/us/en/accommodationrequest>.
Sincerely,
Danielle Bragg, Alex Lu, and Hal Daumé III
Call for Paper: AAAI-2023 Workshop On Multimodal AI For Financial Forecasting
Venue: AAAI 2023
Location: Washington DC, USA
Workshop Date: Monday, 13 February 2023
Submission deadline: December 23, 2022
Submission Site: https://easychair.org/my/conference?conf=muffinaaai2023
Workshop Website: https://muffin-aaai23.github.io/
Abbreviated Title: Muffin-AAAI2023
Contact Email: muffin-aaai23(a)googlegroups.com
Primary Contact: Puneet Mathur
Overview
Financial forecasting is an essential task that helps investors make sound investment decisions and wealth creation. With increasing public interest in trading stocks, cryptocurrencies, bonds, commodities, currencies, crypto coins and non-fungible tokens (NFTs), there have been several attempts to utilize unstructured data for financial forecasting. Unparalleled advances in multimodal deep learning have made it possible to utilize multimedia such as textual reports, news articles, streaming video content, audio conference calls, user social media posts, customer web searches, etc for identifying profit creation opportunities in the market. E.g., how can we leverage new and better information to predict movements in stocks and cryptocurrencies well before others? However, there are several hurdles towards realizing this goal - (1) large volumes of chaotic data, (2) combining text, audio, video, social media posts, and other modalities is non-trivial, (3) long context of media spanning multiple hours, days or even months, (4) user sentiment and media hype-driven stock/crypto price movement and volatility, (5) difficulties with traditional statistical methods (6) misinformation and non-interpretability of financial systems leading to massive losses and bankruptcies.
At the AAAI-2023 Workshop on Multimodal AI for Financial Forecasting (Muffin@AAAI2023), we aim to bring together researchers from natural language processing, computer vision, speech recognition, machine learning, statistics, and quantitative trading communities to expand research on the intersection of AI and financial time series forecasting. We will also organize 2 shared tasks in this workshop – (1) Stock Price and Volatility Prediction post-Monetary Conference Calls and (2) Cryptocurrency Bubble Detection.
This workshop will hold a research track and a shared task track. The research track aims to explore recent advances and challenges of multimodal AI for finance. As this topic is an inherently multi-modal subject, researchers from artificial intelligence, computer vision, speech processing, natural language processing, data mining, statistics, optimization, and other fields are invited to submit papers on recent advances, resources, tools, and challenges on the broad theme of Multimodal AI for finance.
The topics of the workshop include but are not limited to the following:
Transformer models / Self-supervised / Transfer Learning on Financial Data
Machine Learning for Finance
Natural Language Processing and Speech Applications for Finance
Conversational dialogue modeling for Financial Conference Calls
Social media and User NLP for Finance
Entity extraction and linking, Named-entity recognition, information extraction, relationship extraction, and ontology learning in financial documents
Financial Document Processing
Multi-modal financial knowledge discovery
Financial Event detection from Multimedia
Visual-linguistic learning for financial video analysis
Video understanding (human behavior cognition, topic mining, facial expression detection, emotion detection, deception detection, gait and posture analysis, etc.)
Data annotation, acquisition, augmentation, and feature engineering, for financial/time-series analysis
Bias analysis and mitigation in financial models and data
Statistical Modeling for Time Series Forecasting
Interpretability and explainability for financial AI models
Privacy-preserving AI for finance
All papers will be double-blind peer-reviewed. Muffin workshop accepts both long papers and short papers:
Short Paper: Up to 4 pages of content including the references.
Upon acceptance, the authors are provided with 1 more page to address the reviewer's comments.
Long Paper: Up to 8 pages of content including the references.
Upon acceptance, the authors are provided with 1 more page to address the reviewer's comments.
Shared Task Track: Participants are invited to take part in shared tasks: (1) Financial Prediction from Conference Call Videos and (2) Cryptocurrency Bubble Detection. Participants are invited to submit a system paper of 4-8 pages of content including the references.
Important Dates
Paper submission deadline: December 23, 2022
Acceptance notification: January 5, 2023
Camera-ready submission: January 15, 2023
Muffin workshop at AAAI 2023: Feb 13, 2022
All deadlines are “anywhere on earth” (UTC-12)
About the Shared Task
The Multimodal AI for Finance Forecasting (Muffin) workshop will host two shared tasks on challenging multimodal financial forecasting problems using artificial intelligence. Follow this link for details on shared tasks: https://muffin-aaai23.github.io/shared_task.html
Task-1: Financial Prediction from Conference Call Videos
Monetary policy calls (MPC) provide important insights into the actions taken by a country’s central bank on economic goals related to inflation, employment, prices, and interest rates. Investors and analysts critically analyze these video calls to forecast prices of the stock market, treasury bonds, gold, and currency exchange rates post the conference call. Prior works in the NLP literature have looked at what is being said during press conferences although there is a greater need to focus on how it is being said. The use of multimodal (visual+textual+audio) input to answer this question has been largely limited. Non-verbal behavioral cues from conference videos such as eye movements, facial expressions, postures, gaits, the complexity of language, vocal tone, and facial expressions of the speakers may reflect emotions that subjects may not express through words and have been found to be strongly correlated with enhanced trading activities in the financial markets. Interpreting and extracting information from financial conference calls reveals difficult challenges such as (1) Gap in current multimodal AI methods for simultaneously leveraging visual, vocal, and verbal modalities; (2) Long length of videos (50min to 1 hour) with multi-page text transcripts (3) Need to explore few-shot, semi-supervised, and self-supervised methods due to limited training data; (4) Large variability in conference calls across geographies due to different speakers, demographics, and economic conditions causing unintended bias. To this end, we curated a dataset of video conference calls from 2009 to 2022 released by central banks of 6 major English-speaking economies - USA, Canada, European Union, United Kingdom, New Zealand, and South Africa. The data has been processed to extract video frames, audio recordings, and utterance-aligned text transcripts. The task is to predict the volatility and price movement of stock market indices, gold, currency exchange rates, and bond prices T days after a conference call. We provide a cumulative of 25K data points split across training/development/testing for experimentation.
Relevant research paper: [1] MONOPOLY: Financial Prediction from MONetary POLicY Conference Videos Using Multimodal Cues
Task 2: Cryptocurrency Bubble Detection on Social Media
Cryptocurrency trading presents a new investment opportunity for maximizing profits. The rising ubiquity of speculative trading of cryptocurrencies over social media leads to rapid escalation and crash of price in a short period of time, also called bubbles, causing investment losses and bankruptcy. These crypto bubbles are strongly tied to user sentiment and social media usage as opposed to conventional value-driven stocks and equities. Such financial bubbles are often a result of social media hype and the intensity of contagion among users, rendering both conventional statistical models and contemporary ML models weak as they are not built to deal with large volumes of unstructured, user-generated text on social media. In order to identify and safeguard against such bubbles, we formulate the CryptoBubbles Detection Challenge - a novel multi-span prediction task over future days of time series price data for crypto assets. We have curated a dataset of the 50 most traded crypto coins by volume from the top 9 crypto exchanges such as Binance, Gatio, etc to obtain a time series of prices for 450+ crypto assets over five years accompanied by over 2.4 million related tweets.
Relevant research paper: [2] Cryptocurrency Bubble Detection: A New Stock Market Dataset, Financial Task & Hyperbolic Models
[apologies for cross-posting]
==============
Call for Workshops and Tutorials @ Fourth Conference on Language, Data and
Knowledge (LDK2023)
Date: 13 September 2023 (Workshop/Tutorials day), 14–15 September 2023
(Main Conference)
Location: Vienna, Austria
Website: http://2023.ldk-conf.org
Submission Deadline: 19 December 2022
Submissions via: EasyChair
==============
We are inviting proposals for workshops and tutorials to be held on
September 13, 2023 in conjunction with the fourth biennial conference on
Language, Data and Knowledge (LDK 2023) in Vienna, Austria. Building upon
the success of the previous events held in Galway, Ireland in 2017, in
Leipzig, Germany in 2019, and in Zaragoza, Spain in 2021, this conference
will bring together researchers from across disciplines concerned with the
acquisition, curation and use of language data in the context of data
science and knowledge-based applications.
Proposal submission
We welcome workshop and tutorial proposals that are of relevance to the
topics listed below. Submissions should be consistent with the main
conference formatting guidelines and be 3–5 pages in length, plus unlimited
pages for appendices.
The decision on acceptance or rejection of a workshop proposal will be made
on the basis of the overall quality of the proposal and its appeal to
linguistics and knowledge-based communities. Other factors, such as overlap
with other workshop proposals, or issues regarding logistics, will also be
taken into account when making the final decision.
Submissions should include the following details:
1.
Title of the workshop or tutorial
2.
List of organisers or tutorial presenters: List the names, affiliations,
home page links, and provide short (one paragraph) biographies for all
workshop organisers or tutorial presenters.
3.
List of topics: For workshops, provide a list of topics of relevance to
the workshop. For tutorials, provide a detailed outline of the topics that
will be covered.
4.
Detailed description: Provide a 200-word summary that motivates and
describes the list of topics in further detail.
5.
Past workshops or tutorials: Describe any previous editions of the same
workshop or tutorial, along with an estimated size of the audience.
6.
Format: Describe the proposed format for the event (physical/virtual).
For instance, for workshops, describe the envisioned mix, i.e. invited
talks, submitted paper presentations, poster sessions, panels, or demo
sessions. For tutorials, describe to what extent the tutorial will consist
of slide presentations vs. practical hands-on training.
7.
Expected audience: Describe the expected size and composition of the
audience.
8.
Duration: Indicate if the workshop or tutorial is planned to last half a
day (4h) or a full day (8h).
9.
Technical requirements: List any special requirements, such as
audio-visual equipment, poster boards, etc.
10.
Committee members: For workshops, if available, provide a preliminary
list of people who have agreed to serve as program committee members.
11.
References: List relevant publications.
Accepted workshops will be required to prepare a workshop website
containing their call for papers and detailed information about the
workshop organisation and timelines.
Language: All submissions must be written in English.
Proposals should be submitted via EasyChair at:
https://easychair.org/cfp/LDK2023WP.
Each proposal will be reviewed by the workshop and tutorial chairs and
relevant members of the organising committee of LDK2023, and ranked based
on the overall quality of the proposal and the workshop’s fit to the
conference as detailed below. In particular, workshops should address
research topics that satisfy each of the following criteria:
1.
The topic proposed by the workshops should fall in the general scope of
LDK2023.
2.
The workshop’s target should be clear and anchor a specific technology,
problem or application.
3.
The aim of the workshop is to attract a broad community of interested
individuals.
4.
The format of the workshop can invite various types of contributions
(oral and poster presentations) and can be accommodated in online/hybrid
settings.
5.
Workshop proposals on emerging topics are encouraged.
Topics:
1.
Fundamental technical and theoretical problems of language data / linked
data and knowledge graphs.
2.
Applications of language data and semantic web technologies in domains
such as law, medicine, life science, digital humanities, mobility and smart
cities, etc.
3.
Research areas that have been largely neglected or underrepresented in
language data and semantic web studies.
4.
Other research areas relevant to language data and semantic web research
(such as data science, artificial intelligence, big data analytics,
human-computer interaction, natural language processing, and information
retrieval).
5.
New emerging topics.
We encourage diversity in the organising and program committee: from
different institutions and the presence of young researchers and PhD
students. At least one of the organisers should be registered to LDK2023.
All organisers (presenters) of accepted workshops are expected to:
1.
Have their own (open) reviewing process and take care of their publicity
(e.g., website, timelines, and call for papers).
2.
Workshop organisers will be asked to:
1.
Attend LDK2023 in person (at least one organiser).
2.
Closely cooperate with the Workshop Chairs to finalise all
organisational details.
Workshop and tutorial chairs:
-
Ana Ostroški Anić – Institute of Croatian Language and Linguistics
-
Blerina Spahiu – University of Milano-Bicocca
Important Dates:
19 Dec 22
Proposal submission deadline
16 Jan 23
Notification of accepted proposals
6 Feb 2023
Workshop website and CfP published
19 May 23
Suggested deadline for workshop paper submissions
16 June 23
Suggested deadline for notification for workshop paper submissions
30 June 23
Camera-ready submission deadline
13 September 2023
Workshop and tutorial day
14–15 September 2023
Main conference
All deadlines refer to anywhere-on-earth time.
Ana Ostroški Anić and Blerina Spahiu (Workshop and Tutorial Chairs)
Dear List members,
WNLPe-Health 2022 - the first Workshop on Context-aware NLP in eHealth will
be held at IIIT Delhi, India on December 15th, 2022 in conjunction with
19th International Conference on Natural Language Processing (ICON 2022) .
It is currently recognised that as much as 30% of the world’s stored data
is produced by the healthcare sector. However, this ‘data-rich’ sector does
not currently explore data to the full potential which may allow the
development of much more individual and person-centred AI technologies. For
example, by combining ubiquitous data with user-generated and publicly
available data, AI algorithms can guide and inform citizens about risk
modifying behaviors in an appropriate context. Context can be defined as
“any information that can be used to characterize the situation of an
entity. An entity is a person, place, or object that is considered relevant
for the interaction between a user and an application, including the user
and applications themselves.”
The goal of this workshop is to provide a unique platform to bring together
researchers and practitioners in healthcare informatics working with
health-related data, especially textual data, and facilitate close
interaction among students, scholars, and industry professionals on eHealth
language processing tasks. In particular, we are interested in works that
advance state-of-the-art NLP and ML techniques for eHealth domains by
incorporating more contextual knowledge in order to make models
explainable, trustable and robust in changing situations.
We are interested in research on novel approaches, works in progress,
comparative analyses of tools, and advancing state-of-the-art work in
eHealth NLP methods, tools, and applications. Relevant topics for the
workshop include, but are not limited to, the following areas:
-
Modelling of healthcare text in classical NLP tasks (tagging, chunking,
parsing, entity identification, relation extraction, coreference,
summarization, etc.) for under-resourced languages.
-
Person-centred NLP applications for eHealth including early risk
prediction.
-
Algorithm for Context Data reasoning.
-
Context sensitive recommendations to individual citizens and patients.
-
Integration of structured and unstructured resources for health
applications.
-
Domain adaptation techniques for clinical data.
-
Medical terminologies and ontologies.
-
Interpretability and analysis of NLP models for healthcare applications.
-
Processing clinical literature and trial reports.
-
Bayesian modelling and feature selection techniques for high-dimensional
healthcare data.
-
Multimodal learning for decision support systems: Ubiquitous data,
public databases, user generated content (in combination with wearable
sensor technology).
Full paper submissions are limited to 8 pages, while short paper
submissions should be less than 4 pages (including bibliography). For more
information: https://sites.google.com/view/wnlpe-health2022/submissions
Important Dates
Paper submission deadline: 20th November, 2022 (23:59 Hawaii Standard Time)
Notification of acceptance: 26th November, 2022
Camera ready copy deadline: 1st December, 2022
Workshop date: 15th December, 2022
Best regards,
Mohammed
<https://sites.google.com/view/wnlpe-health2022/submissions#h.jxoic3mpxg99>
Important Dates
Paper submission deadline: 20th November, 2022 (23:59 Hawaii Standard Time)
Notification of acceptance: 26th November, 2022
Camera ready copy deadline: 1st December, 2022
Workshop date: 15th December, 2022
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
*Dr. Mohammed Hasanuzzaman, Lecturer, Munster Technological University
<https://www.mtu.ie/> *
*Funded Investigator, ADAPT Centre- <https://www.adaptcentre.ie/> A
<https://www.adaptcentre.ie/>* World-Leading SFI Research Centre
<https://www.adaptcentre.ie/>
*Member, Lero, the SFI Research Centre for Software
<https://lero.ie/>**C**hercheur
Associé*, GREYC UMR CNRS 6072 Research Centre, France
<https://www.greyc.fr/en/home/>
*Associate Editor:** IEEE Transactions on Affective Computing, Nature
Scientific Reports, IEEE Transactions on Computational Social Systems, ACM
TALLIP, PLOS ONE, Computer Speech and Language*
Dept. of CS
Munster Technological University
Bishopstown campus
Cork e: mohammed.hasanuzzaman(a)adaptcentre.ie <email(a)adaptcentre.ie>/
Ireland https://mohammedhasanuzzaman.github.io/
[image: Mailtrack]
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=sig…>
Sender
notified by
Mailtrack
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=sig…>
29/10/22,
02:19:47
Dear colleagues,
Please find hereafter an internship proposal.
Feel free to transfer it to your M2 students.
Kind regards
----------------------------------------------------------------------------------------------------------------------------------
The LIG (Laboratoire d'Informatique de Grenoble) proposes the following Master 2 level internship:
Title: Context-Aware Neural Machine Translation Evaluation
Description:
Context-Aware Neural Machine Translation (CA-NMT) [Tiedemann and Scherrer, 2017; Laubli et al., 2018; Miculicich et al., 2018; Maruf et al., 2019; Zheng et al., 2020; Ma et al. 2021; Lupo et al., 2022] is currently one of the main research axes in NLP, with strong impact on both academic and company research.
CA-NMT systems are evaluated with both "average-quality-measuring" metrics such as BLEU [Papineni et al., 2002], and dedicated contrastive test suites [Voita et al., 2019; Muller&Rios 2018; Lopes et al., 2020].
The latter have been designed to measure specifically to which degree CA-NMT systems are able to exploit context while scoring sentences to be translated in context. Indeed the average translation quality measured by BLEU has been shown inadequate in this respect [Lupo et al., 2022].
When evaluating models with contrastive test suites however, models are only asked to score sentences and not to translate them. The ability of models to use context is thus only implicitly evaluated.
With the work planned in this internship we would like to make a step ahead in the evaluation of CA-NMT systems.
The idea is to exploit annotated data like those already used for [Muller&Rios 2018; Lopes et al., 2020] to explicitly involve discourse phenomena, such like coreferences and anaphora, in the evaluation procedure of CA-NMT models.
Such evaluation procedure will allow possibly to design more accurate and adequate evaluation measures for "discourse-phenomena-aware" CA-NMT systems.
Practical Aspects:
In this internship the student will use Machine Learning and Deep Learning tools to automatically annotate parallel data (at least English-French, but possibly also English-German and other language pairs) used for NMT with discourse phenomena, as well as Neural Machine Translation tools for automatically generating translations that will be used for CA-NMT evaluation.
Based on the annotation of discourse phenomena, we will design an adequate evaluation metric for CA-NMT systems, taking into account the capability of the system to exploit discourse phenomena. Finally, the evaluation metric will be tested by evaluating CA-NMT systems already available or trained from scratch at LIG by the student.
Profile:
Master 2 student level in computer science or NLP
Interested in Natural Language Processing and Deep Learning approaches
Skills in machine learning for probabilistic models
Computer science skills:
Python programming. Some knowledge of deep learning libraries such like Pytorch (Fairseq would be a plus).
Data manipulation and annotation
The internship may last from 5 up to 6 months, it will take place at LIG laboratory, GETALP team (http://lig-getalp.imag.fr/ <http://lig-getalp.imag.fr/>), starting from January/February 2022.
The student will be tutored by Marco Dinarelli (http://www.marcodinarelli.it <http://www.marcodinarelli.it/>), andEmmanuelle Esperança-Rodier (https://lig-membres.imag.fr/esperane/ <https://lig-membres.imag.fr/esperane/>)
Interested candidates must send a CV and a motivation letter to (both adresses) marco.dinarelli(a)univ-grenoble-alpes.fr <mailto:marco.dinarelli@univ-grenoble-alpes.fr>, Emmanuelle.Esperanca-Rodier(a)univ-grenoble-alpes.fr <mailto:Emmanuelle.Esperanca-Rodier@univ-grenoble-alpes.fr>.
[Tiedemann and Scherrer, 2017] Neural ma- chine translation with extended context. Workshop on Discourse in Machine Translation 2017.
[Laubli et al., 2018] Has machine translation achieved human parity? a case for document-level evaluation. EMNLP 2018.
[Miculicich et al. 2018] Document-level neural machine translation with hierarchical attention networks. EMNLP 2018.
[Maruf et al., 2019] Selective attention for context-aware neural machine translation. NAACL 2019.
[Zheng et al., 2020] Towards Making the Most of Context in Neural Machine Translation. IJCAI 2020.
[Ma et al., 2021] A Comparison of Approaches to Document-level Machine Translation. arXiv pre-print 2021.
[Lupo et al., 2022] Divide and Rule: Effective Pre-Training for Context-Aware Multi-Encoder Translation Models. ACL 2022.
[Papineni et al., 2022] Bleu: a method for automatic eval- uation of machine translation. ACL 2002.
[Voita et al., 2019] "When a good translation is wrong in context: Context-aware machine translation improves on deixis, ellipsis, and lexical cohesion". ACL 2019.
[Muller&Rios 2018] "A large-scale test set for the evaluation of context-aware pronoun translation in neural machine translation." CMT 2018
[Lopes et al., 2020] "Document-level neural MT: A systematic comparison". EAMT 2020
----------------------------------------------------------------------------------------------------------------------------------
___________________________________________
Emmanuelle Esperança-Rodier
Enseignante-Chercheuse en Linguistique Informatique (Section 7)
Maîtresse de Conférences - Hors Classe
UMR 5217 - LIG (Laboratoire d’Informatique de Grenoble)
GETALP (Groupe d’Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole)
Bâtiment IMAG - 700 avenue Centrale - Domaine Universitaire de Saint-Martin-d’Hères
04 57 42 14 92
Service des Langues UGA
Coordinatrice des enseignements d’anglais pour la composante IM2AG - Mathématiques
* Title: Diving into neural language models for improving discourse
analysis tasks
* Keywords: Neural Language Models, Discourse analysis, Argumentative
structure, Probing, Transfer Learning
* Supervisors: Nicolas.Hernandez(a)univ-nantes.fr and
Laura.Monceaux(a)univ-nantes.fr
* Location: TALN@LS2N, Nantes, France - https://taln-ls2n.github.io
* Starting date: Jan-2023 (flexible) ~6 months
* Opportunity: to pursue a PhD in the Lexhnology ANR project
https://www.ls2n.fr/stage-these/diving-into-neural-language-models-for-impr…
# MISSION
Fine-tuning a pre-trained language model has become the de facto
standard for handling natural language processing tasks. Since many of
these tasks are dealing with discourse and dialogue structures (e.g.
conversational agent, summarization, dialogue acts recognition,
argumentation mining), it is crucial to understand how such information
is captured by the language models and to study how to intervene on the
learning of this type of information: what is learned, what is missing,
how to add it, how to keep the useful information in a fine-tuned,
distilled, pruned or quantized model...
The internship mission will be defined in this context, collaboratively
with the candidate. One possibility would be to start by probing the
language models on discourse analysis tasks.
We wish the successful candidate to pursue a PhD on the subject in the
Lexhnology project.
* A. Rogers, O. Kovaleva, and A. Rumshisky. A Primer in BERTology: What
We Know About How BERT Works. Transactions of the Association for
Computational Linguistics (TACL), 8:842–866. 2020.
* V. Araujo, A. Villa, M. Mendoza, M.-F. Moens, and A. Soto,
“Augmenting BERT-style Models with Predictive Coding to Improve
Discourse-level Representations,” In EMNLP, Nov. 2021.
* M. Lukasik, B. Dadachev, G. Simões, & K. Papineni, Text Segmentation
by Cross Segment Attention, In Proceedings of the 2020 Conference on
Empirical Methods in Natural Language Processing (EMNLP), 4707–4716,
November 16–20, 2020.
* L. Huber, C. Memmadi, M. Dargnat, and Y. Toussaint. Do sentence
embeddings capture discourse properties of sentences from scientific
abstracts ? In the First ACL Workshop on Computational Approaches to
Discourse, 86–95, 2020.
* F. Koto, J. H. Lau, and T. Baldwin. Discourse Probing of Pretrained
Language Models. In Proceedings of the 20th Conference of the North
American Chapter of the Association for Computational Linguistics
(NAACL), Mexico (virtual), 2021
# THE LEXHNOLOGY PROJECT
Lexhnology is a project funded by the French National Agency (ANR). It
will start on January 2, 2023 for a period of 42 months.
Given the growing extraterritoriality of American law, this domestic law
is increasingly impacting other countries' jurisdiction. It is of prime
importance that second-language (L2) users of legal English be able to
analyze case law. Teaching the argumentative structure to L2 learners is
a widely accepted method in languages for specific purposes (LSP) L2
teaching/learning and may help learners understand the legally-binding
rationale behind judicial decisions.
Despite this context, consensus about the linguistic definition of the
communicative functions, also known as moves, in case law does not yet
exist. In addition, no Natural Language Processing (NLP) techniques are
currently able to automatically identify moves in case law. Finally, the
effectiveness of making moves explicit to L2 learners has not been
measured experimentally.
To answer these questions, Lexhnology will take an innovative
interdisciplinary approach – linguistic, NLP, LSP teaching/learning.
The project is the joint collaboration of four laboratories, namely
LS2N, CRINI, LAIRDIL and ATILF.
# APPLICATION
The successful candidate is expected to:
* Have/Prepare a Master Degree (or equivalent) in Natural Language
Processing, Computer Sciences, Computational Linguistics or Data
sciences,
* Have a excellent background in deep learning and more generally
machine learning,
* Have strong programming skills (software dev. and python)
* Have good verbal communication and writing skills (in French/English)
* Have facility with teamwork as well as working autonomously
* Be dynamic and curious
We look forward to receiving your meaningful online application
including:
* a letter of motivation
* a CV
* contacts for two references
Apply by November 15, 2022, to join the ELLIS PhD Program in 2023 – Details at: https://ellis.eu/news/ellis-phd-program-call-for-applications-2022
The ELLIS PhD program has launched its yearly recruiting round and is now accepting applications. A key pillar of the ELLIS initiative, the program's central aim is to foster and educate the best talent in machine learning and related research areas by pairing outstanding students from across the globe with leading researchers in Europe. The program also offers a variety of networking and training activities. Each PhD student is co-supervised by two ELLIS scientists based in different European countries. Over the course of their degree, students complete a mandatory exchange of at least six months at their co-advisor's lab. One of the advisors may also come from industry, in which case the student will collaborate closely with the industry partner and spend their exchange conducting research at an industrial lab.
Research areas include (but are not limited to) the following machine learning-driven research fields:
- AutoML
- Bayesian & Probabilistic Learning
- Bioinformatics
- Causality
- Computational Neuroscience
- Computer Graphics
- Computer Vision
- Deep Learning
- Earth & Climate Sciences
- Health
- Human Behavior, Psychology & Emotion
- Human Computer Interaction
- Human Robot Interaction
- Information Retrieval
- Interactive & Online Learning
- Interpretability & Fairness
- Law & Ethics
- Machine Learning Algorithms
- Machine Learning Theory
- ML in Chemistry & Material Sciences
- ML in Finance
- ML in Science & Engineering
- ML Systems
- Multi-agent Systems & Game Theory
- Natural Language Processing
- Optimization & Meta Learning
- Privacy
- Quantum & Physics-based ML
- Reinforcement Learning & Control
- Robotics
- Robust & Trustworthy ML
- Safety
- Security, Synthesis & Verification
- Symbolic Machine Learning
- Unsupervised Learning
You can watch our introductory video here: https://www.youtube.com/watch?v=kWXNpnxkfg0.
The deadline for applications is November 15, 2022. Interested candidates should apply online through the ELLIS application portal. For details on the program, specific research areas, and the application process, please consult the call for applications: https://ellis.eu/news/ellis-phd-program-call-for-applications-2022.
The School of Informatics, University of Edinburgh, is thrilled to
announce a PhD scholarship funded by DeepMind.
The scholarship covers tuition fees (at the Home/International tuition
fee rate), provides annual stipend of £17,668 annum (for 4 years full
time study) and provides a research training and support grant. The
student will be supervised by Dr. Mirella Lapata and will also benefit
from mentoring from DeepMind staff during their period of study.
Applicants would be expected to work on an topic drawn from the
following research areas:
- multimodal natural language understanding and generation
- long-form and retrieval-augmented text generation
- Multilingual generation
Applicants wishing to apply for the scholarship should meet one OR
both of the following criteria:
- are resident of a country and/or region underrepresented in AI;
- identify as women including cis and trans people and non-binary or
gender fluid people who identify in a significant way as women or
female;
- and/or identify as Black or other minority ethnicity;
The successful candidate will have a good honours degree or equivalent
in artificial intelligence, computer science, machine learning, or a
related discipline; or have a breadth of relevant experience in
industry/academia/public sector, etc. They will have strong
programming skills and previous experience in natural language
processing.
If you have further questions, please Dr. Mirella Lapata,
mlap(a)inf.ed.ac.uk.
To apply, please follow the instructions at:
http://www.inf.ed.ac.uk/postgraduate/apply.html
As your research area, please select "Informatics: ILCC: Language
Processing, Speech Technology, Information Retrieval, Cognition". On
the application form under "Research Project", please state "DeepMind
Scholarship".
IMPORTANT: After submitting your application through the website,
please email your applicant number to mlap(a)inf.ed.ac.uk.
Application deadline: 10th December 2022 [applications received after
the deadline may be considered, but this cannot be guaranteed].
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
Dear colleague,
We are happy to announce the next webinar in the Language Technology
webinar series organized by the HiTZ research center (Basque Center for
Language Technology, http://hitz.eus). We are organizing one seminar
every month. You can check the videos of previous webinars and the
schedule for upcoming webinars here: http://www.hitz.eus/webinars Next
webinar:
* *Speaker*: Vered Shwartz (The University of British Columbia-Vancouver)
* *Title*: Incorporating Commonsense Reasoning into NLP Models
* *Date*: November 3, 2022, 15:30 CET
* *Summary*: NLP models are primarily supervised, and are by design
trained on a sample of the situations they may encounter in
practice. The ability of models to generalize to and address unknown
situations reasonably is limited, but may be improved by endowing
models with commonsense knowledge and reasoning skills. In this
talk, I will present several lines of work in which commonsense is
used for improving the performance of NLP tasks: for completing
missing knowledge in underspecified language, interpreting
figurative language, and resolving context-sensitive event
coreference. Finally, I will discuss open problems and future
directions in building NLP models with commonsense reasoning abilities.
* *Bio*:Vered Shwartz is an Assistant Professor of Computer Science at
the University of British Columbia and a faculty member at the
Vector Institute for Artificial Intelligence. Her research interests
include commonsense reasoning, computational semantics and
pragmatics, and multiword expressions. Previously, Vered was a
postdoctoral researcher at the Allen Institute for AI (AI2) and the
University of Washington, and received her PhD in Computer Science
from Bar-Ilan University.
* *Upcoming webinars*:
o Machine translation as a tool for multilingual information:
different users and use scenarios -- Maarit Koponen (December 1,
2022)
Check past and upcoming webinars at the following url:
http://www.hitz.eus/webinars If you are interested in participating,
please complete this registration form:
http://www.hitz.eus/webinar_izenematea
If you cannot attend this seminar, but you want to be informed of the
following HiTZ webinars, please complete this registration form instead:
http://www.hitz.eus/webinar_info
Best wishes,
HiTZ Zentroa
*Research Assistant*
*Natural Language Processing and Linked Data*
*School of Computer Science and Data Science Institute*
*Ref. No. University of Galway 272-22*
Applications are invited from suitably qualified candidates for a
full-time, fixed-term position as a Research Assistant with the School of
Computer Science and Data Science Institute at the University of Galway.
This position is funded by SFI Insight Research Centre for Data Analytics
and Fidelity Investments and is available from 1 November 2022 to contract
end date of 30 October 2023.
The School of Computer Science is ambitious and growing, and we invite the
new appointees to contribute to this together with us. The vision of the
School of Computer Science is to build a strong and sustainable learning
environment with world-recognised research that informs high-quality
undergraduate and postgraduate teaching that is inclusive and relevant to
the needs of our stakeholders and society in general. The School of
Computer Science was initially established in 1991 as the Information
Technology Discipline, and became a School in 2019, recognising its growth
and significance.
DSI incorporates the University of Galway node of the nationwide Insight
Centre for Data Analytics. DSI hosts more than 100 staff and has
established itself as a top player worldwide in the areas of Semantic Web
and Linked Data. It has successfully implemented a research strategy around
the goal of “Enabling Networked Knowledge”, which aims at capitalizing on
knowledge as the fuel for the digital service economy, by linking
information and exploiting the resulting knowledge graphs as the basis for
economic productivity. The institute performs fundamental and applied
research in a range of research areas to enable this, including data
streams and sensor networks, knowledge discovery, natural language
processing, social semantics and social network analysis, among others.
Research outcomes are applied in use cases across a range of domains,
including eGovernment, financial services, manufacturing, eHealth and Life
Sciences.
School of Computer Science - University of Galway
<https://www.universityofgalway.ie/science-engineering/school-of-computer-sc…>
*https://www.insight-centre.org/* <https://www.insight-centre.org/>
*Job Description:*
This research assistant position is in the area of Inclusive language
detection, and will focus on the combination of existing natural language
processing and linked data technologies. This work will build on the
definition of inclusive language provided in the Fidelity report “Inclusion
Guide: Language,
Accessibility and more”. In addition, we will investigate open benchmarks.
The successful candidate supports the activities of the project through
provision of research and administrative assistance and will work under the
direction of the Project Leaders Dr Bharathi Raja Chakravarthi and Dr. John
McCrae.
The School of Computer Science and Data Science Institute (DSI) at
University of Galway is inviting applications for the position of research
assistant for 1 year in the context of the SFI Insight Research Centre for
Data Analytics.
*Duties:*
*Research*
- Actively participate as a member of a research team and assist an
individual research leader or team to conduct a particular study (or group
of studies).
- To provide assistance in conducting research activities, including
planning, organizing, conducting, and communicating research studies within
the overall scope of a research project.
- To coordinate and perform a variety of independent tasks and team
activities involved in the collection, analysis, documentation and some
interpretation of information/results.
- Conduct literature and database searches and interpret and present the
findings of the literature searches as appropriate.
- Assist in analysis and interpretation of results of own research.
*Write up & Disseminate*
- Write up results from own research activity (e.g. as project report)
for review by PI, including preparing technical reports, conclusions and
recommendations.
- Contribute to the publication of findings.
- Provide input into the research project’s dissemination, in whatever
form (report, papers, chapters, book) as directed by the PI/project leader.
Authorship should be decided in line with guidelines such as the Vancouver
Protocol, or similar authorship guidelines as appropriate.
- Present on research progress and outcomes e.g. to bodies supervising
research; steering groups; other team members, as agreed with the
PI/project leader.
- Should write at least workshop level papers.
*Management*
- Work under the direction of the Principal Investigator/Project
Leader. Plan and manage own day-to-day research activity within this
framework & direction.
- Provide guidance as required to any support staff and/or research
students assisting with the research project, as agreed with the Principal
Investigator/Grant holder.
- To perform other related duties incidental to the work described
herein.
- Where appropriate provide advice and / or assistance to support staff,
research students.
*Qualifications/Skills required:*
*Essential Requirements:*
- MSc in Natural Language Processing, Computer Science or Linguistics
- Experience with natural language processing, linked data or related
technologies
- Excellent understanding of experimental design and scientific
methodologies
- Strong command of oral and written English
- Good programming skills and evidence of previously completed software
projects
*Desirable Requirements:*
- Strong knowledge of language technology for equality, diversity, and
inclusion
- Knowledge of debiasing techniques in NLP task
- Knowledge of gender inclusive languages
- Programming experience with deep learning in Python
- Strong publication record
- Track record of contribution to open source projects.
*Employment permit restrictions apply for this category of post*
*Salary: *€27,380 to €31,050 per annum, per annum pro rata for shorter
and/or part-time contracts (public sector pay policy rules pertaining to
new entrants will apply
*Start date*: Position is available from November 2022
*Continuing Professional Development/Training*:
Further information on research and working at University of Galway is
available on Research at University of Galway
<http://www.nuigalway.ie/our-research/> Researchers at University of Galway
are encouraged to avail of a range of training and development
opportunities designed to support their personal career development plans.
University of Galway provides continuing professional development supports
for all researchers seeking to build their own career pathways either
within or beyond academia. Researchers are encouraged to engage with our
Researcher Development Centre (RDC) upon commencing employment - see
https://www.universityofgalway.ie/rdc/ for further information.
For information on moving to Ireland please see www.euraxess.ie
Further information about the School of Computer Science and Data Science
Institute is available at School of Computer Science - University of Galway
<https://www.universityofgalway.ie/science-engineering/school-of-computer-sc…>
https://www.universityofgalway.ie/dsi/
*NB*: Gárda vetting is a requirement for this post
*To Apply:*
Applications to include a covering letter, CV, and the contact details of
three referees should be sent, via e-mail (in word or PDF only) to Dr.
Bharathi Raja Chakravarthi (
bharathiraja.asokachakravarthi(a)universityofgalway.ie) and Dr. John P.
McCrae, (john.mccrae(a)universityofgalway.ie) Please put reference
number *University
of Galway 272-22 *in subject line of e-mail application.
*Closing date for receipt of applications is 5.00 pm 28th October 2022*
We reserve the right to re-advertise or extend the closing date for this
post.
University of Galway is an equal opportunities employer.
All positions are recruited in line with Open, Transparent, Merit (OTM) and
Competency based recruitment
with regards,
Dr. Bharathi Raja Chakravarthi,
Assistant Professor / Lecturer-above-the-bar
School of Computer Science, University of Galway, Ireland
Insight SFI Research Centre for Data Analytics, Data Science Institute,
University of Galway, Ireland
E-mail: bharathiraja.akr(a)gmail.com ,
bharathiraja.asokachakravarthi(a)universityofgalway.ie
Google Scholar: https://scholar.google.com/citations?user=irCl028AAAAJ&hl=en