Hi there,
Could you please distribute the following job offer? Thanks.
Best,
Pascal
-------------------------------------------------------------------------------------
We invite applications for a 3-year PhD position co-funded by Inria,
the French national research institute in Computer Science and Applied
Mathematics, and LexisNexis France, leader of legal information in
France and subsidiary of the RELX Group.
The overall objective of this project is to develop an automated
system for detecting argumentation structures in French legal
decisions, using recent machine learning-based approaches (i.e. deep
learning approaches). In the general case, these structures take the
form of a directed labeled graph, whose nodes are the elements of the
text (propositions or groups of propositions, not necessarily
contiguous) which serve as components of the argument, and edges are
relations that signal the argumentative connection between them (e.g.,
support, offensive). By revealing the argumentation structure behind
legal decisions, such a system will provide a crucial milestone
towards their detailed understanding, their use by legal
professionals, and above all contributes to greater transparency of
justice.
The main challenges and milestones of this project start with the
creation and release of a large-scale dataset of French legal
decisions annotated with argumentation structures. To minimize the
manual annotation effort, we will resort to semi-supervised and
transfer learning techniques to leverage existing argument mining
corpora, such as the European Court of Human Rights (ECHR) corpus, as
well as annotations already started by LexisNexis. Another promising
research direction, which is likely to improve over state-of-the-art
approaches, is to better model the dependencies between the different
sub-tasks (argument span detection, argument typing, etc.) instead of
learning these tasks independently. A third research avenue is to find
innovative ways to inject the domain knowledge (in particular the rich
legal ontology developed by LexisNexis) to enrich enrich the
representations used in these models. Finally, we would like to take
advantage of other discourse structures, such as coreference and
rhetorical relations, conceived as auxiliary tasks in a multi-tasking
architecture.
The successful candidate holds a Master's degree in computational
linguistics, natural language processing, machine learning, ideally
with prior experience in legal document processing and discourse
processing. Furthermore, the candidate will provide strong programming
skills, expertise in machine learning approaches and is eager to work
at the interplay between academia and industry.
The position is affiliated with the MAGNET [1], a research group at
Inria, Lille, which has expertise in Machine Learning and Natural
Language Processing, in particular Discourse Processing. The PhD
student will also work in close collaboration with the R&D team at
LexisNexis France, who will provide their expertise in the legal
domain and the data they have collected.
Applications will be considered until the position is filled. However,
you are encouraged to apply early as we shall start processing the
applications as and when they are received. Applications, written in
English or French, should include a brief cover letter with research
interests and vision, a CV (including your contact address, work
experience, publications), and contact information for at least 2
referees. Applications (and questions) should be sent to Pascal Denis
(pascal.denis(a)inria.fr).
The starting date of the position is 1 November 2022 or soon
thereafter, for a total of 3 full years.
Best regards,
Pascal Denis
[1] https://team.inria.fr/magnet/
[2] https://www.lexisnexis.fr/
--
Pascal
----
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.
----
+++++++++++++++++++++++++++++++++++++++++++++++
Pascal Denis
Equipe MAGNET, INRIA Lille Nord Europe
Bâtiment B, Avenue Heloïse
Parc scientifique de la Haute Borne
59650 Villeneuve d'Ascq
Tel: ++33 3 59 35 87 24
Url: http://researchers.lille.inria.fr/~pdenis/
+++++++++++++++++++++++++++++++++++++++++++++++
Dear colleagues,
Last month, we shared the result of our collaborative work on a core metadata scheme for learner corpora with LCR2022 participants. Our proposal builds on Granger and Paquot (2017)'s first attempt to design such a scheme and during our presentation, we explained the rationale for expanding on the initial proposal and discussed selected aspects of the revised scheme.
Our proposal is available at https://docs.google.com/spreadsheets/d/1-RbX5iUCUtCBkZU9Rfk-kv-Vzc--F-eUW2O…<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.goog…>
We firmly believe that our efforts to develop a core metadata scheme for learner corpora will only be successful to the extent that (1) the LCR community is given the opportunity to engage with our work in various ways (provide feedback on the general structure of the scheme, the list of variables that we identified as core and their operationalization; test the metadata on other learner corpora; use the scheme to start a new corpus compilation, etc.) and (2) the core metadata scheme is the result of truly collaborative work.
As mentioned at LCR2022, we will be collecting feedback on the metadata scheme until the end of October. The online feedback form is available at:
https://docs.google.com/document/d/1NeDUuxGJlPSJI9wHVA1xgGM-aV8jXTa8Qlb45K-…<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.goog…>
We'd like to thank all the colleagues who already got back to us (at LCR2022, by email or via the online form). We also thank them for their appreciation and enthusiasm for our work! We'd also like to encourage more colleagues (and particularly those of you who have experience in learner corpus compilation) to provide feedback! We need help in finalizing the core metadata scheme to make sure that it can be applied in all learner compilation contexts. In short, we need you to make sure the scheme meets the needs of the LCR community at large.
With very best wishes,
Magali Paquot (also on behalf of Alexander König, Jennifer-Carmen Frey, and Egon W. Stemle)
Reference
Granger, S. & M. Paquot (2017). Towards standardization of metadata for L2 corpora. Invited talk at the CLARIN workshop on Interoperability of Second Language Resources and Tools, 6-8 December 2017, University of Gothenburg, Sweden.
Dr. Magali Paquot
Centre for English Corpus Linguistics
Institut Langage et Communication
UCLouvain
https://perso.uclouvain.be/magali.paquot/
Dear all
Just wanted to let you know that APJCR Vol. 3, No. 1 is now available to
view online.
http://icr.or.kr/ejournals-apjcr
CK
---
*CK Jung BEng(Hons) Birmingham MSc Warwick EdD Warwick Cert Oxford*
Department of English Language and Literature, Incheon National
University, *South
Korea*
Vice President | The Korea Association of Primary English Education
(KAPEE), *South Korea*
Vice President | The Korea Association of Secondary English Education
(KASEE), *South Korea*
Director | Institute for Corpus Research, Incheon National University, *South
Korea* (http://icr.or.kr)
Editor | Asia Pacific Journal of Corpus Research, ICR, *International* (
http://icr.or.kr/apjcr)
Deputy Editor | Korean Journal of English Language and Linguistics,
KASELL, *South
Korea*
Editorial Board | Corpora, Edinburgh University Press, *UK*
Editorial Board | English Today, Cambridge University Press, *UK*
E: ckjung(a)inu.ac.kr / T: +82 (0)32 835 8129
H(EN): http://ckjung.org
H(KR): http://prof1.inu.ac.kr/user/ckjung
PhD in ML/NLP – Efficient, Fair, robust and knowledge informed
self-supervised learning for speech processing
Starting date: November 1st, 2022 (flexible)
Application deadline: September 5th, 2022
Interviews (tentative): September 19th, 2022
Salary: ~2000€ gross/month (social security included)
Mission: research oriented (teaching possible but not mandatory)
*Keywords:*speech processing, natural language processing,
self-supervised learning, knowledge informed learning, Robustness, fairness
*CONTEXT*
The ANR project E-SSL (Efficient Self-Supervised Learning for Inclusive
and Innovative Speech Technologies) will start on November 1st 2022.
Self-supervised learning (SSL) has recently emerged as one of the most
promising artificial intelligence (AI) methods as it becomes now
feasible to take advantage of the colossal amounts of existing unlabeled
data to significantly improve the performances of various speech
processing tasks.
*PROJECT OBJECTIVES*
Recent SSL models for speech such as HuBERT or wav2vec 2.0 have shown an
impressive impact on downstream tasks performance. This is mainly due to
their ability to benefit from a large amount of data at the cost of a
tremendous carbon footprint rather than improving the efficiency of the
learning. Another question related to SSL models is their unpredictable
results once applied to realistic scenarios which exhibit their lack of
robustness. Furthermore, as for any pre-trained models applied in
society, it isimportant to be able to measure the bias of such models
since they can augment social unfairness.
The goals of this PhD position are threefold:
- to design new evaluation metrics for SSL of speech models ;
- to develop knowledge-driven SSL algorithms ;
- to propose methods for learning robust and unbiased representations.
SSL models are evaluated with downstream task-dependent metrics e.g.,
word error rate for speech recognition. This couple the evaluation of
the universality of SSL representations to a potentially biased and
costly fine-tuning that also hides the efficiencyinformation related to
the pre-training cost. In practice, we will seek to measure the training
efficiency as the ratio between the amount of data, computation and
memory needed to observe a certain gain in terms of performance on a
metric of interest i.e.,downstream dependent or not. The first step will
be to document standard markers that can be used as robust measurements
to assess these values robustly at training time. Potential candidates
are, for instance, floating point operations for computational
intensity, number of neural parameters coupled with precision for
storage, online measurement of memory consumption for training and
cumulative input sequence length for data.
Most state-of-the-art SSL models for speech rely onmasked prediction
e.g. HuBERT and WavLM, or contrastive losses e.g. wav2vec 2.0. Such
prevalence in the literature is mostly linked to the size, amount of
data and computational resources injected by thecompany producing these
models. In fact, vanilla masking approaches and contrastive losses may
be identified as uninformed solutions as they do not benefit from
in-domain expertise. For instance, it has been demonstrated that blindly
masking frames in theinput signal i.e. HuBERT and WavLM results in much
worse downstream performance than applying unsupervised phonetic
boundaries [Yue2021] to generate informed masks. Recently some studies
have demonstrated the superiority of an informed multitask learning
strategy carefully selecting self-supervised pretext-tasks with respect
to a set of downstream tasks, over the vanilla wav2vec 2.0 contrastive
learning loss [Zaiem2022]. In this PhD project, our objective is: 1.
continue to develop knowledge-driven SSL algorithms reaching higher
efficiency ratios and results at the convergence, data consumption and
downstream performance levels; and 2. scale these novel approaches to a
point enabling the comparison with current state-of-the-art systems and
therefore motivating a paradigm change in SSL for the wider speech
community.
Despite remarkable performance on academic benchmarks, SSL powered
technologies e.g. speech and speaker recognition, speech synthesis and
many others may exhibit highly unpredictable results once applied to
realistic scenarios. This can translate into a global accuracy drop due
to a lack of robustness to adversarial acoustic conditions, or biased
and discriminatory behaviors with respect to different pools of end
users. Documenting and facilitating the control of such aspects prior to
the deployment of SSL models into the real-life is necessary for the
industrial market. To evaluate such aspects, within the project, we will
create novel robustness regularization and debasing techniques along two
axes: 1. debasing and regularizing speech representations at the SSL
level; 2. debasing and regularizing downstream-adapted models (e.g.
using a pre-trained model).
To ensure the creation of fair and robust SSL pre-trained models, we
propose to act both at the optimization and data levels following some
of our previous work on adversarial protected attribute disentanglement
and the NLP literature on data sampling and augmentation [Noé2021].
Here, we wish to extend this technique to more complex SSL architectures
and more realistic conditions by increasing the disentanglement
complexity i.e. the sex attribute studied in [Noé2021] is particularly
discriminatory. Then, and to benefit from the expert knowledge induced
by the scope of the task of interest, we will build on a recent
introduction of task-dependent counterfactual equal odds criteria
[Sari2021] to minimize the downstream performance gap observed in
between different individuals of certain protected attributes and to
maximize the overall accuracy. Following this multi-objective
optimization scheme, we will then inject further identified constraints
as inspired by previous NLP work [Zhao2017]. Intuitively, constraints
are injected so the predictions are calibrated towards a desired
distribution i.e. unbiased.
*SKILLS*
*
Master 2 in Natural Language Processing, Speech Processing, computer
science or data science.
*
Good mastering of Python programming and deep learning framework.
*
Previous in Self-Supervised Learning, acoustic modeling or ASR would
be a plus
*
Very good communication skills in English
*
Good command of French would be a plus but is not mandatory
*SCIENTIFIC ENVIRONMENT*
The thesis will be conducted within the Getalp teams of the LIG
laboratory (_https://lig-getalp.imag.fr/_ <https://lig-getalp.imag.fr/>)
and the LIA laboratory (https://lia.univ-avignon.fr/). The GETALP team
and the LIA have a strong expertise and track record in Natural Language
Processing and speech processing. The recruited person will be welcomed
within the teams which offer a stimulating, multinational and pleasant
working environment.
The means to carry out the PhD will be providedboth in terms of missions
in France and abroad and in terms of equipment. The candidate will have
access to the cluster of GPUs of both the LIG and LIA. Furthermore,
access to the National supercomputer Jean-Zay will enable to run large
scale experiments.
The PhD position will be co-supervised by Mickael Rouvier (LIA, Avignon)
and Benjamin Lecouteux and François Portet (Université Grenoble Alpes).
Joint meetings are planned on a regular basis and the student is
expected to spend time in both places. Moreover, the PhD student will
collaborate with several team members involved in the project in
particular the two other PhD candidates who will be recruited and the
partners from LIA, LIG and Dauphine Université PSL, Paris. Furthermore,
the project will involve one of the founders of SpeechBrain, Titouan
Parcollet with whom the candidate will interact closely.
*INSTRUCTIONS FOR APPLYING*
Applications must contain: CV + letter/message of motivation + master
notes + be ready to provide letter(s) of recommendation; and be
addressed to Mickael Rouvier (_mickael.rouvier(a)univ-avignon.fr_
<mailto:mickael.rouvier@univ-avignon.fr>), Benjamin
Lecouteux(benjamin.lecouteux(a)univ-grenoble-alpes.fr) and François Portet
(_francois.Portet(a)imag.fr_ <mailto:francois.Portet@imag.fr>). We
celebrate diversity and are committed to creating an inclusive
environment for all employees.
*REFERENCES:*
[Noé2021] Noé, P.- G., Mohammadamini, M., Matrouf, D., Parcollet, T.,
Nautsch, A. & Bonastre, J.- F. Adversarial Disentanglement of Speaker
Representation for Attribute-Driven Privacy Preservation in Proc.
Interspeech 2021 (2021), 1902–1906.
[Sari2021] Sarı, L., Hasegawa-Johnson, M. & Yoo, C. D. Counterfactually
Fair Automatic Speech Recognition. IEEE/ACM Transactions on Audio,
Speech, and Language Processing 29, 3515–3525 (2021)
[Yue2021] Yue, X. & Li, H. Phonetically Motivated Self-Supervised Speech
Representation Learning in Proc. Interspeech 2021 (2021), 746–750.
[Zaiem2022] Zaiem, S., Parcollet, T. & Essid, S. Pretext Tasks Selection
for Multitask Self-Supervised Speech Representation in AAAI, The 2nd
Workshop on Self-supervised Learning for Audio and Speech Processing,
2023 (2022).
[Zhao2017] Zhao, J., Wang, T., Yatskar, M., Ordonez, V. & Chang, K. - W.
Men Also Like Shopping: Reducing Gender Bias Amplification using
Corpus-level Constraints in Proceedings of the 2017 Conference on
Empirical Methods in Natural Language Processing (2017), 2979–2989.
--
François PORTET
Professeur - Univ Grenoble Alpes
Laboratoire d'Informatique de Grenoble - Équipe GETALP
Bâtiment IMAG - Office 333
700 avenue Centrale
Domaine Universitaire - 38401 St Martin d'Hères
FRANCE
Phone: +33 (0)4 57 42 15 44
Email:francois.portet@imag.fr
www:http://membres-liglab.imag.fr/portet/
*Call for Research Fellow Chairs 2023*
MIAI, the Grenoble Interdisciplinary Institute in Artificial
Intelligence (https://miai.univ-grenoble-alpes.fr/), is opening three
research fellow chairs in AI reserved to persons who have spent most of
their research career outside France (see below). MIAI is one of the
four AI institutes created by the French government and is dedicated to
AI for the human beings and the environment. Research activities in MIAI
aim to cover all aspects of AI and applications of AI with a current
focus on embedded and hardware architectures for AI, learning and
reasoning, perception and interaction, AI & society, AI for health, AI
for environment & energy, and AI for industry 4.0.
These research fellow chairs aim to to address important and ambitious
research problems in AI-related fields and will partly pave the way for
the future research to be conducted in MIAI. Successful candidates will
be appointed by MIAI and will be allocated, for the whole duration of
the chair, a budget of 250k€ covering PhD and/or postdoc salaries,
internships, travels, … They will be part of MIAI and the French network
of AI institutes (comprising, in addition to MIAI, the AI institutes in
Paris, Toulouse and Nice) which provide a very dynamic environment for
conducting research in AI.
*Eligibility*//To be eligible, candidates must hold a PhD from a
non-French university obtained after January 2014 for male applicants
and after 2014-/n/, where /n/ is the number of children, for female
applicants. They must also have spent more than two thirds of their
research career since the beginning of their PhD outside France. Lastly,
they should be pursuing internationally recognized research in
AI-related fields (including applications of AI to any research field).
*To apply* Interested candidates should first contact Eric Gaussier
(eric.gaussier(a)univ-grenoble-alpes.fr) to discuss salary and application
modalities. It is important to note that candidates should identify a
local collaborator working in one of the Grenoble academic research labs
with whom they will interact. If selected, they will join the research
team of this collaborator. They should then send their application to
Manel Boumegoura (manel.boumegoura(a)univ-grenoble-alpes.fr) and Eric
Gaussier (eric.gaussier(a)univ-grenoble-alpes.fr) /by March 11, 2023/.
Each application should comprise a 2-page CV, a complete list of
publications, 2 reference letters, a letter from the local collaborator
indicating the relevance and importance of the proposed project, and a
4-page description of the research project which can target any topic of
AI or applications of AI. It is important to emphasize, in the
description, the ambition, the originality and the potential impact of
the research to be conducted, as well as the collaborations the
candidate has or will develop with Grenoble researchers in order to
achieve her or his research goals.
*Starting date and duration* Each chair is intended for 3 to 4 years,
starting no later than September 2023.
*Location* The work will take place in Grenoble, in the research lab of
the identified collaborator.
For any question, please contact Eric Gaussier
(eric.gaussier(a)univ-grenoble-alpes.fr) or Manel Boumegoura
(manel.boumegoura(a)univ-grenoble-alpes.fr).
*******
** apologies for cross-posting **
Linking Lexicographic and Language Learning Resources (4LR)
Workshop at LDK 2023 – Call for Papers
Workshop website: https://lexicala.com/4lr/
The workshop ‘Linking Lexicographic and Language Learning Resources’ (4LR) will be held in conjunction with LDK 2023 – 4th conference on Language, Data and Knowledge – (http://2023.ldk-conf.org/) at the University of Vienna, Austria, on September 13 (tentative), in hybrid mode.
The aim of this workshop is to explore linguistic linked (open) data and knowledge management methods and technologies for linking lexicographic and language learning resources, tools and applications in general and dictionaries and CEFR lists in particular.
Our starting point is, on the one hand, enhancing CEFR-graded language proficiency lists with lexicographic content and, on the other hand, incorporating CEFR labels in learner’s dictionaries. CEFR – the Common European Framework of Reference for Languages – is a generally established international standard for describing language proficiency, and CEFR-graded resources have been developed for many languages in Europe. However, incorporating their information is still not a common practice in modern lexicography for most languages, except for notably two English dictionaries for advanced learners (Cambridge and Oxford). There are substantial unsolved issues, such as inconsistencies in vocabulary size per level between languages; no, or limited, sense disambiguation in CEFR resources; words from a higher CEFR level in definitions and example sentences. Moreover, there has been limited collaboration and interoperability so far among the related fields of lexicography, language acquisition, and linguistic linked data, whether regarding research, development, or practical application.
4LR will feature an overview by the organizers, as well as an invited talk by Jorge Gracia from University of Zaragoza and chair of NexusLinguarum CA on Linked Data for Lexicographic Resources.
In addition, we invite submissions for papers (20 minutes, plus discussion) on the following topics:
• Linking lexicographic content to CEFR-graded vocabularies
• Pedagogical lexicography and knowledge graphs
• Attributing CEFR labels in learner’s dictionaries
• Incorporating vocabulary and grammar profiles in lexicographic resources
• Creating and linking crosslingual concept-based CEFR resources
• Multilingual knowledge management and language learning applications and tools
SUBMISSION AND DATES
Please submit your abstract including 300-500 words via EasyChair [https://easychair.org/conferences/?conf=4lr2023].
19 May 2023 Deadline for abstract submission
29 May 2023 Deadline for notification for abstract submission
30 June 2023 Deadline for camera-ready paper submission
13 Sep 2023 (tentative) 4LR workshop
14–15 Sep LDK 2023 conference
ORGANIZERS AND CONTACT
Kris Heylen. Dutch Language Institute (Kris DOT Heylen AT ivdnt DOT org)
Jelena Kallas. Institute of the Estonian Language
Ilan Kernerman. Lexicala by K Dictionaries
Carole Tiberius. Dutch Language Institute
Website: https://lexicala.com/4lr/
4LR is supported by NexusLinguarum COST Action (CA18209) – European network for Web-centered linguistic data science.
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
4LR workshop at LDK 2023 will follow a related workshop ‘Lexicography and CEFR: Linking Lexicographic Resources and Language Proficiency Levels’ that will be held in conjunction with eLex 2023 on June 29 in Brno, Czech Republic.
(apologies for multiple postings)
*CALL FOR PAPERS* <https://elex.link/elex2023/call-for-papers/>
*eLex 2023: Electronic lexicography in the 21st century.* The topic of next
year's conference is Invisible Lexicography.
Dates: 27-29 June 2023 (with workshops on June 26th)
Venue: Hotel Passage, Brno, Czechia
Deadline for abstract submissions: January 31st 2023
Conference website: https://elex.link/elex2023/
Language of the conference: English
Format:
The conference will be organized as a hybrid event and while we encourage
everyone to participate on-site, we plan to provide live streaming and
recording of the event for registered participants.
Looking forward to seeing you all in Brno,
Miloš Jakubíček
in the name of the organising committee
First Call for Papers
The 18th Workshop on Innovative Use of NLP for Building Educational
Applications (BEA 2023)
Toronto
Thursday, July 13, 2023
(co-located with ACL 2023)
https://sig-edu.org/bea/current
*Submission Deadline: Monday, April 24, 2023, 11:59pm UTC-12*
WORKSHOP DESCRIPTION
The BEA Workshop is a leading venue for NLP innovation in the context of
educational applications. It is one of the largest one-day workshops in the
ACL community with over 100 registered attendees in the past several years.
The growing interest in educational applications and a diverse community of
researchers involved resulted in the creation of the Special Interest Group
in Educational Applications (SIGEDU)
<https://www.aclweb.org/adminwiki/index.php?title=2019Q3_Reports:_SIGEDU>
in 2017, which currently has over 300 members.
We will solicit papers that incorporate NLP methods, including, but not
limited to:
-
automated scoring of open-ended textual and spoken responses;
-
automated scoring/evaluation for written student responses (across
multiple genres);
-
game-based instruction and assessment;
-
educational data mining;
-
intelligent tutoring;
-
collaborative learning environments;
-
peer review;
-
grammatical error detection and correction;
-
learner cognition;
-
spoken dialog;
-
multimodal applications;
-
annotation standards and schemas;
-
tools and applications for classroom teachers, learners and/or test
developers; and
-
use of corpora in educational tools.
INVITED TALKS
The workshop will feature invited talks from Susan Lottridge (Cambium
Assessment) and Jordana Heller (Textio), as well as a speaker from one of
the IAALDE <https://alliancelss.com/> societies.
IMPORTANT DATES
All deadlines are 11:59 pm UTC-12 (anywhere on earth).
-
Anonymity Period Begins: *Friday, March 24, 2023*
-
Submission Deadline: Monday, April 24, 2023
-
Notification of Acceptance: Monday, May 22, 2023
-
Camera-ready Papers Due: Tuesday, May 30, 2023
-
Workshop: Thursday, July 13, 2023
SUBMISSION INFORMATION
We will be using the ACL Submission Guidelines for the BEA Workshop this
year. Authors are invited to submit a long paper of up to eight (8) pages
of content, plus unlimited references; final versions of long papers will
be given one additional page of content (up to 9 pages) so that reviewers’
comments can be taken into account. We also invite short papers of up to
four (4) pages of content, plus unlimited references. Upon acceptance,
short papers will be given five (5) content pages in the proceedings.
Authors are encouraged to use this additional page to address reviewers’
comments in their final versions. Papers which describe systems are also
invited to give a demo of their system. If you would like to present a demo
in addition to presenting the paper, please make sure to select either
“long paper + demo” or “short paper + demo” under “Submission Category” in
the START submission page.
Previously published papers cannot be accepted. The submissions will be
reviewed by the program committee. As reviewing will be blind, please
ensure that papers are anonymous. Self-references that reveal the author’s
identity, e.g., “We previously showed (Smith, 1991) …”, should be avoided.
Instead, use citations such as “Smith previously showed (Smith, 1991) …”.
We have also included conflict of interest in the submission form. You
should mark all potential reviewers who have been authors on the paper, are
from the same research group or institution, or who have seen versions of
this paper or discussed it with you.
We will be using the START conference system to manage submissions:
https://www.softconf.com/acl2023/bea2023/
DOUBLE SUBMISSION POLICY
We will follow the official ACL double-submission policy
<https://www.aclweb.org/archive/policies/current/double-submission-policy.ht…>.
Specifically:
Papers being submitted both to BEA and another conference or workshop must:
● Note on the title page the other conference or workshop to which
they are being submitted.
● State on the title page that if the authors choose to present their
paper at BEA (assuming it was accepted), then the paper will be withdrawn
from other conferences and workshops.
ORGANIZING COMMITTEE
-
Ekaterina Kochmar <https://ekochmar.github.io/about/>, MBZUAI
-
Jill Burstein <https://sites.google.com/site/jbursteinets/>, Duolingo
-
Andrea Horbach <https://www.ltl.uni-due.de/team/andrea-horbach/>,
FernUniversität
in Hagen
-
Ronja Laarmann-Quante
<https://www.ltl.uni-due.de/team/ronja-laarmann-quante>, Ruhr University
Bochum
-
Nitin Madnani <https://desilinguist.org/>, Educational Testing Service
-
Anaïs Tack <https://anaistack.github.io/>, KU Leuven
-
Victoria Yaneva <http://www.victoriayaneva.info/>, National Board of
Medical Examiners
-
Zheng Yuan <https://www.cl.cam.ac.uk/~zy249/>, King’s College London
-
Torsten Zesch <https://www.ltl.uni-due.de/team/torsten-zesch>,
FernUniversität
in Hagen
Workshop contact email address: bea.nlp.workshop(a)gmail.com
there are all kinds of lists on wikipedia of various kinds of
authors: linguists, philosophers, mathematicians ... In most (almost
all?) cases there is a brief page about the authors biography and
their work.
As you could expect some folks (isbndb.com) would come up with the
great idea of selling you the air you breathe. exaly.com, worldcat,
freelibrary.org ... do a minimally better job, but their web interface
I find too constraining and, "of course", you don't find a "download
the whole damn thing" option.
What I am looking for is an openly and collectively maintained DB a
la wikipedia from which interface you could download all search hits
as well-formatted, parsable lines in a text file without having to
"click next", copy and paste, and all that kind of nonsense.
I could imagine someone in the corpora research community has taken
the time to compile a database which IMO should include:
a) work:
a.1) original name
a.2) original language
a.3) topical bags index
a.4) received category index (a children book, book review, degree
theses, article in periodical, ...)
a.5) publications:
a.5.1) date
a.5.2) metadata RDF including: language, "co-"authors (preface, those
writing back-cover blurbs), editors, translators, ISBNs, publisher,
copyright notice, ...
b) name(s):
b.1) first/given name(s) (at Birth)
b.2) last name(s) (at Birth)
b.3) pen name(s)
b.3) also known as
c) birth place
d) date of birth
e) languages
f) date of death
Authorship - work pairs should be prioritized. In case of
compilations of various auth-work pairs in a single book, the
compilation in which an article appears should be specified in the
metadata.
Please, let me know where could I find such a database (even if
partially) which could be downloaded. In case you don't know such a
general registry of published books/texts, which other entries would
you think are important?
lbrtchx