*Asia Pacific Journal of Corpus Research (APJCR) is now available online:*
http://icr.or.kr/ejournals-apjcr
*The Incredible Shrinking Noun Phrase: Ongoing Change in Japanese Word
Formation*Kevin Heffernan, (Kwansei Gakuin University), JAPAN; Yusuke
Imanishi (Kwansei Gakuin University), JAPAN
DOI: https://doi.org/10.22925/apjcr.2023.4.1.1
________________________________________
*Identifying Key Grammatical Errors of Japanese English as a Foreign
Language Learners in a Learner Corpus: Toward Focused Grammar Instruction
with Data-Driven Learning*
Atsushi Mizumoto (Kansai University), JAPAN; Yoichi Watari (Chukyo
University), JAPAN
DOI: https://doi.org/10.22925/apjcr.2023.4.1.25
________________________________________
*A Comparison of the Constructions Make / Take a Decision in Malaysian
English with the Supervarieties *
Christina Sook Beng Ong (Wawasan Open University), MALAYSIA
DOI: https://doi.org/10.22925/apjcr.2023.4.1.43
________________________________________
*Effects of Corpus Use on Error Identification in L2 Writing *
Yoshiho Satake (Aoyama Gakuin University), JAPAN
DOI: https://doi.org/10.22925/apjcr.2023.4.1.61
---
*CK Jung BEng(Hons) Birmingham MSc Warwick EdD Warwick Cert Oxford*
Associate Professor | Department of English Language and Literature,
Incheon National University, *South Korea*
President | The Korea Association of Secondary English Education, *South
Korea *(http://kasee.org)
Vice President | The Korea Association of Primary English Education), *South
Korea *(http://kapee.or.kr)
Director | Institute for Corpus Research, Incheon National University, *South
Korea* (http://icr.or.kr)
Editor-in-Chief | Asia Pacific Journal of Corpus Research, ICR,
*International* (http://icr.or.kr/apjcr)
Editorial Board | Corpora, Edinburgh University Press, *UK*
Editorial Board | English Today, Cambridge University Press, *UK*
E: ckjung(a)inu.ac.kr / T: +82 (0)32 835 8129
H(EN): http://ckjung.org
== 12th NLP4CALL, Tórshavn, Faroe Islands==
The workshop series on Natural Language Processing (NLP) for Computer-Assisted Language Learning (NLP4CALL) is a meeting place for researchers working on the integration of Natural Language Processing and Speech Technologies in CALL systems and exploring the theoretical and methodological issues arising in this connection. The latter includes, among others, insights from Second Language Acquisition (SLA) research, on the one hand, and promote development of “Computational SLA” through setting up Second Language research infrastructure(s), on the other.
The intersection of Natural Language Processing (or Language Technology / Computational Linguistics) and Speech Technology with Computer-Assisted Language Learning (CALL) brings “understanding” of language to CALL tools, thus making CALL intelligent. This fact has given the name for this area of research – Intelligent CALL, ICALL. As the definition suggests, apart from having excellent knowledge of Natural Language Processing and/or Speech Technology, ICALL researchers need good insights into second language acquisition theories and practices, as well as knowledge of second language pedagogy and didactics. This workshop invites therefore a wide range of ICALL-relevant research, including studies where NLP-enriched tools are used for testing SLA and pedagogical theories, and vice versa, where SLA theories, pedagogical practices or empirical data are modeled in ICALL tools.
The NLP4CALL workshop series is aimed at bringing together competences from these areas for sharing experiences and brainstorming around the future of the field.
We welcome papers:
- that describe research directly aimed at ICALL;
- that demonstrate actual or discuss the potential use of existing Language and Speech Technologies or resources for language learning;
- that describe the ongoing development of resources and tools with potential usage in ICALL, either directly in interactive applications, or indirectly in materials, application or curriculum development, e.g. learning material generation, assessment of learner texts and responses, individualized learning solutions, provision of feedback;
- that discuss challenges and/or research agenda for ICALL
- that describe empirical studies on language learner data.
This year a special focus is given to work done on error detection/correction and feedback generation.
We encourage paper presentations and software demonstrations describing the above- mentioned themes primarily, but not exclusively, for the Nordic languages.
==Shared task==
NEW for this year is the MultiGED shared task on token-level error detection for L2 Czech, English, German, Italian and Swedish, organized by the Computational SLA working group.
For more information, please see the Shared Task website: https://github.com/spraakbanken/multiged-2023
==Invited speakers==
This year, we have the pleasure to announce two invited talks.
The first talk is given by Marije Michel from the University of Amsterdam.
The second talk is given by Pierre Lison from the Norwegian Computing Center.
==Submission information==
Authors are invited to submit long papers (8-12 pages) alternatively short papers (4-7 pages), page count not including references.
We will be using the NLP4CALL template for the workshop this year. The author kit can be accessed here, alternatively on Overleaf:
<https://spraakbanken.gu.se/sites/default/files/2023/NLP4CALL%20workshop%20t…>
<https://spraakbanken.gu.se/sites/default/files/2023/nlp4call%20template.doc>
<https://www.overleaf.com/latex/templates/nlp4call-workshop-template/qqqzqqy…>
Submissions will be managed through the electronic conference management system EasyChair <https://easychair.org/conferences/?conf=nlp4call2023>. Papers must be submitted digitally through the conference management system, in PDF format. Final camera-ready versions of accepted papers will be given an additional page to address reviewer comments.
Papers should describe original unpublished work or work-in-progress. Papers will be peer reviewed by at least two members of the program committee in a double-blind fashion. All accepted papers will be collected into a proceedings volume to be submitted for publication in the NEALT Proceeding Series (Linköping Electronic Conference Proceedings) and, additionally, double-published through the ACL anthology, following experiences from the previous NLP4CALL editions (<https://www.aclweb.org/anthology/venues/nlp4call/>).
==Important dates==
03 April 2023: paper submission deadline
21 April 2023: notification of acceptance
01 May 2023: camera-ready papers for publication
22 May 2023: workshop date
==Organizers==
David Alfter (1), Elena Volodina (2), Thomas François (3), Arne Jönsson (4), Evelina Rennes (4)
(1) Gothenburg Research Infrastructure for Digital Humanities, Department of Literature, History of Ideas, and Religion, University of Gothenburg, Sweden
(2) Språkbanken, Department of Swedish, Multilingualism, Language Technology, University of Gothenburg, Sweden
(3) CENTAL, Institute for Language and Communication, Université Catholique de Louvain, Belgium
(4) Department of Computer and Information Science, Linköping University, Sweden
==Contact==
For any questions, please contact David Alfter, david.alfter(a)gu.se
For further information, see the workshop website <https://spraakbanken.gu.se/en/research/themes/icall/nlp4call-workshop-serie…>
Follow us on Twitter @NLP4CALL <https://twitter.com/NLP4CALL/>
[Apologies for cross-posting]
Dear colleagues
We are inviting submissions for the next issue of Asia Pacific Journal of
Corpus Research, to appear on 31 December 2023.
*ABOUT*The Asia Pacific Journal of Corpus Research (APJCR, e-ISSN
2733-8096, DOI: https://doi.org/10.22925/apjcr) is an international and
interdisciplinary peer-reviewed journal intended to explore corpus research
in the Asia Pacific region. APJCR addresses areas of methodological,
applied and theoretical work in the field of corpus research. Examples of
such include discourse analysis, lexical studies, grammatical studies,
language acquisition, language learning, language education, lexicography,
pragmatics, sociolinguistics, (machine) translation studies, (digital)
literary studies, computational linguistics, speech, phonetics, deep
learning and natural language understanding in conjunction with corpus.
*NO ARTICLE PROCESS CHARGE*APJCR does not charge authors an Article
Processing Fee (APF).
*OPEN ACCESS POLICY*APJCR provides open access to its content under the
principle in the academic field that making research freely available to
the public supports a greater global exchange of knowledge.
*SUBMISSION*
Papers (in English or Korean) should be sent to *apjcreditor(a)icr.or.kr
<apjcreditor(a)icr.or.kr>*
*Full instruction can be found on http://icr.or.kr/apjcr
<http://icr.or.kr/apjcr>*
*IMPORTANT DATES*- Manuscript submission: 15 October 2023
- First decision (articles assessed by editors): October 2023
- Final decision: November 2023
- Production: December 2023
- Online publication: 31 December 2023
*APJCR ARCHIVE*- Google Scholar:
https://scholar.google.co.kr/scholar?hl=ko&as_sdt=0%2C5&q=apjcr&btnG=
- KoreaScience: http://koreascience.or.kr/journal/CPSOBX/v1n1.page
*ENQUIRIES*
help(a)icr.or.kr
---
*CK Jung BEng(Hons) Birmingham MSc Warwick EdD Warwick Cert Oxford*
Associate Professor | Department of English Language and Literature,
Incheon National University, *South Korea*
President | The Korea Association of Secondary English Education, *South
Korea *(http://kasee.org)
Vice President | The Korea Association of Primary English Education), *South
Korea *(http://kapee.or.kr)
Director | Institute for Corpus Research, Incheon National University, *South
Korea* (http://icr.or.kr)
Editor-in-Chief | Asia Pacific Journal of Corpus Research, ICR,
*International* (http://icr.or.kr/apjcr)
Editorial Board | Corpora, Edinburgh University Press, *UK*
Editorial Board | English Today, Cambridge University Press, *UK*
E: ckjung(a)inu.ac.kr / T: +82 (0)32 835 8129
Hi there,
Could you please distribute the following job offer? Thanks.
Best,
Pascal
-------------------------------------------------------------------------------------
We invite applications for a 3-year PhD position co-funded by Inria,
the French national research institute in Computer Science and Applied
Mathematics, and LexisNexis France, leader of legal information in
France and subsidiary of the RELX Group.
The overall objective of this project is to develop an automated
system for detecting argumentation structures in French legal
decisions, using recent machine learning-based approaches (i.e. deep
learning approaches). In the general case, these structures take the
form of a directed labeled graph, whose nodes are the elements of the
text (propositions or groups of propositions, not necessarily
contiguous) which serve as components of the argument, and edges are
relations that signal the argumentative connection between them (e.g.,
support, offensive). By revealing the argumentation structure behind
legal decisions, such a system will provide a crucial milestone
towards their detailed understanding, their use by legal
professionals, and above all contributes to greater transparency of
justice.
The main challenges and milestones of this project start with the
creation and release of a large-scale dataset of French legal
decisions annotated with argumentation structures. To minimize the
manual annotation effort, we will resort to semi-supervised and
transfer learning techniques to leverage existing argument mining
corpora, such as the European Court of Human Rights (ECHR) corpus, as
well as annotations already started by LexisNexis. Another promising
research direction, which is likely to improve over state-of-the-art
approaches, is to better model the dependencies between the different
sub-tasks (argument span detection, argument typing, etc.) instead of
learning these tasks independently. A third research avenue is to find
innovative ways to inject the domain knowledge (in particular the rich
legal ontology developed by LexisNexis) to enrich enrich the
representations used in these models. Finally, we would like to take
advantage of other discourse structures, such as coreference and
rhetorical relations, conceived as auxiliary tasks in a multi-tasking
architecture.
The successful candidate holds a Master's degree in computational
linguistics, natural language processing, machine learning, ideally
with prior experience in legal document processing and discourse
processing. Furthermore, the candidate will provide strong programming
skills, expertise in machine learning approaches and is eager to work
at the interplay between academia and industry.
The position is affiliated with the MAGNET [1], a research group at
Inria, Lille, which has expertise in Machine Learning and Natural
Language Processing, in particular Discourse Processing. The PhD
student will also work in close collaboration with the R&D team at
LexisNexis France, who will provide their expertise in the legal
domain and the data they have collected.
Applications will be considered until the position is filled. However,
you are encouraged to apply early as we shall start processing the
applications as and when they are received. Applications, written in
English or French, should include a brief cover letter with research
interests and vision, a CV (including your contact address, work
experience, publications), and contact information for at least 2
referees. Applications (and questions) should be sent to Pascal Denis
(pascal.denis(a)inria.fr).
The starting date of the position is 1 November 2022 or soon
thereafter, for a total of 3 full years.
Best regards,
Pascal Denis
[1] https://team.inria.fr/magnet/
[2] https://www.lexisnexis.fr/
--
Pascal
----
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.
----
+++++++++++++++++++++++++++++++++++++++++++++++
Pascal Denis
Equipe MAGNET, INRIA Lille Nord Europe
Bâtiment B, Avenue Heloïse
Parc scientifique de la Haute Borne
59650 Villeneuve d'Ascq
Tel: ++33 3 59 35 87 24
Url: http://researchers.lille.inria.fr/~pdenis/
+++++++++++++++++++++++++++++++++++++++++++++++
Dear colleagues,
Last month, we shared the result of our collaborative work on a core metadata scheme for learner corpora with LCR2022 participants. Our proposal builds on Granger and Paquot (2017)'s first attempt to design such a scheme and during our presentation, we explained the rationale for expanding on the initial proposal and discussed selected aspects of the revised scheme.
Our proposal is available at https://docs.google.com/spreadsheets/d/1-RbX5iUCUtCBkZU9Rfk-kv-Vzc--F-eUW2O…<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.goog…>
We firmly believe that our efforts to develop a core metadata scheme for learner corpora will only be successful to the extent that (1) the LCR community is given the opportunity to engage with our work in various ways (provide feedback on the general structure of the scheme, the list of variables that we identified as core and their operationalization; test the metadata on other learner corpora; use the scheme to start a new corpus compilation, etc.) and (2) the core metadata scheme is the result of truly collaborative work.
As mentioned at LCR2022, we will be collecting feedback on the metadata scheme until the end of October. The online feedback form is available at:
https://docs.google.com/document/d/1NeDUuxGJlPSJI9wHVA1xgGM-aV8jXTa8Qlb45K-…<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.goog…>
We'd like to thank all the colleagues who already got back to us (at LCR2022, by email or via the online form). We also thank them for their appreciation and enthusiasm for our work! We'd also like to encourage more colleagues (and particularly those of you who have experience in learner corpus compilation) to provide feedback! We need help in finalizing the core metadata scheme to make sure that it can be applied in all learner compilation contexts. In short, we need you to make sure the scheme meets the needs of the LCR community at large.
With very best wishes,
Magali Paquot (also on behalf of Alexander König, Jennifer-Carmen Frey, and Egon W. Stemle)
Reference
Granger, S. & M. Paquot (2017). Towards standardization of metadata for L2 corpora. Invited talk at the CLARIN workshop on Interoperability of Second Language Resources and Tools, 6-8 December 2017, University of Gothenburg, Sweden.
Dr. Magali Paquot
Centre for English Corpus Linguistics
Institut Langage et Communication
UCLouvain
https://perso.uclouvain.be/magali.paquot/
(Apologies for cross-posting)
CFP: SYMPTEMIST Shared Task (BioCreative VIII run with AMIA 2023)
Named entity recognition and linking of symptoms, signs & findings (incl.
multilingual dataset)
https://temu.bsc.es/SYMPTEMIST/ <https://temu.bsc.es/distemist/>
The SYMPTEMIST track focuses on the automatic detection of mentions of
clinical symptoms (NER) and mapping to concept identifiers in clinical case
reports in Spanish (entity linking). Also a multilingual version of the
dataset will be released including versions in English, French, Italian,
Dutch, Portuguese, Romanian and Swedish.
Key information:
-
Web: https://temu.bsc.es/symptemist
-
Data: <https://doi.org/10.5281/zenodo.6408476>
https://zenodo.org/record/8223654
-
Annotation guidelines: https://zenodo.org/record/8246440
-
BioCreative web: https://biocreative.bioinformatics.udel.edu
-
Registration form (Track 2- SYMPTEMIST):
<https://temu.bsc.es/distemist/registration/>
https://docs.google.com/forms/d/e/1FAIpQLScoSNulOoxRju3c8v9Q-CSv-w5jJcXu93G…
Motivation
Systems able to detect and normalize clinical symptom mentions from medical
texts are crucial for almost any healthcare data mining, AI, medical
analytics or predictive application. As opposed to other clinical
information types, such as diagnoses (diseases/procedures), lab test
results or even medications, clinical symptoms can only be recovered
directly from written clinical narratives. Due to the high complexity,
variability and difficulty in generating annotated corpora for clinical
symptoms, only few large manually annotated data collections have been
constructed so far, with certain underlying limitations in terms of a)
entity linking / normalization of the symptom mentions to controlled
vocabularies and b) a lack of attempts to promote the development of
multilingual solutions and b) provide detailed annotation criteria and
guidelines. To address these issues, we have posed the SYMPTEMIST track at
the upcoming BioCreative VIII initiative, which will be run in the context
of the prestigious AMIA 2023 conference, which received over 1400
submissions this year.
Automatic detection of symptoms mentions are key for a range of clinical
use cases and real world applications like:
-
Predictive modeling of diseases
-
Differential diagnosis of complex diseases
-
Rare disease characterization & analysis
-
Selection of appropriate treatment & therapy
-
Study of disease-symptom associations
-
Early detection of disease outbreaks & epidemiological surveillance
-
Extraction of phenotypes
-
Drug repurposing & off label indications
The SYMPTEMIST organizers will also release multilingual resources to
foster the development of multilingual tools and generate systems not only
for Spanish but also for content in English and Romance languages (French,
Portuguese, Italian, Romanian and Catalan) as well as versions in Dutch,
Swedish and Czech.
Inspired by previous initiatives (e.g. n2c2, CLEF or TREC) and shared tasks
(CANTEMIST, PharmaCoNER, or CodiEsp), we are launching the SYMPTEMIST
shared task as part of the BioCreative 2023 evaluation initiative, with the
following three sub-tracks:
-
SYMPTEMIST-entities: automatic detection of mentions of symptoms.
-
SYMPTEMIST-linking: finding mentions of symptoms and normalizing them to
their Snomed-CT concept identifiers.
-
SYMPTEMIST-multilingual: automatic detection of mentions of symptoms in
versions of the corpus generated in English, French, Italian, Portuguese,
Romanian, Catalan, Dutch, Swedish and Czech.
Tentative schedule
-
Annotation Guidelines: August 8th 2023
-
Train Set Subtask 1 (NER): August 8th, 2023
-
Train Set Subtask 2 (Linking): September 10th 2023
-
Train Set Subtask 3 (Multilingual): September 10th 2023
-
SympTEMIST Test Set: September 30th 2023
-
Participants Test Predictions Deadline: October 5th 2023
-
Participants Evaluation Results Release. October 10th 2023
-
Submission of Participant Papers Deadline: October 22nd 2023
-
Notification of Acceptance Participant Papers: October 30 2023
-
Submission of Camera-ready Participant Papers Deadline. November 1st 2023
-
BioCreative VIII workshop @ AMIA 2023: November 11-15, 2023, In New
Orleans, LA.
BioCreative proceedings and AMIA workshop
Teams participating in SYMPTEMIST will be invited to contribute a systems
description paper for the BioCreative 2023 Working Notes proceedings and a
flash presentation of their approach at the BioCreative 2023 session. The
BioCreative VIII workshop will run with AMIA 2023, November 11-15, 2023, In
New Orleans, LA. See:
https://amia.org/education-events/amia-2023-annual-symposium
Workshop Proceedings and Special Issue:
The BioCreative VIII Proceedings will host all the submissions from
participating teams, and it will be freely available by the time of the
workshop. In addition, we are happy to announce that the journal Database
will host the BioCreative VIII special issue for work that has passed their
peer-review process. Invitation to submit will be sent after the workshop.
All BioCreative VIII tracks
Track 1: BioRED (Biomedical Relation Extraction Dataset)
*Track 2: SYMPTEMIST (Symptom TExt Mining Shared Task)
Track 3: Genetic Phenotype Extraction and Normalization from Dysmorphology
Physical Examination Entries
Track 4: Clinical Annotation Tool Track
Main Organizers
-
Martin Krallinger, Barcelona Supercomputing Center, Spain
-
Eulàlia Farré-Maduell, Barcelona Supercomputing Center, Spain
-
Luis Gascó, Barcelona Supercomputing Center, Spain
-
Salvador Lima, Barcelona Supercomputing Center, Spain
-
Jan Rodriguez, Barcelona Supercomputing Center, Spain
=======================================
Martin Krallinger, Dr.
Head of NLP for Biomedical Information Analysis Unit
Barcelona Supercomputing Center (BSC-CNS)
https://www.linkedin.com/in/martin-krallinger-85495920/
=======================================
CoCo4MT is extended its deadline for paper submission to July 16th!
The Second Workshop on Corpus Generation and Corpus Augmentation for
Machine Translation (CoCo4MT) @MT-SUMMIT XIX
The 19th Machine Translation Summit
Sep 4-8, 2023, Macau SAR, China
https://sites.google.com/view/coco4mt
SCOPE
It is a well-known fact that machine translation systems, especially
those that use deep learning, require massive amounts of data. Several
resources for languages are not available in their human-created format.
Some of the types of resources available are monolingual, multilingual,
translation memories, and lexicons. Those types of resources are
generally created for formal purposes such as parliamentary collections
when parallel and more informal situations when monolingual. The quality
and abundance of resources including corpora used for formal reasons is
generally higher than those used for informal purposes. Additionally,
corpora for low-resource languages, languages with less digital
resources available, tends to be less abundant and of lower quality.
CoCo4MT is a workshop centered around research that focuses on manual
and automatic corpus creation, cleansing, and augmentation techniques
specifically for machine translation. We accept work that covers any
language (including sign language) but we are specifically interested in
those submissions that explicitly report on work with languages with
limited existing resources (low-resource languages). Since techniques
from high-resource languages are generally statistical in nature and
could be used as generic solutions for any language, we welcome
submissions on high-resource languages also.
CoCo4MT aims to encourage research on new and undiscovered techniques.
We hope that the methods presented at this workshop will lead to the
development of high-quality corpora that will in turn lead to
high-performing MT systems and new dataset creation for multiple
corpora. We hope that submissions will provide high-quality corpora that
are available publicly for download and can be used to increase machine
translation performance thus encouraging new dataset creation for
multiple languages that will, in turn, provide a general workshop to
consult for corpora needs in the future. The workshop’s success will be
measured by the following key performance indicators:
- Promotes the ongoing increase in quality of machine translation
systems when measured by standard measurements,
- Provides a meeting place for collaboration from several research areas
to increase the availability of commonly used corpora and new corpora,
- Drives innovation to address the need for higher quality and abundance
of low-resource language data.
Topics of interest include:
- Difficulties with using existing corpora (e.g., political
considerations or domain limitations) and their effects on final MT
systems,
- Strategies for collecting new MT datasets (e.g., via crowdsourcing),
- Data augmentation techniques,
- Data cleansing and denoising techniques,
- Quality control strategies for MT data,
- Exploration of datasets for pretraining or auxiliary tasks for
training MT systems.
SHARED TASK
To encourage research on corpus construction for low-resource machine
translation, we introduce a shared task focused on identifying
high-quality instances that should be translated into a target
low-resource language. Participants are provided access to multi-way
corpora in the high-resource languages of English, Spanish, German,
Korean, and Indonesian, and using these, are required to identify
beneficial instances, that when translated into the low-resource
languages of Cebuano, Gujarati, and Burmese, lead to high-performing MT
systems. More details on data, evaluation and submission can be found on
the website (https://sites.google.com/view/coco4mt/shared-task) or by
emailing coco4mt-shared-task(a)googlegroups.com.
SUBMISSION INFORMATION
CoCo4MT will accept research, review, or position papers. The length of
each paper should be at least four (4) and not exceed ten (10) pages,
plus unlimited pages for references. Submissions should be formatted
according to the official MT Summit 2023 style templates
(https://www.overleaf.com/latex/templates/mt-summit-2023-template/knrrcnxhkq…).
Accepted papers will be published in the MT Summit 2023 proceedings
which are included in the ACL Anthology and will be presented at the
conference either orally or as a poster.
Submissions must be anonymized and should be made to the workshop using
the Softconf conference management system
(https://softconf.com/mtsummit2023/CoCo4MT). Scientific papers that have
been or will be submitted to other venues must be declared as such, and
must be withdrawn from the other venues if accepted and published at
CoCo4MT. The review will be double-blind.
We would like to encourage authors to cite papers written in ANY
language that are related to the topics, as long as both original
bibliographic items and their corresponding English translations are
provided.
Registration will be handled by the main conference. (To be announced)
IMPORTANT DATES
May 18, 2023 - Call for papers released
May 19, 2023 - Shared task release of train, dev and test data
May 25, 2023 - Shared task release of baselines
June 5, 2023 - Second call for papers
June 20, 2023 - Third and final call for papers
July 16, 2023 - Paper submissions due
July 16, 2023 - Shared task deadline to submit results
July 27, 2023 - Notification of acceptance
July 27, 2023 - Shared task system description papers due
August 03, 2023 - Camera-ready due
September 4-5, 2023 - CoCo4MT workshop
CONTACT
CoCo4MT Workshop Organizers:
coco4mt-2023-organizers(a)googlegroups.com
CoCo4MT Shared Task Organizers:
coco4mt-shared-task(a)googlegroups.com
ORGANIZING COMMITTEE (listed alphabetically)
Ananya Ganesh University of Colorado Boulder
Constantine Lignos Brandeis University
John E. Ortega Northeastern University
Jonne Sälevä Brandeis University
Katharina Kann University of Colorado Boulder
Marine Carpuat University of Maryland
Rodolfo Zevallos Universitat Pompeu Fabra
Shabnam Tafreshi University of Maryland
William Chen Carnegie Mellon University
PROGRAM COMMITTEE (listed alphabetically tentative)
Abteen Ebrahimi University of Colorado Boulder
Adelani David Saarland University
Ananya Ganesh University of Colorado Boulder
Alberto Poncelas ADAPT Centre at Dublin City University
Anna Currey Amazon
Amirhossein Tebbifakhr University of Trento
Atul Kr. Ojha National University of Ireland Galway
Ayush Singh Northeastern University
Barrow Haddow University of Edinburgh
Bharathi Raja Chakravarthi National University of Ireland Galway
Beatrice Savoldi University of Trento
Bogdan Babych Heidelberg University
Briakou Eleftheria University of Maryland
Constantine Lignos Brandeis University
Dossou Bonaventure Mila Quebec AI Institute
Duygu Ataman New York University
Eleftheria Briakou University of Maryland
Eleni Metheniti Université Toulosse - Paul Sabatier
Jasper Kyle Catapang University of Birmingham
John E. Ortega Northeastern University
Jonne Sälevä Brandeis University
Kalika Bali Microsoft
Katharina Kann University of Colorado Boulder
Kochiro Watanabe The University of Tokyo
Koel Dutta Chowdhury Saarland University
Liangyou Li Huawei
Manuel Mager University of Stuttgart
Maria Art Antonette Clariño University of the Philippines Los Baños
Marine Carpuat University of Maryland
Mathias Müller University of Zurich
Nathaniel Oco De La Salle University
Niu Xing Amazon
Patrick Simianer Lilt
Rico Sennrich University of Zurich
Rodolfo Zevallos Universitat Pompeu Fabra
Sangjee Dondrub Qinghai Normal University
Santanu Pal Saarland University
Sardana Ivanova University of Helsinki
Shantipriya Parida Silo AI
Shiran Dudy Northeastern University
Surafel Melaku Lakew Amazon
Tommi A Pirinen University of Tromsø
Valentin Malykh Moscow Institute of Physics and Technology
Xing Niu Amazon
Xu Weijia University of Maryland
2nd Call for Abstracts: 1st Workshop on Readability for Low Resourced Languages (RLRL 2023)
Free registration is now open https://bit.ly/3pwUwlG - a few tickets are still available.
Please join us for an exciting online workshop where experts in natural language processing will come together to discuss the latest research and innovative approaches to assessing the readability of low-resource languages. The workshop will take place as a free online event on September 5, 2023, and is being hosted jointly by Lancaster University, Sheffield Hallam University and King Saud University.
We welcome researchers and practitioners to submit presentation abstract proposals of up to 500 words for talks related to the development of a Readability Framework for low-resource languages.
The ultimate goal of the workshop is to discuss best practices and state-of-the-art AI-based approaches to create mathematical representations of expected readability levels at different school grade or cognitive ability levels. The workshop will also focus on utilising classifiers that are intuitive for humans to understand and adjust, enabling the analysis and improvement of the decision-making criteria. We welcome abstracts on work that is still in progress or that does not yet have conclusive results. We encourage authors to share their work at various stages of development to facilitate discussions and collaboration during the workshop.
Important Dates:
- Due date for workshop abstract submission: August 1, 2023 (extended)
- Notification of abstract acceptance to authors: August 10, 2023
- Workshop date: September 5, 2023 (online event<https://bit.ly/3pwUwlG>)
Keynote speakers:
- Professor Laurence Anthony - Faculty of Science and Engineering at Waseda University, Japan.
- Dr Violetta Cavalli-Sforza - School of Science and Engineering at Al Akhawayn University, Morocco.
- Professor Hend Al-Khalifa - College of Computer and Information Sciences at King Saud University, KSA
- Dr Abdel-Karim Al Tamimi- Computer Science and Software Engineering at Sheffield Hallam University, UK
- Dr Mo El-Haj - School of Computing and Communications at Lancaster University, UK
For list of speakers, talks' titles and abstract please visit the workshop's website:
https://wp.lancs.ac.uk/acc/rlrl2023/
The main objectives of the workshop are three-fold:
1- Increase awareness of the importance of readability in low-resource languages and its impact on language learning and literacy.
2- Discuss the challenges of readability in low-resource languages, such as limited resources and lack of standardization, and brainstorm strategies for addressing these challenges.
3- Foster a community of practice among participants, allowing them to share their experiences and best practices for addressing readability issues in low-resource languages.
Abstract submission:
Abstract submission page is now open, please submit abstracts of no more than 500 words https://easychair.org/conferences/?conf=rlrl2023
Alternatively, you can contact the organisers directly with presentation ideas on topics related to readability or low resourced languages.
Topics of interest include, but are not limited to:
- Machine learning for text readability
- Applications of readability assessment
- Readability in low-resource languages
- Comprehensibility measures
- Mathematical representations of readability levels
- Text simplification for low-resource languages
- Readability and comprehensibility in language learning
- The effects of text simplification on readability
- Readability frameworks for indigenous languages
- Updating readability representations
We look forward to your contributions and to a productive and enlightening workshop on September 5, 2023.
RLRL 2023 Organisers:
- Dr Mo El-Haj (SCC/DSI/UCREL, Lancaster University)
- Dr Abdel-Karim Al Tamimi (CSSE, Sheffield Hallam University)
- Prof. Hend Al Khalifa (iWAN, King Saud University)
https://wp.lancs.ac.uk/acc/rlrl2023/
Best wishes,
Mahmoud
---------------------
Dr Mo El-Haj
Senior Lecturer in NLP
Co-Director of UCREL NLP Group
Strategic Lead of Arabic and Financial NLP Research
Advisory Board of the Natural Language Processing Journal
https://benjamins.com/catalog/nlp
School of Computing and Communications, Lancaster University
https://www.lancaster.ac.uk/staff/elhaj
@DocElhaj<https://twitter.com/DocElhaj>
*The NEW submission DEADLINE is: 09 September 2023*
6th International Conference on Natural Language and Speech Processing
<http://icnlsp.org/2023welcome>
We are delighted to invite you to ICNLSP 2023, which will be held virtually
from December 16th to 17th, 2023.
ICNLSP 2023 offers the opportunity for attendees (researchers, academics
and students, and industrials) to share their ideas and to connect to each
other and make them up to date on the ongoing research in the field.
ICNLSP 2023 aims to attract contributions related to natural language and
speech processing. Authors are invited to present their work relevant to
the topics of the conference.
The following list includes the topics of ICNLSP 2023 but not limited to:
Signal processing, acoustic modeling.
Architecture of speech recognition system.
Deep learning for speech recognition.
Analysis of speech.
Paralinguistics in Speech and Language.
Pathological speech and language.
Speech coding.
Speech comprehension.
Summarization.
Speech Translation.
Speech synthesis.
Speaker and language identification.
Phonetics, phonology and prosody.
Cognition and natural language processing.
Text categorization.
Sentiment analysis and opinion mining.
Computational Social Web.
Arabic dialects processing.
Under-resourced languages: tools and corpora.
New language models.
Arabic OCR.
Lexical semantics and knowledge representation.
Requirements engineering and NLP.
NLP tools for software requirements and engineering.
Knowledge fundamentals.
Knowledge management systems.
Information extraction.
Data mining and information retrieval.
Machine translation.
NLP for Arabic heritage documents.
*IMPORTANT DATES*
Submission deadline: *31 August 2023*
Notification of acceptance: *31 October 2023*
Camera-ready paper due: *20 November 2023*
Conference dates: *16, 17 December 2023*
*PUBLICATION*
1- All accepted papers will be published in ACL Anthology (
https://aclanthology.org/venues/icnlsp/).
2- Selected papers will be published in Signals and Communication
Technology (Springer) (https://www.springer.com/series/4748), indexed by
Scopus and zbMATH.
*KEYNOTE SPEAKERS*
Alex Waibel, Carnegie Mellon University, USA
Najim Dehak, Johns Hopkins University, USA
For more details, visit the conference website: https://www.icnlsp
.org/2023welcome
*CONTACT*
icnlsp(at)gmail(dot)com
Best regards,
Mourad Abbas
All (with apologies for cross-posting)
The U.S. Copyright Office is conducting a study regarding the copyright
issues raised by generative artificial intelligence (AI). This study will
collect factual information and policy views relevant to copyright law and
policy. The Office will use this information to analyze the current state
of the law, identify unresolved issues, and evaluate potential areas for
congressional action.
Please go here to submit your comments:
https://www.copyright.gov/policy/artificial-intelligence/?fbclid=IwAR33KxMI…
There will be 2 rounds of opportunities to comment:
Initial written comments are due by 11:59 p.m. Eastern time on October 18,
2023.
Reply comments are due by 11:59 p.m. eastern time on November 15, 2023.
Here is a quick and accessible write-up by The Verge as to why this is
relevant/important:
https://www.theverge.com/2023/8/29/23851126/us-copyright-office-ai-public-c…
<https://www.theverge.com/2023/8/29/23851126/us-copyright-office-ai-public-c…>
Please feel welcome to share this with your networks, as my friend who
works for the copyright office says they want as much feedback as they can
get. I do not think you have to be a US citizen to participate.
Very best wishes
Heather Froehlich
--
Dr Heather Froehlich
w // http://hfroehli.ch
t // @heatherfro