*******************************************************************************************
20th Annual Workshop of the Australasian Language Technology Association (ALTA 2022)
** Flinders University, Adelaide **
14th - 16th December 2022
http://alta2022.alta.asn.au/
*******************************************************************************************
Important Dates
Submission Deadline (short and long papers): 30 September, 2022
Submission Deadline (presentation abstracts) 7 October, 2022
Author Notification: 7 November, 2022
Camera-Ready Deadline: 15 November, 2022
Tutorials: 14 December, 2022
Main Conference: 15-16 December, 2022
Submission deadlines are UTC-11
Overview
The 20th Annual Workshop of the Australasian Language Technology Association will be held in a hybrid format at Flinders University, Adelaide, from the 14th to the 16th of December 2022.
The hybrid format gives participants a valuable opportunity to socialise either in-person or via online platform.
The ALTA 2022 workshop is the key local forum for socialising research results in natural language processing and computational linguistics, with presentations and posters from students, industry, and academic researchers. Like previous years, we would also like to encourage submissions and participation from industry and government researchers and developers.
Note that ALTA is listed in recently updated CORE 2021 Conference Rankings as Australasian B. See details from CORE Rankings Portal.
Topics
ALTA invites the submission of papers and presentations on all aspects of natural language processing, including, but not limited to:
phonology, morphology, syntax, semantics, pragmatics, and discourse
speech recognition, understanding and generation
interpreting spoken and written language
natural language generation
linguistic, mathematical, and psychological models of language
NLP-based information extraction and retrieval
corpus-based and statistical language modelling
machine translation and translation aids
question answering and information extraction
natural language interfaces and dialogue systems
natural language and multimodal systems
message and narrative understanding systems
evaluations of language systems
embodied conversational agents
computational lexicography
summarisation
language resources
topic modelling, semantics and ontology
unsupervised language learning and analysis
social media analysis and processing
search and information retrieval
domain-specific adaptation of natural language processing algorithms
applied natural language processing and/or applications in industry
We particularly encourage submissions that broaden the scope of our community through the consideration of practical applications of language technology and through multi-disciplinary research. We also specifically encourage submissions from industry.
Format
We invite submissions of two different formats: (1) Original Research Papers and (2) Abstract-based Presentations.
(1) Original Research Papers
We invite the submission of papers on original and unpublished research on all aspects of natural language processing.
Long papers should be 6-8 pages and short papers should be 3-4 pages. Accepted papers will either be delivered as an oral presentation or as a poster presentation. Both short and long papers may include unlimited pages of references in addition to the page count requirements.
Note that the review process is double-blind, and accordingly submitted papers should not include the identity of author(s) and the text should be suitably anonymised, e.g. using third person wording for self-citations, not providing URLs to your person website, etc. Original research papers will be included in the workshop proceedings, which will be published online in the ACL anthology and the ALTA website. Long papers will be distinguished from short papers in the proceedings.
(2) Abstract-based Presentations
To encourage broader participation and facilitate local socialisation of international results, we invite 1-2 page presentation abstracts. The organisers may offer the opportunity to give an oral presentation or a poster presentation. Submissions should include presentation title and abstract, name of the presenter, any publications relating to the work, and any information on collaboration with the local ALTA community. Abstracts will not be published in the proceedings, but simply reviewed by the ALTA executive committee to ensure that they are on topic, coherent and likely to be of interest to the ALTA community. Abstracts on work in progress and work published or submitted elsewhere are encouraged. ALTA invites submissions of all manner interesting research, not limited to, but including:
established academics giving an overview of an exciting paper or paper/s published in international venues;
completing research students giving an overview of their thesis work;
early candidature research students presenting their work-in-progress and ideas, which may not have been published; and
industry presenting research and development over linguistic data in the context of their business.
Presentation abstracts should not be anonymised, any publications relating to the work should be cited in the submission, and the person who will give the presentation should be clearly stated.
Multiple Submission Policy
Original research papers that are under review for other publication venues or that you intend to submit elsewhere may be submitted in parallel to ALTA. We require that you declare at submission that your paper is submitted to another venue, and identify the venue. Should your paper be accepted to both ALTA and another venue, we allow you to decide whether the paper should be published in the ALTA proceedings, or if it should be treated as a Presentation (without archival publication). In this case you would still be able to present a research talk at the ALTA workshop. This is to encourage more internationally leading research to be presented at the workshop.
Instructions for AuthorsPaper Submission
Authors should submit their papers via Easychair.
There are 3 tracks in EasyChair this year:
ALTA 2022 (Long) – use this for long papers
ALTA 2022 (Short) – use this for short papers
ALTA 2022 (Abstracts) – use this for abstracts
Formatting Guidelines
Submissions must follow the two-column ACL format. We therefore strongly recommend you use LaTeX style files or Microsoft Word template.
Paper Length
Long papers should be 6-8 pages
Short papers should be 3-4 pages
Abstracts ideally should be a few paragraphs and no more than 2 pages
Anonymisation
Short and long papers must be anonymised.
Abstracts are NOT to be anonymised and must include the author's/authors' affiliation
The Natural Language Processing Chair at JMU Würzburg (WüNLP) as a member
of the Center for AI and Data Science (CAIDAS:
https://www.uni-wuerzburg.de/caidas) offers one research position in the
area of natural language processing (NLP).
The position is bound to a European project EUINACTION (
https://www.euinaction.eu/), carried out in collaboration with two
Political Science research groups, at the University of Leiden and the
University of Strathclyde. The candidate's role in the project will be to
apply and advance state-of-the-art neural NLP to help answer interesting
research questions in Political Science.
The position is available from 1.10.2022 and is available until the end of
the grant, 31.12.2023. There is a possibility of extension, subject to
mutual interest and availability of third-party research funding. Payment
is at the level of E13 according to the German federal wage agreement
scheme (TV-L). Candidates are expected to have a strong background in
computer science, with a specialisation in machine learning or natural
language processing and interest in the topic of the project. Both
applicants with completed Master (or equivalent) degree as well as those
with a completed PhD (postdocs) are welcome to apply.
Please send your application (letter of motivation, curriculum vitae,
academic records) at your earliest convenience, but no later than 31.7.2022
to Prof. Dr. Goran Glavaš (goran.glavas(a)uni-wuerzburg.de). You are welcome
to contact Prof. Glavaš (via the same email address) for additional
information.
WüNLP is a young research group that takes diversity very seriously. Female
and diversity candidates as well as international candidates are warmly
encouraged to apply. Among candidates of equal aptitude and qualifications,
a person with disabilities will be given preference.
The Natural Language Processing Chair at JMU Würzburg (WüNLP) as a member
of the Center for AI and Data Science (CAIDAS:
https://www.uni-wuerzburg.de/caidas) offers one research position in the
area of natural language processing (NLP).
The position is bound to a European project EUINACTION (
https://www.euinaction.eu/), carried out in collaboration with two
Political Science research groups, at the University of Leiden and the
University of Strathclyde. The candidate's role in the project will be to
apply and advance state-of-the-art neural NLP to help answer interesting
research questions in Political Science.
The position is available from 1.10.2022 and is available until the end of
the grant, 31.12.2023. There is a possibility of extension, subject to
mutual interest and availability of third-party research funding. Payment
is at the level of E13 according to the German federal wage agreement
scheme (TV-L). Candidates are expected to have a strong background in
computer science, with a specialisation in machine learning or natural
language processing and interest in the topic of the project. Both
applicants with completed Master (or equivalent) degree as well as those
with a completed PhD (postdocs) are welcome to apply.
Please send your application (letter of motivation, curriculum vitae,
academic records) at your earliest convenience, but no later than 31.7.2022
to Prof. Dr. Goran Glavaš (goran.glavas(a)uni-wuerzburg.de). You are welcome
to contact Prof. Glavaš (via the same email address) for additional
information.
WüNLP is a young research group that takes diversity very seriously. Female
and diversity candidates as well as international candidates are warmly
encouraged to apply. Among candidates of equal aptitude and qualifications,
a person with disabilities will be given preference.
(apologies for cross-postings)
<https://sites.google.com/view/crac2022/> CRAC 2022, the 5th Workshop on
Computational Models of Reference, Anaphora and Coreference, held at COLING
2022 <https://coling2022.org/> on October 16-17, in Gyeongju, Republic of
Korea (in hybrid mode) has a new date for sending papers: July 30, 2022.
You are welcome to send a paper:
* on any topic related to anaphora, reference, coreference
* in several categories (research paper, survey paper, position
paper, challenge paper, demo paper, extended abstract)
Please find all other important information on the CRAC 2022 website
<https://sites.google.com/view/crac2022/> .
See you at CRAC 2022!
Maciej Ogrodniczuk
(on behalf of all organizers: Vincent Ng, Sameer Pradhan, Anna Nedoluzhko
and Massimo Poesio)
Dear all,
As organisers of the BEA 2019 Shared Task on Grammatical Error Correction,
we wish to announce that the Open Phase Codalab evaluation platform for the
shared task has now moved to a new server:
https://codalab.lisn.upsaclay.fr/competitions/4057
This change was necessary because Codalab are phasing out their old
servers, and so the original evaluation platform will no longer accept new
submissions. We were unfortunately unable to migrate past submissions, but
Detailed Results are now accessible again and server stability should be
improved.
We have been really pleased to see how our competition has helped GEC take
off in the past few years, and hope that this update allows people to
continue benchmarking their systems against the BEA 2019 test set for years
to come.
Do get in touch if you encounter any issues.
Thank you,
The BEA 2019 Shared Task organisers
[Apologies if you receive multiple copies of this CfP]
Special Issue on Trends in Social Media Analysis to Address Fake News, Hate Speech, or Bias
==========================================================
Springer Datenbank-Spektrum https://www.springer.com/13222
==========================================================
Social media has many benefits: from staying in contact with close and not-so-close friends, over exercising the right to voice one's opinion, to communicating with many like-minded people all over the world and providing an additional channel for information exchange. Unfortunately, social media has also been abused and misused ever since its inception. Hate speech is prevalent on many sites alienating trusting users and hindering fruitful discussions. Fake news are distributed through social media platforms with dangerous effects. But even without malicious intention, social media can be misleading due to various biases in the system.
Topics of Interest
==================
In this special issue of Datenbank-Spektrum, we will explore and present current trends in the field of automatically detecting and managing hate speech, fake news, bias and other toxic content in the context of social media.
We welcome original contributions including technical papers, application-oriented papers, case studies, survey papers and position papers. Topics of interest include, but are not limited to:
- Automatic detection of hate speech
- Methods to improve online discussions
- Trust and reputation of social media actors
- Identification of fake news
- Countermeasures to fight fake news
- Detection and/or mitigation of bias
- Dealing with bias in training data
- Content analysis and NLP
- Opinion mining and sentiment analysis on social media
- Information extraction and retrieval on social media
- Information diffusion within social networks
- Ethical and legal aspects
Submission Guidelines
====================
Paper format: 8-10 pages, double-column (cf. author guidelines at https://www.springer.com/13222). We welcome contributions in both German and English through the Springer submission system https://www.editorialmanager.com/dasp/
Deadline for submissions: Oct. 1st, 2022;
Publication of special issue: DASP-1-2023 (March 2023)
Guest editors
=============
Feel free to contact the guest editors in case you have questions.
Ralf Krestel, ZBW & CAU Kiel, r.krestel(a)zbw.eu
Udo Kruschwitz, Universität Regensburg, udo.kruschwitz(a)ur.de
Michael Wiegand, Universität Klagenfurt, michael.wiegand(a)aau.at
Dear all,
I'm looking for a post-doc to work with me & my students at Charles
University in Prague on semantic formalisms for text planning and
language generation, as part of the ERC Starting Grant project
Next-generation Natural Language Generation (NG-NLG).
The position is for 36 months (negotiable), job location is Prague, Czechia.
Applications by 31 July are preferred, but the position is open until filled.
Position details, requirements and application instructions can be found here:
https://ufal.mff.cuni.cz/ng-nlg/postdoc
If you're interested in language generation, semantics, deep learning,
and want to address the limitations of current state-of-the-art
models, please apply! If you know someone who might be interested,
please forward them this message. If you need to ask about any details
before applying or forwarding, do not hesitate to contact me.
Best regards (& apologies for cross-posting),
Ondrej Dusek
--
https://tuetschek.github.io
Call for Tutorials: Search Solutions 2022
Search Solutions is the BCS Information Retrieval Specialist Group's annual event focused on practitioner issues in the arena of search and information retrieval. Search Solutions consists of two parts: a tutorial day and a conference day. We invite tutorial proposals which focus on any area of the practical application of search technologies to real world problems, for the tutorial day due to take place on 22nd Nov 2022 before the conference day on 23rd Nov 2022. Tutorials in previous years have included: designing usability for search, multimedia information retrieval, evaluation, pattern search, city search in SmartCities, text analysis, introduction to natural language processing and introduction to reinforcement learning, etc. The details of the previous tutorials can be found here: https://www.bcs.org/membership-and-registrations/member-communities/informa…
Tutorials
Proposals for both full day (5-6 hours including breaks and lunch) and half day (2-3 hours including breaks) tutorials are invited. The tutorials will take place on Tuesday 22nd November 2022 at the BCS offices in London and/or online depending on the situation near the time. We encourage in person tutorials at the BCS offices if possible.
Proposal submission
Tutorial proposals should be submitted to the tutorial chair (h.liu(a)soton.ac.uk<mailto:h.liu@soton.ac.uk>) by midnight Friday 22nd July 2022, using the following template:
* Name of presenter(s): please list the names and affiliations of presenter(s).
* Title: title of the tutorial.
* Contact details: email and snail mail address, phone numbers etc.
* Type of tutorial: half day or full day.
* Delivery format: Online only or in person only or could be both
* Tutorial Abstract: for publicity.
* Target audience: please outline the practitioner audience to be addressed.
* Learning outcomes: what would the practitioners gain from attending this tutorial?
* Tutorial schedule and description: provide a draft schedule and detailed description of each of the items.
* Tutorial logistics/materials: required media and formats for tutorial. What will be provided to attendees (e.g. slides).
* Bio of presenter(s): including track record of presenting tutorials, lecturing experience etc. (200/300 words)
Selection Procedure
All tutorial proposals will be reviewed by the tutorial chair and approved by the organising committee. The selection criteria will focus on the quality of the tutorial content and the appropriateness of it to the main theme of search solutions.
Honorarium and other issues
Each tutorial will receive an honorarium of around £300. All travel and accommodation expenses must be met by the presenters themselves. The organising committee of Search Solutions reserves the right to cancel tutorials unless a minimum of four participants have registered.
Contact Tutorial Chair:
Dr Haiming Liu, h.liu(a)soton.ac.uk<mailto:h.liu@soton.ac.uk>
Dr Haiming Liu, PhD, PgCAP, SFHEA
Associate Professor
Web and Internet Science (WAIS) Research Group
School of Electronics and Computer Science
Faculty of Engineering and Physical Science
University of Southampton
Highfield, Southampton, SO17 1BJ
Email: h.liu(a)soton.ac.uk<mailto:h.liu@soton.ac.uk>
In this newsletter:
Fall 2022 LDC Data Scholarship Program
30th Anniversary Highlight: ATIS0 Complete
New publications:
Qatari Corpus of Argumentative Writing<https://catalog.ldc.upenn.edu/LDC2022T04>
Second DIHARD Challenge Evaluation - SEEDLingS<https://catalog.ldc.upenn.edu/LDC2022S07>
________________________________
Fall 2022 LDC Data Scholarship Program
Student applications for the Fall 2022 LDC Data Scholarship program are being accepted now through September 15, 2022. This program provides eligible students with no-cost access to LDC data. Students must complete an application consisting of a data use proposal and letter of support from their advisor. For application requirements and program rules, visit the LDC Data Scholarships page<https://www.ldc.upenn.edu/language-resources/data/data-scholarships>.
30th Anniversary Highlight: ATIS0 Complete
The ATIS corpora were among the first publications that appeared with the launch of LDC's catalog in 1993. ATIS0 Complete (LDC93S4A)<https://catalog.ldc.upenn.edu/LDC93S4A> is comprised of spontaneous speech, read speech, and other material from participants in the ATIS collection that is contained in ATIS0 Pilot (LDC93S4B),<http://catalog.ldc.upenn.edu/LDC93S4B> ATIS0 Read (LDC93S4B-2)<http://catalog.ldc.upenn.edu/LDC93S4B-2>, and ATIS0 SD-Read (LDC93S4B-3<http://catalog.ldc.upenn.edu/LDC93S4B-3>).
The ATIS (Air Travel Information Services) collection was developed to support the research and development of speech understanding systems. Participants were presented with various hypothetical travel planning scenarios and asked to solve them by interacting with partially or completely automated ATIS systems. The resulting utterances were recorded and transcribed. Data was collected in the early 1990s at five US sites: Raytheon BBN, Carnegie Mellon University, MIT Laboratory for Computer Science, National Institute for Standards and Technology, and SRI International.
The ATIS collection has been widely used to further research in spoken language understanding and slot filling (Kuo et al., 2020<https://arxiv.org/pdf/2009.14386.pdf>). Other data sets published from the collection include ATIS2 (LDC93S5)<https://catalog.ldc.upenn.edu/LDC93S5>, ATIS3 Training and Test Data (LDC94S19,<https://catalog.ldc.upenn.edu/LDC94S19> LDC95S26<https://catalog.ldc.upenn.edu/LDC95S26>) and, more recently, Multilingual ATIS (LDC2019T04)<https://catalog.ldc.upenn.edu/LDC2019T04> and ATIS - Seven Languages (LDC2021T04)<https://catalog.ldc.upenn.edu/LDC2021T04>.
All ATIS corpora are available for licensing by Consortium members and non-members. Visit Obtaining Data <https://www.ldc.upenn.edu/language-resources/data/obtaining> for more information.
________________________________
New publications:
(1) Qatari Corpus of Argumentative Writing<https://catalog.ldc.upenn.edu/LDC2022T04> was developed by Qatar University<http://www.qu.edu.qa/>, University of Exeter<https://www.exeter.ac.uk/>, and Hamad Bin Khalifa University<https://www.hbku.edu.qa/en>, and is comprised of approximately 200,000 tokens of Arabic and English writing by undergraduate students (159 female, 36 male) along with annotations and related metadata. Students were native Arabic speakers and fluent in English; each student wrote one Arabic and one English essay in response to specific argumentative prompts. They were instructed to include in their essays a clear thesis statement supported by relevant evidence.
The corpus is divided into Arabic and English parts, each of which contains 195 essays. Metadata includes information about the students (gender, major, first language, second language) and information about the essay texts (serial numbers of texts, word limits, genre, date of writing, time spent on writing, place of writing).
Qatari Corpus of Argumentative Writing is distributed via web download.
2022 Subscription Members will automatically receive copies of this corpus. 2022 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for a fee.
*
(2) Second DIHARD Challenge Evaluation - SEEDLingS<https://catalog.ldc.upenn.edu/LDC2022S07> was developed by Duke University and LDC and contains approximately two hours of English child language recordings along with corresponding annotations used in support of the Second DIHARD Challenge<https://dihardchallenge.github.io/dihard2>.
Source data is from the SEEDLingS<https://homebank.talkbank.org/access/Password/Bergelson.html> (The Study of Environmental Effects on Developing Linguistic Skills) corpus, designed to investigate how infants' early linguistic and environmental input plays a role in their learning. Recordings were generated in the home environment of infants in the Rochester, New York area. A subset of that data was annotated by LDC for use in the First and Second DIHARD Challenges
Second DIHARD Challenge Evaluation - SEEDLingS is distributed via web download.
2022 Subscription Members will automatically receive copies of this corpus provided they have submitted a completed copy of the special license agreement. 2022 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for a fee.
To unsubscribe from this newsletter, log in to your LDC account<https://catalog.ldc.upenn.edu/login> and uncheck the box next to "Receive Newsletter" under Account Options; or contact LDC for assistance.
Membership Coordinator
Linguistic Data Consortium<ldc.upenn.edu>
University of Pennsylvania
T: +1-215-573-1275
E: ldc(a)ldc.upenn.edu<mailto:ldc@ldc.upenn.edu>
M: 3600 Market St. Suite 810
Philadelphia, PA 19104
We invite researchers in the broad area of computational morphology to submit their recent, unpublished work to a special issue of the Journal of Language Modelling <https://jlm.ipipan.waw.pl/index.php/JLM><https://jlm.ipipan.waw.pl/index.php/JLM>.
Motivation:
Computational techniques have a long history of use in the study of morphology, where they have been used both for practical tasks such as the analysis and production of complex word forms and for theoretical ones such as structural and informational analysis of morphological systems. As both systems and datasets improve, these techniques are increasingly developed and evaluated on a typologically diverse array of languages, including many which are endangered or lack large-scale resources. Detailed comparisons across languages can help to reveal typological biases or assumptions within existing computational techniques [1, 2]. Alternatively, computational methods and analyses can also shed light on questions within linguistic typology [3, 4, 5, 6].
The goal of this special issue is to bring researchers from multiple communities together in exploring issues of linguistic typology across a wide range of different languages and phenomena. We encourage the submission of work on endangered or less-studied languages.
The Journal of Language Modelling is a free (for readers and authors alike) open-access peer-reviewed journal. All articles are peer-reviewed by at least 3 reviewers, usually including at least one member of the Editorial Board.
Topics of interest:
- Typological clustering or classification of languages
- Investigation of particular linguistic features which improve or detract from the performance of computational morphology tools
- Comparison of morphological structures (e.g., inflection classes, implicative networks) across typologically different languages
- Investigation of diachronic typological change using computational methods
- Creation, curation or analysis of typological databases via computational methods
Submissions:
The submissions should be journal papers, not proceedings papers, totalling 25-50 pages, excluding references.
Authors are advised to use the online manuscript submission for the journal. Make sure to select the special issue when asked to provide the article type. More information, including formatting instructions for authors can be found on the journal's webpage at:
https://jlm.ipipan.waw.pl/index.php/JLM/about/submissions.
Important dates:
Call for papers issued: 15/7/2022
Submissions due: 15/1/2023
Author notification: Spring 2023
Guest editors:
Sacha Beniamine (University of Surrey)
Micha Elsner (The Ohio State University)
Katharina Kann (University of Colorado, Boulder)
References
[1] Ryan Cotterell, Christo Kirov, John Sylak-Glassman, David Yarowsky, Jason Eisner, and Mans Hulden. 2016a. The SIGMORPHON 2016 shared Task— Morphological reinflection. In Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 10–22, Berlin, Germany. Association for Computational Linguistics.
[2] Huiming Jin, Liwei Cai, Yihui Peng, Chen Xia, Arya McCarthy, and Katharina Kann. 2020. Unsupervised morphological paradigm completion. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6696– 6707, Online. Association for Computational Linguistics.
[3] Neil Rathi, Michael Hahn, and Richard Futrell. 2021. An Information-Theoretic Characterization of Morphological Fusion. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10115–10120, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
[4] Parker, J., Reynolds, R., & Sims, A. (2022). Network Structure and Inflection Class Predictability: Modeling the Emergence of Marginal Detraction. In A. Sims, A. Ussishkin, J. Parker, & S. Wray (Eds.), Morphological Diversity and Linguistic Cognition (pp. 247-281). Cambridge: Cambridge University Press. DOI: 10.1017/9781108807951.010
[5] Guzmán Naranjo, Matías and Becker, Laura. Statistical bias control in typology. Linguistic Typology, to appear, 2021. DOI: 10.1515/lingty-2021-0002
[6] Sacha Beniamine. 2021. One lexeme, many classes: Inflection class systems as lattices. In Berthold Crysmann & Manfred Sailer (eds.), One-to-many relations in morphology, syntax, and semantics, 23--51. Berlin: Language Science Press. DOI: 10.5281/zenodo.4729789