September 2022 - Corpora

[CFP - due Sep 13] EMNLP Workshop on Text Simplification, Accessibility, and Readability
by Xu, Wei 05 Sep '22

05 Sep '22

Workshop on Text Simplification, Accessibility, and Readability (TSAR) at EMNLP 2022 website: https://taln.upf.edu/pages/tsar2022-ws Call for Papers The web provides an abundance of knowledge and information that can reach large populations. However, the way in which a text is written (vocabulary, syntax, or text organization/structure), or presented, can make it inaccessible for many people, especially for non-native speakers, people with low literacy, and people with some type of cognitive or linguistic impairments. The results of the Adult Literacy Survey (OECD, 2013) indicate that approximately 16.7% of the adult population (averaged over 24 highly-developed countries) requires lexical, 50% syntactic, and 89.4% conceptual simplification of everyday texts (Štajner, 2021). Research on automatic text simplification (TS), textual accessibility, and readability have the potential to improve the social inclusion of marginalized populations. These related research areas have attracted attention in the past ten years, as evidenced by the growing number of publications in NLP conferences. While only about 300 articles in Google Scholar mentioned TS in 2010, this number has increased to about 600 in 2015 and greater than 1000 in 2020 (Štajner, 2021). Recent research in automatic text simplification has mostly focused on proposing the use of methods derived from the deep learning paradigm (Glavaš and Štajner, 2015; Paetzold and Specia, 2016; Nisioi et al., 2017; Zhang and Lapata, 2017; Martin et al., 2020; Maddela et al., 2021; Sheang and Saggion, 2021). However, there are many important aspects of automatic text simplification that need the attention of our community: the design of appropriate evaluation metrics, the development of context-aware simplification solutions, the creation of appropriate language resources to support research and evaluation, the deployment of simplification in real environments for real users, the study of discourse factors in text simplification, the identification of factors affecting the readability of a text, etc. To overcome those issues, there is a need for the collaboration of CL/NLP researchers, machine learning and deep learning researchers, UI/UX and Accessibility professionals, as well as public organizations representatives (Štajner, 2021). The proposed TSAR workshop builds upon the recent success of several regional workshops that covered a subset of our topics of interest, including READI Workshops at LREC 2022 and LREC 2022, SEPLN 2021 Workshop on Current Trends in Text Simplification (CTTS), and the SimpleText workshop at CLEF 2021, as well as the birds-of-a-feather events on Text Simplification at NAACL 2021 (over 50 participants) and ACL 2022. The TSAR workshop aims to foster collaboration among all parties interested in making information more accessible to all people. Through the two invited talks, a shared task on lexical simplification, the round table discussion, oral and poster presentations of novel research, we will discuss recent trends and developments in the area of automatic text simplification, text accessibility, automatic readability assessment, language resources and evaluation for text simplification, etc. Topics We invite contributions on the following topics (among others): * Lexical simplification; * Syntactic simplification; * Modular and end-to-end TS; * Sequence-to-sequence and zero-shot TS; * Controllable TS; * Text complexity assessment; * Complex word identification and lexical complexity prediction; * Corpora, lexical resources, and benchmarks for TS; * Evaluation of TS systems; * Domain-specific/adaptable TS (e.g. health, legal); * Other related topics (e.g. empirical and eye-tracking studies); * Assistive technologies for improving readability and comprehension including those going beyond text. * Text Simplification in Languages other than English * Multilingual TS * Readability Controlled MT Submissions We welcome two types of papers: long papers and short papers. Submissions should be made to the Softconf submission management system: https://softconf.com/emnlp2022/tsar. The papers should present novel research. The review will be double-blind and thus all submissions should be anonymized. Format: Paper submissions must use the official EMNLP template, which is available as an Overleaf template and also downloadable directly (Latex and Word) (see here: https://2022.emnlp.org/calls/style-and-formatting/). Authors may not modify these style files or use templates designed for other conferences. Submissions that do not conform to the required styles, including paper size, margin width, and font size restrictions, will be rejected without review. Long Papers: Long papers must describe substantial, original, completed, and unpublished work. Wherever appropriate, concrete evaluation and analysis should be included. Long papers may consist of up to eight (8) pages of content, plus unlimited pages of references. Final versions of long papers will be given one additional page of content (up to 9 pages), so that reviewers’ comments can be taken into account. Long papers will be presented orally or as posters as determined by the program committee. The decisions as to which papers will be presented orally and which as poster presentations will be based on nature rather than the quality of the work. There will be no distinction in the proceedings between long papers presented orally and long papers presented as posters. Short Papers: Short paper submissions must describe original and unpublished work. Please note that a short paper is not a shortened long paper. Instead, short papers should have a point that can be made in a few pages. Some kinds of short papers include: a small, focused contribution; a negative result; an opinion piece; an interesting application nugget Short papers may consist of up to four (4) pages of content, plus unlimited pages of references. Final versions of short papers will be given one additional page of content (up to 5 pages), so that reviewers' comments can be taken into account. Short papers will be presented orally or as posters as determined by the program committee. While short papers will be distinguished from long papers in the proceedings, there will be no distinction in the proceedings between short papers presented orally and short papers presented as posters. Important Dates 13 September 2022 (extended): paper submission deadline 2 October 2022: acceptance notification deadline 16 October 2022: camera-ready deadline 8 December 2022: Workshop at EMNLP Proceedings All accepted papers will be included in the workshop proceedings and published in ACL Anthology. Extended versions of the best papers will be invited for a special issue of Frontiers in Artificial Intelligence focused on: applied research for TS and readability assessment in the context of TS. Organizers * Sanja Štajner, NLP Researcher, Germany * Horacio Saggion, Chair in Computer Science and Artificial Intelligence and Head of the LaSTUS Lab in the TALN-DTIC, Universitat Pompeu Fabra * Wei Xu, Assistant Professor, Georgia Institute of Technology * Marcos Zampieri, Assistant Professor, Rochester Institute of Technology * Matthew Shardlow, Senior Lecturer, Manchester Metropolitan University * Daniel Ferrés, Post-Doctoral Research Assistant, Universitat Pompeu Fabra * Kai North, Ph.D. student, Rochester Institute of Technology * Kim Cheng Sheang, PhD student, Universitat Pompeu Fabra

1 0

Discours Journal, Issue 32 (deadline 10/15/2022), Publication June 2023
by Nicolas Hernandez 05 Sep '22

05 Sep '22

Discours Journal, Issue 32, Publication June 2023 Deadline: October 15th 2022 Coordinators: Lydia-Mai Ho-Dac and Nicolas Hernandez Dear colleagues, we are inviting submissions for the next issue of Discours, to appear in June 2023. THE DISCOURS JOURNAL Discours is an international and interdisciplinary peer-reviewed e-journal, which publishes two issues a year in open access. The journal is intended as a forum for exchanging and comparing data, analyses and opinions for all linguists, psycholinguists and computer linguists working in fields involving the description, comprehension, formalization and processing of text organization. EDITORIAL LINE It focuses on the following topics (not limited to): discourse structure and discursive markers, discourse relations, coherence, cohesion, linearization, indexation, information structure, word order, discourse comprehension and production, and other related topics. For this issue, we are particularly interested in studies that investigate the following topics: - discourse and dialogue-level annotated corpora (e.g. rhetorical and argumentative structures, dialogue acts, reference chains, enumerative structures, document structures, thematic segmentation) - studies on discourse and dialogue structures in (very) large corpora - tools and methods in Natural Language Processing and Computational Linguistics for discourse and dialogue-level processing - exploitation of discourse or dialogue-level processing in end-user applications (e.g. human language technology, education, health, risk management) SUBMISSION Papers (in English or French) should be sent to discours(a)univ-nantes.fr Full instructions can be found on https://journals.openedition.org/discours/224 IMPORTANT DATES - Manuscript submission: October 15th 2022 - Final decision of the editorial board: First quarter of 2023 - Online publication: June 2023 SCIENTIFIC COMMITTEE - Scientific Committee https://journals.openedition.org/discours/122 - Referees outside the Scientific Committee https://journals.openedition.org/discours/8977 -- Dr. Nicolas Hernandez Associate Professor (Maître de Conférences) Nantes Université - LS2N UMR6004 https://nicolashernandez.github.io/ <https://www.google.com/url?q=https%3A%2F%2Fnicolashernandez.github.io%2F&sa…> +33 (0)2 51 12 53 94 +33 (0)2 40 30 60 67 https://sciences-techniques.univ-nantes.fr/programme-du-m1-atal

1 0

[CFP] Special Issue on "Information Extraction and Language Discourse Processing" of Journal Information (ISSN 2078-2489)
by Jennifer D'Souza 05 Sep '22

05 Sep '22

Dear colleagues and friends, *We invite submissions to a special issue on "Information Extraction and Language Discourse Processing" of journal Information <https://www.mdpi.com/journal/information> (ISSN 2078-2489).* *Special Issue Information* This Special Issue seeks novel research reports on the spectrum that blends information extraction and language discourse processing research in diverse communities. The editors welcome submissions along various dimensions derived from the nature of the extraction task, the advanced neural techniques used for extraction, the variety of input resources exploited, and the type of output produced. Quantitative, qualitative, and mixed methods studies are welcome, as are case studies and experience reports if they describe an impactful application at a scale that delivers useful lessons to the journal readership. Topics of interest include (but are not limited to): - Knowledge base population with discourse-centric information extraction (IE) - Coreference resolution and its impact on discourse-centric IE - Relationship extraction leveraging linguistic discourse - Template filling - Impact of pragmatics or rhetorics on information extraction - Discourse-centric IE at scale - Intelligent and novel assessment models of discourse-centric IE - Survey of discourse-centric IE in natural language processing (NLP) - Challenges implementing discourse-centric IE in real-world scenarios - Modeling domains using discourse-centric IE - Human–AI hybrid systems for learning discourse and IE *Submission Instructions* https://www.mdpi.com/journal/information/special_issues/WYS02U2GTD *Deadline for manuscript submissions* Submissions to the SI will be accepted and published on a rolling basis until the close of the issue on 10 December 2022 Yours cordially, Dr. Jennifer D'Souza Prof. Dr. Chengzhi Zhang *Guest Editors*

1 0

Open call for tenders - Corpus collection + Use case study for a technology-based scientific translation service
by susanna.fiorini＠operas-eu.org 05 Sep '22

05 Sep '22

***Translations & Open Science calls for tenders*** The OPERAS Research Infrastructure launches a series of calls for tenders in order to lay the foundation of a technology-based scientific translation service to foster multilingualism in scholarly communication and thus help to remove language barriers according to Open Science principles. The first two calls are now open (submission deadline: 7 October 2022) 1. Mapping and collection of scientific bilingual corpora: identifying, collecting and preparing corpora of bilingual scientific texts which will serve as training dataset for specialised translation engines, source data for terminology extraction, and translation memory creation Link to call 1: https://www.operas-eu.org/mapping-and-collection-of-scientific-bilingual-co… 2. Use case study for a technology-based scientific translation service: drafting an overview of the current translation practices and challenges in scholarly communication and defining the use cases of a technology-based scientific translation service (expected users and usage scenarios, features, quality requirements, editorial and technical workflows) Link to call 2: https://www.operas-eu.org/use-case-study-for-a-technology-based-scientific-… Please note that two additional calls will be released in the coming months in the following areas: Machine translation output evaluation and Roadmap and budget projections. For any information about ongoing and future calls, please feel free to contact Susanna Fiorini at susanna.fiorini(a)operas-eu.org

1 0

2nd CfP: Special Issue on Trends in Social Media Analysis to Address Fake News, Hate Speech, or Bias (Springer Datenbank-Spektrum)
by Wiegand, Michael 05 Sep '22

05 Sep '22

[Apologies if you receive multiple copies of this CfP] Special Issue on Trends in Social Media Analysis to Address Fake News, Hate Speech, or Bias ========================================================== Springer Datenbank-Spektrum https://www.springer.com/13222 ========================================================== Social media has many benefits: from staying in contact with close and not-so-close friends, over exercising the right to voice one's opinion, to communicating with many like-minded people all over the world and providing an additional channel for information exchange. Unfortunately, social media has also been abused and misused ever since its inception. Hate speech is prevalent on many sites alienating trusting users and hindering fruitful discussions. Fake news are distributed through social media platforms with dangerous effects. But even without malicious intention, social media can be misleading due to various biases in the system. Topics of Interest ================== In this special issue of Datenbank-Spektrum, we will explore and present current trends in the field of automatically detecting and managing hate speech, fake news, bias and other toxic content in the context of social media. We welcome original contributions including technical papers, application-oriented papers, case studies, survey papers and position papers. Topics of interest include, but are not limited to: - Automatic detection of hate speech - Methods to improve online discussions - Trust and reputation of social media actors - Identification of fake news - Countermeasures to fight fake news - Detection and/or mitigation of bias - Dealing with bias in training data - Content analysis and NLP - Opinion mining and sentiment analysis on social media - Information extraction and retrieval on social media - Information diffusion within social networks - Ethical and legal aspects Submission Guidelines ==================== Paper format: 8-10 pages, double-column (cf. author guidelines at https://www.springer.com/13222). We welcome contributions in both German and English through the Springer submission system https://www.editorialmanager.com/dasp/ Deadline for submissions: Oct. 1st, 2022; Publication of special issue: DASP-1-2023 (March 2023) Guest editors ============= Feel free to contact the guest editors in case you have questions. Ralf Krestel, ZBW & CAU Kiel, r.krestel(a)zbw.eu Udo Kruschwitz, Universität Regensburg, udo.kruschwitz(a)ur.de Michael Wiegand, Universität Klagenfurt, michael.wiegand(a)aau.at

1 0

Call for bids to host EACL 2024
by Georg Rehm 04 Sep '22

04 Sep '22

*CALL FOR BIDS TO HOST EACL 2024* The European Chapter of the Association for Computational Linguistics (EACL) invites expressions of interest to host the 2024 EACL conference, to be held in Europe, the Middle East or Africa (EMEA) in Spring (preferably April/May) 2024. The 2024 conference will be the 18th meeting of the EACL. *At this stage, we seek draft proposals from prospective bidders.* These will be evaluated and promising bidders will be asked to provide additional information for the final selection. The EACL Board will appoint the general chair for the conference, the programme committee co- chairs, and all other chairs (tutorial co-chairs, workshop co-chairs, etc.), except for the local arrangements chair. Draft bid proposals (due *October 15th, 2022*) should include information on all of the following items: 1. *Proposed dates:* in Spring (preferably April/May) 2024 2. *Location:* city and conference venue. Indicate whether the conference would be held at a university, hotel or convention center. Bear in mind that EACL is growing. While Gothenburg (EACL 2014) had 520 registered participants, Valencia (EACL 2017, the last Conference held in person) had 680 registered participants. So please suggest a location that could host 800+ people for plenary sessions, plus at least 4 conference rooms hosting parallel sessions (200-250 people each), a large poster or exhibit room; 11 rooms on the workshops/tutorials days among which at least two host 200 people and the others 60 persons; and rooms for demos, small meetings and registration 3. *Local arrangements team:* local chair/co-chair, committee, volunteer labour (e.g. students), registration handling. The local arrangements team will be responsible for activities such as arranging meeting rooms, equipment, refreshments, accommodation, on-site registration, participant internet access, the reception, the conference dinner, and working with the other chairs and the EACL Board to develop the budget and registration materials. Indicate whether a professional conference organizer (PCO) will be involved in the organization. Also, indicate whether any national/regional Computational Linguistics association would be on board of the local organization *The final bids will also include detailed information on the following items:* 1. Computing/wifi/audiovisual: whether there will be desktop/laptop in conference rooms and high-speed wireless Internet access, what the audiovisual facilities are 2. Printing of conference booklet 3. Food catering including breaks, reception, poster sessions and conference dinner 4. Accommodation options at the venue, including low-cost student accommodation 5. Travel alternatives to the venue from Europe and beyond 6. Social events including infrastructure for banquet/other social event and reception 7. Potential for local sponsorships 8. Opportunities for co-location with other meetings 9. The costs related to all of the above items, which should be indicated in the expenses spreadsheet (template provided below). Proposals will be evaluated with respect to a number of criteria (unordered): - Adequacy of conference and exhibit facilities for the anticipated number of registrants - Adequacy of accommodations and food services (in a range of price categories) and proximity to the conference facilities - Adequacy of expenses projections and expected surplus - Appropriateness of proposed dates - Geographical and national balance with regard to previous EACL and ACL conferences, and other major Natural Language Processing conferences held in EMEA - Co-location with national/regional conferences - Experience of the local arrangements team - Local CL community support - Local government and industry support - Appropriateness of expected registration fees - Accessibility of proposed site To help with your bid, you can check: - Bid Guidelines <https://wiki.coli.uni-saarland.de/eacl/Bid%20guidelines?highlight=%28bids%29> - EACL 17 report <http://aclweb.org/adminwiki/index.php?title=2017Q3_Reports:_EACL_2017> Reports, lessons learnt and successful bids from previous years: - Previous Calls for Conference Bids <https://wiki.coli.uni-saarland.de/eacl/Call%20for%20conference%20bids> The EACL conference handbook: - EACL conference handbook <https://wiki.coli.uni-saarland.de/eacl/EACL_conference_handbook> Please send your expressions of interest electronically to the EACL Chair-elect: Roberto Basili, University of Rome, Tor Vergata, Italy – basili(a)info.uniroma2.it The EACL board encourages groups who intend to submit a proposal to ask questions about how to prepare the proposal. *Important Dates:* - *October 15th, 2022:* Deadline for draft bids - *October 31st, 2022:* Feedback to bidders and announcement of shortlist of bidders - *December 22nd, 2022:* Deadline for final bids - *January 15th, 2023:* Final bid chosen - *April or May, 2024:* EACL Conference Best regards, Georg Rehm – Secretary of EACL – -- *Prof. Dr. Georg Rehm <http://georg-re.hm/>* Principal Researcher and Research Fellow DFKI GmbH <http://www.dfki.de/>, Alt-Moabit 91c, 10559 Berlin, Germany Phone: +49 30 23895-1833 – Fax: -1810 georg.rehm(a)dfki.de Deutsches Forschungszentrum für Künstliche Intelligenz GmbH Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern Geschäftsführung: Prof. Dr. Antonio Krüger (Vorsitzender), Helmut Ditzer Vorsitzender des Aufsichtsrats: Dr. Gabriël Clemens Amtsgericht Kaiserslautern, HRB 2313

1 0

2nd Call for Participation - EmoThreat: Emotions & Threat Detection in Urdu - CICLing 2022 track @FIRE 2022
by Maaz Amjad 04 Sep '22

04 Sep '22

*[Apologies for cross-posting]* *EmoThreat: Emotions & Threat Detection in Urdu* CICLing 2022 track @FIRE 2022* Website: Link <https://sites.google.com/view/multi-label-emotionsfire-task/home> Registration is now open: Link <https://docs.google.com/forms/d/e/1FAIpQLSfWSPSM5wlgkucnhq3lDEsnWdaitwfq2EF…> The training set is now available. Participants are invited to publish Working Notes of FIRE 2022* *Task Description* With the growth of the spread and importance of social media platforms, the effect of their misuse became more and more impactful. In particular, numerous posts contain abusive language towards certain users and hence worsen users’ experience from communication via such platforms, while other posts contain actual threats that potentially put platform users in danger. The Urdu language has more than 230 million speakers worldwide, with vast representation on social networks and digital media. We encourage participants to participate in *EmoThreat: Emotion and Threat detection in Urdu (Nastaliq)* *Task A: Multi-label emotion classification in Urdu *Link <https://sites.google.com/view/multi-label-emotionsfire-task/home/task-a> Task A requires you to classify the tweet as one, or more of the six basic emotions (plus neutral), which is the best representation of the emotion of the person tweeting. *Task B: Threatening Language Detection Task in Urdu *Link <https://sites.google.com/view/multi-label-emotionsfire-task/home/task-b> Task B focuses on detecting Threatening language using Twitter tweets in Urdu language. This is a binary classification task in which participating systems are required to classify tweets into two classes, namely: Threatening and Non-Threatening. *Note: Participants in this year’s shared task can choose to participate in either one or both subtasks. Please visit the website for more information.* *Important Dates* 30th June – Training data release 25th July – Codalab submission link release (Task A) 7th September - Test set release (Task B) 20th September – Run submission deadline 30th September – Results Declared 12th October - Working Note submission 26th October - Review Notifications 2nd November – Camera Ready Due 9th - 13th December - FIRE 2022 (Online Event) *Organizers* Sabur Butt, Instituto Politécnico Nacional, Mexico Maaz Amjad, Instituto Politécnico Nacional, Mexico Noman Ashraf, Dana-Farber Cancer Institute, Harvard Medical School, United States Fazlourrahman Balouchzahi, Instituto Politécnico Nacional, Mexico Rajesh Sharma, University of Tartu, Estonia Grigori Sidorov, Instituto Politécnico Nacional, Mexico Alexander Gelbukh, Instituto Politécnico Nacional, Mexico *Contact* Email: emothreat2022(a)gmail.com Google-group: Link <https://groups.google.com/g/emothreat> *FIRE 2022: Link <http://fire.irsi.res.in/fire/2022/home> -- *With my best regards,* *Maaz Amjad**, PhD* ================================================ *LinkedIn <http://www.linkedin.com/in/maazmjad> **Twitter <https://twitter.com/maazamjad13?lang=en> * *Skype: maaz.amjad72Mobile: +5215567332662* *Email: h.maazamjad(a)gmail.com <h.maazamjad(a)gmail.com>*

1 0

[Final Deadline Extension] First Workshop on Information Extraction from Scientific Publications (WIESP) at AACL-IJCNLP 2022
by Tirthankar Ghosal 03 Sep '22

03 Sep '22

*** First Workshop on Information Extraction from Scientific Publications ( WIESP) at AACL-IJCNLP 2022 *** *** Website: https://ui.adsabs.harvard.edu/WIESP/ *** Twitter: https://twitter.com/wiesp_nlp The number of scientific papers published per year has exploded in recent years. Indexing the article's full text in search engines helps discover and retrieve vital scientific information to continue building on the shoulders of giants, informing policy, and making evidence-based decisions. Nevertheless, it is difficult to navigate this ocean of data. Using simple string matching has substantial limitations: human language is ambiguous in nature, context matters, and we frequently use the same word and acronyms to represent a multitude of different meanings. Extracting structured and semantically relevant information from scientific publications (e.g., named-entity recognition, summarization, citation intention, linkage to knowledge graphs) allows for better selection and filter articles. The First Workshop on Information Extraction from Scientific Publications ( WIESP) will create the necessary forum to foster discussion and research using Natural Language Processing and Machine Learning. WIESPwould specifically focus on topics related to information extraction from scientific publications, including (but not limited to): - Scientific document parsing - Scientific named-entity recognition - Scientific article summarization - Question-answering on scientific articles - Citation context/span extraction - Structured information extraction from full-text, tables, figures, bibliography - Novel datasets curated from scientific publications - Argument extraction and mining - Challenges in information extraction from scientific articles - Building knowledge graphs via mining scientific literature; querying scientific knowledge graphs - Novel tools for IE on scientific literature and interaction with users - Mathematical information extraction - Scientific concepts, facts extraction - Visualizing scientific knowledge - Bibliometric and Altmetric studies via information extraction from scientific articles and metadata - Information extraction from COVID-19 articles to inform public health policy In addition to research paper presentations, WIESP would also feature keynote talks, a panel discussion, and a shared task. We will update the details on our website as and when they become available. We especially welcome participation from academic and research institutions, government and industry labs, publishers, and information service providers. Projects and organizations using NLP/ML techniques in their text mining and enrichment efforts are also welcome to participate. ***Call for Papers*** We invite papers of the following categories: ***Long papers*** must describe substantial, original, completed, and unpublished work. Wherever appropriate, concrete evaluation and analysis should be included. Papers must not exceed eight (8) pages of content, plus unlimited pages of references. The final versions of long papers will be given one additional page of content (up to 9 pages) so that reviewers' comments can be taken into account. ***Short papers*** must describe original and unpublished work. Please note that a short paper is not a shortened long paper. Instead, short papers should have a point that can be made in a few pages, such as a small, focused contribution, a negative result, or an interesting application nugget. Short papers must not exceed four (4) pages, plus unlimited pages of references. The final versions of short papers will be given one additional page of content (up to 5 pages) so that reviewers' comments can be taken into account. ***Position papers*** will give voice to authors who wish to take a position on a topic listed above or the field of scholarly information extraction. Submissions need not present original work and should be two to four pages in length, including title, text, figures and tables, and references. ***Demo papers*** should be no more than four (4) pages in length, including references, and should describe implemented systems that are of relevance to the theme of the workshop. Authors of demo papers should be willing to present a demo of their system during WIESP at AACL-IJCNLP 2022. ***Extended Abstracts*** We welcome submissions of extended abstracts (2 pages max) related to the research topics mentioned above. Submissions may include previously published results, late-breaking results, or a description of ongoing projects in the broad field of information extraction and mining from scientific publications. Extended abstracts can also summarize existing work, work in progress, or a collection of works under a unified theme (e.g., a series of closely related papers that build on each other or tackle a common problem). ***Shared Task: Detecting Entities in the Astrophysics Literature (DEAL)*** A good amount of astrophysics research makes use of data coming from missions and facilities such as ground observatories in remote locations or space telescopes, as well as digital archives that hold large amounts of observed and simulated data. These missions and facilities are frequently named after historical figures or use some ingenious acronym which, unfortunately, can be easily confused when searching for them in the literature via simple string matching. For instance, Planck can refer to the person, the mission, the constant, or several institutions. Automatically recognizing entities such as missions or facilities would help tackle this word sense disambiguation problem. The shared task consists of Named Entity Recognition (NER) on samples of text extracted from astrophysics publications. The labels were created by domain experts and designed to identify entities of interest to the astrophysics community. They range from simple to detect (ex: URLs) to highly unstructured (ex: Formula), and from useful to researchers (ex: Telescope) to more useful to archivists and administrators (ex: Grant). Overall, 31 different labels are included, and their distribution is highly unbalanced (ex: ~100x more Citations than Proposals). Submissions will be scored using both the CoNLL-2000 shared task seqeval F1-Score at the entity level and scikit-learn's Matthews correlation coefficient method at the token level. We also encourage authors to propose their own evaluation metrics. A sample dataset and more instructions can be found at: https://ui.adsabs.harvard.edu/WIESP/2022/SharedTasks Participants (individuals or groups) will have the opportunity to present their findings during the workshop and write a short paper. The best performant or interesting approaches might be invited to further collaborate with the NASA Astrophysical Data System ( https://ui.adsabs.harvard.edu/). ***Important Dates*** - Paper/Abstract Submission Deadline: September 12, 2022 (Final extension) - Notification of workshop paper/abstract acceptance: October 7, 2022 - Camera-ready Submission Deadline: October 24, 2022 - Workshop: November 20, 2021 (online) ***All submission deadlines are 11.59 pm UTC -12h ("Anywhere on Earth")*** ***Submission Website and Format*** Submission Link: https://softconf.com/aacl2022/WIESP/ Submission will be via softconf. Submissions should follow the ACLPUB formatting guidelines (https://acl-org.github.io/ACLPUB/formatting.html) and template files (https://github.com/acl-org/acl-style-files/tree/master). Submissions (Long and Short Papers) will be subject to a double-blind peer-review process. Position papers, Demo papers, and Extended Abstracts need not be anonymized. The authors will present accepted papers at the workshop either as a talk or a poster. All accepted papers will be published in the workshop proceedings. We follow the same policies as AACL-IJCNLP 2022 regarding preprints and double submissions. The anonymity period for WIESP 2022 is from July 15 to September 25. ***Organizers*** - Tirthankar Ghosal, Charles University, CZ - Sergi Blanco-Cuaresma, Center for Astrophysics | Harvard & Smithsonian, USA - Alberto Accomazzi, Center for Astrophysics | Harvard & Smithsonian, USA - Robert M. Patton, Oak Ridge National Laboratory, USA - Felix Grezes, Center for Astrophysics | Harvard & Smithsonian, USA - Thomas Allen, Center for Astrophysics | Harvard & Smithsonian, USA -- +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Tirthankar Ghosal Researcher at UFAL, Charles University, CZ https://member.acm.org/~tghosal +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

1 0

1st CFP: Crowdsourcing & Human Computation Track @TheWebConf 2023
by Matt Lease 02 Sep '22

02 Sep '22

https://www2023.thewebconf.org/calls/research-tracks/crowdsourcing-hc/ We invite research contributions to the Crowdsourcing and Human Computation track at the 32nd edition of The Web Conference series (formerly known as WWW), to be hosted at Austin, TX, US, on April 30 - May 4, 2023 ( https://www2023.thewebconf.org/ <https://www2023.thewebconf.org/>) Fifteen years ago, a 2007 WWW paper entitled “Internet-Scale Collection of Human-Reviewed Data <https://dl.acm.org/doi/abs/10.1145/1242572.1242604>” was one of several forerunners to signal a new, emerging area of research on Human Computation and Crowdsourcing (HCOMP). Growing excitement and work in this new area would eventually lead to four years of HCOMP workshops across KDD and AAAI (2009-2012), a new annual AAAI HCOMP conference <https://www.humancomputation.com/> (2013 onward), and a new, annual HCOMP track at the WebConference (2014 onward). Today, the world and research landscape looks remarkably different than it did in 2007, with the Web playing a central role in orchestrating such advances. Of particular note, modern neural models have transformed AI capabilities, along with far greater ubiquity and significance of AI systems now in practical deployment around the world. As one effect of this, the commoditization and democratization of AI models today has also brought a new focus to “data-centric AI” in which AI models can succeed or fail based on the quality of underlying data and human annotations. The nature of human-AI interactions are also continually evolving in response to AI advances, posing an ever-changing frontier of new challenges for researchers and practitioners. Furthermore, the growth of AI power has brought a commensurate recognition of the need for responsible AI systems that are fair, accountable, transparent, and trustworthy – across diverse, global communities of human stakeholders who interact with or are impacted by AI systems. Given the central role of HCOMP in AI (creating reliable training and benchmark annotations, as well as enabling hybrid, human-in-the-loop systems), continuing innovation in HCOMP remains a key challenge for the further advancement of AI. HCOMP itself has made tremendous strides forward in the past fifteen years, yet many research challenges remain. *We invite AI, HCI, and related contributions that advances the broad spectrum of crowdsourcing and human computation (HCOMP) in the scope of the Web*: - algorithms, analysis, applications, methods, systems, and techniques - conceptual, empirical, theoretical, and mixed-methods - spanning fields (e.g, psychology, sociology, economics, ethics, etc.) - system-centered, human-centered, and hybrid More specifically, we invite work addressing contemporary HCOMP challenges including (but not limited to) the following Web-related themes: - Fundamental research challenges in Web-based HCOMP - *Data collection, generation, labeling, and cleaning*: data-centric AI; human and AI-assisted annotation; annotator agreement, aggregation, and modeling; annotation subjectivity and ambiguity, data excellence; human-in-the-loop data augmentation, generation, and adversarial attacks; label noise and bias detection and reduction; task decomposition, task and workflow design, novel modalities for input acquisition, etc. - *Human-centered explainability *: algorithmic/model explanations, interpretability, and transparency to enhance human success in using AI in decision-making, model and data debugging, task performance, trust in AI systems, appropriate reliance, etc. (please also read the CFP of the “Fairness, Accountability, Transparency and Ethics” track) - *Human-centered studies*: collaborative systems, computer-supported cooperative work, human-computer interaction, human factors, interaction design, usability, user experience, etc. - *Resources, benchmarking, reproducibility*: New resources for the community (e.g., datasets, open source toolkits, etc.), benchmarking studies comparing state of the art methods, and/or reproducibility studies of prior work. - *Addressing bias and diversity in annotation and human computation*: Methods and algorithms to identify and mitigate biases in annotations; bias-aware annotation workflows; diversity in annotators and workers, data labeling, and hybrid, human-in-the-loop systems; downstream effects of annotator diversity on bias and fairness measures; impact on evaluation of various systems (e.g. information retrieval systems, recommender systems, etc.); ethics and fairness of HCOMP practices - Underlying workforce powering Web-based HCOMP - *Social and economic impacts of human computation and crowdsourcing*: societal and methodological challenges around crowdsourcing labor and workforces; inequalities in access and representation in crowdsourcing workforces; platform affordances and economic impact - *Supporting HCOMP workers*: collective action; design activism; fair work <https://fair.work/en/fw/homepage/>; ghost work, heteromation, and invisible work; human computation, digital colonialism, and the global south; impact sourcing <https://en.wikipedia.org/wiki/Impact_sourcing> and responsible sourcing <https://partnershiponai.org/workstream/responsible-sourcing/>; regulation; worker empowerment, organization, protection and wellness; and workforce diversity, equity, and inclusion, etc. - *Future of work*: AI-assisted human coordination, team formation and work, distributed work, freelancer economy, hybrid, human+AI work and complementarity, etc. - Web-based HCOMP systems, frameworks, or architectures - *Crowd-powered systems*: data management, marketplace design and sustainability, platforms, scalability, security, privacy, programming languages, real-time crowdsourcing, etc. - *Human-in-the-loop architectures*: decision support; human-AI collaboration, interaction, and teaming; hybrid systems; mixed-initiative design, etc. - *Crowdsourcing*: citizen science, collective intelligence, crowd computing, crowd creativity, crowdfunding, crowd ideation, crowd intelligence, crowd sensing, crowdsourcing contests, crowd phenomena, crowd science, incentive schemes, gamification, human flesh search, open innovation, peer production, prediction markets, reputation systems, social web, wisdom of crowds, etc. - *Human computation*: decision-theoretic and game-theoretic design, design patterns, human algorithm design and complexity, mechanism and incentive design, etc. - Web-based Applications of HCOMP - *Machine learning for HCOMP*: aggregation, answer fusion, annotator and user modeling, quality assurance, optimization, task assignment and recommendation, truth inference, etc. - *New Applications and Services*: delivering beyond state-of-the-art AI capabilities and enhanced services through human computation and human-in-the-loop systems. Authors should consult the conference’s main Research Track CFP <https://www2023.thewebconf.org/calls/research-tracks/> to ensure their submissions are aligned with broader conference expectations, scope, and theme: “Web Research with Openness, Fairness and Reproducibility”. The CFP also details submission guidelines, relevant dates, and important policies. Review criteria <https://www.humancomputation.com/2016/review-criteria.html> will include considerations typical of those in past years of this track and the AAAI HCOMP conference. Submissions that are out of scope or unresponsive to the call above will be rejected early during the reviewing process (“desk rejected”) with minimal feedback.This includes submissions that: - merely apply HCOMP methods in standard, previously known ways, without novel contributions to advance the methodology itself; - do not relate to the web or web-based human computation platforms, methods, or applications. In case you have doubts whether your paper fits the scope of this track, please contact the track chairs hcomp-webconf2023(a)easychair.org Important dates - Abstract submission: October 6, 2022. This is compulsory for all papers. - Full papers submission: October 13, 2022 - Rebuttal: December 15 - 22, 2022 - Notification: January 25, 2023 Track chairs: - Ujwal Gadiraju <http://ujwalgadiraju.com/>(Delft University of Technology) - Matthew Lease <https://www.ischool.utexas.edu/~ml/>(University of Texas at Austin and Amazon) - Besmira Nushi <https://besmiranushi.com/>(Microsoft Research) *Senior Program Committee & Program Committee*: Stay Tuned! -- Matt Lease Professor School of Information University of Texas at Austin Voice: (512) 471-9350 · Fax: (512) 471-3971 · Office: UTA 5.536 http://www.ischool.utexas.edu/~ml

1 0

2nd call: EAMT Sponsorship of Activities (Students' edition) for 2023 (deadline 07/10/2022)
by Carol Scarton 02 Sep '22

02 Sep '22

************************************************************** EAMT Sponsorship of Activities (Students' edition) for 2023 Deadline: 07/10/2022 ************************************************************** == Call for Proposals == The European Association for Machine Translation (EAMT) is an organisation that serves the growing community of people interested in MT and translation tools, including translators, users, developers, and researchers of this increasingly viable technology. As part of its commitment to promote research, development and awareness about translation technologies, the EAMT is for the third consecutive year launching a call for proposals to fund MT-related activities led by students during 2023. == Purpose of the Call == The EAMT is planning to support various MT activities such as shared tasks, workshops, teaching and awareness initiatives, open-source initiatives, dataset creation and small research and development projects by its current student members. The EAMT particularly welcomes proposals from students in all levels of education, including undergraduates, postgraduates and PhD students. This call will also give priority to projects that extend work done during the Machine Translation Marathon 2022 ( https://ufal.mff.cuni.cz/events/mt-marathon-2022), being held in Prague, Czech Republic from 5 to 10 September 2022. Topics of interest include, but are not limited to: - Recent developments in MT research. - MT evaluation methodology, metrics and results. - Launch of MT-specific evaluation campaigns. - New or prospective commercial users of MT technology. - MT environments (workflow, support tools, etc.). - Interaction between users and MT systems. - MT combined with other technologies (translation memories, speech translation, cross-language information retrieval, multilingual text categorization, multilingual text summarization, etc.). - MT for less-resourced languages: development, usage, etc. - MT in the social internet: new uses, new modes of development. - MT for crisis management. - Training events on MT, particularly on recent developments. - Events to disseminate MT, especially to the wider public (including shared tasks). - Creation of datasets for MT research. All proposals will be screened by a review committee that consists of EAMT Executive Committee members and possibly a few appointed external experts if necessary. == Submission information == * Eligibility requirements * In order to qualify for funding, the individual making the proposal must be a confirmed student member of the EAMT at submission time (Membership information: http://www.eamt.org/membership.php). Applicants will also need formal approval from their supervisor. It is important to emphasise that projects are expected to be student-led. Therefore, although we welcome projects showing collaboration with industry and other academic partners, the project is expected to directly benefit the students own career and/or project. * Selection criteria * - The proposed activity should be of direct interest to the MT community at large: researchers, developers, vendors, translators and/or users of MT technologies. - The proposal shall clearly describe the purpose of the project and include measurable mid-project milestones for which a report should be submitted (see below). - Preference will be given to projects which by nature will involve and be beneficial for several persons, as for instance conferences, seminars, workshops, shared tasks and tutorials. - Proposals with a significant, clearly identified impact on the MT community (through the development, dissemination or use of project results) are those most likely to be accepted. - Proposals that bring together different aspects of MT will be especially valued. - The proposal should be clearly justified as being technically and/or scientifically sound. - The quality and efficiency of the implementation of the proposal will be evaluated. - The budget should be adequate for the proposed objectives and the actual implementation of the activity. == Budget == EAMT anticipates funding several proposals for various activities. The total foreseen EAMT Budget for this call is around €4,000 to cover all granted projects. The maximum amount EAMT can grant for a single project will be €4,000. During the negotiation stage, budget adjustments may be required by the EAMT executive committee. This means that the EAMT may only offer to partially fund a project. A project being granted financial support by EAMT according to this call will receive 50% of the granted amount at the start of the project. The proposer will receive the remaining 50% when the mid-project progress report has been received by the EAMT Secretary and substantiates that the mid-project milestones are met, and furthermore provided that the proposer is still a current member of the EAMT. == Contact for enquiries == Carolina Scarton EAMT Secretary e-mail: c.scarton(a)sheffield.ac.uk == Submission procedure == * Overview * Candidates should submit their proposals as a single PDF file, written in English, that is composed of the elements described below. - Proposal description: 2-page maximum - Person/organisation experience: 1-page maximum - Budget and project planning overview: 1-page maximum - Supervisor's letter of approval: 1-page maximum Proposals should be submitted no later than the deadline (see Important Dates below) through EasyChair: https://easychair.org/conferences/?conf=eamt2023 (Submission type: Project Proposals). Authors are encouraged to use the template available at http://www.eamt.org/eamt2020-projects.zip. Templates for both LaTeX and Microsoft Word are available. * Detailed description of sections of the proposal * 1-) Proposal summary (two pages) in English - Complete contact information of the candidate. - A clear and detailed description of the proposed event or activity. - A statement on why this event or activity would be helpful for the community and the development of your studies (you should establish a clear connection between this activity and your degree project). - A statement justifying why EAMT should support this event or activity. 2-) Experience of the proposing person in the field (up to one page) - It may include a list of experience and related skills of the participants of the team (your team may be composed by your supervisors and potential collaborators). 3-) Budget and project planning overview (up to one page) - A breakdown of the costs estimated for the entire activity or event. - Clear milestones and deliverables must be indicated. - An identification of the support requested from EAMT and possible other supporting funds. 4-) Supervisor's letter of approval (up to one page) - A letter from your supervisor stating that they approve your project submission and that they will act as fund manager if needed (please note that EAMT needs to make payments into a research account set up at your institution). == Important Dates == - Circulation of the Call: August 1, 2022 - Submission deadline for proposals: October 7, 2022, 23:59 CEST - Acceptance notifications and negotiations to start on: December 7, 2022 In case of acceptance: - Mid-project progress report due: June 30, 2023, 23:59 CEST - Final report and deliverables due: January 31, 2024, 23:59 CET == Additional provisions == - Only complete proposals will be reviewed. - All information submitted with proposals will be regarded as confidential and will only be used in the context of this project. - Following the recommendations from the reviewers and EAMT executive members, projects may be approved with amendments that will be discussed during the negotiation stage. - The funded projects may be required to report at the EAMT events (e.g. Poster at the EAMT conference, a short progress report for the General assembly, etc.). If you think you will not have funds for attending the EAMT event you can add travel costs to your budget. - The EAMT should be acknowledged in all materials related to the project, activity or initiative. == No obligation to award the proposal == The EAMT shall be under no obligation to fund the proposals pursuant to this call for proposals. EAMT shall not be liable for any compensation with respect to candidates whose proposals have not been accepted. Nor shall it be liable in the event of it deciding not to award the proposal. -- *Carolina Scarton* Lecturer in Natural Language Processing Department of Computer Science University of Sheffield http://staffwww.dcs.shef.ac.uk/people/C.Scarton/

1 0

2026

2025

2024

2023

2022

Corpora September 2022