- Corpora - ELRA lists

[CFP] Cambridge NLE - Special Issue - "Natural Language Processing Applications for Low-Resource Languages"
by Sivaji Bandyopadhyay 06 Sep '22

06 Sep '22

Dear Colleagues and Friends, We welcome submissions to the special issue of the Cambridge University Press journal <https://www.cambridge.org/core/journals/natural-language-engineering>Natural Language Engineering <https://www.cambridge.org/core/journals/natural-language-engineering> on Natural Language Processing Application on Low-Resource Languages. *Cambridge NLE - Special Issue* *Natural Language Processing Applications for Low-Resource Languages* *The deadline for submissions is November 30, 2022.* *Call for Papers* We welcome papers dealing with one or more of the following topics: ● *Machine Translation for Low-Resource Languages* ● *Caption Generation for Low-Resource Languages* ● *POS Tagging on Low-Resource Languages* ● *Optical character recognition and generation of Low-Resource Languages.* ● *Chatbot assistant for Low-Resource Languages* ● *Speech/Language recognition* ● *Regional music identification and classification* ● *Textual Entailment of Low-Resource Languages* ● *Face-to-Face Translation of Low-Resource Languages* ● *News/Social media analysis* ● *Text prediction and suggestion* ● *Sentiment analysis* ● *Text classification* *Important Dates* *Deadline for submissions*: November 30, 2022 *First-round author notification*: Between March 30, 2023 and April 30, 2023 *Submission of revised versions*: Between June 30, 2023 and July 30, 2023 *Second-round author notification (final)*: Between August 30, 2023 and September 30, 2023 *Camera-ready versions*: Between Oct 01, 2023 and Oct 30, 2023 *Submissions* Instructions for preparing your manuscript for the *Journal of Natural Language Engineering* are available <https://www.cambridge.org/core/journals/natural-language-engineering/inform…> here <https://www.cambridge.org/core/journals/natural-language-engineering/inform…> . Please submit your article through the NLE <https://mc.manuscriptcentral.com/nle>manuscript submission system <https://mc.manuscriptcentral.com/nle>. When submitting your manuscript, please select *Natural Language Processing Applications for Low-Resource Languages *in the field *Special Issue Designation.* *Guest Editors* *Dr. Partha Pakray, National Institute of Technology Silchar* *Prof. Alexander Gelbukh, Instituto Politécnico Nacional, Mexico City, Mexico* *Prof. Sivaji Bandyopadhyay, National Institute of Technology Silchar* Yours Cordially, Professor Sivaji Bandyopadhyay Guest Editors Director, National Institute of Technology(NIT) Silchar (Assam) Professor, Computer Science and Engineering Department NIT Silchar (Assam) Professor, Computer Science and Engineering Department (on lien) Jadavpur University, Kolkata - 700032 INDIA

1 0

[CFP - due Sep 13] EMNLP Workshop on Text Simplification, Accessibility, and Readability
by Xu, Wei 05 Sep '22

05 Sep '22

Workshop on Text Simplification, Accessibility, and Readability (TSAR) at EMNLP 2022 website: https://taln.upf.edu/pages/tsar2022-ws Call for Papers The web provides an abundance of knowledge and information that can reach large populations. However, the way in which a text is written (vocabulary, syntax, or text organization/structure), or presented, can make it inaccessible for many people, especially for non-native speakers, people with low literacy, and people with some type of cognitive or linguistic impairments. The results of the Adult Literacy Survey (OECD, 2013) indicate that approximately 16.7% of the adult population (averaged over 24 highly-developed countries) requires lexical, 50% syntactic, and 89.4% conceptual simplification of everyday texts (Štajner, 2021). Research on automatic text simplification (TS), textual accessibility, and readability have the potential to improve the social inclusion of marginalized populations. These related research areas have attracted attention in the past ten years, as evidenced by the growing number of publications in NLP conferences. While only about 300 articles in Google Scholar mentioned TS in 2010, this number has increased to about 600 in 2015 and greater than 1000 in 2020 (Štajner, 2021). Recent research in automatic text simplification has mostly focused on proposing the use of methods derived from the deep learning paradigm (Glavaš and Štajner, 2015; Paetzold and Specia, 2016; Nisioi et al., 2017; Zhang and Lapata, 2017; Martin et al., 2020; Maddela et al., 2021; Sheang and Saggion, 2021). However, there are many important aspects of automatic text simplification that need the attention of our community: the design of appropriate evaluation metrics, the development of context-aware simplification solutions, the creation of appropriate language resources to support research and evaluation, the deployment of simplification in real environments for real users, the study of discourse factors in text simplification, the identification of factors affecting the readability of a text, etc. To overcome those issues, there is a need for the collaboration of CL/NLP researchers, machine learning and deep learning researchers, UI/UX and Accessibility professionals, as well as public organizations representatives (Štajner, 2021). The proposed TSAR workshop builds upon the recent success of several regional workshops that covered a subset of our topics of interest, including READI Workshops at LREC 2022 and LREC 2022, SEPLN 2021 Workshop on Current Trends in Text Simplification (CTTS), and the SimpleText workshop at CLEF 2021, as well as the birds-of-a-feather events on Text Simplification at NAACL 2021 (over 50 participants) and ACL 2022. The TSAR workshop aims to foster collaboration among all parties interested in making information more accessible to all people. Through the two invited talks, a shared task on lexical simplification, the round table discussion, oral and poster presentations of novel research, we will discuss recent trends and developments in the area of automatic text simplification, text accessibility, automatic readability assessment, language resources and evaluation for text simplification, etc. Topics We invite contributions on the following topics (among others): * Lexical simplification; * Syntactic simplification; * Modular and end-to-end TS; * Sequence-to-sequence and zero-shot TS; * Controllable TS; * Text complexity assessment; * Complex word identification and lexical complexity prediction; * Corpora, lexical resources, and benchmarks for TS; * Evaluation of TS systems; * Domain-specific/adaptable TS (e.g. health, legal); * Other related topics (e.g. empirical and eye-tracking studies); * Assistive technologies for improving readability and comprehension including those going beyond text. * Text Simplification in Languages other than English * Multilingual TS * Readability Controlled MT Submissions We welcome two types of papers: long papers and short papers. Submissions should be made to the Softconf submission management system: https://softconf.com/emnlp2022/tsar. The papers should present novel research. The review will be double-blind and thus all submissions should be anonymized. Format: Paper submissions must use the official EMNLP template, which is available as an Overleaf template and also downloadable directly (Latex and Word) (see here: https://2022.emnlp.org/calls/style-and-formatting/). Authors may not modify these style files or use templates designed for other conferences. Submissions that do not conform to the required styles, including paper size, margin width, and font size restrictions, will be rejected without review. Long Papers: Long papers must describe substantial, original, completed, and unpublished work. Wherever appropriate, concrete evaluation and analysis should be included. Long papers may consist of up to eight (8) pages of content, plus unlimited pages of references. Final versions of long papers will be given one additional page of content (up to 9 pages), so that reviewers’ comments can be taken into account. Long papers will be presented orally or as posters as determined by the program committee. The decisions as to which papers will be presented orally and which as poster presentations will be based on nature rather than the quality of the work. There will be no distinction in the proceedings between long papers presented orally and long papers presented as posters. Short Papers: Short paper submissions must describe original and unpublished work. Please note that a short paper is not a shortened long paper. Instead, short papers should have a point that can be made in a few pages. Some kinds of short papers include: a small, focused contribution; a negative result; an opinion piece; an interesting application nugget Short papers may consist of up to four (4) pages of content, plus unlimited pages of references. Final versions of short papers will be given one additional page of content (up to 5 pages), so that reviewers' comments can be taken into account. Short papers will be presented orally or as posters as determined by the program committee. While short papers will be distinguished from long papers in the proceedings, there will be no distinction in the proceedings between short papers presented orally and short papers presented as posters. Important Dates 13 September 2022 (extended): paper submission deadline 2 October 2022: acceptance notification deadline 16 October 2022: camera-ready deadline 8 December 2022: Workshop at EMNLP Proceedings All accepted papers will be included in the workshop proceedings and published in ACL Anthology. Extended versions of the best papers will be invited for a special issue of Frontiers in Artificial Intelligence focused on: applied research for TS and readability assessment in the context of TS. Organizers * Sanja Štajner, NLP Researcher, Germany * Horacio Saggion, Chair in Computer Science and Artificial Intelligence and Head of the LaSTUS Lab in the TALN-DTIC, Universitat Pompeu Fabra * Wei Xu, Assistant Professor, Georgia Institute of Technology * Marcos Zampieri, Assistant Professor, Rochester Institute of Technology * Matthew Shardlow, Senior Lecturer, Manchester Metropolitan University * Daniel Ferrés, Post-Doctoral Research Assistant, Universitat Pompeu Fabra * Kai North, Ph.D. student, Rochester Institute of Technology * Kim Cheng Sheang, PhD student, Universitat Pompeu Fabra

1 0

Discours Journal, Issue 32 (deadline 10/15/2022), Publication June 2023
by Nicolas Hernandez 05 Sep '22

05 Sep '22

Discours Journal, Issue 32, Publication June 2023 Deadline: October 15th 2022 Coordinators: Lydia-Mai Ho-Dac and Nicolas Hernandez Dear colleagues, we are inviting submissions for the next issue of Discours, to appear in June 2023. THE DISCOURS JOURNAL Discours is an international and interdisciplinary peer-reviewed e-journal, which publishes two issues a year in open access. The journal is intended as a forum for exchanging and comparing data, analyses and opinions for all linguists, psycholinguists and computer linguists working in fields involving the description, comprehension, formalization and processing of text organization. EDITORIAL LINE It focuses on the following topics (not limited to): discourse structure and discursive markers, discourse relations, coherence, cohesion, linearization, indexation, information structure, word order, discourse comprehension and production, and other related topics. For this issue, we are particularly interested in studies that investigate the following topics: - discourse and dialogue-level annotated corpora (e.g. rhetorical and argumentative structures, dialogue acts, reference chains, enumerative structures, document structures, thematic segmentation) - studies on discourse and dialogue structures in (very) large corpora - tools and methods in Natural Language Processing and Computational Linguistics for discourse and dialogue-level processing - exploitation of discourse or dialogue-level processing in end-user applications (e.g. human language technology, education, health, risk management) SUBMISSION Papers (in English or French) should be sent to discours(a)univ-nantes.fr Full instructions can be found on https://journals.openedition.org/discours/224 IMPORTANT DATES - Manuscript submission: October 15th 2022 - Final decision of the editorial board: First quarter of 2023 - Online publication: June 2023 SCIENTIFIC COMMITTEE - Scientific Committee https://journals.openedition.org/discours/122 - Referees outside the Scientific Committee https://journals.openedition.org/discours/8977 -- Dr. Nicolas Hernandez Associate Professor (Maître de Conférences) Nantes Université - LS2N UMR6004 https://nicolashernandez.github.io/ <https://www.google.com/url?q=https%3A%2F%2Fnicolashernandez.github.io%2F&sa…> +33 (0)2 51 12 53 94 +33 (0)2 40 30 60 67 https://sciences-techniques.univ-nantes.fr/programme-du-m1-atal

1 0

[CFP] Special Issue on "Information Extraction and Language Discourse Processing" of Journal Information (ISSN 2078-2489)
by Jennifer D'Souza 05 Sep '22

05 Sep '22

Dear colleagues and friends, *We invite submissions to a special issue on "Information Extraction and Language Discourse Processing" of journal Information <https://www.mdpi.com/journal/information> (ISSN 2078-2489).* *Special Issue Information* This Special Issue seeks novel research reports on the spectrum that blends information extraction and language discourse processing research in diverse communities. The editors welcome submissions along various dimensions derived from the nature of the extraction task, the advanced neural techniques used for extraction, the variety of input resources exploited, and the type of output produced. Quantitative, qualitative, and mixed methods studies are welcome, as are case studies and experience reports if they describe an impactful application at a scale that delivers useful lessons to the journal readership. Topics of interest include (but are not limited to): - Knowledge base population with discourse-centric information extraction (IE) - Coreference resolution and its impact on discourse-centric IE - Relationship extraction leveraging linguistic discourse - Template filling - Impact of pragmatics or rhetorics on information extraction - Discourse-centric IE at scale - Intelligent and novel assessment models of discourse-centric IE - Survey of discourse-centric IE in natural language processing (NLP) - Challenges implementing discourse-centric IE in real-world scenarios - Modeling domains using discourse-centric IE - Human–AI hybrid systems for learning discourse and IE *Submission Instructions* https://www.mdpi.com/journal/information/special_issues/WYS02U2GTD *Deadline for manuscript submissions* Submissions to the SI will be accepted and published on a rolling basis until the close of the issue on 10 December 2022 Yours cordially, Dr. Jennifer D'Souza Prof. Dr. Chengzhi Zhang *Guest Editors*

1 0

Open call for tenders - Corpus collection + Use case study for a technology-based scientific translation service
by susanna.fiorini＠operas-eu.org 05 Sep '22

05 Sep '22

***Translations & Open Science calls for tenders*** The OPERAS Research Infrastructure launches a series of calls for tenders in order to lay the foundation of a technology-based scientific translation service to foster multilingualism in scholarly communication and thus help to remove language barriers according to Open Science principles. The first two calls are now open (submission deadline: 7 October 2022) 1. Mapping and collection of scientific bilingual corpora: identifying, collecting and preparing corpora of bilingual scientific texts which will serve as training dataset for specialised translation engines, source data for terminology extraction, and translation memory creation Link to call 1: https://www.operas-eu.org/mapping-and-collection-of-scientific-bilingual-co… 2. Use case study for a technology-based scientific translation service: drafting an overview of the current translation practices and challenges in scholarly communication and defining the use cases of a technology-based scientific translation service (expected users and usage scenarios, features, quality requirements, editorial and technical workflows) Link to call 2: https://www.operas-eu.org/use-case-study-for-a-technology-based-scientific-… Please note that two additional calls will be released in the coming months in the following areas: Machine translation output evaluation and Roadmap and budget projections. For any information about ongoing and future calls, please feel free to contact Susanna Fiorini at susanna.fiorini(a)operas-eu.org

1 0

2nd CfP: Special Issue on Trends in Social Media Analysis to Address Fake News, Hate Speech, or Bias (Springer Datenbank-Spektrum)
by Wiegand, Michael 05 Sep '22

05 Sep '22

[Apologies if you receive multiple copies of this CfP] Special Issue on Trends in Social Media Analysis to Address Fake News, Hate Speech, or Bias ========================================================== Springer Datenbank-Spektrum https://www.springer.com/13222 ========================================================== Social media has many benefits: from staying in contact with close and not-so-close friends, over exercising the right to voice one's opinion, to communicating with many like-minded people all over the world and providing an additional channel for information exchange. Unfortunately, social media has also been abused and misused ever since its inception. Hate speech is prevalent on many sites alienating trusting users and hindering fruitful discussions. Fake news are distributed through social media platforms with dangerous effects. But even without malicious intention, social media can be misleading due to various biases in the system. Topics of Interest ================== In this special issue of Datenbank-Spektrum, we will explore and present current trends in the field of automatically detecting and managing hate speech, fake news, bias and other toxic content in the context of social media. We welcome original contributions including technical papers, application-oriented papers, case studies, survey papers and position papers. Topics of interest include, but are not limited to: - Automatic detection of hate speech - Methods to improve online discussions - Trust and reputation of social media actors - Identification of fake news - Countermeasures to fight fake news - Detection and/or mitigation of bias - Dealing with bias in training data - Content analysis and NLP - Opinion mining and sentiment analysis on social media - Information extraction and retrieval on social media - Information diffusion within social networks - Ethical and legal aspects Submission Guidelines ==================== Paper format: 8-10 pages, double-column (cf. author guidelines at https://www.springer.com/13222). We welcome contributions in both German and English through the Springer submission system https://www.editorialmanager.com/dasp/ Deadline for submissions: Oct. 1st, 2022; Publication of special issue: DASP-1-2023 (March 2023) Guest editors ============= Feel free to contact the guest editors in case you have questions. Ralf Krestel, ZBW & CAU Kiel, r.krestel(a)zbw.eu Udo Kruschwitz, Universität Regensburg, udo.kruschwitz(a)ur.de Michael Wiegand, Universität Klagenfurt, michael.wiegand(a)aau.at

1 0

Call for bids to host EACL 2024
by Georg Rehm 04 Sep '22

04 Sep '22

*CALL FOR BIDS TO HOST EACL 2024* The European Chapter of the Association for Computational Linguistics (EACL) invites expressions of interest to host the 2024 EACL conference, to be held in Europe, the Middle East or Africa (EMEA) in Spring (preferably April/May) 2024. The 2024 conference will be the 18th meeting of the EACL. *At this stage, we seek draft proposals from prospective bidders.* These will be evaluated and promising bidders will be asked to provide additional information for the final selection. The EACL Board will appoint the general chair for the conference, the programme committee co- chairs, and all other chairs (tutorial co-chairs, workshop co-chairs, etc.), except for the local arrangements chair. Draft bid proposals (due *October 15th, 2022*) should include information on all of the following items: 1. *Proposed dates:* in Spring (preferably April/May) 2024 2. *Location:* city and conference venue. Indicate whether the conference would be held at a university, hotel or convention center. Bear in mind that EACL is growing. While Gothenburg (EACL 2014) had 520 registered participants, Valencia (EACL 2017, the last Conference held in person) had 680 registered participants. So please suggest a location that could host 800+ people for plenary sessions, plus at least 4 conference rooms hosting parallel sessions (200-250 people each), a large poster or exhibit room; 11 rooms on the workshops/tutorials days among which at least two host 200 people and the others 60 persons; and rooms for demos, small meetings and registration 3. *Local arrangements team:* local chair/co-chair, committee, volunteer labour (e.g. students), registration handling. The local arrangements team will be responsible for activities such as arranging meeting rooms, equipment, refreshments, accommodation, on-site registration, participant internet access, the reception, the conference dinner, and working with the other chairs and the EACL Board to develop the budget and registration materials. Indicate whether a professional conference organizer (PCO) will be involved in the organization. Also, indicate whether any national/regional Computational Linguistics association would be on board of the local organization *The final bids will also include detailed information on the following items:* 1. Computing/wifi/audiovisual: whether there will be desktop/laptop in conference rooms and high-speed wireless Internet access, what the audiovisual facilities are 2. Printing of conference booklet 3. Food catering including breaks, reception, poster sessions and conference dinner 4. Accommodation options at the venue, including low-cost student accommodation 5. Travel alternatives to the venue from Europe and beyond 6. Social events including infrastructure for banquet/other social event and reception 7. Potential for local sponsorships 8. Opportunities for co-location with other meetings 9. The costs related to all of the above items, which should be indicated in the expenses spreadsheet (template provided below). Proposals will be evaluated with respect to a number of criteria (unordered): - Adequacy of conference and exhibit facilities for the anticipated number of registrants - Adequacy of accommodations and food services (in a range of price categories) and proximity to the conference facilities - Adequacy of expenses projections and expected surplus - Appropriateness of proposed dates - Geographical and national balance with regard to previous EACL and ACL conferences, and other major Natural Language Processing conferences held in EMEA - Co-location with national/regional conferences - Experience of the local arrangements team - Local CL community support - Local government and industry support - Appropriateness of expected registration fees - Accessibility of proposed site To help with your bid, you can check: - Bid Guidelines <https://wiki.coli.uni-saarland.de/eacl/Bid%20guidelines?highlight=%28bids%29> - EACL 17 report <http://aclweb.org/adminwiki/index.php?title=2017Q3_Reports:_EACL_2017> Reports, lessons learnt and successful bids from previous years: - Previous Calls for Conference Bids <https://wiki.coli.uni-saarland.de/eacl/Call%20for%20conference%20bids> The EACL conference handbook: - EACL conference handbook <https://wiki.coli.uni-saarland.de/eacl/EACL_conference_handbook> Please send your expressions of interest electronically to the EACL Chair-elect: Roberto Basili, University of Rome, Tor Vergata, Italy – basili(a)info.uniroma2.it The EACL board encourages groups who intend to submit a proposal to ask questions about how to prepare the proposal. *Important Dates:* - *October 15th, 2022:* Deadline for draft bids - *October 31st, 2022:* Feedback to bidders and announcement of shortlist of bidders - *December 22nd, 2022:* Deadline for final bids - *January 15th, 2023:* Final bid chosen - *April or May, 2024:* EACL Conference Best regards, Georg Rehm – Secretary of EACL – -- *Prof. Dr. Georg Rehm <http://georg-re.hm/>* Principal Researcher and Research Fellow DFKI GmbH <http://www.dfki.de/>, Alt-Moabit 91c, 10559 Berlin, Germany Phone: +49 30 23895-1833 – Fax: -1810 georg.rehm(a)dfki.de Deutsches Forschungszentrum für Künstliche Intelligenz GmbH Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern Geschäftsführung: Prof. Dr. Antonio Krüger (Vorsitzender), Helmut Ditzer Vorsitzender des Aufsichtsrats: Dr. Gabriël Clemens Amtsgericht Kaiserslautern, HRB 2313

1 0

2nd Call for Participation - EmoThreat: Emotions & Threat Detection in Urdu - CICLing 2022 track @FIRE 2022
by Maaz Amjad 04 Sep '22

04 Sep '22

*[Apologies for cross-posting]* *EmoThreat: Emotions & Threat Detection in Urdu* CICLing 2022 track @FIRE 2022* Website: Link <https://sites.google.com/view/multi-label-emotionsfire-task/home> Registration is now open: Link <https://docs.google.com/forms/d/e/1FAIpQLSfWSPSM5wlgkucnhq3lDEsnWdaitwfq2EF…> The training set is now available. Participants are invited to publish Working Notes of FIRE 2022* *Task Description* With the growth of the spread and importance of social media platforms, the effect of their misuse became more and more impactful. In particular, numerous posts contain abusive language towards certain users and hence worsen users’ experience from communication via such platforms, while other posts contain actual threats that potentially put platform users in danger. The Urdu language has more than 230 million speakers worldwide, with vast representation on social networks and digital media. We encourage participants to participate in *EmoThreat: Emotion and Threat detection in Urdu (Nastaliq)* *Task A: Multi-label emotion classification in Urdu *Link <https://sites.google.com/view/multi-label-emotionsfire-task/home/task-a> Task A requires you to classify the tweet as one, or more of the six basic emotions (plus neutral), which is the best representation of the emotion of the person tweeting. *Task B: Threatening Language Detection Task in Urdu *Link <https://sites.google.com/view/multi-label-emotionsfire-task/home/task-b> Task B focuses on detecting Threatening language using Twitter tweets in Urdu language. This is a binary classification task in which participating systems are required to classify tweets into two classes, namely: Threatening and Non-Threatening. *Note: Participants in this year’s shared task can choose to participate in either one or both subtasks. Please visit the website for more information.* *Important Dates* 30th June – Training data release 25th July – Codalab submission link release (Task A) 7th September - Test set release (Task B) 20th September – Run submission deadline 30th September – Results Declared 12th October - Working Note submission 26th October - Review Notifications 2nd November – Camera Ready Due 9th - 13th December - FIRE 2022 (Online Event) *Organizers* Sabur Butt, Instituto Politécnico Nacional, Mexico Maaz Amjad, Instituto Politécnico Nacional, Mexico Noman Ashraf, Dana-Farber Cancer Institute, Harvard Medical School, United States Fazlourrahman Balouchzahi, Instituto Politécnico Nacional, Mexico Rajesh Sharma, University of Tartu, Estonia Grigori Sidorov, Instituto Politécnico Nacional, Mexico Alexander Gelbukh, Instituto Politécnico Nacional, Mexico *Contact* Email: emothreat2022(a)gmail.com Google-group: Link <https://groups.google.com/g/emothreat> *FIRE 2022: Link <http://fire.irsi.res.in/fire/2022/home> -- *With my best regards,* *Maaz Amjad**, PhD* ================================================ *LinkedIn <http://www.linkedin.com/in/maazmjad> **Twitter <https://twitter.com/maazamjad13?lang=en> * *Skype: maaz.amjad72Mobile: +5215567332662* *Email: h.maazamjad(a)gmail.com <h.maazamjad(a)gmail.com>*

1 0

[Final Deadline Extension] First Workshop on Information Extraction from Scientific Publications (WIESP) at AACL-IJCNLP 2022
by Tirthankar Ghosal 03 Sep '22

03 Sep '22

*** First Workshop on Information Extraction from Scientific Publications ( WIESP) at AACL-IJCNLP 2022 *** *** Website: https://ui.adsabs.harvard.edu/WIESP/ *** Twitter: https://twitter.com/wiesp_nlp The number of scientific papers published per year has exploded in recent years. Indexing the article's full text in search engines helps discover and retrieve vital scientific information to continue building on the shoulders of giants, informing policy, and making evidence-based decisions. Nevertheless, it is difficult to navigate this ocean of data. Using simple string matching has substantial limitations: human language is ambiguous in nature, context matters, and we frequently use the same word and acronyms to represent a multitude of different meanings. Extracting structured and semantically relevant information from scientific publications (e.g., named-entity recognition, summarization, citation intention, linkage to knowledge graphs) allows for better selection and filter articles. The First Workshop on Information Extraction from Scientific Publications ( WIESP) will create the necessary forum to foster discussion and research using Natural Language Processing and Machine Learning. WIESPwould specifically focus on topics related to information extraction from scientific publications, including (but not limited to): - Scientific document parsing - Scientific named-entity recognition - Scientific article summarization - Question-answering on scientific articles - Citation context/span extraction - Structured information extraction from full-text, tables, figures, bibliography - Novel datasets curated from scientific publications - Argument extraction and mining - Challenges in information extraction from scientific articles - Building knowledge graphs via mining scientific literature; querying scientific knowledge graphs - Novel tools for IE on scientific literature and interaction with users - Mathematical information extraction - Scientific concepts, facts extraction - Visualizing scientific knowledge - Bibliometric and Altmetric studies via information extraction from scientific articles and metadata - Information extraction from COVID-19 articles to inform public health policy In addition to research paper presentations, WIESP would also feature keynote talks, a panel discussion, and a shared task. We will update the details on our website as and when they become available. We especially welcome participation from academic and research institutions, government and industry labs, publishers, and information service providers. Projects and organizations using NLP/ML techniques in their text mining and enrichment efforts are also welcome to participate. ***Call for Papers*** We invite papers of the following categories: ***Long papers*** must describe substantial, original, completed, and unpublished work. Wherever appropriate, concrete evaluation and analysis should be included. Papers must not exceed eight (8) pages of content, plus unlimited pages of references. The final versions of long papers will be given one additional page of content (up to 9 pages) so that reviewers' comments can be taken into account. ***Short papers*** must describe original and unpublished work. Please note that a short paper is not a shortened long paper. Instead, short papers should have a point that can be made in a few pages, such as a small, focused contribution, a negative result, or an interesting application nugget. Short papers must not exceed four (4) pages, plus unlimited pages of references. The final versions of short papers will be given one additional page of content (up to 5 pages) so that reviewers' comments can be taken into account. ***Position papers*** will give voice to authors who wish to take a position on a topic listed above or the field of scholarly information extraction. Submissions need not present original work and should be two to four pages in length, including title, text, figures and tables, and references. ***Demo papers*** should be no more than four (4) pages in length, including references, and should describe implemented systems that are of relevance to the theme of the workshop. Authors of demo papers should be willing to present a demo of their system during WIESP at AACL-IJCNLP 2022. ***Extended Abstracts*** We welcome submissions of extended abstracts (2 pages max) related to the research topics mentioned above. Submissions may include previously published results, late-breaking results, or a description of ongoing projects in the broad field of information extraction and mining from scientific publications. Extended abstracts can also summarize existing work, work in progress, or a collection of works under a unified theme (e.g., a series of closely related papers that build on each other or tackle a common problem). ***Shared Task: Detecting Entities in the Astrophysics Literature (DEAL)*** A good amount of astrophysics research makes use of data coming from missions and facilities such as ground observatories in remote locations or space telescopes, as well as digital archives that hold large amounts of observed and simulated data. These missions and facilities are frequently named after historical figures or use some ingenious acronym which, unfortunately, can be easily confused when searching for them in the literature via simple string matching. For instance, Planck can refer to the person, the mission, the constant, or several institutions. Automatically recognizing entities such as missions or facilities would help tackle this word sense disambiguation problem. The shared task consists of Named Entity Recognition (NER) on samples of text extracted from astrophysics publications. The labels were created by domain experts and designed to identify entities of interest to the astrophysics community. They range from simple to detect (ex: URLs) to highly unstructured (ex: Formula), and from useful to researchers (ex: Telescope) to more useful to archivists and administrators (ex: Grant). Overall, 31 different labels are included, and their distribution is highly unbalanced (ex: ~100x more Citations than Proposals). Submissions will be scored using both the CoNLL-2000 shared task seqeval F1-Score at the entity level and scikit-learn's Matthews correlation coefficient method at the token level. We also encourage authors to propose their own evaluation metrics. A sample dataset and more instructions can be found at: https://ui.adsabs.harvard.edu/WIESP/2022/SharedTasks Participants (individuals or groups) will have the opportunity to present their findings during the workshop and write a short paper. The best performant or interesting approaches might be invited to further collaborate with the NASA Astrophysical Data System ( https://ui.adsabs.harvard.edu/). ***Important Dates*** - Paper/Abstract Submission Deadline: September 12, 2022 (Final extension) - Notification of workshop paper/abstract acceptance: October 7, 2022 - Camera-ready Submission Deadline: October 24, 2022 - Workshop: November 20, 2021 (online) ***All submission deadlines are 11.59 pm UTC -12h ("Anywhere on Earth")*** ***Submission Website and Format*** Submission Link: https://softconf.com/aacl2022/WIESP/ Submission will be via softconf. Submissions should follow the ACLPUB formatting guidelines (https://acl-org.github.io/ACLPUB/formatting.html) and template files (https://github.com/acl-org/acl-style-files/tree/master). Submissions (Long and Short Papers) will be subject to a double-blind peer-review process. Position papers, Demo papers, and Extended Abstracts need not be anonymized. The authors will present accepted papers at the workshop either as a talk or a poster. All accepted papers will be published in the workshop proceedings. We follow the same policies as AACL-IJCNLP 2022 regarding preprints and double submissions. The anonymity period for WIESP 2022 is from July 15 to September 25. ***Organizers*** - Tirthankar Ghosal, Charles University, CZ - Sergi Blanco-Cuaresma, Center for Astrophysics | Harvard & Smithsonian, USA - Alberto Accomazzi, Center for Astrophysics | Harvard & Smithsonian, USA - Robert M. Patton, Oak Ridge National Laboratory, USA - Felix Grezes, Center for Astrophysics | Harvard & Smithsonian, USA - Thomas Allen, Center for Astrophysics | Harvard & Smithsonian, USA -- +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Tirthankar Ghosal Researcher at UFAL, Charles University, CZ https://member.acm.org/~tghosal +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

1 0

1st CFP: Crowdsourcing & Human Computation Track @TheWebConf 2023
by Matt Lease 02 Sep '22

02 Sep '22

https://www2023.thewebconf.org/calls/research-tracks/crowdsourcing-hc/ We invite research contributions to the Crowdsourcing and Human Computation track at the 32nd edition of The Web Conference series (formerly known as WWW), to be hosted at Austin, TX, US, on April 30 - May 4, 2023 ( https://www2023.thewebconf.org/ <https://www2023.thewebconf.org/>) Fifteen years ago, a 2007 WWW paper entitled “Internet-Scale Collection of Human-Reviewed Data <https://dl.acm.org/doi/abs/10.1145/1242572.1242604>” was one of several forerunners to signal a new, emerging area of research on Human Computation and Crowdsourcing (HCOMP). Growing excitement and work in this new area would eventually lead to four years of HCOMP workshops across KDD and AAAI (2009-2012), a new annual AAAI HCOMP conference <https://www.humancomputation.com/> (2013 onward), and a new, annual HCOMP track at the WebConference (2014 onward). Today, the world and research landscape looks remarkably different than it did in 2007, with the Web playing a central role in orchestrating such advances. Of particular note, modern neural models have transformed AI capabilities, along with far greater ubiquity and significance of AI systems now in practical deployment around the world. As one effect of this, the commoditization and democratization of AI models today has also brought a new focus to “data-centric AI” in which AI models can succeed or fail based on the quality of underlying data and human annotations. The nature of human-AI interactions are also continually evolving in response to AI advances, posing an ever-changing frontier of new challenges for researchers and practitioners. Furthermore, the growth of AI power has brought a commensurate recognition of the need for responsible AI systems that are fair, accountable, transparent, and trustworthy – across diverse, global communities of human stakeholders who interact with or are impacted by AI systems. Given the central role of HCOMP in AI (creating reliable training and benchmark annotations, as well as enabling hybrid, human-in-the-loop systems), continuing innovation in HCOMP remains a key challenge for the further advancement of AI. HCOMP itself has made tremendous strides forward in the past fifteen years, yet many research challenges remain. *We invite AI, HCI, and related contributions that advances the broad spectrum of crowdsourcing and human computation (HCOMP) in the scope of the Web*: - algorithms, analysis, applications, methods, systems, and techniques - conceptual, empirical, theoretical, and mixed-methods - spanning fields (e.g, psychology, sociology, economics, ethics, etc.) - system-centered, human-centered, and hybrid More specifically, we invite work addressing contemporary HCOMP challenges including (but not limited to) the following Web-related themes: - Fundamental research challenges in Web-based HCOMP - *Data collection, generation, labeling, and cleaning*: data-centric AI; human and AI-assisted annotation; annotator agreement, aggregation, and modeling; annotation subjectivity and ambiguity, data excellence; human-in-the-loop data augmentation, generation, and adversarial attacks; label noise and bias detection and reduction; task decomposition, task and workflow design, novel modalities for input acquisition, etc. - *Human-centered explainability *: algorithmic/model explanations, interpretability, and transparency to enhance human success in using AI in decision-making, model and data debugging, task performance, trust in AI systems, appropriate reliance, etc. (please also read the CFP of the “Fairness, Accountability, Transparency and Ethics” track) - *Human-centered studies*: collaborative systems, computer-supported cooperative work, human-computer interaction, human factors, interaction design, usability, user experience, etc. - *Resources, benchmarking, reproducibility*: New resources for the community (e.g., datasets, open source toolkits, etc.), benchmarking studies comparing state of the art methods, and/or reproducibility studies of prior work. - *Addressing bias and diversity in annotation and human computation*: Methods and algorithms to identify and mitigate biases in annotations; bias-aware annotation workflows; diversity in annotators and workers, data labeling, and hybrid, human-in-the-loop systems; downstream effects of annotator diversity on bias and fairness measures; impact on evaluation of various systems (e.g. information retrieval systems, recommender systems, etc.); ethics and fairness of HCOMP practices - Underlying workforce powering Web-based HCOMP - *Social and economic impacts of human computation and crowdsourcing*: societal and methodological challenges around crowdsourcing labor and workforces; inequalities in access and representation in crowdsourcing workforces; platform affordances and economic impact - *Supporting HCOMP workers*: collective action; design activism; fair work <https://fair.work/en/fw/homepage/>; ghost work, heteromation, and invisible work; human computation, digital colonialism, and the global south; impact sourcing <https://en.wikipedia.org/wiki/Impact_sourcing> and responsible sourcing <https://partnershiponai.org/workstream/responsible-sourcing/>; regulation; worker empowerment, organization, protection and wellness; and workforce diversity, equity, and inclusion, etc. - *Future of work*: AI-assisted human coordination, team formation and work, distributed work, freelancer economy, hybrid, human+AI work and complementarity, etc. - Web-based HCOMP systems, frameworks, or architectures - *Crowd-powered systems*: data management, marketplace design and sustainability, platforms, scalability, security, privacy, programming languages, real-time crowdsourcing, etc. - *Human-in-the-loop architectures*: decision support; human-AI collaboration, interaction, and teaming; hybrid systems; mixed-initiative design, etc. - *Crowdsourcing*: citizen science, collective intelligence, crowd computing, crowd creativity, crowdfunding, crowd ideation, crowd intelligence, crowd sensing, crowdsourcing contests, crowd phenomena, crowd science, incentive schemes, gamification, human flesh search, open innovation, peer production, prediction markets, reputation systems, social web, wisdom of crowds, etc. - *Human computation*: decision-theoretic and game-theoretic design, design patterns, human algorithm design and complexity, mechanism and incentive design, etc. - Web-based Applications of HCOMP - *Machine learning for HCOMP*: aggregation, answer fusion, annotator and user modeling, quality assurance, optimization, task assignment and recommendation, truth inference, etc. - *New Applications and Services*: delivering beyond state-of-the-art AI capabilities and enhanced services through human computation and human-in-the-loop systems. Authors should consult the conference’s main Research Track CFP <https://www2023.thewebconf.org/calls/research-tracks/> to ensure their submissions are aligned with broader conference expectations, scope, and theme: “Web Research with Openness, Fairness and Reproducibility”. The CFP also details submission guidelines, relevant dates, and important policies. Review criteria <https://www.humancomputation.com/2016/review-criteria.html> will include considerations typical of those in past years of this track and the AAAI HCOMP conference. Submissions that are out of scope or unresponsive to the call above will be rejected early during the reviewing process (“desk rejected”) with minimal feedback.This includes submissions that: - merely apply HCOMP methods in standard, previously known ways, without novel contributions to advance the methodology itself; - do not relate to the web or web-based human computation platforms, methods, or applications. In case you have doubts whether your paper fits the scope of this track, please contact the track chairs hcomp-webconf2023(a)easychair.org Important dates - Abstract submission: October 6, 2022. This is compulsory for all papers. - Full papers submission: October 13, 2022 - Rebuttal: December 15 - 22, 2022 - Notification: January 25, 2023 Track chairs: - Ujwal Gadiraju <http://ujwalgadiraju.com/>(Delft University of Technology) - Matthew Lease <https://www.ischool.utexas.edu/~ml/>(University of Texas at Austin and Amazon) - Besmira Nushi <https://besmiranushi.com/>(Microsoft Research) *Senior Program Committee & Program Committee*: Stay Tuned! -- Matt Lease Professor School of Information University of Texas at Austin Voice: (512) 471-9350 · Fax: (512) 471-3971 · Office: UTA 5.536 http://www.ischool.utexas.edu/~ml

1 0

2025

2024

2023

2022