February 2023 - Corpora

Reminder: PhD fellowship in NLP (event extraction), University of Oslo
by Erik Velldal 23 Feb '23

23 Feb '23

[Apologies for cross-posting] A fully-funded position as PhD Research Fellow in Natural Language Processing is available in the Language Technology Group (LTG) in the Machine Learning Section at the Department of Informatics, University of Oslo (UiO), Norway. The 3-year position is affiliated with a new research project dubbed Peace Science Infrastructure (PSI), focusing on event extraction in the domain of armed conflicts. For more information, please see the full announcement here: https://www.jobbnorge.no/en/available-jobs/job/239414/phd-research-fellow-i… The closing date is February 28th, 2023. Please do not hesitate to contact me for any further information. Best regards, -erik -- Erik Velldal Language Technology Group Section for Machine Learning Department of Informatics, University of Oslo

1 0

2nd Call for participation: DISRPT 2023 - Shared Task on Discourse Relation Parsing and Treebanking
by Chloé Braud 22 Feb '23

22 Feb '23

2nd Call for Participation: DISRPT 2023Shared Task on Discourse Relation Parsing and Treebanking In conjunction with CODI 2023, ACL 2023 - 14 July 2023 News: the training and dev data sets have been released on our github: https://github.com/disrpt/sharedtask2023. Now you can start training your systems! Surprise datasets will be released along with the test sets in about two months! Get excited! News: please join us on the google group disrpt2023_participants(a)googlegroups.com and on the dedicated Discord channel https://discord.gg/JDdjhXaK This year, we are organizing DISRPT 2023 as a shared task on discourse processing across formalisms, for a variety of languages and genres. It is the third iteration of a cross-formalism shared task on discourse analysis, with three subtasks: * Task 1: discourse segmentation * Task 2: connective identification * Task 3: relation classification We will provide training, development and test datasets from all available languages in RST, SDRT, PDTB and Discourse Dependencies using a uniform format. Because different corpora, languages, and frameworks use different guidelines, the shared task will promote the design of flexible methods for dealing with various guidelines, and will help to push forward the discussion of converging standards for discourse units, discourse relations and discourse markers. For datasets which have treebanks, we will evaluate segmentation in two different scenarios: with and without gold syntax. An automatically parsed version is provided for all corpora without a gold parse. Shared Task Data and Formats Data for the shared task is released via GitHub together with format documentation and tools: https://github.com/disrpt/sharedtask2023 See here for more information about the previous shared tasks: * 2019: https://sites.google.com/view/disrpt2019/shared-task * 2021: https://sites.google.com/georgetown.edu/disrpt2021/ Tentative Schedule: * 25 January 2023 – Sample data released * 21 February 2023 – Train / dev data release * 15 April 2023 – Test data release * 8 May 2023 – Submission of system and paper * 22 May 2023 - Notification of acceptance * 1 June 2023 - Camera-ready paper due (This date has been modified since the 1st call) * 14 July 2023 - CODI Workshop at ACL Information: Contact the organizers: disrpt_chairs(a)googlegroups.com Official website: https://sites.google.com/georgetown.edu/disrpt2021 Google group for participants, please join us on: disrpt2023_participants(a)googlegroups.com Discord group for participants, please join us on: https://discord.gg/JDdjhXaK Organization: * Amir Zeldes (Georgetown University, Washington, DC, USA) * Janet Liu (Georgetown University, Washington, DC, USA) * Philippe Muller (IRIT, University of Toulouse, Toulouse, France) * Chloé Braud (IRIT, CNRS, Toulouse, France) * Laura Rivière (IRIT, University of Toulouse, Toulouse, France) * Attapol Te Rutherford (Faculty of Arts Chulalongkorn University, Bangkok, Thaïland)

1 0

PhD fellowship in explainable semantic change detection, University of Oslo
by Andrey Kutuzov 22 Feb '23

22 Feb '23

[Apologies for cross-posting] I am looking for a PhD student interested in studying explainable semantic change detection. Current computational approaches to modeling synchronic and diachronic semantic change achieve considerable success as measured by scores in shared tasks and the number of research papers. But they are mostly non-transparent and obscure for historical linguists and lexicographers. One of the reasons is that these methods lack explanatory power. This PhD project is supposed to address this issue. The overall aim is to transform numerical change predictions into human-readable explanations linked to rich linguistic tradition of semantic shift categorization. We will define particular paths towards this aim jointly, in discussion with the PhD student. The position is fully-funded and linked to the Language Technology Group (LTG) in the Machine Learning Section at the Department of Informatics, University of Oslo (UiO), Norway. The fellowship period is 3 years, starting no later than August 2023. A fourth year may be considered with a workload of 25 % that may consist of teaching, supervision duties, and/or research assistance. For more information, please see the full announcement here: https://www.jobbnorge.no/en/available-jobs/job/239381/phd-research-fellow-i… The application deadline is February 28th, 2023. Please feel free to contact me for any further information. -- Andrey Associate professor Language Technology Group (LTG) University of Oslo

1 1

Final CFP: LoResMT 2023 at EACL 2023
by Atul K. Ojha 22 Feb '23

22 Feb '23

The Sixth Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2023) https://www.loresmt.org/ *@ EACL 2023 (May 2–6, 2023)*Valamar Lacroma Dubrovnik Hotel, Iva Dulčića 34, 20000, Dubrovnik, Croatia SUBMISSION *https://softconf.com/eacl2023/LoResMT2023* <https://softconf.com/eacl2023/LoResMT2023> TIMELINE Paper due: February 22, 2023 *March 02 2023*,* at 23:59 (Anywhere on Earth*) Notification of acceptance: March 23, 2023 Camera-ready papers due: *April** 01, 2023, at 23:59 (Anywhere on Earth)* Conference dates: May 2-6, 2023 SCOPE Based on the success of past low-resource machine translation (MT) workshops at AMTA 2018 (https://amtaweb.org/), MT Summit 2019 ( https://www.mtsummit2019.com), AACL-IJCNLP 2020 (http://aacl2020.org/), AMTA 2021, COLING 2022, we introduce the Sixth Workshop. The workshop provides a discussion panel for researchers working on MT systems/methods for low-resource and under-represented languages in general. We would like to help review/overview the state of MT for low-resource languages and define the most important directions. We also solicit papers dedicated to supplementary NLP tools that are used in any language and especially in low-resource languages. Overview papers on these NLP tools are very welcome. It will be beneficial if the evaluations of these tools in research papers include their impact on the quality of MT output. TOPICS We are highly interested in (1) original research papers, (2) review/opinion papers, and (3) online systems on the topics below; however, we welcome all novel ideas that cover research on low-resource languages. - COVID-related corpora, their translations and corresponding NLP/MT systems - Neural machine translation for low-resource languages - Work that presents online systems for practical use by native speakers - Word tokenizers/de-tokenizers for specific languages - Word/morpheme segmenters for specific languages - Alignment/Re-ordering tools for specific language pairs - Use of morphology analyzers and/or morpheme segmenters in MT - Multilingual/cross-lingual NLP tools for MT - Corpora creation and curation technologies for low-resource languages - Review of available parallel corpora for low-resource languages - Research and review papers on MT methods for low-resource languages - MT systems/methods (e.g. rule-based, SMT, NMT) for low-resource languages - Pivot MT for low-resource languages - Zero-shot MT for low-resource languages - Fast building of MT systems for low-resource languages - Re-usability of existing MT systems for low-resource languages - Machine translation for language preservation SUBMISSION INFORMATION We are soliciting two types of submissions: (1) research, review, and position papers and (2) system demonstration papers. For research, review and position papers, the length of each paper should be at least four (4) and not exceed eight (8) pages, plus unlimited pages for references. For system demonstration papers, the limit is four (4) pages. Submissions should be formatted according to the official EACL 2023 style templates (LaTeX, Word, Overleaf). Accepted papers will be published online in the EACL 2023 proceedings and will be presented at the conference. Submissions must be anonymized and should be done using the official conference management system (which will be available in the following weeks). Scientific papers that have been or will be submitted to other venues must be declared as such and must be withdrawn from the other venues if accepted and published at LoResMT. The review will be double-blind. We would like to encourage authors to cite papers written in ANY language that are related to the topics, as long as both original bibliographic items and their corresponding English translations are provided. ORGANIZING COMMITTEE (LISTED ALPHABETICALLY) Atul Kr. Ojha, University of Galway & Panlingua Language Processing LLP Chao-Hong Liu, Potamu Research Ltd Ekaterina Vylomova, University of Melbourne, Australia Jade Abbott, Retro Rabbit Jonathan Washington, Swarthmore College Nathaniel Oco, National University (Philippines) Tommi A Pirinen, UiT The Arctic University of Norway, Tromsø Valentin Malykh, Huawei Noah’s Ark lab and Kazan Federal University Varvara Logacheva, Skolkovo Institute of Science and Technology Xiaobing Zhao, Minzu University of China PROGRAM COMMITTEE (LISTED ALPHABETICALLY) Alberto Poncelas, Rakuten, Singapore Alina Karakanta, Fondazione Bruno Kessler Amirhossein Tebbifakhr, Fondazione Bruno Kessler Anna Currey, Amazon Web Services Aswarth Abhilash Dara, Amazon Arturo Oncevay, University of Edinburgh Bharathi Raja Chakravarthi, University of Galway Beatrice Savold, University of Trento Bogdan Babych, Heidelberg University Constantine Lignos, Brandeis University, USA Daan van Esch, Google Diptesh Kanojia, University of Surrey, UK Duygu Ataman, University of Zurich Eleni Metheniti, CLLE-CNRS and IRIT-CNRS Francis Tyers, Indiana University Kalika Bali, MSRI Bangalore, India Koel Dutta Chowdhury, Saarland University (Germany) Jade Abbott, Retro Rabbit Jasper Kyle Catapang, University of the Philippines John P. McCrae, DSI, Univerity of Galway Kevin Patrick Scannell, Saint Louis University Liangyou Li, Noah’s Ark Lab, Huawei Technologies Maria Art Antonette Clariño, University of the Philippines Los Baños Majid Latifi, University of York, York, UK Mathias Müller, University of Zurich Monojit Choudhury, Microsoft Turing Rajdeep Sarkar, Univerity of Galway Rico Sennrich, University of Zurich Sangjee Dondrub, Qinghai Normal University Santanu Pal, WIPRO AI Sardana Ivanova, University of Helsinki Shantipriya Parida, Silo AI Sunit Bhattacharya, Charles University Surafel Melaku Lakew, Amazon AI CONTACT Please email loresmt(a)googlegroups.com if you have any questions/comments/suggestions.

1 0

Invitation to join ACL SIGTURK
by Duygu Ataman 22 Feb '23

22 Feb '23

(Please find Turkish and Russian Translations below.) We are excited to invite you to join SIGTURK, the new ACL special interest group working on improving and promoting computational linguistics in Turkic languages. As a product of our recent initiatives for preparing public resources for language studies in Turkic languages, including pre-trained language models, evaluation benchmarks and multilingual corpora that can be used throughout the community of researchers working on topics around natural language processing, linguistics and are interested in Turkic languages or related problems around low-resourced languages and few shot learning, we are happy to officially have the opportunity to foster a new interdisciplinary scientific community aiming to promote an inclusive and diverse environment for exchanging ideas, promoting collaborations and allowing language research and technologies become applicable in more languages and domains. As part of our ongoing projects, we have recently started working on a new multilingual natural language processing benchmark in Turkic languages, and one of our future plans is also to hold an annual workshop to allow productive knowledge exchange and collaborations. As an international community of researchers from all over the world already composed of scientists and professionals from many different backgrounds, we would be glad to see anyone interested in joining our group and we are happy to welcome new ideas and promote innovative research in different directions. If you are interested in becoming a member of SIGTURK, please fill out the form below: https://forms.gle/NwWJ8B4LSRfEi2Wx5 <https://urldefense.proofpoint.com/v2/url?u=https-3A__forms.gle_NwWJ8B4LSRfE…> And for more information visit our website: https://sigturk.github.io <https://urldefense.proofpoint.com/v2/url?u=https-3A__sigturk.github.io_&d=D…> Sincerely, Duygu Ataman SIGTURK President Assistant Professor/Faculty Fellow New York University — (Turkish) Merhaba, Sizleri ACL SIGTURK, Türki dillerin doğal dil işlemesinin geliştirilmesi ve yaygınlaştırılması adına kurduğumuz yeni uluslararası çalışma grubuna davet etmekten sevinç duyuyorum. Geçtiğimiz yıl içinde Türkçe ve Türk dillerinde makine çevirisi, dil modellemesi gibi halihazırda sistem oluşturulması için kullanılabilecek hazır geliştirilmiş dil modelleri, ve diğer doğal dil işleme uygulamalarında kullanılabilecek Türki dillerde dilbilim ve doğal dil işleme gruplarının kullanımına açık kaynak ve derlemler hazırlanması gibi konularda faaliyetler organize etmek üzere başlattığımız uluslararası bir çalışma grubunu büyük bir özveriyle resmi bir hale getirmiş bulunuyoruz, şu anda çok dilli bir Türki dil ailesi doğal dil işleme derlemi geliştirilmesi üzerinde çalışmaktayız. Çalışmalarımızı mail grubumuz üzerinden yürütüyoruz, önümüzdeki yıl da projelerimizi paylaşabileceğimiz, ve uluslararası ortamda alanımızda çalışan herkese profesyonel ağlarını geliştirme fırsatı sunabilecek bir çalıştay düzenlemeyi umuyoruz. Grubun bir parçası olmak ve bizimle Türki dillerde dilbilim ve dil işlemesi üzerine çalışan meslektaşlarımızı resmi bir çatı altında toplayarak çalışmalarımıza katkı sağlamak isteyen herkesi aramızda görmekten çok memnun oluruz. Topluluğumuzda şimdiden birçok ülkeden farklı alanlarda Türki diller konuşan, dilbilim ve bilgisayar bilimi de dahil olmak üzere farklı alanlarda çalışan ve gönüllü işbirliği yapmak için oldukça hevesli birçok üyemiz bulunuyor. Gruba aşağıdaki forma iletişim bilgilerinizi ekleyerek kayıt olabilirsiniz. https://forms.gle/NwWJ8B4LSRfEi2Wx5 <https://urldefense.proofpoint.com/v2/url?u=https-3A__forms.gle_NwWJ8B4LSRfE…> Tanıdığınız ve bu alanda çalışmalarımıza katılmak isteyebileceğini düşündüğünüz çalışma arkadaşlarınızı da davet ederseniz çok seviniriz. Daha fazla bilgi için web sitemizi de ziyaret edebilirsiniz. https://sigturk.github.io <https://urldefense.proofpoint.com/v2/url?u=https-3A__sigturk.github.io_&d=D…> Saygılarımla, Duygu Ataman — (Russian) Мы рады пригласить вас присоединиться к SIGTURK, новой специальной группе ассоциации компьютерной лингвистики (ACL SIG), работающей над улучшением и продвижением компьютерной лингвистики для тюркских языков. Наши недавние инициативы по подготовке общедоступных ресурсов для языковых исследований тюркских языков включают предварительно обученные языковые модели, бенчмарки (оценочные тесты) и многоязычные корпуса для исследователей, работающих над темами, связанными с обработкой естественного языка, лингвистикой, и которые интересуются тюркскими языками и для тех, кто интересуется машинным обучением и обработкой естественных языков для малоресурсных языков. Мы рады создать новое междисциплинарное научное сообщество, целью которого является создание инклюзивной и разнообразной среды для обмена идеями, развития сотрудничества и предоставления возможности для языковых исследований и технологий для большого количества языков и областей. В рамках наших текущих проектов мы недавно начали работу над новым эталоном многоязычной обработки естественного языка для тюркских языков, и одним из наших будущих планов также является проведение ежегодного семинара (workshop), чтобы обеспечить продуктивный обмен знаниями и сотрудничество. Как международное сообщество исследователей со всего мира, уже состоящее из ученых и профессионалов из самых разных областей, мы будем рады видеть всех, кто заинтересован в присоединении к нашей группе, и мы рады приветствовать новые идеи и будем рады продвигать инновационные исследования в различных направлениях. Если вы заинтересованы стать членом сообщества, пожалуйста, заполните следующую форму: https://forms.gle/NwWJ8B4LSRfEi2Wx5 <https://urldefense.proofpoint.com/v2/url?u=https-3A__forms.gle_NwWJ8B4LSRfE…> Для дополнительной информации, пожалуйста, посетите наш сайт: https://sigturk.github.io <https://urldefense.proofpoint.com/v2/url?u=https-3A__sigturk.github.io&d=Dw…> C уважением, Дуюгу Атаман - Duygu Ataman, Ph.D. Assistant Professor/Faculty Fellow Courant Institute of Mathematical Sciences New York University

1 0

Rana Malhas, Arabic Reading Comprehension on the Holy Qur’an using CL-AraBERT 24.2.23 12:00
by Eric Atwell 22 Feb '23

22 Feb '23

Please share this invitation with your network thanks Eric Atwell, School of Computing, University of Leeds ________________________________ School of Computing Colloquium, University of Leeds, Friday 24.2.23 12:00 (UK) Guest speaker: Dr Rana Malhas, Qatar University Title: Arabic Reading Comprehension on the Holy Qur’an using CL-AraBERT Venue: Teams Click here to join the meeting<https://emea01.safelinks.protection.outlook.com/ap/t-59584e83/?url=https%3A…> Abstract: In this talk, we tackle the problem of machine reading comprehension (MRC) on the Holy Qur’an to address the lack of Arabic datasets and systems for this important task. We construct QRCD as the first Qur’anic Reading Comprehension Dataset, composed of 1,337 question passage-answer triplets for 1,093 question-passage pairs, of which 14% are multi-answer questions. We then introduce CLassical-AraBERT (CL-AraBERT for short), a new AraBERTbased pre-trained model, which is further pre-trained on about 1.0B-word Classical Arabic (CA) dataset, to complement the Modern Standard Arabic (MSA) resources used in pre-training the initial model, and make it a better fit for the task. Finally, we leverage cross-lingual transfer learning from MSA to CA, and fine-tune CL-AraBERT as a reader using two MSA-based MRC datasets followed by our QRCD dataset to constitute the first (to the best of our knowledge) MRC system on the Holy Qur’an. To evaluate our system, we introduce Partial Average Precision (𝑝𝐴𝑃 ) as an adapted version of the traditional rank-based Average Precision measure, which integrates partial matching in the evaluation over multi-answer and single-answer MSA questions. Adopting two experimental evaluation setups (hold-out and cross validation (CV)), we empirically show that the fine-tuned CL-AraBERT reader model significantly outperforms the baseline fine-tuned AraBERT reader model by 6.12 and 3.75 points in 𝑝𝐴𝑃 scores, in the hold-out and CV setups, respectively. To promote further research on this task and other related tasks on Qur’an and Classical Arabic text, we make both the QRCD dataset and the pre-trained CL-AraBERT model publicly available. ________________________________________________________________________________ Microsoft Teams meeting Join on your computer, mobile app or room device Click here to join the meeting<https://emea01.safelinks.protection.outlook.com/ap/t-59584e83/?url=https%3A…> Meeting ID: 355 926 814 82 Passcode: 4dJRfD Download Teams<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.micr…> | Join on the web<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.micr…> Learn more<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Faka.ms%2…> | Meeting options<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fteams.mi…> ________________________________________________________________________________

1 0

Call for Participation: shared task on Guarani-Spanish code switching analysis - GUA-SPA at IberLEF 2023
by luischir＠fing.edu.uy 22 Feb '23

22 Feb '23

*** Call for Participation for GUA-SPA at IberLEF 2023 *** GUA-SPA - Guarani-Spanish Code Switching Analysis at IberLEF 2023 https://codalab.lisn.upsaclay.fr/competitions/11030 Guarani is a South American indigenous language that has been in contact with Spanish and other Indo-European languages for about 500 years, which has resulted in many interesting varieties with different levels of mixture. In Paraguay, according to the most recent census, most of the population of the country speak at least some Guarani, and there is a high prevalence of Guarani-Spanish bilingualism. Bilingual speakers often make use of the two languages at the same time, mixing them in different ways, in a phenomenon called code-switching. We propose a challenge for analyzing code-switched texts in Guarani and Spanish, trying to identify the language used in each span of text, the named entities mentioned in the text, and the way Spanish is used. For this, we will provide a corpus of news and tweets where each token is labeled with an appropriate language or category identifier. Three tasks are presented: * Language identification in code-switched data: identify each token in the text as Guarani, Spanish, mix, named entity, or other categories. * Named entity classification: classify the named entities found in the text as locations, organizations or people. * Spanish code classification: classify the spans in Spanish as code changes or unadapted loans. How to participate: If you want to participate in this task, please join our Codalab competition: https://codalab.lisn.upsaclay.fr/competitions/11030 Important Dates: * March 22nd, 2023: training set. * May 24th, 2023: test set and open for submissions. * June 7th, 2023: publication of results. * June 14th, 2023: paper submission. * June 28th, 2023: notification of acceptance. * July 3rd, 2023: camera-ready paper submission. * September, 2023: IberLEF 2023 Workshop.

1 0

Call for Participation: shared task on on binary gender prediction
by Marina Litvak 22 Feb '23

22 Feb '23

Binary Gender Prediction From Long Texts Competition on Classification of authors' binary gender from their writing style Competition site: https://www.kaggle.com/competitions/binary-gender-prediction-from-long-text… Timeline - February 20, 2023: Competitions are published on the website and opened to participants. - February 20, 2023: Registration for the competition. In addition to the registration on Kaggle, also fill out the registration form <https://forms.gle/LhYuGcjnYSBj5YgY6>. - February 20, 2023: Training and validation datasets are available for registered participants. - April 13, 2023: Test dataset available. - April 20, 2023,23:59 UTC: Delivery of results with a short description of the method. This competition is organized in conjunction with the 1st International Workshop on Implicit Author Characterization from Texts for Search and Retrieval (IACT’23) <https://en.sce.ac.il/news/iact23>, which will be held in Taipei, Taiwan on July 27. The ability to accurately determine the binary gender of an author based on their writing style opens up a wealth of possibilities in a variety of fields. In historical document analysis, for example, researchers can gain deeper insights into the role and representation of women in different societies. In forensic investigations, this information can be used to focus the investigation on a smaller and more targeted group, thereby increasing the efficiency and effectiveness of the investigation. By participating in this competition, researchers will have the chance to contribute to this growing field and make a meaningful impact on society. Whether you're a seasoned NLP professional or just starting out, this competition presents a unique opportunity to challenge yourself and take your skills to the next level. So don't miss out on this exciting opportunity to make your mark in the world of IR. Organizers: - Marina Litvak - marinal(a)ac.sce.ac.il; Shamoon College of Engineering Beer Sheva; Israel - Irina Rabaev - irinar(a)ac.sce.ac.il; Shamoon College of Engineering Beer Sheva; Israel - Alípio Mário Jorge - amjorge(a)fc.up.pt; University of Porto; Porto, Portugal - Ricardo Campos - ricardo.campos(a)ipt.pt; Polytechnic Institute of Tomar INESC TEC, Portugal; Porto, Portugal - Adam Jatowt - adam.jatowt(a)uibk.ac.at; University of Innsbruck; Innsbruck, Austria - Vladimir Yonkin - vladiyo(a)ac.sce.ac.il; Shamoon College of Engineering, Beer Sheva, Israel -- Best regards, Marina Litvak

1 0

CFP: The 1st International Workshop on Implicit Author Characterization from Texts for Search and Retrieval (IACT’23)
by Marina Litvak 22 Feb '23

22 Feb '23

Call for Papers: The 1st International Workshop on Implicit Author Characterization from Texts for Search and Retrieval (IACT’23) The workshop will be held in conjunction with the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval Workshop website: https://en.sce.ac.il/news/iact23 July 27, 2023. Taipei, Taiwan. Paper submission deadline: April 25, 2023, AoE Submission link: https://easychair.org/conferences/?conf=iact23 To bring the research community's attention to the limitations of current models at capturing the hidden author characteristics from texts, we organize the first edition of IACT workshops under the umbrella of the SIGIR conference. Research works submitted to the workshop should foster the scientific advance on all aspects of implicit author information extraction from text. All papers must be original and not simultaneously submitted to another journal or conference. The following paper categories are welcome: Research papers: up to 8 pages. Original and high-quality unpublished contributions to the theory and practical aspects of the implicit author characterization task. Full papers should introduce existing approaches and describe the methodology and the experiments conducted in detail. Negative result papers highlighting tested hypotheses that did not get the expected outcome are also welcomed. Shared task papers up to 5 pages. The submissions must be anonymous and will be peer-reviewed by at least two program committee members. Papers must be submitted electronically in PDF format through EasyChair <https://easychair.org/conferences/?conf=iact23>. All submissions must be in English and formatted according to the one-column CEUR-ART style with no page numbers. Templates in Word or LaTeX can be found in the following zip folder at http://ceur-ws.org/Vol-XXX/CEURART.zip. There is also an Overleaf page <https://www.overleaf.com/latex/templates/template-for-submissions-to-ceur-w…> for LaTeX users. Research works submitted to the workshop should foster the scientific advance on all aspects of implicit author information extraction from text, including but not limited to the following: - Query understanding - Personalized information search and retrieval - Demographics (gender, age, location, etc.) prediction and analysis - Profiling protagonists - Multi-modal, multi-genre, and multilingual author analysis - Detecting implicit expressions of sentiment, emotion, opinion, and bias - Transfer learning for implicit author characterization - Implicit author profiling annotation schema - Evaluation of implicit author profiling - Differentiation between AI-generated content and human-generated content and bot profiling - Author profiling in low-resource languages and under-studied domains - Exploration of the relations between self-reported or binary gender identity and implicit personality traits - Mental disorders and risk prediction - Author identification and verification Organizing Committee: - Marina Litvak - marinal(a)ac.sce.ac.il; Shamoon College of Engineering Beer Sheva; Israel - Irina Rabaev - irinar(a)ac.sce.ac.il; Shamoon College of Engineering Beer Sheva; Israel - Alípio Mário Jorge - amjorge(a)fc.up.pt; University of Porto; Porto, Portugal - Ricardo Campos - ricardo.campos(a)ipt.pt; Polytechnic Institute of Tomar INESC TEC, Portugal; Porto, Portugal - Adam Jatowt - adam.jatowt(a)uibk.ac.at; University of Innsbruck; Innsbruck, Austria Invited Speakers: - Prof. Mark Last - Ben-Gurion University of the Negev, Israel - Prof. Dr. Valia Kordoni - Humboldt-Universität Berlin, Germany IACT’23 proceedings will be published at CEUR workshop proceedings (indexed in Scopus and DBLP) as long as they do not conflict with previous publication rights. Contact: - Dr. Marina Litvak: litvak.marina(a)gmail.com - Dr. Irina Rabaev: irinar(a)ac.sce.ac.il -- Best regards, Marina Litvak

1 0

SIGDIAL 2023 Call for Papers (submission deadline May 15)
by Svetlana Stoyanchev 22 Feb '23

22 Feb '23

The 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL) and the 16th International Natural Language Generation Conference (INLG) will be held jointly in Prague on September 11-15, 2023. The SIGDIAL venue provides a regular forum for the presentation of cutting edge research in dialogue and discourse to both academic and industry researchers, continuing a series of 23 successful previous meetings. The conference is sponsored by the SIGDIAL organization - the Special Interest Group in discourse and dialogue for ACL and ISCA. Topics of Interest We welcome formal, corpus-based, implementation, experimental, or analytical work on discourse and dialogue including, but not restricted to, the following themes: * Discourse Processing: Rhetorical and coherence relations, discourse parsing and discourse connectives. Reference resolution. Event representation and causality in narrative. Argument mining. Quality and style in text. Cross-lingual discourse analysis. Discourse issues in applications such as machine translation, text summarization, essay grading, question answering and information retrieval. Discourse issues in text generated by large language models. * Dialogue Systems: Task oriented and open domain spoken, multi-modal, embedded, situated, and text-based dialogue systems, their components, evaluation and applications. Knowledge representation and extraction for dialogue. State representation, tracking and policy learning. Social and emotional intelligence. Dialogue issues in virtual reality and human-robot interaction. Entrainment, alignment and priming. Generation for dialogue. Style, voice, and personality. Safety and ethics issues in Dialogue. * Corpora, Tools and Methodology: Corpus-based and experimental work on discourse and dialogue, including supporting topics such as annotation tools and schemes, crowdsourcing, evaluation methodology and corpora. * Pragmatic and Semantic Modeling: Pragmatics and semantics of conversations(i.e., beyond a single sentence), e.g., rational speech act, conversation acts, intentions, conversational implicature, presuppositions. * Applications of Dialogue and Discourse Processing Technology. Submissions The program committee welcomes the submission of long papers, short papers, and demo descriptions. Submitted long papers may be accepted for oral or for poster presentation. Accepted short papers will be presented as posters. * Long paper submissions must describe substantial, original, completed and unpublished work. Wherever appropriate, concrete evaluation and analysis should be included. Long papers must be no longer than 8 pages, including title, text, figures and tables. An unlimited number of pages is allowed for references. Two additional pages are allowed for appendices containing sample discourses/dialogues and algorithms, and an extra page is allowed in the final version to address reviewers’ comments. * Short paper submissions must describe original and unpublished work. Please note that a short paper is not a shortened long paper. Instead short papers should have a point that can be made in a few pages, such as a small, focused contribution; a negative result; or an interesting application nugget. Short papers should be no longer than 4 pages including title, text, figures and tables. An unlimited number of pages is allowed for references. One additional page is allowed for sample discourses/dialogues and algorithms, and an extra page is allowed in the final version to address reviewers’ comments. * Demo descriptions should be no longer than 4 pages including title, text, examples, figures, tables and references. A separate one-page document should be provided to the program co-chairs for demo descriptions, specifying furniture and equipment needed for the demo. Authors are encouraged to also submit additional accompanying materials, such as corpora (or corpus examples), demo code, videos and sound files. Multiple Submissions SIGDIAL 2023 cannot accept work for publication or presentation that will be (or has been) published elsewhere and that have been or will be submitted to other meetings or publications whose review periods overlap with that of SIGDIAL. Overlap with the SIGDIAL workshop submissions is permitted for non-archived workshop proceedings. Any questions regarding submissions can be sent to program-chairs [at] sigdial.org. Blind Review Building on previous years’ move to anonymous long and short paper submissions, SIGDIAL 2023 will follow the ACL policies for preserving the integrity of double blind review (see author guidelines <https://www.aclweb.org/adminwiki/index.php?title=ACL_Author_Guidelines>). Unlike long and short papers, demo descriptions will not be anonymous. Demo descriptions should include the authors’ names and affiliations, and self-references are allowed. Submission Format All long, short, and demonstration submissions must follow the two-column ACL format, which are available as an Overleaf template <https://www.overleaf.com/read/crtcwgxzjskr> and also downloadable directly <https://github.com/acl-org/acl-style-files> (Latex and Word) Submissions must conform to the official ACL style guidelines, which are contained in these templates. Submissions must be electronic, in PDF format. Submission Deadline SIGDIAL will accept regular submissions through the Softconf/START system, as well as commitment of already reviewed papers through the ACL Rolling Review (ARR) system. Regular submission Authors have to fill in the submission form in the Softconf/START system and upload an initial pdf of their papers before May 15, 2023 (23:59 GMT-11). Details and the submission link will be posted on the conference website <https://2023.sigdial.org/>. Submission via ACL Rolling Review (ARR) <https://aclrollingreview.org/> Please refer to the ARR Call for Papers <https://aclrollingreview.org/cfp> for detailed information about submission guidelines to ARR. The commitment deadline for authors to submit their reviewed papers, reviews, and meta-review to SIGDIAL 2023 is June 19, 2023. Note that the paper needs to be fully reviewed by ARR in order to make a commitment, thus the latest date for ARR submission will be April 15, 2023. Mentoring Acceptable submissions that require language (English) or organizational assistance will be flagged for mentoring, and accepted with a recommendation to revise with the help of a mentor. An experienced mentor who has previously published in the SIGDIAL venue will then help the authors of these flagged papers prepare their submissions for publication. Best Paper Awards In order to recognize significant advancements in dialogue/discourse science and technology, SIGDIAL 2023 will include best paper awards. All papers at the conference are eligible for the best paper awards. A selection committee consisting of prominent researchers in the fields of interest will select the recipients of the awards. SIGDIAL 2023 Program Committee Svetlana Stoyanchev and Shafiq Rayhan Joty Conference Website: https://2023.sigdial.org/

1 0

2026

2025

2024

2023

2022

Corpora February 2023