- Corpora - ELRA lists

assortlist
by bbrgy2ek＠nqmo.com 04 Aug '24

04 Aug '24

If you aren't a current list member, sending this message will subscribe you. https://www.assortlist.com

1 0

Journal of Biomedical Informatics (JBI) Special Issue on Semantics-enabled Biomedical Literature Analytics
by bbrgy2ek＠nqmo.com 04 Aug '24

04 Aug '24

Journal of Biomedical Informatics (JBI) Special Issue on Semantics-enabled Biomedical Literature Analytics https://www.assortlist.com

1 0

Fwd: [Call for Papers] 4th NeurIPS ENLSP 2024 Workshop
by Ignacio J. Iacobacci 02 Aug '24

02 Aug '24

(Apologies for cross-posting) The 4th NeurIPS ENLSP 2024 workshop on "Efficient Natural Language & Speech Processing: Highlighting New Architectures for Future Foundation Models" is now open for submissions! If your research aims to enhance the efficiency of large language and foundation models in architecture, training, and inference for real-world applications, we invite you to submit your work to this workshop and please feel free to share it with your network. Link to the website for more details: https://neurips2024-enlsp.github.io *Call For Paper* The scope of this workshop includes, but not limited to, the following topics: Efficient Architectures and Models 🧠 Efficient Training (pre-training & fine-tuning) ⚡ Efficient Inference 📈 Evaluation and Benchmarking of Efficient Models 📊 Efficient Solutions in Other Modalities and Applications 🌐 Efficiency of foundational or pre-trained models in multi-modal set-up and other modalities (beyond NLP and Speech) such as biology, chemistry, computer vision, and time series Efficient representations (e.g. Matryoshka representation) and models in dense retrieval and search Efficient Federated learning, lower communication costs, tackling heterogeneous data and models ... and much more! *Submission Remarks* You are invited to submit your papers in our CMT submission portal ( https://cmt3.research.microsoft.com/ENLSP2024 ). All the submitted papers have to be anonymous for double-blind review. We expect each paper will be reviewed by at least three reviewers. The content of the paper (excluding the references and supplementary materials) should not be more than *8 pages for Long Papers* and *4 pages for Short Papers*, strictly following the NeurIPS template style. According to the guideline of the NeurIPS workshops, already published papers are not encouraged for submission, but you are allowed to submit your ArXiv papers or the ones which are under submission (for example *any NeurIPS submissions can be submitted concurrently to workshops* ). To encourage higher quality submissions, our sponsors are offering the *Best Paper* and the *Best Poster* Awards to qualified outstanding original oral and poster presentations (upon nomination of the reviewers). Bear in mind that our workshop is not archival, but the accepted papers will be hosted on the workshop website. *Moreover, **accepted papers may opt-in to publish their papers in a special issue proceeding for the workshop in PMLR. * *Important Dates:* Submission Deadline: *August 30, 2024 Anywhere on Earth (AOE)* Acceptance Notification: October 14, 2024 AOE Camera-Ready Submission: October 28, 2024 AOE We look forward to receiving your submissions and encourage you to share this call for papers with your network (or you may reshare this LINK <https://x.com/mrgzadeh/status/1814400987299238188> on X ). Thanks! Best Regards, NeurIPS ENLSP 2024 Organizers

1 0

Open Call to Join the WOAH Organizing Team
by Agostina Calabrese 02 Aug '24

02 Aug '24

Dear Colleagues, We are looking for a new member to help us diversify our organizing team for the 9th edition of the Workshop on Online Abuse and Harms<https://www.workshopononlineabuse.com/> (WOAH). If you are interested, please fill out the application form by August 31st: https://t.co/UoqdNV8ZBh. All the best, Agostina The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th' ann an Oilthigh Dh?n ?ideann, cl?raichte an Alba, ?ireamh cl?raidh SC005336.

1 0

2nd Call for Papers for the 22nd Annual Workshop of the Australasian Language Technology Association Workshop - #ALTA2024
by Kathy Reid 02 Aug '24

02 Aug '24

Dear Colleagues, We're delighted to announce that the CfP for the 22nd Annual Workshop of the Australasian Language Technology Association - ALTA 2024 - is now open and closes on 20th September (23:59hrs Anywhere on Earth UTC -12) Details are available on our website at https://alta2024.alta.asn.au/calls/papers and a summary follows. --- Important Dates * Submission deadline for short/long papers, presentation abstracts and industry demonstrations: 20 September 2024 (23:59 Anywhere On Earth UTC-12). * Main conference: 3 December and 4 December 2024, ANU, Canberra, ACT, hybrid (in person and online) Overview The 22nd Annual Workshop of the Australasian Language Technology Association (ALTA) will be held in a hybrid format at the Australian National University, Canberra, from 2 December to 4 December 2024. The ALTA 2024 workshop is the key local forum for socialising research results in Natural Language Processing (NLP) and Computational Linguistics (CL). It will feature presentations, posters, and demonstrations from students, industry, and academic researchers. Like previous years, we also encourage submissions and participation from industry and government researchers and developers. Note that ALTA is listed in the CORE 2023 Conference Rankings as Australasian C<https://www.core.edu.au/conference-portal>. Topics ALTA invites the submission of papers and presentations on all aspects of NLP and CL, including, but not limited to: * Commonsense Reasoning. * Computational Social Science and Cultural Analytics. * Dialogue and Interactive Systems. * Discourse and Pragmatics. * Efficient Methods for NLP. * Ethics in NLP. * Information Extraction. * Information Retrieval and Text Mining. * Interpretability, Interactivity and Analysis of Models for NLP. * Language Grounding to Vision, Robotics and Beyond. * Language Modeling and Analysis of Language Models. * Linguistic Theories, Cognitive Modeling and Psycholinguistics. * Machine Learning for NLP. * Machine Translation. * Multilinguality and Linguistic Diversity. * Natural Language Generation. * NLP Applications. * Phonology, Morphology and Word Segmentation. * Question Answering. * Resources and Evaluation. * Semantics: Lexical, Sentence level, Document Level, Textual Inference, etc. * Sentiment Analysis, Stylistic Analysis, and Argument Mining. * Speech and Multimodality. * Summarisation. * Syntax, Parsing and their Applications. We particularly encourage submissions that broaden the scope of our community by considering practical applications of language technology and multidisciplinary research. We also specifically encourage submissions from the industry. Format and instructions for authors Please refer to our CfP webpage for specifics.<https://alta2024.alta.asn.au/calls/papers> We are using OpenReview for submissions, and invite submissions of three different formats: (1) Original Research Papers, (2) Abstract-based Presentations, and (3) Industry Demonstrations. --- You can follow ALTA on social media at the following links: * LinkedIn (page): https://www.linkedin.com/company/australasian-language-technology-associati… * LinkedIn (group):https://www.linkedin.com/groups/1849979/ * Twitter: https://twitter.com/altanlp * Mastodon: https://sigmoid.social/@ALTAnlp * Hashtag is #ALTA2024 With kind regards, on behalf of the ALTA 2024 Team: Dr Gabriela Ferraro, General Chair Professor Tim Baldwin, Program Chair Dr Sergio José Rodríguez Méndez, Program Chair Dr Nicholas Kuo, Program Chair Dr Anton Malko, Publication Chair Dr Dawei Chen, Technology Chair A/Prof Shunichi Ishihara, Finance Chair Charbel El-Khaissi, PhD candidate, Sponsorship Chair Ned Cooper, PhD candidate, Local Chair Kathy Reid, PhD candidate, Publicity Chair

1 0

Final Call for Participation - CoLI-Dravidian@FIRE 2024: Word-level Code-Mixed Language Identification in Dravidian Languages
by Sabur B 01 Aug '24

01 Aug '24

****We apologize for multiple postings of this e-mail**** CALL FOR PARTICIPATION FIRE 2024 Task - CoLI-Dravidian: Word-level Code-Mixed Language Identification in Dravidian Languages Held as a shared task in the 16th meeting of Forum for Information Retrieval Evaluation (FIRE 2024 <http://fire.irsi.org.in/fire/2024/home>) December 12-15, 2024. DAIICT, Gandhinagar, India Website: https://sites.google.com/view/coli-dravidian-2024/datasets?authuser=0 Codalab link: https://codalab.lisn.upsaclay.fr/competitions/19357 Dear All, We are inviting researchers and students to participate in the shared task CoLI-Dravidian: Word-level Code-Mixed Language Identification in Dravidian Languages, which is held as a shared task in the 16th meeting of Forum for Information Retrieval Evaluation (FIRE 2024 <http://fire.irsi.org.in/fire/2024/home>). Language Identification (LI) involves detecting the language(s) used in a given text, which is a preliminary step for many applications such as sentiment analysis, machine translation, information retrieval, and natural language understanding. In multilingual India, especially among the youth, social media often features code-mixed text, blending local languages with English at various levels. However, this poses significant challenges for LI, particularly when languages are mixed within a single word. Dravidian languages, extensively spoken in southern India, are under-resourced despite their rich morphological structure. These languages face technological challenges, especially in script representation on digital platforms, leading users to prefer Roman or hybrid scripts for communication. This prevalent code-mixing offers vast linguistic data for research yet remains understudied. To address word-level LI challenges in code-mixed Dravidian languages, we are conducting a shared task by providing code-mixed datasets for four languages - Kannada, Tamil, Malayalam, and Tulu, to encourage the development of advanced LI models. There will be a real-time leaderboard, and the participants will be allowed to make a maximum of 10 submissions in the training phase and 5 submissions in the testing phase through CodaLab. Each team will have to select the best submission for ranking. To download the data and participate, go to: https://codalab.lisn.upsaclay.fr/competitions/19357. Best regards, The CoLI-Dravidian 2024 Organizing Committee Important dates - 14th June 2024 - open track websites and training data release - 1st July 2024– test data release - 25th July – run submission deadline (7th August – run submission deadline extended) - 8th August – results declared - 29th August – Working notes due - 10th September - Reviews - 30th October – Camera-ready copies of working notes NOTE: All dates mentioned here are in the AoE (Anywhere on Earth) zone. Organizing Committee - Shashirekha Hosahalli Lakshmaiah, Department of Computer Science, Mangalore University, India. - Ameeta Agrawal, Department of Computer Science, Portland State University, USA. - Fazlourrahman Balouchzahi, CIC, IPN, Mexico. - Asha Hegde, Department of Computer Science, Mangalore University, India. - Sabur Butt, IFE, Tecnologico de Monterrey, Mexico. - Sharal Coelho, Department of Computer Science, Mangalore University, India. - Kavya G, Department of Computer Science, Mangalore University, India. - Harshitha, Department of Computer Science, Mangalore University, India. - Sonith D, Department of Computer Science, Mangalore University, India. *Sabur Butt, Ph.D. *(He/Him) Institute for the Future of Education (IFE) *Tecnológico de Monterrey, Mexico* Address: Av. Eugenio Garza Sada 2501 Sur Tecnológico, 64849 Monterrey, N.L. LinkedIn <https://www.linkedin.com/in/saburb> - GitHub <https://github.com/saburbutt> - Scholar <https://scholar.google.com/citations?user=re7md-0AAAAJ&hl=en> - Website <https://saburbutt.github.io/>

1 0

Position for a junior researcher in historical computational linguistics at the University of Milan
by Martin Ruskov 01 Aug '24

01 Aug '24

Hello everyone, We have an open position for a junior researcher (with or without PhD) in linguistics or computational humanities at the University of Milan, as part of the project MetaLing that works on a corpus about the metalanguage of English linguistics between the 16th and the 18th century. We are particularly interested in the semantic shifts and variations in this context. This is a one-year contract, with a possibility of extension. The application deadline is 30 August 2024. Related links: project overview: https://expertise.unimi.it/resource/project/PRIN202223AANDR_01 position opening (pdf): https://www.unimi.it/en/media/69915/download For more information and help with the application process: angela.andreani(a)unimi.it or martin.ruskov(a)unimi.it Best regards, Martin Ruskov

1 0

Call For Papers: Workshop on Multimodal Search and Recommendations (CIKM MMSR ‘24)
by Aditya Nandkishore Chichani 31 Jul '24

31 Jul '24

Workshop on Multimodal Search and Recommendations (CIKM MMSR ‘24) Date: October 25, 2024 (Full day workshop) Venue: ACM CIKM 2024 <https://cikm2024.org/> (Boise, Idaho, United States) Website: https://cikm-mmsr.github.io/ Organizers: Aditya Chichani, Surya Kallumadi, Tracy Holloway King, Andrei Lopatenko Paper submission deadline: August 10, 2024 (23:59 P.M. GMT) Overview: The advent of multimodal LLMs like GPT-4o and Gemini has significantly boosted the potential for multimodal search and recommendations. Traditional search engines rely mainly on textual queries, supplemented by session and geographical data. In contrast, multimodal systems create a shared embedding space for text, images, audio, and more, enabling next-gen customer experiences. These advancements lead to more accurate and personalized recommendations, enhancing user satisfaction and engagement. Topics of interest include, but are not limited to: Cross-modal retrieval techniques Strategies for efficiently indexing and retrieving multimodal data. Approaches to ensure cross-modal retrieval systems can handle large-scale data. Development of metrics to measure similarity across different data modalities. Applications of Multimodal Search and Recommendations to Verticals (e.g. E-commerce, real estate) Implementing and optimizing image-based product searches. Creating multimodal conversational systems to enhance user experience and make search more accessible. Utilizing AR to enhance product discovery and user interaction. Leveraging multimodal search for efficient customer service and support. User-centric design principles for multimodal search interfaces Best practices for designing user-friendly interfaces that support multimodal search. Methods for evaluating the usability of multimodal search interfaces. Personalizing multimodal search interfaces to individual user preferences. Ensuring multimodal search interfaces are accessible to users with disabilities. Ethical Considerations and Privacy Implications of Multimodal Search and Recommendations Strategies for ensuring user data privacy in multimodal applications. Identifying and mitigating biases in multimodal algorithms. Ensuring transparency in how multimodal results are generated and presented. Approaches for obtaining and managing user consent for using their data. Modeling for Multimodal Search and Discovery Multi-modal representation learning Utilizing GPT-4o, Gemini, and other advanced pre-trained multimodal LLMs Dimensionality reduction techniques to reduce complexity of multimodal data. Techniques for fine-tuning pre-trained vision-language models. Developing and standardizing metrics to evaluate the performance of vision-language models in multimodal search. Submission Instructions: All papers will be peer-reviewed by the program committee and judged based on their relevance to the workshop and their potential to generate discussion. Submissions must be in PDF format, following the latest CEUR single column format. For instructions and LaTeX/Overleaf/docx templates, refer to CEUR’s submission guidelines ( https://ceur-ws.org/HOWTOSUBMIT.html#CEURART), reading up to and including the “License footnote in paper PDFs” section. Use Emphasizing Capitalized Style for Paper Titles. Submissions must describe original work not previously published, not accepted for publication, and not under review elsewhere. All submissions must be in English. The workshop follows a single-blind review process and does not accept anonymous submissions. At least one author of each accepted paper must register for the workshop and present the paper. Long paper limit: 15 pages. Short paper limit: 8 pages. References are not counted in the page limit. Submit to CIKM MMSR’24: https://openreview.net/group?id=ACM.org/CIKM/2024/Workshop/MMSR Contact: Aditya Chichani E-mail: aditya_chichani(a)berkeley.edu

1 0

*Extended deadline: 30 Sept 2024* OSNEM Special Issue:AI in Online Social Networks: opportunities and challenges
by Andrea Passarella 30 Jul '24

30 Jul '24

------------------------------------------------------------------------------ CALL FOR PAPERS Elsevier Online Social Networks and Media Journal (OSNEM) Special issue on AI in Online Social Networks: opportunities and challenges ****** EXTENDED SUBMISSION DEADLINE ******** Submission Deadline: Continuous submissions September 30st, 2024 ******************************************** https://www.sciencedirect.com/journal/online-social-networks-and-media ------------------------------------------------------------------------------ Online Social Networks and Media are a fundamental component of everyday life and the use of AI technologies in OSNEM can further boost their role. The use of AI in online social networks offers great opportunities and, at the same time, raises several challenges. AI's ability to analyze vast amounts of data in real-time allows social media platforms to offer highly personalized experiences to users. The use of AI may raise concerns about ethical issues such as privacy, algorithmic bias, misinformation, etc., but AI can also be used for content moderation on social media to detect and remove harmful or inappropriate content, identifying and mitigating the spread of fake news. etc. The use of AI on OSNEM can promote the democratic processes by facilitating the dissemination of information and encourage political engagement. On the other hand, AI algorithms can create echo chambers, influence voting behavior and generate significant risks for democracy. AI-driven security measures can help to protect OSNEM users from fraud and privacy breaches but, malicious actors can also use AI to support their attacks. The exponential diffusion of generative AI adds novel dimensions to this landscape, on the one hand supporting novel forms of interactions spanning into the Metaverse, but on the other hand exposing vulnerable users to dramatic threats. The aim of this special issue is to push the state of the art in using AI in OSNEM, by presenting quantitative contributions that investigate the opportunities and challenges of using AI in Online Social Networks. Within this framework, topics include, but are not limited to: - Using AI in OSNEM for personalization, efficiency, and recommendations; - AI-based studies for analysis and modelling of information and opinion dynamics in OSNEM; - AI-based predictions based on OSNEM data analysis; - AI impact on OSNEM security, trustworthiness and privacy; - Generative AI in OSNEM; - AI and social networking in the Metaverse; - AI methodologies for large-scale OSNEM data collection and analysis - AI methods to safeguard OSNEM users (e.g., bot detection, toxic content identification, content moderation, echo chamber avoidance) - Case studies of AI application in OSNEM Online Social Networks and Media is a multidisciplinary journal for the wide community of computer and network scientists working on developing OSNEM platforms and services and using OSNEM as a big data source to mine, learn and model the (online) human behaviour. Manuscripts only based on questionnaires, even focused on the reported use of social media, are outside the scope of the journal. On the other hand, the journal welcomes papers which present analyses based on big data mined from social networks/media. ----------------------------------------------------------------------------- Schedule Manuscript submission deadline: continuous submission until September 30th, 2024 (*) First notification: two months after the submission Expected publication: papers are published a few weeks after acceptance. Guest Editors Marco Conti, IIT-CNR, Italy Andrea Passarella, IIT-CNR, Italy ------------------------------------------------------------------------------ Instructions for submission Manuscripts must not have been previously published nor currently under review by other journals or conferences. If prior work was published in a conference, the submitted manuscript should include a substantial extension of at least 35% novel contributions. In this case, authors are also required to submit their published conference articles and a summary document explaining the enhancements made in the journal version. The submission website for this journal is located at https://www2.cloud.editorialmanager.com/osnem/default2.aspx. Please select ''VSI:AI&OSNEM'' when you reach the ''Article Type'' step in the submission process. To ensure that all manuscripts are correctly identified, for consideration by the special issue, the authors should indicate in the cover letter that the manuscript has been submitted for the special issue on “AI&OSNEM”. (*) Manuscripts can be submitted continuously until the deadline. Once a paper is submitted, the review process will start immediately. Accepted papers will be published continuously in the journal (in the first issue available as soon as the paper is accepted). All accepted papers will be listed together in an online virtual special issue published in the journal website. For further information, please contact the guest editors at {m.conti,a.passarella} at iit.cnr.it

1 0

Ulysses Tesemõ: a new large corpus for Brazilian legal and governmental domain
by Ellen Souza 29 Jul '24

29 Jul '24

Ulysses Tesemõ, a large corpus specifically built for the Brazilian legal domain. The corpus consists of over 3.5 million files, totaling 30.7 GiB of raw text, collected from 159 sources encompassing judicial, legislative, academic, news, and other related data. https://doi.org/10.1007/s10579-024-09762-8 Best Regards, Ellen Souza

1 0

2026

2025

2024

2023

2022

Corpora