- Corpora - ELRA lists

CFP-Fourth Workshop on Language Technology for Equality, Diversity, Inclusion (LT-EDI-2024) at EACL 2024-reg
by Bharathi Raja Asoka Chakravarthi 13 Oct '23

13 Oct '23

Apologies for cross posting *Fourth Workshop on Language Technology for Equality, Diversity, Inclusion (LT-EDI-2024) at EACL 2024* *Website link: https://sites.google.com/view/lt-edi-2024/ <https://sites.google.com/view/lt-edi-2024/> * Equality, Diversity and Inclusion (EDI) is an important agenda across every field throughout the world. Language as a major part of communication should be inclusive and treat everyone with equality. Today’s large internet community uses language technology (LT) and has a direct impact on people across the globe. EDI is crucial to ensure everyone is valued and included, so it is necessary to build LT that serves this purpose. Recent results have shown that big data and deep learning are entrenching existing biases and that some algorithms are even naturally biased due to problems such as ‘regression to the mode’. Our focus is on creating LT that will be more inclusive of gender, racial, sexual orientation, persons with disability. The workshop will focus on creating speech and language technology to address EDI not only in English, but also in less resourced languages. The broader objective of LT-EDI-2024 will be - To investigate challenges related to speech and language resource creation for EDI. - To promote research in inclusive LT. - To adopt and adapt appropriate LT models to suit EDI. - To provide opportunities for researchers from the LT community around the world to collaborate with other researchers to identify and propose possible solutions for the challenges of EDI. Our workshop theme focuses on being more inclusive and providing a platform for researchers to create LT of a more inclusive nature. We hope that through these engagements we can develop LT tools to be more inclusive of everyone, including marginalized people. *Call for Papers:* Our main theme in this workshop is equality, diversity, and inclusivity in LT. We invite researchers and practitioners to submit papers reporting on these issues and datasets to avoid these issues. We also encourage qualitative studies related to these issues and how to avoid them. LT-EDI-2024 welcomes theoretical and practical paper submissions on any languages that contribute to research in Equality, Diversity and Inclusion. We will particularly encourage studies that address either practical application or improving resources. *Topics of interest include, but are not limited to:* - Data set development to include EDI - Gender inclusivity in LT - LGBTQ+ inclusivity in LT - Racial inclusivity in LT - Persons with disability inclusivity in LT - Speech and language recognition for minority groups - Unconscious bias and how to avoid them in natural language processing, machine learning and other LT technologies. - Tackling rumours and fake news about gender, racial, and LGBTQ+ minorities. - Tackling discrimination against gender, racial, and LGBTQ+ minorities. Submissions: At LTEDI we accept the following submission types: - Long paper submissions must describe substantial, original, completed and unpublished work. Wherever appropriate, concrete evaluation and analysis should be included. Long papers may consist of up to 8 pages of content, plus unlimited pages for references and appendices. Upon acceptance, long papers will be given one additional page of content (i.e. up to 9 pages) in the proceedings so that reviewers’ comments can be taken into account. - Short paper submissions must describe original and unpublished work. Please note that a short paper is not a shortened long paper. Instead, short papers should have a point that can be made in a few pages. Short papers may consist of up to 4 pages of content, plus unlimited references and appendices. Upon acceptance, short papers will be given one additional page of content (i.e. up to 5 pages) in the proceedings so that reviewers’ comments can be taken into account. - Poster and demo submissions should be no longer than 4 pages (plus unlimited number of pages for references and ethics/broader impact statement). More information on submission can be found at https://sites.google.com/view/lt-edi-2024/submission For electronic submission of all papers, please use: https://openreview.net/group?id=eacl.org/EACL/2024/Workshop/LTEDI *Important Dates* - Workshop paper due: December 12, 2023 - Direct Submission deadline (pre-reviewed ARR & main conference) January 17, 2024 - Notification of acceptance: January 15, 2024 - Camera-ready papers due: January 25 2024 - Workshop dates: March 21-22, 2024 with regards, Dr. Bharathi Raja Chakravarthi, Assistant Professor / Lecturer-above-the-bar School of Computer Science, University of Galway, Ireland Insight SFI Research Centre for Data Analytics, Data Science Institute, University of Galway, Ireland E-mail: bharathiraja.akr(a)gmail.com , bharathi.raja(a)universityofgalway.ie <bharathiraja.asokachakravarthi(a)universityofgalway.ie> Google Scholar: https://scholar.google.com/citations?user=irCl028AAAAJ&hl=en Website: https://www.universityofgalway.ie/our-research/people/bharathirajaasokachak…

1 0

[Open Position] Research Engineer - Machine Learning and Large Language Models - EURECOM
by Raphaël Troncy 13 Oct '23

13 Oct '23

[Apologies for Cross-Posting] The Data Science Department at EURECOM, Sophia-Antipolis France, invites applications for a Research Engineer position. The candidate will work as part of a department-wide project, which aims at developing our research activities on large language models (LLMs). We are looking for fresh, ambitious and hard-working software engineers with a clear passion for turning ideas in the field of machine learning into fully-functional prototypes, demonstrators, and systems, using cutting edge machine learning libraries, and computing facilities. The position is funded by a French National project on Artificial Intelligence, and targets the development of LLMs in several flavors. Our first goal is to target an audience of EURECOM staff, including researchers and administration, to offer novel conversational agents that can use internal, multimodal information to interact with users for Q&A, summarization, in-filling and many other tasks. A second ambitious goal is to offer innovative educational services to EURECOM students, by designing novel conversational agents instructed to follow Socratic interactions. Such conversational agents will be augmented with teaching material (lecture notes, slides, video lectures, code from labs, etc…) which should be used and referenced in conversations. The successful candidate will work in close relation with professors, postdoctoral researchers and Ph.D. candidates, and will lead contributions to: (i) the implementation and deployment of state-of-the art, open-source LLMs, including inner components (novel attention mechanisms, novel tokenization schemes, key-value caching, to name a few) and outer components (such as retrieval augmented generation) (ii) the definition of appropriate methodologies for training and, most importantly, fine-tuning of our internal LLMs, (iii) the definition of appropriate benchmarks and validation methodologies, and (iv) communication to a public of developers, engineers and practitioners through technical blogs, online demonstrators (e.g. HuggingFace spaces, on-site deployment) of our results, as well as to demonstrate the various releases of the prototype to EURECOM management teams. Since our activities are fueled by an important public fund, in the context of the French Government plan for AI, called the 3IA, the successful candidate will also participate to project meetings and reviews, as well as public demonstrations of research results. Requirements: * Education Level / Degree: Ms.c. or Ph.D. degree in Computer Science, Applied Mathematics, Physics, or a closely related area with a strong background on algorithmic development and software engineering * Applicants should have a good background on machine learning, a taste for learning new topics related to machine learning research, as well as proven experience in the design of solid software engineering artifacts * Applicants should have strong communication skills, with target audiences ranging from the scientific and academic community, as well as engineers and product managers * The working language in the department is English. Application The application must include: * Detailed curriculum, including (if available) a list of software engineering projects, public demonstrators, and contributions to open-source projects * a cover letter describing the applicant’s interests the contact details of 2/3 persons that can provide references about the candidate * the transcripts of courses taken at graduate (and optionally undergraduate) level Applications should be submitted by e-mail to Pietro.Michiardi(a)eurecom.fr and secretariat(a)eurecom.fr with the reference : DS/PM/EFELIA-LLM/102023 Applications will be accepted until the position is filled. Start date: ASAP -- Raphaël Troncy EURECOM, Campus SophiaTech Data Science Department 450 route des Chappes, 06410 Biot, France. e-mail: raphael.troncy(a)eurecom.fr & raphael.troncy(a)gmail.com Tel: +33 (0)4 - 9300 8242 Fax: +33 (0)4 - 9000 8200 Web: http://www.eurecom.fr/~troncy/

1 0

2nd CFP, the 8th Biomedical Linked Annotation Hackathon (BLAH8)
by Jin-Dong Kim 12 Oct '23

12 Oct '23

[apologies for cross posting] The 8th Biomedical Linked Annotation Hackathon (BLAH8) - Biomedical Annotations in the Age of LLMs 15 - 19 January, 2024 Kashiwa, Chiba, Japan https://blah8.linkedannotation.org/ KEYNOTE SPEAKERS - Lawrence Hunter - University of Colorado - Martin Krallinger - Barcelona Supercomputing Center CALL FOR PROJECT PROPOSALS We invite submission of project proposals from those who are interested in contributing biomedical literature annotation with their literature annotation resources, and expertise, particularly this year with a connection to LLMs. We invite projects which can be accomplished during the hackathon. - Submission due of project proposals : 20 Oct., 2023 TRAVEL SUPPORT Those who submit project proposals are eligible to apply for travel support. See the homepage for detailed information. PUBLICATION Immediately after BLAH8, participants will be invited to submit papers to either of the two venues: - Genomics & Informatics : an open access journal, which is indexed by PubMed. All the papers of the journal will be immediately included in the PMC open access subset. - BioHackrxiv : a preprint server, which is powered by OSF preprints and indexed by EuropeanPMC. **Please refer to the homepage for more detailed information: https://blah8.linkedannotation.org/ PROGRAM COMMITTEE - Jin-Dong Kim (DBCLS, ROIS-DS) - Fabio Rinaldi (IDSIA) - Lars Juhl Jensen (Univ. Copenhagen) - Zhiyong Lu (NCBI, NLM)

1 0

Final CfP and Deadline Extension: PROPOR 2024 - 16th International Conference on Computational Processing of Portuguese
by Pablo Gamallo 12 Oct '23

12 Oct '23

Subject: Final CfP and Deadline Extension: PROPOR 2024 - 16th International Conference on Computational Processing of Portuguese [Apologies for cross-postings] ******************************************************** PROPOR 2024: 16th International Conference on Computational Processing of Portuguese Universidade de Santiago de Compostela (Santiago de Compostela - Galicia) March 14th to 15th 2024 2nd Call for Papers https://propor2024.citius.gal/ ******************************************************** *Important dates* * Full and short paper submission deadline: *06/11/2023 (23:59 GMT-3)* * Notification of paper acceptance or rejection: 07/12/2023 * Camera-ready papers due: TBA * Conference: March 14th - 15th, 2024 The International Conference on Computational Processing of Portuguese (PROPOR), whose next edition will take place for the first time in Galicia, birthplace of the Portuguese language, is the main event in the area of natural language processing that is focused on theoretical and technological issues of written and spoken Portuguese and Galician (considered as a local variety of the former). The meeting has been a very rich forum for the exchange of ideas and partnerships for the research and industry communities dedicated to the automated processing of this language, promoting the development of methodologies, resources and projects that can be shared among researchers and practitioners in the field. We call for papers describing work on any topic related to computational language and speech processing of Portuguese/Galician by researchers in the industry or academia. Topics of interest include, but are not limited to: * Natural language processing tasks (e.g. parsing, word sense disambiguation, coreference resolution) * Natural language processing applications (e.g. question answering, subtitling, summarization, sentiment analysis) * Natural language generation * Information extraction and information retrieval * Speech technologies (e.g. spoken language generation, speech and speaker recognition, spoken language understanding) * Speech applications (e.g. spoken language interfaces, dialogue systems, speech-to-speech translation) * Resources, standardization and evaluation (e.g. corpora, ontologies, lexicons, grammars) * NLP-oriented linguistic description or theoretical analysis * Distributional semantics and language modeling * Portuguese language varieties and dialect processing (including the language varieties of Angola, Brazil, Cape Verde, East Timor, Galicia, Guinea-Bissau, Macau, Mozambique, Portugal, and Sao Tome and Principe) * Multilingual studies, methods, applications and resources including Portuguese/Galician PROPOR 2024 will be held at the University of Santiago de Compostela (Santiago de Compostela - Galicia, Spain) from March 14th to March 15th. PROPOR 2024 will be the 16th edition of the biennial PROPOR conference, hosted alternately in Brazil and in Europe (Portugal/Galicia). Past meetings were held in Lisbon, PT (1993); Curitiba, BR (1996); Porto Alegre, BR (1998); Évora, PT (1999); Atibaia, BR (2000); Faro, PT (2003); Itatiaia, BR (2006); Aveiro, PT (2008); Porto Alegre, BR (2010); Coimbra, PT (2012); São Carlos, BR (2014), Tomar, PT (2016), Canela, BR (2018), Évora, PT (2020), and Fortaleza, BR (2022). Submissions Submissions should describe original, unpublished work. Authors are invited to submit two kinds of papers: * Full papers – Reporting substantial and completed work, especially those that may contribute in a significant way to the advancement of the area. Wherever appropriate, concrete evaluation results should be included. Full papers may consist of up to 8 pages of content, plus unlimited pages of references. * Short papers – Reporting small, focused contributions such as ongoing work, position papers, potential ideas to be discussed, or negative results. Short papers may consist of up to 4 pages of content, plus unlimited pages of references. Both Full and Short papers will be published in the proceedings of the main conference. Each submission will be evaluated by at least three reviewers. As reviewing will be double-blind, submitted papers must be anonymized, that is, they should not contain the authors’ names and affiliations. Authors must avoid self-references that reveal identity, like, “We previously showed (Smith, 1991) …”. Instead, they should prefer citations such as “Smith (1991) previously showed …”. Separate author identification information will be required as part of the submission process. Submissions to PROPOR 2024 may not be made available online (e.g. via a preprint server), and may not be submitted for review elsewhere while being under review for this conference. Submissions should be written in English. At submission time, only PDF format is accepted. For the final versions, authors of accepted papers will be given 1 extra content page to take the reviews into account. Authors of accepted papers will be requested to send the source files for the production of the proceedings. All submitted papers must conform to the official ACL style guidelines. ACL provides style files for LaTeX and Microsoft Word that meet these requirements. They can be found at: * LaTeX styelesheet * MS Word stylesheet Paper should be submitted here in the following URL: https://easychair.org/my/conference?conf=propor2024 Important dates * Full and short paper submission deadline: 06/11/2023 (23:59 GMT-3) * Notification of paper acceptance or rejection: 07/12/2023 * Camera-ready papers due: TBA * Conference: March 14th - 15th, 2024 Publication The proceedings of PROPOR 2024 will be published by ACL as a volume in ACL Anthology (https://aclanthology.org/ ). They will be available online. To ensure publication, at least one author of each accepted paper must complete an adequate registration for PROPOR 2024 by the early registration deadline. Kindest regards, António Teixeira, Livy Real & Marcos Garcia PROPOR 2024 Program Chairs

1 0

Schools and LLMs: Are you ready for the challenge?
by Fabio Massimo Zanzotto 11 Oct '23

11 Oct '23

Schools and LLMs: Are you ready for the challenge? Request for expression of interest for two 1-year open positions: (1) Research Fellowship - Assegno di Ricerca I Fascia (Requirements: Master's Degree) (2) Post-doc Research Fellowship - Assegno di Ricerca II Fascia (Requirements: Ph.D. Degree) Send a resume to <mailto:fabio.massimo.zanzotto@uniroma2.it> fabio.massimo.zanzotto(a)uniroma2.it to express your interest in one of the two positions. We offer: - an uncompetitive salary - no extra-benefits - no clear career path Yet, YOU can help us shape possible ways schools may integrate these disruptive LLMs to prepare "biological brains" for the vibrating future. These positions are within an Italian Research Project of National Interest (PRIN): "Class-tAIs: Artificial Intelligence and multi-brain connectivity as a buddy to Enhancing Competencies in students" Positions will start early next year and will be formally announced soon. Lab: Human-centric Art at the University of Rome Tor Vergata (Italy) Follow us on our newly established X account: @HumanCentricArt

1 0

Job opening: PhD candidate in Inclusive Machine Translation (1,0 fte, 4 years)
by Fred Blain 11 Oct '23

11 Oct '23

Dear all, The Department of Cognitive Science and Artificial Intelligence at Tilburg University is delighted to announce a PhD vacancy for a highly motivated and talent student to contribute to the cutting-edge field of Machine Translation (MT). More specifically, the candidate will focus on the important yet challenging and exciting topic of Inclusive Machine Translation. Traditional Neural Machine Translation (NMT) systems tend to suffer from overgeneralizations, biases and lack of transparency. By mitigating biases and enhancing the transparency of NMT systems, the PhD researcher will work towards fairer and more inclusive MT. In line with the research interests of the Inclusive and Sustainable Machine Translation (ISMT) Lab, the objective of this research proposal is thus to investigate efficient and sustainable ways to address challenges related to inclusivity, bias and transparency in NMT. We offer a fully funded, 4-year PhD position, to conduct research and innovation activities in the field of text-to-text Machine Translation. Within the scope of this PhD, topics such as data compilation, data preprocessing and data analysis tailored to the needs of Machine Translation; development and neural architectures for uncovering bias and enhance explainabilty in translation and/or adapting architectures with successful application on text-to-text MT; adapting and adopting pre-trained language models, and human-centred design, evaluation and validation activities will be addressed. This position is hosted by the ISMT Lab of the CSAI department which focuses on (i) user- and use-case-centric MT, (ii) Inclusive Language Technology and (iii) Sustainable and Environmentally Conscious AI. Through its research activities it aims at promoting inclusiveness and cross-field collaborations, working with both academia and industry. The PhD candidate will be supervised by dr. Eva Vanmassenhove<https://research.tilburguniversity.edu/en/persons/eva-vanmassenhove> and dr. Frédéric Blain<https://research.tilburguniversity.edu/en/persons/fred-blain> (daily supervisors), and dr. Afra Alishahi<http://afra.alishahi.name/> (promotor). The work will be centred around objectives that will involve: Data Analysis, Data Processing Techniques, Model Training (design and implementation of state-of-the-art Deep Learning models for MT), Annotations (e.g. Linguistic Annotations), Bias Mitigation, Explainability Enhancement. The PhD candidate will work on tasks related to the aforementioned objectives in a multidisciplinary environment. This will allow the PhD candidate to gain expertise from the fields of Computational Linguistics, Computer Science, Engineering, Data Analysis, Machine- and Deep-Learning, and Cognitive Science. Job description * Develop and conduct research; * Write and defend a thesis within 4 years; * Write peer-reviewed papers for submission to international journals or leading conference proceedings in the first 3 years; * Give presentations about study results on a regular basis on workshops and conferences; * Participate in co-organising workshops; * Participate in ISMT and CSAI activities; * Participate in the Graduate Schools’ education program. Your Profile Candidates for this position should have a (research) Masters’ degree in: Artificial Intelligence and / or Computational linguistics with focus in at least one of the following fields: computational linguistics, natural language processing, machine translation, machine learning, data analysis and/or deep learning. In addition, we invite applicants who have a strong profile in a subset of these: * Interest in languages, NLP and/or Linguistics; * Excellent analytical and problem-solving abilities; * Ability to work both independently and collaboratively as part of a team to drive innovative research; * Proficiency in scientific programming languages, particularly Python (Numpy, Pandas, Matplotlib, Scikit-Learn, Tensorflow, PyTorch, etc.); * Some experience with (experimental) evaluation and statistical analyses; * Fluent spoken and written English communication skills; * Have a proactive and goal-directed attitude, good organizational skills, and the ability to get things done. Knowledge of another language than English (default language of communication) and / or Dutch is not an obligation but would be a major advantage given the research topic. Conditions of employment Fixed-term contract: 4 years. A full-time position. Starting date is negotiable. The selected candidate will start with a contract at Tilburg University at the Department of Cognitive Science and Artificial Intelligence for one year, concluded by an evaluation. Upon a positive outcome of the first-year evaluation, the candidate will be offered an employment contract for an additional 3 years. The PhD candidate will be ranked in the Dutch university job ranking system (UFO) as a PhD-student (promovendus) with a starting fulltime salary of € 2,770 gross per month in the first year, increasing up to € 3,539 the fourth year. A holiday allowance of 8% and an end-of-year bonus of 8.3% (annually); Researchers from outside the Netherlands may qualify for a temporary tax-free allowance equal to 30% of their taxable salary. The University will apply for such an allowance on their behalf; The University will provide assistance in finding suitable accommodation (for foreign employees); Tilburg University is rated among the top Dutch employers and offers very good fringe benefits (it is one of the best non-profit employers in the Netherlands), also including excellent technical infrastructure, savings schemes and excellent sport facilities. The collective labor agreement of the Dutch Universities applies. Who we are The CSAI department (https://csai.nl) The Cognitive Science & Artificial Intelligence (CS&AI) department performs computational research in the domains of Artificial Intelligence and Cognitive Science and runs educational programs on Cognitive Science & Artificial Intelligence and Data Science and Society. The department is partially housed on Tilburg University campus (Dante Building) and in the Deprez building (near Tilburg central station) as member of MindLabs<https://www.mind-labs.eu/>. We maintain a close collaboration with the Jheronimus Academy of Data Science (JADS) in ‘s-Hertogenbosch. We are member of the Benelux Association for Artificial Intelligence<http://ii.tudelft.nl/bnvki/> (BNVKI), participate in the Special Interest Group for AI<http://ii.tudelft.nl/bnvki/?page_id=1247> (SIG AI), contributed to the Dutch AI Manifesto<http://ii.tudelft.nl/bnvki/wp-content/uploads/2018/09/Dutch-AI-Manifesto.pdf>, and participate in the Confederation of Laboratories for Artificial Intelligence Research in Europe<https://claire-ai.org/> (CLAIRE). The research group consists of about 70 researchers covering a broad range of topics relevant for cognitive science and artificial intelligence, and this includes approximately 35 PhD students. Core research domains include cognitive science, machine learning, deep learning, games, virtual reality, computational psycholinguistics, brain-computer interfaces, robotics, cognitive modeling of language, computational linguistics, educational technologies, computational modeling of evolutionary and adaptive systems, image and signal processing, with a strong emphasis on quantitative methods. The ISMT Lab (https://www.tilburguniversity.edu/about/schools/tshd/departments/dca/lab/ma…) Machine translation (MT), the task of automatically translating text in one language into text in another language using a computer system, has undergone many shifts since its inception in the late 1950s. The latest of which, neural machine translation (NMT), has reached unprecedented translation qualities at almost human-level performance, of course for some use-cases and under certain conditions. MT has become an indispensable tool for professional translators (to assist in the translation workflow), for commercial users, e.g., e-commerce companies (to make their content quickly available in multiple languages), to every-day users (to access information unrestricted by the language in which it is produced). User- and use-case-centric MT We focus on the specific user requirements as well as the domain, the language, the style, etc. of a specific use-case. Through smart data analysis, selection and processing and optimized models we investigate faster and better tools that are tuned towards users and use-cases. Inclusive Language Technology Current MT and NLP models exacerbate bias and may produce inaccurate or sometimes even offensive outputs. We explore bias-related language phenomena and develop techniques to mitigate bias is an important research direction we are undertaking. Sustainable and Environmentally Conscious AI We study the environmental impact of language technology and ways to cut down the use of computing resources without losing quality. Can we reduce, reuse and reorganize for a less intrusive technology? More information This PhD project is expected to start at the beginning of 2024; the preferred start date is February 1st 2024. For more information on this position or the project, please contact: dr. Eva Vanmassenhove (E.O.J.Vanmassenhove(a)tilburguniversity.edu<mailto:E.O.J.Vanmassenhove@tilburguniversity.edu>), or dr. Frédéric Blain (F.L.G.Blain(a)tilburguniversity.edu<mailto:F.L.G.Blain@tilburguniversity.edu>) Applications To apply for this position please submit a motivation letter, CV, thesis (and any publications), grade list and names of two references. The only way to apply is online. Deadline for applications: November 15th 2023. Interviews are expected to take place in the beginning of December 2023. Tilburg School of Humanities and Digital Sciences Research and education at the Tilburg School of Humanities and Digital Sciences (TSHD) has a unique focus on humans in the context of the globalizing digital society, on the development of artificial intelligence and interactive technologies, on their impact on communication, culture and society, and on moral and existential challenges that arise. The School of Humanities and Digital Sciences consists of four departments: Communication and Cognition, Cognitive Science and Artificial Intelligence, Culture Studies and Philosophy; several research institutes and a faculty office. Also the University College Tilburg is part of the School. Each year around 275 students commence a Bachelor or (Pre) Master Program. The School has approximately 2000 students and 250 employees. Tilburg School of Humanities and Digital Sciences<https://www.tilburguniversity.edu/about/schools/humanities/> -------- Met hartelijke groeten/With kind regards, Fred Blain Assistant Professor in AI Department of Cognitive Science & Artificial Intelligence Dante Building, D144 Tilburg School of Humanities and Digital Sciences Tilburg University, The Netherlands

1 0

Postdoc position in Copenhagen (AAU, 24 months)
by Johannes Bjerva 11 Oct '23

11 Oct '23

Hi everyone, We are hiring a 2-year postdoc at the Copenhagen campus of Aalborg University. The project is on NLP in an educational context, with the goal of exploring methodology for automated generation of feedback, tailored to specific students. There is a substantial amount of freedom in terms of the methodological research direction. The position is funded by a Villum Synergy project, hence in addition to core NLP research, there will be an opportunity to publish in venues focused on learning technologies. You will be joining a growing department (with annual hiring rounds for assistant professors), and a fun NLP team with ph.d.s and postdocs focusing on low-resource NLP (https://typnlp.github.io/). * Application deadline: 8 November 2023 * Interviews likely in mid November * Starting date: 1 January 2024 (or shortly thereafter) * Quality of life in Copenhagen is high, and getting around by bike is both safe and easy - the AAU campus in Copenhagen is located by the waterfront, close to the city, and easily accessible both by bike and public transport. For more details and to apply, see the official call here: https://www.vacancies.aau.dk/scientific-positions/show-vacancy?vacancyId=12… Interested applicants are encouraged to reach out to me at jbjerva(a)cs.aau.dk. Best, Johannes Johannes Bjerva Associate Professor | Natural Language Processing | Department of Computer Science Team Lead for CS CPH Aalborg University Copenhagen Office 2.2.089, A.C. Meyers Vænge 15, 2450 Copenhagen, Denmark

1 0

Job: Postdoctoral Researcher (f/m/d) in Privacy-Preserving Natural Language Processing, Paderborn University, Germany
by Ivan Habernal 11 Oct '23

11 Oct '23

Job: Postdoctoral Researcher (f/m/d) in Privacy-Preserving Natural Language Processing, Paderborn University, Germany Paderborn University is a high-performance and internationally oriented university with approximately 18,000 students. Within interdisciplinary teams, we undertake forward-looking research, design innovative teaching concepts and actively transfer knowledge into society. As an important research and cooperation partner, the university also shapes regional development strategies. We offer our more than 2,600 employees in research, teaching, technology and administration a lively, family-friendly, equal opportunity environment, a lean management structure and diverse opportunities. Join us to invent the future! The Natural Language Processing group lead by Prof. Dr. Ivan Habernal at the Faculty of Computer Science, Electrical Engineering and Mathematics offers a full-time position as Postdoctoral Researcher (f/m/d) (according to salary group E 14 TV-L) starting as soon as possible in the research area "Privacy-Preserving Natural Language Processing". This position will focus on a broad range of research questions related to privacy-preserving NLP, including large language models or data privatization. A solid background in machine learning for natural language processing is essential, prior experience with differential privacy or other privacy frameworks is a plus. The position is planned for three years with a possible extension. The period of employment is governed by the Academic Fixed-Term Contract Act (Wissenschaftszeitvertragsgesetz - WissZeitVG). An extension is possible within the time limits of the WissZeitVG. Your duties and responsibilities: * Independently conduct innovative research in the research area mentioned above * Take an active role in supervising students (doctoral-, master's and bachelor's students) * Operational and strategic management and further development by leading research project * Acquisition of new third-party funds * Teaching on the order of 4 teaching hours (SWS) per week Your profile: * PhD degree in computer science or a related area * Excellent analytical and programming skills * Publications in the area of privacy-preserving NLP/ML is a plus * Communicative and team-oriented personality * Independent, self-reliant and committed working style * Very good command of English, both written and spoken (German is a plus) We provide: * Work on highly relevant research topics and technologies in an international research team * A family-friendly workplace with the opportunity of partial remote work ("Mobiles Arbeiten") * Personnel development through further training opportunities * A supplementary employer pension scheme (VBL) Applications from women are particularly welcome and, in case of equal qualifications and experiences, will receive preferential treatment according to the North Rhine-Westphalian Equal Opportunities Act (LGG), unless there are preponderant reasons to give preference to another applicant. Part-time employment is, in principle, possible. Applications from disabled people with appropriate suitability are explicitly welcome. This also applies to people with equal opportunities in accordance with the German social law SGB IX. Apply by sending a short letter of motivation and a curriculum vitae including your publication list to Prof. Dr. Ivan Habernal (ivan.habernal(a)uni-paderborn.de) with reference code 6137. Don't hesitate to contact Prof. Habernal should you have any questions regarding the position. Application deadline is on October 22th, 2023. The position remains open until filled. Information regarding the processing of your person data can be located at: https://www.uni-paderborn.de/zv/personaldatenschutz . Prof. Dr. Ivan Habernal Faculty of Computer Science, Electrical Engineering and Mathematics – Department of Computer Science Paderborn University Warburger Str. 100 33098 Paderborn

1 0

Call for Papers: International Conference for Learner Corpus Research (LCR 2024)
by lcr2024＠ut.ee 10 Oct '23

10 Oct '23

Dear Colleagues, We are delighted to extend an invitation for the forthcoming International Conference for Learner Corpus Research (LCR) in 2024. Organized under the aegis of the Learner Corpus Association, this event brings together researchers, language instructors, and software developers who share an interest in using learner corpora for research. LCR 2024 in Tartu aspires to provide a favourable environment for dialogue and most recent research, allowing LCA members and other scholars in the field of LCR to share their latest ideas and findings. Recognizing Estonia's unique position as the homeland of a lesser-known non-Indo-European language, we see it as suitable to prioritize corpus research that delves into the learning of smaller languages and by learners with less common L1 backgrounds. However, research focused on languages, even those extensively studied, is, as always, mostly welcomed. Below, you will find the most essential details about the conference. Keynote speakers: Gaëtanelle Gilquin (Université Catholique de Louvain, Belgium) Ilmari Ivaska (Turun Yliopisto, Finland) Cristóbal Lozano (Universidad de Granada, Spain) Event Details: Date: 26-28 September 2024 Location: Institute of Foreign Languages and Cultures and the Institute of Estonian and General Linguistics. University of Tartu, Estonia. Categories: The conference will include keynote talks, paper presentations, work in progress reports, poster presentations and software demonstrations. Language: The language of the conference will be English. Topics: Areas of interest include, but are not limited to, the following: Language for academic purposes Language for specific purposes Language teaching, assessment and testing Learner corpus-based SLA studies Corpora as pedagogical resources Multimodal learner corpora Software for learner corpus analysis Corpus-based translation studies English as a Medium of Instruction (EMI) English as a Lingua Franca (ELF) Data mining and other explorative approaches to learner corpora Statistical methods in learner corpus studies Discourse analysis and pragmatics Studies related to lexis: semantics, metaphor, etc. NLP approaches Complexity, accuracy and/or fluency (CAF) analysis Abstracts: A short summary of the intended presentation, capturing the central idea along with the research questions, methods of research and the (possibly tentative) key conclusions, also citing any relevant previous work or theoretical background of the field On unpublished original research Complemented by 3-5 keywords Limited to 300 words, excluding keywords and references Anonymous: the abstract itself should hold no reference to the author or their affiliation Further information is available at our webpage: https://lcr2024.ut.ee/main Contact information: lcr2024(a)ut.ee Looking forward to your valuable contributions.

1 0

CfP: The 8th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
by Stan Szpakowicz 10 Oct '23

10 Oct '23

*LaTeCH-CLfL 2024: The 8th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature * to be held in March 2024 in conjunction with EACL 2024 <https://2024.eacl.org/> in Malta. https://sighum.wordpress.com/latech-clfl-2024/ First Call for Papers (with apologies for cross-posting) Organisers: Yuri Bizzoni, Stefania Degaetano-Ortlieb, Anna Kazantseva, Stan Szpakowicz LaTeCH-CLfL 2024 is the eighth in a series of meetings for NLP researchers who work with data from the broadly understood arts, humanities and social sciences, and for specialists in those disciplines who apply NLP techniques in their work. The workshop continues a long tradition of annual meetings. The SIGHUM Workshops on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH) ran ten times in 2007-2016. The five Workshops on Computational Linguistics for Literature (CLfL) took place in 2012-2016. The first seven joint workshops (LaTeCH-CLfL) were held in 2017-2023. *Topics and content* In the Humanities, Social Sciences, Cultural Heritage and literary communities, there is increasing interest in, and demand for, NLP methods for semantic and structural annotation, intelligent linking, discovery, querying, cleaning and visualization of both primary and secondary data. This is even true of primarily non-textual collections, given that text is also the pervasive medium for metadata. Such applications pose new challenges for NLP research: noisy, non-standard textual or multi-modal input, historical languages, vague research concepts, multilingual parts within one document, and so no. Digital resources often have insufficient coverage; resource-intensive methods require (semi-)automatic processing tools and domain adaptation, or intense manual effort (e.g., annotation). Literary texts bring their own problems, because navigating this form of creative expression requires more than the typical information-seeking tools. Examples of advanced tasks include the study of literature of a certain period, author or sub-genre, recognition of certain literary devices, or quantitative analysis of poetry. NLP methods applied in this context not only need to achieve high performance, but are often applied as a first step in research or scholarly workflow. That is why it is crucial to interpret model results properly; model interpretability might be more important than raw performance scores, depending on the context. More generally, there is a growing interest in computational models whose results can be used or interpreted in meaningful ways. It is, therefore, of mutual benefit that NLP experts, data specialists and Digital Humanities researchers who work in and across their domains get involved in the Computational Linguistics community and present their fundamental or applied research results. It has already been demonstrated how cross-disciplinary exchange not only supports work in the Humanities, Social Sciences, and Cultural Heritage communities but also promotes work in the Computational Linguistics community to build richer and more effective tools and models. Topics of interest include, but are not limited to, the following: • adaptation of NLP tools to Cultural Heritage, Social Sciences, Humanities and literature; • automatic error detection and cleaning of textual data; • complex annotation schemas, tools and interfaces; • creation (fully- or semi-automatic) of semantic resources; • creation and analysis of social networks of literary characters; • discourse and narrative analysis/modelling, notably in literature; • emotion analysis for the humanities and for literature; • generation of literary narrative, dialogue or poetry; • identification and analysis of literary genres; • interpretability of large language models output for DH-related tasks (explainable AI); • linking and retrieving information from different sources, media, and domains; • low-resource and historical language processing; • modelling dialogue literary style for generation; • modelling of information and knowledge in the Humanities, Social Sciences, and Cultural Heritage; • profiling and authorship attribution; • search for scientific and/or scholarly literature; • work with linguistic variation and non-standard or historical use of language. *Information for authors* We invite papers on original, unpublished work in the topic areas of the workshop. In addition to long papers, we will consider short papers and system descriptions (demos). We also welcome position papers. • Long papers, presenting completed work, may consist of up to eight (8) pages of content plus additional pages of references (just two if possible -:). The final camera-ready versions of accepted long papers will be given one additional page of content (up to 9 pages) so that reviewers’ comments can be taken into account. • A short paper / demo presenting work in progress, or the description of a system, and may consist of up to four (4) pages of content plus additional pages of references (one if you can). Upon acceptance, short papers will be given five (5) content pages in the proceedings. • A position paper — clearly marked as such — should not exceed eight (8) pages including references. All submissions are to use the EACL stylesheets (for LaTeX / Overleaf and MS Word); there will be a link soon (we hope) but last year's https://2023.eacl.org/calls/styles is a good guess. Papers should be submitted electronically, only in PDF, via the LaTeCH-CLfL2024 submission website on the SoftConf pages (we will publish the link as soon as we have it). Reviewing will be double-blind. Please do not include the authors’ names and affiliations, or any references to Web sites, project names, acknowledgements and so on — anything that immediately reveals the authors’ identity. Self-references should be kept to a reasonable minimum, and anonymous citations cannot be used. Accepted papers will be published in the workshop proceedings available as usual in the ACL Anthology. *Important dates* (tentative) Workshop paper due: December 18, 2023 Notification of acceptance: January 20, 2024 Camera-ready papers due: January 30 2024 Workshop date: March 21 or 22, 2024 *More on the organizers* Yuri Bizzoni, Center for Humanities Computing / School for Communication and Culture, Århus University Stefania Degaetano-Ortlieb, Language Science and Technology, Saarland University Anna Kazantseva, National Research Council Canada Stan Szpakowicz, School of Electrical Engineering and Computer Science, University of Ottawa *Contact* latech-clfl(a)googlegroups.com

1 0

2026

2025

2024

2023

2022

Corpora