December 2023 - Corpora

Advance notice: ‘Statistics for linguistics with R’ bootcamp (08 – 12/07/2024)
by Magali Paquot 01 Dec '25

01 Dec '25

The Linguistics Research Unit of the Institute of Language and Communication (Université catholique de Louvain, Belgium) will be hosting Stefan Gries’s next bootcamp on statistics for linguistics with R from 08 to 12 July 2024. The ‘Statistics for linguistics with R’ bootcamp is a hands-on introduction to statistical methods for both graduate students and seasoned researchers and is loosely based on the third edition (2021) of Gries’s textbook Statistics for linguistics with R. The course is intended for linguists who already have a basic knowledge in statistics and some experience using R and who wish to improve their proficiency in statistical modeling of linguistic data. Using the open source software and programming language R, we will deal with: • fundamental aspects of fixed effects regression modeling for both numeric and binary response variables; these include exploration of data and their preparation for modeling, model formulation and selection; numerical and visual interpretation and evaluation of models; • more advanced aspects of fixed-effects regression modeling such as contrasts for ordinal predictors, orthogonal contrasts, curvature of numeric predictors, and maybe general linear hypothesis tests; • the theoretical foundations of mixed-effects regression modeling; • applications of mixed-effects modeling for both numeric and binary response variables; • tree-based methods and random forests: 'fitting' and interpreting them with importance scores, partial dependence scores, and detecting (not just capturing) interactions. The website of the bootcamp will be online in early 2024 and online registration will start on 1 March 2024, 11 am CEST. The number of participants is limited. If you would like to participate, mark the date in your diary! Contact email: magali.paquot(a)uclouvain.be<mailto:magali.paquot@uclouvain.be> Magali Paquot Convenor

1 2

3-year PhD position in Computational Models of Semantic Memory and its Acquisition (Inria and University of Lille, France)
by Pascal Denis 13 May '25

13 May '25

Hello, Could you please distribute the following job offer? Thanks. Best, Pascal ------------------------------------------------------------------------------------- 3-year PhD position in Computational Models of Semantic Memory and its Acquisition (Inria and University of Lille, France) We invite applications for a 3-year PhD position at the University of Lille in the context of the recently funded research project "COMANCHE" (Computational Models of Lexical Meaning and Change). The position is funded by Inria, the French national research institute in Computer Science and Applied Mathematics. COMANCHE proposes to transfer and adapt neural word embeddings algorithms to model the acquisition and evolution of word meaning, by comparing them with linguistic theories on language acquisition and language evolution. At the intersection between Natural Language Processing, psycholinguistics and historical linguistics, this project intends to validate or revise some of these theories, while also developing computational models that are less data hungry and computationally intensive as they exploit new inductive biases inspired by these disciplines. The first strand of the project, on which the successful candidate will work, focuses on the development of computational models of semantic memory and its acquisition. Two main research directions will be pursued. On the one hand, we will compare the structural properties associated to different semantic spaces derived from word embedding algorithms to those found in human semantic memory as reflected in behavioral data (such as typicality norms) as well as brain imaging data. The latter data will then used as additional supervision to inject more hierarchical structure into the learned semantic spaces. One the other hand, we intend to experiment with training regimes for word embedding algorithms that are closer to those of humans when they acquire language, controlling the quantity as well as the linguistic complexity of the inputs fed to the learning algorithms through the use of longitudinal and child directed speech corpora (e.g., CHILDES, Colaje). In both cases, both English and French data will be considered. The successful candidate holds a Master's degree in computational linguistics or computer science or cognitive science and has prior experience in word embedding models. Furthermore, the candidate will provide strong programming skills, expertise in machine learning approaches and is eager to work across languages. The position is affiliated with the MAGNET team at Inria, Lille [1] as well as with the SCALAB group at University of Lille [2] in an effort to strenghten collaborations between these two groups, and ultimately foster cross-fertilizations between Natural Language Processing and Psycholinguistics. Applications will be considered until the position is filled. However, you are encouraged to apply early as we shall start processing the applications as and when they are received. Applications, written in English or French, should include a brief cover letter with research interests and vision, a CV (including your contact address, work experience, publications), and contact information for at least 2 referees. Applications (and questions) should be sent to Angèle Brunellière (angele.brunelliere(a)univ-lille.fr) and Pascal Denis (pascal.denis(a)inria.fr). The starting date of the position is 1 October 2022 or soon thereafter, for a total of 3 full years. Best regards, Angèle Brunellière and Pascal Denis [1] https://team.inria.fr/magnet/ [2] https://scalab.univ-lille.fr/ -- Pascal ---- Pour une évaluation indépendante, transparente et rigoureuse ! Je soutiens la Commission d'Évaluation de l'Inria. ---- +++++++++++++++++++++++++++++++++++++++++++++++ Pascal Denis Equipe MAGNET, INRIA Lille Nord Europe Bâtiment B, Avenue Heloïse Parc scientifique de la Haute Borne 59650 Villeneuve d'Ascq Tel: ++33 3 59 35 87 24 Url: http://researchers.lille.inria.fr/~pdenis/ +++++++++++++++++++++++++++++++++++++++++++++++

1 2

Asia Pacific Journal of Corpus Research (APJCR) Vol. 4, No. 1 is now available online
by Prof CK Jung 03 Sep '24

03 Sep '24

*Asia Pacific Journal of Corpus Research (APJCR) is now available online:* http://icr.or.kr/ejournals-apjcr *The Incredible Shrinking Noun Phrase: Ongoing Change in Japanese Word Formation*Kevin Heffernan, (Kwansei Gakuin University), JAPAN; Yusuke Imanishi (Kwansei Gakuin University), JAPAN DOI: https://doi.org/10.22925/apjcr.2023.4.1.1 ________________________________________ *Identifying Key Grammatical Errors of Japanese English as a Foreign Language Learners in a Learner Corpus: Toward Focused Grammar Instruction with Data-Driven Learning* Atsushi Mizumoto (Kansai University), JAPAN; Yoichi Watari (Chukyo University), JAPAN DOI: https://doi.org/10.22925/apjcr.2023.4.1.25 ________________________________________ *A Comparison of the Constructions Make / Take a Decision in Malaysian English with the Supervarieties * Christina Sook Beng Ong (Wawasan Open University), MALAYSIA DOI: https://doi.org/10.22925/apjcr.2023.4.1.43 ________________________________________ *Effects of Corpus Use on Error Identification in L2 Writing * Yoshiho Satake (Aoyama Gakuin University), JAPAN DOI: https://doi.org/10.22925/apjcr.2023.4.1.61 --- *CK Jung BEng(Hons) Birmingham MSc Warwick EdD Warwick Cert Oxford* Associate Professor | Department of English Language and Literature, Incheon National University, *South Korea* President | The Korea Association of Secondary English Education, *South Korea *(http://kasee.org) Vice President | The Korea Association of Primary English Education), *South Korea *(http://kapee.or.kr) Director | Institute for Corpus Research, Incheon National University, *South Korea* (http://icr.or.kr) Editor-in-Chief | Asia Pacific Journal of Corpus Research, ICR, *International* (http://icr.or.kr/apjcr) Editorial Board | Corpora, Edinburgh University Press, *UK* Editorial Board | English Today, Cambridge University Press, *UK* E: ckjung(a)inu.ac.kr / T: +82 (0)32 835 8129 H(EN): http://ckjung.org

1 2

NLP4CALL 2023 Final call for papers
by David Alfter 22 Aug '24

22 Aug '24

== 12th NLP4CALL, Tórshavn, Faroe Islands== The workshop series on Natural Language Processing (NLP) for Computer-Assisted Language Learning (NLP4CALL) is a meeting place for researchers working on the integration of Natural Language Processing and Speech Technologies in CALL systems and exploring the theoretical and methodological issues arising in this connection. The latter includes, among others, insights from Second Language Acquisition (SLA) research, on the one hand, and promote development of “Computational SLA” through setting up Second Language research infrastructure(s), on the other. The intersection of Natural Language Processing (or Language Technology / Computational Linguistics) and Speech Technology with Computer-Assisted Language Learning (CALL) brings “understanding” of language to CALL tools, thus making CALL intelligent. This fact has given the name for this area of research – Intelligent CALL, ICALL. As the definition suggests, apart from having excellent knowledge of Natural Language Processing and/or Speech Technology, ICALL researchers need good insights into second language acquisition theories and practices, as well as knowledge of second language pedagogy and didactics. This workshop invites therefore a wide range of ICALL-relevant research, including studies where NLP-enriched tools are used for testing SLA and pedagogical theories, and vice versa, where SLA theories, pedagogical practices or empirical data are modeled in ICALL tools. The NLP4CALL workshop series is aimed at bringing together competences from these areas for sharing experiences and brainstorming around the future of the field. We welcome papers: - that describe research directly aimed at ICALL; - that demonstrate actual or discuss the potential use of existing Language and Speech Technologies or resources for language learning; - that describe the ongoing development of resources and tools with potential usage in ICALL, either directly in interactive applications, or indirectly in materials, application or curriculum development, e.g. learning material generation, assessment of learner texts and responses, individualized learning solutions, provision of feedback; - that discuss challenges and/or research agenda for ICALL - that describe empirical studies on language learner data. This year a special focus is given to work done on error detection/correction and feedback generation. We encourage paper presentations and software demonstrations describing the above- mentioned themes primarily, but not exclusively, for the Nordic languages. ==Shared task== NEW for this year is the MultiGED shared task on token-level error detection for L2 Czech, English, German, Italian and Swedish, organized by the Computational SLA working group. For more information, please see the Shared Task website: https://github.com/spraakbanken/multiged-2023 ==Invited speakers== This year, we have the pleasure to announce two invited talks. The first talk is given by Marije Michel from the University of Amsterdam. The second talk is given by Pierre Lison from the Norwegian Computing Center. ==Submission information== Authors are invited to submit long papers (8-12 pages) alternatively short papers (4-7 pages), page count not including references. We will be using the NLP4CALL template for the workshop this year. The author kit can be accessed here, alternatively on Overleaf: <https://spraakbanken.gu.se/sites/default/files/2023/NLP4CALL%20workshop%20t…> <https://spraakbanken.gu.se/sites/default/files/2023/nlp4call%20template.doc> <https://www.overleaf.com/latex/templates/nlp4call-workshop-template/qqqzqqy…> Submissions will be managed through the electronic conference management system EasyChair <https://easychair.org/conferences/?conf=nlp4call2023>. Papers must be submitted digitally through the conference management system, in PDF format. Final camera-ready versions of accepted papers will be given an additional page to address reviewer comments. Papers should describe original unpublished work or work-in-progress. Papers will be peer reviewed by at least two members of the program committee in a double-blind fashion. All accepted papers will be collected into a proceedings volume to be submitted for publication in the NEALT Proceeding Series (Linköping Electronic Conference Proceedings) and, additionally, double-published through the ACL anthology, following experiences from the previous NLP4CALL editions (<https://www.aclweb.org/anthology/venues/nlp4call/>). ==Important dates== 03 April 2023: paper submission deadline 21 April 2023: notification of acceptance 01 May 2023: camera-ready papers for publication 22 May 2023: workshop date ==Organizers== David Alfter (1), Elena Volodina (2), Thomas François (3), Arne Jönsson (4), Evelina Rennes (4) (1) Gothenburg Research Infrastructure for Digital Humanities, Department of Literature, History of Ideas, and Religion, University of Gothenburg, Sweden (2) Språkbanken, Department of Swedish, Multilingualism, Language Technology, University of Gothenburg, Sweden (3) CENTAL, Institute for Language and Communication, Université Catholique de Louvain, Belgium (4) Department of Computer and Information Science, Linköping University, Sweden ==Contact== For any questions, please contact David Alfter, david.alfter(a)gu.se For further information, see the workshop website <https://spraakbanken.gu.se/en/research/themes/icall/nlp4call-workshop-serie…> Follow us on Twitter @NLP4CALL <https://twitter.com/NLP4CALL/>

2 6

1st CfP: Asia Pacific Journal of Corpus Research (APJCR) Vol. 4, No. 2 (Deadline: 15 October 2023)
by Prof CK Jung 30 May '24

30 May '24

[Apologies for cross-posting] Dear colleagues We are inviting submissions for the next issue of Asia Pacific Journal of Corpus Research, to appear on 31 December 2023. *ABOUT*The Asia Pacific Journal of Corpus Research (APJCR, e-ISSN 2733-8096, DOI: https://doi.org/10.22925/apjcr) is an international and interdisciplinary peer-reviewed journal intended to explore corpus research in the Asia Pacific region. APJCR addresses areas of methodological, applied and theoretical work in the field of corpus research. Examples of such include discourse analysis, lexical studies, grammatical studies, language acquisition, language learning, language education, lexicography, pragmatics, sociolinguistics, (machine) translation studies, (digital) literary studies, computational linguistics, speech, phonetics, deep learning and natural language understanding in conjunction with corpus. *NO ARTICLE PROCESS CHARGE*APJCR does not charge authors an Article Processing Fee (APF). *OPEN ACCESS POLICY*APJCR provides open access to its content under the principle in the academic field that making research freely available to the public supports a greater global exchange of knowledge. *SUBMISSION* Papers (in English or Korean) should be sent to *apjcreditor(a)icr.or.kr <apjcreditor(a)icr.or.kr>* *Full instruction can be found on http://icr.or.kr/apjcr <http://icr.or.kr/apjcr>* *IMPORTANT DATES*- Manuscript submission: 15 October 2023 - First decision (articles assessed by editors): October 2023 - Final decision: November 2023 - Production: December 2023 - Online publication: 31 December 2023 *APJCR ARCHIVE*- Google Scholar: https://scholar.google.co.kr/scholar?hl=ko&as_sdt=0%2C5&q=apjcr&btnG= - KoreaScience: http://koreascience.or.kr/journal/CPSOBX/v1n1.page *ENQUIRIES* help(a)icr.or.kr --- *CK Jung BEng(Hons) Birmingham MSc Warwick EdD Warwick Cert Oxford* Associate Professor | Department of English Language and Literature, Incheon National University, *South Korea* President | The Korea Association of Secondary English Education, *South Korea *(http://kasee.org) Vice President | The Korea Association of Primary English Education), *South Korea *(http://kapee.or.kr) Director | Institute for Corpus Research, Incheon National University, *South Korea* (http://icr.or.kr) Editor-in-Chief | Asia Pacific Journal of Corpus Research, ICR, *International* (http://icr.or.kr/apjcr) Editorial Board | Corpora, Edinburgh University Press, *UK* Editorial Board | English Today, Cambridge University Press, *UK* E: ckjung(a)inu.ac.kr / T: +82 (0)32 835 8129

1 4

second call for papers: eacl 2024 SRW
by Neele Falk 07 Mar '24

07 Mar '24

** *SECOND CALL FOR PAPERS: EACL 2024 STUDENT RESEARCH WORKSHOP * * Student Research Workshop co-located with EACL 2024 in St. Julians, Malta. Workshop Dates: March 21/22 2024 ***Paper Submission Deadline: December 18, 2023 (Direct) and January 17, 2024 (through ARR)*** ** About the Student Research Workshop ** The EACL 2024 Student Research Workshop (SRW) is a forum to bring together students investigating various areas of Computational Linguistics and Natural Language Processing. The workshop provides an excellent opportunity for participants to present their work and to receive mentorship and valuable feedback from the international research community. The workshop's goal is to aid students at multiple stages of their education, including undergraduate, MSc/MA, junior and senior PhD students, in getting familiar with conducting and presenting their research. General Invitation for Submission* We invite papers in two different categories: * ** * Thesis Proposals: This category is appropriate for PhD students who have decided on a thesis topic and wish to get feedback on their proposal and broader ideas for their continuing work. * Research Papers: Papers in this category can describe completed work, or work in progress with preliminary results. For these papers, the first author **MUST BE** a current student (graduate or undergraduate). Topics of interest for the SRW are the same as for the main EACL 2024 conference:<https://www.2022.aclweb.org/calls>https://2024.eacl.org/calls/papers/ <https://2024.eacl.org/calls/papers/> We are opening a unique opportunity for the submission of research papers that, while not accepted to the EACL main conference, align well with the themes of this workshop. To be eligible for submission, the first author must be a current student. Additionally, submissions should be complemented with the reviews from ARR to provide context and insights for evaluation. The submission deadline for this will be January 17, 2024. Why Submit to EACL SRW? * Mentorship program: EACL SRW provides a unique opportunity for students to receive constructive feedback and advise from more senior researchers through our on-site mentorship program. * Improving your publication record: Publishing a paper as an undergraduate or as a MSc/MA student is beneficial when applying for a PhD program. Publishing a paper in an EACL SRW workshop can be really helpful for improving students’ publication records. * Negative results: we encourage the submission of studies with negative results providing insights on why and in which scenarios a particular method fails. All accepted papers and thesis proposals will be presented in the main conference poster sessions, which will give students an opportunity to interact with and to present their work to a large and diverse audience, including top researchers in the field and assigned mentors.** *Important Dates* **** * Direct Workshop paper submission: December 18, 2023 * Pre-reviewed ARR paper submission: January 17, 2024 * Notification of acceptance: January 20, 2024 * Camera-ready deadline: January 30 2024 * Workshop dates: March 21-22, 2024 All deadlines are 11:59PM UTC-12:00 ("anywhere on Earth"). ** Submission Requirements ** We accept both archival submissions (which will be included in the conference proceedings) and non-archival submissions (which will be presented at the workshop but will not be included in the proceedings). ** The archival submissions must follow the anonymity period and the restrictions of the main conference. Short papersconsist of up to four (4) pages of content, plus unlimited references. Upon acceptance, they will be given five (5) content pages in the proceedings. Long papersconsist of up to eight (8) pages of content, plus unlimited references. Upon acceptance, they will be given nine (9) content pages in the proceedings. Thesis proposalsconsist of up to eight (8) pages of content, plus unlimited references. The title must begin with “Thesis Proposal:”. Upon acceptance, they will be given nine (9) content pages in the proceedings. We strongly recommend the use of the official ARR style templates. The paper templates are available as an Overleaf template and can also be downloaded directly (LaTeX and Word) via https://aclrollingreview.org/cfp <https://aclrollingreview.org/cfp>under 'Paper Submission and Templates'. All submissions must be in PDF format. Submissions that do not adhere to the above author guidelines or ACL policies will be rejected without review. Submission is electronic, using the OpenReview conference management. The submission link is available here: https://openreview.net/group?id=eacl.org/EACL/2024/Workshop/SRW <https://openreview.net/group?id=eacl.org/EACL/2024/Workshop/SRW> Grants We expect to have grants to offset some portion of students' travel, conference registration, and accommodation expenses. Further details will be posted on the SRW website. To contact the organizers of the workshop, please email us at: eaclsrw(a)gmail.com Website and Contact Information For more information, please visit https://sites.google.com/view/eacl2024srw <https://sites.google.com/view/eacl2024srw>and follow us on Twitter @eacl_srw. To contact the organizers of the workshop, please email us at eaclsrw(a)gmail.com*

1 2

[3rd CfP]: CALD-pseudo workshop @ EACL 2024
by Elena Volodina 29 Feb '24

29 Feb '24

Third Call for Papers: CALD-pseudo workshop on Computational Approaches to Language Data Pseudonymization @ EACL 2024, March 21 or 22, 2024 Website: https://mormor-karl.github.io/events/CALD-pseudo/ Submission website: https://softconf.com/eacl2024/CALD-pseudo-2024/ Submission Deadline: Monday, 18 December 2023 (anywhere on earth) We invite submissions to the first edition of the CALD-pseudo workshop on Computational Approaches to Language Data Pseudonymization, to be held at EACL 2024 on March 21 or 22, 2024. [Important Dates] * December 18, 2023: paper submission deadline * January 17, 2024: resubmission of already pre-reviewed ARR papers * January 20, 2024: notification of acceptance * January, 30 2024: camera-ready papers due * March 21 or 22, 2024: workshop date (the date to be confirmed by the EACL) [Introduction] Accessibility of research data is critical for advances in many research fields, but textual data often cannot be shared due to the personal and sensitive information which it contains, e.g names, political opinions, sensitive personal information and medical data. General Data Protection Regulation, GDPR (EU Commission, 2016), suggests pseudonymization as a solution to secure open access to research data but we need to learn more about pseudonymization as an approach before adopting it for manipulation of research data (Volodina et al., 2023). The main challenge is how to effectively pseudonymize data so that individuals cannot be identified, while at the same time keeping the data usable for research in, among others, computational linguistics, linguistics and natural language processing, for which it was collected. [Topics of Interest] CALD-pseudo workshop invites a broad community of researchers in all concerned cross-disciplinary fields to jointly discuss challenges within pseudonymization, such as * automatic approaches to detection and labelling of personal information in unstructured language data, including events and other context-dependent cues revealing a person; * developing context-sensitive algorithms for replacement of personal information in unstructured data; * studies into the effects of pseudonymization on unstructured data, e.g. applicability of pseudonymised data for the intended research questions, readability of pseudonymised data or addition of unwelcome biases through pseudonymization; * effectiveness of pseudonymization as a way of protecting writer identity; * reidentification studies; e.g. adversarial learning techniques that attempt to breach the privacy protections of pseudonymized data; * constructing datasets for automatic pseudonymization, including methodological and ethical aspects of those; * approaches to the evaluation of automatic pseudonymization both in concealing the private information and preserving the semantics of the non-personal data; * pseudonymization tools and software: evaluating the available tools and software for pseudonymization in different languages, and their ease of use, scalability, and performance; * and numerous other open questions. [Submission Guidelines] Authors are invited to submit by December 18, 2023 original and unpublished research papers in the following categories: * Full papers (up to 8 pages) for substantial contributions * Short papers (up to 4 pages) for ongoing or preliminary work All submissions must be in PDF format, must follow the EACL 2024 guidelines described in the ARR CfP (https://aclrollingreview.org/cfp), and use the official ACL style templates available here: https://github.com/acl-org/acl-style-files Direct submission deadline: December 18, 2023 at https://softconf.com/eacl2024/CALD-pseudo-2024/ Deadline for registration of ARR reviewed papers: January 17, 2023. (Further instructions will follow.) We also invite authors of papers on the topics of the workshop accepted to Findings to reach out to the organizing committee of CALD-pseudo to present them at the workshop. [Invited speakers] We are happy to announce that the workshop will host two invited speakers: * Anders Søgaard, University of Copenhagen, Denmark * Ildikó Pilán, the Norwegian Computing Center, Norway [Workshop Organizers] * Elena Volodina, University of Gothenburg, Sweden * Therese Lindström Tiedemann, University of Helsinki, Finland * Simon Dobnik, University of Gothenburg, Sweden * Xuan-Son Vu, Umeå university, Sweden [Program Committee] A list of program committee members is available on the workshop website. [Contact] For inquiries, please contact mormor.karl(a)svenska.gu.se ACL link to the call: https://www.aclweb.org/portal/content/computational-approaches-language-dat… ___________________ Elena Volodina, PhD, Docent https://spraakbanken.gu.se/en/about/staff/elena Life is like a mirror. Smile at it and it smiles back at you. Peace Pilgrim

1 1

Open Positions in the Newly Established Bamberg NLP Research Group
by Roman Klinger 14 Feb '24

14 Feb '24

I will start a new research group on natural language processing as part of the Bamberg AI Center (https://www.uni-bamberg.de/en/bacai/). There are currently four open positions: We do fundamental NLP research at the intersection to computational psychology, digital humanities, and computational social sciences. We have currently four positions open (deadline February 28, 2024): 1. Postdoc, Open Topic (3 years) 2. PhD student in interactive prompt optimization (3 years) 3. Researcher in event-centered emotion analysis (1 year) 4. Researcher in multimodal emotion analysis (1 year) Position 3+4 can be combined to have a 2-year position. Please find more details at https://www.bamnlp.de/openpositions/ Do not hesitate to contact me, if you have questions! Roman Klinger

1 1

Call for ARR Commitment WNUT 2023
by rob van der goot 17 Jan '24

17 Jan '24

Dear colleagues, The 9th Workshop on Noisy and User-generated Text is welcoming paper commitments from ARR. More info on the workshop:http://noisy-text.github.io ARR commitment link:https://openreview.net/group?id=eacl.org/EACL/2024/Workshop/WNUT_ARR_C… Our ARR commitment deadline is January 17th anywhere on earth, so you can also commit EACL submissions after rejection. Best, Rob

1 1

1st CfP: Special Session on Emergent Phenomena in Deep Representations and Large Language Models @IJCNN 2024 & IEEE WCCI 2024
by Ozge Alacam 16 Jan '24

16 Jan '24

Apologies for cross-posting ------------------------------------------------------ Dear colleagues, We invite you to submit to the special session on “Emergent Phenomena in Deep Representations and Large Language Models” as a part of IJCNN 2024 and IEEE WCCI 2024, which will be located in Yokohama, Japan. We are looking forward to your contributions. Please find the CfP below. Best wishes, On behalf of Organising Committee Özge Alacam ------------------------------------------------------ First Call for Papers: Special Session on Emergent Phenomena in Deep Representations and Large Language Models @IJCNN 2024 & IEEE WCCI 2024: Deep learning models trained on large datasets have shown spectacular performance in a wide range of tasks demonstrated by current applications of Large Language Models. However, recent works have shown that the abilities large machine learning models acquire often emerge unpredictably with increasing model complexity or training dataset size. These emergent phenomena include the unexpected appearance of abilities for which the model was not explicitly trained, but they might also be related to unexpected performance boosts due to the increased model complexity. Emergent phenomena are not always beneficial: larger models may pick up new biases from the training data or start hallucinating. To move towards increasingly sustainable, reliable, and explainable applications of AI systems, it is necessary to increase the understanding of the mechanisms surrounding emergent phenomena. Moreover, this effort provides increased insight into the learning process behind the acquisition of abilities of large models to perform specific tasks. Important research questions relate to the definition of emergent phenomena, their causes (what controls which abilities are acquired and when?), training efficiency, and training data quality (e.g., acquiring desired abilities with less computational effort), prompting strategies to get or test for desired model behaviour (e.g., a chain of thought), and further verification methods of model abilities and properties. The primary goal of this special session is (i) to discuss the emergent abilities and risks in deep neural networks and representations from very different angles and (ii) facilitate networking and encourage collaboration between various research fields that approach this issue from different perspectives, like computational linguistics, ethics in AI, computer science, physics, etc. Topics of interest include, but are not limited to: • The definition of emergence in the context of NLP and ML • Prompting strategies • Physics-based/inspired analyses (e.g. phase transitions in ML models) • Explainability and interpretability (XAI) • Evaluation measures for model ability, monitoring strategies, assessment of model abilities (e.g. technical or psychology-based) • Knowledge distillation, model pruning, energy-efficient models. • Mitigation strategies for emergent risks and model deterioration. • Fine-tuning and Retrieval-augmented generation (RAG) • Papers focusing on specific emergent phenomena (reasoning, creativity, double descent phenomena etc.) The website for the call for papers is accessible at https://sites.google.com/view/emergenn/call-for-papers Organising Committee: ------------------------------ • Dr. Özge Alacam (Ludwig-Maximilian University & Uni Bielefeld, Germany) • Dr. Michiel Straat (Uni Bielefeld, Germany) • Prof. Dr. Hinrich Schütze (Ludwig-Maximilian University, Germany) • Prof. Dr. Alessandro Sperduti (University of Padova, Italy) Important Dates: ------------------------------ • January 15, 2024 - Paper Submission Deadline • March 15, 2024 - Notification of Acceptance • May 1, 2024 - Camera-ready Deadline & Early Registration Deadline • June 30 - July 5, 2024 - Main Conference (IEEE WCCI 2024, Yokohama, Japan) * All deadlines are 11:59 PM UTC-12:00 ("anywhere on Earth") Submission Format and Platform: ------------------------------ • Submissions will be through the IEEE WCCI 2024 Submission page <https://edas.info/login.php?rurl=aHR0cHM6Ly9lZGFzLmluZm8vTjMxNjE0P2M9MzE2MT…>. • Each paper is limited to 8 pages, including figures, tables, and references. Please refer to the author guidelines provided by IEEE WCCI 2024 • Please specify during the submission that your paper is intended for the Special Session: Emergent Phenomena in Deep Representations and Large Language Models. • Special session webpage: https://sites.google.com/view/emergenn/call-for-papers • IEEE WCCI 2024 webpage: https://2024.ieeewcci.org/ Contact information: ------------------------------ • Özge Alacam : oezge.alacam(a)uni-bielefeld.de • Michiel Straat : mstraat(a)techfak.uni-bielefeld.de

1 1

2026

2025

2024

2023

2022

Corpora December 2023