March 2024 - Corpora

Postdoc opening at UT Austin
by Matt Lease 04 Apr '24

04 Apr '24

I have a postdoc opening in my lab with *applications due April 10th*. See Bullard Research Fellow (BRF) area 7 (“*BRF7*”) in the job ad here: https://apply.interfolio.com/142711. "BRF7) We seek applicants in natural language processing (NLP), information retrieval (IR), and human computation & crowdsourcing (HCOMP). Our work on responsible AI develops methods for model explanations and fairness. We build automated and human-in-the-loop models. We develop general methods to advance the state-of-the-art, grounded in social challenges like curbing disinformation and hate speech. A variety of our ongoing work touches on large language models (LLMs). This position will be mentored by Matt Lease <https://www.ischool.utexas.edu/~ml/>, as part of his lab for Artificial Intelligence and Human-Centered Computing <http://ai.ischool.utexas.edu/> (AI&HCC), and provide collaboration opportunities in UT Austin’s campus-wide Good Systems <http://goodsystems.utexas.edu/> grand challenge for responsible AI." Please see the job ad for full details about the opening: https://apply.interfolio.com/142711. -- Matt Lease Professor Information & Computer Science University of Texas at Austin Voice: (512) 471-9350 · Fax: (512) 471-3971 · Office: UTA 5.536 http://www.ischool.utexas.edu/~ml

1 1

Call for papers 4th Workshop on Scholarly Document Processing - SDP@ACL 2024
by Tirthankar Ghosal 02 Apr '24

02 Apr '24

** Call for Research Papers ** Scholarly literature is the chief means by which scientists and academics document and communicate their results and is therefore critical to the advancement of knowledge and improvement of human well-being. At the same time, this literature poses challenges to NLP uncommon in other genres, such as specialized language and high background knowledge requirements, long documents and strong structural conventions, multimodal presentation, citation relationships among documents, an emphasis on rational argumentation, and the frequent availability of detailed metadata. These challenges necessitate the development of NLP methods and resources optimized for this domain. The Scholarly Document Processing (SDP) workshop provides a venue for discussing these challenges, bringing together stakeholders from different communities including computational linguistics, machine learning, text mining, information retrieval, digital libraries, scientometrics and others, to develop methods, tasks, and resources in support of these goals. This workshop builds on the success of prior workshops: the 1st, 2nd, and 3rd SDP workshops held at EMNLP 2020, NAACL 2021, and COLING 2022, and the 1st and 2nd SciNLP workshops held at AKBC 2020 and 2021. In addition to having broad appeal within the NLP community, we hope the SDP workshop will attract researchers from other relevant fields including meta-science, scientometrics, data mining, information retrieval, and digital libraries, bringing together these disparate communities within ACL. Website: https://sdproc.org/2024/ X (Twitter): https://twitter.com/sdpworkshop Topics of Interest We invite submissions from all communities demonstrating usage of and challenges associated with natural language processing, information retrieval, and data mining of scholarly and scientific documents. Relevant topics include (but are not limited to): - Large Language Models (LLMs) for Science - Representation learning and language modeling - Information extraction and NER - Document understanding - Summarization and generation - Question-answering - Discourse modeling/argumentation mining - Network analysis - Bibliometrics, scientometrics, and altmetrics - Reproducibility and research integrity, including new challenges posed by generative AI - Peer review tools, principles and technology - Metadata and indexing - Inclusion of datasets and computational resources - Research infrastructures and digital libraries - Increasing the representation in scholarly work of disadvantaged populations - LLM-based interfaces to consume/produce scholarly documents ** Submission Information ** Authors are invited to submit full and short papers with unpublished, original work. Submissions will be subject to a double-blind peer-review process. Accepted papers will be presented by the authors at the workshop either as a talk or a poster. All accepted papers will be published in the workshop proceedings (proceedings from previous years can be found here: https://aclanthology.org/venues/sdp/). The submissions must be in PDF format and anonymized for review. All submissions must be written in English and follow the ACL 2024 formatting requirements: Long paper submissions: up to 8 pages of content, plus unlimited references. Short paper submissions: up to 4 pages of content, plus unlimited references. Submission Website: Paper submission has to be done through openreview: < https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/SDProc> Final versions of accepted papers will be allowed 1 additional page of content so that reviewer comments can be taken into account. ** Important Dates (Main Research Track) ** Paper submission deadline: May 17 (Friday), 2024 Notification of acceptance: June 17 (Monday), 2024 Camera-ready paper due: July 1 (Monday), 2024 Workshop dates: August 16, 2024 ** SDP 2024 Keynote Speakers ** We are excited to have several keynote speakers at SDP 2024. 1. Iryna Gurevych, Professor at Technical University Darmstadt and head of the UKP Lab, Germany. 2. Anna Rogers, Assistant Professor, University of Copenhagen, Denmark 3. Heng Ji, Professor, University of Illinois at Urbana-Champaign, USA. 4. Doug Downey, Associate Professor at Northwestern University and Research Manager at Allen Institute for AI, USA. ** SDP 2024 Shared Tasks ** SDP 2024 will host two exciting shared tasks. More information about all shared tasks is provided on the workshop website: https://sdproc.org/2024/sharedtasks.html DAGPap24: Detecting automatically generated scientific papers A big problem with the ubiquity of Generative AI is that it has now become very easy to generate fake scientific papers. This can erode public trust in science and attack the foundations of science: are we standing on the shoulders of robots? The Detecting Automatically Generated Papers (DAGPAP) competition aims to encourage the development of robust, reliable AI-generated scientific text detection systems, utilizing a diverse dataset and varied machine learning models in a number of scientific domains. Organizers: Savvas Chamezopoulos, Yury Kashnitsky, Drahomira Herrmannova, Anita de Waard (Elsevier), Domenic Rosati (Scite) Context24: Contextualizing Scientific Figures and Tables When making sense of results across many research papers on a topic, figures or tables of key results from the papers can serve as effective, information-dense summaries that can be compared/contrasted and synthesized with other results. However, to understand the results, key elements (e.g., measures, sample) need to be contextualized with associated methodological details, which are typically dispersed throughout the text, often far from the figure/table and from each other. In this shared task, we are interested in contextualizing scientific figures and tables, i.e., automatically retrieving and ranking snippets from the paper that are most needed to interpret their results, with the goal of making figures/tables more self-contained. Organizers: Joel Chan, Matthew Akamatsu ** Organizing Committee ** Tirthankar Ghosal, Oak Ridge National Laboratory, USA Philipp Mayr, GESIS – Leibniz Institute for the Social Sciences, Germany Aakanksha Naik, Allen Institute for AI, USA Shannon Shen, Massachusetts Institute of Technology, USA Amanpreet Singh, Allen Institute for AI, USA Anita de Waard, Elsevier, Netherlands Orion Weller, Johns Hopkins University, USA Yanxia Qin, National University of Singapore, Singapore Yoonjoo Lee, Korea Advanced Institute of Science & Technology, South Korea -- +++++++++++++++++++++++++++++++++++ *Tirthankar Ghosal* Scientist National Center for Computational Sciences (NCCS) Oak Ridge National Laboratory, United States ++++++++++++++++++++++++++++++++++++

1 1

Registrations open for The Alan Turing Institute AI UK Fringe Event Public Lectures on April 16, 2024
by Mohammed Hasanuzzaman 01 Apr '24

01 Apr '24

Dear Colleagues, Exciting news! Join us for "*AI and Future*" at Queen’s University Belfast, UK, on April 16, 2024, at 1:00 PM BST. It's The Alan Turing Institute AI UK Fringe Event Public Lectures – don't miss out! Event Details: - Date: *April 16, 2024*, starting at 1:00 PM BST - Format: Hybrid (Link will be shared after the registration deadline) - Participation Fees: Free Talks Include: 1. "Reflections on Consciousness in AI" by *Dr. Patrick Butlin, University of Oxford, UK* 2. "Scalable Multimodal Learning and Multimedia Recommendation" by *Prof. Jialie Shen, City, University of London, UK* 3. "Generalization Error of Neural Networks and Its Applications" by *Prof. Wing W. Y. Ng, South China University of Technology, China* Registration Details: Please register your interest through our application, accessible via the following link: Application Link <https://forms.office.com/e/6U2h0chaFS>. The deadline for registration is *April 10, 2024*. We look forward to your participation in this insightful event. Should you have any questions or require further information, please feel free to reach out to me at m.hasanuzzaman(a)qub.ac.uk Best regards, Mohammed ------------------------------------------------------------------------------------------------------ *Dr. Mohammed HasanuzzamanLecturer/Assistant Professor**Queen's University Belfast <https://www.qub.ac.uk/>, UK * *&Munster Technological University <https://www.mtu.ie/>, Ireland* *Funded Investigator, ADAPT Centre- <https://www.adaptcentre.ie/> A <https://www.adaptcentre.ie/>* World-Leading SFI Research Centre <https://www.adaptcentre.ie/> *C**hercheur Associé*, GREYC UMR CNRS 6072 Research Centre, France <https://www.greyc.fr/en/home/> *Associate Editor:* * IEEE Transactions on Affective Computing, Nature Scientific Reports, IEEE Transactions on Computational Social Systems, ACM TALLIP, PLOS One, Computer Speech and Language**Website: **https://mohammedhasanuzzaman.github.io/ <https://mohammedhasanuzzaman.github.io/>* [image: Mailtrack] <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=sig…> Sender notified by Mailtrack <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=sig…> 25/03/24, 22:39:35

1 1

PhD position - Deadline April 15th - CASCADE EU Horizon MSCA - Diachronic development of text types in the English Language
by Stefania Degaetano-Ortlieb 29 Mar '24

29 Mar '24

New deadline April 15th! Exciting EU Horizon Doctoral Network Opportunity! Network: HORIZON Marie Skłodowska-Curie Actions (MSCA) Doctoral Network on Computational Analysis of Semantic Change Across Different Environments (CASCADE) Job Title: PhD Candidate in Computational Linguistics/Digital Humanities/Computational humanities/Corpus Linguistics Project Title: Diachronic development of text types in the English Language Please share with interested students! Important: see eligibility requirements below!! Full official advert available here: https://www.uni-saarland.de/fileadmin/upload/verwaltung/stellen/Wissenschaft ler/W2440.pdf Workplace Computational Analysis of Semantic Change Across Different Environments (CASCADE) is a HORIZON Marie Skłodowska-Curie Actions (MSCA) Doctoral Networks scheme which will bring together researchers from a variety of backgrounds, including literary studies, historical text analysis, semantics, corpus linguistics, machine learning and natural language processing. The project will emphasise the importance of computational linguistics and humanities scholarship as skills that bring value and competitive edge to organisations concerned with semantically aware information retrieval and text analytics. CASCADE is a partnership between University College Cork, the University of Sheffield, KU Leuven, the University of Helsinki, and Saarland University. Location This position is to be filled at Saarland University, seeking applications for a PhD project entitled Diachronic development of text types in the English Language, which will be one of ten funded projects to emerge out of CASCADE's doctoral network. The successful candidate will be appointed as Marie Curie Early Career Researchers at Saarland University, enrolled in a fully-funded three-year PhD programme. The successful candidate will be supervised by PD Dr. Stefania Degaetano-Ortlieb. The position will be located within the Department of Language Science and Technology. The successful applicant will have the opportunity to collaborate with researchers affiliated with the DFG-funded Collaborative Research Center (CRC) 1102 on Information Density and Linguistic Encoding at Saarland University. CRC 1102 is a thriving research environment with over 30 PhD students and postdocs from many subfields of Linguistics, Computational Linguistics and Psycholinguistics. The Department of Language Science and Technology consists of about 100 research staff in nine research groups in the fields of Computational Linguistics, Psycholinguistics, Speech Processing, and Corpus Linguistics. It is part of Saarland Informatics Campus, which brings together computer science research at the university with world-class research institutions on campus. Project description The project Diachronic development of text types in the English Language will use computational approaches to model temporal dynamics of textual data (corpus-based and novel probabilistic measures) for the analysis of the development of text types in English. From the 15th century onward, increased use of the vernacular and perceived syntactic and lexical gaps (compared with Latin) prompted linguistic innovation in English, concurrent with the progressive conventionalization of text types. Existing variation was made functional and new distinctions were either borrowed or invented (e.g. transfer of features from medicine to cooking recipes). Applying corpus-based and novel probabilistic measures to investigate diachronic change and with support from Text+ and IDeaL, the PhD candidate will be able to: 1) determine the linguistic features of different text types; 2) determine how these features change over time; 3) assess the overlap of different text types and the possible mutual influence of text types; 4) develop a computational workflow to support the above objectives which is made publicly available (e.g. on github). Eligibility: This is important! The vacancy is open to applicants of all nationalities who comply with the mobility requirement and the degree requirement: * applicants must not have resided in the country of the CASCADE university that will host them (university of the main supervisor, Germany in this case) for more than 12 months in the 36 months preceding appointment * applicants must not hold a doctoral degree in any other field or have failed a corresponding doctoral examination at another higher education institution. More information about the Network CASCADE is committed to creating an environment in which all talents can develop to their maximum potential, regardless of gender, age, cultural origin, nationality or disability. We particularly encourage candidates from traditionally underrepresented groups to apply. CASCADE is advertising positions for 10 candidates within its network, who will be housed across the five participating universities. While applications are welcome to apply for more than one project, applications for each position must be submitted separately. In the event an applicant does apply for more than one position and has a particular preference, this should be indicated by email to PD Dr. Stefania Degaetano-Ortlieb. Applicants are encouraged to only apply for projects for which they are ideally qualified and suitable. The recruitment process will be done centrally by the CASCADE selection committee. Job requirements and responsibilities * Participation in the general training programme of the doctoral network (kick-off, monthly online research meetings, annual workshops and schools, and final conference) * Collaboration with a multidisciplinary research group focused on the intersection of Computational Linguistics, Linguistics, and Digital Humanities. * Creation of a typology of diachronic text types. * Large-scale analysis of the evolution of meaning across text types. * Development of a computationally supported methodology for diachronic text typology. * Participation in joint publications and engagement in academic service. * The successful candidate will be expected to participate in a series of international conferences and workshops, as well as undertake secondments in Germany and abroad. Your academic qualifications * Completed university studies in English Linguistics with strong historical and/or corpus-based background, Digital Humanities, Corpus Linguistics, Computational Linguistics, or related fields, held by time of appointment (MA-degree) * Language skills (according to GER): English B2 The successful candidate will also be expected to: * Demonstrated experience in applying text-/corpus-based or comparative approaches to the analysis of language variation (e.g., diachronic, sociolinguistic, multilingual variation) (desirable); * Capability to bridge the gap between data science and the humanities (essential); * Enthusiasm for inter- and multidisciplinary research and ability to work independently and collaboratively (essential); * Strong research, analytical, and organizational skills (essential); * Excellent written and verbal communication skills (essential); * A good command of English is mandatory; * Language skills (according to GER): English B2 What we can offer you: * A flexible work schedule allowing you to balance work and family * Interdisciplinary supervision and structured PhD training * An occupational health management model with numerous attractive options, such as our university sports program * A broad range or further education and professional development programmes (e.g. language courses) Applicants are requested to enclose the following documents in English with their application: 1. A curriculum vitae, including any publications (max. 10 pages); 2. A cover statement outlining the applicant's ambitions for Diachronic development of text types in the English Language and the wider CASCADE network (max. 2 pages); 3. A single-authored writing sample from your previous studies (any length). We look forward to receiving your meaningful online application (in a PDF file) by 15.04.2024 (extension is planned) to s.degaetano(a)mx.uni-saarland.de <mailto:s.degaetano@mx.uni-saarland.de> . Please include the reference number W2440 in the subject line of the e-mail. If you have any questions, please contact us for assistance. Your contact: PD Dr. Stefania Degaetano-Ortlieb Assistant Professor (Akademische Rätin) Department of Language Science and Technology Saarland University Campus A2.2 66123 Saarbrücken s.degaetano(a)mx.uni-saarland.de <mailto:s.degaetano@mx.uni-saarland.de> www.stefaniadegaetano.com <http://www.stefaniadegaetano.com/>

1 0

PhD position - Deadline April 15th - CASCADE EU Horizon MSCA - Modeling context for the analysis of language variation and change
by Stefania Degaetano-Ortlieb 29 Mar '24

29 Mar '24

New deadline April 15th! Exciting EU Horizon Doctoral Network Opportunity! Network: HORIZON Marie Skłodowska-Curie Actions (MSCA) Doctoral Network on Computational Analysis of Semantic Change Across Different Environments (CASCADE) Job Title: PhD Candidate in Computational Linguistics/Digital Humanities/Computational Humanities/Data Science (AI/NLP) Project Title: Modeling context for the analysis of language variation and change Please share with interested students! Important: see eligibility requirements below!! Full official advert available here: https://www.uni-saarland.de/fileadmin/upload/verwaltung/stellen/Wissenschaft ler/W2439.pdf Workplace Computational Analysis of Semantic Change Across Different Environments (CASCADE) is a HORIZON Marie Skłodowska-Curie Actions (MSCA) Doctoral Networks scheme which will bring together researchers from a variety of backgrounds, including literary studies, historical text analysis, semantics, corpus linguistics, machine learning and natural language processing. The project will emphasise the importance of computational linguistics and humanities scholarship as skills that bring value and competitive edge to organisations concerned with semantically aware information retrieval and text analytics. CASCADE is a partnership between University College Cork, the University of Sheffield, KU Leuven, the University of Helsinki, and Saarland University. Location This position is to be filled at Saarland University, seeking applications for a PhD project entitled Diachronic development of text types in the English Language, which will be one of ten funded projects to emerge out of CASCADE's doctoral network. The successful candidate will be appointed as Marie Curie Early Career Researchers at Saarland University, enrolled in a fully-funded three-year PhD programme. The successful candidate will be supervised by PD Dr. Stefania Degaetano-Ortlieb. The position will be located within the Department of Language Science and Technology. The successful applicant will have the opportunity to collaborate with researchers affiliated with the DFG-funded Collaborative Research Center (CRC) 1102 on Information Density and Linguistic Encoding at Saarland University. CRC 1102 is a thriving research environment with over 30 PhD students and postdocs from many subfields of Linguistics, Computational Linguistics and Psycholinguistics. The Department of Language Science and Technology consists of about 100 research staff in nine research groups in the fields of Computational Linguistics, Psycholinguistics, Speech Processing, and Corpus Linguistics. It is part of Saarland Informatics Campus, which brings together computer science research at the university with world-class research institutions on campus. Project description The project Modeling context for the analysis of language variation and change will use computational approaches to mode linguistic and extra-linguistic context for the analysis of language variation and change. Context has a major impact on how we process language. However, the notion of context is very broad ranging from broadly conceived pragmatic conditions, i.e. the extra-linguistic context (e.g. socio-cultural factors, genres, time), to the relationship among linguistic elements that can substitute for each other in a given context, i.e. the paradigmatic context, up to the local linguistic context of linguistic elements, i.e. the syntagmatic context. While studies on language variation and change do encompass the notion of context, the coverage of contextual factors is often relatively limited. This project will apply and further develop computational modeling techniques to integrate contextual factors among the types of context described above to arrive at more comprehensive accounts of effects of contextual factors on language variation and change. Eligibility: This is important! The vacancy is open to applicants of all nationalities who comply with the mobility requirement and the degree requirement: * applicants must not have resided in the country of the CASCADE university that will host them (university of the main supervisor, Germany in this case) for more than 12 months in the 36 months preceding appointment * applicants must not hold a doctoral degree in any other field or have failed a corresponding doctoral examination at another higher education institution. More information about the Network CASCADE is committed to creating an environment in which all talents can develop to their maximum potential, regardless of gender, age, cultural origin, nationality or disability. We particularly encourage candidates from traditionally underrepresented groups to apply. CASCADE is advertising positions for 10 candidates within its network, who will be housed across the five participating universities. While applications are welcome to apply for more than one project, applications for each position must be submitted separately. In the event an applicant does apply for more than one position and has a particular preference, this should be indicated by email to PD Dr. Stefania Degaetano-Ortlieb. Applicants are encouraged to only apply for projects for which they are ideally qualified and suitable. The recruitment process will be done centrally by the CASCADE selection committee. Job requirements and responsibilities * Participation in the general training programme of the doctoral network (kick-off, monthly online research meetings, annual workshops and schools, and final conference) * Collaboration with a multidisciplinary research group focused on the intersection of Computational Linguistics, Linguistics, and Digital Humanities. * Creation of a computational approach to model contextual factors. * Large-scale analysis of effects of contextual factors on modeling language variation and change. * Participation in joint publications and engagement in academic service. * The successful candidate will be expected to participate in a series of international conferences and workshops, as well as undertake secondments in Germany and abroad. Your academic qualifications * Completed university studies in Digital Humanities, Computational Linguistics, Data Science, AI/NLP, Information Science, or related fields, held by time of appointment * Language skills (according to GER): English - B2 The successful candidate will also be expected to: * Demonstrated experience in applying computational methods to textual data (essential); * Familiarity with eighteenth-century English data sources (desirable) * Capability to bridge the gap between data science and the humanities (essential); * Enthusiasm for inter- and multidisciplinary research and ability to work independently and collaboratively * (essential); * Strong research, analytical, and organizational skills (essential); * Excellent written and verbal communication skills (essential); * A good command of English is mandatory; * Language skills (according to GER): English B2 What we can offer you: * A flexible work schedule allowing you to balance work and family * Interdisciplinary supervision and structured PhD training * An occupational health management model with numerous attractive options, such as our university sports program * A broad range or further education and professional development programmes (e.g. language courses) Applicants are requested to enclose the following documents in English with their application: 1. A curriculum vitae, including any publications (max. 10 pages); 2. A cover statement outlining the applicant's ambitions for Diachronic development of text types in the English Language and the wider CASCADE network (max. 2 pages); 3. A single-authored writing sample from your previous studies (any length). We look forward to receiving your meaningful online application (in a PDF file) by 15.4.2024 (extension is planned) to s.degaetano(a)mx.uni-saarland.de <mailto:s.degaetano@mx.uni-saarland.de> . Please include the reference number W2440 in the subject line of the e-mail. If you have any questions, please contact us for assistance. Your contact: PD Dr. Stefania Degaetano-Ortlieb Assistant Professor (Akademische Rätin) Department of Language Science and Technology Saarland University Campus A2.2 66123 Saarbrücken s.degaetano(a)mx.uni-saarland.de <mailto:s.degaetano@mx.uni-saarland.de> www.stefaniadegaetano.com <http://www.stefaniadegaetano.com/>

1 0

CfP: Sixth Workshop on Teaching NLP
by Biester, Laura 29 Mar '24

29 Mar '24

Call For Papers: Sixth Workshop on Teaching NLP at ACL 2024 The Sixth Workshop on Teaching NLP will be co-located with the 2024 Annual Meeting of the Association for Computational Linguistics in Bangkok, Thailand. The workshop will occur on August 15 (hybrid option available). The one-day workshop will combine a program of traditional keynotes, posters, and oral presentations, with discourse through panel discussion, and focus on building a community for sharing resources. Call for Papers The field of Natural Language Processing (NLP) is growing rapidly, with new state-of-the-art methods emerging every year. This rapid growth challenges educators of NLP courses and degree programs to constantly revise their old material and create fresh NLP courses and degree programs, as well as new best practices and educational materials focused on emerging subareas of NLP. To support those facing these challenges, our one-day workshop will bring together the communities of NLP research and education to facilitate active discussion on questions including (but not limited to): * How can we facilitate meaningful conversations about language among Computer Science students? * How do we include user-centered design in core NLP curricula? * How should NLP educators design curricula that equip students with the ability to advance responsible and ethical NLP? * How can we design assignments that require GPU access or the use of paid APIs? * What are best practices that NLP educators from universities, industry groups, and Massive OpenOnline Courses (MOOCs) can use to share tools and resources for NLP education? This timely sixth edition of the Teaching NLP Workshop builds on prior successful offerings to tackle the most pressing issues in how to design NLP courses and bring together instructors from various backgrounds to discuss, create, and refine instructional design and material. Submission Information We welcome two submission types: teaching materials and papers: Teaching Materials (short papers) We invite short paper submissions of 1-2 pages that describe teaching materials such as curricula, course GitHub repositories, Jupyter notebooks, slides, homework, and assignments. These short papers do not need to be anonymised, but will be peer-reviewed and published in workshop proceedings, as well as presented in posters or demos. The corresponding teaching materials, while not being part of proceedings, should be submitted in addition to the short paper. We will create a Teaching NLP repository/wiki where authors may opt-in to make their materials available for the community after the workshop. Papers We invite papers of up to 8 pages discussing pedagogical aspects of NLP, focusing on (but not limited to) any of the following general topics: * Tools and methodologies (e.g., active learning, flipped classroom) * Scaling curricula to fit large class sizes * Adapting existing curricula to incorporate new NLP advancements * Teaching online NLP courses or adjusting courses to become remote * Challenges of designing the first NLP course or related degree program at a college, university, or on a MOOC platform * Teaching heterogenous groups of students (e.g., with respect to prior experience in computer science and linguistics) * Teaching underrepresented students * Bridging the gap between academic training and industry needs * Incorporating ethics, reproducibility, and responsible practices in NLP courses * Teaching multilingual NLP All submissions will be processed through OpenReview<https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/TeachNLP>. Important Dates * Paper Submission: May 17, 2024 * Notification of Acceptance: June 17, 2024 * Camera-Ready Deadline: July 1, 2024 * Teaching NLP Workshop: August 15, 2024 Website: https://sites.google.com/view/teachingnlpacl2024/ Contact: teachingnlp.yt(a)gmail.com<mailto:teachingnlp.yt@gmail.com> Best, TeachingNLP 2024 Organizers (Sana Al-azzawi, Laura Biester, György Kovács, Ana Marasović, Leena Mathur, Margot Mieskes, Leonie Weissweiler)

1 0

[CIKM 2024] - Call for PhD Symposium papers
by antonela.tommasel＠isistan.unicen.edu.ar 29 Mar '24

29 Mar '24

* We apologize if you receive multiple copies of this CfP * * For the online version of this Call, visit: https://cikm2024.org/call-for-phd-symposium/ =============== CIKM 2024: 33rd ACM International Conference on Information and Knowledge Management Boise, Idaho, USA October 21–25, 2024 =============== We are excited to invite Ph.D. students in databases (DB), information retrieval (IR), and knowledge management (KM) to submit their research proposals for the PhD Symposium at the 33rd ACM International Conference on Information and Knowledge Management (CIKM 2024). The conference will take place at the Boise Centre in Boise, Idaho, USA, from October 21 to 25, 2024. The PhD Symposium is designed to provide a supportive environment where doctoral students can present their ongoing research, receive feedback from experienced researchers, and engage with peers at similar stages of their doctoral journey. This event aims to foster discussions on research questions, methodologies, and preliminary results, contributing to the student’s doctoral research progression. CIKM 2024 is deeply committed to improving the field by making the research community more diverse, equitable, and inclusive. We highly encourage women and students from other underrepresented demographic groups to submit their work. -------------------------- Key Dates -------------------------- * Submission Deadline: 17 June 2024 * Acceptance Notification: 16 July 2024 * Camera-ready Version Due: 8 August 2024 * Doctoral Consortium: 25 October 2024 -------------------------- Symposium Objectives -------------------------- * Feedback and Guidance: Offer a platform for doctoral students to present their research and receive constructive feedback from the CIKM community’s senior researchers. * Community Building: Help participants network with other doctoral students and researchers, facilitating knowledge exchange and potential collaborations. * Insight into Career Paths: Through panel discussions and networking sessions, provide insights into career opportunities post-PhD in academia and industry. * Prospective attendees should have written or be close to completing a thesis proposal (or equivalent). It is desirable that students are not so close to completing their Ph.D. that the event would have little impact on their work. Similarly, students should not be so early in their Ph.D. program that a concrete topic has not been chosen yet. We strongly advise students to discuss this criterion with their advisor(s) or supervisor(s) before submitting. Doctoral students who submit to the Symposium are allowed to have previously published their research. They are encouraged to submit full, short, or demo papers of their work to the CIKM 2024 conference and associated workshops. -------------------------- Topics of Interest -------------------------- We welcome submissions across the broad spectrum of AI, data science, databases, information retrieval, and knowledge management. Research with real-world social impact is particularly encouraged. Topics of interest include, but are not limited to, the following areas: * Data and information acquisition and preprocessing (e.g., data crawling, IoT data, data quality, data privacy, mitigating biases, data wrangling) * Integration and aggregation (e.g., semantic processing, data provenance, data linkage, data fusion, knowledge graphs, data warehousing, privacy and security, modeling, information credibility) * Efficient data processing (e.g., serverless, data-intensive computing, database systems, indexing and compression, architectures, distributed data systems, dataspaces, customized hardware) * Special data processing (e.g., multilingual text, sequential, stream, spatiotemporal, (knowledge) graphs, multimedia, scientific, and social media data) * Analytics and machine learning (e.g., OLAP, data mining, machine learning and AI, scalable analysis algorithms, algorithmic biases, event detection and tracking, understanding, interpretability) * Neural Information and knowledge processing (e.g., graph neural networks, domain adaptation, transfer learning, network architectures, neural ranking, neural recommendation, and neural prediction) * Information access and retrieval (e.g., ad hoc and web search, facets, and entities, question answering and dialogue systems, retrieval models, query processing, personalization, recommender systems) * Users and interfaces for information and data systems (e.g., user behavior analysis, user interface design, perception of biases, personalization, interactive information retrieval, interactive analysis, conversational interfaces) * Evaluation, performance studies, and benchmarks (e.g., online and offline evaluation, best practices, user studies) * Crowdsourcing (e.g., task assignment, worker reliability, optimization, trustworthiness, transparency, best practices) * Understanding multi-modal content (e.g., natural language processing, speech recognition, computer vision, content understanding, knowledge extraction, knowledge graphs, and knowledge representations) * Data presentation (e.g., visualization, summarization, readability, VR, speech input/output) * Applications (e.g., urban systems, biomedical and health informatics, legal informatics, crisis informatics, computational social science, data-enabled discovery, social media) * Fairness, accountability, transparency, and ethics (e.g., sociotechnical nature of information access systems, algorithmic fairness, transparency and explainability, misinformation and disinformation) -------------------------- Submission Guidelines -------------------------- PhD students interested in participating should submit a paper (up to 4 pages, including references) using the ACM camera-ready two-column template. Submissions are single-blind, should be solely authored by the student, and clearly state the Ph.D. supervisor(s) (“supervised by …”). The submitted paper should be discussed with the PhD supervisor(s) before submission. Submissions should cover the following aspects: * Problem: What research problem or question does your work address? * State of the Art: How does your work relate to existing research in CIKM-related fields (e.g., information retrieval, databases, machine learning, data mining)? * Approach: Your novel approach to addressing the problem. * Methodology: The methodology you use or plan to use, including evaluation strategies. * Results: Any preliminary results you have obtained. * Conclusion and Future Work: Your conclusions and future research directions so far. * Additionally, include a one-page appendix detailing: - Topics and questions you wish to discuss with mentors and peers. - A statement from your advisor(s) supporting your participation, describing the current status of your research, and providing an anticipated thesis completion date. -------------------------- Selection Procedure -------------------------- Candidates will be selected based on the potential of their research for future impact and their potential to benefit from participating in the Symposium. Submissions will be reviewed by the PhD Symposium Program Committee, comprising experienced researchers who will provide feedback and suggest future research directions. All accepted PhD Symposium papers (excluding the appendix) will be included in the main proceedings and available through the ACM Digital Library. If accepted, presenting the results at the PhD Symposium is mandatory. -------------------------- Symposium Format -------------------------- The symposium will include presentations by the Ph.D. students, plenary discussions, one-to-one mentorship sessions, and panel discussions focusing on career paths post-PhD. -------------------------- Student Travel Support -------------------------- Students are highly encouraged to apply for student travel support from CIKM. Application details will be available on the CIKM 2024 website. Students must apply for the support to be considered. -------------------------- Chairs Contact Information -------------------------- For more information, contact the PhD Symposium chairs at: CIKM2024-phdsymposium [at] easychair [dot] org Yanfang (Fanny) Ye (University of Notre Dame, US) Jiaxin Mao (Renmin University of China, China)

1 0

[CIKM 2024] - Call for AnalytiCup Competition Proposals
by antonela.tommasel＠isistan.unicen.edu.ar 29 Mar '24

29 Mar '24

* We apologize if you receive multiple copies of this CfP * * For the online version of this Call, visit: https://cikm2024.org/call-for-analyticup-competition-proposals/ =============== CIKM 2024: 33rd ACM International Conference on Information and Knowledge Management Boise, Idaho, USA October 21–25, 2024 =============== CIKM 2024 AnalytiCup is an open competition including compelling data challenges aimed at members of the industry and academia interested in information and knowledge management. The challenges will be rolled out progressively and last for several weeks. The final solutions will be presented at CIKM 2024 AnalytiCup which is to be held in conjunction with the CIKM conference during October 2024. -------------------------- Key Dates -------------------------- * Proposal due: May 31 2024 * Notification: June 7 2024 * Competition Kickoff: June 24 20224 * Competition Ends: August 30, 2024 (All deadlines are at 11:59 pm AOE) -------------------------- Proposals Submissions -------------------------- We invite proposals from practitioners across industry and academia who are interested in the areas of information retrieval, databases, and knowledge management. The best fit proposal should include a well-motivated goal with a positive social impact, a novel and challenging task, a fair setup with stable evaluation approach, and adequate amount of real-world data for the competition. * A well-motivated goal: A goal of the proposed competition should be solving a challenging real-world problem at the same time impacting the research and other communities positively. A good competition is where the output of the competition should lead to a greater good of everyone, such proposals are encouraged. * A challenging task: The task should be challenging in the sense that there is enough room for improvement from the basic solutions, and novel ideas are required to succeed in the competition. At the same time the task should be manageable in about 2 months’ time. * A fair setup: The organizers should guarantee the availability of the data and the confidentiality of the test set. The evaluation metrics should be both meaningful for the application in-hand and statistically sound for the objective comparison. The baseline should be established to show that non-trivial results can be achieved. * A real-world dataset: A proposal should clearly explain what data will be provided for competition and the source of the data. Also, explain how/why the provided data is sufficient for the competition. A proposal should cover all the important details such as dates, submission and evaluation of results, etc. and describe the competition rules clearly. Please provide following details with your proposal: * Title: The title of your challenge. * Problem Description: Describe the problem clearly in detail. Explain the importance of the problem and its impact. Discuss different scenarios for the problem with its challenges and limitations. Share the simple data samples and explain the data clearly. If the proposed competition includes more than one track, please describe each track clearly and show unique value for each track. * Evaluation: Describe how you plan to evaluate the submission. Select the evaluation method which is fair and statistically robust. * Suggested Participants: Provide a list of suggested participants in the challenge. * Timeline. Dates for expected start of the competition, user registration, team formation, submission, evaluation, and notification. * Awards. Specify the type and form of the awards you want to share with the winners. * Host information: Names, affiliations, email addresses, and short biographies of the organizers. -------------------------- Chairs Contact Information -------------------------- For more information, contact the AnalytiCup Chairs at: CIKM2024-analyticup [at] easychair [dot] org Vachik Dave, Walmart Global Tech Carl Yang, Emory University

1 0

1st CFP: EvalLAC 2024 – AIED 24 workshop
by Shiva Taslimipoor 28 Mar '24

28 Mar '24

***** *1st Workshop on Automated Evaluation of Learning and Assessment Content* AIED 2024 workshop | Recife (Brazil) & Hybrid | 8-12 July 2024 https://sites.google.com/view/eval-lac-2024/ ***** We are happy to announce the first edition of the Workshop on Automated Evaluation of Learning and Assessment Content will be held in Recife (Brazil) & online during the AIED 2024 conference. *About the workshop* The evaluation of learning and assessment content has always been a crucial task in the educational domain, but traditional approaches based on human feedback are not always usable in modern educational settings. Indeed, the advent of machine learning models, in particular Large Language Models (LLMs), enabled to quickly and automatically generate large quantities of texts, making human evaluation unfeasible. Still, these texts are used in the educational domain -- e.g., as questions, hints, or even to score and assess students -- and thus the need for accurate and automated techniques for evaluation becomes pressing. This hybrid workshop aims to attract professionals from both academia and the industry, and to to offer an opportunity to discuss which are the common challenges in evaluating learning and assessment content in education. Topics of interest include but are not limited to: - Question evaluation (e.g., in terms of alignment to learning objectives, factual accuracy, language level, cognitive validity, etc.). - Estimation of question statistics (e.g., difficulty, discrimination, response time, etc.). - Evaluation of distractors in Multiple Choice Questions. - Evaluation of reading passages in reading comprehension questions. - Evaluation of lectures and course material. - Evaluation of learning paths (e.g., in terms of prerequisites and topics taught before a specific exam). - Evaluation of educational recommendation systems (e.g., personalised curricula). - Evaluation of hints and scaffolding questions, as well as their adaptation to different students. - Evaluation of automatically generated feedback provided to students. - Evaluation of techniques for automated scoring. - Evaluation of bias in educational content and LLM outputs. Human-in-the-loop approaches are welcome, provided that there is also an automated component in the evaluation and there is a focus on the scalability of the proposed approach. Papers on generation are also very welcome, as long as there is an extensive focus on the evaluation step. *Important dates* Submission deadline: May 17, 2024 Notification of acceptance: June 4, 2024 Camera ready: June 11, 2024 Workshop: 8 July or 12 July 2024 *Submission guidelines* Authors are invited to submit short papers (5 pages, excluding references) and long papers (10 pages, excluding references), formatted according to the workshop style available on the website. Submissions should contain mostly novel work, but there can be some overlap between the submission and work submitted elsewhere (e.g., summaries, focus on the evaluation phase of a broader work). Each of the submissions will be reviewed by the members of the Program Committee, and the proceedings volume will be submitted for publication to CEUR Workshop Proceedings. *Organisers* Luca Benedetto (1), Andrew Caines (1), George Dueñas (2), Diana Galvan-Sosa (1), Anastassia Loukina (3), Shiva Taslimipoor (1), Torsten Zesch (4) (1) ALTA Institute, Dept. of Computer Science and Technology, University of Cambridge (2) National Pedagogical University, Colombia (3) Grammarly, Inc. (4) FernUniversität in Hagen

1 0

Reminder: Professor of Language Technology at the Department of Swedish, Multilingualism, Language Technology; University of Gothenburg [new deadline]
by Gerlof Bouma 28 Mar '24

28 Mar '24

The Department of Swedish, Multilingualism, Language Technology at the University of Gothenburg is inviting applications for the position of Professor of Language Technology. The new professor's main duties will be to lead and develop research, education, and outreach in the field of language technology at the department, in particular within its Språkbanken Text group. A detailed description of the position and the application requirements can be found in University of Gothenburg's job application portal, at the link below. This detailed description is only available in Swedish, as proficiency in Swedish or another Scandinavian language is required for the position. Applications must be submitted no later than 6 May 2024. https://web103.reachmee.com/ext/I005/1035/job?site=6&lang=SE&validator=3038… <https://web103.reachmee.com/ext/I005/1035/job?site=6&lang=SE&validator=3038…> -- GERLOF BOUMA Universitetslektor GÖTEBORGS UNIVERSITET Institutionen för svenska, flerspråkighet och språkteknologi Språkbanken Text https://spraakbanken.gu.se/om/personal/gerlof

1 0

2026

2025

2024

2023

2022

Corpora March 2024