The Chair of Computational Corpus Linguistics (CCL) at Friedrich-Alexander-Universität Erlangen-Nürnberg has open positions for
PhD CANDIDATES AND POSTDOCTORAL RESEARCHERS IN COMPUTATIONAL CORPUS LINGUISTICS (65% or 100%, E13 TV-L, starting ASAP)
You will contribute to one or more research projects in the area of computational corpus linguistics. Depending on qualifications and your preferred topic, different appointments are possible:
• Statistical methods for the multivariate analysis of linguistic variation (65% for 3 years)
• Text mining with corpus queries and LLM (65% for 3 years)
• Development and applications of deep learning models in NLP and legal tech (100% until 12/2025, with possible extension depending on availability of funding)
• Evaluation and analysis of generative LLM (100% until 12/2025, with possible extension depending on availability of funding)
Required qualifications:
• M.A./M.Sc. or Ph.D. in Computational Linguistics, Mathematics, Computer Science, Machine Learning, Data Science, Computational Humanities, or similar subject area (with very good grade)
• Evidence of strong programming skills, ideally in Python and/or R
• Experience with machine learning in NLP, deep learning, or statistical analysis
• Near-native language skills in German or English
• Very good communication skills in English, including the ability to write for publication and present research findings at meetings and conferences
• Ability to work independently, manage own academic research and associated activities, and to supervise student assistants
• Ability to work effectively in small and large teams
Optional qualifications:
• Evidence of strong mathematical skills
• Evidence of ability to analyse linguistic data
• Experience with creation of corpora and gold standards
• Native-like language skills in German and English
• Experience with development of Web apps in Python/Flask and Web-based UIs
The contract will start ASAP (preferably by July 2024) and run for 3 years (65% positions) or until 14.12.2025 (100%, with extension depending on availability of funding). If you have any questions about the positions, the research projects, or the expected qualifications, please don’t hesitate to contact me at stephanie.evert(a)fau.de.
Please submit your application by e-mail to stephanie.evert(a)fau.de, providing evidence for all relevant qualifications. The application deadline is Sunday, 21 April 2024. Positions will be open until filled.
Full details: https://www.jobs.fau.de/jobs/phd-candidates-and-postdoctoral-researchers-in…
Call for Participation - JOKER @ CLEF 2024: Automatic Humour Analysis
=====================================================================
https://joker-project.com/
We invite individuals and teams to participate in JOKER 2024, a set of
shared tasks on automatic humour analysis. JOKER 2024 will be held at
the Conference and Labs of the Evaluation Forum (CLEF 2024) from 9 to 12
September 2024 in Grenoble, France.
Topics and tasks
----------------
The goal of the JOKER workshop is to bring together linguists and
computer scientists to work on an evaluation framework for humour,
including data and metric development, and to foster work on automatic
methods for humour analysis and translation. We invite submissions of
automatic or manual runs for any or all of the following tasks, for
which we have prepared annotated data:
Task 1: Humour-aware information retrieval
Task 2: Humour classification according to genre and technique
Task 3: Translation of puns from English to French
Unshared task: We welcome submissions that use our data for other tasks.
How to participate
------------------
At least one team member should register at the CLEF website, and all
team members should join the JOKER mailing list (see URLs below). Task
data will be made available to all registered participants. Runs and
system description papers should be submitted according to the schedule
and instructions posted on the JOKER and CLEF websites.
Deadlines
---------
2024-04-22: CLEF 2024 registration
2024-05-06: Submission of runs for JOKER
2024-05-24: Evaluation results posted
2024-05-31: Submission of system description papers
2024-06-24: Reviews of system description papers
2024-07-08: Submission of camera-ready papers
2024-09-09: CLEF 2024 conference begins
Contacts
--------
JOKER website: https://joker-project.com/
CLEF website: https://clef2024.clef-iniative.eu/
Email: contact(a)joker-project.com
Twitter: https://twitter.com/joker_research
Mailing list (Google Groups): https://groups.google.com/g/joker-project
Chairs
------
Liana Ermakova, Université de Bretagne Occidentale
Tristan Miller, University of Manitoba
Adam Jatowt, University of Innsbruck
Anne-Gwenn Bosser, ENIB
Victor Manuel Palma Preciado, Instituto Politécnico Nacional
Grigori Sidorov, Instituto Politécnico Nacional
--
Dr. Tristan Miller, Assistant Professor
Department of Computer Science, University of Manitoba
https://logological.org/ | Tel. +1 204 474 6792
Dear all,
A fully funded PhD available for with a text mining / digital humanities / computational social science specialization.
An interdisciplinary team of researchers is seeking to recruit a PhD student to a research project on industrial modernity and Deep Transitions at the Institute of Social Studies, University of Tartu, Estonia, led by Laur Kanger. The PhD study (4 years) will focus on the identification of long-term trends in industrial modernity in a comparative-historical perspective, combining the text mining of digitalized newspapers with existing databases. The application deadline is 15.05.2024.
Please do not hesitate to forward this to anyone who might be interested in taking up the challenge. General information can be found here: https://ut.ee/en/content/phd-open-calls (navigate to “1-15 May and 1-15 June 2024” > “Faculty of Social Sciences” > “Media and Communication, Sociology” tab). Detailed description of the PhD position can be found here (https://ut.ee/sites/default/files/2024-04/Sociology_PhD%20call_Deep%20Trans…) . Long story short: if you have experience in text mining and an interest in societal change, you're likely to be a good match. :) Don't hesitate to contact for more information about the research!
Best,
Peeter Tinits
Dear All,
We are seeking a passionate and motivated Research Fellow at the University
of Essex, UK. You will be working on the EU/UKRI funded ELOQUENCE project
<https://eloquenceai.eu/>. The funding is available until 31 December 2026.
We are interested in attracting a researcher working on Natural Language
Processing, Speech Processing, and Machine Learning/AI. Some areas of
interest include (but not limited to)
- Multimodel foundation model to combine text, speech and vision.
- Multi-lingual text and speech processing
- Factual retrieval-augmented generation
- Human feedback incorporation using HCI techniques.
- Ethics dimension of AI in real-world use cases
You will be responsible for developing, validating, and deploying a
deep-learning model. You will need experience building deep learning models
using established tools (e.g., PyTorch) and proficiency in Python. You
should also have, or be close to completing, a Ph.D. in computer science or
a related field. You will have opportunities to develop your research
interests, collaborate with other project partners, and present work in
international conferences/journals.
*Application Deadline: 21 April 2024 (Application details
<https://www.jobs.ac.uk/job/DGX705/senior-research-officer>) *
Informal inquiries about the position can be made to Prof. Haris Mouratidis
and Dr. Ravi Shekhar (e-mail: {h.mouratidis and r.shekhar}(a)essex.ac.uk)
-- Ravi
Ravi Shekhar
Lecturer, University of Essex, UK
http://shekharravi.github.io/
The 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL) will be held in Kyoto, Japan on September 18-20, 2024. SIGDIAL will be co-located with INLG which will take place after SIGDIAL in Tokyo, Japan.
The SIGDIAL venue provides a regular forum for the presentation of cutting-edge research in dialogue and discourse to both academic and industry researchers, continuing a series of 24 successful previous meetings. The conference is sponsored by the SIGDIAL organization - the Special Interest Group in discourse and dialogue for ACL and ISCA.
* Topics of Interest *
We welcome formal, corpus-based, implementation, experimental, or analytical work on discourse and dialogue including, but not restricted to, the following themes:
- Discourse Processing: Rhetorical and coherence relations, discourse parsing and discourse connectives. Reference resolution. Event representation and causality in narrative. Argument mining. Quality and style in text. Cross-lingual discourse analysis. Discourse issues in applications such as machine translation, text summarization, essay grading, question answering and information retrieval. Discourse issues in text generated by large language models.
- Dialogue Systems: Task oriented and open domain spoken, multimodal, embedded, situated, and text-based dialogue systems, their components, evaluation and applications, Knowledge representation and extraction for dialogue, State representation, tracking and policy learning. Social and emotional intelligence, Dialogue issues in virtual reality and human-robot interaction. Entrainment, alignment and priming. Generation for dialogue, Style, voice, and personality. Safety and ethics issues in Dialogue.
- Corpora, Tools and Methodology: Corpus-based and experimental work on discourse and dialogue, including supporting topics such as annotation tools and schemes, crowdsourcing, evaluation methodology and corpora.
- Pragmatic and Semantic Modeling: Pragmatics and semantics of conversations (i.e., beyond a single sentence), e.g., rational speech act, conversation acts, intentions, conversational implicature, presuppositions.
- Applications of Dialogue and Discourse Processing Technology.
* Special Session *
SIGDIAL 2024 invites work on the special session “GEMINI - Graph-based knowledge for Modelling Intelligent Natural Interaction” that focuses on knowledge and knowledge modeling for dialogue systems, in particular on the opportunities and challenges for enhancing and stabilizing dialogue capabilities of chatbots, robots, and virtual agents with the use of LLMs.
* Submissions *
The program committee welcomes the submission of long papers, short papers, and demo descriptions. Submitted long papers may be accepted for oral or for poster presentation. Accepted short papers will be presented as posters.
- Long paper submissions must describe substantial, original, completed and unpublished work. Wherever appropriate, concrete evaluation and analysis should be included. Long papers must be no longer than 8 pages, including title, text, figures and tables. An unlimited number of pages is allowed for references and appendices, and an extra page is allowed in the final version to address reviewers’ comments.
- Short paper submissions must describe original and unpublished work. Please note that a short paper is not a shortened long paper. Instead, short papers should have a point that can be made in a few pages, such as a small, focused contribution; a negative result; or an interesting application nugget. Short papers should be no longer than 4 pages including title, text, figures and tables. An unlimited number of pages is allowed for references and appendices, and an extra page is allowed in the final version to address reviewers’ comments.
- Demo descriptions should be no longer than 4 pages including title, text, examples, figures, tables and references. A separate one-page document should be provided to the program co-chairs for demo descriptions, specifying furniture and equipment needed for the demo.
Note that content that is an important part of the contribution or that is important for the reviewers to assess the technical correctness of the work should be a part of the main paper, and not appear in appendices. Reviewers are not required to consider material in appendices.
Authors are encouraged to also submit additional accompanying materials, such as corpora (or corpus examples), demo code, videos and sound files.
* Multiple Submissions *
SIGDIAL 2024 cannot accept work for publication or presentation that will be (or has been) published elsewhere and that have been or will be submitted to other meetings or publications whose review periods overlap with that of SIGDIAL. Any questions regarding submissions can be sent to program-chairs [at] sigdial.org.
* Blind Review *
Building on previous years’ move to anonymous long and short paper submissions, SIGDIAL 2024 will follow the ACL policies for preserving the integrity of double-blind review (see author guidelines: https://www.aclweb.org/adminwiki/index.php?title=ACL_Author_Guidelines). Unlike long and short papers, demo descriptions will not be anonymous. Demo descriptions should include the authors’ names and affiliations, and self-references are allowed.
* Submission Format *
All long, short, and demonstration submissions must follow the two-column ACL format, which are available as an Overleaf template (https://www.overleaf.com/read/crtcwgxzjskr) and also downloadable directly (Latex and Word) (https://github.com/acl-org/acl-style-files).
Submissions must conform to the official ACL style guidelines, which are contained in these templates. Submissions must be electronic, in PDF format.
* Submission Deadline *
SIGDIAL will accept regular submissions through the Softconf/START system, as well as commitment of already reviewed papers through the ACL Rolling Review (ARR) system.
* Regular submission *
Authors have to fill in the submission form in the Softconf/START system and upload an initial pdf of their papers before May 17, 2024 (23:59 GMT-11). Details and the submission link will be posted on the conference website (https://2024.sigdial.org/).
Submission via ACL Rolling Review (ARR, https://aclrollingreview.org/)
Please refer to the ARR Call for Papers (https://aclrollingreview.org/cfp ) for detailed information about submission guidelines to ARR. The commitment deadline for authors to submit their reviewed papers, reviews, and meta-review to SIGDIAL 2024 is June 19, 2024. Note that the paper needs to be fully reviewed by ARR in order to make a commitment, thus the latest date for ARR submission will be April 15, 2024.
* Mentoring *
Acceptable submissions that require language (English) or organizational assistance will be flagged for mentoring, and accepted with a recommendation to revise with the help of a mentor. An experienced mentor who has previously published in the SIGDIAL venue will then help the authors of these flagged papers prepare their submissions for publication.
* Best Paper Awards *
In order to recognize significant advancements in dialogue/discourse science and technology, SIGDIAL 2024 will include best paper awards. All papers at the conference are eligible for the best paper awards. A selection committee consisting of prominent researchers in the fields of interest will select the recipients of the awards.
SIGDIAL 2024 Program Committee
Vera Demberg and Stefan Ultes
Conference Website: https://2024.sigdial.org/
*** Last Call for Papers ***
IEEE Mobile Cloud 2024
The 12th IEEE International Conference on Mobile Cloud Computing, Services
and Engineering
July 15-18, 2024 | Shanghai, China
https://ieeemobilecloud.com
(*** Submission Deadline: April 30th, 2024 AoE (extended) ***)
IEEE Mobile Cloud is a pioneering IEEE sponsored international conference devoted to the
research in mobile, edge, and cloud computing. It covers all aspects of mobile, edge, and
cloud computing from architectures, techniques, tools and methodologies to applications.
This year's conference is scheduled to take place in Shanghai, China, from 15-18 July 2024.
IEEE Mobile Cloud 2024 is part of the IEEE International Congress On Intelligent And Service-
Oriented Systems Engineering offering a broad spectrum of international events, sharing
renowned keynotes and fostering exchange among researchers and practitioners (see common
homepage for all colocated events, https://ieee-cisose-congress.org).
The fusion of mobile communications, computing, and intelligence is catalysing the emergence
of innovative systems and applications that facilitate intelligent resource provisioning, process
extensive data from mobile sensors and interconnected hardware platforms, and bolster the
Internet of Things (IoT) through robust edge and cloud-based backend infrastructure. The
pivotal role of current and forthcoming communication technologies, machine learning
implementation, and mobile cloud infrastructures as facilitators for this convergence cannot be
understated. These mobile intelligent applications are poised to revolutionise various facets of
daily life, encompassing domains such as transportation, e-commerce, healthcare, smart
homes, smart cities, social interaction, and more.
Mobile intelligence serves as an inclusive platform for both academic and industrial researchers
to share their latest research insights, experimental findings, and the latest advancements in
industry technologies related to mobile systems, machine learning, edge and cloud computing,
services, and engineering. Leveraging the synergy of mobile communications, machine
intelligence, edge computing, and edge/cloud infrastructures, the future of Mobile Intelligence
Systems is envisioned to provide a multitude of critical and personalised services across diverse
application domains, ranging from education, transportation, to public health, safety, and
security. Submissions will be evaluated on the criteria of originality, significance, clarity,
relevance, and accuracy.
TOPICS OF INTEREST
They include but not limited to:
Theory, Modelling, and Methodologies
• Mobile cloud computing models, architectures, infrastructures, and platforms
• Mobile intelligence theories, concepts, algorithms, and methodologies
• Mobile cloud data management
• Mobile cloud tools, middleware, and data centres
• Mobile intelligence as a service
• Mobile networking, protocols, and technologies
• Quality of service (QoS)
• Mobile intelligence security and privacy
Applications and Industry Practice
• Mobile intelligence for autonomous driving systems, V2X, intelligent transportation systems
(ITS), telematics
• Mobile intelligence for robotics, unmanned aerial vehicles (UAVs), and unmanned ground
vehicles (UGVs)
• Mobile intelligence for sensor networks, Industrial IoT, industrial 4.0, and industry 5.0
• Mobile intelligence for future wireless technologies, 5G/6G, WiFi, Satellite, etc.
• Mobile intelligence for aviation, airports, and railway
• Mobile intelligence for Augmented Reality/Virtual Reality (AR/VR)
• Mobile intelligence for computer vision and video analytics
• Mobile intelligence for surveillance and disaster management
• Mobile intelligence for healthcare
• Mobile intelligence for the metaverse
• Mobile intelligence for smart city
• Mobile intelligence for satellite
• Mobile intelligence for mission-critical systems
• Mobile intelligence for community services and social networking
• Mobile intelligence computing for sustainable development
PAPER SUBMISSION GUIDELINES
Papers must be written in English. All papers must be prepared in the IEEE double column
proceedings format. Please see the following link for details:
https://www.ieee.org/conferences/publishing/templates.html .
All accepted conference papers will be published by IEEE Computer Society and IEEE Explore
digital library with EI-index. Selected papers will be recommended to SCI-index journals
as special issue papers.
The paper length should be up to 8 pages for regular conference papers and 6 pages for
work-in-progress papers. Submitted papers should contain original work and not being
submitted elsewhere. Each paper must be presented by an author at the conference.
Presentations via teleconference are not permitted. Permissions to have the paper presented
by a qualified substitute presented may be granted by the TCP Chairs under extraordinary
circumstances, upon written request.
Submissions should be made via Easy Chair using the following link:
https://easychair.org/my/conference?conf=imc24 .
IMPORTANT DATES
• Abstract submission: March 31st, 2024 (AoE)
• Paper submission: April 30th, 2024 (AoE) (extended)
• Notification of acceptance: May 15th, 2024
• Final manuscript submission: May 22nd, 2024
• Author registration: May 22nd, 2024
• Conference: July 15th-18th, 2024
COMMITTEES
General Chairs
• Hiroyuki Sato, University of Tokyo, Japan
• Yan Bai, University of Washington Tacoma, USA
Program Chairs
• Lan Zhang, Clemson University, USA
• Sun Yao, The University of Glasgow, Scotland, UK
• Tomoki Watanabe, Kanagawa Institute Technology, Japan
• Fan Wu, Shanghai Jiao Tong University, China
Publicity Chair
George Angelos Papadopoulos, University of Cyprus, Cyprus
Program Committee
• Ouri Wolfson, University of illinois
• Felix Beierle, University of Würzburg
• Thomas Richter, Rhein-Waal University of Applied Sciences
• Dan Grigoras, University College Cork
• Sergio Ilarri, University of Zaragoza
• Iulian Sandu Popa, University of Versailles Saint-Quentin & INRIA Saclay-Ile-de-France
• Haiping Xu, University of Massachusetts Dartmouth
• Prasad Calyam, University of Missouri
• Dana Petcu, West University of Timisoara
• Fabio Costa, Federal University of Goias
• Cristian Borcea, New Jersey Institute of Technology
• Lei Huang, Prairie View A&M University
• Chunsheng Zhu, Southern University of Science and Technology
• Xuyun Zhang, Macquarie University
• Jia Zhao, Changchun Institute of Technology
• Richard Han, University of Colorado Boulder
CISOSE General Chairs
• Jerry Gao, San Jose State University, USA
• Iraklis Varlamis, Harokopio University of Athens, Greece
CISOSE Steering Committee
• Jerry Gao, San Jose State University, USA
• Guido Wirtz, University of Bamberg, Germany
• Huaimin Wang, NUDT, China
• Jie Xu, University of Leeds, UK
• Wei-Tek Tsai, Arizona State University, USA
• Axel Kupper, TU Berlin, Germany
• Hong Zhu, Oxford Brookes University, UK
• Longbin Cao, University of Technology Sydney, Australia
• Cristian Borcea, New Jersey Institute of Technology, USA
• Sato Hiroyuki, University of Tokyo, Japan
*Two-year fully funded postdoctoral position in quantitative text analysis/
NLP*
*Location:* University College Dublin, School of Politics and
International Relations
*Start date:* 1st September, 2024
*Deadline:* 12th May, noon, 2024
University College Dublin is currently recruiting a post-doctoral
researcher to implement natural language processing (NLP) tools to analyse
interview data.
The main objective of this position is to develop tools to identify and
analyse so-called cognitive maps (Axelrod 1976) from interview data.
Dornschneider and Henderson (2016, 2023) and Dornschneider (2019) have
developed tools for the computational analysis of cognitive maps. What is
needed is a set of tools to infer cognitive maps from natural language.
This Irish Research Council funded project investigates the role of women
in Muslim resistance movements, based on Arabic interviews conducted by the
Principal Investigator. The cognitive mapping analysis has several main
objectives: 1- to show typical behavioral decisions (e.g. to join a
resistance movement) described by the interviewees; 2- to identify common
reasoning processes related to these decisions; and 3- to trace the role of
religious beliefs in these reasoning processes.
You will work with the PI, Dr. Stephanie Dornschneider-Elkink, to deliver
the research objectives of the project. You will support the development
and subsequent publication of new tools to convert text into cognitive
maps. Tasks will include but are not limited to POS tagging, sequence
analysis, word embeddings, and visualization. You will have the chance to
give substantial input to the analysis and to co-author papers with the PI.
Full ad*:* https://my.corehr.com/pls/coreportal_ucdp/apply?id=017201
*References*
Axelrod, R. (ed.). 1976. *Structure of decision: The cognitive maps of
political elites*. Princeton: Princeton university press.
Dornschneider-Elkink, S. and Henderson, N., 2023. Repression and Dissent:
How Tit-for-Tat Leads to Violent and Nonviolent Resistance. *Journal of
Conflict Resolution*, p.00220027231179102.
https://doi.org/10.1177/0022002714540473
Dornschneider, S., 2019. High‐Stakes Decision‐Making Within Complex Social
Environments: A Computational Model of Belief Systems in the Arab
Spring. *Cognitive
Science*, *43*(7), p.e12762. https://doi.org/10.1111/cogs.12762
Dornschneider, S. and Henderson, N., 2016. A computational model of
cognitive maps: Analyzing violent and nonviolent activity in Egypt and
Germany. *Journal of Conflict Resolution*, *60*(2), pp.368-399.
--
Dr Stephanie Dornschneider-Elkink
Assistant Professor, School of Politics & International Relations (SPIRe)
University College Dublin
Newman Building, F316, Belfield, Dublin 4, Ireland
http://www.dornschneider.net/
[Apologies for cross-posting]
23rd EDITION OF THE SEPLN AWARD TO THE BEST DOCTORAL THESIS IN NATURAL LANGUAGE PROCESSING
[EXTENSION: May 15nd, 2024]
The Spanish Society for Natural Language Processing announces the 23rd Edition of the SEPLN Award for the Best Doctoral Thesis in Natural Language Processing, which will be governed by the following bases:
1.- The purpose of this award is the promotion and dissemination of research in the field of natural language processing.
2.- The thesis will be awarded with a compact laptop (tablet) and €300 for attendance at the congress. The award will be presented at the 40th International Congress of the Spanish Society of Natural Language Processing (SEPLN 2024), after a brief presentation of the award-winning work by the author.
3.- In order to compete, the author of the doctoral thesis must be a member of the SEPLN at the time of submitting the work. No contestant may participate as author in more than one work.
4.- Doctoral theses read during the year 2023, written in a language of the Spanish State or in English, may be submitted to the competition.
In addition to the complete thesis, it is essential to send:
a) a 4-page summary of the thesis, clearly describing the topic and the relevance of the research, the objectives, methods, results achieved and contributions.
b) a brief description of the scientific career of the author of the thesis, detailing the participation in scientific activities such as organization of competitive tasks, congresses, generation of open access resources such as sets of data, language models, etc., and participation in projects, contracts, and/or patents.
The quality of the presentation, the technical and methodological correctness, the relevance, originality, the generation, evaluation and publication of resources, as well as the research trajectory during the pre-doctoral period will be the criteria used for the award of the prize by the jury.
The works will be submitted through the website of the Society's magazine (http://journal.sepln.org) in PDF format before May 15nd 2024.
The final decision will be communicated during the 40th International Congress of the Spanish Society for Natural Language Processing (SEPLN 2024).
Submission instructions (http://www.sepln.org/sites/default/files/noticia/documentos_relacionados/20…)
For more information aitziber.atucha(a)ehu.eus
EDICIÓN XXIII PREMIO SEPLN A LA MEJOR TESIS DOCTORAL EN PROCESAMIENTO DEL LENGUAJE NATURAL
[EXTENSIÓN: 15 de mayo de 2024]
La Sociedad Española para el Procesamiento del Lenguaje Natural convoca la Edición XXIII del Premio SEPLN a la Mejor Tesis Doctoral en Procesamiento del Lenguaje Natural, que se regirá por las siguientes bases:
1.- La finalidad de este premio es la promoción y divulgación de la investigación en el campo del procesamiento del lenguaje natural.
2.- La tesis será premiada con una computadora portátil compacta (tablet) y 300€ para la asistencia al congreso. Se dará entrega del premio en el 40 Congreso Internacional de la Sociedad Española del Procesamiento del Lenguaje Natural (SEPLN 2024), tras una breve presentación del trabajo premiado por parte del autor.
3.- Para poder concursar, el autor de la tesis doctoral debe ser socio de la SEPLN en el momento de presentar el trabajo. Ninguna persona concursante podrá participar como autora en más de un trabajo.
4.- Se podrán presentar a concurso tesis doctorales leídas durante el año 2023, escritas en una lengua del Estado español o en lengua inglesa.
Además de la tesis completa, es imprescindible enviar:
a.- Un breve resumen de 4 páginas donde claramente se indique el tema y la relevancia de la investigación, los objetivos, métodos, resultados alcanzados y contribuciones.
b.- Una breve descripción de la trayectoria científica del autor de la tesis, en la que se describa la participación en actividades científicas como organización de de tareas competitivas, congresos, generación de recursos open access como conjuntos de datos, modelos de lenguaje, etc, y participación en proyectos, contratos, y/o patentes.
La calidad de la presentación, la corrección técnica y metodológica, la relevancia, originalidad, la generación, evaluación y publicación de recursos, así como la trayectoria investigadora durante el periodo predoctoral serán los criterios empleados para la adjudicación del premio por parte del jurado.
Los trabajos se enviarán a través de la web de la revista de la Sociedad (http://journal.sepln.org) en formato PDF antes del 15 de mayo de 2024.
La resolución del premio se comunicará durante el 40 Congreso Internacional de la Sociedad Española del Procesamiento del Lenguaje Natural (SEPLN 2024).
Documento con las instrucciones (http://www.sepln.org/sites/default/files/noticia/documentos_relacionados/20…)
Para más información dirigirse a aitziber.atucha(a)ehu.eus
Dear colleagues,
We have received many requests to extend the submission deadline for CMC-Corpora 2024 and are therefore pleased to announce an extension of the paper and abstract submission deadline to 23:59 CEST (GMT +2), April, 26th, 2024.
We are also very happy to inform you that Susan Herring (Indiana University) will be our keynote speaker!
For submission details, please see the conference website: https://cmc-corpora-nice.sciencesconf.org/
Looking forward to receiving your submission!
On behalf of the organizing and steering committees,
Céline Poudat and Steven Coats
University Lecturer, Docent
English, Faculty of Humanities
University of Oulu
P.O. Box 8000, FI-90014 University of Oulu
Finland
https://cc.oulu.fi/~scoats
We invite the community to participate in a shared task organized in the
context of the CONDA workshop https://conda-workshop.github.io/
<https://conda-workshop.github.io/>.
Data contamination, where evaluation data is inadvertently included in
pre-training corpora of large scale models, and language models (LMs) in
particular, has become a concern in recent times (Sainz et al. 2023
<https://aclanthology.org/2023.findings-emnlp.722/>; Jacovi et al. 2023
<https://aclanthology.org/2023.emnlp-main.308/>). The growing scale of
both models and data, coupled with massive web crawling, has led to the
inclusion of segments from evaluation benchmarks in the pre-training
data of LMs (Dodge et al., 2021
<https://aclanthology.org/2021.emnlp-main.98/>; OpenAI, 2023
<https://arxiv.org/abs/2303.08774>; Google, 2023
<https://arxiv.org/abs/2305.10403>; Elazar et al., 2023
<https://arxiv.org/abs/2310.20707>). The scale of internet data makes it
difficult to prevent this contamination from happening, or even detect
when it has happened (Bommasani et al., 2022
<https://arxiv.org/abs/2108.07258>; Mitchell et al., 2023
<https://arxiv.org/abs/2212.05129>). Crucially, when evaluation data
becomes part of pre-training data, it introduces biases and can
artificially inflate the performance of LMs on specific tasks or
benchmarks (Magar and Schwartz, 2022
<https://aclanthology.org/2022.acl-short.18/>). This poses a challenge
for fair and unbiased evaluation of models, as their performance may not
accurately reflect their generalization capabilities.
The shared task is a community effort on centralized data contamination
evidence collection. While the problem of data contamination is
prevalent and serious, the breadth and depth of this contamination are
still largely unknown. The concrete evidence of contamination is
scattered across papers, blog posts, and social media, and it is
suspected that the true scope of data contamination in NLP is
significantly larger than reported.
With this shared task we aim to provide a structured, centralized
platform for contamination evidence collection to help the community
understand the extent of the problem and to help researchers avoid
repeating the same mistakes. The shared task also gathers evidence of
clean, non-contaminated instances. The platform is already available for
perusal at
https://huggingface.co/spaces/CONDA-Workshop/Data-Contamination-Database
<https://huggingface.co/spaces/CONDA-Workshop/Data-Contamination-Report>.
Participants in the shared task need to submit their contamination
evidence (see instructions below). The CONDA 2024 workshop organizers
will review the evidence through pull requests.
*/Compilation Paper/*
As a companion to the contamination evidence platform, we will produce a
paper that will provide a summary and overview of the evidence collected
in the shared task. The participants who contribute to the shared task
will be listed as co-authors in the paper.
*/
/*
*/Instructions for Evidence Submission/*
Each submission should report a case of contamination or lack of
contamination thereof. The submission can be either about (1)
contamination in the corpus used to pre-train language models, where the
pre-training corpus contains a specific evaluation dataset, or about (2)
contamination in a model that shows evidence of having seen a specific
evaluation dataset while being trained. Each submission needs to mention
the corpus (or model) and the evaluation dataset, in addition to some
evidence of contamination. Alternatively, we also welcome evidence of a
lack of contamination.
Reports must be submitted through a Pull Request in the Data
Contamination Report space at HuggingFace. The reports must follow the
Contribution Guidelines provided in the space and will be reviewed by
the organizers. If you have any questions, please contact us at
conda-workshop(a)googlegroups.com
<mailto:conda-workshop@googlegroups.com> or open a discussion in the
space itself.
URL with contribution guidelines:
https://huggingface.co/spaces/CONDA-Workshop/Data-Contamination-Database
<https://huggingface.co/spaces/CONDA-Workshop/Data-Contamination-Report> (“Contribution
Guidelines” tab)
*/Important dates/*
* Deadline for evidence submission: July 1, 2024
* Workshop day: August 16, 2024
*/Sponsors/*
* AWS AI and Amazon Bedrock
* HuggingFace
* Google
*/Contact/*
* Website: https://conda-workshop.github.io/
<https://conda-workshop.github.io/>
* Email: conda-workshop(a)googlegroups.com
<mailto:conda-workshop@googlegroups.com>
*/Organizers/*
Oscar Sainz, University of the Basque Country (UPV/EHU)
Iker García Ferrero, University of the Basque Country (UPV/EHU)
Eneko Agirre, University of the Basque Country (UPV/EHU)
Jon Ander Campos, Cohere
Alon Jacovi, Bar Ilan University
Yanai Elazar, Allen Institute for Artificial Intelligence and University
of Washington
Yoav Goldberg, Bar Ilan University and Allen Institute for Artificial
Intelligence