(Apologies for cross-postings)
*** The GUM Corpus - Release 9.0.0 ***
*** Georgetown University Multilayer corpus ***
Corpling@GU <https://gucorpling.org/corpling/> is happy to announce the first release of series 9 of the Georgetown University Multilayer corpus (GUM V9.0.0):
https://gucorpling.org/gum/
New in this version:
- 20 new documents added including more conversational data (total tokens: 203,879)
- Abstractive summaries for each document
- Annotations for salient/non-salient entities in each document
- Foreign language tags to identify individual source languages where relevant
- New easier process for reconstructing Reddit text data
- Many corrections to all annotation layers
GUM is an open source corpus of richly annotated English texts from multiple genres: academic, bio, conversation, fiction, interview, news, speeches, textbooks, travel, vlogs, how-to and Reddit forum discussions. The corpus is created by students as part of the Computational Linguistics curriculum at Georgetown University and is available under Creative Commons licenses.
This is the first version of GUM series 9, containing roughly 200K tokens annotated for:
- Multiple POS tags (100% manual gold PTB, extended PTB, converted CLAWS5 and UPOS) and UD morphological features
- Manually corrected lemmatization
- Sentence segmentation and rough speech act (manual)
- Document structure using TEI tags (paragraphs, headings, figures, captions etc., all manual)
- Constituent and dependency syntax (manually corrected Universal Dependencies, and PTB parses from gold tags with function labels)
- Information status (given-active/inactive, accessible-inferable/common ground/aggregate, and new)
- Entity type, salience and coreference annotation (including non-named entities, singletons, appositions, cataphora and several types of bridging)
- Entity linking (Wikification) of all named entities with Wikipedia articles, including their non-named and pronominal mentions
- Discourse parses in Rhetorical Structure Theory and discourse dependencies
- Abstractive summaries
Note on Reddit data: token text is not contained in the release but can be downloaded with an included script.
For more information and to search or download the corpus online, see the corpus website <https://gucorpling.org/gum/> .
Best wishes,
The GUM team
We invite you to participate in our multilingual stance classification shared task, as part of the Touché Lab, which will be held in conjunction with the CLEF'23 conference in Thessaloniki, Greece [1].
Context:
Participatory Democracy at the scale of a continent like Europe brings many difficulties due to the high diversity of languages and cultures. At the same time, Machine Learning is an interesting tool for stance recognition in a large-scale context, in terms of data size, but also regarding the topics and themes addressed or the languages employed by the participants. Public consultations of citizens using Online Participatory Democracy platforms offer this kind of setting and are good use cases for automatic stance recognition systems.
In the context of the Touché Lab at CLEF 2023 [2], we are proposing a shared task on data coming from the platform used during the Conference for the Future of Europe [2] which was inaugurated in 2021, where users can submit proposals and comment over them in any of the 24 official EU languages. A particularity of this platform is the use of a Machine Translation system in order to give the possibility to the users to interact between each others in their native languages, leading to what we call Intra-Multilingual data: pairs of proposal and comment in different languages.
[1] https://clef2023.clef-initiative.eu/
[2] https://touche.webis.de/
[3] https://futureu.europa.eu/
Tasks: Given a proposal on a socially important issue, the task is to classify whether a comment is in favor, against, or neutral towards the proposal.
Subtask1: Cross-debate Stance Classification.
Subtask2: All-data-available Classification
Learn more about this and other argumentation- and causality-related tasks at https://touche.webis.de/
Data available at https://touche.webis.de/clef23/touche23-web/multilingual-stance-classificat…
Register via the CLEF website: https://clef2023-labs-registration.dei.unipd.it/
-------------------------------------------------------------------------------
Important Dates
-------------------------------------------------------------------------------
Now open: Registration
Jan. 15, 2023: Development data available
April 30, 2023: Test data available
May 2, 2023: Approaches submission on the test data
June 5, 2023: Participant paper submission
July 7, 2023: Camera-ready participant papers submission
Sep. 18-21, 2023: Conference
One of the conference days: Touché Workshop on Argument and Causal Retrieval
-------------------------------------------------------------------------------
Special Announcements
-------------------------------------------------------------------------------
Touché Open Source Proceedings
Touché will host a collection of software developed by participants at GitHub.
The Touché team invite you to publish your software too and invite software submissions using TIRA [ https://www.tira.io/ ].
In case of questions / suggestions / etc., please reach us at touche(a)webis.de.
Best regards,
CoFE Team @ Touché
Dear colleagues,
The Fourth Workshop on Insights from Negative Results in NLP Co-located
with EACL, May 2 or 6, 2023
First Call for Participation
Insights Website: <https://insights-workshop.github.io/
<https://insights-workshop.github.io/index>>
Contact email: insights-workshop-organizers(a)googlegroups.com
*Overview
Publication of negative results is difficult in most fields, but in NLP the
problem is exacerbated by the near-universal focus on improvements in
benchmarks. This situation implicitly discourages hypothesis-driven
research, and it turns creation and fine-tuning of NLP models into art
rather than science. Furthermore, it increases the time, effort, and carbon
emissions spent on developing and tuning models, as the researchers have no
opportunity to learn what has already been tried and failed.
This workshop invites both practical and theoretical unexpected or negative
results that have important implications for future research, highlight
methodological issues with existing approaches, and/or point out pervasive
misunderstandings or bad practices. In particular, the most successful NLP
models currently rely on different kinds of pretrained meaning
representations (from word embeddings to Transformer-based models like BERT
and GPT-3). To complement all the success stories, it would be insightful
to see where and possibly why they fail. Any NLP tasks are welcome:
sequence labeling, question answering, inference, dialogue, machine
translation - you name it.
A successful negative results paper would contribute one of the following:
** broadly applicable recommendations for training/fine-tuning, especially
if X that didn’t work is something that many practitioners would think
reasonable to try, and if the demonstration of X’s failure is accompanied
by some explanation/hypothesis;
** ablation studies of components in previously proposed models, showing
that their contributions are different from what was initially reported;
** datasets or probing tasks showing that previous approaches do not
generalize to other domains or language phenomena;
** trivial baselines that work suspiciously well for a given task/dataset;
** cross-lingual studies showing that a technique X is only successful for
a certain language or language family;
** experiments on (in)stability of the previously published results due to
hardware, random initializations, preprocessing pipeline components, etc;
** theoretical arguments and/or proofs for why X should not be expected to
work;
** demonstration of issues with data processing/collection/annotation
pipelines, especially if they are widely used;
** demonstration of issues with evaluation metrics (e.g. accuracy, F1 or
BLEU), which prevent their usage for fair comparison of methods.
* Important Dates
** Submission due: February 13, 2023
** Submission due for papers reviewed through ACL Rolling Review: March 17,
2023
** Notification of acceptance: March 13, 2023
** Camera-ready papers due: March 27, 2023
** Workshop: May 5 or 6, 2023
* Submission
Submission is electronic, using the Softconf START conference management
system.
Submission link: <https://softconf.com/eacl2023/insights2023/>
The workshop will accept short papers (up to 4 pages, excluding
references), as well as 1-2 page non-archival abstract submissions for
papers published elsewhere (e.g. in one of the main conferences or in
non-NLP venues). The goal of this event is to stimulate a meaningful
community-wide discussion of the deep issues in NLP methodology, and the
authors of both types of submissions will be welcome to take part in our
get-togethers.
The workshop will run its own review process, and papers can be submitted
directly to the workshop by Feb 13, 2023. It is also possible to submit a
paper accompanied with reviews from the ACL Rolling Review system by March
17, 2023. The submission deadline for ARR papers follows the ACL RR
calendar. Both research papers and abstracts must follow the ACL two-column
format. Official style sheets:
<https://www.overleaf.com/read/crtcwgxzjskr>
<https://github.com/acl-org/ACLPUB/tree/master/templates>
Please do not modify these style files, nor should you use templates
designed for other conferences. Submissions that do not conform to the
required styles, including paper size, margin width, and font size
restrictions, will be rejected without review.
* Multiple Submission Policy
The workshop cannot accept work for publication or presentation that will
be (or has been) published elsewhere and that have been or will be
submitted to other meetings or publications whose review periods overlap
with that of Insights. Any questions regarding submissions can be sent to
insights-workshop-organizers(a)googlegroups.com.
If the paper has been rejected from another venue, the authors will have
the option to provide the original reviews and the author response. The new
reviewers will not have access to this information, but the organizers will
be able to take into account the fact that the paper has already been
revised and improved.
* Anonymity Period
We are not enforcing any anonymity period.
* Presentation
All accepted papers must be presented at the workshop to appear in the
proceedings. Authors of accepted papers must notify the program chairs by
the camera-ready deadline if they wish to withdraw the paper. At least one
author of each accepted paper must register for the workshop.
Previous presentations of the work (e.g. preprints on arXiv.org) should be
noted in a footnote in the camera-ready version (but not in the anonymized
version of the paper).
The workshop will take place on May 2 or 6 2023. The workshop will be
hybrid with both in-person and virtual presentations.
* Organization Committee
** Shabnam Tafreshi, University of Maryland: ARLIS
** Arjun Reddy Akula, Google
** João Sedoc, New York University
** Anna Rogers, University of Copenhagen
** Aleksandr Drozd, RIKEN
** Anna Rumshisky, University of Massachusetts Lowell / Amazon Alexa
* Contact info
Any questions regarding the workshop can be sent to
insights-workshop-organizers(a)googlegroups.com.
Please continue reading about: Authorship, Citation and Comparison, Ethics
Policy, Reproducibility, Anonymity Period, and Presentation in the call for
paper page on our website: https://insights-workshop.github.io/2023/cfp/
Regards,
Insights 2023 Organizers
--
*Shabnam Tafreshi, PhD*
*Assistant Research Scientist*
*Computational Linguistics, NLP*
*UMD: ARLIS @ College Park*
*"All the problems of the world could be settled easily, if people only
willing to think."*
*-Thomas J. Watson*
Fully funded 4-year PhD position on NLP for video subtitling at the University of Amsterdam, Language Technology Lab. This is a collaboration with RTL and part of the LTP ROBUST program. The call text is below my signature, mirrored from the official listing:
https://vacatures.uva.nl/UvA/job/PhD-Candidate-in-Natural-Language-Processi…
Apply, only through the link above, before Feb 24. For more context, see also my web site https://vene.ro/jobs.html. For further questions, don’t hesitate to e-mail me—please include [PhD 11053] in the subject line so my filters can catch your email.
Vlad Niculae [he/him]
Asst. Prof. @ LTL, IvI, University of Amsterdam
https://vene.ro
---
PhD Candidate in Natural Language Processing for Video Subtitling
Faculteit/Dienst: Faculteit der Natuurw., Wiskunde & Informatica
Opleidingsniveau: Master
Functie type: Promotieplaats
Sluitingsdatum: 24 februari 2023
Vacaturenummer: 11053
We are inviting applications for a fully-funded, four-year PhD position in natural language processing for video subtitling. This is a collaboration between core Computer Science, Science, Technology, and Social Studies. Are you eager to work on applied research models for accessibility language technologies? Do you want to research controllability of language generation for generating adequate, appropriate, and faithful subtitles? This position might be the one for you!
What are you going to do?
You will be embedded in the Language Technology Lab (LTL) under the supervision of Dr. Vlad Niculae and lead a project to investigate and improve NLP generative models for semi-automatic subtitling for Dutch and English television and video-on-demand. As captions provide access to information to many, high quality and unbiased performance are of critical societal importance. Powerful speech recognition systems are available today and provide a solid basis, but do not solve subtitling. We aim towards a subtitling system that is:
contextualized: it uses speaker identity, available scripts, and visual cues for improved accuracy;
machine-in-the-loop: it quantifies its own uncertainty, giving control to expert human operators;
faithful: it maintains good performance across languages, topics, and speaker identities (such as gender, age, region).
The PhD position will be part of the large LTP ROBUST program “Trustworthy AI-based Systems for Sustainable Growth” consortium, comprising 17 universities, 19 industry partners, and 15 collaborating partners representing diverse stakeholder groups. You will gain valuable experience working with an industry partner and will be able to tap into a wealth of networking, career development, and training opportunities in conjunction with ICAI, the Innovation Center for Artificial Intelligence at the University of Amsterdam. You will be part of one of the 17 new ICAI labs, named TAIM (Trustworthy AI for Media Lab) consisting of 5 PhD students, who will collaborate on developing methods, metrics and tools to evaluate and improve diversity and inclusion in media.
Tasks and responsibilities
With our help and support, you will:
innovate in research on contextual, uncertainty-aware, faithful generative models of language for subtitling;
deploy prototypes and evaluate subtitling in the applied setting of RTL;
complete and defend your PhD thesis;
become an active participant in the research community and collaborate within and outside the TAIM lab and the Language Technology Lab;
publish and present work regularly at international conferences, workshops, and journals;
assist in educational tasks (labs / tutorials, supervising bachelor and Master projects.)
Additionally, you will have the opportunity to closely collaborate with a leading entertainment brand, RTL. You are expected to work at their premises one day per week in Hilversum and one day remotely, making use of their resources and deployment context.
We care strongly about respecting work-life balance and contractual hours.
What do you have to offer?
Your experience and profile:
A Master’s degree (completed or near completion) with a thesis in Natural Language Processing, Machine Learning, Computer Science, or similar relevant areas;
Serious interest in pursuing fundamental research with concrete applications;
A good background in Natural Language Processing, Machine Learning, and Deep Learning;
Advanced programming skills;
Professional command of the English language;
A commitment to maintaining an inclusive, collaborative, diverse, and supportive work environment.
Interdisciplinary collaborations and backgrounds are appreciated, especially along fields related to linguistics and communication science. Experience with using subtitles or similar accessibility language technologies is a pre. If this describes you, we encourage your application.
If you are interested but unsure if you are qualified, please contact Dr. Vlad Niculae before applying. If your Master’s degree is near completion, it must be completed before the start date. Knowledge of the Dutch language is not required for this position, but can help both for living in Amsterdam and for a good understanding of the video content. The UvA provides the opportunity to attend Dutch language classes.
Our offer
A temporary contract for 38 hours per week for the duration of 4 years (the initial contract will be for a period of 18 months and after satisfactory evaluation it will be extended for a total duration of 4 years). The preferred starting date is April 2023. Your work should lead to a dissertation (PhD thesis). We will draft an educational plan that includes attendance of courses and (international) meetings. We also expect you to assist in teaching undergraduates and master students.
The gross monthly salary, based on 38 hours per week and dependent on relevant experience, ranges between € 2,541 in the first year to € 3,247 in the last year (scale P). UvA additionally offers an extensive package of secondary benefits, including 8% holiday allowance and a year-end bonus of 8.3%. The UFO profile PhD Candidate is applicable. A favourable tax agreement, the ‘30% ruling’, may apply to non-Dutch applicants. The Collective Labour Agreement of Universities of the Netherlands is applicable.
Besides the salary and a vibrant and challenging environment at Science Park we offer you multiple fringe benefits:
232 holiday hours per year (based on fulltime) and extra holidays between Christmas and 1 January.
Multiple courses to follow from our Teaching and Learning Centre.
A complete educational program for PhD students.
Multiple courses on topics such as leadership for academic staff.
Multiple courses on topics such as time management, handling stress and an online learning platform with 100+ different courses.
7 weeks birth leave (partner leave) with 100% salary.
Partly paid parental leave.
The possibility to set up a workplace at home;
A pension at ABP for which UvA pays two third part of the contribution.
The possibility to follow courses to learn Dutch;
Help with housing for a studio or small apartment when you’re moving from abroad.
Are you curious to read more about our extensive package of secondary employment benefits, take a look here.
About us
The University of Amsterdam is the Netherlands' largest university, offering the widest range of academic programmes. At the UvA, 42,000 students, 6,000 staff members and 3,000 PhD candidates study and work in a diverse range of fields, connected by a culture of curiosity.
The Faculty of Science has a student body of around 8,000, as well as 1,800 members of staff working in education, research or support services. Researchers and students at the Faculty of Science are fascinated by every aspect of how the world works, be it elementary particles, the birth of the universe or the functioning of the brain.
The mission of the Informatics Institute (IvI) is to perform curiosity-driven and use-inspired fundamental research in Computer Science. The main research themes are Artificial Intelligence, Computational Science and Systems and Network Engineering. Our research involves complex information systems at large, with a focus on collaborative, data driven, computational and intelligent systems, all with a strong interactive component.
The Language Technology Lab (LTL) is a research group focusing on information access from natural language data. Our work ranges from basic research in natural language processing to key applications in human language technology, and covers areas such as machine translation, summarization, question answering, language modeling, and image captioning. LTL positions itself primarily in the AI research theme, with some links to the Data Science theme of the Informatics Institute.
You will be part of one of the 17 new ICAI labs, named TAIM (Trustworthy AI for Media Lab) consisting of 5 PhD students, who will collaborate on developing methods, metrics and tools to evaluate and improve diversity and inclusion in media. You are joining a unique team also including the Department of Advanced Computing Sciences at Maastricht University (UM) and media and entertainment company RTL Nederland.
The TAIM lab will bring together two of the strongest groups on personalization and recommender systems in the Netherlands (UM and UvA), with a leading media organization (RTL), to develop trustworthy and personalized media. The lab will focus on the development of media that is inclusive, informed by democratic norms, and in line with RTLs values to represent, and give a voice to, all of the Netherlands in the design of their personalization algorithms.
Want to know more about our organisation? Read more about working at the University of Amsterdam.
Any questions?
Do you have any questions or do you require additional information? Please contact:
E: Dr. Vlad Niculae, Assistant Professor.
Job application
If you feel the profile fits you, and you are interested in the job, we look forward to receiving your application. You can apply online via the button below. We accept applications until and including 24 February 2023.
Applications should include the following information (all files besides your CV should be submitted in one single pdf file):
a letter of motivation (max 2 pages) in which you:
motivate your choice for this position and your interest in the proposed project;
indicate your preferred starting date and availability;
sketch out some thoughts and ideas about tackling the project (not a fully-detailed or binding proposal).
a Curriculum Vitae (including start/end months of education and work experience);
a summary of, or a copy of, your Master’s thesis;
a copy of your Master’s and Bachelor’s transcript/diploma.
If your MSc thesis is not finished or not in English, submit a brief summary in 1-4 pages. If your transcripts or diplomas are not available yet, please attach a note clearly stating which documents are not available, and when they will be available. This note can be in your own words.
Before submitting, please make sure to provide ALL requested documents mentioned above.
You can use the CV field to upload your resume as a separate pdf document. Use the Cover Letter field to upload the other requested documents, including the motivation letter, as one single pdf file.
Please do not submit applications by e-mail.
Only complete applications received within the response period via the link below will be considered.
The interviews will be held in March 2023.
The UvA is an equal-opportunity employer. We prioritize diversity and are committed to creating an inclusive environment for everyone. We value a spirit of enquiry and perseverance, provide the space to keep asking questions, and promote a culture of curiosity and creativity.
If you encounter Error GBB451/ GBC451, please try using a VPN connection when outside of the European Union. Please reach out directly to our to our HR Department directly. They will gladly help you continue your application.
No agencies please.
Please, consider participating and/or forwarding to colleagues and groups.
****We apologize for multiple postings of this e-mail****
----------------------------------------------------------------------------------------------------
Call for Participation
----------------------------------------------------------------------------------------------------
Second Call for Participation
EXIST 2023 at CLEF 2023
Task: EXIST 2023: sEXism Identification in Social neTworks
Website: http://nlp.uned.es/exist2023/
EXIST is a series of scientific events and shared tasks on sexism identification in social networks that aims to capture sexism in a broad sense, from explicit misogyny to other subtle expressions that involve implicit sexist behaviours (EXIST 2021, EXIST 2022). The third edition of the EXIST shared task will be held as a Lab at CLEF 2023, which will take place on September 18-21, 2023, in the Centre for Research & Technology Hellas (CERTH), Thessaloniki, Greece.
Social Networks are the main platforms for social complaint, activism and expression of opinions and personal views in general. Movements like #MeTwoo, #8M or #Time’sUp have spread rapidly. Under the umbrella of social networks, many women all around the world have reported abuses, discriminations and other sexist experiences suffered in real life. Social networks are also contributing to the transmission of sexism and other disrespectful and hateful behaviours. In this context, automatic tools not only may help to detect and alert against sexist behaviours and discourses, but also to estimate how often sexist and abusive situations are found in social media platforms, what forms of sexism are more frequent and how sexism is expressed in these media.
Given the success of the tasks, EXIST 2023 is a follow up of the tasks addressed in previous years, while facing yet a new challenge: the identification of the intention of the author of the sexist message. Additionally, the main novelty will be the adoption of the “learning with disagreements” paradigm for the development of the dataset and for the evaluation of the systems. The adoption of this paradigm along with our effort to control bias in the annotations will allow us to evaluate whether including the different views and sensibilities of the annotators contributes to the development of more accurate and fairer NLP systems.
Participants will be asked to classify tweets (in English and Spanish) according to the following three tasks:
TASK 1 - Sexism Identification: a binary classification where systems have to decide whether or not a given text (tweets) contains sexist expressions or behaviours (i.e., it is sexist itself, describes a sexist situation or criticizes a sexist behaviour).
TASK 2 - Source Intention: for the tweets that have been classified as sexist, the second task aims to classify each tweet according to the intention of the person who wrote it. We propose a ternary classification task: (i) direct sexist message, (ii) reported sexist message and (iii) judgemental message.
TASK 3 - Sexism Categorization: once a message has been classified as sexist, the third task aims to categorize the message in different types of sexism (according to the categorization proposed by experts and that takes into account the different facets of women that are undermined). In particular, each sexist tweet must be categorized in one or more of the following categories: (i) Ideological and inequality, (ii) Stereotyping and dominance, (iii) Objectification, (iv) Sexual violence and (v) Misogyny and non-sexual violence.
Although we recommend to participate in all subtasks, participants are allowed to participate just in one of them. During the training phase, the task organizers will provide to the participants the manually-annotated EXIST 2023 dataset. For the evaluation of the teams, the unlabelled test data will be released.
We encourage participation from both academic institutions and industrial organizations. We invite the participants to register for the lab at CLEF 2023 Labs Registration site (http://clef2023-labs-registration.dei.unipd.it/registrationForm.php). Upon registration participants will receive information about how to join the Google Group about the EXIST 2023 shared task.
Important Dates:
* 14 November 2022: Registration open.
* 13 February 2023: Training set available.
* 27 March 2023: Development set available.
* 10 April 2023: Test set available.
* 28 April 2023: Registration closes.
* 10 May 2023: Runs submission due.
* 26 May 2023: Results notification.
* 5 June 2023: Submission of Working Notes by participants.
* 23 June 2023: Notification of acceptance (peer-reviews).
* 7 July 2023: Camera-ready participant papers due.
* 18-21 September 2023: EXIST 2023 at CLEF Conference.
**Note: All deadlines are 11:59PM UTC-12:00 ("anywhere on Earth").**
Organizers:
Laura Plaza, Universidad Nacional de Educación a Distancia (UNED)
Jorge Carrillo-de-Albornoz, Universidad Nacional de Educación a Distancia (UNED)
Roser Morante, Universidad Nacional de Educación a Distancia (UNED)
Enrique Amigó, Universidad Nacional de Educación a Distancia (UNED)
Julio Gonzalo, Universidad Nacional de Educación a Distancia (UNED)
Damiano Spina, Royal Melbourne Institute of Technology (RMIT)
Paolo Rosso, Universitat Politècnica de Valencia (UPV)
Contact:
Contact the organizers by writing to: jcalbornoz(a)lsi.uned.es
Website: http://nlp.uned.es/exist2023/
AVISO LEGAL. Este mensaje puede contener información reservada y confidencial. Si usted no es el destinatario no está autorizado a copiar, reproducir o distribuir este mensaje ni su contenido. Si ha recibido este mensaje por error, le rogamos que lo notifique al remitente.
Le informamos de que sus datos personales, que puedan constar en este mensaje, serán tratados en calidad de responsable de tratamiento por la UNIVERSIDAD NACIONAL DE EDUCACIÓN A DISTANCIA (UNED) c/ Bravo Murillo, 38, 28015-MADRID-, con la finalidad de mantener el contacto con usted. La base jurídica que legitima este tratamiento, será su consentimiento, el interés legítimo o la necesidad para gestionar una relación contractual o similar. En cualquier momento podrá ejercer sus derechos de acceso, rectificación, supresión, oposición, limitación al tratamiento o portabilidad de los datos, ante la UNED, Departamento de Política Jurídica de Seguridad de la Información<https://www.uned.es/dpj>, o a través de la Sede electrónica<https://sede.uned.es/> de la Universidad.
Para más información visite nuestra Política de Privacidad<https://descargas.uned.es/publico/pdf/Politica_privacidad_UNED.pdf>.
[Apologies for multiple postings]
ImageCLEFfusion (2nd edition)
Registration: https://www.imageclef.org/2023/fusion
Run submission: May 10, 2023
Working notes submission: June 5, 2023
CLEF 2023 conference: September 18-21, Thessaloniki, Greece
*** CALL FOR PARTICIPATION ***
While deep neural networks have proven their predictive power in many
tasks, there are still several domains where a single deep learning
network is not enough for attaining high precision, e.g., prediction
of subjective concepts such as violence, memorability, etc.
Late fusion, also called ensembling or decision-level fusion,
represents one of the approaches that researchers employ to increase
the performance of single-system approaches. It consists of using a
series of weaker learner methods called inducers, whose prediction
outputs are combined in the final step, via a fusion mechanism to
create a new and improved super predictor. These systems have a long
history and are shown to be particularly useful in scenarios where the
performance of single-system approaches is not considered
satisfactory.
The task challenges participants to develop and benchmark late fusion
schemes. This task would allow to explore various aspects of late
fusion mechanisms, such as the performance of different fusion
methods, the methods for selecting inducers from a larger set, the
exploitation of positive and negative correlations between inducers,
and so on.
*** TASK ***
The participants will receive a data set of real inducers and are
expected to provide a fusion mechanism that would allow to combine
them into a super-system yielding superior performance compared to the
highest performing individual system. The provided inducers were
developed to solve three real tasks:
(i) prediction of visual interestingness (int --- regression task),
(ii) diversification of image search results (div --- retrieval task),
(iii) medical image captioning (cap --- multi-class labeling task).
*** DATA SET ***
ImageCLEFfusion-int. The data for this task is extracted and
corresponds to the Interestingness10k dataset. We will provide output
data from 33 inducers, while 1,826 samples will be used for the
development set, and 609 samples will be used for the testing set.
ImageCLEFfusion-div. The data for this task is extracted and
corresponds to the Retrieving Diverse Social Images Task dataset. We
will provide outputs data from 117 inducers, while 104 queries will be
used for the development set, and 35 samples will be used for the
testing set.
ImageCLEFfusion-cap. The data for this task is extracted from the
ImageCLEFmedical Caption task. We will provide output data from 85
inducers, while 5,700 images will be used for the development set, and
1900 images will be used for the testing set.
*** METRICS ***
Evaluation will be performed using the metrics specific to each
dataset we use, e.g., MAP@10, F1@20, ClusterRecall@20, accuracy.
*** IMPORTANT DATES ***
- Run submission: May 10, 2023
- Working notes submission: June 5, 2023
- CLEF 2023 conference: September 18-21, Thessaloniki, Greece
(https://clef2023.clef-initiative.eu/)
*** OVERALL COORDINATION ***
Liviu-Daniel Stefan, Politehnica University of Bucharest, Romania
Mihai Gabriel Constantin, Politehnica University of Bucharest, Romania
Mihai Dogariu, Politehnica University of Bucharest, Romania
Bogdan Ionescu, Politehnica University of Bucharest, Romania
*** ACKNOWLEDGEMENT ***
The task is supported under the H2020 AI4Media “A European Excellence
Centre for Media, Society and Democracy” project, contract #951911
https://www.ai4media.eu/.
On behalf of the Organizers,
Bogdan Ionescu
https://www.AIMultimediaLab.ro/
Dear colleagues,
The Second Workshop on NLP Applications to Field Linguistics (Field Matters 2023) invites paper submissions. The workshop will take place at EACL 2023 (https://2023.eacl.org/) in Dubrovnik, Croatia on May 5 or 6 (online participants are also welcomed).
We accept papers on the following topics:
- Application of NLP to field linguistics workflow;
- Transfer learning for under-resourced language processing;
- The use of fieldwork data to build NLP systems;
- Modeling morphology and syntax of typologically diverse languages in the low-resource setting;
- Speech processing for under-resourced languages;
- Computational analysis of field linguistics datasets;
- Using technology for preserving culture via language;
- Improving ways of interaction with Indigenous communities;
- Machine-readable field linguistic datasets.
Submission deadline is February 13. The workshop will run its own review process, and papers can be submitted directly to the workshop via Start (https://softconf.com/eacl2023/FieldMatters2023/).
You can find more information on the submission process and format requirements on our web-site (https://field-matters.github.io/cfp2023).
Subscribe to our Twitter page (https://twitter.com/field_matters) to follow the updates.
If you have any questions, feel free to ask them!
Best regards,
Anna Postnikova
Field Matters workshop organizing committee
*Query Performance Prediction (QPP) *is currently primarily used for ad-hoc
retrieval tasks. The Information Retrieval (IR) field is reaching new
heights thanks to recent advances in large language models and neural
networks, as well as emerging new ways of searching, such as conversational
search. Such advancements are quickly spreading to adjacent research areas,
including QPP, necessitating reconsidering how we perform and evaluate QPP.
Important Dates
Submission deadline: February 12th, 2023
Notification of acceptance: March 5th, 2023
Camera ready: March 15th, 2023
Workshop day: April 2nd, 2023
Conference days: April 3rd-6th, 2023
Call for Papers
This workshop aims at stimulating discussion on three main aspects
concerning the future of
QPP:
-
*What are the emerging QPP challenges* posed by new methods and
technologies, including but not limited to dense retrieval, contextualized
embeddings, and conversational search?
-
How might these *new techniques be used to improve the quality of QPP*
?
-
Can we claim that the current techniques for *evaluating QPP are
effective in all arising scenarios*? Can we envision new evaluation
protocols capable of granting generalizability in new domains?
We plan to foster the discussion via *two focus groups* led by the
workshop's organizers.
The first focus group will identify what possibilities the QPP offers
regarding new research models and IR tasks, primary considerations, issues
linked to different aspects of the QPP, and the potentialities provided by
new tools.
The second focus group will gather the community’s concerns and solutions
with respect to the QPP evaluation, especially for what concerns emerging
domains.
The workshop will focus on the following themes:
-
*Query performance prediction applied to new tasks*:
Can existing QPP techniques be exploited, or which new QPP theories and
models need to be devised for new tasks, such as passage-retrieval, Q&A,
and conversational search?
-
*Query performance prediction exploiting new techniques*:
How can new technologies like contextualized embeddings, large language
models, and neural networks be exploited to improve QPP?
-
*Evaluation of query performance prediction*:
How should QPP techniques be evaluated, including best practices,
datasets, and resources, and, in particular, should QPP be evaluated the
same for different IR tasks?
It is possible to submit three main categories of manuscripts to the
workshop:
*Full papers*: up to 6 pages.
*Short papers*: up to 3 pages.
*Discussion papers*: up to 3 pages.
All manuscripts are expected to address the workshop's themes as mentioned
above. *Full and short papers* should contain *innovative ideas and* their
experimental evaluation. *We are also interested in works containing*
(methodologically
sound) *preliminary results and incremental endeavours*.
*Discussion papers should include work with or without preliminary results,
position papers, and papers describing failures*. Such papers should foster
the discussion and thus are not required to contain full-fledged results.
In this sense, the experimental evaluation of the submitted discussion
paper is appreciated but not required.
*We are also interested in receiving contributions regarding* (methodologically
sound) *failed experiments*; since the workshop will focus on new research
directions, we consider it necessary also to discuss the reasons and causes
of failures.
Each manuscript will be peer-reviewed by at least two program committee
members.
*Accepted papers will be published online as a volume of the CEUR-WS
proceeding series.*
Submit your contribution via Easychair at the following link
*https://easychair.org/conferences/?conf=qpp2023
<https://easychair.org/conferences/?conf=qpp2023>*
To prepare the submission, use the one-column CEUR template. A precompiled
version is
available at
*https://drive.google.com/file/d/1sTW16i0vlsVHVf75t0rC_30UVMPUmn3Z/view?usp=share_link
<https://drive.google.com/file/d/1sTW16i0vlsVHVf75t0rC_30UVMPUmn3Z/view?usp=…>*
Website
*https://qpp.dei.unipd.it/ <https://qpp.dei.unipd.it/>*
Organizers
Guglielmo Faggioli, University of Padova, Italy, faggioli(a)dei.unipd.it
Nicola Ferro, University of Padova, Italy, ferro(a)dei.unipd.it
Josiane Mothe, Université de Toulouse, IRIT, France, josiane.mothe(a)irit.fr
Fiana Raiber, Yahoo Research, Israel, fiana(a)yahooinc.com
==============
Call for Papers @ Fourth Biennial Conference on Language, Data and
Knowledge (LDK 2023)
Dates: 12–13 September 2023 (Workshops/Tutorials), 14–15 September 2023
(Main Conference)
Location: Vienna, Austria
Website: http://2023.ldk-conf.org
Submission Deadline: 10 March 2023
Submission page: https://openreview.net/group?id=LDK/2023/Conference
==============
We invite submissions to the fourth biennial conference on Language, Data
and Knowledge (LDK 2023) to be held in Vienna, Austria in September 2023.
This conference aims to bring together researchers from across different
disciplines concerned with the acquisition, treatment, curation and use of
language data in the context of data science and knowledge-based
applications. This edition builds upon the success of the inaugural event
held in Galway, Ireland in 2017, the second LDK in Leipzig, Germany in
2019, and the third LDK in Zaragoza, Spain in 2021.
Invited speakers
We are happy to announce Diana Maynard (University of Sheffield), Ruben
Verborgh (Ghent University), and Ruth Wodak (Lancaster
University/University of Vienna), as keynote speakers for LDK 2023.
Paper submissionWe welcome submissions of relevance to the topics listed
below. Submissions can be in the form of:
-
Long papers: 9–12 pages;
-
Short papers: 4–6 pages.
All submission lengths are given including references. Accepted submissions
will be published by ACL in an open-access conference proceedings volume,
free of charge for authors. The ACL templates
<https://github.com/acl-org/acl-style-files> should therefore be used for
all conference submissions.
As the reviewing process is single-blind, submissions should not be
anonymised.
Papers should be submitted via OpenReview at the following address:
https://openreview.net/group?id=LDK/2023/Conference
The conference will be hybrid (face-to-face and remote). Note that at least
one author of each accepted paper must register to present the paper at the
conference (either remotely or on-site). There will be no registration fee
administered for participating in LDK 2023.
Presentation format
Accepted submissions will be selected for oral or poster presentation based
on recommendations from the reviewers. This decision will not reflect any
difference in the quality of the papers, and there will be no distinction
between oral and poster presentations in the published proceedings. Authors
of accepted short papers or posters are welcome to present their work as a
demo in addition to the regular presentation.
Topics
Relevant topics for the conference include, but are not limited to, the
following fields:
Language Data
-
Language data construction and acquisition
-
Language data annotation
-
FAIR data practices for language data
-
Language data portals and metadata about language data
-
Organisational and infrastructural management of language data
-
Multilingual, multimedia and multimodal language data
-
Evaluation, provenance and quality of language data
-
Visualisation of language data
-
Standards and interoperability of language data
-
Legal aspects of publishing language data
-
Under-resourced languages
-
e-Lexicography
-
Semantic processing
Knowledge Graphs
-
Linguistic linked data and the multilingual semantic web
-
Ontologies, terminologies, wordnets, framenets and related resources
-
Information and knowledge extraction (taxonomy extraction, ontology
learning)
-
Data, information and knowledge integration across languages
-
(cross-lingual) ontology alignment
-
Entity linking and relatedness
-
Linked data profiling
-
Knowledge representation and reasoning
-
Knowledge graphs for corpora processing and analysis
Applications for Language, Data and Knowledge
-
Question answering and semantic search
-
Text analytics on big data
-
NLP for language documentation and preservation
-
Speech recognition and synthesis
-
Spoken language processing
-
Semantic content management
-
Computer-aided language learning
-
Natural language interfaces to big data
-
Knowledge-based NLP
-
Deep learning and machine learning for and on LLOD
-
Other applications
Use Cases in Language, Data and Knowledge
Contributions are welcome where the topics above - and others within the
scope of Language, Data and Knowledge - are applied to domain-specific use
cases, including but not limited to: social sciences and humanities, legal,
life sciences, FinTech, cybersecurity.
Organising committee
Conference Chairs
-
Jorge Gracia – University of Zaragoza
-
John P. McCrae – University of Galway
Program Chairs:
-
Sara Carvalho – University of Aveiro | NOVA CLUNL
-
Anas Fahad Khan – Institute for Computational Linguistics “A. Zampolli”
Workshop and Tutorial Chairs:
-
Ana Ostroški Anić – Institute of Croatian Language and Linguistics
-
Blerina Spahiu – University of Milano-Bicocca
Local Organisers:
-
Dagmar Gromann – University of Vienna
-
Barbara Heinisch – University of Vienna
Proceedings Chair:
-
Ana Salgado – NOVA CLUNL | Lisbon Academy of Sciences
Important Dates
10 March 2023
Paper submission deadline
28 April 2023
Notification
26 May 2023
Camera-ready submission deadline
12–13 September 2023
Pre-conference events
14–15 September 2023
Main conference
All deadlines refer to anywhere-on-earth time.
Program Committee
http://2023.ldk-conf.org/program-committee/
Workshops and tutorials
You can check the list of accepted workshops and tutorials at
http://2023.ldk-conf.org/workshops-tutorials/
//Sorry for multiple posts//Desculpem por múltiplas postagens/Lo
sentimos por varios mensajes/
/Versions in/Versões em/Versiones en English/Español/Português/
Caros Colegas,
Estimados Colegas,
Dear Colleagues,
The last call had a small error, please consider this one
La última llamada tiene un pequeño error, por favor considere este
A última chamada possuia um pequeno engano, favor considerar este
2nd Call for papers / Segunda Convocatoria / Segunda Circular
/English/
18th Congress of Latin-American Association of Systemic-Functional
Linguistics (ALSFAL2023)
2nd Brazilian Congress of Systemic-Functional Linguistics
New paths for Dialogue, Understanding and Reconnection
*
Dates: 28th August through 1st September
*
Venue: University of Campinas - UNICAMP (Campinas - São Paulo - Brasil)
This second Call for Papers aims to provide information about submitting
abstracts and congress fees. The Call for Papers is available on the
official conference website; you can find it here
<https://www2.iel.unicamp.br/alsfal2023/?page_id=119&lang=en>.
Please, flow us on Instagram (@alsfal2023
<https://www.instagram.com/alsfal2023/>) and Facebook
<https://www.facebook.com/ALSFAL2023>.
*
Official website: http://www.iel.unicamp.br/alsfal2023
*
Official email: alsfal2023(a)unicamp.br
We hope to meet you in Campinas!
Rodrigo Esteves de Lima-Lopes
On behalf of the organising committee
/Español/
XVIII Congreso de la Asociación de Lingüística Sistémico-Funcional
de América Latina (ALSFAL2023)
II Congreso Brasileño de Lingüística Sistémico-Funcional
Nuevos Caminos para el Diálogo, Entendimiento y Reconexión.
*
Fechas: 28 de agosto a 01º de septiembre de 2023
*
Local del Evento: Universidade Estadual de Campinas - UNICAMP
(Campinas - São Paulo - Brasil)
El principal objetivo de esta segunda convocatoria es divulgar
informaciones acerca de la sumisión de resúmenes y las tarifas del
congreso. La convocatoria está disponible en el sitio web oficial de la
conferencia; lo puedes encontrar aquí
<https://www2.iel.unicamp.br/alsfal2023/?page_id=114&lang=es>.
Síganos en Instagram (@alsfal2023
<https://www.instagram.com/alsfal2023/>) y Facebook
<https://www.facebook.com/ALSFAL2023>.
* *Página oficial*: http://www.iel.unicamp.br/alsfal2023
* *Email oficial*: alsfal2023(a)unicamp.br
Rodrigo Esteves de Lima-Lopes
En nombre del comité organizador
/Português/
XVIII Congresso da Associação Latino-americana de Linguística
Sistêmico-Funcional (ALSFAL2023)
II Congresso Brasileiro de Linguística Sistémico-Funcional
Novos Caminhos para o Diálogo, Entendimento e Reconexão
* Data: 28 de agosto a 01º de setembro de 2023
* Local: Universidade Estadual de Campinas - UNICAMP (Campinas - São
Paulo - Brasil)
O principal objetivo dessa segunda circular é divulgar informações para
a submissão de resumos e as taxas do congresso. A circular e as
informações estão disponíveis no sítio oficial do evento aqui
<https://www2.iel.unicamp.br/alsfal2023/?page_id=112>
Você pode nos seguir no Instagram (@alsfal2023
<https://www.instagram.com/alsfal2023/>) e no Facebook
<https://www.facebook.com/ALSFAL2023>.
* Página oficial: http://www.iel.unicamp.br/alsfal2023
* E-mail oficial: alsfal2023(a)unicamp.br
Esperamos por vocês em Campinas!
Rodrigo Esteves de Lima-Lopes
Em nome do comitê organizador
--
*Rodrigo Esteves de Lima Lopes (ele/he)*
rll307@unicamp.br/@rll307
Professor Associado/Associate Professor
Universidade Estadual de Campinas/University of Campinas
Depto. de Linguística Aplicada/Dept. of Applied Linguistics