Dear All,
We are pleased to announce an online meeting with the organizers of the task Learning With Disagreements (LeWiDi) shared task @Semeval2023, to find out more about the task.
*WHEN?* FRIDAY, 2TH DEC 2022, 1 p.m. CET *WHERE?* ON ZOOM *WHO?* The online meeting is open to everyone who is interested in the task, also who's not subscribed yet *WHAT?* The meeting will be structured as follow:
- introduction to datasets and task from the organisers - Q&A time
*HOW?* find the details of the meeting:
- on the pages of our codalab competition https://codalab.lisn.upsaclay.fr/competitions/6146 - by adding the event on your calendar at the following link *https://calendar.google.com/calendar/event?action=TEMPLATE&tmeid=N3RzMHJ... https://calendar.google.com/calendar/event?action=TEMPLATE&tmeid=N3RzMHJmcHBib29jYTZlNG9wcXBwYWhpb3AgbGV3aWRpc2VtZXZhbDIwMjNAbQ&tmsrc=lewidisemeval2023@gmail.com*
We hope to see you numerous at the meeting!
-------------------------------------------------------------------------------
*ABOUT THE TASK *
The assumption that natural language expressions have a single and clearly identifiable interpretation is more and more recognized as just a convenient idealization. Focus of LeWiDi task is entirely on subjective tasks, where usage of aggregated labels makes much less sense.
The objective of the LeWiDi task is to provide a unified testing framework for learning from disagreements and developing methods able to capture them, using datasets containing disaggregated annotations.
We propose 4 diverse (textual) datasets:
- with different characteristics in terms of genres (social media and conversations), languages (English and Arabic), tasks (misogyny, hate speech, offensiveness detection) and annotations' methodology (experts, specific demographic groups, AMT-crowd) - all datasets are equipped with disaggregated annotations - all datasets provides relevant information about annotators
We developed an *harmonized json format* for the 4 datasets so to encourage participants in *developing methods able to capture agreements/disagreements*, rather than focusing on developing the best model for each dataset
Performance is evaluated using two metrics:
1. 'classical hard evaluation' (F1): how well the model predicts the aggregated labels (binary, based on majority)
2. 'soft' evaluation' (cross-entropy): how well the model's probabilities reflect the level of agreement among annotators
An ideal model would have high F1 and low cross-entropy results.
PARTECIPATE:
get the data and participate https://codalab.lisn.upsaclay.fr/competitions/6146
CONTACT US: le-wi-di-semeval2023_contactus@googlegroups.com
DATES: Current status: train and dev are released, unlimited submissions via Codalab are allowed (evaluation on dev) January 10, 2023: evaluation phase starts, unlabeled test is released. Limited submissions on Codalab are allowed (evaluation on test) February 2023: Participant paper submission March 2023: Peer review notification April 2023: Camera-ready participant papers submission Summer 2023: SemEval workshop (co-located with a major NLP conference)
ORGANIZERS:
Elisa Leonardelli, Fondazione Bruno Kessler (FBK), Italy Gavin Abercrombie, Heriot-Watt University, United Kingdom Valerio Basile, Torino University, Italy Tommaso Fornaciari, Bocconi University, Italy Barbara Plank, IT University of Copenhagen, Denmark Verena Rieser, Heriot-Watt University, United Kingdom Massimo Poesio, Queen Mary University of London, United Kingdom Alexandra Uma, Queen Mary University of London, United Kingdom
Best regards,
the LeWiDi organizers
--
The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. If you received this in error, please contact the sender and delete the material.