Test set released!
IberLEF 2023 Task - HOPE: Multilingual Hope Speech detection
Held as part of the evaluation forum IberLEF 2023 https://sites.google.com/view/iberlef-2023 in the XXXIX edition of the International Conference of the Spanish Society for Natural Language Processing (SEPLN 2023 http://sepln2023.sepln.org/en/home/)
September 26, 2023. Jaén, Andalusia, Spain
Codalab link: https://codalab.lisn.upsaclay.fr/competitions/10215
Dear All,
We are inviting researchers and students to participate in the shared-task HOPE: Multilingual Hope Speech detection, held as part of IberLEF 2023, the shared evaluation campaign for Natural Language Processing systems in Spanish and other Iberian languages, collocated with SEPLN 2023 Conference.
The HOPE shared task is related to the inclusion of vulnerable groups and focuses on the study of the detection of hope speech, in pursuit of equality, diversity and inclusion. This task was previously organized at the second workshop on Language Technology for Equality, Diversity and Inclusion (LT-EDI-2022), as a part of ACL 2022, but for five languages: Tamil, Malayalam, Kannada, English and Spanish. The novelties of this shared task are: i) it is organized in two languages, Spanish and English; and ii) it provides an expanded and improved dataset. It consists of two subtasks:
Subtask 1: Hope Speech detection in Spanish. Given a Spanish tweet, identifying whether it contains hope speech or not. The possible categories for each text are: -
HS: Hope Speech. -
NHS: Non Hope Speech. -
Subtask 2: Hope Speech detection in English. Given an English Youtube comment, identifying whether it contains hope speech or not. The possible categories for each text are: -
HS: Hope Speech. -
NHS: Non Hope Speech.
In both subtasks there will be a real time leaderboard and the participants will be allowed to make a maximum of 10 submissions through CodaLab, from which each team will have to select the best one for ranking.
The dataset for this task comprises two corpus, one in Spanish and another in English. The Spanish corpus was collected between 2021 and 2022. It is an extension of the SpanishHopeEDI dataset (García-Baena et al., 2023) to be published in the journal Language Resources and Evaluation, which was used in the ACL LT-EDI-2022 Spanish task (Chakravarthi et al., 2022). It consists of a set of LGBT-related tweets annotated as HS (Hope Speech) or NHS (Non Hope Speech). A tweet is considered as HS if the text: i) explicitly supports the social integration of minorities; ii) is a positive inspiration for the LGTBI community; iii) explicitly encourages LGTBI people who might find themselves in a situation; or iv) unconditionally promotes tolerance. On the contrary, a tweet is marked as NHS if the text: i) expresses negative sentiment towards the LGTBI community; ii) explicitly seeks violence; or iii) uses gender-based insults. The English corpus is an extension of the English part of the HopeEDI dataset (Chakravarthi, 2020). It consists of comments posted on YouTube videos on a wide range of socially relevant topics such as Equality, Diversity and Inclusion, including LGBTIQ issues, COVID-19, women in STEM, Black Lives Matter, etc.
Today, we have released the test dataset that can be found in the "Files" subsection of the "Participate" tab.
Finally, remember that the CodaLab competition is open to submit your results with the test set until Mar 28th, 2023.
To download the data and participate, go to: https://codalab.lisn.upsaclay.fr/competitions/10215.
Best regards,
The HOPE 2023 organizing committee
García-Baena, D., García-Cumbreras, M.A., Jiménez-Zafra, S.M., García-Díaz, J.A., Valencia-García, R. (2023). Hope Speech Detection in Spanish. The LGBT case. Language Resources and Evaluation. To be published. -
Chakravarthi BR (2020) HopeEDI: A multilingual hope speech detection dataset for equality, diversity, and inclusion. In: Proceedings of the Third Workshop on Computational Modeling of People’s Opinions, Personality, and Emotion’s in Social Media, Association for Computational Linguistics, Barcelona, Spain (Online), pp 41–53, URL https://aclanthology.org/2020.peoples-1.5 -
Chakravarthi, B. R., Muralidaran, V., Priyadharshini, R., Cn, S., McCrae, J. P., García-Cumbreras, M. Á., Jiménez-Zafra, S. M., Valencia-García, R., Kumar Kumaresan, P., Ponnusamy, R., García-Baena, D. & García-Díaz, J. (2022, May). Overview of the Shared Task on Hope Speech Detection for Equality, Diversity, and Inclusion. In Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion (pp. 378-388). https://aclanthology.org/2022.ltedi-1.58
Important dates
Release of training + development corpora: Feb 13, 2023. -
Release of test corpora and start of evaluation campaign: Mar 13, 2023. -
End of evaluation campaign (deadline for runs submission): Mar 28, 2023. -
Publication of official results: Mar 30, 2023. -
Paper submission: Abr 25, 2023. -
Review notification: May 23, 2023. -
Camera ready submission: Jun 9, 2023. -
IberLEF Workshop (SEPLN 2023): Sep 26, 2023 (Jaén, Andalusia, Spain) -
Publication of proceedings: Sep ??, 2023
Organizing committee
Miguel Ángel García Cumbreras (SINAI, Universidad de Jaén) -
Daniel García-Baena (SINAI, Universidad de Jaén) -
Bharathi Raja Chakravarthi (University of Galway) -
Salud María Jiménez-Zafra (SINAI, Universidad de Jaén) -
José Antonio García-Díaz (UMUTeam, Universidad de Murcia) -
Rafael Valencia-García (UMUteam, Universidad de Murcia) -
L. Alfonso Ureña-López (SINAI, Universidad de Jaén)
[image: Universidad de Jaén] http://www.uja.es/ *Salud María Jiménez Zafra* sjzafra@ujaen.es
Universidad de Jaén Grupo de Investigación SINAI http://sinai.ujaen.es/ | Departamento de Informática EPS Jaén, Edificio A3, Despacho 219 Campus Las Lagunillas s/n 23071 - Jaén | +34 953212992
[image: Universidad de Jaén] http://www.uja.es/