Call for Participation: SemRel SemEval Shared Task 1 - Corpora

19 Sep 2023


      Dear corpora-list members,
We are glad to announce the first SemEval shared task on Semantic Textual
Relatedness (STR): A shared task on automatically detecting the degree of
semantic relatedness (closeness in meaning) between pairs of sentences.
The semantic relatedness of two language units has long been considered
fundamental to understanding meaning (Halliday and Hasan, 1976; Miller and
Charles, 1991), and automatically determining relatedness has many
applications such as evaluating sentence representation methods, question
answering, and summarization.
Two sentences are considered semantically similar when they have a
paraphrasal or entailment relation. On the other hand, relatedness is a
much broader concept that accounts for all the commonalities between two
sentences: whether they are on the same topic, express the same view,
originate from the same time period, one elaborates on (or follows from)
the other, etc. For instance, for the following sentence pairs:
-
Pair 1: a. There was a lemon tree next to the house. b. The boy enjoyed
   reading under the lemon tree.
-
Pair 2: a. There was a lemon tree next to the house. b. The boy was an
   excellent football player.
Most people will agree that the sentences in pair 1 are more related than
the sentences in pair 2.
In this task, new textual datasets will be provided for Afrikaans
https://en.wikipedia.org/wiki/Afrikaans, Algerian Arabic
https://en.wikipedia.org/wiki/Algerian_Arabic, Amharic
https://en.wikipedia.org/wiki/Amharic, English, Hausa
https://en.wikipedia.org/wiki/Hausa_language, Hindi
https://en.wikipedia.org/wiki/Hindi, Indonesian
https://en.wikipedia.org/wiki/Indonesian_language, Kinyarwanda
https://en.wikipedia.org/wiki/Kinyarwanda, Marathi
https://en.wikipedia.org/wiki/Marathi_language, Moroccan Arabic
https://en.wikipedia.org/wiki/Moroccan_Arabic, Modern Standard Arabic
https://en.wikipedia.org/wiki/Modern_Standard_Arabic, Punjabi
https://en.wikipedia.org/wiki/Punjabi_language, Spanish
https://en.wikipedia.org/wiki/Spanish_language, and Telugu
https://en.wikipedia.org/wiki/Telugu_language.
Data
Each instance in the training, development, and test sets is a sentence
pair. The instance is labeled with a score representing the degree of
semantic textual relatedness between the two sentences. The scores can
range from 0 (maximally unrelated) to 1 (maximally related). These gold
label scores have been determined through manual annotation. Specifically,
a comparative annotation approach was used to avoid known limitations of
traditional rating scale annotation methods This comparative annotation
process (which avoids several biases of traditional rating scales) led to a
high reliability of the final relatedness rankings.
Further details about the task, the method of data annotation, how STR is
different from semantic textual similarity, applications of semantic
textual relatedness, etc. can be found in this paper:
https://aclanthology.org/2023.eacl-main.55.pdf
Tracks
Each team can provide submissions for one, two or all of the tracks shown
below:
Track A: Supervised
Participants are to submit systems that have been trained using the labeled
training datasets provided. Participating teams are allowed to use any
publicly available datasets (e.g., other relatedness and similarity
datasets or datasets in any other languages). However, they must report
additional data they used, and ideally report how impactful each resource
was on the final results.
Track B: Unsupervised
Participants are to submit systems that have been developed without the use
of any labeled datasets pertaining to semantic relatedness or semantic
similarity between units of text more than two words long in any language.
The use of unigram or bigram relatedness datasets (from any language) is
permitted.
Track C: Cross-lingual
Participants are to submit systems that have been developed without the use
of any labeled semantic similarity or semantic relatedness datasets in the
target language and with the use of labeled dataset(s) from at least one
other language.  Note: Using labeled data from another track is mandatory
for submission to this track.
Deciding which track a submission should go to:
-
If a submission uses labeled data in the target language: submit to
   Track A
   -
If a submission does not use labeled data in the target language but
   uses labeled data from another language: submit to Track C
   -
If a submission does not use labeled data in any language: submit to
   Track B
** Here ‘labeled data’ refers to labeled datasets pertaining to semantic
relatedness or semantic similarity between units of text more than two
words long.
Evaluation
The official evaluation metric for this task is the Spearman rank
correlation coefficient, which captures how well the system-predicted
rankings of test instances align with human judgments. You can find the
evaluation script for this shared task on our Github page
https://github.com/semantic-textual-relatedness/Semantic_Relatedness_SemEval2024/blob/main/evaluation_script/evaluation.py
.
Helpful Links
-
Competition Website: https://codalab.lisn.upsaclay.fr/competitions/15704
   -
Task Website: https://afrisenti-semeval.github.io/
   https://semantic-textual-relatedness.github.io
   -
Twitter X: https://twitter.com/AfriSenti2023
   https://twitter.com/SemRel2024
   -
Contact organisers semrel-semeval-organisers@googlegroups.com
   -
Google group for participants
   semrel-semeval-participants@googlegroups.com
Important Dates
-
Training data ready: 11 September 2023
   -
Evaluation Starts: 10 January 2024
   -
Evaluation End: 31 January 2024
   -
System Description Paper Due: February 2024
   -
SemEval workshop: Summer 2024 - (co-located with a major NLP conference)
References
-
Shima Asaadi, Saif Mohammad, Svetlana Kiritchenko. 2019. Big BiRD: A
   Large, Fine-Grained, Bigram Relatedness Dataset for Examining Semantic
   Composition. Proceedings of the 2019 Conference of the North American
   Chapter of the Association for Computational Linguistics: Human Language
   Technologies.
   -
M. A. K. Halliday and R. Hasan. 1976. Cohesion in English. London:
   Longman.
   -
George A Miller and Walter G Charles. 1991. Contextual Correlates of
   Semantic Similarity. Language and Cognitive Processes, 6(1):1–28
   -
Mohamed Abdalla, Krishnapriya Vishnubhotla, and Saif Mohammad. 2023.
   What Makes Sentences Semantically Related? A Textual Relatedness Dataset
   and Empirical Study. In Proceedings of the 17th Conference of the European
   Chapter of the Association for Computational Linguistics, pages 782–796,
   Dubrovnik, Croatia. Association for Computational Linguistics.
Task Organizers
Nedjma Ousidhoum
Shamsuddeen Hassan Muhammad
Mohamed Abdalla
Krishnapriya Vishnubhotla
Vladimir Araujo
Meriem Beloucif
Idris Abdulmumin
Seid Muhie Yimam
Nirmal Surange
Christine De Kock
Sanchit Ahuja
Oumaima Hourrane
Manish Shrivastava
Alham Fikri Aji
Thamar Solorio
Saif M. Mohammad