CFP - SemEval 2025 Task 8: Question Answering on Tabular Data - Corpora

4 Oct 2024

      [apologies if you receive multiple copies of this call]
CALL FOR PARTICIPATION - SemEval 2025 Task 8: Question Answering on Tabular
Data
We are pleased to announce the first SemEval task on Question Answering on
Tabular Data.
Our SemEval 2025 task consists of Question Answering over Tabular Data
making use of the DataBench benchmark. DataBench is a benchmark composed of
real-world table datasets from different domains and with large size of
rows and columns, as well as a wide variety of data types that allow to
assess distinct sort of questions related to each data type.
We propose a task to encourage participants to develop a system that
answers the questions of the kind present in DataBench over day-to-day
datasets, where the answer is either a number, a categorical value, a
boolean value or lists of several types. DataBench can be used as a
training and validation set, while we will release another test set
explicitly compiled for the task competition.
The system developed by the participants will be provided by a series of
(dataset, question) pairs and will need to provide an answer which would
then be compared with a gold standard.
The answer might be achieved through a variety of methods. In our paper we
illustrate two different approaches: In-Context Learning and Code
Generation. You may use any of these or come up with your own approach.
There will be two subtasks:
Subtask I : DataBench QA
Participants will be provided with a dataset (of any size) and a question
over it. The question should be answered using the data from the dataset
only.
Subtask II: DataBench Lite QA
The task is essentially the same as the previous subtask, but involves
using the sampled version of each dataset with a maximum of 20 rows per
dataset (see explanation on DataBench Lite). The question should be
answered using the data from the sampled dataset only. For the test set, we
will similarly provide a reduced version of each dataset for this subtask.
This task is especially relevant when testing for models with a smaller
window size.
Important Dates
Official Competition start 10 January 2025
Competition end 31 January 2025
Task Organizers
Jorge Osés Grijalba - Graphext
L. Alfonso Ureña-López and Eugenio Martínez Cámara - University of Jaén
Jose Camacho-Collados - Cardiff University
Competition website: https://jorses.github.io/semeval/
Codabench: https://www.codabench.org/competitions/3360/
Google Group: https://groups.google.com/g/semeval-25-t8-tabularqa
-- 
Suelo trabajar a deshoras por lo que este correo puede haberte llegado
fuera de tu horario laboral, y al cual puedes responder en el momento que
mejor se ajuste a tus hábitos de trabajo. | I sometimes work at irregular
times and this email might arrive out of working hours so please be assured
that I respect your working pattern and look forward to your response when
it suits you.

-------

Eugenio Martínez Cámara.
Vicepresidente de la SEPLN http://www.sepln.org/ | Vice President of the
SEPLN http://www.sepln.org/en.
Investigador en Proc. del Lenguaje Natural | Postdoctoral Researcher in
Natural Language Proc.
Grupo de Investigación SINAI http://sinai.ujaen.es/ | SINAI
http://sinai.ujaen.es/ Research Group.
Profesor Titular | Associate Professor.
Dpto. de Informática | Computer Science Department.
Universidad de Jaén.