[apologies if you receive multiple copies of this call]
CALL FOR PARTICIPATION - SemEval 2025 Task 8: Question Answering on Tabular Data
We are pleased to announce the first SemEval task on Question Answering on Tabular Data.
Our SemEval 2025 task consists of Question Answering over Tabular Data making use of the DataBench benchmark. DataBench is a benchmark composed of real-world table datasets from different domains and with large size of rows and columns, as well as a wide variety of data types that allow to assess distinct sort of questions related to each data type.
We propose a task to encourage participants to develop a system that answers the questions of the kind present in DataBench over day-to-day datasets, where the answer is either a number, a categorical value, a boolean value or lists of several types. DataBench can be used as a training and validation set, while we will release another test set explicitly compiled for the task competition.
The system developed by the participants will be provided by a series of (dataset, question) pairs and will need to provide an answer which would then be compared with a gold standard.
The answer might be achieved through a variety of methods. In our paper we illustrate two different approaches: In-Context Learning and Code Generation. You may use any of these or come up with your own approach.
There will be two subtasks:
Subtask I : DataBench QA
Participants will be provided with a dataset (of any size) and a question over it. The question should be answered using the data from the dataset only.
Subtask II: DataBench Lite QA
The task is essentially the same as the previous subtask, but involves using the sampled version of each dataset with a maximum of 20 rows per dataset (see explanation on DataBench Lite). The question should be answered using the data from the sampled dataset only. For the test set, we will similarly provide a reduced version of each dataset for this subtask. This task is especially relevant when testing for models with a smaller window size.
Important Dates
Official Competition start 10 January 2025
Competition end 31 January 2025
Task Organizers
Jorge Osés Grijalba - Graphext
L. Alfonso Ureña-López and Eugenio Martínez Cámara - University of Jaén
Jose Camacho-Collados - Cardiff University
Competition website: https://jorses.github.io/semeval/
Codabench: https://www.codabench.org/competitions/3360/ Google Group: https://groups.google.com/g/semeval-25-t8-tabularqa