Apologies for the multiple postings. ----------------------------- *Indian Language Summarization (ILSUM 2024)* Website: https://ilsum.github.io/
To be organized in conjunction with FIRE 2024 (fire.irsi.org.in) 12th-15th December 2024, Gandhinagar, India ------------------------------
The third shared task on Indian Language Summarization (ILSUM) aims at extending evaluation benchmark dataset for Indian Language Summarization. Three Dravidian languages Kannada, Telugu and Tamil are introduced this year. We also extend the misinformation detection subtask to a cross-lingual setup.
*Subtask 1*: This task builds upon the task from the first two editions. In the previous editions we covered three major Indian languages Hindi, Gujarati and Bengali alongside Indian English, a widely recognized dialect of the English Language. This year's edition adds the three Dravidian languages Kannada, Tamil and Telugu and an expanded dataset for the languages from last year.
Like the previous edition, this will be a classic summarization task, where we will provide article-summary pairs for each language and the participants are expected to generate a fixed-length summary.
*Subtask 2*: The task is centred around identifying factual errors in machine-generated summaries. While LLMs are very good at summarization, among other NLP tasks, they are often prone to hallucinations. This means the model generates information that is not accurate, not based on its training data, or is completely made up but looks accurate and reliable. Further, such tools can be misused to generate misleading or outright incorrect information. Identifying such inaccuracies can be a challenging task.
This year's subtask builds upon a similar task from the previous edition in a cross-lingual setup. Participants will be provided with an article in English and its corresponding machine-generated summary in Hindi and Gujarati. The objective is to identify the presence of factual incorrectness in the summaries if any, and classify them in one of the predefined categories.
*Tentative Timeline* ------------- 15th August - Training Data Released and Registrations open 30th August - Test Data Release 30th September - Run Submission Deadline 10th October - Results Declared 20th October - Working notes due 20th November - Camera Ready Submissions due
12th-15th December - FIRE 2024 at Gandhinagar, India
*Organisers* ---------------- Shrey Satapara, Indian Institute of Technology, Hyderabad, India Sandip Modha, LDRP-ITR, Gandhinagar, India Shashirekha HL, Mangalore University, India Asha Hegde, Mangalore University, India Parth Mehta, Parmonic, USA Debasis Ganguly, University of Glasgow, Scotland
*For regular updates subscribe to our mailing list: **ilsum@googlegroups.com**