[apologies if you receive multiple copies of this call]
Dear colleagues and friends,
*We are pleased to release the 1st Call for Participation to the LLMs4Subjects Shared Task organized as part of SemEval 2025.*
*Overview:* As the first of its kind, LLMs4Subjects invites the research community to *develop cutting-edge LLM-based semantic solutions for the subject tagging of the Leibniz University's Technical Library's open-access collection*. The shared task provides an opportunity for the research community to creatively utilize LLMs for subject tagging of technical records. *Systems need to demonstrate bilingual language modeling in understanding technical documents in both German and English.* Moreover, successful solutions may be directly integrated into the operational workflows of the TIB Leibniz Information Centre for Science and Technology University Library.
*What we provide to participants:* a human-readable form of a subject's taxonomy (this is the *GND* or *Gemeinsame Normdatei*, the integrated authority file used for cataloging in German-speaking countries) and a large collection of technical records tagged with these subjects from the TIB's open-access collection called *TIBKAT*.
More details on the task website: https://sites.google.com/view/llms4subjects/
*LLMs4Subjects defines the following three tasks:*
- Task 1: Learn the GND - Task 2: Align subject tagging to the TIBKAT collection - (Optional and Fun) Task 3: Develop Elegant Frontend Interfaces for Subject Tagging
*LLMs4Subjects will have three separate evaluations:*
- Evaluation 1: Quantitative Metrics-based Evaluations - Evaluation 2: Qualitative Evaluations by the Human Subject Specialists - (Optional) Evaluation 3: HCI evaluations for subject indexing interfaces submitted
*To participate in the LLMs4Subjects shared task,*
1. please submit your interest to participate using our online form ( https://forms.gle/YQzupcoySAyJi45c6), 2. sign up to the shared task Google Groups ( https://groups.google.com/u/6/g/llms4subjects) for FAQs, news, and announcements, and, 3. last but not the least, download the datasets ( https://github.com/jd-coderepos/llms4subjects/) to begin development.
*Dates*
Training and validation datasets available:October 2, 2024 Test data available/Evaluation starts: January 10, 2025 Evaluation ends: January 31, 2025 Participant paper submissions due: February 28, 2025 Notification to authors: March 31, 2025 Camera ready due: April 21, 2025 SemEval workshop: TBD
*Task Organizers*
Jennifer D'Souza, Sameer Sadruddin, Holger Israel, Mathias Begoin et al. All organizers are affiliated with the TIB Leibniz Information Centre for Science and Technology - Germany (https://www.tib.eu/en/)
*We look forward to having you on board!*
*Contact: * llms4subjects [at] gmail.com