Hello Colleagues,
Imagine using an AI assistant to get the latest updates on your favorite sports team, but instead of receiving latest scores, you get last year’s results. Or when asking about a niche movie you really like, but couldn't get meaningful answers. These are classic examples of “hallucination” where LLMs provide outdated or incorrect information.
The Meta Comprehensive RAG (CRAG) Benchmark Challenge aims to address these challenges. Participate to improve how Large Language Models (LLMs) can keep-up with the ever-evolving reality and provide accurate responses by leveraging Retrieval Augmented Generation (RAG).
Why RAG Matters
Despite the advancements of LLMs, the issue of hallucination persists as a significant challenge; that is, LLMs may generate answers that lack factual accuracy or grounding. Retrieval-Augmented Generation (RAG) has recently emerged as a promising solution to alleviate LLM’s deficiency in lack of knowledge and attracted a lot of attention from both academia research and industry.
Introducing the Meta Comprehensive RAG (CRAG) Benchmark Challenge
CRAG is a factual question answering benchmark that covers 5 domains and 8 question types, and provides a practical set-up to evaluate RAG systems. Different from existing benchmarks, CRAG is designed to include questions with a wide variety of domains and types. In particular, it includes questions with answers that change from over seconds to over years; it considers entity popularity and covers not only head, but also torso and tail facts; it contains simple-fact questions as well as 7 types of complex questions such as comparison, aggregation and set questions to test the reasoning and synthesis capabilities of RAG solutions.
Why This Challenge Is a Game-Changer
Addressing "hallucination" and outdated information is critical to enhancing the reliability of LLM-powered question-answering systems. RAG proposes a solution by integrating the external data into its responses. The CRAG Benchmark is a comprehensive test to evaluate these advanced systems' effectiveness across various domains and question types, challenging them with scenarios that require immediate data and those that explore less popular "tail" facts.
What's unique about this challenge?
* Tasks Designed to Improve QA Systems: The three tasks focus on web-based retrieval summarization, knowledge graph and web augmentation, and an end-to-end RAG challenge, each built on top of the previous one. * Rich Dataset Across Diverse Domains: The CRAG dataset covers domains from finance to music, to address questions that mirror real-world variability and complexity. * Prizes For Winners: Compete for a chance to win a part of the $31,500 prize pool, with the top performing teams in each task winning up to $4,000.
Challenge Timeline
* Website Online and Registration Begin: 20th March, 2024 23:55 UTC * Phase 1 Start Date: 1st April, 2024 23:55 UTC * Phase 1 End Date: 20th May, 2024 23:55 UTC * Phase 2 Start Date: 22nd May, 2023 23:55 UTC * Registration and Team Freeze Deadline: 31st May, 2024 23:55 UTC * Phase 2 End Date: 20th Jun, 2023 23:55 UTC * Winner Notification: 15th July, 2024 * Winner Announcement: 26th August, 2024 (KDD Cup Winners)
👉 Engage Now: Begin this journey by delving into the challenge details at https://www.aicrowd.com/challenges/meta-comprehensive-rag-benchmark-kdd-cup-.... Join a community of innovative thinkers, share ideas, and engage in this exciting challenge.
Connect with us on our Community Forum and Discord Server for support and collaboration. We're eager to see the innovations you'll bring to life. Refer to: https://www.aicrowd.com/challenges/meta-comprehensive-rag-benchmark-kdd-cup-... for the competition rules.
All the best, Team AIcrowd
*NO PURCHASE NECESSARY TO ENTER/WIN. Open to individuals who are 18 + age of majority and meet the full eligibility requirements in the full Rules. Open 3/20/2024 10:00:01 AM thru 6/15/2024 23:59:59 PM PT. Void where prohibited. Subject to full Rules at https://www.aicrowd.com/challenges/meta-comprehensive-rag-benchmark-kdd-cup-.... See Rules for prize details and values. Sponsor: Meta Platforms, Inc. 1 Hacker Way, Menlo Park, California 94025 (for US entrants) or Meta Ireland Limited, 4 Grand Canal Square, Dublin 2, Ireland (for all other entrants).