[Corpora-List] Introducing LLMeBench: A Flexible Framework for Accelerating LLMs Benchmarking and Prompt Engineering

18 Sep 2023


      Dear all,
We are happy to announce the release of our LLMeBench framework. The
framework is designed to accelerate and simplify evaluation and
benchmarking of large language models. It is modular, language-agnostic and
simple to extend. It currently supports interactions with LLMs through
APIs. The framework also features zero- and few-shot learning settings. It
will be open-sourced to encourage improvements and extensions from the
community.
The framework currently hosts recipes for a diverse set of Arabic NLP tasks
using OpenAI’s GPT and BLOOMZ models. Specifically, it currently serves 31
unique NLP tasks (from word-level to sentence pairs tasks) with specific
focus on Arabic tasks, using 53 publicly available datasets. It also comes
equipped with 200 prompts for these setups. It has recipes for 12 languages
including Arabic, Bangla, Bulgarian, Dutch, English, French, German,
Italian, Polish, Russian, Spanish, Turkish, and more to come.
We hope this will encourage experimentation with LLMs for multilingual
studies content.
We extend an invitation to the research community to participate and
improve the framework. We are excited to hear all your feedback and
suggestions and we thank you for your contribution.
For further details please take a look at the repository and the paper
below.
Code: https://github.com/qcri/LLMeBench
Paper: https://arxiv.org/pdf/2308.04945.pdf
Regards
Firoj
................
Firoj Alam, PhD
http://sites.google.com/site/firojalam/

2025

2024

2023

2022

[Corpora-List] Introducing LLMeBench: A Flexible Framework for Accelerating LLMs Benchmarking and Prompt Engineering