Webminar by Sebastian Ruder (Meta) - Corpora

31 Jan 2025


      **** We apologize for the multiple copies of this email. In case you are 
already registered to the next webinar, you do not need to register 
again. ****
------------------------------------------------------------------------
Dear colleague,
We are happy to announce the next webinar in the Language Technology 
webinar series organized by the HiTZ Chair of AI&LT (https://hitz.eus). 
You can check the videos of previous webinars and the schedule for 
upcoming webinars here: http://www.hitz.eus/webinars
Next webinar:
*Speaker:* Sebastian Ruder (Meta)
*Title:* Multilingual LLM Evaluation in Practical Settings
*Date: * Thursday, February 6, 2025 - 15:00 CET
*Summary:* Large language models (LLMs) are increasingly used in a 
variety of applications across the globe but do not provide equal 
utility across languages. In this talk, I will discuss multilingual 
evaluation of LLMs in two practical settings: conversational 
instruction-following and usage of quantized models. For the first part, 
I will focus on a specific aspect of multilingual conversational ability 
where errors result in a jarring user experience: generating text in the 
user’s desired language. I will describe a new benchmark and evaluation 
of a range of LLMs. We find that even the strongest models exhibit 
language confusion, i.e., they fail to consistently respond in the 
correct language. I will discuss what affects language confusion, how to 
mitigate it, and potential extensions. In the second part, I will 
discuss the first evaluation study of quantized multilingual LLMs across 
languages. We find that automatic metrics severely underestimate the 
negative impact of quantization and that human evaluation—which has been 
neglected by prior studies—is key to revealing harmful effects. Overall, 
I highlight limitations of multilingual LLMs and challenges of 
real-world multilingual evaluation.
*Bio:* Sebastian Ruder is a research scientist at Meta based in Berlin, 
Germany where he works on improving evaluation and benchmarking of large 
language models (LLMs). He previously led the Multilinguality team at 
Cohere with the objective to improve the multilingual capabilities of 
Cohere's LLMs. Before that he was a research scientist at Google 
DeepMind. He completed his PhD in Natural Language Processing (NLP) at 
the Insight Research Centre for Data Analytics, while working as a 
research scientist at Dublin-based text analytics startup AYLIEN. 
Previously, he studied Computational Linguistics at the University of 
Heidelberg, Germany and at Trinity College, Dublin.
*
Upcoming webinars:*
· Christian Herff (Thursday, March 6, 2025)
· Emanuele Bugliarello (Thursday, April 3, 2025)
· André F. T. Martins (Thursday, May 8, 2025)
If you are interested in participating, please complete this 
registration form: http://www.hitz.eus/webinar_izenematea
If you cannot attend this seminar, but you want to be informed of the 
following HiTZ webinars, please complete this registration form instead: 
http://www.hitz.eus/webinar_info
Best wishes,
HiTZ Zentroa
P.S: HiTZ will not grant any type of certificate for attendance at these 
webinars.