[Corpora-List]Webminar by Mikel Artetxe (FAIR (Meta AI)) - Corpora

16 Jun 2022


      *** We apologize for the multiple copies of this email. In case y
Dear colleague,
We are happy to announce an additional webinar in the Language 
Technology webinar series organized by the HiTZ research center (Basque 
Center for Language Technology, http://hitz.eus). Instead of the usual 
afternoon hour, it will take place at 10:00am CET (June 24).
Next webinar:
Speaker: Mikel Artetxe (FAIR (Meta AI))
     Title: Is scale all you need?
     Date: Jun 24, 2022, 10:00 CET
     Summary: The development of advanced spoken language technologies 
based on automatic speech recognition (ASR) and text-to-speech synthesis 
(TTS) has enabled computers to either learn how to listen or speak. Many 
applications and services are now available but still support fewer than 
100 languages. Nearly 7000 living languages that are spoken by 350 
million people remain uncovered. This is because the construction is 
commonly done based on machine learning trained in a supervised fashion 
where a large amount of paired speech and corresponding transcription is 
required. In this talk, we will introduce a semi-supervised learning 
mechanism based on a machine speech chain framework. First, we describe 
the primary machine speech chain architecture that learns not only to 
listen or speak but also to listen while speaking. The framework enables 
ASR and TTS to teach each other given unpaired data. After that, we 
describe the use of machine speech chain for code-switching and 
cross-lingual ASR and TTS of several languages, including low-resourced 
ethnic languages. Finally, we describe the recent multimodal machine 
chain that mimics overall human communication to listen while speaking 
and visualizing. With the support of image captioning and production 
models, the framework enables ASR and TTS to improve their performance 
using an image-only dataset.
     Summary: Every once in a while, a new language model with gazillion 
parameters makes a big splash in Twitter, smashing the previous SOTA in 
some benchmarks or showing some impressive emerging capabilities. While 
some may argue that scaling will eventually solve NLP, others are 
skeptical about the scientific value of this trend. In this talk, I will 
argue that scaling is not just engineering, but also comes with exciting 
research questions. I will present some of our recent work in the topic, 
and discuss our efforts to make large language models more accessible 
for the community.
     Bio:Mikel Artetxe is a Research Scientist at FAIR (Meta AI). His 
primary area of research is multilingual NLP. Mikel was one the pioneers 
of unsupervised machine translation, and has done extensive work on 
cross-lingual representation learning. More recently, he has also been 
working on natural language generation, few-shot learning, and 
large-scale language models. Prior to joining FAIR, Mikel did his PhD at 
the IXA group at the University of the Basque Country, and interned at 
DeepMind, FAIR and Google.
Check past and upcoming webinars at the following url: 
http://www.hitz.eus/webinars If you are interested in participating, 
please complete this registration form: 
http://www.hitz.eus/webinar_izenematea
If you cannot attend this seminar, but you want to be informed of the 
following HiTZ webinars, please complete this registration form instead: 
http://www.hitz.eus/webinar_info
Best wishes,
HiTZ Zentroa
Unsuscribe: If you do not wish to receive further emails from us, please 
feel free to contact us