[Corpora-List] PhD in ML/NLP – Fairness and self-supervised learning for speech processing

1 Jun 2023


      PhD in ML/NLP – Fairness and self-supervised learning for speech processing
Starting date: October 1st, 2023 (flexible)
Application deadline: June 9th, 2023
Interviews (tentative): June 14th, 2023
Salary: ~2000€ gross/month (social security included)
Mission: research oriented (teaching possible but not mandatory)
*Keywords:*speech processing, fairness, bias, self-supervised 
learning,evaluation metrics
*CONTEXT*
This thesis is in the context of the ANR project E-SSL (Efficient 
Self-Supervised Learning for Inclusive and Innovative Speech 
Technologies). Self-supervised learning (SSL) has recently emerged as 
one of the most promising artificial intelligence (AI) methods as it 
becomes now feasible to take advantage of the colossal amounts of 
existing unlabeled data to significantly improve the performances of 
various speech processing tasks.
*PROJECT OBJECTIVES*
Speech technologies are widely used in our daily life and are expanding 
the scope of our action, with decision-making systems, including in 
critical areas such as health or legal aspects. In these societal 
applications, the question of the use of these tools raises the issue of 
the possible discrimination of people according to criteria for which 
societyrequires equal treatment, such as gender, origin, religion or 
disability...  Recently, the machine learning community has been 
confronted with the need to work on the possible biases of algorithms, 
and many works have shown that the search for the best performance is 
not the only goal to pursue [1]. For instance, recent evaluations of ASR 
systems have shown that performances can vary according to the gender 
but these variations depend both on data used for learning and on models 
[2]. Therefore such systems are increasingly scrutinized for being 
biased while trustworthy speech technologies definitely represents a 
crucial expectation.
Both the question of bias and the concept of fairness have now become 
important aspects of AI, and we now have to find the right threshold 
between accuracy and the measure of fairness. Unfortunately, these 
notions of fairness and bias are challenging to define and their
meanings can greatly differ [3].
The goals of this PhD position are threefold:
- First make a survey on the many definitions of robustness, fairness 
and bias with the aim of coming up with definitions and metrics fit for 
speech SSL models
- Then gather speech datasets with high amount of well-described metadata
- Setup an evaluation protocol for SSL models and analyzing the results.
*SKILLS*
*
Master 2 in Natural Language Processing, Speech Processing, computer
    science or data science.
*
Good mastering of Python programming and deep learning framework.
*
Previous experience in bias in machine learning would be a plus
*
Very good communication skills in English
*
Good command of French would be a plus but is not mandatory
*SCIENTIFIC ENVIRONMENT*
The PhD position will be co-supervised by Alexandre Allauzen (Dauphine 
Université PSL, Paris) and Solange Rossato and François Portet 
(Université Grenoble Alpes). Joint meetings are planned on a regular 
basis and the student is expected to spend time in both places. 
Moreover, two other PhD positions are open in this project.  The 
students, along with the partners will closely collaborate. For 
instance, specific SSL models along with evaluation criteria will be 
developed by the other PhD students. Moreover, the PhD student will 
collaborate with several team members involved in the project in 
particular the two other PhD candidates who will be recruited  and the 
partners from LIA, LIG and Dauphine Université PSL, Paris. The means to 
carry out the PhD will be providedboth in terms of missions in France 
and abroad and in terms of equipment. The candidate will have access to 
the cluster of GPUs of both the LIG and Dauphine Université PSL. 
Furthermore, access to the National supercomputer Jean-Zay will enable 
to run large scale experiments.
*INSTRUCTIONS FOR APPLYING*
Applications must contain: CV + letter/message of motivation + master 
notes + be ready to provide letter(s) of recommendation; and be 
addressed to Alexandre Allauzen (_alexandre.allauzen@espci.psl.eu_ 
mailto:mickael.rouvier@univ-avignon.fr), Solange 
Rossato(Solange.Rossato@imag.fr) and François Portet 
(_francois.Portet@imag.fr_ mailto:francois.Portet@imag.fr). We 
celebrate diversity and are committed to creating an inclusive 
environment for all employees.
*REFERENCES:*
[1] Mengesha, Z., Heldreth, C., Lahav, M., Sublewski, J. & Tuennerman, 
E. “I don’t Think These Devices are Very Culturally Sensitive.”—Impact 
of Automated Speech Recognition Errors on African Americans. Frontiers 
in Artificial Intelligence 4. issn: 2624-8212. 
_https://www.frontiersin.org/article/10.3389/frai.2021.725911_ 
https://www.frontiersin.org/article/10.3389/frai.2021.725911(2021).
[2] Garnerin, M., Rossato, S. & Besacier, L. Investigating the Impact  
of Gender Representation in ASR Training Data: a Case Study on 
Librispeech inProceedings of the 3rd Workshop on Gender Bias in Natural 
Language Processing (2021), 86–92.
[3] Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K. & Galstyan, A. A 
Survey on Bias and Fairness in Machine Learning.  ACMComput. Surv. 54. 
issn: 0360-0300. _https://doi.org/10.1145/3457607_ 
https://doi.org/10.1145/3457607(July 2021).
-- 
François PORTET
Professeur - Univ Grenoble Alpes
Laboratoire d'Informatique de Grenoble - Équipe GETALP
Bâtiment IMAG - Office 333
700 avenue Centrale
Domaine Universitaire - 38401 St Martin d'Hères
FRANCE

Phone:  +33 (0)4 57 42 15 44
Email:francois.portet@imag.fr
www:http://membres-liglab.imag.fr/portet/

2025

2024

2023

2022

[Corpora-List] PhD in ML/NLP – Fairness and self-supervised learning for speech processing