Deadline 15th January: 5-6 months M2 internship, A computational socio-linguistic exploration of the Ethics of AI landscape, at Télécom Paris - Corpora

16 Dec 2024


      Hi everyone,
Tiphaine Viard and Maria Boritchev are offering a master 2 internship at Télécom Paris. Feel free to contact us if you have any questions about the offer or the project, the details on the offer are below:
Title: A computational socio-linguistic exploration of the Ethics of AI landscape
Duration: 5-6 months, starting from April 1st (flexible)
Location: Télécom Paris, 19 Pl. Marguerite Perey, 91120 Palaiseau, France
Advisors: [ https://mboritchev.github.io/page-perso-TP/ | Maria Boritchev ] , [ https://tiphainev.github.io/ | Tiphaine Viard ]
Gratification: Approximately 600 euros per month (more or less 20 euros per month, the precise amount will depend on changes in the French labor code in 2025).
Context: 
The recent years have seen a surge of initiatives with the goal of defining what “ethical” artificial intelligence would or should entail, resulting in the publication of various charters and manifestos discussing AI ethics; these documents originate from academia, AI industry companies, non-profits, regulatory institutions, and the civil society. The contents of such documents vary wildly, from short, vague position statements to verbatims of democratic debates or impact assessment studies. As such, they are a marker of the social world of artificial intelligence, outlining the tenets of different actors, the consensus and dissensus on important goals, and so on [1]. We have assembled a corpus of charters and manifestos of Ethics of AI, in English, written by different actors of the current AI landscape. This corpus is called MapAIE: [ https://mapaie.telecom-paris.fr/ | https://mapaie.telecom-paris.fr/ ] . We are conducting research on data from MapAIE both from a sociological and linguistic perspectives:
* Sociologically, who are the groups of people who write about Ethics of AI? 
    * Linguistically, what type of vocabulary or semantic constructions do people use to write about Ethics of AI? 
    * Socio-linguistically, is there a difference in linguistic usage between different groups of people who write about Ethics of AI?
To conduct these investigations, we would like to go further than traditional tools: we intend to develop graph-based natural language processing and computational sociology approaches making better use of modern NLP methods to explore our data. In particular, we could to exploit word sense induction approaches to automatically extract different linguistic usages.
Objectives: 
The goal of this internship is to investigate MapAIE by using and developing graph-based natural language processing and computational sociology approaches. 
The internship will proceed in three steps: 
(1) Conduct a state of the art exploration on existing graph-based natural language processing and computational sociology techniques, starting from Abstract Meaning Representations (AMR, [ https://github.com/amrisi/amr-guidelines/blob/master/amr.md | https://github.com/amrisi/amr-guidelines/blob/master/amr.md ] ) and Cortext ( [ https://www.cortext.net/ | https://www.cortext.net/ ] ). 
(2) Re-implement existing techniques identified in (1), in particular [2], and analyse the obtained results sociologically and linguistically in view of the research questions of the project. 
(3) Propose new research questions and new graph-based data exploration approaches relevant to MapAIE.
Bibliography: 
[1] Mapping AI ethics: a meso-scale analysis of its charters and manifestos, Mélanie Gornet, Simon Delarue, Maria Boritchev, and Tiphaine Viard, ACM Conference on Fairness, Accountability and Transparency 2024. 
[2] Matan Eyal, Shoval Sadde, Hillel Taub-Tabib, and Yoav Goldberg. 2022. Large Scale Substitution-based Word Sense Induction. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 
[3] Roth, C., & Hellsten, I. (2023). Socio-semantic configuration of an online conversation space: The case of Twitter users discussing the# IPCC reports. Social Networks, 75, 186-196. 
[4] Becker, H. S. (1976). Art worlds and social types. American behavioral scientist, 19(6), 703-718. 
[5] Cefaï, D. (2016). Publics, problèmes publics, arènes publiques…. Que nous apprend le pragmatisme?. Questions de communication, (30), 25-64.
Application: 
* deadline: January 15th, 2025 * 
To apply for this position, please send an email with your CV and a few words explaining your interest in this project to Maria Boritchev and Tiphaine Viard ( [ mailto:firstname.lastname@univ-nantes.fr | firstname.lastname@telecom-paris.fr ] ). 
We are looking for applications from students preparing a Master’s degree or equivalent with solid skills (and ideally experience) in Natural Language Processing, Machine Learning and Deep Learning, Computational Social Sciences. Knowledge of English is necessary.