The Marseille Computer Science and Systems Laboratory (https://www.lis-lab.fr/) is seeking a candidate for a three-year thesis grant as part of the ANR Cre@lame project, in collaboration with the University of Turku in Finland.
The subject concerns the modeling of the literary writing and revision process carried out by authors. The starting point is an already written text, which is to be revised in the manner of an author. The problem is seen as a problem of predicting edit operations, taking the original text as input and producing edit operations. These can concern the lexicon, syntax or textual organization.
The thesis's problem is structured around three directions.
The first is the nature of the object produced by the prediction process, which could take the form of a sequence of edit operations or a more complex form, such as a graph. The prediction model itself will depend largely on the nature of the predicted object.
The second concerns data. Revision data, which associates revision operations with a text, is not very common in general, and those concerning literary revision are even less so. We will rely on all available data available and, possibly, produce them using language models, in order to train the revision models.
The final direction concerns evaluation. Given an original text and a revised version, how can we judge the quality of the latter? And how can we assess that the changes made to the original text are consistent with the objectives of the revision process.
We are looking for candidates with a strong background in machine learning, mainly in deep learning, as well as knowledge in Natural Language Processing.
Application deadline: June 22
Contacts: Patrice Bellot (patrice.bellot@univ-amu.fr), Christophe Leblay (chrleb@utu.fi) and Alexis Nasr (alexis.nasr@univ-amu.fr)