Dear colleagues, We are pleased to announce the release of the FR-BSKY-ALO-2025-2026 Corpus: https://doi.org/10.5281/zenodo.19075495
FR-BSKY-ALO-2025-2026 is a corpus of anonymized, publicly available Bluesky posts collected for research and pedagogical purposes to explore lexical variation in French, between “sorte” and “espèce”, on the one hand, and between “enfin”, “finalement”, “au final”, and “à la fin”, on the other. Such terms and expressions have been the subject of several research in linguistic studies (Souza et al., 2011; Hansen & Mosegaard, 2005; Franckel, 1987, among others). The collection comprises 278,786 posts, published between 1 February 2025 and 27 January 2026, from all user accounts.
This academic activity has two main objectives. Using Blueskyscraper (Moncomble, 2026: https://corpustools.prendrelangue.fr), data collection was carried out with the participation of students enrolled in the course Analyse linguistique outillée (second-year undergraduate level, academic year 2025–2026), at the Université de Poitiers. The project aimed to introduce students to the practical and methodological aspects of corpus construction and to the analysis of linguistic data using corpus linguistics tools. Students were also invited to explore the corpus in order to invastigate linguistic variation through a corpus-based approach grounded in digitized texts (BlueSky posts), there by encouraging reflection on emerging forms of linguistic data and digital discourse studies.
The corpus is available on Zenodo in seveeral formats (CSV, TEI/XML, TSV, TXT), with the aim of ensuring compatibility with a wide range of corpus analysis tools whiled preserving metadata intrinsic to digitized texts, notably posting date and post URLs.
Best regards, Sangwan Jeon