Dear colleagues,

 

Last month, we shared the result of our collaborative work on a core metadata scheme for learner corpora with LCR2022 participants. Our proposal builds on Granger and Paquot (2017)’s first attempt to design such a scheme and during our presentation, we explained the rationale for expanding on the initial proposal and discussed selected aspects of the revised scheme. 

 

Our proposal is available at https://docs.google.com/spreadsheets/d/1-RbX5iUCUtCBkZU9Rfk-kv-Vzc--F-eUW2OgQcV58C8/edit#gid=238442486

 

We firmly believe that our efforts to develop a core metadata scheme for learner corpora will only be successful to the extent that (1) the LCR community is given the opportunity to engage with our work in various ways (provide feedback on the general structure of the scheme, the list of variables that we identified as core and their operationalization; test the metadata on other learner corpora; use the scheme to start a new corpus compilation, etc.) and (2) the core metadata scheme is the result of truly collaborative work.

 

As mentioned at LCR2022, we will be collecting feedback on the metadata scheme until the end of October. The online feedback form is available at:

 

https://docs.google.com/document/d/1NeDUuxGJlPSJI9wHVA1xgGM-aV8jXTa8Qlb45K-JanA/edit?usp=sharing

 

We’d like to thank all the colleagues who already got back to us (at LCR2022, by email or via the online form). We also thank them for their appreciation and enthusiasm for our work! We’d also like to encourage more colleagues (and particularly those of you who have experience in learner corpus compilation) to provide feedback! We need help in finalizing the core metadata scheme to make sure that it can be applied in all learner compilation contexts. In short, we need you to make sure the scheme meets the needs of the LCR community at large.

 

With very best wishes,

 

Magali Paquot (also on behalf of Alexander König, Jennifer-Carmen Frey, and Egon W. Stemle)

 

Reference

Granger, S. & M. Paquot (2017). Towards standardization of metadata for L2 corpora. Invited talk at the CLARIN workshop on Interoperability of Second Language Resources and Tools, 6-8 December 2017, University of Gothenburg, Sweden.

 

 

Dr. Magali Paquot

Centre for English Corpus Linguistics

Institut Langage et Communication

UCLouvain

https://perso.uclouvain.be/magali.paquot/