Hi everyone (and sorry for cross-posting),
I wanted to highlight that we have released 80% of the ELLIPSE corpus on Kaggle in our final feedback prize https://www.kaggle.com/competitions/feedback-prize-english-language-learning/overview .
The corpus is open source and comprises 6,500 ELL writing samples that have expert judgments for six analytic features related to language proficiency including cohesion, syntax, vocabulary, and phraseology.
Once the competition ends in three months, we will release the entire dataset which will also include a holistic score for language proficiency along with demographic and individual difference features.
I hope you can join in the competition ($55K in prize money!) and that you can spread the news about this corpus. I am excited to see how the corpus can help forward research on second language studies!
Best,