[Corpora-List] Query: Guide or advice for crowd-sourcing linguistic annotations

13 Oct 2022

      Dear all
I'm looking for a guide or advice for crowd-sourcing linguistic 
annotations via platforms  such as Mechanical Turk. I'm thinking of 
rating tasks such as evaluating positive and negative sentiment in 
sentences or annotating concordances from a corpus for a certain 
property (e.g. deontic v epistemic meaning in modal verbs).
Specifically, I'm wondering
- How can I ensure that the annotations are of sufficient quality? I 
don't have a gold standard for all the data, after all this is why I 
need the annotations. If I get all the data annotated by two or three 
independent annotators, I can ensure adequate quality. But then I might 
still get annotators who more or less submit random annotations (or 
start doing so after a while), or at least it would take me very long to 
find out who is doing so.
- How do I find out what remuneration is adequate?
- What is a good way to split up the data for annotation? Single 
annotation units or, say, 50 or 100 at a time? How do I deliver them 
effectively to the annotators?
Many thanks and best wishes
Robert
-- 
Prof. Dr. Robert Fuchs (JP) | Department of English Language and 
Literature/Institut für Anglistik und Amerikanistik | University of 
Hamburg | Überseering 35, 22297 Hamburg, Germany | Room 07076 | 
https://uni-hamburg.academia.edu/RobertFuchs | 
https://sites.google.com/view/rflinguistics/

Mailing list on varieties of English/World Englishes/ENL-ESL-EFL. 
Subscribe here: https://groups.google.com/forum/#!forum/var-eng/join
Are you a non-native speaker of English? Please help us by taking this 
short survey on when and how you use the English language: 
https://lamapoll.de/englishusageofnonnativespeakers-1/

2025

2024

2023

2022

[Corpora-List] Query: Guide or advice for crowd-sourcing linguistic annotations