Dear Corpora List members,
I am using the Books v1 corpus from OPUS https://opus.nlpl.eu/datasets/Books?pair=en&es as part of my research and have a question regarding the alignment process.
The corpus description mentions that some texts were manually reviewed by András Farkas, but it is not entirely clear whether this review concerned sentence-level alignments, paragraph-level alignments, or both.
Specifically, I would like to know whether the sentence-level alignments (including 1-to-1, 1-to-2, 2-to-1, and unmatched sentences) can be considered manually verified gold-standard data, a partially reviewed silver standard, or fully automatic alignments without human validation.
I would be very grateful if someone could provide any clarification or pointers to relevant documentation. If this is not the most appropriate forum for this question, I apologize in advance, but I thought someone here might be able to help.
Thank you very much for your time.
Hugo