Dear corpora community,
We are organizing the ValueEval competition [1] as part of SemEval'23. We are looking for data suggestions or contributions to diversify our training and test datasets.
The task of ValueEval is to automatically identify human value categories (e.g., "concern," "tradition," or "self-directed thinking") from which a statement draws persuasive power. For example, the statement "Nuclear weapons have the potential to cause massive destruction" can draw on social security values to appeal to the statement "Nuclear weapons should be abolished." For more details, see our paper and the existing dataset [2].
Specifically, we are looking for data that fulfill these criteria: - The data consists of pairs of two causally related short statements (1 to 3 sentences each): - the first statement of the pair must provide one or more reasons - the second statement of the pair provides context for the first statement: the first statement must either be supporting or attacking the second statement - The data contains between 50 and 1000 such pairs - The statements are in English (possibly translated from a different language) and grammatically sound
The suitable datasets we know (mainly from the computational argumentation community) focus on US or Western topics and contain debate-style statements. We are thus specifically looking for datasets that focus on issues from other parts of the world or other genres. We are grateful for pointers to resources we could use (e.g., specific websites) or to existing corpora.
After assessing suitability per the criteria outlined above, we will take care of annotation. We will write a paper on the final dataset and invite each data contributor to join as a co-author.
Please respond with suggestions or contributions to this mail by August 31, 2022.
Yours sincerely, Milad, Johannes, Henning, and Benno