Dear Ada,
thank you for the interest in our workshop topic. First, directly on the concrete question about our shared task:
will data used to train the LLMs be included in the evaluation?
We try to mitigate this issue by relying on very recently created data from sources of 2023. Maybe this also helps you for the other questions on the shared task.
Your following questions, or, if I may, thoughts, are on the general topics that our workshop is about. I personally think these contain valid points and interesting reflections on evaluation and evaluation practices in NLP, some of them I believe are touching on deeper issues quite prevalent in NLP and ML, such as potential selection biases in human judges. Generally speaking, Eval4NLP also warmly welcomes contributions that discuss, introduce and extend such or similar thoughts/opinions, critiques, reflections, outlooks, etc. on evaluation and evaluation practices in NLP, wishing to be a forum for exchange of ideas and feedback.
Best wishes
Juri