The Chair of Data Science and Natural Language Processing (Prof.
Siegfried Handschuh) invites applications for a PhD position as part of
the recently granted Swiss National Science Foundation (SNF) research
project "Conversational AI: Dialogue-based Adaptive Argumentative
Writing Support". You will contribute to its successful implementation,
as well as the chair's varied activities in teaching and outreach.
Our research at the Data Science and NLP Chair at ICS-HSG focuses on the
cutting-edge field of Natural Language Processing (NLP) and Natural
Language Understanding (NLU). Our group delves into various aspects,
including conversational AI, Large Language Models (LLM), intricate
language analysis, Knowledge Graphs and more. Utilising sophisticated
methods and algorithms, we aim to extract deeper knowledge and
understanding from vast amounts of textual and auditory data. Our
research not only delves into the core principles of NLP but also
explores its practical applications across various industries.
Details of the post and how to apply: https://bit.ly/phd_conversational_ai
--
Prof. Dr. Siegfried Handschuh
Full Professor of Data Science and Natural Language Processing
Director, Institute of Computer Science
University of St.Gallen
E-mail: siegfried.handschuh(a)unisg.ch
*** Second Shared Task on Automatic Minuting ***
*** AutoMin 2023 as INLG 2023 Generation Challenge ***
Automatic summarization of meetings into meeting minutes (aka
`minuting') would be an amazing tool. Superficially, large language
models like BART, GPT*, and friends may seem to have solved it already,
but serious concerns still remain regarding completeness and factual
correctness.
We would like to invite you to take part in AutoMin 2023, the second
instance of the shared on automatic minuting, i.e., summarization of
English and Czech meeting transcripts into meeting minutes and their
evaluation. See the details on task options below.
AutoMin 2023 is a registered INLG Challenge this year.
*** Website ***
https://ufal.github.io/automin-2023/
*** Key Tasks (one or both) ***
- Task A: The main task consists of automatically creating minutes
from multiparty meeting transcripts. The generated minute will be
evaluated both via automatic and manual metrics.
- Task D: Given a meeting transcript, a candidate minutes, and a set of
one or more reference minutes, assign a score indicating the quality
of the candidate. Task D will not be evaluated by a single criterion.
All submissions to Task D will be evaluated in terms of Pearson
correlation against all manual and all automatic evaluation scores
available in AutoMin 2023, including other submissions to Task D.
*** Registration ***
Registration and participation are free of charge.
All participants need to register to get access to the test data.
Use this form and provide us with your GitHub user name (one member
per team is fine):
https://forms.office.com/e/jdvsdtV5Tm
After registration, we will give you GitHub access to the test data.
Please get in touch in case you would be willing to contribute to the
manual evaluation
of AutoMin submissions.
*** Important Dates (EXTENDED) ***
May 1, 2023 System Submission Deadline
May 15, 2023 Report Submission Deadline
July 25, 2023 Camera Ready Submission; in line with INLG
September, 2023 AutoMin @ INLG 2023
*** Publication ***
All teams are required to submit a brief technical report describing
their method. Please submit using the INLG paper template (details
here: https://inlg2023.github.io/calls.html).
All reports must be a minimum of 2 pages and a maximum of 4
pages excluding references. Reports must be written in English.
The proceedings will be published in ACL Anthology.
*** Organizers ***
Ondřej Bojar, UFAL, Charles University, Czech Republic
Tirthankar Ghosal, UFAL, Charles University, Czech Republic
Marie Hledikova, UFAL, Charles University, Czech Republic
Anja Nedoluzhko, UFAL, Charles University, Czech Republic
*** Contact ***
For inquiries, contact <automin(a)ufal.mff.cuni.cz>.
--
+++++++++++++++++++++++++++++++++++
Tirthankar Ghosal
https://member.acm.org/~tghosal
++++++++++++++++++++++++++++++++++++
TrueHealth 2023: Combating Health Misinformation for Social Wellbeing
Workshop @ ICWSM 2023, the 17th International Conference on Web and Social Media
June 5th – 8th 2023, Limassol, Cyprus
https://truehealth.disco.unimib.it/
Scope and topics:
In recent years, people have increasingly referred to the Web and social media as sources of information about health-related problems and solutions, as confirmed by the U.S. Pew Research Center, and other European and international studies. Although, on the one hand, these platforms favor easier and more direct access to information sources by users without the intermediation of experts, on the other hand, it is precisely such democratization of health information that constitutes a potential danger for people. As we have seen especially in the last period, linked to the pandemic, the proliferation of false information, conspiracy theories, and unreliable remedies risk compromising the health not only of individuals but that of the community as a whole.
From this perspective, it becomes necessary to study and propose technological solutions to help users come into contact with genuine information, especially in a critical domain such as health, for social well-being.
To this end, it is essential to promote research of an interdisciplinary nature, involving computer scientists, physicians, lawyers, and communication experts who can address the problem of health misinformation from different points of view by combining their expertise.
The topics of interest of the TrueHealth 2023 Workshop at ICWSM include, but are not limited to:
Assessing the genuineness of Online Health Information (OHI);
Consumer Health Search (CHS) and genuine information access;
Debunking health misinformation;
Fake news/rumors and healthcare;
Measures, evaluation methods, and datasets for health misinformation detection;
Health misinformation detection;
Health literacy and information genuineness;
Fact-checking in Online Health Information (OHI);
Misinformation and public opinion on health;
Relationship between access to non-genuine information and danger to public health;
Relationship between psychological characteristics and perceptions of health misinformation;
Techniques for accessing and retrieving genuine Online Health Information (OHI).
Submission Instructions
We welcome both 2-page abstracts, as well as Long (8 pages) and Short (4 pages) papers – excluding references (11 pages max with references and ethics statement). Abstracts are ideal as Demo or Position papers, Short papers as presentations of ongoing research with preliminary results or summaries of previous work, and Long papers as presentations of novel research and results.
Long and Short papers will be published in ICWSM Workshop Proceedings (http://workshop-proceedings.icwsm.org/).
All submissions should be double-blind.
Papers have to follow the AAAI format, as outlined here: https://www.overleaf.com/latex/templates/aaai-2023-author-kit/wxnmhzcrjbpc
Submit here: https://easychair.org/conferences/?conf=truehealth2023
Other ICWSM submission instructions: https://www.icwsm.org/2023/index.html/call_for_submissions.html
Important Dates
Workshop Papers Submissions: March 27, 2023
Workshop Paper Acceptance Notification: April 10, 2023
Workshop Final Camera-Ready Paper Due: May 6, 2023
ICWSM-2023 Workshops Day: June 5, 2023
Organizers
Gabriella Pasi (Full Professor), University of Milano-Bicocca, Milan, Italy
Rishabh Upadhyay (Research Fellow), University of Milano-Bicocca, Milan, Italy
Marco Viviani (Associate Professor), University of Milano-Bicocca, Milan, Italy
Program Committee
Lorraine Goeuriot, Université Grenoble Alpes, France
Sanda Harabagiu, The University of Texas at Dallas, USA
Liadh Kelly, Maynooth University, Ireland
Dongwon Lee, The Pennsylvania State University, USA
Yelena Mejova, ISI Foundation, Italy
Marinella Petrocchi, Institute of Informatics and Telematics (CNR), Italy
Michael Sirivianos, Cyprus University of Technology, Cyprus
Xingyi Song, University of Sheffield, UK
Hanna Suominen, Australian National University, Australia
Francesca Spazzano, Boise State University, USA
Angelo Spognardi, Sapienza University of Rome, Italy
Bei Yu, Syracuse University, USA
Arkaitz Zubiaga, Queen Mary University of London, UK
Apologies for cross-posting
Link:https://codalab.lisn.upsaclay.fr/competitions/11077
Shared Task on Homophobia/Transphobia Detection in social media comments:
at LT-EDI 2023- RANLP 2023
<https://sites.google.com/view/lt-edi-2023/home?authuser=0> *Languages:
English, Spanish, Hindi, Tamil, and Malayalam*.
Participants will be provided with sentences in comment, extracted from
social. Given a comments, a system must predict whether or not it contains
any form of homophobia/transphobia. The seed data for this task is the
Homophobia/Transphobia Detection dataset [1], a collection of comments from
social media. The comments are manually annotated to show whether the text
contains homophobia/transphobia.
The participants will be provided development, training and test
dataset in *English,
Spanish, Hindi, Tamil, and Malayalam*. To download the data and
participate, go to codalab and click “Participate" tab. As far as we know,
this is the first shared task on Homophobia/Transphobia Detection.
*Important Dates for shared task:*
Task announcement: Feb 20, 2023
Release of Training data: Feb 28, 2023
Release of Test data: May 10, 2023
Run submission deadline: June 1, 2023
Results declared: June 10, 2023
Paper submission:1 July 2023
Peer review notification: 5 August 2023
Camera-ready paper due: 20 August 2023
with regards,
Dr. Bharathi Raja Chakravarthi,
Assistant Professor / Lecturer-above-the-bar
School of Computer Science, University of Galway, Ireland
Insight SFI Research Centre for Data Analytics, Data Science Institute,
University of Galway, Ireland
E-mail: bharathiraja.akr(a)gmail.com , bharathi.raja(a)universityofgalway.ie
<bharathiraja.asokachakravarthi(a)universityofgalway.ie>
Google Scholar: https://scholar.google.com/citations?user=irCl028AAAAJ&hl=en
Website:
https://www.universityofgalway.ie/our-research/people/bharathirajaasokachak…
Special Issue on Language Technology for Safer Online Social Media
Platforms in Low-resource Eurasian Languages
<https://dl.acm.org/pb-assets/static_journal_pages/tallip/pdf/TALLIP-SI-Lang…>
Cyber-Social Issues Prediction in Low-Resource Languages with Deep Internet
of Things (DIoT)
<https://dl.acm.org/pb-assets/static_journal_pages/tallip/pdf/TALLIP-SI-Pred…>
Dear Colleagues,
the Institute of Modern Languages at the University of Zielona Góra announces a linguistics conference: "Contemporary Trends in English-Language Studies". This year's edition will be held entirely online on May 18-19, 2023.
SUBMISSION DEADLINE EXTENSION: March 31, 2023.
More information is available at:
https://sites.google.com/view/ctiels/
Thank you!
Leszek Szymański
Hello,
Bloomberg is happy to announce an exciting funding opportunity for Ph.D. students. The sixth edition of the Bloomberg Data Science Ph.D. Fellowship Program invites Ph.D. students working in broadly-construed data science to apply for fellowships.
Our fellowship program, launched in 2018, provides the opportunity for outstanding Ph.D. candidates to be funded for up to three years of their Ph.D studies to work on their research proposal. The recipients will collaborate and be supported by our Data Science community throughout this time and will complete 14-week summer internships with Bloomberg for the duration of their fellowships. The past year's recipients of the fellowship are presented here: https://www.bloomberg.com/company/stories/introducing-the-fifth-cohort-of-b…
Applications for the 2023-2024 academic year must be submitted by April 28, 2023. Fellowship recipients will be announced by June 16, 2023.
Full details about the fellowship and application process can be found at: https://www.bloomberg.com/company/values/tech-at-bloomberg/data-science/aca…
We would appreciate it if you can share this opportunity with interested parties.
Please direct all questions and future communications to rdml(a)bloomberg.net.
Bloomberg
Call for Participants
LongEval: Longitudinal Evaluation of Model Performance
CLEF 2023 Lab
18-21 September, Thessaloniki, Greece
--------------------------------------------------------
** Registration open, training data available **
The CLEF 2023 LongEval lab is motivated by recent research showing that
the performance of information retrieval and text classification models
drops as the test data becomes more distant in time from the training
data. LongEval differs from traditional IR and classification shared
tasks with special considerations on evaluating models that mitigate
performance drop over time. It encourages participants to develop
temporal information retrieval systems and longitudinal text classifiers
that survive through dynamic temporal text changes, introducing time as
a new dimension for ranking models performance.
The lab consists of two tasks:
* Task 1. LongEval-Retrieval: The goal of Task 1 is to propose an
information retrieval system which can handle changes over the time. The
proposed retrieval system should follow the temporal timewise evolution
of Web documents. Contact: longeval-ir-task(a)univ-grenoble-alpes.fr
* Task 2. LongEval-Classification: The goal of Task 2 is to propose a
temporal persistence classifier which can mitigate performance drop over
short and long periods of time compared to a test set from the same time
frame as training. Contact: r.a.a.alkhalifa(a)qmul.ac.uk
Registration
------------
Please use the CLEF registration form
(http://clef2023-labs-registration.dei.unipd.it/) to register to the
tasks. The registration closes on Friday, 28 April 2023.
Timeline
--------
April 2023: Test data release
28 April 2023: Registration closes
10 May 2023: End of Evaluation Cycle [submission of runs]
5 June 2023: Submission of Participant Papers [CEUR-WS]
5–23 June 2023: Review process of participant papers
July 2023: Camera ready paper submission
September 2023: CLEF Conference
Organisers
----------
Rabab Alkhalifa, Iman Bilal, Hsuvas Borkakoty, Jose Camacho-Collados,
Romain Deveaud, Alaa El-Ebshihy, Luis Espinosa-Anke, Gabriela
Gonzalez-Saez, Petra Galuščáková, Lorraine Goeuriot, Elena Kochkina,
Maria Liakata, Daniel Loureiro, Harish Tayyar Madabushi, Philippe
Mulhem, Florina Piroi, Martin Popel, Christophe Servan, Arkaitz Zubiaga.
[Apologies for multiple postings]
ImageCLEFaware (3rd edition)
Registration: https://www.imageclef.org/2023/aware
Run submission: May 10, 2023
Working notes submission: June 5, 2023
CLEF 2023 conference: September 18-21, Thessaloniki, Greece
*** CALL FOR PARTICIPATION ***
Images constitute a large part of the content shared on social
networks. Their disclosure is often related to a particular context
and users are often unaware of the fact that, depending on their
privacy status, images can be accessible to third parties and be used
for purposes which were initially unforeseen. For instance, it is
common practice for employers to search information about their future
employees online. Another example of usage is that of automatic credit
scoring based on online data. Most existing approaches which propose
feedback about shared data focus on inferring user characteristics and
their practical utility is rather limited.
We hypothesize that user feedback would be more efficient if conveyed
through the real-life effects of data sharing.
The objective of the task is to automatically score user photographic
profiles in a series of situations with strong impact on her/his life.
Four such situations were modeled this year and refer to searching
for: (i) a bank loan, (ii) an accommodation, (iii) a job as
waitress/waiter, and (iv) a job in IT. The inclusion of several
situations is interesting in order to make it clear to the end-users
of the system that the same image will be interpreted differently
depending on the context.
The final objective of the task is to encourage the development of
efficient user feedback, such as the YDSYO Android app
https://ydsyo.app/.
*** TASK ***
Given an annotated training dataset, participants will propose machine
learning techniques which provide a ranking of test user profiles in
each situation which is as close as possible to a human ranking of the
test profiles.
*** DATA SET ***
This is the third edition of the task. A data set of more than 1,000
user profiles with 100 photos per profile was created and annotated
with an appeal score for a series of real-life situations via
crowdsourcing. Participants to the experiment were asked to provide a
global rating of each profile in each situation modeled using a
7-points Likert scale ranging from strongly unappealing to strongly
appealing. An averaged and normalized appeal score will be used to
create a ground truth composed of ranked users in each modeled
situation. User profiles are created by repurposing a subset of the
YFCC100M dataset.
*** METRICS ***
Participants to the task will provide an automatically ranking of user
ratings for each situation which will be compared to a ground truth
rating obtained by crowdsourcing. The correlation between the two
ranked list will be measured using Pearson's correlation coefficient.
The final score of each participating team will be obtained by
averaging correlations obtained for individual situations.
*** IMPORTANT DATES ***
- Run submission: May 10, 2023
- Working notes submission: June 5, 2023
- CLEF 2023 conference: September 18-21, Thessaloniki, Greece
(https://clef2023.clef-initiative.eu/)
*** OVERALL COORDINATION ***
Jérôme Deshayes-Chossart, CEA LIST, France
Adrian Popescu, CEA LIST, France
Bogdan Ionescu, Politehnica University of Bucharest, Romania
*** ACKNOWLEDGEMENT ***
The task is supported under the H2020 AI4Media “A European
Excellence Centre for Media, Society and Democracy†project,
contract #951911 https://www.ai4media.eu/.
On behalf of the Organizers,
Bogdan Ionescu
https://www.AIMultimediaLab.ro/
4th Workshop on Patent Text Mining and Semantic Technologies (PatentSemTech2023)
================================================================================
The PatentSemTech2023 workshop aims to establish a long-term collaboration and a
two-way communication channel between the IP industry and academia from relevant
fields such as natural language processing (NLP), text and data mining (TDM), and
semantic technologies (ST) in order to explore and transfer new knowledge, methods,
and technologies for the benefit of industrial applications as well as support
research in applied sciences for the IP and neighboring domains.
Call for Contributions
======================
PatentSemTech2023 will be held as a full-day event in conjunction with SIGIR 2023.
Workshop website: http://ifs.tuwien.ac.at/patentsemtech/
Important Dates
===============
Submission deadline: April 25, 2023
Notification: May 23, 2023
SIGIR PatentSemTech2023 workshop: July 27, 2023
Topics of Interest
==================
We encourage submissions of high quality research papers on all topics related
to the IP domain. Topics of interest include (but are not limited to):
* Text mining and retrieval from patents, legal documents, or other
scientific-technical information sources
* Machine learning methods, in particular deep learning methods for
- Representation learning (word and document embeddings)
- Query expansion
- Clustering and classification
- Recommendation
- IPC/CPC prediction
- Trend detection
- Entity extraction
* Semantic approaches for
- Linking semantic information
- Integrating external knowledge sources
- Semantic enrichment
* Methods and applications for retrieving, mining, and analysing, including
- Patent landscaping
- Hot spot / White spot analysis
- Multi-modal analysis
- Technology trend analysis
- Innovative user interfaces
- Visual user interface concepts
Contributions
=============
We solicit two types of submissions: full papers and short papers for three tracks: research, demo, and summarization task. Full papers will be limited to 8 pages (including references); short papers will be 4 pages (including references).
The submissions will be peer-reviewed by at least two program committee members and evaluated based on innovativeness, novelty, interestingness, and impact.
We plan for three tracks:
*Research Track (full or short papers)*
For this track, we solicit contributions from academia that present
* Novel applications of existing state of the art methods for the IP domain
* Novel methods or tasks in the IP domain
* Novel user interfaces for the IP domain
* Novel evaluation or analysis insights in the IP domain
* Novel benchmark datasets or other resources of interest
* A survey or overview related to a particular task in the IP domain
*Demo/System Track (short papers)*
We solicit demos, case study, insights, or novel ideas from industry that present
* Focused case studies making use of semantic technologies or machine learning
* Interesting IP-related task descriptions or best practices for patent analysis
* In-use systems or prototype implementations of semantic technologies
* Demos on processing or analysing data from the IP domain, or user interfaces
* In-use resources related to patents or external resources, e.g., linked open data.
*Summarization Task Track (short papers)*
Within the patent text mining community, especially from the industry, there is an interest in developing text mining tools targeting text summarization.
* Participants are free to use publicly available data sets to train their models. We recommend exploring US Patents, which many contain the text section SUMMARY OF THE INVENTION.
* We will also publish a small training and test data set on the 23rd of February. The provided data set is composed of patents within the field of Green Plastics Technology.
* Participants are asked to submit a short (4 pages) scientific paper, which will be peer-reviewed by the workshop organizers. The most interesting submissions will be invited to present their solution at the workshop.
* Furthermore, we will have an additional interactive evaluation to reflect a more real-life scenario at the workshop, making it possible to evaluate not only the performance in terms of F1, ROUGE, recall, precision etc., but also efficiency. Therefore the invited participants will be asked to set-up their solutions as a service and provide a REST API. Input will be a patent document (PDF,DOCX), and output should be a summary of not more than 700 words.
Submission Guidelines
=====================
Submissions must be in English, in PDF, and in the current ACM two-column conference
format. Suitable LaTeX, Word, and Overleaf templates are available from the ACM Website:
https://www.acm.org/publications/proceedings-template ("sigconf" template for LaTeX;
Interim Template for Word). Submissions should be at most 8 (full) or 4 (short) pages (including figures and references) in length. Submissions should be submitted electronically via EasyChair:
https://www.easychair.org/conferences/?conf=patentsemtech2023.
At least one author of each accepted paper is required to register for,
and present the work in person at the workshop.
Publication
===========
Accepted papers will be published as CEUR proceedings. Selected contributions
will be invited to submit extended, full papers to Elsevier's World Patent
Information (WPI) journal: https://www.journals.elsevier.com/world-patent-information/
Organizers
==========
Ralf Krestel (ZBW & CAU Kiel, Germany), Hidir Aras (FIZ Karlsruhe, Germany),
Linda Andersson (Artificial Researcher, Austria), Florina Piroi (Data Science Studio,
RSA FG, Austria), Allan Hanbury (TU Wien, Austria), Dean Alderucci (CMU, USA)
All questions about submissions should be emailed to:
r.krestel(a)zbw.eu and hidir.aras(a)fiz-karlsruhe.de
Dear CorporaList members,
we are aiming to compile a list of initiatives/funded projects working on creating Large Language Models, similar to this one (https://leam.ai/).
If you know such projects, we would be highly grateful if you could either reply to this list or contact us personally.
Thank you very much.
Mit freundlichen Grüßen / Best Regards
Dr. Annemarie Friedrich
https://annefried.github.io