Starting from May 2023, the Data & Knowledge Engineering group at
Heinrich-Heine-University (HHU, Düsseldorf), affiliated with Knowledge
Technologies for Social Sciences (KTS, https://www.gesis.org/en/kts) at
GESIS (Cologne) and the Computational Linguistics department at HHU
(
https://www.ling.hhu.de/bereiche-des-institutes/abteilung-fuer-computerling…
)
are looking for a
*PhD student– Information Extraction & Natural Language Processing*
(Salary group 13 TV-L, working time 75%-100%, initially limited to 36
months with the possibility of further extension)
In the context of the research project "NewOrder", we are investigating
scientific online discourse in news & social media, in an
interdisciplinary consortium involving researchers from Computer
Science, Psychology, Political and Communication Science. Our research
will be concerned with novel Natural Language Processing (NLP) methods
for the analysis of scientific online discourse (e.g. on Twitter)
addressing challenges arising from its informal nature and
heterogeneity. For instance, references to scientific works (e.g.
publications, studies, datasets), scientists or scientific organisations
are often provided in informal and ambiguous ways. Other challenges
include the dynamically evolving vocabulary posing challenges for reuse
and adaptation of both pretrained language models as well as NLP models
finetuned towards specific downstream tasks. Hence, detecting and
disambiguating informal science discourse and associated claims remains
a challenging problem.
Your tasks will be:
*******************
* Research in fields such as NLP, Machine Learning, Language Modeling
and Representation learning, specifically with the aim to extract
structured information from online discourse data
* Develop NLP methods for (i) the detection, disambiguation and
classification of sources of science-related information on social
media, (ii) assessing the quality and credibility of sources and claims
and (iii) investigating implicit language cues for cognitive states and
source characteristics/traits
* Writing, publishing and presenting project results
* Collaboration with team members and project partners in an
interdisciplinary consortium
Your profile:
**************
* University degree (diploma/MSc) in Computer Science, Computational
Linguistics or related fields
* Research interests in NLP, machine learning, data mining, large
language models
* Hands-on experience with Python and handling big datasets, ideally
experience with Big Data Frameworks (e.g. Spark/Hadoop)
* Knowledge of ML-Frameworks such as TensorFlow and PyTorch
* Ability to communicate fluently in English mandatory, basic knowledge
of the German language desirable
What we offer:
***************
* Flexible working hours and home office arrangements
* A fast growing and international working environment with a lot of
creative scientific freedom
* Access to unique research data, (social) web archives and behavioral data
* Support of collaborations with international research labs and experts
through an extensive international exchange programme
The PhD research will be supervised by Prof. Dr. Stefan Dietze
(Scientific Director of KTS at GESIS and Professor for Data & Knowledge
Engineering at HHU) & Prof. Dr. Laura Kallmeyer (Chair of Computational
Linguistics department at HHU).
For further information please contact Stefan Dietze
(stefan.dietze(a)hhu.de) and/or Laura Kallmeyer
(kallmeyer(a)phil.uni-duesseldorf.de).
Interested?
*************
Please apply by sending your complete application documents as a single
PDF file to kallmeyer(a)phil.hhu.de by 20 January 2023.
--
Prof. Dr. Laura Kallmeyer
Institut für Linguistik
Heinrich-Heine Universität Duesseldorf
Universitaetsstr. 1
D-40225 Duesseldorf, Germany
https://user.phil.hhu.de/kallmeyer/
Phone +49 (0)211 8113899
CALL FOR PAPERS
ACM OSNeHM 2023
First International workshop on
Online Social Networks in the Human-centric Metaverse
co-located with
The Web Conference 2023
Austin, Texas, USA
APRIL 30 - MAY 4, 2023
https://osnehm.iit.cnr.it/
____________________________________________________________________
SCOPE AND OVERVIEW
__________________
The cyber and physical worlds are increasingly becoming
indistinguishable. This is fostered by enabling technologies such as IoT
and pervasive networks, advanced data management and analytics
techniques, and advanced platforms with massive diffusion, chiefly among
them Online Social Networks. Whatever we do in one world has immediate
consequences on the other world, thanks to a constant flow of data - and
online analytics - between the two worlds.
In this context, the vision of the Metaverse provides additional
perspectives, augmenting human interactions with things and other humans
across the two worlds. The role of the humans in this socio-technical
complex system is key, and still largely unexplored. Quite
interestingly, while new tools characteristic of the cyber-physical
world - OSN among them - have been designed to largely extend human
capabilities, the real interplay between these tools and human
behaviours and cognitive constraints often result in unexpected results.
Therefore, while humans are in principle at the center of the
cyber-physical convergence and, thus -- in the perspective -- of the
Metaverse, the interplay between both worlds and the technical solutions
underpinning this convergence are hitherto largely unexplored and yet to
be understood. This is a big gap our community should feel, in order to
develop cyber-physical worlds (and the Metaverse) as a truly
human-centric environment.
OSNeHM’s main theme will be the role of Online Social Networks in such a
human-centric cyber-physical convergence leading to the Metaverse. It
will provide a forum for discussion on early yet principled approaches
and results on all aspects related to this theme. A special emphasis
will be devoted to the characterisation of the individual and social
behaviour of humans, using OSN as “big data microscopes” for collecting
and analysing big data via robust big data analytics. Papers discussing
solutions focusing on the interplay between social and technical (online
and offline) worlds will be high welcome. On the other hand, the
workshop will welcome papers proposing novel technical solutions to
support human-centric approaches to the evolution of OSN in the
perspective of the Metaverse.
Specific topics of interest include, but are not limited to, the following:
* OSNeHM platforms, protocols and applications;
* OSN and Metaverse services & applications;
* Decentralised, mobile and location-based OSNeHM;
* Trust, reputation, privacy and security in OSNeHM;
* Dynamics of trends, information and opinion diffusion in OSNeHM;
* Fake news, toxicity radicalization and disinformation in OSNeHM;
* Detecting, modeling and tackling Online harms in OSNeHM;
* Recommendations and advertising in OSNeHM;
* Measurement, analysis and modeling of popular OSN (Facebook, Twitter,
Instagram, Flickr, etc.),
including decentralized ones (e.g., Mastodon, Pleroma, ...);
* Data mining, and machine learning in OSNeHM systems;
* Social media analysis and social analytics in the perspective of OSNeHM;
* Information extraction and search in OSNeHM;
* Complex-network analysis of OSNeHM;
* Modeling of social behavior through OSN data;
* Crowdsourcing and OSNeHM;
* Multidisciplinary applications of OSNeHM (economics, medicine,
society, politics,
homeland security, psychology, etc.)
PAPER FORMAT AND SUBMISSION INSTRUCTIONS
________________________________________
Papers that have been previously published or are under review for
another journal, conference or workshop will not be considered for
publication. Submitted papers should not exceed 12 pages in length
(maximum 8 pages for the main paper content + maximum 2 pages for
appendixes + maximum 2 pages for references). Papers must be submitted
in PDF format according to the ACM template published in the ACM
guidelines, selecting the generic “sigconf” sample. The PDF files must
have all non-standard fonts embedded. Workshop papers must be
self-contained and in English.
Submissions that do not follow these guidelines may be rejected without
review.
Further, at least one author of each accepted workshop paper has to
register for the main conference. Workshop attendance is only granted
for registered participants. Accepted papers will be included in the
workshop proceedings, which will be published as companion proceedings
of The Web Conference, and indexed according to the main conference policy.
Please follow the submission link at:
https://easychair.org/conferences/submissions?a=29997356 and select the
full name of the workshop in the submission list.
AWARDS AND EDITORIAL FOLLOW-UPS
_______________________________
We will consider assigning a best paper award.
We will organise a special issue on the Elsevier Online Social Networks
and Media (OSNEM) Journal
https://www.journals.elsevier.com/online-social-networks-and-media/
soliciting submissions of extended versions of particularly promising
papers.
OSNEM is a recent yet very well-reputed (Q1 SJR) journal covering, among
others, 100% of the workshop topics.
IMPORTANT DATES (all deadlines are AoE)
_______________
23rd January 2023: Abstract submission deadline
6th February 2023: Workshop paper submission deadline
6th March 2023: Workshop paper (acceptance) notification
20th March 2023: Workshop papers camera-ready deadline
31st March 2023: Final program (with duration) provided to Workshop
Track leads
1st or 2nd May 2023: Workshops at WWW2023
ORGANISING COMMITTEE
____________________
Workshop chairs:
* Marco Conti, IIT-CNR, Italy
* Andrea Passarella, IIT-CNR, Italy
* Jussara M. Almeida, Universidade Federal de Minas Gerais, Brazil
* Arkaitz Zubiaga, Queen Mary University of London, UK
Technical Program Committee (TBC)
For more information, please write to the workshop co-chairs at osnehm23
<at> iit <dot> cnr <dot> it
We invite you to participate in the SemEval 2023 shared task on clickbait spoiling.
Clickbait spoiling means generating or extracting a short message for a clickbait post that spoils the clickbait by filling its curiosity gap.
Learn more at https://clickbait.webis.de/
-------------------------------------------------------------------------------
Important Dates
-------------------------------------------------------------------------------
Now open: Registration
January 10, 2023: Submission deadline
February 2023: Participant paper submission
March 2023: Peer review notification
April 2023: Camera-ready participant papers submission
Summer 2023: SemEval workshop (co-located with a major NLP conference)
Best regards,
PAN team
Don't miss the deadline (December, 23rd 2022) to submit your proposal for
the Research Award! Propose YOUR solution for interpreting old databases!
Dinosaur databases are running the world! As relics of the early steps of
the information era, these Databases are still the basis of many economic
transactions. Although their age, it seems extremely difficult to replace
them with novel and faster solutions. These databases were written in a
wonderful era in which memory was a problem. Hence, variable, table, and
column names were short and cryptic. Moreover, documents describing these
names are buried in forgotten places if they still exist. The challenge is,
then, giving sense to these dinosaur databases to help software engineers to
produce the novel version.
Ready to apply? Azimut is looking forward to your submission!
Deadline: December, 23rd 2022
More info at:
<https://www.azimut.it/it/az-venture-tech-challenge>
https://www.azimut.it/it/az-venture-tech-challenge
Azimut (https://www.azimut.it), a leading wealth management company in
Europe, offers a Research Award to whoever can propose a solution to
interpret old databases.
*** Apologies for cross-posting ***
++ CALL FOR PAPERS ++
****************************************************************************
Sixth International Workshop on Narrative Extraction from Texts (Text2Story'23)
Held in conjunction with the 45th European Conference on Information Retrieval (ECIR'23)
April 2nd, 2023 - Dublin, Ireland
Website: https://text2story23.inesctec.pt<https://text2story23.inesctec.pt/>
****************************************************************************
++ Important Dates ++
- Submission deadline: January 23rd, 2023
- Acceptance Notification Date: March 3rd, 2023
- Camera-ready copies: March 17th, 2023
- Workshop: April 2nd, 2023
++ Overview ++
Recent years have shown a stream of continuously evolving information making it unmanageable and time-consuming for an interested reader to track and process and to keep up with all the essential information and the various aspects of a story. Automated narrative extraction from text offers a compelling approach to this problem. It involves identifying the sub-set of interconnected raw documents, extracting the critical narrative story elements, and representing them in an adequate final form (e.g., timelines) that conveys the key points of the story in an easy-to-understand format. Although, information extraction and natural language processing have made significant progress towards an automatic interpretation of texts, the problem of automated identification and analysis of the different elements of a narrative present in a document (set) still presents significant unsolved challenges
++ List of Topics ++
In the sixth edition of the Text2Story workshop, we aim to bring to the forefront the challenges involved in understanding the structure of narratives and in incorporating their representation in well-established models, as well as in modern architectures (e.g., transformers) which are now common and form the backbone of almost every IR and NLP application. It is hoped that the workshop will provide a common forum to consolidate the multi-disciplinary efforts and foster discussions to identify the wide-ranging issues related to the narrative extraction task. To this regard, we encourage the submission of high-quality and original submissions covering the following topics:
* Narrative Representation Models
* Story Evolution and Shift Detection
* Temporal Relation Identification
* Temporal Reasoning and Ordering of Events
* Causal Relation Extraction and Arrangement
* Narrative Summarization
* Multi-modal Summarization
* Automatic Timeline Generation
* Storyline Visualization
* Comprehension of Generated Narratives and Timelines
* Big Data Applied to Narrative Extraction
* Personalization and Recommendation of Narratives
* User Profiling and User Behavior Modeling
* Sentiment and Opinion Detection in Texts
* Argumentation Analysis
* Bias Detection and Removal in Generated Stories
* Ethical and Fair Narrative Generation
* Misinformation and Fact Checking
* Bots Influence
* Narrative-focused Search in Text Collections
* Event and Entity importance Estimation in Narratives
* Multilinguality: Multilingual and Cross-lingual Narrative Analysis
* Evaluation Methodologies for Narrative Extraction
* Resources and Dataset Showcase
* Dataset Annotation for Narrative Generation/Analysis
* Applications in Social Media (e.g. narrative generation during a natural disaster)
* Language Models and Transfer Learning in Narrative Analysis
* Narrative Analysis in Low-resource Languages
++ Dataset ++
We challenge the interested researchers to consider submitting a paper that makes use of the tls-covid19 dataset (published at ECIR'21) under the scope and purposes of the text2story workshop. tls-covid19 consists of a number of curated topics related to the Covid-19 outbreak, with associated news articles from Portuguese and English news outlets and their respective reference timelines as gold-standard. While it was designed to support timeline summarization research tasks it can also be used for other tasks including the study of news coverage about the COVID-19 pandemic. A script to reconstruct and expand the dataset is available at https://github.com/LIAAD/tls-covid19. The article itself is available at this link: https://link.springer.com/chapter/10.1007/978-3-030-72113-8_33
++ Submission Guidelines ++
We invite two kinds of submissions:
* Full papers (up to 7 pages + references): Original and high-quality unpublished contributions on the theory and practical aspects of the narrative extraction task. Full-papers should introduce existing approaches, describe the methodology and the experiments conducted in detail. Negative result papers to highlight tested hypotheses that did not get the expected outcome are also welcomed.
* Work in progress, demos and dissemination papers (up to 4 pages + references): unpublished short papers describing work in progress; demo and resource papers presenting research/industrial prototypes, datasets or software packages; position papers introducing a new point of view, a research vision or a reasoned opinion on the workshop topics; and dissemination papers describing project ideas, ongoing research lines, case studies or summarized versions of previously published papers in high-quality conferences/journals that is worthwhile sharing with the Text2Story community, but where novelty is not a fundamental issue.
Submissions will be peer-reviewed by at least two members of the programme committee. The accepted papers will appear in the proceedings published at CEUR workshop proceedings (indexed in Scopus and DBLP) as long as they don't conflict with previous publication rights.
++ Workshop Format ++
Participants of accepted papers will be given 15 minutes for oral presentations.
++ Organizing committee ++
Ricardo Campos (INESC TEC; Ci2 - Smart Cities Research Center, Polytechnic Institute of Tomar, Tomar, Portugal)
Alípio M. Jorge (INESC TEC; University of Porto, Portugal)
Adam Jatowt (University of Innsbruck, Austria)
Sumit Bhatia (Media and Data Science Research Lab, Adobe)
Marina Litvak (Shamoon Academic College of Engineering, Israel)
++ Proceedings Chair ++
João Paulo Cordeiro (INESC TEC & Universidade da Beira do Interior)
Conceição Rocha (INESC TEC)
++ Web and Dissemination Chair ++
Hugo Sousa (INESC TEC & University of Porto)
Behrooz Mansouri (Rochester Institute of Technology)
++ Program Committee ++
Álvaro Figueira (INESC TEC & University of Porto)
Andreas Spitz (University of Konstanz)
Antoine Doucet (Université de La Rochelle)
António Horta Branco (University of Lisbon)
Arian Pasquali (CitizenLab)
Bart Gajderowicz (University of Toronto)
Begoña Altuna (Universidad del País Vasco)
Brenda Santana (Federal University of Rio Grande do Sul)
Bruno Martins (IST & INESC-ID, University of Lisbon)
Daniel Loureiro (Cardiff University)
Dennis Aumiller (Heidelberg University)
Dhruv Gupta (Norwegian University of Science and Technology)
Dyaa Albakour (Signal UK)
Evelin Amorim (INESC TEC)
Henrique Cardoso (INESC TEC & University of Porto)
Ismail Altingovde (Middle East Technical University)
João Paulo Cordeiro (INESC TEC & University of Beira Interior)
Kiran Bandeli (Walmart Inc.)
Luca Cagliero (Politecnico di Torino)
Ludovic Moncla (INSA Lyon)
Marc Finlayson (Florida International University)
Marc Spaniol (Université de Caen Normandie)
Moreno La Quatra (Politecnico di Torino)
Nuno Guimarães (INESC TEC & University of Porto)
Pablo Gamallo (University of Santiago de Compostela)
Pablo Gervás (Universidad Complutense de Madrid)
Paulo Quaresma (Universidade de Évora)
Paul Rayson (Lancaster University)
Raghav Jain (Indian Institute of Technology, Patna)
Ross Purves (University of Zurich)
Satya Almasian (Heidelberg University)
Sérgio Nunes (INESC TEC & University of Porto)
Simra Shahid (Adobe's Media and Data Science Research Lab)
Sriharsh Bhyravajjula (University of Washington)
Udo Kruschwitz (University of Regensburg)
Veysel Kocaman (John Snow Labs & Leiden University)
++ Contacts ++
Website: https://text2story23.inesctec.pt
For general inquiries regarding the workshop, reach the organizers at: text2story2023(a)easychair.org<mailto:text2story2023@easychair.org>
Dear all,
We are happy to release six corpora (1.3 Million tokens) with full morphological annotations for (Palestinian, Lebanese, Yemeni, Iraqi, Libyan, and Sudanese) dialects. All are annotated using the LDC’s SAMA tagsets.
Search: https://portal.sina.birzeit.edu/curras
Download: https://portal.sina.birzeit.edu/curras/about-en.html
This video demonstrates how to search the corpora in Arabic/English.
https://twitter.com/mjarrar/status/1604078695068598273
#arabic_language_day We are very happy to release 6 Arabic dialects corpora (1.3 million tokens, morphologically annotated): Curras(Palestinian), Baladi (Lebanese), Lisani (Yemeni, Irqi, Libyan, Sudanese) by @UN, @BirzeitU and @AUB_Lebanon. https://t.co/ZP3hqVSRWc

Mustafa Jarrar
twitter.com
Best
--Mustafa
__________________________
Mustafa Jarrar, PhD
Professor of Artificial Intelligence
Chair, PhD Program in Computer Science
Birzeit University, Palestine
Whatsapp:+972599662258 | mjarrar(a)birzeit.edu
http://www.jarrar.info
Dear all --
In celebration of Arabic Language Day (Dec 18), we are happy to announce
the first release of Maknuune, the Open Source Palestinian Arabic Lexicon.
www.palestine-lexicon.org
Maknuune has over 36K entries from 17K lemmas, and 3.7K roots. All entries
include diacritized Arabic orthography, phonological transcription and
English glosses. Some entries are enriched with additional information such
as broken plurals and templatic feminine forms, associated phrases and
collocations, Standard Arabic glosses, and examples or notes on grammar,
usage, or location of collected entry.
We are honored to have received comments of endorsement from Profs. Noam
Chomsky, Hamid Dabashi, Abdelkader Fassi Fehri, Clive Holes, Ilan Pappe,
and Dr. Walid Saif.
https://sites.google.com/nyu.edu/palestine-lexicon/endorsements
--
Nizar Habash
Professor of Computer Science
New York University Abu Dhabi
A fully funded 4-year PhD position at the intersection of NLP and Topology
is offered at Queen Mary University of London (QMUL), School of Electronic
Engineering and Computer Science. It is part of the collaboration scheme
between QMUL and the China Scholarship Council (CSC), and is therefore
available for Chinese candidates only.
The CSC scheme provides full tuition fee waiver and living stipend for 4
years, and requires (among other things) an English Language test (IELTS)
from the last 2 years. You can read more about the scheme's requirements
here
<https://www.qmul.ac.uk/scholarships/items/china-scholarship-council-scholar…>
.
I am looking for brilliant candidates who hold (or about to hold) MSc in
Computer Science with a strong NLP research background. Prospective
students can learn more about the project here
<http://eecs.qmul.ac.uk/phd/phd-studentships/csc-phd-studentships-in-electro…>,
under the section: *Understanding neural representations via their
algebraic-topological structures*. The PhD student will work in an
interdisciplinary environment, and will be at the forefront of NLP
research.
If you are interested, please get in touch with me on:
h.dubossarsky(a)qmul.ac.uk.
Bests,
Haim