August 2023 - Corpora

Postdoc and PhD position in NLP/CL/CSS at GESIS (Cologne) - deadline: Sept 5/6
by Gabriella Lapesa 15 Aug '23

15 Aug '23

Postdoc and PhD position in NLP/CL/CSS at GESIS (Cologne) The newly established Data Science Methods team led by Gabriella Lapesa [2,3] (Leibnitz Institute for Social Sciences GESIS, Cologne [1], Computational Social Science department [4]) has two positions available from November 2023: - one postdoctoral researcher (100%, 4 years, with possibility of tenure) - one doctoral researcher (75%, 4 years). The PhD project will be pursued at the Heinrich Heine University of Düsseldorf (where Gabriella Lapesa is a junior professor in Responsible Data Science and Machine Learning). ** The team ** The Data Science Methods team will contribute to build and mantain the GESIS infrastructure for Computational Social Science (CSS) research by developing novel methods and making them available, documented, and accessible through the GESIS services. The team will focus on fostering the interaction between Natural Language Processing and Social Science by developing solutions that allow for the integration of multiple information sources (e.g., different textual sources for the same debate; socio-demographic features of speakers and audiences; integration of textual and multimodal data) and address recent challenges in NLP (modeling subjective phenomena; low-resource scenarios; identifying and mitigating bias). The team will tackle research questions at the interface between computational argumentation and CSS, and target political communication from a very broad perspective involving different types of actors (citizens, politicians, parties) and discourse contexts (e.g., online discussions vs. newspapers). From a methodological perspective, at the core of the team's research agenda will be the “learning from disagreements” challenge, as machine learning approaches which rely on gold standards which average annotators’ perspectives are particularly unsuitable for the highly subjective phenomena tackled in CSS research (e.g., persuasion in online discussions; harmful communication online; polarization). ** How to apply ** The official job announcement with more details about the requirements/tasks and the application procedure can be found at the following links: Postdoctoral researcher (deadline: September 5th): https://www.hidden-professionals.de//HPv3.Jobs/Gesis//stellenangebot/33073/1 Doctoral researcher (deadline: September 6th): https://www.hidden-professionals.de//HPv3.Jobs/Gesis//stellenangebot/33084/1 [1] https://www.gesis.org/en/home [2] https://www.gesis.org/institut/mitarbeitendenverzeichnis/person/Gabriella.L… [3] https://www.ims.uni-stuttgart.de/institut/team/Lapesa/ [4] https://www.gesis.org/en/institute/departments/computational-social-science

2 3

August 2023 Newsletter - LDC
by Penn LDC 15 Aug '23

15 Aug '23

In this newsletter: LDC at Interspeech 2023 LDC releases speech activity detector Fall 2023 LDC Data Scholarship Program New publications: 2019 OpenSAT Public Safety Communications Simulation<https://catalog.ldc.upenn.edu/LDC2023S06> Samrómur Queries Icelandic Speech 1.0<https://catalog.ldc.upenn.edu/LDC2023S05> ________________________________ LDC at Interspeech 2023 LDC is happy to be back in person as an exhibitor and longtime supporter of Interspeech, taking place this year August 20-24 in Dublin, Ireland. Stop by Stand A2 to say hello and learn about the latest developments at the Consortium. LDC is also delighted to once again be a silver sponsor for the Young Female Researchers in Speech Workshop<https://sites.google.com/view/yfrsw-2023> and to provide data in support of the CHiME-7 challenge<https://www.chimechallenge.org/current/workshop/index> satellite workshop and the MERLIon CCS Challenge<https://sites.google.com/view/merlion-ccs-challenge>. LDC will post conference updates via our social media platforms. We look forward to seeing you in Dublin! LDC releases speech activity detector LDC announces the release of the LDC Broad Phonetic Class Speech Activity Detector. Based on the broad phonetic class recognizer implemented in the HTK Speech Recognition Toolkit<https://htk.eng.cam.ac.uk/>, LDC's speech activity detector model runs the speech signal through a GMM-HMM recognizer to identify five broad phonetic classes: vowel, stops/affricate, fricative, nasal, and glide/liquid. The LDC Broad Phonetic Class Speech Activity Detector is available at no cost on github<https://github.com/Linguistic-Data-Consortium/ldc-bpcsad> under a GPL v3 license<https://www.gnu.org/licenses/gpl-3.0.en.html>. Fall 2023 LDC Data Scholarship Program Student applications for the Fall 2023 LDC Data Scholarship program are being accepted now through September 15, 2023. This program provides eligible students with no-cost access to LDC data. Students must complete an application consisting of a data use proposal and letter of support from their advisor. For application requirements and program rules, visit the LDC Data Scholarships page<https://www.ldc.upenn.edu/language-resources/data/data-scholarships> ________________________________ New publications: 2019 OpenSAT Public Safety Communications Simulation<https://catalog.ldc.upenn.edu/LDC2023S06> contains 141 hours of English speech recordings and transcripts used in the NIST Open Speech Analytic Technologies (OpenSAT<https://www.nist.gov/itl/iad/mig/opensat>) 2019 evaluation's automatic speech recognition, speech activity detection, and keyword search tasks. The data is part of the SAFE-T (Speech Analysis For Emergency Response Technology) corpus created by LDC which is comprised of speakers engaged in a collaborative problem-solving activity representative of public safety communications in terms of speech content, noise types, and noise levels. US English speakers played the board game Flash Point Fire Rescue. Background noise was played through a participant's headset during the recording session. Recording sessions consisted of 2 30-minute games. The corpus is divided into training, development, and evaluation data. 2023 members can access this corpus through their LDC accounts. Non-members may license this data for a fee. * Samrómur Queries Icelandic Speech 1.0<https://catalog.ldc.upenn.edu/LDC2023S05> was developed by the Language and Voice Lab, Reykjavik University<https://lvl.ru.is/> in cooperation with Almannarómur, Center for Language Technology<https://almannaromur.is/>. The corpus contains 20 hours of Icelandic prompted queries from 3,809 speakers representing 17,475 utterances. Speech data was collected between October 2019 and December 2021 using the Samrómur website<https://samromur.is> which displayed prompts to participants. The prompts were mainly from The Icelandic Gigaword Corpus<http://clarin.is/en/resources/gigaword>, which includes text from novels, news, plays, and from a list of location names in Iceland. Additional prompts were taken from the Icelandic Web of Science<https://www.visindavefur.is/> and others were created by combining a name followed by a question or a demand. Prompts and speaker metadata are included in the corpus. 2023 members can access this corpus through their LDC accounts provided they have submitted a completed copy of the special license agreement. Non-members may license this data for a fee. To unsubscribe from this newsletter, log in to your LDC account<https://catalog.ldc.upenn.edu/login> and uncheck the box next to "Receive Newsletter" under Account Options or contact LDC for assistance. Linguistic Data Consortium<ldc.upenn.edu> University of Pennsylvania T: +1-215-573-1275 E: ldc(a)ldc.upenn.edu<mailto:ldc@ldc.upenn.edu> M: 3600 Market St. Suite 810 Philadelphia, PA 19104

1 0

Conference on Harmful Online Communication (CHOC2023), 16-17 Nov, Cologne Germany & online
by Sen, Indira 15 Aug '23

15 Aug '23

Dear colleagues, We (Christina Dahn, Pascal Siegers, Katrin Weller, and I) are hosting the Conference on Harmful Online Communication (CHOC2023) later this year which might also interest NLP researchers working on abusive language detection. The conference will take place in Cologne, Germany, and online on November 16-17, and is generously funded by the Thyssen Foundation. CHOC 2023 aims to bring together researchers and practitioners working on detecting harmful language from a variety of disciplinary lenses. There are also some options for joining with a poster presentation or as participants. See https://www.gesis.org/en/research/conferences/gesis-conferences/conference-… for additional information. We would appreciate it if you could also share this with other people who might be interested. Best wishes, Indira Sen

1 0

BCS Information Retrieval SG - Call for Tutorials: Search Solutions 2023
by Udo Kruschwitz 15 Aug '23

15 Aug '23

Search Solutions is the BCS Information Retrieval Specialist Group’s annual event focused on practitioner issues in the arena of search and information retrieval. Search Solutions consists of two parts: a tutorial day and a conference day. We invite tutorial proposals which focus on any area of the practical application of search technologies to real world problems, for the tutorial day due to take place on 21st Nov 2023 before the conference day on 22nd Nov 2023. Tutorials in previous years have included: designing usability for search, multimedia information retrieval, evaluation, pattern search, city search in SmartCities, text analysis, introduction to natural language processing and introduction to reinforcement learning, Apache Solr and open-source technologies, etc. The details of the previous tutorials can be found here: <>https://www.bcs.org/membership-and-registrations/member-communities/informa… <https://www.bcs.org/membership-and-registrations/member-communities/informa…>. Tutorials Proposals for both full day (5-6 hours including breaks and lunch) and half day (2-3 hours including breaks) tutorials are invited. The tutorials will take place on Tuesday 21st November 2023 at the BCS offices in London and/or online depending on the situation near the time. We encourage in person tutorials at the BCS offices if possible. Proposal submission Tutorial proposals should be submitted to the tutorial chair (h.liu(a)soton.ac.uk <mailto:h.liu@soton.ac.uk>) by midnight Monday 11th Sep 2023, using the following template: Name of presenter(s): please list the names and affiliations of presenter(s). Title: title of the tutorial. Contact details: email and snail mail address, phone numbers etc. Type of tutorial: half day or full day. Delivery format: Online only or in person only or could be both Tutorial Abstract: for publicity. Target audience: please outline the practitioner audience to be addressed. Learning outcomes: what would the practitioners gain from attending this tutorial? Tutorial schedule and description: provide a draft schedule and detailed description of each of the items. Tutorial logistics/materials: required media and formats for tutorial. What will be provided to attendees (e.g. slides). Bio of presenter(s): including track record of presenting tutorials, lecturing experience etc. (200/300 words) Selection Procedure All tutorial proposals will be reviewed by the tutorial chair and approved by the organising committee. The selection criteria will focus on the quality of the tutorial content and the appropriateness of it to the main theme of search solutions. Contact Tutorial Chair: Dr Haiming Liu, h.liu(a)soton.ac.uk <mailto:h.liu@soton.ac.uk> Dr Haiming Liu, PhD, PgCAP, SFHEA Associate Professor @ Web and Internet Science (WAIS) Research Group Director of Centre for Machine Intelligence (CMI) School of Electronics and Computer Science (ECS) Faculty of Engineering and Physical Science University of Southampton Highfield, Southampton, SO17 1BJ Email: h.liu(a)soton.ac.uk <mailto:h.liu@soton.ac.uk>

1 0

Re: math and language/text data (continued from thread "Re: Re: Any literature about tensors-based corpora NLP research with actual examples (and homework ;-)) you would suggest? ...")
by Anil Singh 14 Aug '23

14 Aug '23

Re "Wanderers, Kings, Merchants: The Story of India through Its Languages by Peggy Mohan": look forward to reading it. (History or historical events aside, does it have to do with how language(s) became an indicator of power/privilege/status? If humankind has anything in common, this could be one general observation...) It does to some extent, particularly with regard to the history of languages and related things in India. Re "If people are used to writing in their own language on computers, then that language is more likely to survive": I don't disagree, but --- [Forewarning: one might not like my reply to the following, but I ask for it to be interpreted with a scientific mindset rather, not with emotions and sentiments related to language identity and particular cultural practices.] "Languages" (as in, particular language varieties, not "language" as in language-at-large) come and go, born and die (and, sometimes, come back), like trends, styles, cultures ("culture" as in a particular way of living, a set of habits...). Esp. for users of varieties that have undergone oppression/suppression, I understand that there is or can be much meaning to many users in having the varieties be alive or in use. It is important to the users. It is a symbol of their existence. Yes, that is important. But having witnessed how language has been abused (e.g. with research greed by some CL/NLPers), I sometimes think one might have gone too far with how much identity one attaches to any particular language. Coming trom the background that I do, your statement above seems very similar to saying that we might have gone too far with political correctness or about opposing racism/misogyny/etc. Like most Europeans or Americans, you seem to have no (or very little) idea about the toll that discrimination -- even unintended discrimination -- takes on a very large part of the human population. In early 2009, I had written a rant in a blog post and the title of the post was "English is Language Independent". I had made roughly the same points which the now famous Bender Rule makes. I have been writing about it, although that blog is now defunct. I did not, however, make any proposal, as the Bender Rule does. I just pointed out the problem, so I am in no way undermining the importance of that. And one anonymous comment on this blog post was this: "Why don't you work on a good project. I don't see the prejudice persisting for long once you do that." It is like saying about gender or racism: "Why don't you make accomplishments equal to us? Once you do that, I don't see the prejudice persisting for long." Not to be picky here, but "I have heard some native speakers *users *of some "Dravidian languages" say that there are some (I guess minor) problems with Unicode for their languages": using the terms "native" and "speakers" to refer to "users" (or as I sometimes use among knowers of language: "languagers") has been an unhealthy baggage from our past practices in the language space. I can't comment on the issue of "Dravidian languages" and Unicode, but it seems one might need more info/details on the complaints to act further. Well, yes, the word "native" has indeed a very dark history. How could I not know it? But, the manner in which linguists use the term "native speaker" is very different. At least I think so. Note that I am only mentioning linguists here, not "CL/NLPer". By the way, I thought I had coined the term "NLPer" in my blog long ago, but I may be wrong about that. I am pretty sure one can find out these details in some online forum or some academic publication etc. I have not so far done that, but it is a good idea to do that. I will try. Re "psycholinguistic validity from computational validity": in (cognitive/psycholinguistic) modeling, there is / can be not much difference between the two. (When one has enough experiences with modeling or with language/psycholinguistic phenomena, it's not hard to to see that results from computational modeling could also hold elsewhere. The art then is to be able to connect the two "realms". But then again, it depends on the claims, of course.) Yes, of course, it depends on the claim. The two realms can definitely be connected. We don't disagree about that. In fact, I think, we don't really disagree about that many things it seems. Even so, isn't it possible to implement the same thing in many different ways when it comes to computation? That is not the case with the brain/mind, of course. Here, I am again making a distinction between computation and mathematics. Perhaps you don't agree with that? In that case, perhaps we mean different things by the term "computation". Re science and engineering: I am not sure if engineering has to be "just about" heuristics or short cuts. There is good engineering and there is bad engineering, for the sake of my arguments here. In the context of ML with language/textual data, one ought to be careful with "computing based on values of surface elements/strings". Much of what the CL/NLP community/communities have been doing for the past few decades has been "computing based on values of surface elements/strings". This practice deserves serious re-evaluation (there are lots of grey areas here and opportunities to compare processing across finer granularities (without all the preprocessing hacks/"heuristics"/"engineering") for various tasks and data types/formats, without "words", "sentences", "linguistic structure(s)", "grammar" et al.). I don't think of it as "it's engineering!", but some bad practices/culture having been promoted as such and normalized (for a couple of decades?). Good engineering can also be fine, thoughtful, and robust. I completely agree. But sometimes I do work on things which, theoretically, seem ridiculous to me, but they may be practically useful. At least half (perhaps more) of my motive to work on language processing is to address somehow, to any extent, the issue of linguistic empowerment. I am prepared to compromise theoretically for that purpose. Re "I don't think there is anything wrong with what you call grammar hacking from the engineering point of view": I do (think there is something wrong), because: i. "all grammars leak" (from Sapir, also in Manning and Schütze (1999)); I know. I tell this to students every time I teach NLP or any related subject. ii. "words" (whatever they are) are too coarse-grained for computing. I already agreed to that, but if they help in my benign motives, I am prepared to use them. Re "it [language(s)] is still likely to have an 'organic' structure": couldn't that structure (one not associated with "words"/"sentences"/"grammar") be one from math or computing? Or one that is a by-product of a combination of these factors? It certainly could. Some CL/NLPers have made various claims concerning "structures" in the past, borrowing the concepts from "linguistic structure(s)", from "grammar". There was a lot of chiming along, many often have neglected the fact that grammar could effect the impression of "structures" through "words" etc. or that it all in turn patterns some of our thoughts/judgments sub-/unconsciously. And the loop goes on. (See also: See also: https://twitter.com/adawan919/status/1532335891448057858) Well, if you prefer the term "patterns" to "grammar" or "structure", I am completely fine with that. As I said earlier, I am moving towards the language games view of language, even for this exchange. We can't avoid that if we are talking in a human/natural language. The only way to avoid that, if there is a way, is to use only mathematical notation, but I don't think we have reached that stage so far with the study of language. Re "in English "John loves Mary" is in fact a very different thing than "Mary loves John"": one has to re-evaluate to which extent this matters in whichever form of computing/computation one is engaged in and how often this "canonical form" that you are implicitly referring to really occurs in data as well as how this actually surfaces in data. One should look at the data in front of one, not the framework/theory in one's mind. (I believe in achieving better designs/systems through testing from both a data-centric as well as an algorithm-centric perspective. Hardware counts too!) I mostly agree, but I am not sure whether you are saying that "John loves Mary" may not perhaps be different from "Mary loves John"? Re " it is unfair to blame Linguistics for that": My focus in "[t]he "non-native speakers of X" has been a plague in Linguistics" was on the "native" part. That has been my understanding, at least to a great extent "nativeness" was so promoted/reinforced, esp. within the school of generative Linguistics in the 2nd half of the 20th century, when it comes to "linguistic judgments". I thought the propagation stemmed from there. Who/What else do you think started it? I think the word "native" was used in a derogatory/condescending way throughout the English speaking world, even before the "birth of Modern Linguistics". It was, in fact, the more polite word. One other common word was "savage". I remember being shocked to find (very long ago) in Jane Eyre the phrase "savages living on the banks of the Ganges". Savages? On the bank of the Ganges? But these usages are much older (than "Modern Linguistics"). The matter of "linguistic judgements" or "grammaticality" is very different from that, regardless of what one's opinions are about the existence of grammar. All in all, your replies remind me of many of the reviewers' responses "typical" of (i.e. I often got from) the *CL circle (of those who remained in the past decade or so). I don't know about that. I thought I had very unconventional views of NLP, but I could be wrong about that, at least relatively. If I may guess: i. you don't have an academic background in Linguistics (esp. general Linguistics, That is true. I have learned about language(s) mostly on my own. So, if you want me to show my degree in Linguistics, I have none, except a PhD in Computational Linguistics. I was the second person to get a PhD specifically in CL in India. note that there is a difference between linguistics of particular languages and that in a more general/theoretical manner (not about (p-)language grammatical particularities), Of course there is. What makes you think I don't know that? The fact is, my knowledge of "(p)-language" is relatively very limited. Even my knowledge of the syntax of Hindi (my "mother tongue"), in a formal sense, is very limited. I mostly know about language in general. ii. you learn about language(s) through mostly non-academic books or through your own language experience(s) (which counts too, I am not invalidating it/them here), I have no idea what makes you say that. Am I supposed to list the Linguistics books I have read, in addition to showing my Linguistics degree? (Sorry if that sounds bitter, but it has happened in the past, not literally, but effectively). I can only say -- and it is strange to even have to say this -- that I definitely know more about language in general and linguistic theory than -- at least -- most graduates and postgraduates (including PhDs) of Linguistics in India. iii. you never had phonetics and phonology, nor In my replies on this thread I have not mentioned anything related to phonetics or phonology. So it must be from somewhere else that you have this impression(?). Is it from some of the papers co-authored by me? I think there are some which could give that impression. Explaining why that could be so, will take this discussion somewhere else. I don't think it is relevant here at all. Is your point that I don't know about phonetics or phonology? If it is, I would prefer not to answer that. iv. do you realize how you can practice without "words" I do. But I am also prepared to use "words" wherever they help. As I wrote earlier, I have worked without "words" sometimes and have argued against them. --- did I get any right? You got -- sort of -- the first and the third, assuming you were asking me to show my Linguistics degree and whether I had formally taken courses on Phonetics and Phonology. I wanted to note this because --- and please do not take offense, it is not meant personally for I respect your expertise and appreciate our exchanges --- for a while, I didn't know where(else) to submit my findings. It wasn't until I got all the rejections with rather shallow comments about language (or language and computing) did I realize the "solidarity" one has built with people with a background similar to yours might have been the driving force of how some computational (general) linguists (as in, "general language scientists" who also do computational work --- there are only a few of us) got chased out of the arena. The "typical" excuses for practices of this "culture" have been "engineering", "useful", "it works" --- but without any/much grounding/interest in good generalizations. One puts excess focus on processing but not on evaluation or interpretation. I think it's time for a "culture" change in this regard. To your other reply below (in triple quotes): Sorry, but I didn't understand what you mean by triple quotes. I could find any triple quotes in your comments. re "language policy": not everything has to be or can be regulated. Policies can help with promoting/reinforcing/rectifying a particular situation/initiative. I agree. Forcing people e.g. to use language in one particular way or to use "one language" only (whatever "language" "means"*)? Again, I agree. However, like most Europeans and Americans, you seem to have no (or very little) idea how people are already being forced to use some language or another. And it is hardly a new phenomenon, but it has become much more serious now due mainly to colonization and all its effects. Do you have any idea how much hundreds of millions of Indians suffer simply from being forced to use English? My primary motive in my whole life, for good reasons, has been to counter linguistic discrimination, mainly due to the imposition of English on any Indian who has any ambition at all. I think that alone makes me sufficiently qualified to "work on language". This is analogous to any other kind of discrimination or prejudice. I do not think that would be a good direction to go. For any regions, we have seen both good/better and bad/worse policies throughout the course of history. One would really have to evaluate the proposed policy in question carefully. I never said anything about forcing people to speak one language. That's why I said I don't know what exactly the connection is with the language policy. But it sure has a very strong connection, because the problem in the first place is due to language policy, written and unwritten. Do you know that there are and have been schools in the world, including India, where students are punished if they are caught speaking in their mother tongue (or first, "native" language). I still remember feeling recognition and the impression it had on me when I read the famous novel about life in Wales, How Green was My Valley. I had just become fluent in English then. Depending on the situation, some may best change things through the economy, some through government support, some through education and/or grassroot-type of initiatives, some a combination of all these and more.... I agree. *I have an answer to this... please wait for my next pub or so. I would love to read it. I am eternally hungry for any fresh look on language in general. Not so much for particular languages or varieties. That is to me, to some extent, boring. Re "it is very much like conservation of ecology or of species. I don't think it (the latter) will be considered unwarranted prescriptivism": see my 2nd response above. The same for my reply to that comment. Also, with language documentation, one can just document data without promoting grammar. (That's probably the less unethical thing one can do with language or language data.) Again I agree. I never said anything about promoting grammar. I don't like to read grammar books. It's painful to me, compared to almost any other topic under the sun, except perhaps finance, commerce, and the intricacies of legal procedures. For the sake of completeness, I should clarify -- as it seems to matter -- that I have "never had Syntax or even Semantics or Pragmatics". I am mostly self-taught, not just in Linguistics, but also in Computer Science and almost everything else. Do you really think it matters in the context of this discussion? Again, for the sake of completeness, I should mention that for decades, I have been reading all kinds of books that had anything to do with language, mostly in general, but also about Hindi or Indian languages, not to mention English. These have included what you call academic books on language in general and about Linguistics. I still keep reading, as I know very well that, being self-taught, I have some gaps in my knowledge of Linguistics and Computer Science. My undergraduate degree was in Mechanical Engineering (from 1990), but I hardly remember anything in that area. I have similarly been reading all kinds of books for decades about computers and Computer Science. I am unable to see how any of this matters in the context of this discussion. By the way, I like the metaphor you use for language: It being like a graphical user interface for the brain. That reminds me of the views of Daniel Dennett about consciousness. He constantly compares elements making up consciousness to graphical user interfaces on computers. Not that I completely agree with him about consciousness, but I still find the metaphor quite good, perhaps as an approximation.

2 5

Call for Book Chapter Submissions: "Empowering Low-Resource Languages With NLP Solutions"
by Pankaj Dadure 14 Aug '23

14 Aug '23

Dear Sir/Ma'am, I hope you are doing well and in good health. We are excited to announce a call for a book chapter for an upcoming book titled "*Empowering Low-Resource Languages With NLP Solutions.*" Link: https://www.igi-global.com/publish/call-for-papers/call-details/6596 The objective of this book is to provide an in-depth understanding of Natural Language Processing (NLP) techniques and applications specifically tailored for low-resource languages. We believe that your valuable insights and research in this domain would greatly enrich the content of this book. To ensure a comprehensive and high-quality book, all submitted chapters will undergo a rigorous peer-review process. The accepted book will be *indexed in Scopus and Web of Science*, thereby enhancing the visibility and impact of your work. The book aims to cover a wide range of topics related to NLP in low-resource languages. Some of the suggested topics, although not limited to, include: · Introduction to Low-Resource Languages in NLP · Language Resource Acquisition for Low-Resource Languages · Morphological Analysis and Morpho-Syntactic Processing · Named Entity Recognition and Entity Linking for Low-Resource Languages · Part-of-Speech Tagging and Syntactic Parsing · Machine Translation for Low-Resource Languages · Sentiment Analysis and Opinion Mining for Low-Resource Languages · Speech and Audio Processing for Low-Resource Languages · Text Summarization and Information Retrieval for Low-Resource Languages · Multimodal NLP for Low-Resource Languages · Code-switching and Language Identification for Low-Resource Languages · Evaluation and Benchmarking for NLP in Low-Resource Languages · Applications of NLP in Low-Resource Language Settings · Future Directions and Challenges in NLP We encourage you to contribute a book chapter focusing on any of the above-mentioned topics or related areas within the scope of NLP in low-resource languages. The submission guidelines are as follows: 1. Please submit a chapter proposal (maximum 500 words) outlining the objective, methodology, and expected outcomes of your proposed chapter by August 15, 2023, to the submission portal: https://www.igi-global.com/publish/call-for-papers/call-details/6596 2. Chapter proposals should include the title of the chapter, the author(s) name, and their affiliations. 3. All submissions should be original and should not have been previously published or currently under review elsewhere. 4. The chapters should be written in English and adhere to the formatting guidelines provided after the acceptance of the proposal. *Important Dates:* August 15, 2023: Proposal Submission Deadline August 25, 2023, 2023: Notification of Acceptance September 17, 2023: Full Chapter Submission October 31, 2023: Review Results Returned December 12, 2023: Final Acceptance Notification December 26, 2023: Final Chapter Submission Thank you for considering this invitation, and we look forward to receiving your valuable contribution to this book. If you have any further questions or require additional information, please do not hesitate to contact us. Best regards, Editorial Team Dr. Partha Pakray National Institute of Technology Silchar Email: partha(a)cse.nits.ac.in Dr. Pankaj Dadure University of Petroleum and Energy Studies Dehradun Email: pankajk.dadure(a)ddn.upes.ac.in Prof. Sivaji Bandyopadhyay Jadavpur University, Kolkata Email: sivaji.cse.ju(a)gmail.com -- With Best Regards Pankaj Dadure Mobile: 9545757478

1 1

3rd CALL FOR PAPERS: ICNLSP 2023, 6th International Conference on Natural Language and Speech Processing, December 16-17, 2023
by Mourad Abbas 14 Aug '23

14 Aug '23

Third Call for papers 6th International Conference on Natural Language and Speech Processing <http://icnlsp.org/> We are delighted to invite you to ICNLSP 2023, which will be held virtually from December 16th to 17th, 2023. ICNLSP 2023 offers the opportunity for attendees (researchers, academics and students, and industrials) to share their ideas and to connect to each other and make them up to date on the ongoing research in the field. ICNLSP 2023 aims to attract contributions related to natural language and speech processing. Authors are invited to present their work relevant to the topics of the conference. The following list includes the topics of ICNLSP 2023 but not limited to: Signal processing, acoustic modeling. Architecture of speech recognition system. Deep learning for speech recognition. Analysis of speech. Paralinguistics in Speech and Language. Pathological speech and language. Speech coding. Speech comprehension. Summarization. Speech Translation. Speech synthesis. Speaker and language identification. Phonetics, phonology and prosody. Cognition and natural language processing. Text categorization. Sentiment analysis and opinion mining. Computational Social Web. Arabic dialects processing. Under-resourced languages: tools and corpora. New language models. Arabic OCR. Lexical semantics and knowledge representation. Requirements engineering and NLP. NLP tools for software requirements and engineering. Knowledge fundamentals. Knowledge management systems. Information extraction. Data mining and information retrieval. Machine translation. NLP for Arabic heritage documents. *IMPORTANT DATES* Submission deadline: *31 August 2023* Notification of acceptance: *31 October 2023* Camera-ready paper due: *20 November 2023* Conference dates: *16, 17 December 2023* *PUBLICATION* 1- All accepted papers will be published in ACL Anthology ( https://aclanthology.org/venues/icnlsp/). 2- Selected papers will be published in Signals and Communication Technology (Springer) (https://www.springer.com/series/4748), indexed by Scopus and zbMATH. For more details, visit the conference website: https://www.icnlsp.org *CONTACT* icnlsp(at)gmail(dot)com Best regards, Mourad Abbas

1 0

DGfS2024 workshop 'Towards Linguistically Motivated Computational Models of Framing' (dgfs2024-framing)
by Ines Rehbein 14 Aug '23

14 Aug '23

*Call for Abstracts* *'Towards Linguistically Motivated Computational Models of Framing'* Date: Feb 28 - Mar 1, 2024 Location: Ruhr-University Bochum, Germany Organizers: Annette Hautli-Janisz (University of Passau), Gabriella Lapesa (University of Stuttgart), Ines Rehbein (University of Mannheim) Homepage: https://sites.google.com/view/dgfs2024-framing Call for Papers: Framing is a central notion in the study of language use to rhetorically package information strategically to achieve conversational goals (Entman, 1993) but also, more broadly, in the study of how we organize our experience (Goffman, 1974). In his seminal article, Entman (1993) defines framing as "to select some aspects of a perceived reality and make them more salient in a communicating text, in such a way as to promote problem definition, causal interpretation, moral evaluation, and/or treatment recommendation for the item described." This frame definition has recently been operationalized in NLP in terms of coarse-grained topic dimensions (Card et al., 2015), e.g., by modeling the framing of immigration in the media as a challenge to economy vs. a human rights issue. But there is more to frames than just topics. The breadth of the debate on what constitutes a frame and on its (formal and cognitive) definition naturally correlates to the interdisciplinary relevance of this phenomenon: a theoretically motivated (computational) model for framing is still needed, and this is precisely the goal of this workshop, which will bring together researchers from theoretical, applied and computational linguistics interested in framing analysis. Our main interest is in furthering our understanding of how different linguistic levels contribute to the framing of messages, and to pave the way for the development of linguistically-driven computational models of how people use framing to communicate their attitudes, preferences and opinions. We thus invite contributions that cover all levels of linguistic analysis and methods: from phonetics (e.g., euphony: the use of repetition, alliteration, rhymes and slogans to create persuasive messages) and syntax (e.g., topicalization, passivization) to semantics (lexical choices, such as Pro-Life vs. Pro-Choice; the use of pronouns to create in- vs. out-groups; the use of metaphors; different types of implicit meaning) to pragmatics (e.g., pragmatic framing through the use of presupposition-triggering adverbs). We also invite work on experimental and computational studies on framing which employ linguistic structure to better understand instances of framing. The workshop is part of the 46th Annual Conference of the German Linguistic Society (DGfS 2024), held from 28 Feb - 1 March 2024 at Ruhr-Universität Bochum, Germany. *Submission instructions*: We invite the submission of anonymous abstracts for 30 min talks including discussion. Submissions should not exceed one page, 11pt single spaced (abstract + references), with an optional additional page for images. The reviewing process is double-blind; please ensure that the paper does not include the authors' names and affiliations. Furthermore, self-references that reveal the author's identity, e.g., "We previously showed (Smith, 1991) ...", should be avoided. Instead, use citations such as "Smith previously showed (Smith, 1991) …". *Submission deadline:* *August 25, 2023* Abstract review period: Aug. 26, 2023 - Sept. 5, 2023 Meeting email: dgfs2024-framing(a)fim.uni-passau.de -- Ines Rehbein Data and Web Science Group University of Mannheim, Germany

1 0

Second Call for Papers: Deep Learning Summer School at RANLP 2023
by Nanomi Arachchige, Isuri 14 Aug '23

14 Aug '23

DLinNLP 2023 - Deep Learning Summer School at RANLP 2023 Second Call for Participation Varna, Bulgaria 30th August - 1st September https://dlinnlp2023.github.io/ We invite everyone interested in Machine Learning and Natural Language Processing to attend the Deep Learning Summer School at 14th biennial RANLP conference (RANLP 2023). Purpose: Deep Learning is a branch of machine learning that has gained significant traction in the field of Artificial Intelligence, pushing the envelope in the state-of-the-art, with many sub-areas including natural language, image, and speech processing employing it widely in their best-performing models. This summer school will feature presentations from outstanding researchers in the field of Natural Language Processing (NLP) and Deep Learning. These will include coverage of recent advances in theoretical foundations and extensive practical coding sessions showcasing the latest relevant technology. The summer school would be of interest to novices and established practitioners in the fields of NLP, corpus linguistics, language technologies, and similar related areas. Important Dates: 30 August - 1 September: Deep Learning Summer School in NLP Lectures: * Lucas Beyer (Google Brain) * Tharindu Ranasinghe (Aston University, UK) * Iacer Calixto (University of Amsterdam, Holland) Practical Sessions: * Damith Premasiri (practical sessions) (University of Wolverhampton, UK) * Isuri Anuradha (practical sessions) (University of Wolverhampton, UK) * Anthony Hughes (practical sessions) (University of Wolverhampton, UK) Registration: **** Registration is now open: ****** https://ranlp.org/ranlp2023/index.php/fees-registration/ Programme: Please refer to the website for the details of the programme: https://dlinnlp2023.github.io/#programme ***There will be an informal poster presentation session where attendees can present their research work and get feedback from the experts in the field. *** Contact Email: dlinnlp2023(a)gmail.com<mailto:dlinnlp2023@gmail.com>

1 0

[Final call] Four fully-funded 5-year PhD positions at University of Gothenburg, Sweden
by Peter Ljunglöf 14 Aug '23

14 Aug '23

[Note: the application deadline is Sunday 20 August] The University of Gothenburg, Sweden, is offering four fully-funded PhD positions in computer science and engineering where the candidates can choose the project themselves out of fourteen options. Two of the projects are related to NLP, one about efficient algorithms for corpus searching, and another about automatic generation of Wikipedia articles. See the ad for more information: https://web103.reachmee.com/ext/I005/1035/job?site=7&lang=UK&validator=9b89… The positions are fully funded for 5 years, including 20% teaching or other departemental duties. Application deadline: 20 August 2023 best regards, Peter Ljunglöf ------- ------ ----- ---- --- -- - - - - - peter ljunglöf peter.ljunglof(a)gu.se data- och informationsteknik, och språkbanken göteborgs universitet och chalmers tekniska högskola -------------- --------- -------- ------- ------ ----- ---- --- -- - - - - -

1 0

2025

2024

2023

2022

Corpora August 2023