Linguistics is not "supposed to be about spoken languages" --- that has just been a disciplinary bias. And the term "spoken languages" is also not quite correct. There are scripts, there are sound, there are signs --- all of which are* ways in which we express ourselves and the way we interpret information*. Such "ways" can be regarded as language (as in, as has been traditionally and commonly known), g-language*. P-language can be thought of as an instance of such (mixed with some historical contexts, but also A LOT OF identity politics and language attitudes).
*g-language as general/generalized language or "language-at-large" (or in Haspelmath's formulation "general phenomenon of Human Language"), p-language as "particular language(s)" such as those commonly referred to e.g. FR, EN.... See also Haspelmath (2019). Confusing p-linguistics and g-linguistics: Philosopher Ludlow on “framework-free theory". https://dlc.hypotheses.org/1801. Note though that I, by "decomposing 'words'", am taking things one step further in generalization compared to what he would refer to as g-language phenomena. Because previous to my work, one had considered language to be grammar and grammar to be language, at least in the academic setting, and even in Linguistics, as an academic discipline.
Much of what I don't agree with in our current education and R&D landscape has to do with the abuse of "p-language(s)". It's research greed coupled with sentiment manipulation. Many who have studied linguistics (and many philological studies) as well as computational linguistics (and possibly natural language processing --- depending on which traditions/assumptions/practices is/are being taught) are likely to have been affected (I myself included). There was a stage in my life when I also wanted to speak up / advocate for minority languages. Then I realized, it's not really about the language (as texts or speech/sign data...).
[And yes, of course, there ARE the parts about the texts and speech/sign data --- the documentation, the processing, the interpretation/description. But it's not really what many/most people in CL/NLP or the "language space" are doing. They are not seeing interpretation as description. They are describing subjectively (mixing in some historical bias), then processing with biases historical and personal (and emotional), and IF they are interpreting at all, they are doing so based on a grammarian or "layperson" persepective (i.e. not a scientific/technical one).
And also, of course, there is much truth-telling to do when it comes to language attitudes and identity politics. For these could affect the availability of data, how data "quality" is being perceived, among others. But these do not have anything to do with how data is to be computationally processed. For those who perform testing on humans, their studies could reveal potential human biases effected by language attitudes etc..]
=====
2. Re "John loves Mary" etc.: I think you are mixing in your expectation of what how a canonical form should behave with its relevance. Re "I am unable to imagine a scenario where "John loves Mary" could be the same "Mary loves John"": they are both 15 characters long. (As to how they can be different, one thing that has not been mentioned: we can interpret/describe these strings with e.g. their transition probabilities, character to character, character n-gram to another --- there are a few who have worked on this, e.g. John Goldsmith (see his youtube videos on MDL) and myself (though on a "word"-free manner). It could give one some insights on the interpretation of phonological phenomena.)
3. "Word" is not ill-defined. It is not definable, in ways that would be necessary or sufficient for science, engineering, R&D.
4. Re "phonetics and phonology": no, though having had p and p may help with understanding language in finer granularities, it's not so much about p and p per se (or any academic degree or courses). It's about having a scientific mindframe. Between phoneticians, phonologists, morphologists, syntacticians, semanticists/semanticians, pragmaticists or social linguists, phoneticians (and some social linguists) were the ones who started using machines to study language. Seeing language as an object of scientific inquiry/investigation is key.
I am not writing what I wrote because I have had some "advanced Linguistics", I am not trying to impose any academic superiority here. But from what you wrote, I do see/sense there is a clearer demarcation between science and language attitudes & identity politics possible.
Re "I don't think you do at all, based on your comments": I do understand and empathize. I might have just seen/experienced things from a different angle --- I have seen how language attitudes / identity politics has been exercised in the academic context (also industry), so to "broaden the practitioner pool" for language-related technologies. It's almost like selling/promoting a less legitimate form of currency. In that respect, I think there are things to correct. (So I might come off a bit stern, but that should not be misinterpreted as my lacking empathy/compassion.)
Re "no one has to adopt a purist attitude when it comes to using/understanding of language by others. Perfectly true. Did I even hint at that to the least degree? You are again shadow boxing.": I am agreeing with you by echoing, (re-)confirming/reformulating.
Re "But even when I started, I was not a purist in any sense of the word.": Then how/where do you draw the line between p-languages, esp. outside the context of computing? Some puristic ideas/ideals are there to back the assumption/adoption of such naming.
Re "Why do you make assumptions as you comment on anything and hurl semi-insults? You don't really know me. I didn't assume anything about you.": I think there might have been some misunderstandings. I wasn't discoursing with you with prejudice. In fact, I have been trying to understand you better. Understanding readers' perspectives is important for authors. As you also identified yourself as a seasoned CLer, I thought to draw the connection between your views and some of what I've observed from "the CL population" (e.g. as exemplified by some dominant (and implicit) assumption/interpretation about the state of our art/discipline/area (or whether there is one)).
I did and do know about the language situation in India (e.g. linguistic hegemony) --- I may not have witnessed things in ways you have, or be able to describe things with as many details or report on the group sentiment (may it be perceived or collectively acted upon or that which ends up shaping the social infrastructure). But there are many parallels in language phenomena. The crux of the matter lies not in language --- that's a message I've been trying to get across. (It took me a while to come to terms with that.)
On Mon, Aug 14, 2023 at 9:16 AM Anil Singh anil.phdcl@gmail.com wrote:
On Fri, Aug 11, 2023 at 9:52 PM Ada Wan adawan919@gmail.com wrote:
Just a quick reply before the weekend to some of the points that I thought deserve a short clarification:
- re linguistic empowerment: yes and no. As I commented on X (formerly
Twitter) on 09Aug2023: "[t]here are divergent ways of thinking... but when it comes to language and the social sciences, one must be careful with how one "diverges"! Humans and [sic: are] humans. And much of what we postulate re "in-group/out-group" can be a matter of our "traditions" (if so, is it time to re-evaluate?), perspectives (if so, can we be biased sometimes?), or our willingness to include or will to exclude. How different can particular "languages" (in a folk psychological, proverbial usage) be, really? Where do the differences lie?".
Of course. That goes without saying. For almost 40 years now I have been looking at this issue and thinking about it. I had wanted to write a book about it. This is what got me into languages and then NLP, because I was then hooked by the study of language by itself, even without the issue of linguistic empowerment, particularly from the computational point of view. I have looked at it from all possible points of view. I was not a linguistic purist even when I began -- with a lot of bitterness -- and I am certainly not that now. I have no doubt at all that there is something universal and species-specific about human languages, although I don't know in what way it is universal exactly. No one does, as far as I know. No one could be more against linguistic chauvinism of any kind than me. Or any other kind of chauvinism.
The same goes with "diversity" efforts. AND OF COURSE I am NOT AGAINST these.
I believe that. I didn't say you were. I just gave an example to make my point.
But one has to be careful how far one goes with "difference(s)".
Only as far as is reasonable and fair to everyone.
- Re "John loves Mary" being "same/different" as "Mary loves John": it
depends. Note stress/emphasis/topicalization, different usage pattern(s) etc., not just "subj verb obj".
Well, yes, that is the central contradiction of Linguistics. It is primarily supposed to be about spoken language, but -- quite naturally -- linguists in academic literature have to use examples in written form. And the written form misses "stress/emphasis/topicalization, different usage pattern(s) etc.". Being concerned with language for 40 years, how could I possibly not know it?
However, I am unable to imagine a scenario where "John loves Mary" could be the same "Mary loves John", with any possible stress/emphasis/topicalization, different usage pattern(s) etc. for either of them and their combinations. It may be that I am missing something here.
- Btw, your usage of the term "word" can be replaced by other
alternative formulations, e.g. "term",
Let us terminate this terminological tussle about the term 'term', that is to say, the term 'word'.
I have already more than once agreed that the term 'word' is ill-defined and that I have even written about it. In this case, you are indulging in what can be called shadow boxing.
- Re phonetics and phonology: I was not referring to the relevance of
phonetic/phonological knowledge per se, that a practitioner in the space of "language and computing" would "need" in order to be competent. But that, as well as a comprehensive knowledge of general language theories and a broad background in p-languages and their (social/usage) contexts, belongs in the toolkit of a good linguist (as in, a good language scientist). To me, progressing to finer granularities is just refining our assumptions, our model.
I mostly agree. Only mostly, since the statement above a somewhat vague programmatic statement. If the details were there, I could agree to specific things.
But to those who may not have had phonetics/phonology, they may be more likely to think that they "need" "words" and hence my findings might be either a paradigm shift or the end of the world.
So, you are still equating "having had phonetics/phonology", which can be translated as having formally attended and passed courses and exams in phonetics/phonology. I can't imagine how could you possibly talk about de-pedatization if you subscribe to this -- in my opinion -- somewhat ridiculous way of thinking. Pardon me for using strong words, but what you say is on the borderline of being offensive, if not actually offensive. And it is extremely silly and childish, coming from such a well-read person.
- Re triple quotes: """
I copied and pasted your reply that didn't seem to have been sent to the list and put it in triple quotes, as a reference (for others).
OK.
- Re "Do you have any idea how much hundreds of millions of Indians
suffer simply from being forced to use English?": in what ways are they "forced"?
I can't even begin to attempt to describe in innumerable ways people are forced to use English. There is tons of literature about that, but a lot of it may be non-European languages. For example, it is there in Hindi. The book that I always wanted to write, but for various reasons couldn't, at least so far, was partly about that.
Just to mention a few examples. The medium of instruction in India, particularly for higher education, and exclusively for technical and scientific education, is in English. Every day hundreds of millions of people suffer due to that. The result is that a lot of people grow up with complexes and stunted intellect, as they couldn't understand what the teacher is saying, what is written in the books, and so on. When they come to college, a majority of people have problems writing one decent page of content in either their own language(s) or English.
The legal system, particularly at higher levels, works in English. As a result, the overwhelming majority of people have no idea what is going on. They have to rely on others completely, some of whom themselves may not be very fluent in English.
All the lucrative jobs require not only knowledge of English, but spoken fluency in English. Not only that, your accent while speaking English puts you in a particular caste, so to speak. As a result, an incompetent and badly educated person who speaks fluent English can get through life much more easily than a competent well-educated person with a 'bad' English accent.
There is little incentive to write (and read) in Indian languages, and therefore it is very difficult to write and publish literary or academic or even other kinds of books in Indian languages.
And so on and on and on.
The challenge is that it is very difficult to solve this problem, since there are many major languages in India, and so speakers of one language will not accept 'imposition' of another Indian language, or even the requirement to learn another Indian language. As a result, just as the British ruled by Divide-and-Conquer, so English rules in this time tested way.
And to top it all, if you want to be associated with the global economics/culture/world-at-large, you again need English.
But I can understand that.
I don't think you do at all, based on your comments.
The more important thing is to also understand that no one has to discriminate based on language(s),
Sure! Who can disagree with that except a language chauvinist?
no one has to adopt a purist attitude when it comes to using/understanding of language by others.
Perfectly true. Did I even hint at that to the least degree? You are again shadow boxing.
People can always have various language/linguistic habits, no one has to use "one language only".
Again, did I even hint at that in any possible way?
The point is not to use language as a weapon.
Ditto as above.
[These are things I think you know, but many on this list may not.]
I sure do. I have been thinking and researching about these matters for the last 40 years, almost obsessively, from all possible points of view. My position on this issue has changed a great deal over the years. But even when I started, I was not a purist in any sense of the word. I will always be against forcing people to do things they don't want to do.
Re "Do you know that there are and have been schools in the world, including India, where students are punished if they are caught speaking in their mother tongue (or first, "native" language)": is this still happening in India? I've only had similar experiences in my "foreign language" lessons and in real life (using a variety/style y when/where y can be "frowned upon") --- though not "punished", just looked upon with 🙄 or 👀 in ways condescending.
The last time I checked, it was happening. As of this moment, I don't know for sure. But, as pointed out above, innumerable people do suffer in innumerable ways due to the supremacy of English in India. I may have some suggestions, but I don't really know the solution to this issue, as it is complicated by so many factors. I will never ever support forcing people to use one or the other language.
Why do you make assumptions as you comment on anything and hurl semi-insults? You don't really know me. I didn't assume anything about you.
For example, if I have understood correctly, you simply wanted to say that one should pay attention to stress/emphasis/topicalization (I will add, from my side, prosody and intonation) when considering the meanings of the sentences "John loves Mary" and "Mary loves John" (note that there is no question mark here at the end). And you went about it by first saying "let me guess" and then making some silly statements about my competence and expertise about language(s). You could have simply mentioned the importance of stress/emphasis/topicalization in the beginning. I can't imagine any reason why you have to make such assumptions.
Well-read and well-educated as you are about language(s), perhaps it is possible, even if only remotely, that I could have a thing or two that I could tell you about language that you might not perhaps know?
Great weekend!