Re "Wanderers, Kings, Merchants: The Story of India through Its Languages by Peggy Mohan": look forward to reading it. (History or historical events aside, does it have to do with how language(s) became an indicator of power/privilege/status? If humankind has anything in common, this could be one general observation...)
It does to some extent, particularly with regard to the history of languages and related things in India.
Re "If people are used to writing in their own language on computers, then that language is more likely to survive": I don't disagree, but --- [Forewarning: one might not like my reply to the following, but I ask for it to be interpreted with a scientific mindset rather, not with emotions and sentiments related to language identity and particular cultural practices.] "Languages" (as in, particular language varieties, not "language" as in language-at-large) come and go, born and die (and, sometimes, come back), like trends, styles, cultures ("culture" as in a particular way of living, a set of habits...). Esp. for users of varieties that have undergone oppression/suppression, I understand that there is or can be much meaning to many users in having the varieties be alive or in use. It is important to the users. It is a symbol of their existence.
Yes, that is important.
But having witnessed how language has been abused (e.g. with research greed by some CL/NLPers), I sometimes think one might have gone too far with how much identity one attaches to any particular language.
Coming trom the background that I do, your statement above seems very similar to saying that we might have gone too far with political correctness or about opposing racism/misogyny/etc. Like most Europeans or Americans, you seem to have no (or very little) idea about the toll that discrimination -- even unintended discrimination -- takes on a very large part of the human population.
In early 2009, I had written a rant in a blog post and the title of the post was "English is Language Independent". I had made roughly the same points which the now famous Bender Rule makes. I have been writing about it, although that blog is now defunct. I did not, however, make any proposal, as the Bender Rule does. I just pointed out the problem, so I am in no way undermining the importance of that.
And one anonymous comment on this blog post was this: "Why don't you work on a good project. I don't see the prejudice persisting for long once you do that."
It is like saying about gender or racism: "Why don't you make accomplishments equal to us? Once you do that, I don't see the prejudice persisting for long."
Not to be picky here, but "I have heard some native speakers *users *of some "Dravidian languages" say that there are some (I guess minor) problems with Unicode for their languages": using the terms "native" and "speakers" to refer to "users" (or as I sometimes use among knowers of language: "languagers") has been an unhealthy baggage from our past practices in the language space. I can't comment on the issue of "Dravidian languages" and Unicode, but it seems one might need more info/details on the complaints to act further.
Well, yes, the word "native" has indeed a very dark history. How could I not know it? But, the manner in which linguists use the term "native speaker" is very different. At least I think so. Note that I am only mentioning linguists here, not "CL/NLPer". By the way, I thought I had coined the term "NLPer" in my blog long ago, but I may be wrong about that.
I am pretty sure one can find out these details in some online forum or some academic publication etc. I have not so far done that, but it is a good idea to do that. I will try.
Re "psycholinguistic validity from computational validity": in (cognitive/psycholinguistic) modeling, there is / can be not much difference between the two. (When one has enough experiences with modeling or with language/psycholinguistic phenomena, it's not hard to to see that results from computational modeling could also hold elsewhere. The art then is to be able to connect the two "realms". But then again, it depends on the claims, of course.)
Yes, of course, it depends on the claim. The two realms can definitely be connected. We don't disagree about that. In fact, I think, we don't really disagree about that many things it seems. Even so, isn't it possible to implement the same thing in many different ways when it comes to computation? That is not the case with the brain/mind, of course. Here, I am again making a distinction between computation and mathematics. Perhaps you don't agree with that? In that case, perhaps we mean different things by the term "computation".
Re science and engineering: I am not sure if engineering has to be "just about" heuristics or short cuts. There is good engineering and there is bad engineering, for the sake of my arguments here. In the context of ML with language/textual data, one ought to be careful with "computing based on values of surface elements/strings". Much of what the CL/NLP community/communities have been doing for the past few decades has been "computing based on values of surface elements/strings". This practice deserves serious re-evaluation (there are lots of grey areas here and opportunities to compare processing across finer granularities (without all the preprocessing hacks/"heuristics"/"engineering") for various tasks and data types/formats, without "words", "sentences", "linguistic structure(s)", "grammar" et al.). I don't think of it as "it's engineering!", but some bad practices/culture having been promoted as such and normalized (for a couple of decades?). Good engineering can also be fine, thoughtful, and robust.
I completely agree. But sometimes I do work on things which, theoretically, seem ridiculous to me, but they may be practically useful. At least half (perhaps more) of my motive to work on language processing is to address somehow, to any extent, the issue of linguistic empowerment. I am prepared to compromise theoretically for that purpose.
Re "I don't think there is anything wrong with what you call grammar hacking from the engineering point of view": I do (think there is something wrong), because: i. "all grammars leak" (from Sapir, also in Manning and Schütze (1999));
I know. I tell this to students every time I teach NLP or any related subject.
ii. "words" (whatever they are) are too coarse-grained for computing.
I already agreed to that, but if they help in my benign motives, I am prepared to use them.
Re "it [language(s)] is still likely to have an 'organic' structure": couldn't that structure (one not associated with "words"/"sentences"/"grammar") be one from math or computing? Or one that is a by-product of a combination of these factors?
It certainly could.
Some CL/NLPers have made various claims concerning "structures" in the past, borrowing the concepts from "linguistic structure(s)", from "grammar". There was a lot of chiming along, many often have neglected the fact that grammar could effect the impression of "structures" through "words" etc. or that it all in turn patterns some of our thoughts/judgments sub-/unconsciously. And the loop goes on. (See also: See also: https://twitter.com/adawan919/status/1532335891448057858)
Well, if you prefer the term "patterns" to "grammar" or "structure", I am completely fine with that. As I said earlier, I am moving towards the language games view of language, even for this exchange. We can't avoid that if we are talking in a human/natural language. The only way to avoid that, if there is a way, is to use only mathematical notation, but I don't think we have reached that stage so far with the study of language.
Re "in English "John loves Mary" is in fact a very different thing than "Mary loves John"": one has to re-evaluate to which extent this matters in whichever form of computing/computation one is engaged in and how often this "canonical form" that you are implicitly referring to really occurs in data as well as how this actually surfaces in data. One should look at the data in front of one, not the framework/theory in one's mind. (I believe in achieving better designs/systems through testing from both a data-centric as well as an algorithm-centric perspective. Hardware counts too!)
I mostly agree, but I am not sure whether you are saying that "John loves Mary" may not perhaps be different from "Mary loves John"?
Re " it is unfair to blame Linguistics for that": My focus in "[t]he "non-native speakers of X" has been a plague in Linguistics" was on the "native" part. That has been my understanding, at least to a great extent "nativeness" was so promoted/reinforced, esp. within the school of generative Linguistics in the 2nd half of the 20th century, when it comes to "linguistic judgments". I thought the propagation stemmed from there. Who/What else do you think started it?
I think the word "native" was used in a derogatory/condescending way throughout the English speaking world, even before the "birth of Modern Linguistics". It was, in fact, the more polite word. One other common word was "savage". I remember being shocked to find (very long ago) in Jane Eyre the phrase "savages living on the banks of the Ganges". Savages? On the bank of the Ganges?
But these usages are much older (than "Modern Linguistics").
The matter of "linguistic judgements" or "grammaticality" is very different from that, regardless of what one's opinions are about the existence of grammar.
All in all, your replies remind me of many of the reviewers' responses "typical" of (i.e. I often got from) the *CL circle (of those who remained in the past decade or so).
I don't know about that. I thought I had very unconventional views of NLP, but I could be wrong about that, at least relatively.
If I may guess: i. you don't have an academic background in Linguistics (esp. general Linguistics,
That is true. I have learned about language(s) mostly on my own. So, if you want me to show my degree in Linguistics, I have none, except a PhD in Computational Linguistics. I was the second person to get a PhD specifically in CL in India.
note that there is a difference between linguistics of particular languages and that in a more general/theoretical manner (not about (p-)language grammatical particularities),
Of course there is. What makes you think I don't know that? The fact is, my knowledge of "(p)-language" is relatively very limited. Even my knowledge of the syntax of Hindi (my "mother tongue"), in a formal sense, is very limited. I mostly know about language in general.
ii. you learn about language(s) through mostly non-academic books or through your own language experience(s) (which counts too, I am not invalidating it/them here),
I have no idea what makes you say that. Am I supposed to list the Linguistics books I have read, in addition to showing my Linguistics degree? (Sorry if that sounds bitter, but it has happened in the past, not literally, but effectively).
I can only say -- and it is strange to even have to say this -- that I definitely know more about language in general and linguistic theory than -- at least -- most graduates and postgraduates (including PhDs) of Linguistics in India.
iii. you never had phonetics and phonology, nor
In my replies on this thread I have not mentioned anything related to phonetics or phonology. So it must be from somewhere else that you have this impression(?). Is it from some of the papers co-authored by me? I think there are some which could give that impression. Explaining why that could be so, will take this discussion somewhere else. I don't think it is relevant here at all.
Is your point that I don't know about phonetics or phonology? If it is, I would prefer not to answer that.
iv. do you realize how you can practice without "words"
I do. But I am also prepared to use "words" wherever they help. As I wrote earlier, I have worked without "words" sometimes and have argued against them.
--- did I get any right?
You got -- sort of -- the first and the third, assuming you were asking me to show my Linguistics degree and whether I had formally taken courses on Phonetics and Phonology.
I wanted to note this because --- and please do not take offense, it is not meant personally for I respect your expertise and appreciate our exchanges --- for a while, I didn't know where(else) to submit my findings. It wasn't until I got all the rejections with rather shallow comments about language (or language and computing) did I realize the "solidarity" one has built with people with a background similar to yours might have been the driving force of how some computational (general) linguists (as in, "general language scientists" who also do computational work --- there are only a few of us) got chased out of the arena. The "typical" excuses for practices of this "culture" have been "engineering", "useful", "it works" --- but without any/much grounding/interest in good generalizations. One puts excess focus on processing but not on evaluation or interpretation. I think it's time for a "culture" change in this regard.
To your other reply below (in triple quotes):
Sorry, but I didn't understand what you mean by triple quotes. I could find any triple quotes in your comments.
re "language policy": not everything has to be or can be regulated. Policies can help with promoting/reinforcing/rectifying a particular situation/initiative.
I agree.
Forcing people e.g. to use language in one particular way or to use "one language" only (whatever "language" "means"*)?
Again, I agree. However, like most Europeans and Americans, you seem to have no (or very little) idea how people are already being forced to use some language or another. And it is hardly a new phenomenon, but it has become much more serious now due mainly to colonization and all its effects. Do you have any idea how much hundreds of millions of Indians suffer simply from being forced to use English? My primary motive in my whole life, for good reasons, has been to counter linguistic discrimination, mainly due to the imposition of English on any Indian who has any ambition at all. I think that alone makes me sufficiently qualified to "work on language". This is analogous to any other kind of discrimination or prejudice.
I do not think that would be a good direction to go. For any regions, we have seen both good/better and bad/worse policies throughout the course of history. One would really have to evaluate the proposed policy in question carefully.
I never said anything about forcing people to speak one language. That's why I said I don't know what exactly the connection is with the language policy. But it sure has a very strong connection, because the problem in the first place is due to language policy, written and unwritten. Do you know that there are and have been schools in the world, including India, where students are punished if they are caught speaking in their mother tongue (or first, "native" language). I still remember feeling recognition and the impression it had on me when I read the famous novel about life in Wales, How Green was My Valley. I had just become fluent in English then.
Depending on the situation, some may best change things through the economy, some through government support, some through education and/or grassroot-type of initiatives, some a combination of all these and more....
I agree.
*I have an answer to this... please wait for my next pub or so.
I would love to read it. I am eternally hungry for any fresh look on language in general. Not so much for particular languages or varieties. That is to me, to some extent, boring.
Re "it is very much like conservation of ecology or of species. I don't think it (the latter) will be considered unwarranted prescriptivism": see my 2nd response above.
The same for my reply to that comment.
Also, with language documentation, one can just document data without promoting grammar. (That's probably the less unethical thing one can do with language or language data.)
Again I agree. I never said anything about promoting grammar. I don't like to read grammar books. It's painful to me, compared to almost any other topic under the sun, except perhaps finance, commerce, and the intricacies of legal procedures.
For the sake of completeness, I should clarify -- as it seems to matter -- that I have "never had Syntax or even Semantics or Pragmatics". I am mostly self-taught, not just in Linguistics, but also in Computer Science and almost everything else. Do you really think it matters in the context of this discussion?
Again, for the sake of completeness, I should mention that for decades, I have been reading all kinds of books that had anything to do with language, mostly in general, but also about Hindi or Indian languages, not to mention English. These have included what you call academic books on language in general and about Linguistics. I still keep reading, as I know very well that, being self-taught, I have some gaps in my knowledge of Linguistics and Computer Science. My undergraduate degree was in Mechanical Engineering (from 1990), but I hardly remember anything in that area. I have similarly been reading all kinds of books for decades about computers and Computer Science.
I am unable to see how any of this matters in the context of this discussion.
By the way, I like the metaphor you use for language: It being like a graphical user interface for the brain. That reminds me of the views of Daniel Dennett about consciousness. He constantly compares elements making up consciousness to graphical user interfaces on computers. Not that I completely agree with him about consciousness, but I still find the metaphor quite good, perhaps as an approximation.
Just a quick reply before the weekend to some of the points that I thought deserve a short clarification:
1. re linguistic empowerment: yes and no. As I commented on X (formerly Twitter) on 09Aug2023: "[t]here are divergent ways of thinking... but when it comes to language and the social sciences, one must be careful with how one "diverges"! Humans and [sic: are] humans. And much of what we postulate re "in-group/out-group" can be a matter of our "traditions" (if so, is it time to re-evaluate?), perspectives (if so, can we be biased sometimes?), or our willingness to include or will to exclude. How different can particular "languages" (in a folk psychological, proverbial usage) be, really? Where do the differences lie?". The same goes with "diversity" efforts. AND OF COURSE I am NOT AGAINST these. But one has to be careful how far one goes with "difference(s)". 2. Re "John loves Mary" being "same/different" as "Mary loves John": it depends. Note stress/emphasis/topicalization, different usage pattern(s) etc., not just "subj verb obj". 3. Btw, your usage of the term "word" can be replaced by other alternative formulations, e.g. "term", 4. Re phonetics and phonology: I was not referring to the relevance of phonetic/phonological knowledge per se, that a practitioner in the space of "language and computing" would "need" in order to be competent. But that, as well as a comprehensive knowledge of general language theories and a broad background in p-languages and their (social/usage) contexts, belongs in the toolkit of a good linguist (as in, a good language scientist). To me, progressing to finer granularities is just refining our assumptions, our model. But to those who may not have had phonetics/phonology, they may be more likely to think that they "need" "words" and hence my findings might be either a paradigm shift or the end of the world. 5. Re triple quotes: """ I copied and pasted your reply that didn't seem to have been sent to the list and put it in triple quotes, as a reference (for others). 6. Re "Do you have any idea how much hundreds of millions of Indians suffer simply from being forced to use English?": in what ways are they "forced"? But I can understand that. The more important thing is to also understand that no one has to discriminate based on language(s), no one has to adopt a purist attitude when it comes to using/understanding of language by others. People can always have various language/linguistic habits, no one has to use "one language only". The point is not to use language as a weapon. [These are things I think you know, but many on this list may not.] Re "Do you know that there are and have been schools in the world, including India, where students are punished if they are caught speaking in their mother tongue (or first, "native" language)": is this still happening in India? I've only had similar experiences in my "foreign language" lessons and in real life (using a variety/style y when/where y can be "frowned upon") --- though not "punished", just looked upon with 🙄 or 👀 in ways condescending.
Great weekend!
On Fri, Aug 11, 2023 at 5:01 PM Anil Singh anil.phdcl@gmail.com wrote:
Re "Wanderers, Kings, Merchants: The Story of India through Its Languages by Peggy Mohan": look forward to reading it. (History or historical events aside, does it have to do with how language(s) became an indicator of power/privilege/status? If humankind has anything in common, this could be one general observation...)
It does to some extent, particularly with regard to the history of languages and related things in India.
Re "If people are used to writing in their own language on computers, then that language is more likely to survive": I don't disagree, but --- [Forewarning: one might not like my reply to the following, but I ask for it to be interpreted with a scientific mindset rather, not with emotions and sentiments related to language identity and particular cultural practices.] "Languages" (as in, particular language varieties, not "language" as in language-at-large) come and go, born and die (and, sometimes, come back), like trends, styles, cultures ("culture" as in a particular way of living, a set of habits...). Esp. for users of varieties that have undergone oppression/suppression, I understand that there is or can be much meaning to many users in having the varieties be alive or in use. It is important to the users. It is a symbol of their existence.
Yes, that is important.
But having witnessed how language has been abused (e.g. with research greed by some CL/NLPers), I sometimes think one might have gone too far with how much identity one attaches to any particular language.
Coming trom the background that I do, your statement above seems very similar to saying that we might have gone too far with political correctness or about opposing racism/misogyny/etc. Like most Europeans or Americans, you seem to have no (or very little) idea about the toll that discrimination -- even unintended discrimination -- takes on a very large part of the human population.
In early 2009, I had written a rant in a blog post and the title of the post was "English is Language Independent". I had made roughly the same points which the now famous Bender Rule makes. I have been writing about it, although that blog is now defunct. I did not, however, make any proposal, as the Bender Rule does. I just pointed out the problem, so I am in no way undermining the importance of that.
And one anonymous comment on this blog post was this: "Why don't you work on a good project. I don't see the prejudice persisting for long once you do that."
It is like saying about gender or racism: "Why don't you make accomplishments equal to us? Once you do that, I don't see the prejudice persisting for long."
Not to be picky here, but "I have heard some native speakers *users *of some "Dravidian languages" say that there are some (I guess minor) problems with Unicode for their languages": using the terms "native" and "speakers" to refer to "users" (or as I sometimes use among knowers of language: "languagers") has been an unhealthy baggage from our past practices in the language space. I can't comment on the issue of "Dravidian languages" and Unicode, but it seems one might need more info/details on the complaints to act further.
Well, yes, the word "native" has indeed a very dark history. How could I not know it? But, the manner in which linguists use the term "native speaker" is very different. At least I think so. Note that I am only mentioning linguists here, not "CL/NLPer". By the way, I thought I had coined the term "NLPer" in my blog long ago, but I may be wrong about that.
I am pretty sure one can find out these details in some online forum or some academic publication etc. I have not so far done that, but it is a good idea to do that. I will try.
Re "psycholinguistic validity from computational validity": in (cognitive/psycholinguistic) modeling, there is / can be not much difference between the two. (When one has enough experiences with modeling or with language/psycholinguistic phenomena, it's not hard to to see that results from computational modeling could also hold elsewhere. The art then is to be able to connect the two "realms". But then again, it depends on the claims, of course.)
Yes, of course, it depends on the claim. The two realms can definitely be connected. We don't disagree about that. In fact, I think, we don't really disagree about that many things it seems. Even so, isn't it possible to implement the same thing in many different ways when it comes to computation? That is not the case with the brain/mind, of course. Here, I am again making a distinction between computation and mathematics. Perhaps you don't agree with that? In that case, perhaps we mean different things by the term "computation".
Re science and engineering: I am not sure if engineering has to be "just about" heuristics or short cuts. There is good engineering and there is bad engineering, for the sake of my arguments here. In the context of ML with language/textual data, one ought to be careful with "computing based on values of surface elements/strings". Much of what the CL/NLP community/communities have been doing for the past few decades has been "computing based on values of surface elements/strings". This practice deserves serious re-evaluation (there are lots of grey areas here and opportunities to compare processing across finer granularities (without all the preprocessing hacks/"heuristics"/"engineering") for various tasks and data types/formats, without "words", "sentences", "linguistic structure(s)", "grammar" et al.). I don't think of it as "it's engineering!", but some bad practices/culture having been promoted as such and normalized (for a couple of decades?). Good engineering can also be fine, thoughtful, and robust.
I completely agree. But sometimes I do work on things which, theoretically, seem ridiculous to me, but they may be practically useful. At least half (perhaps more) of my motive to work on language processing is to address somehow, to any extent, the issue of linguistic empowerment. I am prepared to compromise theoretically for that purpose.
Re "I don't think there is anything wrong with what you call grammar hacking from the engineering point of view": I do (think there is something wrong), because: i. "all grammars leak" (from Sapir, also in Manning and Schütze (1999));
I know. I tell this to students every time I teach NLP or any related subject.
ii. "words" (whatever they are) are too coarse-grained for computing.
I already agreed to that, but if they help in my benign motives, I am prepared to use them.
Re "it [language(s)] is still likely to have an 'organic' structure": couldn't that structure (one not associated with "words"/"sentences"/"grammar") be one from math or computing? Or one that is a by-product of a combination of these factors?
It certainly could.
Some CL/NLPers have made various claims concerning "structures" in the past, borrowing the concepts from "linguistic structure(s)", from "grammar". There was a lot of chiming along, many often have neglected the fact that grammar could effect the impression of "structures" through "words" etc. or that it all in turn patterns some of our thoughts/judgments sub-/unconsciously. And the loop goes on. (See also: See also: https://twitter.com/adawan919/status/1532335891448057858)
Well, if you prefer the term "patterns" to "grammar" or "structure", I am completely fine with that. As I said earlier, I am moving towards the language games view of language, even for this exchange. We can't avoid that if we are talking in a human/natural language. The only way to avoid that, if there is a way, is to use only mathematical notation, but I don't think we have reached that stage so far with the study of language.
Re "in English "John loves Mary" is in fact a very different thing than "Mary loves John"": one has to re-evaluate to which extent this matters in whichever form of computing/computation one is engaged in and how often this "canonical form" that you are implicitly referring to really occurs in data as well as how this actually surfaces in data. One should look at the data in front of one, not the framework/theory in one's mind. (I believe in achieving better designs/systems through testing from both a data-centric as well as an algorithm-centric perspective. Hardware counts too!)
I mostly agree, but I am not sure whether you are saying that "John loves Mary" may not perhaps be different from "Mary loves John"?
Re " it is unfair to blame Linguistics for that": My focus in "[t]he "non-native speakers of X" has been a plague in Linguistics" was on the "native" part. That has been my understanding, at least to a great extent "nativeness" was so promoted/reinforced, esp. within the school of generative Linguistics in the 2nd half of the 20th century, when it comes to "linguistic judgments". I thought the propagation stemmed from there. Who/What else do you think started it?
I think the word "native" was used in a derogatory/condescending way throughout the English speaking world, even before the "birth of Modern Linguistics". It was, in fact, the more polite word. One other common word was "savage". I remember being shocked to find (very long ago) in Jane Eyre the phrase "savages living on the banks of the Ganges". Savages? On the bank of the Ganges?
But these usages are much older (than "Modern Linguistics").
The matter of "linguistic judgements" or "grammaticality" is very different from that, regardless of what one's opinions are about the existence of grammar.
All in all, your replies remind me of many of the reviewers' responses "typical" of (i.e. I often got from) the *CL circle (of those who remained in the past decade or so).
I don't know about that. I thought I had very unconventional views of NLP, but I could be wrong about that, at least relatively.
If I may guess: i. you don't have an academic background in Linguistics (esp. general Linguistics,
That is true. I have learned about language(s) mostly on my own. So, if you want me to show my degree in Linguistics, I have none, except a PhD in Computational Linguistics. I was the second person to get a PhD specifically in CL in India.
note that there is a difference between linguistics of particular languages and that in a more general/theoretical manner (not about (p-)language grammatical particularities),
Of course there is. What makes you think I don't know that? The fact is, my knowledge of "(p)-language" is relatively very limited. Even my knowledge of the syntax of Hindi (my "mother tongue"), in a formal sense, is very limited. I mostly know about language in general.
ii. you learn about language(s) through mostly non-academic books or through your own language experience(s) (which counts too, I am not invalidating it/them here),
I have no idea what makes you say that. Am I supposed to list the Linguistics books I have read, in addition to showing my Linguistics degree? (Sorry if that sounds bitter, but it has happened in the past, not literally, but effectively).
I can only say -- and it is strange to even have to say this -- that I definitely know more about language in general and linguistic theory than -- at least -- most graduates and postgraduates (including PhDs) of Linguistics in India.
iii. you never had phonetics and phonology, nor
In my replies on this thread I have not mentioned anything related to phonetics or phonology. So it must be from somewhere else that you have this impression(?). Is it from some of the papers co-authored by me? I think there are some which could give that impression. Explaining why that could be so, will take this discussion somewhere else. I don't think it is relevant here at all.
Is your point that I don't know about phonetics or phonology? If it is, I would prefer not to answer that.
iv. do you realize how you can practice without "words"
I do. But I am also prepared to use "words" wherever they help. As I wrote earlier, I have worked without "words" sometimes and have argued against them.
--- did I get any right?
You got -- sort of -- the first and the third, assuming you were asking me to show my Linguistics degree and whether I had formally taken courses on Phonetics and Phonology.
I wanted to note this because --- and please do not take offense, it is not meant personally for I respect your expertise and appreciate our exchanges --- for a while, I didn't know where(else) to submit my findings. It wasn't until I got all the rejections with rather shallow comments about language (or language and computing) did I realize the "solidarity" one has built with people with a background similar to yours might have been the driving force of how some computational (general) linguists (as in, "general language scientists" who also do computational work --- there are only a few of us) got chased out of the arena. The "typical" excuses for practices of this "culture" have been "engineering", "useful", "it works" --- but without any/much grounding/interest in good generalizations. One puts excess focus on processing but not on evaluation or interpretation. I think it's time for a "culture" change in this regard.
To your other reply below (in triple quotes):
Sorry, but I didn't understand what you mean by triple quotes. I could find any triple quotes in your comments.
re "language policy": not everything has to be or can be regulated. Policies can help with promoting/reinforcing/rectifying a particular situation/initiative.
I agree.
Forcing people e.g. to use language in one particular way or to use "one language" only (whatever "language" "means"*)?
Again, I agree. However, like most Europeans and Americans, you seem to have no (or very little) idea how people are already being forced to use some language or another. And it is hardly a new phenomenon, but it has become much more serious now due mainly to colonization and all its effects. Do you have any idea how much hundreds of millions of Indians suffer simply from being forced to use English? My primary motive in my whole life, for good reasons, has been to counter linguistic discrimination, mainly due to the imposition of English on any Indian who has any ambition at all. I think that alone makes me sufficiently qualified to "work on language". This is analogous to any other kind of discrimination or prejudice.
I do not think that would be a good direction to go. For any regions, we have seen both good/better and bad/worse policies throughout the course of history. One would really have to evaluate the proposed policy in question carefully.
I never said anything about forcing people to speak one language. That's why I said I don't know what exactly the connection is with the language policy. But it sure has a very strong connection, because the problem in the first place is due to language policy, written and unwritten. Do you know that there are and have been schools in the world, including India, where students are punished if they are caught speaking in their mother tongue (or first, "native" language). I still remember feeling recognition and the impression it had on me when I read the famous novel about life in Wales, How Green was My Valley. I had just become fluent in English then.
Depending on the situation, some may best change things through the economy, some through government support, some through education and/or grassroot-type of initiatives, some a combination of all these and more....
I agree.
*I have an answer to this... please wait for my next pub or so.
I would love to read it. I am eternally hungry for any fresh look on language in general. Not so much for particular languages or varieties. That is to me, to some extent, boring.
Re "it is very much like conservation of ecology or of species. I don't think it (the latter) will be considered unwarranted prescriptivism": see my 2nd response above.
The same for my reply to that comment.
Also, with language documentation, one can just document data without promoting grammar. (That's probably the less unethical thing one can do with language or language data.)
Again I agree. I never said anything about promoting grammar. I don't like to read grammar books. It's painful to me, compared to almost any other topic under the sun, except perhaps finance, commerce, and the intricacies of legal procedures.
For the sake of completeness, I should clarify -- as it seems to matter -- that I have "never had Syntax or even Semantics or Pragmatics". I am mostly self-taught, not just in Linguistics, but also in Computer Science and almost everything else. Do you really think it matters in the context of this discussion?
Again, for the sake of completeness, I should mention that for decades, I have been reading all kinds of books that had anything to do with language, mostly in general, but also about Hindi or Indian languages, not to mention English. These have included what you call academic books on language in general and about Linguistics. I still keep reading, as I know very well that, being self-taught, I have some gaps in my knowledge of Linguistics and Computer Science. My undergraduate degree was in Mechanical Engineering (from 1990), but I hardly remember anything in that area. I have similarly been reading all kinds of books for decades about computers and Computer Science.
I am unable to see how any of this matters in the context of this discussion.
By the way, I like the metaphor you use for language: It being like a graphical user interface for the brain. That reminds me of the views of Daniel Dennett about consciousness. He constantly compares elements making up consciousness to graphical user interfaces on computers. Not that I completely agree with him about consciousness, but I still find the metaphor quite good, perhaps as an approximation.
On Fri, Aug 11, 2023 at 9:52 PM Ada Wan adawan919@gmail.com wrote:
Just a quick reply before the weekend to some of the points that I thought deserve a short clarification:
- re linguistic empowerment: yes and no. As I commented on X (formerly
Twitter) on 09Aug2023: "[t]here are divergent ways of thinking... but when it comes to language and the social sciences, one must be careful with how one "diverges"! Humans and [sic: are] humans. And much of what we postulate re "in-group/out-group" can be a matter of our "traditions" (if so, is it time to re-evaluate?), perspectives (if so, can we be biased sometimes?), or our willingness to include or will to exclude. How different can particular "languages" (in a folk psychological, proverbial usage) be, really? Where do the differences lie?".
Of course. That goes without saying. For almost 40 years now I have been looking at this issue and thinking about it. I had wanted to write a book about it. This is what got me into languages and then NLP, because I was then hooked by the study of language by itself, even without the issue of linguistic empowerment, particularly from the computational point of view. I have looked at it from all possible points of view. I was not a linguistic purist even when I began -- with a lot of bitterness -- and I am certainly not that now. I have no doubt at all that there is something universal and species-specific about human languages, although I don't know in what way it is universal exactly. No one does, as far as I know. No one could be more against linguistic chauvinism of any kind than me. Or any other kind of chauvinism.
The same goes with "diversity" efforts. AND OF COURSE I am NOT AGAINST these.
I believe that. I didn't say you were. I just gave an example to make my point.
But one has to be careful how far one goes with "difference(s)".
Only as far as is reasonable and fair to everyone.
- Re "John loves Mary" being "same/different" as "Mary loves John": it
depends. Note stress/emphasis/topicalization, different usage pattern(s) etc., not just "subj verb obj".
Well, yes, that is the central contradiction of Linguistics. It is primarily supposed to be about spoken language, but -- quite naturally -- linguists in academic literature have to use examples in written form. And the written form misses "stress/emphasis/topicalization, different usage pattern(s) etc.". Being concerned with language for 40 years, how could I possibly not know it?
However, I am unable to imagine a scenario where "John loves Mary" could be the same "Mary loves John", with any possible stress/emphasis/topicalization, different usage pattern(s) etc. for either of them and their combinations. It may be that I am missing something here.
- Btw, your usage of the term "word" can be replaced by other alternative
formulations, e.g. "term",
Let us terminate this terminological tussle about the term 'term', that is to say, the term 'word'.
I have already more than once agreed that the term 'word' is ill-defined and that I have even written about it. In this case, you are indulging in what can be called shadow boxing.
- Re phonetics and phonology: I was not referring to the relevance of
phonetic/phonological knowledge per se, that a practitioner in the space of "language and computing" would "need" in order to be competent. But that, as well as a comprehensive knowledge of general language theories and a broad background in p-languages and their (social/usage) contexts, belongs in the toolkit of a good linguist (as in, a good language scientist). To me, progressing to finer granularities is just refining our assumptions, our model.
I mostly agree. Only mostly, since the statement above a somewhat vague programmatic statement. If the details were there, I could agree to specific things.
But to those who may not have had phonetics/phonology, they may be more likely to think that they "need" "words" and hence my findings might be either a paradigm shift or the end of the world.
So, you are still equating "having had phonetics/phonology", which can be translated as having formally attended and passed courses and exams in phonetics/phonology. I can't imagine how could you possibly talk about de-pedatization if you subscribe to this -- in my opinion -- somewhat ridiculous way of thinking. Pardon me for using strong words, but what you say is on the borderline of being offensive, if not actually offensive. And it is extremely silly and childish, coming from such a well-read person.
- Re triple quotes: """
I copied and pasted your reply that didn't seem to have been sent to the list and put it in triple quotes, as a reference (for others).
OK.
- Re "Do you have any idea how much hundreds of millions of Indians
suffer simply from being forced to use English?": in what ways are they "forced"?
I can't even begin to attempt to describe in innumerable ways people are forced to use English. There is tons of literature about that, but a lot of it may be non-European languages. For example, it is there in Hindi. The book that I always wanted to write, but for various reasons couldn't, at least so far, was partly about that.
Just to mention a few examples. The medium of instruction in India, particularly for higher education, and exclusively for technical and scientific education, is in English. Every day hundreds of millions of people suffer due to that. The result is that a lot of people grow up with complexes and stunted intellect, as they couldn't understand what the teacher is saying, what is written in the books, and so on. When they come to college, a majority of people have problems writing one decent page of content in either their own language(s) or English.
The legal system, particularly at higher levels, works in English. As a result, the overwhelming majority of people have no idea what is going on. They have to rely on others completely, some of whom themselves may not be very fluent in English.
All the lucrative jobs require not only knowledge of English, but spoken fluency in English. Not only that, your accent while speaking English puts you in a particular caste, so to speak. As a result, an incompetent and badly educated person who speaks fluent English can get through life much more easily than a competent well-educated person with a 'bad' English accent.
There is little incentive to write (and read) in Indian languages, and therefore it is very difficult to write and publish literary or academic or even other kinds of books in Indian languages.
And so on and on and on.
The challenge is that it is very difficult to solve this problem, since there are many major languages in India, and so speakers of one language will not accept 'imposition' of another Indian language, or even the requirement to learn another Indian language. As a result, just as the British ruled by Divide-and-Conquer, so English rules in this time tested way.
And to top it all, if you want to be associated with the global economics/culture/world-at-large, you again need English.
But I can understand that.
I don't think you do at all, based on your comments.
The more important thing is to also understand that no one has to discriminate based on language(s),
Sure! Who can disagree with that except a language chauvinist?
no one has to adopt a purist attitude when it comes to using/understanding of language by others.
Perfectly true. Did I even hint at that to the least degree? You are again shadow boxing.
People can always have various language/linguistic habits, no one has to use "one language only".
Again, did I even hint at that in any possible way?
The point is not to use language as a weapon.
Ditto as above.
[These are things I think you know, but many on this list may not.]
I sure do. I have been thinking and researching about these matters for the last 40 years, almost obsessively, from all possible points of view. My position on this issue has changed a great deal over the years. But even when I started, I was not a purist in any sense of the word. I will always be against forcing people to do things they don't want to do.
Re "Do you know that there are and have been schools in the world, including India, where students are punished if they are caught speaking in their mother tongue (or first, "native" language)": is this still happening in India? I've only had similar experiences in my "foreign language" lessons and in real life (using a variety/style y when/where y can be "frowned upon") --- though not "punished", just looked upon with 🙄 or 👀 in ways condescending.
The last time I checked, it was happening. As of this moment, I don't know for sure. But, as pointed out above, innumerable people do suffer in innumerable ways due to the supremacy of English in India. I may have some suggestions, but I don't really know the solution to this issue, as it is complicated by so many factors. I will never ever support forcing people to use one or the other language.
Why do you make assumptions as you comment on anything and hurl semi-insults? You don't really know me. I didn't assume anything about you.
For example, if I have understood correctly, you simply wanted to say that one should pay attention to stress/emphasis/topicalization (I will add, from my side, prosody and intonation) when considering the meanings of the sentences "John loves Mary" and "Mary loves John" (note that there is no question mark here at the end). And you went about it by first saying "let me guess" and then making some silly statements about my competence and expertise about language(s). You could have simply mentioned the importance of stress/emphasis/topicalization in the beginning. I can't imagine any reason why you have to make such assumptions.
Well-read and well-educated as you are about language(s), perhaps it is possible, even if only remotely, that I could have a thing or two that I could tell you about language that you might not perhaps know?
Great weekend!
A typo correction:
Read:
"So, you are still equating "having had phonetics/phonology", which can be translated as having formally attended and passed courses and exams in phonetics/phonology. "
As:
"So, you are still equating "having had phonetics/phonology", which can be translated as having formally attended and passed courses and exams in phonetics/phonology with having knowledge about phonetics/phonology? And similarly for any kind of academic knowledge?"
On Mon, Aug 14, 2023 at 12:46 PM Anil Singh via Corpora < corpora@list.elra.info> wrote:
On Fri, Aug 11, 2023 at 9:52 PM Ada Wan adawan919@gmail.com wrote:
Just a quick reply before the weekend to some of the points that I thought deserve a short clarification:
- re linguistic empowerment: yes and no. As I commented on X (formerly
Twitter) on 09Aug2023: "[t]here are divergent ways of thinking... but when it comes to language and the social sciences, one must be careful with how one "diverges"! Humans and [sic: are] humans. And much of what we postulate re "in-group/out-group" can be a matter of our "traditions" (if so, is it time to re-evaluate?), perspectives (if so, can we be biased sometimes?), or our willingness to include or will to exclude. How different can particular "languages" (in a folk psychological, proverbial usage) be, really? Where do the differences lie?".
Of course. That goes without saying. For almost 40 years now I have been looking at this issue and thinking about it. I had wanted to write a book about it. This is what got me into languages and then NLP, because I was then hooked by the study of language by itself, even without the issue of linguistic empowerment, particularly from the computational point of view. I have looked at it from all possible points of view. I was not a linguistic purist even when I began -- with a lot of bitterness -- and I am certainly not that now. I have no doubt at all that there is something universal and species-specific about human languages, although I don't know in what way it is universal exactly. No one does, as far as I know. No one could be more against linguistic chauvinism of any kind than me. Or any other kind of chauvinism.
The same goes with "diversity" efforts. AND OF COURSE I am NOT AGAINST these.
I believe that. I didn't say you were. I just gave an example to make my point.
But one has to be careful how far one goes with "difference(s)".
Only as far as is reasonable and fair to everyone.
- Re "John loves Mary" being "same/different" as "Mary loves John": it
depends. Note stress/emphasis/topicalization, different usage pattern(s) etc., not just "subj verb obj".
Well, yes, that is the central contradiction of Linguistics. It is primarily supposed to be about spoken language, but -- quite naturally -- linguists in academic literature have to use examples in written form. And the written form misses "stress/emphasis/topicalization, different usage pattern(s) etc.". Being concerned with language for 40 years, how could I possibly not know it?
However, I am unable to imagine a scenario where "John loves Mary" could be the same "Mary loves John", with any possible stress/emphasis/topicalization, different usage pattern(s) etc. for either of them and their combinations. It may be that I am missing something here.
- Btw, your usage of the term "word" can be replaced by other
alternative formulations, e.g. "term",
Let us terminate this terminological tussle about the term 'term', that is to say, the term 'word'.
I have already more than once agreed that the term 'word' is ill-defined and that I have even written about it. In this case, you are indulging in what can be called shadow boxing.
- Re phonetics and phonology: I was not referring to the relevance of
phonetic/phonological knowledge per se, that a practitioner in the space of "language and computing" would "need" in order to be competent. But that, as well as a comprehensive knowledge of general language theories and a broad background in p-languages and their (social/usage) contexts, belongs in the toolkit of a good linguist (as in, a good language scientist). To me, progressing to finer granularities is just refining our assumptions, our model.
I mostly agree. Only mostly, since the statement above a somewhat vague programmatic statement. If the details were there, I could agree to specific things.
But to those who may not have had phonetics/phonology, they may be more likely to think that they "need" "words" and hence my findings might be either a paradigm shift or the end of the world.
So, you are still equating "having had phonetics/phonology", which can be translated as having formally attended and passed courses and exams in phonetics/phonology. I can't imagine how could you possibly talk about de-pedatization if you subscribe to this -- in my opinion -- somewhat ridiculous way of thinking. Pardon me for using strong words, but what you say is on the borderline of being offensive, if not actually offensive. And it is extremely silly and childish, coming from such a well-read person.
- Re triple quotes: """
I copied and pasted your reply that didn't seem to have been sent to the list and put it in triple quotes, as a reference (for others).
OK.
- Re "Do you have any idea how much hundreds of millions of Indians
suffer simply from being forced to use English?": in what ways are they "forced"?
I can't even begin to attempt to describe in innumerable ways people are forced to use English. There is tons of literature about that, but a lot of it may be non-European languages. For example, it is there in Hindi. The book that I always wanted to write, but for various reasons couldn't, at least so far, was partly about that.
Just to mention a few examples. The medium of instruction in India, particularly for higher education, and exclusively for technical and scientific education, is in English. Every day hundreds of millions of people suffer due to that. The result is that a lot of people grow up with complexes and stunted intellect, as they couldn't understand what the teacher is saying, what is written in the books, and so on. When they come to college, a majority of people have problems writing one decent page of content in either their own language(s) or English.
The legal system, particularly at higher levels, works in English. As a result, the overwhelming majority of people have no idea what is going on. They have to rely on others completely, some of whom themselves may not be very fluent in English.
All the lucrative jobs require not only knowledge of English, but spoken fluency in English. Not only that, your accent while speaking English puts you in a particular caste, so to speak. As a result, an incompetent and badly educated person who speaks fluent English can get through life much more easily than a competent well-educated person with a 'bad' English accent.
There is little incentive to write (and read) in Indian languages, and therefore it is very difficult to write and publish literary or academic or even other kinds of books in Indian languages.
And so on and on and on.
The challenge is that it is very difficult to solve this problem, since there are many major languages in India, and so speakers of one language will not accept 'imposition' of another Indian language, or even the requirement to learn another Indian language. As a result, just as the British ruled by Divide-and-Conquer, so English rules in this time tested way.
And to top it all, if you want to be associated with the global economics/culture/world-at-large, you again need English.
But I can understand that.
I don't think you do at all, based on your comments.
The more important thing is to also understand that no one has to discriminate based on language(s),
Sure! Who can disagree with that except a language chauvinist?
no one has to adopt a purist attitude when it comes to using/understanding of language by others.
Perfectly true. Did I even hint at that to the least degree? You are again shadow boxing.
People can always have various language/linguistic habits, no one has to use "one language only".
Again, did I even hint at that in any possible way?
The point is not to use language as a weapon.
Ditto as above.
[These are things I think you know, but many on this list may not.]
I sure do. I have been thinking and researching about these matters for the last 40 years, almost obsessively, from all possible points of view. My position on this issue has changed a great deal over the years. But even when I started, I was not a purist in any sense of the word. I will always be against forcing people to do things they don't want to do.
Re "Do you know that there are and have been schools in the world, including India, where students are punished if they are caught speaking in their mother tongue (or first, "native" language)": is this still happening in India? I've only had similar experiences in my "foreign language" lessons and in real life (using a variety/style y when/where y can be "frowned upon") --- though not "punished", just looked upon with 🙄 or 👀 in ways condescending.
The last time I checked, it was happening. As of this moment, I don't know for sure. But, as pointed out above, innumerable people do suffer in innumerable ways due to the supremacy of English in India. I may have some suggestions, but I don't really know the solution to this issue, as it is complicated by so many factors. I will never ever support forcing people to use one or the other language.
Why do you make assumptions as you comment on anything and hurl semi-insults? You don't really know me. I didn't assume anything about you.
For example, if I have understood correctly, you simply wanted to say that one should pay attention to stress/emphasis/topicalization (I will add, from my side, prosody and intonation) when considering the meanings of the sentences "John loves Mary" and "Mary loves John" (note that there is no question mark here at the end). And you went about it by first saying "let me guess" and then making some silly statements about my competence and expertise about language(s). You could have simply mentioned the importance of stress/emphasis/topicalization in the beginning. I can't imagine any reason why you have to make such assumptions.
Well-read and well-educated as you are about language(s), perhaps it is possible, even if only remotely, that I could have a thing or two that I could tell you about language that you might not perhaps know?
Great weekend!
Corpora mailing list -- corpora@list.elra.info https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email to corpora-leave@list.elra.info
Also, we don't have to think of a "grammar-based model"(or any other kind of model) and a computational model as being mutually exclusive. After all, programming languages are computational and used for all kinds of computation, but they have a well-defined grammar, based on insights and findings from Linguistics, not only Computer Science and mathematics and logic. We can build a computational model of almost everything.
On Mon, Aug 14, 2023 at 7:10 PM Anil Singh anil.phdcl@gmail.com wrote:
A typo correction:
Read:
"So, you are still equating "having had phonetics/phonology", which can be translated as having formally attended and passed courses and exams in phonetics/phonology. "
As:
"So, you are still equating "having had phonetics/phonology", which can be translated as having formally attended and passed courses and exams in phonetics/phonology with having knowledge about phonetics/phonology? And similarly for any kind of academic knowledge?"
On Mon, Aug 14, 2023 at 12:46 PM Anil Singh via Corpora < corpora@list.elra.info> wrote:
On Fri, Aug 11, 2023 at 9:52 PM Ada Wan adawan919@gmail.com wrote:
Just a quick reply before the weekend to some of the points that I thought deserve a short clarification:
- re linguistic empowerment: yes and no. As I commented on X (formerly
Twitter) on 09Aug2023: "[t]here are divergent ways of thinking... but when it comes to language and the social sciences, one must be careful with how one "diverges"! Humans and [sic: are] humans. And much of what we postulate re "in-group/out-group" can be a matter of our "traditions" (if so, is it time to re-evaluate?), perspectives (if so, can we be biased sometimes?), or our willingness to include or will to exclude. How different can particular "languages" (in a folk psychological, proverbial usage) be, really? Where do the differences lie?".
Of course. That goes without saying. For almost 40 years now I have been looking at this issue and thinking about it. I had wanted to write a book about it. This is what got me into languages and then NLP, because I was then hooked by the study of language by itself, even without the issue of linguistic empowerment, particularly from the computational point of view. I have looked at it from all possible points of view. I was not a linguistic purist even when I began -- with a lot of bitterness -- and I am certainly not that now. I have no doubt at all that there is something universal and species-specific about human languages, although I don't know in what way it is universal exactly. No one does, as far as I know. No one could be more against linguistic chauvinism of any kind than me. Or any other kind of chauvinism.
The same goes with "diversity" efforts. AND OF COURSE I am NOT AGAINST these.
I believe that. I didn't say you were. I just gave an example to make my point.
But one has to be careful how far one goes with "difference(s)".
Only as far as is reasonable and fair to everyone.
- Re "John loves Mary" being "same/different" as "Mary loves John": it
depends. Note stress/emphasis/topicalization, different usage pattern(s) etc., not just "subj verb obj".
Well, yes, that is the central contradiction of Linguistics. It is primarily supposed to be about spoken language, but -- quite naturally -- linguists in academic literature have to use examples in written form. And the written form misses "stress/emphasis/topicalization, different usage pattern(s) etc.". Being concerned with language for 40 years, how could I possibly not know it?
However, I am unable to imagine a scenario where "John loves Mary" could be the same "Mary loves John", with any possible stress/emphasis/topicalization, different usage pattern(s) etc. for either of them and their combinations. It may be that I am missing something here.
- Btw, your usage of the term "word" can be replaced by other
alternative formulations, e.g. "term",
Let us terminate this terminological tussle about the term 'term', that is to say, the term 'word'.
I have already more than once agreed that the term 'word' is ill-defined and that I have even written about it. In this case, you are indulging in what can be called shadow boxing.
- Re phonetics and phonology: I was not referring to the relevance of
phonetic/phonological knowledge per se, that a practitioner in the space of "language and computing" would "need" in order to be competent. But that, as well as a comprehensive knowledge of general language theories and a broad background in p-languages and their (social/usage) contexts, belongs in the toolkit of a good linguist (as in, a good language scientist). To me, progressing to finer granularities is just refining our assumptions, our model.
I mostly agree. Only mostly, since the statement above a somewhat vague programmatic statement. If the details were there, I could agree to specific things.
But to those who may not have had phonetics/phonology, they may be more likely to think that they "need" "words" and hence my findings might be either a paradigm shift or the end of the world.
So, you are still equating "having had phonetics/phonology", which can be translated as having formally attended and passed courses and exams in phonetics/phonology. I can't imagine how could you possibly talk about de-pedatization if you subscribe to this -- in my opinion -- somewhat ridiculous way of thinking. Pardon me for using strong words, but what you say is on the borderline of being offensive, if not actually offensive. And it is extremely silly and childish, coming from such a well-read person.
- Re triple quotes: """
I copied and pasted your reply that didn't seem to have been sent to the list and put it in triple quotes, as a reference (for others).
OK.
- Re "Do you have any idea how much hundreds of millions of Indians
suffer simply from being forced to use English?": in what ways are they "forced"?
I can't even begin to attempt to describe in innumerable ways people are forced to use English. There is tons of literature about that, but a lot of it may be non-European languages. For example, it is there in Hindi. The book that I always wanted to write, but for various reasons couldn't, at least so far, was partly about that.
Just to mention a few examples. The medium of instruction in India, particularly for higher education, and exclusively for technical and scientific education, is in English. Every day hundreds of millions of people suffer due to that. The result is that a lot of people grow up with complexes and stunted intellect, as they couldn't understand what the teacher is saying, what is written in the books, and so on. When they come to college, a majority of people have problems writing one decent page of content in either their own language(s) or English.
The legal system, particularly at higher levels, works in English. As a result, the overwhelming majority of people have no idea what is going on. They have to rely on others completely, some of whom themselves may not be very fluent in English.
All the lucrative jobs require not only knowledge of English, but spoken fluency in English. Not only that, your accent while speaking English puts you in a particular caste, so to speak. As a result, an incompetent and badly educated person who speaks fluent English can get through life much more easily than a competent well-educated person with a 'bad' English accent.
There is little incentive to write (and read) in Indian languages, and therefore it is very difficult to write and publish literary or academic or even other kinds of books in Indian languages.
And so on and on and on.
The challenge is that it is very difficult to solve this problem, since there are many major languages in India, and so speakers of one language will not accept 'imposition' of another Indian language, or even the requirement to learn another Indian language. As a result, just as the British ruled by Divide-and-Conquer, so English rules in this time tested way.
And to top it all, if you want to be associated with the global economics/culture/world-at-large, you again need English.
But I can understand that.
I don't think you do at all, based on your comments.
The more important thing is to also understand that no one has to discriminate based on language(s),
Sure! Who can disagree with that except a language chauvinist?
no one has to adopt a purist attitude when it comes to using/understanding of language by others.
Perfectly true. Did I even hint at that to the least degree? You are again shadow boxing.
People can always have various language/linguistic habits, no one has to use "one language only".
Again, did I even hint at that in any possible way?
The point is not to use language as a weapon.
Ditto as above.
[These are things I think you know, but many on this list may not.]
I sure do. I have been thinking and researching about these matters for the last 40 years, almost obsessively, from all possible points of view. My position on this issue has changed a great deal over the years. But even when I started, I was not a purist in any sense of the word. I will always be against forcing people to do things they don't want to do.
Re "Do you know that there are and have been schools in the world, including India, where students are punished if they are caught speaking in their mother tongue (or first, "native" language)": is this still happening in India? I've only had similar experiences in my "foreign language" lessons and in real life (using a variety/style y when/where y can be "frowned upon") --- though not "punished", just looked upon with 🙄 or 👀 in ways condescending.
The last time I checked, it was happening. As of this moment, I don't know for sure. But, as pointed out above, innumerable people do suffer in innumerable ways due to the supremacy of English in India. I may have some suggestions, but I don't really know the solution to this issue, as it is complicated by so many factors. I will never ever support forcing people to use one or the other language.
Why do you make assumptions as you comment on anything and hurl semi-insults? You don't really know me. I didn't assume anything about you.
For example, if I have understood correctly, you simply wanted to say that one should pay attention to stress/emphasis/topicalization (I will add, from my side, prosody and intonation) when considering the meanings of the sentences "John loves Mary" and "Mary loves John" (note that there is no question mark here at the end). And you went about it by first saying "let me guess" and then making some silly statements about my competence and expertise about language(s). You could have simply mentioned the importance of stress/emphasis/topicalization in the beginning. I can't imagine any reason why you have to make such assumptions.
Well-read and well-educated as you are about language(s), perhaps it is possible, even if only remotely, that I could have a thing or two that I could tell you about language that you might not perhaps know?
Great weekend!
Corpora mailing list -- corpora@list.elra.info https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email to corpora-leave@list.elra.info
--
- Anil
Linguistics is not "supposed to be about spoken languages" --- that has just been a disciplinary bias. And the term "spoken languages" is also not quite correct. There are scripts, there are sound, there are signs --- all of which are* ways in which we express ourselves and the way we interpret information*. Such "ways" can be regarded as language (as in, as has been traditionally and commonly known), g-language*. P-language can be thought of as an instance of such (mixed with some historical contexts, but also A LOT OF identity politics and language attitudes).
*g-language as general/generalized language or "language-at-large" (or in Haspelmath's formulation "general phenomenon of Human Language"), p-language as "particular language(s)" such as those commonly referred to e.g. FR, EN.... See also Haspelmath (2019). Confusing p-linguistics and g-linguistics: Philosopher Ludlow on “framework-free theory". https://dlc.hypotheses.org/1801. Note though that I, by "decomposing 'words'", am taking things one step further in generalization compared to what he would refer to as g-language phenomena. Because previous to my work, one had considered language to be grammar and grammar to be language, at least in the academic setting, and even in Linguistics, as an academic discipline.
Much of what I don't agree with in our current education and R&D landscape has to do with the abuse of "p-language(s)". It's research greed coupled with sentiment manipulation. Many who have studied linguistics (and many philological studies) as well as computational linguistics (and possibly natural language processing --- depending on which traditions/assumptions/practices is/are being taught) are likely to have been affected (I myself included). There was a stage in my life when I also wanted to speak up / advocate for minority languages. Then I realized, it's not really about the language (as texts or speech/sign data...).
[And yes, of course, there ARE the parts about the texts and speech/sign data --- the documentation, the processing, the interpretation/description. But it's not really what many/most people in CL/NLP or the "language space" are doing. They are not seeing interpretation as description. They are describing subjectively (mixing in some historical bias), then processing with biases historical and personal (and emotional), and IF they are interpreting at all, they are doing so based on a grammarian or "layperson" persepective (i.e. not a scientific/technical one).
And also, of course, there is much truth-telling to do when it comes to language attitudes and identity politics. For these could affect the availability of data, how data "quality" is being perceived, among others. But these do not have anything to do with how data is to be computationally processed. For those who perform testing on humans, their studies could reveal potential human biases effected by language attitudes etc..]
=====
2. Re "John loves Mary" etc.: I think you are mixing in your expectation of what how a canonical form should behave with its relevance. Re "I am unable to imagine a scenario where "John loves Mary" could be the same "Mary loves John"": they are both 15 characters long. (As to how they can be different, one thing that has not been mentioned: we can interpret/describe these strings with e.g. their transition probabilities, character to character, character n-gram to another --- there are a few who have worked on this, e.g. John Goldsmith (see his youtube videos on MDL) and myself (though on a "word"-free manner). It could give one some insights on the interpretation of phonological phenomena.)
3. "Word" is not ill-defined. It is not definable, in ways that would be necessary or sufficient for science, engineering, R&D.
4. Re "phonetics and phonology": no, though having had p and p may help with understanding language in finer granularities, it's not so much about p and p per se (or any academic degree or courses). It's about having a scientific mindframe. Between phoneticians, phonologists, morphologists, syntacticians, semanticists/semanticians, pragmaticists or social linguists, phoneticians (and some social linguists) were the ones who started using machines to study language. Seeing language as an object of scientific inquiry/investigation is key.
I am not writing what I wrote because I have had some "advanced Linguistics", I am not trying to impose any academic superiority here. But from what you wrote, I do see/sense there is a clearer demarcation between science and language attitudes & identity politics possible.
Re "I don't think you do at all, based on your comments": I do understand and empathize. I might have just seen/experienced things from a different angle --- I have seen how language attitudes / identity politics has been exercised in the academic context (also industry), so to "broaden the practitioner pool" for language-related technologies. It's almost like selling/promoting a less legitimate form of currency. In that respect, I think there are things to correct. (So I might come off a bit stern, but that should not be misinterpreted as my lacking empathy/compassion.)
Re "no one has to adopt a purist attitude when it comes to using/understanding of language by others. Perfectly true. Did I even hint at that to the least degree? You are again shadow boxing.": I am agreeing with you by echoing, (re-)confirming/reformulating.
Re "But even when I started, I was not a purist in any sense of the word.": Then how/where do you draw the line between p-languages, esp. outside the context of computing? Some puristic ideas/ideals are there to back the assumption/adoption of such naming.
Re "Why do you make assumptions as you comment on anything and hurl semi-insults? You don't really know me. I didn't assume anything about you.": I think there might have been some misunderstandings. I wasn't discoursing with you with prejudice. In fact, I have been trying to understand you better. Understanding readers' perspectives is important for authors. As you also identified yourself as a seasoned CLer, I thought to draw the connection between your views and some of what I've observed from "the CL population" (e.g. as exemplified by some dominant (and implicit) assumption/interpretation about the state of our art/discipline/area (or whether there is one)).
I did and do know about the language situation in India (e.g. linguistic hegemony) --- I may not have witnessed things in ways you have, or be able to describe things with as many details or report on the group sentiment (may it be perceived or collectively acted upon or that which ends up shaping the social infrastructure). But there are many parallels in language phenomena. The crux of the matter lies not in language --- that's a message I've been trying to get across. (It took me a while to come to terms with that.)
On Mon, Aug 14, 2023 at 9:16 AM Anil Singh anil.phdcl@gmail.com wrote:
On Fri, Aug 11, 2023 at 9:52 PM Ada Wan adawan919@gmail.com wrote:
Just a quick reply before the weekend to some of the points that I thought deserve a short clarification:
- re linguistic empowerment: yes and no. As I commented on X (formerly
Twitter) on 09Aug2023: "[t]here are divergent ways of thinking... but when it comes to language and the social sciences, one must be careful with how one "diverges"! Humans and [sic: are] humans. And much of what we postulate re "in-group/out-group" can be a matter of our "traditions" (if so, is it time to re-evaluate?), perspectives (if so, can we be biased sometimes?), or our willingness to include or will to exclude. How different can particular "languages" (in a folk psychological, proverbial usage) be, really? Where do the differences lie?".
Of course. That goes without saying. For almost 40 years now I have been looking at this issue and thinking about it. I had wanted to write a book about it. This is what got me into languages and then NLP, because I was then hooked by the study of language by itself, even without the issue of linguistic empowerment, particularly from the computational point of view. I have looked at it from all possible points of view. I was not a linguistic purist even when I began -- with a lot of bitterness -- and I am certainly not that now. I have no doubt at all that there is something universal and species-specific about human languages, although I don't know in what way it is universal exactly. No one does, as far as I know. No one could be more against linguistic chauvinism of any kind than me. Or any other kind of chauvinism.
The same goes with "diversity" efforts. AND OF COURSE I am NOT AGAINST these.
I believe that. I didn't say you were. I just gave an example to make my point.
But one has to be careful how far one goes with "difference(s)".
Only as far as is reasonable and fair to everyone.
- Re "John loves Mary" being "same/different" as "Mary loves John": it
depends. Note stress/emphasis/topicalization, different usage pattern(s) etc., not just "subj verb obj".
Well, yes, that is the central contradiction of Linguistics. It is primarily supposed to be about spoken language, but -- quite naturally -- linguists in academic literature have to use examples in written form. And the written form misses "stress/emphasis/topicalization, different usage pattern(s) etc.". Being concerned with language for 40 years, how could I possibly not know it?
However, I am unable to imagine a scenario where "John loves Mary" could be the same "Mary loves John", with any possible stress/emphasis/topicalization, different usage pattern(s) etc. for either of them and their combinations. It may be that I am missing something here.
- Btw, your usage of the term "word" can be replaced by other
alternative formulations, e.g. "term",
Let us terminate this terminological tussle about the term 'term', that is to say, the term 'word'.
I have already more than once agreed that the term 'word' is ill-defined and that I have even written about it. In this case, you are indulging in what can be called shadow boxing.
- Re phonetics and phonology: I was not referring to the relevance of
phonetic/phonological knowledge per se, that a practitioner in the space of "language and computing" would "need" in order to be competent. But that, as well as a comprehensive knowledge of general language theories and a broad background in p-languages and their (social/usage) contexts, belongs in the toolkit of a good linguist (as in, a good language scientist). To me, progressing to finer granularities is just refining our assumptions, our model.
I mostly agree. Only mostly, since the statement above a somewhat vague programmatic statement. If the details were there, I could agree to specific things.
But to those who may not have had phonetics/phonology, they may be more likely to think that they "need" "words" and hence my findings might be either a paradigm shift or the end of the world.
So, you are still equating "having had phonetics/phonology", which can be translated as having formally attended and passed courses and exams in phonetics/phonology. I can't imagine how could you possibly talk about de-pedatization if you subscribe to this -- in my opinion -- somewhat ridiculous way of thinking. Pardon me for using strong words, but what you say is on the borderline of being offensive, if not actually offensive. And it is extremely silly and childish, coming from such a well-read person.
- Re triple quotes: """
I copied and pasted your reply that didn't seem to have been sent to the list and put it in triple quotes, as a reference (for others).
OK.
- Re "Do you have any idea how much hundreds of millions of Indians
suffer simply from being forced to use English?": in what ways are they "forced"?
I can't even begin to attempt to describe in innumerable ways people are forced to use English. There is tons of literature about that, but a lot of it may be non-European languages. For example, it is there in Hindi. The book that I always wanted to write, but for various reasons couldn't, at least so far, was partly about that.
Just to mention a few examples. The medium of instruction in India, particularly for higher education, and exclusively for technical and scientific education, is in English. Every day hundreds of millions of people suffer due to that. The result is that a lot of people grow up with complexes and stunted intellect, as they couldn't understand what the teacher is saying, what is written in the books, and so on. When they come to college, a majority of people have problems writing one decent page of content in either their own language(s) or English.
The legal system, particularly at higher levels, works in English. As a result, the overwhelming majority of people have no idea what is going on. They have to rely on others completely, some of whom themselves may not be very fluent in English.
All the lucrative jobs require not only knowledge of English, but spoken fluency in English. Not only that, your accent while speaking English puts you in a particular caste, so to speak. As a result, an incompetent and badly educated person who speaks fluent English can get through life much more easily than a competent well-educated person with a 'bad' English accent.
There is little incentive to write (and read) in Indian languages, and therefore it is very difficult to write and publish literary or academic or even other kinds of books in Indian languages.
And so on and on and on.
The challenge is that it is very difficult to solve this problem, since there are many major languages in India, and so speakers of one language will not accept 'imposition' of another Indian language, or even the requirement to learn another Indian language. As a result, just as the British ruled by Divide-and-Conquer, so English rules in this time tested way.
And to top it all, if you want to be associated with the global economics/culture/world-at-large, you again need English.
But I can understand that.
I don't think you do at all, based on your comments.
The more important thing is to also understand that no one has to discriminate based on language(s),
Sure! Who can disagree with that except a language chauvinist?
no one has to adopt a purist attitude when it comes to using/understanding of language by others.
Perfectly true. Did I even hint at that to the least degree? You are again shadow boxing.
People can always have various language/linguistic habits, no one has to use "one language only".
Again, did I even hint at that in any possible way?
The point is not to use language as a weapon.
Ditto as above.
[These are things I think you know, but many on this list may not.]
I sure do. I have been thinking and researching about these matters for the last 40 years, almost obsessively, from all possible points of view. My position on this issue has changed a great deal over the years. But even when I started, I was not a purist in any sense of the word. I will always be against forcing people to do things they don't want to do.
Re "Do you know that there are and have been schools in the world, including India, where students are punished if they are caught speaking in their mother tongue (or first, "native" language)": is this still happening in India? I've only had similar experiences in my "foreign language" lessons and in real life (using a variety/style y when/where y can be "frowned upon") --- though not "punished", just looked upon with 🙄 or 👀 in ways condescending.
The last time I checked, it was happening. As of this moment, I don't know for sure. But, as pointed out above, innumerable people do suffer in innumerable ways due to the supremacy of English in India. I may have some suggestions, but I don't really know the solution to this issue, as it is complicated by so many factors. I will never ever support forcing people to use one or the other language.
Why do you make assumptions as you comment on anything and hurl semi-insults? You don't really know me. I didn't assume anything about you.
For example, if I have understood correctly, you simply wanted to say that one should pay attention to stress/emphasis/topicalization (I will add, from my side, prosody and intonation) when considering the meanings of the sentences "John loves Mary" and "Mary loves John" (note that there is no question mark here at the end). And you went about it by first saying "let me guess" and then making some silly statements about my competence and expertise about language(s). You could have simply mentioned the importance of stress/emphasis/topicalization in the beginning. I can't imagine any reason why you have to make such assumptions.
Well-read and well-educated as you are about language(s), perhaps it is possible, even if only remotely, that I could have a thing or two that I could tell you about language that you might not perhaps know?
Great weekend!