On 7/23/23, Darren Cook via Corpora corpora@list.elra.info wrote:
A tensor is just a generalization of a vectors and matrices, so might be the distracting search term?
Perhaps my doubts relate to the fact that as a theoretical physicist myself, the kind of "mathematical purity" I was trained into can't digest well how you can use vector/tensor algebra with texts if, based on my way of seeing this type of matter, the concepts of space, vector and consequently product of two vectors have not been properly defined.
How do they define "space" and "vector" when it comes to corpora?
Searching for "corpora NLP" on oreilly.com gets 993 hits, 944 of them books.
I haven't found a convincing definition, yet. The concepts of metric space and measurement are well-defined in Mathematics:
https://en.wikipedia.org/wiki/Metric_space
https://en.wikipedia.org/wiki/Measure_(mathematics)
but you don't notice references to applications to textual processing ... even though those culturing the "AI" techne can't stop talking about "deep learning", "information", "the semantic web", ...
lbrtchx