Tuesday, September 27, 2011

9/20/2011

Singular value decomposition (SVD) is a method for identifying and ordering the dimensions along which data points exhibit the most variation, and once we have identified where the most variation is, it's possible to find the best approximation of the original data points using fewer dimensions. Hence, SVD can be seen as a method for data reduction.

What makes SVD practical for NLP applications is that you can simply ignore variation below a particular threshhold to massively reduce your data but be assured that the main relationships of interest have been preserved.

when map each document and query vector into a lower dimensional space, we call it Latent semantic indexing(LSI).
- Shu Wang