Thursday, September 15, 2011

09/15/2011

Keywords/terms in the document can be considered independent (each keyword/term has a unique meaning, non-redundant language). To use it Dimensionality reduction techniques are required. PCA (Principal Components analysis) is a technique to do such dimensionality reduction. PCA applied to documents is called latent Semantic indexing. 




--Shreejay