Thursday, September 15, 2011

09/15/2011

There are 3 ways of doing correlation analysis :
1) Analysis of entire document corpus (use Doc-term martix) - create a global thesaurus 
2) Analysis of top-k documents which are similar to the current query (vector similarity of query and doc is high) - create a local thesaurus
3) Analysis of Query logs - where instead of doc-term matrix, we will consider query-term matrix - this type of analysis runs in the problem of cold start! 


--Shreejay