Tuesday, September 6, 2011

09/06/2011



IDF (Inverse Document frequency) is the log of total numbers of documents in a collection divided by number of documents containing a particular term
, regardless of how many times 
it appears in a document.
 ie. IDF = log (D/di), 
where D is the total number of documents and di is the number of documents containing a given term, 'i' [counting each document only once, even if a keyword appears in it multiple times].

IDF was useful with early search engines and IR systems. But because large size search engines on the Web are too generic, it has been slowly phased out by models that incorporate relevance information and more stable as the size of the collection grows.

Regards,
Rajasekhar bayapu.