Thursday, September 1, 2011

8/30/2011

1. Precision and Recall are good measures to evaluate IR engine. In terms of set theory, this is expressed as:

U (universe)= tn + fp + tp + fn
Relevant docs user is looking for= tp+fn;
System returns the following=tp+fp;

where,
fp-false positive
tp-true positive
fn-false negative
tn-true negative

For a good IR system, fn=fp=0, such that: (Relevant docs user is looking for)=(System returns the following);

2. Missing precision can be easily caught by user.
3. In precision-recall curve, precision starts at 1.0 and usually reduces as recall increases.
4. F-measure or F-score is the harmonic mean of Precision and Recall. Harmonic mean (HM) is used as HM is closer to the minimum of the terms in the mean.
5. When precision(P) and recall(R) are weighed equally, then F-measure is called F1-Measure and is given the the formula:
     F1=2(P)(R)/(P+R)
Weight can be assigned to Recall wrt Precision, say B (read Beta) which should be a positive real number. Hence Recall weighs B times more than Precision. F-measure in such cases is given by:
    FB=(B*B +1)(P)(R)/(B*B*P+R) 

-Rashmi Dubey