Wednesday, November 16, 2011

11/01/2011

Likelihood of probability in bayes network can be computed by the
expression of its dependents.

These expressions can be very large. Overflow could be avoided by
working with the log of value instead of the original.

Sample bias and 0 errors could be prevented by multiplying each
probability by 1/V, where V is some "virtual" document count.

Feature selection is important for both performance, and accuracy
reasons. Including too much information can actually lower the
"correctness" of the results.

Diversity of features can be just as important as similarity.

-Thomas Hayden