like Documents is tricky as most document
pairs have a similarity distance approaching 0.
The cosine theta distance is not really
a good measure, so use the costly LSI
to reduce dimensions. The true distance
is then represented in reduced dim. space.
Manjara did this but its not practical. yet.
M.