Tweet Notes (CSE 494/598 F11)
This will contain collectively generated notes
Thursday, September 1, 2011
9/01/11
k Shingle - a contiguous sequence of k words.
k Gram - a contiguous sequence of k letters.
When creating a query to retrieve document
, usually want a smaller k. But, when comparing
documents for similarity, a larger k is desirable.
larger k -> greater precision but decreasing recall.
M.
Newer Post
Older Post
Home