Thursday, September 8, 2011

9/8/2011

Three techniques for generating keywords:
1. Stop Word elemination- Eliminate common words in the lexicon(eg do not index them). In English, some examples would be "the", "an"...
2. Noun phrase detection- combine multiple words that occur together eg."data structure"
3. Stemming- remove endings of words so that the query can be matched more easily to those words indexed from documents. eg. "walked"->"walk"

-James Cotter