Wednesday, October 5, 2011

October 4, 2011

A possible way to clustering pages is by using link analysis to find the disconnected components. Essentially, the largest cluster will get all the authority/hub value on the first round. This is the largest community. The second largest can then be found by removing those components in the largest community and recomputing. Iterate these processes until all communities are found.

-James Cotter