Saturday, October 15, 2011

10/6/2011

A crawler having a depth first search as its algorithm would not work as if there is a cycle of links in a page it would end up in the cycle and never be able to crawl out of it. Hence we should have some priority based Queue to store and URLS depending on the frequency of visits.

How to Prioritize:
1)Importance of the Page.
2)How often does it change.


Anuj