Tweet Notes (CSE 494/598 F11): 10/06/2011

Sunday, October 9, 2011

10/06/2011

There's a robots.txt file located on the root of a web server that lists a set of rules for the crawlers, which they can respect or not if they choose to. It can tell the crawlers if they're allowed to index the website, restrict certain crawlers, or limit the indexing to certain directories only.

--
Ivan Zhou

Graduate Student

Graduate Professional Student Association (GPSA) Assembly Member

School of Computing, Informatics and Decision Systems Engineering

Ira A. Fulton School of Engineering

Arizona State University