When Crawlers hit Dead EndsThis would be useful: to know what happens when a basic crawler (spider, bot, searchbot) hits your site, and where it dead-ends. Is there a splashscreen or popup? Google discourages the use of doorway pages, don't know about the others (Yahoo, MSN). There could be links from the homepage that are not real links. Can a crawler follow a link to a pdf, then follow links from that pdf? What happens when there are no more links, can they peak into subdirectories? Are there any open-source crawlers out there? 6:21:56 PM |
PageRankPageRank adjusts results so that sites deemed "more important" will appear higher in the result set. PageRank Uncovered (pdf 56 pages). Some basics:
See also 5:30:58 PM |