Tuesday, May 20, 2003

When Crawlers hit Dead Ends

This would be useful: to know what happens when a basic crawler (spider, bot, searchbot) hits your site, and where it dead-ends.

Is there a splashscreen or popup? Google discourages the use of doorway pages,  don't know about the others (Yahoo, MSN). There could be links from the homepage that are not real links. Can a crawler follow a link to a pdf, then follow links from that pdf? What happens when there are no more links, can they peak into subdirectories? Are there any open-source crawlers out there?


6:21:56 PM  images/woodsItemLink.gif  comment []  - See Also:  

PageRank

PageRank adjusts results so that sites deemed "more important" will appear higher in the result set.  PageRank Uncovered (pdf 56 pages). Some basics:
  • Every link from a page to a target page is a vote for the target page.
  • The sum of all votes for a target page determine its page rank.
  • Some votes carry more weight than others.
  • All votes count; links within the same website are not excluded.

See also
- PageRank Explained
- The Handy Dandy Google Page Rank Figurin' Guide
- Google Information for Webmasters: Why does my page's rank keep changing?
- Webmaster Guidelines


5:30:58 PM  images/woodsItemLink.gif  comment []  - See Also: