Friday, September 17, 2004

Make Sure Your Hosting Provider Is Not Blocking Important Spiders

I just read a couple of articles that blew my mind - your hosting provider may be responsible for your sites not being listed in Google, Yahoo, or MSN.

How is that possible?

It seems that some hosting providers have made the idiotic decision to BAN SOME IP ADDRESSES THAT HIT THEIR SERVERS TOO MUCH.  This is insane!  If the IP addresses that the Googlebot spider uses happen to be on that ban list, GOOGLE WILL NOT BE ABLE TO CRAWL YOUR SITES. 

How can you tell if your host is blocking important spider IP addresses?  That the (potentially) million dollar question that is tough to answer.  One thing you can do is check your stats logs for sites that are ranking well and are being crawled often by Google - copy the user agent information that Google leaves in your logs - use wannabrowser.com to spoof that user agent - and then try to bring up one of your pages that you believe may be being blocked.  If the page loads up, you should be OK.  If the page does not load, your host may be blocking that IP or user agent.

You can also check Google's cache snapshot of your most important pages and look for the date stamp.  If the cache is dated a long time ago, Google may be having trouble crawling your site. 

With all the things that can go wrong in the search engines, it's amazing to me that I ever get any top listings at all...

 


11:32:53 AM  #