Saturday, December 18, 2004
Glenn Fleishman: On Nov. 13, I posted a graph showing the fast growth in the requested bytes in RSS and similar feeds from my Wi-Fi Networking News and a few (much smaller) other sites. The bandwidth usage showed a growth from the mid-200 MB per day range up to about 350 MB per average per day. During that same time, I wasn't seeing an increase in visitors of that scale--maybe 10 to 20 percent, not 75 percent. After analyzing logs, I discovered that a small percentage of aggregation sites and aggregation servers were requesting as much as 20 to 30 percent of the bandwidth unnecessarily through aggressive downloads that didn't check the If-Modified-Since headers or other tools to prevent a retrieval of a page that hadn't changed. I built a simple program running via Apache that throttles RSS downloads: a given IP and user agent combination can only request a given RSS feed file if it's changed since they last retrieved it. Pretty simple. But the effects are profound, as this graph shows...
[GlennLog] 4:27:18 AM Link Google It!
[GlennLog] 4:27:18 AM Link Google It!