Some Blogchalk Data
The phenomenon seems to be growing...
In the database currently:
- there are 2821 total weblogs (this counts unique titles)
- 1251 of those are as-yet unspidered
Of the 1570 that have been spidered:
- 28 had parse failures (I'm using python's htmllib).
- 501 contain a META keywords tag
- 122 contain a META keywords tag containing "blogchalk".
I exchanged some email today with Daniel Padua about a few things. Coming soon:
- A search function.
- An out-of-page data format (xml). This will make the database more capable. www.blogchalking.tk will probably generate the necessary code for you like it does now.
If anyone has experience writing search functionality and would like to lend a hand, please let me know. Otherwise, I'll just hack together something that works, no matter how ugly.