“WorldCat
is our union catalogue of about 56 million bibliographic records, which
represent approximately a billion holdings. It is about 50 gigabytes in
MARC Communications (100+ gigabytes in XML) format and about 23
gigabytes compressed.
OCLC Research recently acquired a 24-node (48-cpu) Beowulf cluster with 96 Gigabytes of memory. According to my colleague Thom Hickey,
whose team has been working on the machine, the cluster speeds up most
bibliographic processing by about a factor of 30. This means that what
might have taken a minute now takes two seconds, what might have taken
an hour takes two minutes, what might have taken a month takes a day.
For jobs that will fit entirely in memory (e.g. a `grep' of WorldCat)
avoiding disk i/o gives another factor of about 20, reducing 1-hour
jobs down to 6 seconds. We can 'frbrize' WorldCat on the cluster in about an hour.
WorldCat
is also now more mobile. Thom has a 40 gig iPod which can accommodate
WorldCat on its disk with room left for 5,000 song tracks.
Now, you can't do much with the data on the iPod, but you can certainly
carry it around. Again, it takes about an hour to get it on and off the
iPod.” [Lorcan Dempsey’s Weblog, via It’s All Good]
They’re
all amazing numbers, but think about that iPod statement for a moment.
What does it mean when a patron can carry around the whole, freaking
WorldCat database? We’re not that far off from the introduction of the
personal, mobile server in your pocket.
[