For sure, commodity microprocessors are cheap -- and fast enough for me when I type this column. But certain tasks still need an extraordinary amount of computing power.
This is what Claire Tristram reminds us with this extremely interesting article about the Earth Simulator built by NEC for a cost of at least $350 million.
Please read it carefully because it's full of good information.
First, let's answer the basic question: why do we need such supercomputers?
What are the real advantages of making computers ever faster? Why, after all, can't we use a machine that takes a month or a week to complete a task instead of a day or an hour? For many problems, we can. But the truth is, we're just beginning to gain the computing power to understand what is going on in systems with thousands or millions of variables; even the fastest machines are just now revealing the promise of what's to come.
One recent example of both the promise and the limitations of today's most powerful computers came from IBM's ASCI White machine, the world's fourth-fastest supercomputer, which IBM researchers used to investigate how materials crack and deform under stress. The study, announced last spring, simulated the behavior of a billion copper atoms. A billion certainly sounds like a lot of variables -- until you realize that it would take more than a hundred trillion times that number of atoms to make up even a cubic centimeter of copper.
Now that you're convinced that we need lots of computing power in some situations, let's concentrate on the differences in programming for supercomputers and superclusters.
Yoking together commodity machines with standard commercial networks shifts the speed burden from hardware to software. Computer scientists must write "parallel programs" that parse problems into chunks, then explicitly control which processors should handle each chunk -- all in an effort to minimize the time spent passing bits through communications bottlenecks between processors.
Such programming has proved extremely difficult: A straightforward FORTRAN program becomes a noodly mess of code that calls for rewriting and debugging by parallel-programming specialists. "I hope to concentrate my attention on my research rather than on how to program," says Hitoshi Sakagami, a researcher at Japan's Himeji Institute of Technology and a Gordon Bell Prize finalist for work using the Earth Simulator. "I don't consider parallel computers acceptable tools for my research if I'm constantly forced to code parallel programs."
Not only programming for clusters is difficult thus expensive, it is not very efficient.
A supercomputer comprising large numbers of commercial processors isn't just hard to program. It has become clear that the gains from adding more processors to a commodity system eventually flatten into insignificance as coaxing them to work together grows more difficult. What really got computational scientists' hearts racing about the Earth Simulator was not the peak -- or maximum number of calculations performed per second -- which is roughly four times the capacity of the next fastest machine and in itself is impressive enough. Instead, it was the computer's capability for real problem solving (which, after all, is what scientists care about). The Earth Simulator can crunch computations at up to 67 percent of its peak capacity over a sustained period. In comparison, the massively parallel approach -- well, it doesn't compare.
"If you throw enough of these commodity processors into a system, and you're not overwhelmed by the cost of the communications network to link them together, then you might eventually reach the peak performance of the Earth Simulator," says Thomas Sterling, faculty associate at the Center for Advanced Computing Research at Caltech. "But what is rarely reported publicly about these systems is that their sustained performance is frequently below five percent of peak, or even one percent of peak."
A final note: if you're involved in high-end computing, you definitively should read this article.
Source: Claire Tristram, Technology Review, available from NewsFactor Network, February 11, 2003
11:28:41 AM Permalink
|
|