CRAY CO-FOUNDER BURTON SMITH TALKS TRENDS
by Tom Tabor, Publisher
Cray Inc. co-founder and Chief Scientist Burton Smith is a recognized expert
on high performance computer architectures and programming languages for
parallel computers. An IEEE and ACM fellow, in 2003 he won the Seymour Cray
Computing Engineering Award from the IEEE Computer Society and was elected to
membership in the National Academy of Engineering. HPCwire publisher Tom
Tabor talked with Smith about HPC trends and challenges.
Tabor: Burton, first of all thank you for taking the time to chat with us
Since the early days of Tera, with great admiration, weíve followed your work
and recently your achievements at Cray. Iím sure many or our readers would
like to know how you are spending your time these days?
Smith: It's the usual mix of time spent working and time spent with my family.
A lot of my work life is focused on future products for Cray and on the
Cascade Project, which is our DARPA HPCS initiative. I also spend a lot of
time talking with Cray customers and prospective customers.
Tabor: The breakthrough architectures you developed at Tera were very exciting
to the HPC community. I recall an HPC meeting in Europe many years ago a panel
of CTOs from many of the large vendors were asked what they thought was the
most exciting architecture in HPC and to a man they all referenced your work.
Are you and Cray still interested in multi-threaded architectures?
Smith: Yes. Multi-threading is an element of the Cascade Project.
Microprocessor vendors are already implementing multi-threading to a modest
extent. It's one of a number of promising technologies for HPC and other
computing markets. We think Cray has a real leg up in experience with multi-
Tabor: What other exciting HPC technologies do you see emerging in the next
five years or so?
Smith: I think heterogeneous computing will be a tremendously important trend,
especially when it's integrated to the point where we can efficiently apply it
to a single problem. Today, the HPC community doesn't know how to do this.
Within the next five years, I believe we'll learn how to decompose problems in
order to subject parts of them, as appropriate, to vector processing, multi-
threaded processing, off-the-shelf microprocessing, FPGA processing, and so
Tabor: How important is that?
Smith: Aside from optimizing times-to-solution, heterogenous computing is
important because microprocessor performance is running out of steam. We're
going to see more divergence from standard microprocessors. We'll see a
greater variety of offerings from microprocessor vendors, new processor
architectures like FPGAs, more aggressive multi-threaded architectures, EDGE
architectures. FPGAs, for example, already do some things well that standard
microprocessor architectures do poorly: bitwise and small integer operations,
I think we may also see more optical interconnect technology and greater use
of this technology. Itís not particularly cost-effective today.
Tabor: How will Cray be involved in leveraging these technologies?
Smith: It's safe to say Cray will remain interested in balanced, high-
bandwidth systems, which means we'll be heavily involved in using, and in some
cases pioneering, new technologies.
Tabor: Were you involved in the OctigaBay acquisition? If so, in what way?
Smith: Sure, I was involved. I was very excited about what OctigaBay was
doing and the strong match with Cray's market philosophy.
Tabor: What do you think of Jack Dongarra's new benchmarks?
Smith: The HPC Challenge benchmark tests are very important. These are the
first metrics in a long time that give us more rationality in talking about
the balance of HPC systems. Granted, standard benchmarks are no substitute
for running your own codes on candidate machines, but standard benchmarks
allow people to make more informed guesses about which machines they should
look at. In that sense, the HPC Challenge tests are a very useful addition to
the HPC scene.
Tabor: What's the best way to measure a system's productivity vs. its speed?
Smith: On the topic of productivity and speed, they're not orthogonal. Speed
has a lot to do with productivity. I like the definition of productivity that
says it's utility divided by cost. Utility has to do with what an answer's
worth. That is often time-dependent. Taking 48 hours to predict tomorrow's
weather doesn't have much value. It's the same with anything time-critical.
Being able to do things faster is one of the great advantages of
The price/performance ratio does not recognize that utility is dependent on
time. The more time matters, the more you benefit from a capability system.
Also, since productivity equals utility divided by cost, itís important to
realize that cost is much more than the initial acquisition cost. It needs to
include programming time, power consumption, facilities renovation or
construction, and so forth. In the final analysis, if utility/cost is greater
than 1, you're making money. If it's less than one, you're losing money.
Tabor: What are the biggest challenges in running single applications on
future systems with tens of thousands or hundreds of thousands of processors?
Smith: The biggest challenge is designing systems with enough balance to get
high performance at those scales. It's easy to build big systems that don't
work well. Some other major challenges for systems on that scale are
reliability, programmability, debugability and performance tuning.
Tabor: In general, what are the biggest technical obstacles to progress in HPC
Smith: Narrowness of applicability is one. Another is difficulty of
programming. We need more programmable machines.
Tabor: How would you propose to overcome these barriers?
Smith: Making balanced systems available is the first important step. Beyond
that, the HPC industry needs to work on new programming language ideas. We
need more productive HPC architectures and languages.
Tabor: What needs to happen with programming models and languages?
Smith: It used to be the case that architectures were designed for programming
models. Now, people build architectures and then try to figure out how to
program them after the fact. MPI evolved in this way, as a post- facto
attempt to develop a programming model for existing systems. We need to return
to designing computers for programmability.
Tabor: Going down a parallel HPC technology path, what do you think about and
how would you define grid computing?
Smith: Grid computing means different things to different people today. To me,
it's about using the Internet to access resources remotely and to accomplish
things with these resources. You could do data sharing on a resource at one
location, computing at a different location, visualization at a third
location. Grid computing is not as capable today as some people would like us
to believe, but it's a natural and cost-effective way to do some important
things. The great myth is that you can run supercomputing applications on
armies of Internet-connected PCs. That would not be a very balanced
Tabor: Thank you again Burton for your timeÖ Any final thoughts?
Smith: One more thing is that the uniprocessor has pretty well run out of
steam. Parallelism to date has been a nice strategy for HPC users and an
afterthought for microprocessor vendors. Now, it is becoming a matter of
business survival for all processor vendors. Parallelism is going to be taken
more seriously, starting with the idea of exploiting multi-threading and
multiple cores on a single problem. This is a major change. Imagine if
Microsoft wanted to write Office in a parallel language. What would that
language be, and what would be the architecture to support it? We don't have
good answers to these questions yet.
From 1985 to 1988 Smith was a Fellow at the Supercomputing Research Center of
the Institute for Defense Analyses in Maryland. He received the BSEE from the
University of New Mexico in 1967 and the Sc.D. from MIT in 1972. He now serves
as chief scientist of Cray Inc.