AN INTERVIEW WITH SUPERCOMPUTING LEGEND JACK DONGARRA
by Martin Meuer, Prometeus GmbH
Jack Dongarra, director of the Center for Information Technology Research
(CITR) and of the Innovative Computing Laboratory (ICL) at the University of
Tennessee, is a legend in the supercomputing field since twenty years:
LINPACK, MPI and the TOP500 are closely related to him.
Jack Dongarra will give the keynote presentation on Friday, June 21, 2002 at
the 17th International Supercomputer Conference in Heidelberg, Germany. The
title of his talk is: "High Performance Computing, Computational Grid, and
This interview for HPCwire was conducted by Martin Meuer, Prometeus GmbH,
HPCwire: Jack, you belong to the most renowned benchmark and HPC experts in
the world. How long have you been personally involved in benchmarking and in
the field of HPC?
DONGARRA: The original LINPACK Benchmark is, in some sense, an accident. So I
guess that makes me an "Accidental Benchmarker". It was originally designed to
assist users of the LINPACK numerical software package by providing
information on execution times required to solve a system of linear equations.
The first "LINPACK Benchmark" report appeared as an appendix in the LINPACK
Users' Guide in 1979. The appendix of the Users' Guide collected performance
data for one commonly used path in the LINPACK software package. Results were
provided for a matrix problem of size 100, on a collection of widely used
computers (23 computers in all). This was done so users could estimate the
time required to solve their matrix problem by extrapolation.
Over the years additional performance data was added, more as a hobby than
anything else, and today the collection includes around 1500 different
computer systems. In addition to the number of computers increasing, the
scope of the benchmark has also expanded. The benchmark report describes the
performance for solving a general dense matrix problem Ax=b at three levels of
problem size and optimization opportunity: 100 by 100 problem (inner loop
optimization), 1000 by 1000 problem (three loop optimization - the whole
program), and a scalable parallel problem.
HPCwire: The Linpack benchmark has been used up until now to determine the
TOP500-lists, of which you are a co-publisher. What are the strengths and
weaknesses of Linpack for the evaluation of supercomputers?
DONGARRA: In order to fully exploit the increasing computational power of
highly parallel computers, the application software must be scalable, that is,
able to take advantage of larger machine configurations to solve larger
problems with the same efficiency. The LINPACK benchmark addressed scalability
by an introduction of a new category in 1991. This new category is referred to
as a Highly-Parallel LINPACK (HPL) NxN benchmark. It requires solution of
systems of linear equations by some method. The problem size is allowed to
vary, and the best floating-point execution rate should be reported. In
computing the execution rate, the number of operations should be 2n3/3+2n2
independent of the actual method used. If Gaussian elimination is chosen,
partial pivoting must be used. The accuracy of solution is measured and
reported. The following quantities of the benchmark are reported in the
· Rmax is the performance in GF/s for the largest problem run on a computer,
· Nmax is the size of the largest problem run on a computer,
· N1/2 is the size where half the Rmax execution rate is achieved,
· Rpeak is the theoretical peak performance in GF/s for the computer.
As such this benchmark reports the performance of this one application with
its floating point operations and message passing. So there is one application
and one number that represents the performance for the computer system. The
weakness is that this is just one application and one number.
HPCwire: How would you rate the "IDC balanced rating HPC benchmark", which is
considering other parameters like bandwidth/latency of the memory and the
scalability of the system? Is it a more realistic approach than Linpack and
therefore capable of replacing Linpack for the evaluation of supercomputers?
DONGARRA: IDC Benchmark looks on the surface to be a good starting point for a
follow on benchmark. Unfortunately because of the way the rating procedure is
implemented there are problems. Others have gone into detail on the problems.
As Aad van der Steen said in a recent article in the Primeur Weekly of
February 13, 2002 when asked: How informative is the IDC Balanced Rating HPC
Benchmark? He said "It is not. It looks like all the time and effort invested
in the HPC Forum has yielded a sub-standard product that should be radically
improved or withdrawn." Presently it only adds to the confusion that already
exists on the HPC Benchmark scene.
HPCwire: Are the TOP500 authors searching for an alternative to Linpack as
yardstick for such evaluations? Or will we be forced to remain satisfied with
Linpack for the expected PF/s systems within the next 10 years?
DONGARRA: Yes, the organizers of the Top500 are actively looking to expand the
scope of the benchmark reporting. It is important to include more performance
characteristic and signatures for a given system. There are a number of
alternatives to look at in expanding the effort such as, STREAMS benchmark,
the EuroBen-DM benchmark, the NAS Parallel Benchmarks, and PARKBENCH. Each of
these describes more about the scalability of systems and provides more
insight than the IDC benchmark will.
HPCwire: In the TOP500-list, which will be published in June during the
ISC2002 in Heidelberg, the Japanese Earth Simulator (ES) will replace the
American ASCI White as the new #1, with a best Linpack Performance of 35.61
TF/s. How large was the dense system of linear equations relating to this,
which ES solved for the new Linpack world record, and how long did it take for
ES to solve it?
DONGARRA: The system of linear equations was of size 1,041,216; (8.7 TB of
memory). This is the largest dense system of linear equations I have seen
solved on a computer. The benchmark took 5.8 hours to run. The results of the
computation were checked and were accurate to floating point arithmetic
specifications. The algorithm used to solve the system was a standard LU
decomposition with partial pivoting and the software environment, for the most
part, was FORTRAN using MPI with special coding for the computational kernels.
HPCwire: You yourself have referred to ES in the New York Times as Computenik,
alluding to the Russian Sputnik in 1957, which certainly motivated and
accelerated the US space program decidedly. Was the rapid completion of ES
really such a great surprise in the USA? As far as I know, the Japanese ES-
Project, including the time schedule, was known worldwide from the beginning.
Or was the scheduled completion of ES with an efficiency of 87% considered too
DONGARRA: I think the time schedule was known, but the fact that it was
completed on time and is up in running is impressive. The result achieved with
87% on the benchmark is also very impressive. The impressive features of the
performance are that it is 5 times the performance of the ASCI White Pacific
computer for the benchmark. This is the largest difference we have seen in the
past 10 years.
Here are some additional impressive statistics of the Earth Simulator:
The performance of ES is approximately a fourth of the performance of all the
TOP500 computers from the November 2001 list, greater than the performance of
all the DOE computers together and greater than the sum of the Top 20
Computers in the US.
HPCwire: Do you think that ES will accelerate the ASCI-Program?
DONGARRA: People in the US are taking notice of the Japanese accomplishments
in the Earth Simulator Computer and it is the hope of many of us that the US
government will continue to invest in the future of high performance
HPCwire: What does the ES mean for "vector parallel multiprocessing
architectures"; will they gain terrain? Will not only NEC, but also Cray Inc.
with the SV2, profit from ES?
DONGARRA: I'm not sure about the Cray SV2, but the NEC SX architecture is very
impressive as a highly parallel system as composed in the Earth Simulator. I
would guess there are a number of customers around the world who are very
interested in acquiring such system composed of the SX nodes and the Earth
Simulator's switch fabric.
HPCwire: Currently grid computing is of major interest. For example, at the
ISC2002 in Heidelberg in a tutorial, a whole day will be dedicated to this
topic, and in the conference an entire session, in which you will supply the
Friday keynote address of ISC2002 relating to the topic "High Performance
Computing, Computational Grids, and Numerical Libraries".
What are your current projects concerning grid computing?
DONGARRA: At the University of Tennessee we are working on three Grid related
projects, NetSolve, GrADS, and Harness.
Since 1995 we have been working on an approach to grid computing called
NetSolve. NetSolve provides easy access to computational resources that are
distributed with respect to both geography and ownership. Using NetSolve, a
user can access both hardware and software computational resources distributed
across a network.
In addition we are working with a group of colleagues on a NSF funded program
execution framework being developed by the Grid Application Development
Software (GrADS) Project. The goal of this framework is to provide good
resource allocation for Grid applications and to support adaptive reallocation
if performance degrades because of changes in the availability of Grid
HARNESS (Heterogeneous Adaptable Reconfigurable Networked SystemS) is an
experimental metacomputing framework built around the services of a highly
customizable and reconfigurable distributed virtual machine (DVM). A DVM is a
tightly coupled computation and resource grid that provides a flexible
environment to manage and coordinate parallel application execution. This is a
collaboration of researchers at the University of Tennessee, Emory University,
and Oak Ridge National Laboratory and funded by the Department of Energy.
HPCwire: How would you predict grid computing to affect the landscape of HPC
in the future?
DONGARRA: The Grid will make it possible to implement dramatically new classes
of applications. These applications ranging from new systems for scientific
inquiry, through computing support for crisis management, to support for
personal lifestyle management are characterized by three dominant themes:
computing resources are no longer localized, but distributed and hence
heterogeneous and dynamic; computation is increasingly sophisticated and
multidisciplinary; and computation is integrated into our daily lives, and
hence subject to stricter time constraints than at present.
A good resource on activity in the area is "The Grid: Blueprint for a New
Computing Infrastructure", edited by Globus originators Ian Foster and Carl
Kesselman (Foster and Kesselman, 1998).
HPCwire: There are many non-commercial projects in the area of grid computing,
for example SETI@home. When can we expect the first commercial applications?
DONGARRA: There are examples today. Companies like United Devices, Entropia,
Avaki, Parabon Computation are trying to establish a viable commercial
operation using the concepts and ideas that came from the SETI@home
Web site: http://www.supercomp.de