ANTS TECHNOLOGY MAY IMPROVE COMPUTER PROGRAMMING
by Peter C. Patton, Ph.D.
07/28/00
San Diego, CA -- CHoPP, for Columbia Homogenous Parallel Processor, is a
parallel processor architecture developed by Herbert Sullivan when he was at
Columbia University in New York in the 1970s. Sullivan used the acronym CHoPP
as the name for a venture capital corporation he founded. He wanted funds to
build a highly parallel supercomputer based on the CHoPP architecture and
patents that resulted from his work at Columbia. He intended that this
supercomputer would take advantage of concurrency at all levels of granularity
in a computer program, even in existing programs that had been written for
sequential or vector machines. A prototype was built at the firm's facility in
San Diego, California and the first model was sold to the San Diego
Supercomputer Center. It was never delivered due to difficulty raising
capital, but the prototype demonstrated that the basic concepts of the CHoPP
architecture were effective and indeed, could be implemented in computer
hardware.
The firm was reduced to a small consulting and design staff that became
involved in a number of government sponsored studies and design projects
revolving around parallel processing. During this period Sullivan came to
realize that there was very little difference in the performance of various
shared memory architectures. Average latency increases approximately with the
log of the number of processors, processing power increases almost linearly
until the number of processors equals half, or so, the average concurrency.
Increasing the concurrency of the programs was much more important than any
particular architecture. Much of the uniqueness of the CHoPP architecture,
particularly the patented techniques for exploiting latent concurrency in
existing programs, could also be implemented in user-level control software.
The increasing use of distributed computing models, such as MPP and, more
importantly, networks, led CHoPP to concentrate on using increased concurrency
to reduce the effect of network latency on performance, in addition to the
original goal of allowing nodes of hundreds or thousands of processors to
scale. The original patents were sufficiently broad to cover this unforeseen
application to software. However, recent developments at CHoPP and the
increasing possibility of gaining software patents have led management of the
firm to file specific software patents to cover a software development
technology called ANTs: Asynchronous Non-preemptive Tasks. CHoPP has now
applied for a patent for ANTs as a non-preemptive threading technology.
The people at CHoPP realized that, while the commercial software market was
having difficulty scaling beyond a handful of processors, CHoPP had technology
that permitted scaling into the thousands of processors. While the scalability
and robustness of asynchronous, stateless communication has been made obvious
by the success of the World Wide Web, its application to general purpose
computing has been hidden by the overhead of existing methods. CHoPP has been
using a low overhead mechanism for asynchronous distributed control for some
time. The thirty year history of the development of this technology has been
distilled into a simmple and elegant set of primitive operations that are
object oriented, easy to use, and compatible with existing software systems.
CHoPP now offers this environment for sale as the ANTs Software System. It is
available in C++ and Java (soon). ANTs will also be available as part of
MinimalJ, the most secure and portable client ever developed.
When Herb Sullivan first told me about the software implementation of his
CHoPP technology, I reminded him of a termite colony. The mystery of a termite
colony arises from the number of bits of information required to describe the
architecture of a termite mound, a six-foot high structure commonly seen on
the African savanna. The information required to design the architecture of
the mound surpasses the number of neurons in a given termite's brain; no
individual termite's brain is large enough to store the entire design. So
where does the colony store it? Herb's response was that, of course, the
architecture is not stored explicitly, each termite knows only its
preprogrammed function and they all work together in a non-preemptive
multi-tasking environment, each to do their own thing, which in sum is the
thing of the colony. He then replied that ants do the same sorts of things as
termites but do it in the daylight and are beautiful and shiny black or red
creatures, rather than ugly and pasty white or semi-transparent, so he would
rather call the technology ants. Moreover, he pointed out, the acronym works
out much better, as well.
ANTs are a programming technology for simplifying large complex software
systems into independent components, thereby making them more reliable, more
efficient and easier to maintain. It achieves these results because, like the
ant or termite colony, it distributes control structures throughout the
computation. ANTs computations are a collection of essentially independent
component sub-programs whose interactions are described within the component
sub-programs themselves, and do not depend on a central control structure.
ANTs execute concurrently, or appear to be processing data simultaneously and
interact only when they explicitly exchange data. Thus ANTs computations are
concurrent computations in which all components appear to execute
simultaneously unless they are waiting for data to be provided by another ANT.
An inherent property of ANTs computations are that they execute concurrently
on parallel multiprocessor or distributed hardware. The benefit of the ANTs
programming technique can be seen from two perspectives. On the one hand, ANTs
simplify programming by eliminating the need for central control structures;
on the other, ANTs enhance efficiency by effectively exploiting the parallel
resources of a multi-processor.
The computer architect has faced two challenges: improvement of
computational efficiency - by avoiding the waste of time or space - and
reducing complexity by logical simplification and reductions of system
component count. In terms of computational efficiency, progress has been made
in two key areas. Improvement in communication between software and hardware
has come through the use of operating systems advances and threading
technology. Efficiency has also been improved through the use of multiple
processors. In this domain, performance is also improved because multiple
processors can cooperate to complete a given computation in minimum time.
Computer program resource demands are satisfied by calls to an operating
system. Traditionally, hardware resources are assigned to competing processes
(which may or may not be parts of one or more concurrent programs) by a
technique called multi-tasking or multi-threading. This process takes place in
the twilight zone between the application program and its assignment of
resources by the operating system upon which it depends. Today's
state-of-the-art technology in this area is synchronous, preemptive
multi-threading; that is, when an application asks the operating system to
accomplish a task, the system opens a thread and locks it; assigning the
requested task to that thread. The thread remains locked until the task is
complete or the operating system decides some more important resource needs
that thread's assigned resource. These multiple preemptive threads are then
scheduled by a single thread responsible only for scheduling all the others.
In conventional procedural programming practice, the control program
specifies the sequence in which component parts of a computation are executed.
Normally, this sequence ensures that program segments do not execute until the
data they need is available, and also ensures that appropriate program
segments execute in time to provide data before it is needed for the
computation to progress. As computations become larger and more complex,
control structures become more complex and limit the size and effectiveness of
the application. Complexity rises exponentially with the number of function
points in an application program. At 1,000 or more function points, complexity
usually becomes overwhelming and the application is abandoned. While systems
this large do not necessarily require more bits than the number of neurons in
a single human brain, they do seem to be well beyond the ability of a single
application architect to describe, and to predict, the real world performance
of the application system.
In software engineering, the usual method for solving complexity issues has
been to break complex problems down into a number of simpler, more manageable
parts. Unfortunately, the programmer must then also undertake the coordination
of the various components of the program; there has been no technique for
separating the central control mechanism and encapsulating it into the
program's components. In fact, the control structure becomes either highly
centralized in an explicit way or interwoven implicitly into the fabric of the
application program. The result is that the parts of the program are no longer
interchangeable or amenable to independent upgrades, and the entire program
must be analyzed when local changes or maintenance activities are undertaken.
The advantages of breaking up the program into quasi-independent segments are
undermined by centralized or embedded control structures.
Due to its central control structure preemptive threading technology may get
in the way of the effort to reduce complexity. ANTs solve this problem by
eliminating the central control structure and making the interaction between
program components explicit, asynchronous, and easily accessible for later
changes. These components, or ANTs, acquire input data, process that data, and
provide the results as output to be received by other ANTs, which are in a
suspended state until the data they need becomes available. The programmer is
no longer required to be concerned with scheduling or explicitly sequencing
tasks because the ANTs control kernel (ACK) automatically schedules the
execution of any program component or ANT whenever the data it awaits becomes
available for processing. There is no longer any need to sequentially order
processes and the computations will proceed concurrently from the point of
view of the programmer. In fact they will proceed in parallel on a
multiprocessor or distributed server system up to: Minimum {processors
available, concurrency in the program} which is one of Sullivan's theorems.
Dependencies of one part of the program on another are automatically
respected, because no ANT can execute until the ANTs on which it depends have
provided the data it needs to execute.
The efficiency of ANTs depends on the special properties of the ACK, which
is at the heart of CHoPP's proprietary technology. The ACK sits between the
program, which may be written in any language (commonly C++ or JAVA) and any
operating system (typically UNIX or NT). ANTs communicate with messages sent
to mailboxes and the ACK acts as postmaster. Some of the functions of the ACK
are usually provided by conventional operating system kernels, but these have
two shortcomings for ANTs. First, they are not efficient enough to manage a
program broken into so many small parts, and second, currently neither
operating systems nor databases provide a single object oriented interface
required for today's programs. The efficiency issue can be understood by
observing that the natural decomposition of a problem often produces numerous
fine grain components. For example, a business transaction may spawn as many
as 50 internal computing/database transactions or object references. A natural
programming style would turn each such activity into an ANT. Conventional
operating systems would associate each such component with a synchronous
operating system process, or thread. Then scheduling and task switching
overhead would be incurred every time a thread or process is activated or
suspended. The useful work done by these small grain components, say 50
instructions or so, would be very small compared to the thousands required for
thread management.
By contrast, the ACK executes 10 to 30 instructions for overhead when an ANT
is activated. The overhead is so low that the programmer can ignore size
considerations in partitioning a computation into its natural components,
using segments as large or as small as the application or its algorithm
dictate. The second reason the ACK must be substituted for certain operating
system functions is that protocols for thread management are inconsistent with
the view that ANTs are independent, encapsulated objects. Threads communicate
via unrestricted shared memory that provides no tools for private protection,
or for structured shared access. As a result, threads are notoriously
difficult to use; they provide mechanisms for concurrent (and parallel)
execution only at the expense of increased programming complexity. ANTs, which
make the interaction between components explicit and asynchronous, provide an
improvement in conventional encapsulation, while permitting concurrent
execution as a welcome side benefit.
ANTs are more efficient than threads for mathematical reasons and not simply
because of coding tricks. An ANT program is a partially ordered set and
therefore has more degrees of freedom in execution order than does a
deterministic set of threads. Because the thread control mechanism is used to
manage hardware and I/O, fairness and prioritization are essential. Fairness
is inherently blocking and prioritization is inherently centralized.
Lock/Unlock, etc. are inherently blocking because any locked thread must place
itself on a list and the unlocking thread must be able to restart it. One or
the other must wait while the list is altered. Once the thread count is
sufficiently high, blocking dominates execution time. ANTs are non-blocking.
ANTs cannot span multiple applications with a single mechanism. ANTs cannot
operate in real time or assign priorities. ANTs can only guarantee that all
the work in a batch of work is completed before it halts. This simplification
allows ANTs to be non-blocking and have distributed control.
Traditional programming techniques have difficulties when, as is usually the
case, the event, which determines a context switch, occurs before the
operations necessary to permit it are complete. The resulting flag variables
inevitably lead to errors. An ANTs program makes the context switch
immediately and the ANT representing the new context waits for the data to be
ready for it. This simplifies the code and removes the largest barrier to
concurrent execution. The two technologies differ in six fundamental aspects
as shown in the table below. ANTs operate in user space whereas threads
operate in the kernel space. This lowers overhead and improves protection
between processes. The operating system is not aware of ANTs, it deals with an
ANT program as if it were a single process.
ANTs impose no priorities and unlike threads, are non-preemptive. Priorities
can be established only by reference to a global view of execution, a view
that determines some order relationship between threads. Such a global view is
in conflict with the notion of independent execution. Priorities are
appropriate in the execution of operating system calls, for the management of
hardware resources, for example. In these cases ANTs make OS calls and
interface with threads in the usual ways. ANTs are encapsulated objects with
protected code and data. Threads are not objects, they are kernel managed
scheduling entities.
ANTs execute reentrant code and have private data that is not accessible
from other ANTs. They communicate by means of structured mailbox objects only.
These mailboxes provide the primary ANTs message passing control mechanisms,
because when ANTs access mailboxes they are suspended or continue execution
according to the availability of data in the mailboxes. Threads, by contrast,
communicate via shared memory. This provides no protection or structure to
threads communication priorities and is a significant source of error prone
code.
The programmer sees the synchronization and coordination of ANTs based
entirely on data availability; ANTs execute as soon as the data they need
becomes available. This data flow view appears natural to the programmer.
Threads are synchronized and coordinated using Wait/Signal or Lock/Unlock
primitives. Their role in the application program is unspecified and extrinsic
to the application logic or algorithm, and often manifested within obscure or
opaque coding sequences. This contributes to the complexity and error
proneness of threads usage. When an ANT is suspended and another starts to
execute certain housekeeping functions, an activity called task switching must
be carried out. The number of task switching instructions required averages
between 10 and 30 for ANTs. The analogous operation for threads takes a
thousand or more instructions. This distinction is at the heart of ANTs
efficiency and also contributes to ANTs programmability. Such low task
switching overhead allows the user to be indifferent to task granularity.
ANTs and Threads Compared
ANTs Threads
Execution Environment User Space Kernel Space
Priority Preemption No Priority Preemption Priority Based Scheduling
Encapsulation Private Data and Code None
Communication Message Passing Shared Memory
Synchronization Criteria Data Availability Lock/Unlock
Task Switching 10-30 Instructions 1000+ Instructions
ANTs technology originated as a programming technique for exploiting
parallel processing in multiprocessor computer systems. Much of the early work
on concurrent programming for parallel processors involved explicit mapping of
the application onto the computer architecture by the programmer, including
discovering new algorithms suitable for specific architectures. In many
embedded applications, the parallel processor was even specifically designed
to execute a given algorithm. From the earliest days of the CHoPP
architecture, the design team sought automatic mapping technology to remove
this burden from the programmer. During the past decade while the CHoPP team
has been developing ANTs as the software realization of its hardware
architecture, multiprocessor operating systems have developed around the
threads technology first used at Carnegie Mellon for Mach. This became the
kernel of AiX. More recently the JAVA programming language was developed with
light weight threads in an effort to reduce complexity.
Threads and ANTs both incorporate automatic mapping techniques that enable
the programmer to create concurrent programs without reference to the specific
hardware platform on which the application will run. ANTs cannot perform many
of the real-time operating system functions for which threads were designed,
and thus an ANTs program makes system calls to thread implemented functions
whenever needed. On the other hand, threads are poorly adapted as a basis for
a modern application programming system, because they introduce too much
overhead, they defeat encapsulation, and they rely on a central control
paradigm.
The original CHoPP goal of eliminating the manual mapping of programs onto
multiprocessor hardware was achieved through ANTs by a significant reduction
in the overhead required for intertask communication. To effectively exploit
the parallel resources of a multiprocessor, it is necessary to express the
application as a collection of tasks that can be distributed among the
available processors. The natural decomposition of an application into
interdependent cooperating tasks that communicate frequently requires
processors to communicate frequently. Over the past decade interprocessor
communication has improved dramatically and so has networked communication
between distributed processors. In fact, processors in IBM SP Series systems
connected by T3 lines are indifferent as to whether they communicate with
another on the same Omega switch or over the network with a processor
thousands of miles away. Today, hardware generated time required for
interprocess communication is about the same as ordinary processor memory
access when there is a cache miss. The software generated time required for
threads to communicate when potentially suspending is two to three orders of
magnitude higher than the hardware time. Thus threads are unsuited for
communication intensive applications because they generate so much overhead
they cancel the benefits of parallel processing.
ANTs use the conventions and tools of object oriented programming technology
to provide encapsulation, naming, and reuse. The initial implementation of
ANTs was done using C++; although a runtime library and several precompiled
files are required to program in ANTs, no special programming tools are
necessary. A JAVA version of ANTs is in its final stages of testing and will
soon be available in demonstration form from an ANTs Web site. ANTs provide
the programmer with a small set of reusable object classes, foremost of which
are the ANTclass and the Mailboxclass. The actual ANTs runtime environment is
invisible to the user and is entirely self-adjusting, since it executes object
code on different platforms without user intervention. The programmer will
need to give some attention to issues arising from partial ordering of their
ANT "colony" or set of tasks in order to gain the full performance benefit of
the technology when using a parallel processor.
Objects of the ANTsclass have private data, but all public data is
restricted to that contained in Mailboxes. This data is accessed by
communications instructions, which are method calls to the Mailboxclass.
Objects of ANTsclass execute concurrently to the extent of processors
available to host them. Their order of execution is determined by the
Mailboxclass methods invoked for interprocess communication and not by any
syntax ordering of the ANTs within the program. Unlike threads, ANTs operate
in the user space under the ACK in the ANTs runtime package.
The Mailbox object has several private states, two of which are Valid and
Invalid. If the Mailbox state is Valid, the Mailbox contains a pointer to
public data. The methods of the Mailboxclass are called communication
instructions because these are the only way that ANT objects communicate with
each other. Sample communication instructions for the C++ version are:
void Cmailbox::Send (void* pointername) // places void pointer in mailbox
and marks mailbox valid
void* Cmailbox::Receive (void) // waits until mailbox is valid and returns
from send
void Cmailbox::Reuse (void) // invalidates mailbox
These functions allow the ANTs programmer to send mail to a particular ANT
by storing a pointer in its mailbox, to receive mail by waiting until a
particular mailbox is filled, and to clear a mailbox for later reuse.
Naturally this syntax is a bit different for the JAVA version of ANTs since
the JAVA language does not support pointers. The JAVA versions are:
public class Mailbox {
Mailbox (Class type); // TypeDesc is always a reference
void Send (type ObjectReference); // in JAVA, Mailbox is strongly typed
type Receive (void);
void Reuse();
}
The JAVA Mailboxclass allows the programmer the same three functions without
the use of pointers. The constructor of a Mailbox checks a list of available
Mailbox types and creates the correct one; if it does not exist, a new class
is created and added to the list. Unlike in C++, a JAVA Mailbox may only pass
a single type of object. The advantages of this approach are that all ANTs
programs are strongly typed and no runtime type checking is necessary. The
disadvantages are that it is necessary to subclass mailbox (or use TypeDesc)
to know the actual class of a mailbox. Certain programs that create dynamic
classes without end may have their symbol and vector tables grow much like a C
memory leak; runtime type checking will be done on lists of mailboxes because
they are underspecified. JAVA casting rules make this necessary and dynamic
testing is too slow to be useful.
Experiments with ANTs have shown a factor of four performance improvements
without changing the code, other than to add ANTs wrappers to natural, easily
recognized application program components. Another promising component based
technology for distributed object computing has appeared as a part of the
greater JAVA phenomenon recently and comparison with ANTs is interesting.
JavaBeans is a portable, platform-independent component model written in the
JAVA language and developed in collaboration with software industry leaders.
It enables developers to write reusable components once and run them anywhere,
taking advantage of the platform independent nature of JAVA technology and the
object distribution capability of the Internet. JavaBeans acts as a bridge
between the proprietary component models and provides a seamless and powerful
means for developers to build components that will run in ActiveX container
applications, for example. The JavaBeans' API provides a framework for
defining reusable, embeddable, modular software components. The JavaBeans'
Specification defines a bean as a reusable software component that can be
manipulated visually in a builder tool. Any object that conforms to certain
basic rules can be a bean; there is no Bean class that all beans are required
to subclass. A bean exports properties, events and methods. A property is a
piece of the bean's internal state that can be set and queried by the
programmer; a bean may generate events and may export methods as public
methods defined by the bean. In addition to these common object features a
bean may have indexed properties, bound properties, and constrained
properties. An indexed property has an array value for which the bean provides
methods to get and set individual elements of the array. A bound property
sends out a notification when its value changes, and a constrained property
does the same but also allows that change to be vetoed by listeners.
ANTs always adopt the component mechanism of its host language. So, if a
JAVA ANT is compiled as a JAVA bean, it is componentized as a bean. Since all
ANTs calls are virtual and JAVA supports only virtual calls, it doesn't
matter, except that JAVA cannot compile across a bean boundary and thus
performance suffers. The JAVA communication model is not scalable but the use
of ANTs technology does fix that problem. For example, a JAVA program compiled
as ANTs ran four times faster than its non-ANTs version.
The reason ANTs simplify code is clarified by referring to the old 4GL
arguments between the "template" school and the "rules" school. The template
school permitted similar instances of code to be produced cookie-cutter style
by filling out a form. The rules school created code from a list of rules
specifying what ought to occur in any circumstance. The difficulty both
schools faced was that they could only create 90% of a program, but moreover,
there was no means to go from the 3GL output of the methodology back to the
original 4GL input. Any change to the 4GL wiped out the 3GL supplement that
finished the program. The most important contribution of object oriented
programming to productivity may be the fact that it provides a reversible
template technology. However, the objections of the rules side to templates
are still valid. The implementation of formulas, tests for conditions, names
of methods, and types of data are spread out in the code.
Cataloging and altering these a posteriori code augmentations becomes
virtually impossible in a large program. The complex job of determining
dependencies and order is still left to the programmer. ANTs adds to object
oriented programming the most important benefits of rules based programming,
that is, the ability to ignore ordering, the separation of rules from the rest
of the program, and removal of specific symbols from the object language
source code. While earlier systems used sophisticated algorithms to accomplish
these things, ANTs simplifies things by associating the execution of code with
the Boolean states of data references. There was little else that
sophisticated systems could do anyway. Virtually all of the objections of the
rules school to object technology are eliminated without resorting to
preprocessors or arcane syntax, and without giving up a single benefit.
Mailboxes contain references to data and not data itself. The structure of
C++ and JAVA, the only intended target languages for ANTs, make the cost of
that indirection insignificant since data is rarely passed by value in a real
program. Since pointers and references only differ as to whether arithmetic is
allowed, and since ANTs discourage address arithmetic, there is little
practical difference.
JAVA performance seems to have several significant bottlenecks. Those most
often complained about will become moot with the advent of optimizing
compilers for servers and frequently used applications. More difficult is the
synchronization of threads. When a thread encounters a lock it must test it
and if it is already locked, place itself on a list. To do that, it must lock
the list. When a thread releases a lock, it, or its proxy, must check to see
if a thread is waiting to gain the lock. If the list is locked, only bad
choices remain. Either the thread can spin on the lock or it can suspend,
leaving the next thread wanting the lock waiting. If the locked-out thread
finds the list locked, it must suspend and wait. Many implementations use a
third thread to maintain the list. Regardless of the strategy, a reliable
system is computationally costly. With a reasonably large number of
preemptively multi-tasked threads, synchronization will come to dominate
execution time. Example: If a list remains locked for a microsecond, and the
preemptive slice is ten milliseconds, one in ten thousand times, the thread
locking the list will be preempted. Because preemption usually lasts much
longer than the time slice, a busy list will almost always be locked by a
preempted thread.
When the odds are high that a thread will pause for a lock, the time for a
suspend/resume pair must be included in the cost of the lock. When two or more
threads are polling for a place on the wait list, and the odds are low of
getting in, many times the cost of a suspend/resume pair can be incurred. The
non-blocking structure of ANTs fixes this unfortunate situation, which often
occurs in large distributed object information systems.
Because JAVA disallows all of the performance enhancing hacks that C
programmers use to make elaborate control programs work (like table driven
state machines, which are rather slow without pointer arithmetic), JAVA tends
to have performance problems with parsers and formatters, for example. ANTs
make most of these programs run faster than their C equivalent by replacing
the table entries with ANTs waiting on mailboxes. Because JAVA is structured
as a call-return, synchronous language, distributed systems must either use
some programmer-devised optimization or fail from network delay. ANTs are
inherently asynchronous and naturally exploit the concurrency of a program to
moderate latency.
A principal goal of ANTs from the beginning was to reduce the need to map a
program onto the architecture of the processor that would execute it. In
today's business computing environment, that translates into avoiding API
sets. Programs that must operate on various machines have a difficult time
dealing with variations in their threading models, event models, message
queuing and other system functions. Even systems that only support a single
operating system face problems scaling to clusters, distributed computing,
handling network latency and properly packaging system calls. ANTs permit a
single code base that can be run on a single computer, clusters, distributed
networks, and large parallel processors and distributed systems without
alterations. Examples of large scale enterprise servers particularly
well-suited for the ANTS programming model are the IBM SP Series and the
Unisys Aquanta 32 processor enterprise server, both of which can be clustered
into sets of 4-8 processor groups.
Writing for a cluster usually involves mapping functions onto specific
processors and contending with a difficult API. There are several reasons that
this is the case: the call-return model does not start work on data until it
is needed (the function that creates the data is called as a request for the
data); synchronization is performed by event notification, which is a system
function; event notification across OS instances is accomplished by a
different mechanism than within the OS; and migration of computation from one
node to another is difficult or impossible. Moreover, Object Oriented
Programming (OOP) does not provide enough encapsulation to hide machine
boundaries and asynchronous communication is provided by an external entity
(the message queue) which is slow and clumsy.
ANTs employ asynchronous communication as its basic primitive. Which
instance of the OS in a cluster an ANT actually runs on is a matter of
considerable indifference to ANTs. ANTs are linked to the list of runable
tasks by the arrival of all requested data and, therefore, are neither
notified nor polled. Because no distinction is made about where an ANT runs,
the cluster API is not a factor. Because ANTs automatically load-balance in a
cluster, no special mapping is required. Because ANTs do not rely on the
system for synchronization, distributed and dissimilar systems are no problem.
(Special harnesses for fault tolerance have been designed, but not
implemented.) Hand mapping synchronously communicating processes does not work
very well. ANTs are a significant improvement. Windows NT, and, to a lesser
degree, UNIX are famous for their system inefficiency. ANTs dramatically
reduce the role of the operating system, improving efficiency and scalability.
Thread management overhead is lowered 90% using ANTs and can be distributed.
This is enough to make NT scale to very large parallel systems.
The call-return model of programming, embeds the name of the function called
in the calling code. ANTs embed the name of the desired value. Additionally,
the name is local to a component, and, therefore, components can be
interchanged without altering the source of other components. Rules are easy
to add and modify, because whenever Event X happens is the fundamental
primitive of ANTs and is usually the trigger for rules. Call-return programs
must embed the notification code in the calling sequences, however many of
them there may be, and that makes change difficult.
The reliable superclassing of objects makes JAVA an ideal language for forms
routing systems (because stations may need only the functionality of the
superclass, and keeping the class description of every form on every station
is a management problem), but the absence of useful synchronization mechanisms
in JAVA thwart most efforts in this direction. ANTs provide robust
synchronization at this global level, reduce the cost of network latency, and
improve component encapsulation.
Large online applications with routing requirements for many forms and,
similarly, directed map simulations are notoriously hard to parallelize. ANTs
make it easy because ANTs programs are partially ordered directed maps. Many
common instruction lists are partially ordered directed maps, such as recipes.
For instance, it is necessary to break the eggs before separating them, but it
is not necessary to break them before sifting the flour or vice versa. It is
necessary to do all three before mixing the flour and the eggs. It is
interesting to note that cooking a meal is often used as an everyday example
of parallel processing, even though the recipes or instructions for preparing
the meal's components or dishes, are procedural rather than descriptive. With
complex sets of instructions, it is often difficult to find an order for the
instructions that will not halt them before completion. ANTs naturally express
their latent partial ordering and the directed map. An ANTs program will not
halt before completion, if any path completes, regardless of whether the
programmer recognizes the proper path or whether there is even a consistent
set of rules for recognizing it. This fact simplifies the job of writing the
program, improving the performance by a factor of a hundred by eliminating a
complicated central control. In addition, the program will run at maximum
concurrency on a multiprocessor without modification.
Lawson Software has the first release C++ version of ANTs installed on a ten
processor Aquanta XR/6 NT server in the Research & Development benchmark
laboratory. The technology is being evaluated as a means of improving
performance against the strict preemptive multi-threading of the NT operating
system. We have an application written in JAVA and are looking forward to the
forthcoming JAVA version of ANTs to evaluate performance improvements over
JAVA's lightweight but still preemptive threading.
ANTs are an enabling technology not only for parallel processing but for
distributed object oriented systems as well. Since ANTs are language and
operating system independent (implemented in both C++ and JAVA for NT, UNIX
and any JVM), the components of a distributed object based application can be
assembled from objects written in either language, (or any other, with JAVA
wrappers), running on any OS that supports them or any JVM.
One might ask why anyone would want such a complex distributed application
platform. The answer is that nobody does. Such systems are usually "received
structures" and must be operated as effectively as possible until they can be
replaced by more highly integrated successors. It is unusual but not unknown
for customers of an ISV providing cross-industry applications on multiple
platforms, to (inter)operate software suites on AS/400 servers for
manufacturing, UNIX servers for CAD/CAM and perhaps supply chain, and NT
servers for European subsidiaries and some corporate functions like HR. As
these applications migrate to object technology with increasing functional and
geographic distribution and ever more complex interoperability, software tools
like ANTs will enable dramatic performance increases over preemptive threading
technology.
Our experience in the development and application of parallel processing
over the past thirty years and in distributed object oriented systems over the
past ten, tells us that ANTs represent a significant break-through technology
with a very promising future in the Age of JAVA. We have long known that the
real world operates concurrently; physicists tell us that time is all that
keeps everything from happening at once. The object paradigm offers an
opportunity to map real world business objects such as invoices, for example,
into computer stored, manipulated and distributed electronic objects in a
natural way. ANTs allow us to escape the constraints of the computer operating
system's in-built deterministic, sequential ordering and also to map onto our
computer models the partially ordered real world activities that make up
actual business processes.
The author wishes to acknowledge many helpful discussions with Clifford
Hersh, Vice President for Product Development at CHoPP Computer Corporation
and with Dr. Richard Wright recently of Lawson Software, now with The Pitts
Library at Emory University, Atlanta, G.A.
---
Peter Patton was formerly Vice President of International operations for
Lawson software group based out of MN. Patton has held numerous prestigious
high technology positions, including the post of chief information officer
(CIO) of the University of Pennsylvania and lead IT consultant to: the
National Technology Transfer Center; NASA's Jet Propulsion Laboratory; and Dun
and Bradstreet's DataQuest Division. He was founding director of Minnesota
Supercomputer Institute, vice president and director of Parallel Processing
Architecture at MCC and director of computing at the University of Minnesota.
He is the author of 16 books and 86 journal articles. Patton also holds a
Harvard A.B. degree in Engineering and Applied Physics, an M.A. in Mathematics
from the University of Kansas, and a Ph.D. in Aerospace Engineering, summa cum
laude, from the University of Stuttgart in Germany.
For more information, see http://www.antssoftware.com/