VAST
auto-parallelization, auto-vectorization tools
VAST is a family
of powerful software tools which can automatically convert serial code
into code designed for dual processors (Windows/Macintosh
and Linux), add OpenMP directives, or vectorize (G4/G5
Macintosh and PPC/Linux) your code for AltiVec. Available for both Absoft and IBM compilers.
More... |
|
- VAST/AltiVec automatically translates Fortran, C and C++ to use the AltiVec vector unit on PowerPC G4- and G5-based systems.
- VAST/Parallel automatically
translates Fortran and C to use multiple processors
via threads parallelism on Linux-based, Windows and MacOS X
systems.
Back
to top
|
|
VAST Vector (AltiVec)
features include:
-
Optimization
of entire loop nests, not just inner loops. Critical optimizations
include loop fusion (squeezing multiple loops into one loop),
outer loop unrolling (unrolling an outer loop inside an
inner loop), loop collapse (making one long loop from a
multiple dimension loop), and loop interchange (changing
the order of the loops in a loop nest to get more efficient
memory access).
-
Unrolled
vector loops. Unrolling vectorized loops is very important
in making sure that the vector instructions are overlapped
the the maximum extent possible.
-
Vectorization
of reduction loops. Includes array summations, dot products,
minimum and maximum element of an array, product of array
elements, etc. These operations take a large fraction of
the CPU time for many programs.
-
Vectorization
of conditional loops. "if" statements and conditional
operators are vectorized.
-
Non-aligned
vectors can be vectorized efficiently. VAST introduces "permute"
operations to align vectors "on the fly" prior
to computation.
-
32-bit float
and 8, 16 and 32-bit integer vectorization. Integers can
be signed and unsigned. Also, VAST can vectorize loops that
contain mixed data sizes.
-
ALIGNED
pragma so that the user can inform VAST-C about arrays that
are aligned on 16-byte boundaries. Also the -Valigned command
line switch.
-
-Vmessages
switch to get vectorization messages for all loops in the
program. Find out what constructs are inhibiting vectorization
of your important loops.
-
DISJOINT,
NODEPCHK pragmas for disambiguating data dependencies. Especially
useful if the target program uses lots of pointers rather
than array notation.
-
-L parameter
for assertion levels to allow vectorization in the presence
of pointer arguments. Can be very useful if the program
is written to pass most of the data as pointer arguments.
-
Vector load
lifting. Move all loads to the top of the loop, as far as
they will go (safely). Allows the compiler to do a better
job of instruction scheduling.
-
Vectorization
of complex data type. Uses the permute instructions to reorder
interleaved complex data so that it can be operated on with
the vector unit.
-
Testing
for stride one on loops with variable stride. Inserts a
run-time test to see if variable array strides are all one;
executes a vector version of the loop if the strides are
one, otherwise executes the original scalar loop.
-
Partial
vectorization of loops with strided or gather/scatter vectors.
-
Vectorization
of "table lookup" loops. Loops that have a branch
out of the loop can be vectorized in certain cases.
-
Full Loop
Nest Analysis. Loops are analyzed in simple and complicated
loop nests; loops containing the largest amount of work
are parallelized. Loops do not have to be tightly nested.
-
Extended
Parallel Regions. VAST/Parallel extends parallel regions
to include multiple parallel loops and intervening scalar
code. This cuts down on parallel overhead.
-
Threshold
testing. All parallel systems have some overhead. When VAST/Parallel
finds a parallel region, if the amount of work in the region
is not clear at compile time, then VAST/Parallel creates
a run-time test. Through this run-time test, the parallel
region will only be executed if there is enough work; otherwise,
the original serial version is executed.
-
Dependence
Analysis. VAST/Parallel has very sophisticated data dependency
analysis capabilities that allow it to optimize complicated
situations. All loop nests are examined to see if they can
be executed in parallel safely. VAST/Parallel can resolve
ambiguous subscripting by examining variable assignments
outside of loops, and restructure the use of variables to
avoid certain other dependencies.
-
Potential
Dependence Testing. When dependencies are unclear at compile
time, sometimes VAST/Parallel can generate run-time tests
to allow parallelism to proceed.
-
Special
Reduction Optimization. Summations and other reductions
are parallelized through the use of locks or critical regions.
-
Shared/Private
Determination. All variables in a parallel loop are categorized
as shared (seen by all threads) or private (copy in each
thread). VAST/Parallel can detect and create private arrays.
-
Interprocedural
Analysis for Parallel Calls. VAST/Parallel can examine call
chains to determine their dependencies, and then parallelize
loops containing calls or groups of calls outside loops.
-
Automatic
recognition of parallel cases. When sections of code deal
with disjoint operations, VAST/Parallel can process each
section in a separate parallel case.
-
Superscalar
optimizations. VAST/Parallel includes scalar optimizations
to boost performance even in a single thread. Parallel optimizations
can be done to outer loops while inner loops are optimized
for efficient execution on one thread.
-
Array Syntax.
VAST/Parallel can in general parallelize and optimize multi-dimensional
array syntax just as efficiently as loop nests.
-
Choice of
static or dynamic partitioning of loop iterations. Load
balancing can tradeoff with loop overhead. Use dynamic partitioning
when you need more load balancing, static partioning when
you are concerned about overhead.
-
Number of
threads can be set with an environment variable. This allows
degree of parallelism to be changed from run to run. When
the system is busy you can run with two threads, when it
is empty you can run with eight threads, without recompiling
your program.
-
Choice of
thread waiting strategy. You can select either busy waiting
or sleep waiting for threads, so that the parallel program
can adapt to loaded or dedicated workloads on the target
system. Use busy waiting on a lightly loaded system, and
sleep waiting when another job might need the cycles.
VAST is a family of
powerful software tools which can convert serial code into
code designed for dual processors (Windows/Linux and
Macintosh), or vectorize (G4/G5 Macintosh and PPC/Linux) your code for AltiVec. All code restructuring can be done automatically at the click of a mouse. The original source files are retained in unmodified form.
VAST
products are developed by Crescent
Bay Software who is an Absoft partner and the leading
developer of superscalar, auto-parallelization and auto vectorization
tools for workstations and supercomputers.
VAST/Parallel
VAST/Parallel is a preprocessor that provides automatic parallelization
and superscalar optimization of code for maximum performance.
VAST/Parallel optimizations can provide performance gains
on single or multi-CPU systems. VAST also provides full support
for OpenMP. VAST is available for all Absoft products for AMD/Intel/Linux
(32-bit), AMD Opteron/Linux (64-bit), PPC/Linux, Macintosh OS X and
Windows NT/2000/XP SMP systems.
VAST/Parallel
is the most cost effective way to determine how much your
code benefits from dual processors. There is no need to spend
your time rewriting, debugging and editing code to test for
potential performance gains just click your mouse.
And best of all, using VAST, you can build highly optimized
versions of your application for different target systems,
all without changing the original source tree. VAST/Parallel
is very easy to use and can be operated from the user
IDE
on Pro Fortran for Windows or Macintosh
OS X. Just click the mouse and go. Once created, applications
do not need VAST to be present on the system to run. There
is no runtime redistribution fee for applications created
with VAST/Parallel.
VAST/Vector
VAST/Vector
is a preprocessor which automatically restructures code to
take advantage of the AltiVec vector unit built into the Macintosh
G4/G5 processors. VAST/Vector operates in single precision
mode and can provide significant speed ups in certain types
of applications.
VAST/Vector is available for Absoft and IBM compilers. Very easy to use it can
be operated from the Absoft IDE. Pro Fortran v9.0
allows a single copy of VAST to work with both Absoft and IBM
compilers. Just click the mouse and go. For XL Fortran users not
using the Absoft IDE, a command line verion is provided. Once created,
applications do not need VAST to be present on the
system to run. There is no runtime redistribution fee for applications
created with VAST/Vector
VAST/Vector
is the most cost effective way to see how much your code benefits
from vectorization. There is no need to spend your time re
writing, debugging and editing code to test for potential
performance gains just click your mouse. And best of
all, using VAST, you can build highly optimized AltiVec versions
of your application without changing the original source tree.
Back
to top
VAST is now compatible with
XL Fortran and XL C/C++ as well as Absoft Fortran. Users of Pro Fortran v9.0
need only a single copy of VAST which will work with both XL Fortran and
Absoft Fortran, which saves money and increases efficiency.
VAST/Parallel works very
closely with the DEEP development environment to provide a
complete GUI interface to the world of data parallel programming.
VAST/Parallel gathers compile-time data for DEEP, and inserts
instrumentation code for run-time data gathering. DEEP uses
this information to display in detail the compile-time optimization
notes (which loop nests have been parallelized, where data
dependencies are preventing parallelization, etc.) and run-time
performance data (which loop nests use the most wallclock
time, which procedures are called the most, etc.) in many
useful views of the program. With DEEP, you can very quickly
zoom in on any performance bottlenecks in your code.
Back
to top
Performance Gains on a
Single CPU system:
VAST/Parallel's superscalar optimization technology can enhance
the performance of certain types of code on standard, single
CPU systems. If your programs spend large amounts of time
in nested loops or operating on large arrays, a performance
improvement of over 35% may be possible. On other types of
code, VAST/Parallel may have little impact.

Performance Gains on Dual
CPU System:
VAST/Parallel can automatically parallelize your code and
also provides full OpenMP support to enable user-directed
parallelization. VAST/Parallel contains sophisticated data
dependency analysis technology to detect when optimized execution
will be safe, has very advanced in-lining capabilities, and
uses interprocedural analysis to optimize across procedure
boundaries.

Back
to top
VAST/Parallel fully supports
the OpenMP standard. For calculations where you know exactly
what you want parallelized, OpenMp provides a portable way to
specify this. VAST/Parallel supports all OpenMP directives/pragmas
and functions, and provides diagnostics on incorrect use of
the directives.
Features
include:
The driver(s) that comes with VAST Vector (AltiVec) combines VAST and the
compiler(s) in a transparent way, so that (for example) compilation
can be as easy as replacing gcc with vcc or f90 with v90 in your
makefiles.
There
are several ways to use VAST. If your program spends most of its
time in clean loops, then VAST may be able to vectorize your program
automatically. Often with C programs, depending on the programming
style they are written it, a potential "data dependency"
between pointers and arrays may prevent some vectorization, and
some simple assertions from the user can improve the amount of vectorization.
VAST can provide messages that help you understand what parts of
your program have been successfully optimized and what parts have
not been optimized.
Advanced
users may choose to write clean loops for new applications and have
VAST automatically generate AltiVec code, rather than doing AltiVec
coding by hand. Very advanced users may wish to modify the VAST
intermediate C code and change the order or nature of vector operations
that VAST generates.
Back
to top
September 15 2003
- Electronic Delivery full Media Documentation
September 31 2003
Agreement: Absoft Software License Agreement.
Limited warranty applies:
Yes
Warranty: Absoft and its Licensors warrant that the media on which the software is supplied shall be free of defects in materials and workmanship under normal use for a period of 90 days from the date of purchase.
Introductory Software Maintenance applies: From date of purchase
until version is no longer current; charges may apply afterwards.
Absoft branded products
include Premier Technical Support and Initial Software Maintenance
at no additional charge from the time of purchase for as long
as the purchased version remains the current release version.
Initial
Software Maintenance (ISM)
Initial Software Maintenance
includes free electronic downloads of all interim Service Packs
and Quick Fixes from Absoft. Version upgrades require separate
purchase but at discounted prices.
Continuing
Software Maintenance (CSM)
At the conclusion of the
Initial Software Maintenance term (i.e. when a new version of
the purchased product is released) customers may either purchase
a Product Upgrade or within 30 days of the new product release,
purchase Continuing Software Maintenance (CSM) for an additional
12 month maintenance term. Continuing Software Maintenance includes
all elements of Premier Technical support plus all Product Upgrades
issued during the 12 month Continuing Software Maintenance term.
Annual renewals of Software Maintenance will be made available
at low-cost to customers who renew within their unexpired Maintenance
term.
For users electing not
to renew Continuing Software Maintenance and having expired maintenance
terms but later wishing to rejoin CSM, Reinstatement Software
Maintenance (RSM) terms may be purchased.
Third party products distributed
by Absoft are subject to license terms and policies of their respective
manufacturers.
Disclaimer:
Premier Technical Support cannot be used as a tutorial to assist
in the design and development of applications, use of programs
in other than their specified operating environment, provide instructions
on general system operation or explore failures caused by products
for which Absoft is not responsible under this agreement.
MaxFlex· Flexible
Licensing available: No
Floating Network Licenses
available: No
Copy and use on home/portable
computer: No
Volume orders: Yes.
Back
to top
|