ACML
The principal mathematical subroutine library in Murska is ACML (AMD Core Math Library). This library is optimized for AMD64 processors.
The ACML routines can be called from both Fortran and C programs. ACML comprises the following parts:
- BLAS - Basic Linear Algebra Subprograms
- LAPACK - Linear Algebra Package for solving linear equations and eigenvalue problems
- FFT - a set of routines for Fast Fourier Transform
- RNG - a set of Random Number Generators and statistical distribution functions
- Fast vectorized versions of standard mathematical funcions (in library ACML_MV).
For a comprehensive documentation of ACML, see http://developer.amd.com/assets/acml_userguide.pdf
This short introduction follows very closely the manual.
Linear Algebra
BLAS and LAPACK are standard libraries for linear algebra operations. For documentation, see http://www.netlib.org/blas and http://www.netlib.org/lapack.
BLAS operations are divided into three classes:
- Level 1: operations on vectors (one or more)
- Level 2: operations involving a vector and a matrix
- Level 3: operations between two matrices
BLAS operations are used as building blocks for higher level operations in other libraries such as LAPACK.
LAPACK, originally written in FORTRAN 77, is a very widely used library for
solving problems in (dense) linear algebra. In addition to solving linear equations, the subroutines handle least squares problems, eigenvalue problems and singular value problems.
The actual algorithms behind some LAPACK subroutines in ACML differ from those used in the LAPACK source in public domain. Both functionally and numerically the subroutines conform to the usual LAPACK conventions.
Fast Fourier Transforms
Discrete Fourier Transforms in ACML come in two types:
- The transforms of the first type map complex data to complex data. These routines have names beginning with ZFFT (double precision) or CFFT (single precision). There are separate routines for 1D, 2D and 3D transforms. Applying forward and backward transforms consecutively recovers the original data.
- Transforms of the second type map complex data to real data or vice versa. The names begin with DZFFT or SCFFT (complex to real) and ZDFFT or CSFFT (real to complex). These routines are available only for 1D sequences, and consecutive forward and backward transforms will NOT recover the original data; rather, the transform must be conjugated before the backward transform in order to recover data.
Random Number Generators
ACML has five different Base Random Number Generators (BRNG) for producing sequences of pseudo-random numbers unifomly distributed over the open interval (0,1). In addition, there are 23 distribution generators for transforming the uniformly distributed numbers to variates from specified distributions (for example, the normal distribution or the chi squared distribution).
Usage
There are modules for 3.6, 4.0, 4.1 and 4.2 serial versions of ACML. All programming environments PrgEnv-pgi, PrgEnv-pathscale, PrgEnv-gnu, and PrgEnv-intel load a related acml module automatically, which may be version 3.6, 4.0, 4.1 or 4.2, depending on the programming environment used (see module list to see the exact versions loaded). The default version of ACML is otherwise 4.2.
Please, note that the PGI compiler versions 7.1 and 7.2 are not compatible with the ACML versions 4.0 and 4.1. Use the PGI versions 8.0 and the ACML versions 4.2 (one of which is loaded automatically with the PGI programming environment PrgEnv-pgi/8.0-4), instead. You can now use also the various PGI versions 7.1 and 7.2 only by swapping to their programming environments, because these also load the ACML versions 4.2, except the PGI version 7.2-5 which uses the ACML library installed wiht the PGI compiler itself.
Please, note also that the GCC versions 4.2 are not compatible with the ACML versions 4.2 and the GCC versions 4.3 are not compatible with the ACML versions 4.1. You must use the ACML versions 4.1 with the GCC versions 4.2. The GCC versions 4.3 requires the ACML versions 4.2.
In order to link ACML subroutines to your program from these automatically loaded serial version modules you must use the option -lacml when compiling (or linking). This option will link to the shared (dynamic) version of the library. If you need the static version of ACML, add the option -static before -lacml. Fortran runtime libraries might be needed with some older version after -lacml for some C-compilers: -lpgftnrtl and some times -lpgftnrtl -lrt for pgcc (also -lm may be needed), -lpathfortran for pathcc, -lg2c for version 3 gcc, and -lgfortran for version 4 gcc.
In addition, when linking in ACML routines with PGI compilers, you must compile and link all program units with -Mcache_align or an aggregate option such as -fastsse, which incorporates -Mcache_align. PathScale compilers do not need any extra options for linking ACML routines. With GCC compilers there may be various difficulties depending on the compiler command and you may need to add various options.
When you need the faster (but slightly less accurate) versions of standard mathematical functions, the appropriate option for linking is -lacml_mv.
For OpenMP (suffix _mp in directory and libarary names) and INTEGER*8 (suffix _int64 in directory names) there are no modules and you must by yourself define the needed environment variables, make your own modules, or use -I, -L, -l and OpenMP options in addition to the options mentioned above.
IMSL
IMSL is a collection of software libraries of numerical analysis functionality. The IMSL Numerical Libraries is developed by Visual Numerics. The IMSL product page can be found at http://www.vni.com/products/imsl/index.php.
Function catalogue with Fortran bindings can be found here.
To use IMSL libraries, one has to load the module imsl. After that, special wrappers have to be employed. A Fortran 90/95 program containing calls to IMSL routines is compiled as follows:
$F90 $FFLAGS myprog.f90 -o myprog $LINK_F90
All the $-prefixed statements are necessary. The PGI programming environment has to be used with IMSL.
MPI-parallelized IMSL routines are linked with the wrapper $LINK_MPI. In order to call serial IMSL routines from an MPI program, use the wrapper $LINK_MPIS.
For information in Finnish see: IMSL Murskalla