PathIntegral
PathIntegral
is a code for modeling of electron structures (quantum dots) with the Path
Integral Monte Carlo method. The first phase
involved a study of hydrogen atoms on a nickel surface. The optimization concerned mostly reducing
the runtime on the Many-Body Alloy, i.e. MBA potential, subroutine. The runtime
on the Corona
server was reduced by 80 percent. The next phase optimization concerned a code
used to calculate mainly quantum statistics of atoms and molecules. The code
must account for the movement of both electrons and cores, which is difficult,
and often impossible with the traditional methods. When pseudopotential was
tabulated, excessive use of exp and erf functions was omitted, and the code
became significantly faster. In addition, the number of computational
operations was also reduced by re-tabulating necessary data, such as distances,
which further helped to reduce the runtime. The serial performance was improved
by a factor of ten. Finally, the code was parallelized with OpenMP.
PathIntegral is based on a new modeling method of quantum points. It can be
used to solve nano-scale problems, which makes the code scientifically
significant. Image © Jussi
Enkovaara
LES
The
computational time of LES, Large Eddy Simulation code for turbulent fluid flow
was cut in half, when an optimized solver for tridiagonal matrices was
implemented and memory references were optimized. Turbulence is one of the key
ingredients in the efficiency of devices with fast fluid flow, like, e.g.
ships, paper-machines, hydro-power plants, etc. Image © Ville
Vuorinen
CohortComparator
CohortComparator is a sequence comparison program for genetic research. It is a program for localizing genomic regions in SNP (single-nucleotide polymorphism) data that may harbor recessively inherited mutations. The original program was written by Marko Laakso (University of Helsinki, Biomedicum) in the R language. Within FinHPC, the code was rewritten anew in Fortran 90. To maximize the performance, the data structures were simplified to vectors and two dimensional tables, the most time-consuming loops were restructured, and some auxiliary variables were introduced to eliminate unnecessary comparisons. For a reasonable-sized test data the performance time was dramatically reduced.
ProCon
ProCon
is the calculation engine for a web service at Tampere University/Institute of
Medical Technology for analyzing correlations of aligned protein sequences.
Several bottlenecks in the code were identified and removed. The efficiency of
the original code suffered from abundant memory use and file access. The data
structures used to search for pairwise or triplet correlations between amino acids
have been replaced by more efficient data structures. Memory usage was improved
by reserving memory only for the necessary computational operations.
Significant improvements were also made by redesigning the writing of results
to output files. Instead of using several small files, the results are combined
and saved into one larger file. Memory utilization was diminished by a factor
of 88, hard disk space by a factor of 13, and runtime by a factor of 112. The
program is now more than 100 times faster than before the optimization. ProCon
is used in, for example, drug design, where the desired functionality is
achieved by using chemically more suitable amino acids. Image © Jyrki Hokkanen
POY
POY,
Phylogenetic Analysis of DNA and other Data using Dynamic Homology, compares
the different DNA nucleotide sequences between DNA strands and searches for an
association between DNA changes and the genome. Most of the program time is
spent in three core subroutines. One of them was rewritten in inline assembly
that uses the cmov (conditional move) instruction instead of branch
instructions (jmp). Together with loop unrolling and common subexpressions in
the other subroutines the total speed up of POY was more than 1.5. POY was
originally developed at the American
Museum of Natural History
and the modifications were included in the new version of POY. Image © Jyrki Hokkanen
GenInter
GenInter bioinformatics software was developed by a research group at the University of Helsinki. It is used for establishing relations between single nucleotide polymorphisms (SNP) and clinical patient data, e.g. to allow forecasting of recovery in prognoses for cancer patients. The original software had been written in the R language. It contains powerful statistical analysis tools, but as an interpreting language it is slow and unsuitable for parallel computers. Within the FinHPC project, the code was rewritten in Fortran and parallelized using the MPI paradigm. As a result, considerable speedups were attained in runtime performance. More important than that is the fact that the rewritten code enables completely new possibilities for research. Due to an inherent memory bottleneck, the previous version was able to treat combinations of three SNPs only. The rewritten code can address a much larger amount of SNP data, perhaps even the whole genome.
Biogenetic
Biogenetic uses Bayesian approach to analyze the inheritance of genetic traits in well-documented populations. The algorithm requires several thousands of iterations involving matrix updates and solution of a system of linear equations. The program has been developed by Professor Otso Ovaskainen, presently at the Department of Biological and Environmental Sciences, University of Helsinki. The original code, used in the Mathematica environment for symbolic mathematics, was rewritten in Fortran 90. The involved sparse systems of equations are solved using open source libraries. Running a moderate size test data with the original version took roughly one month. Solving the problem with the rewritten Fortran code takes only a fraction of this time.
HOLO
HOLO is a code for simulation of the affect of a hologram on the radiation pattern of a GSM antenna. The code functions at two levels investigating the near field and far field differences. Within the FinHPC project the code was profiled and parallelized.
BLAST
BLAST (Basic Local Alignment Search Tool) is an amino acid and DNA sequence alignment program. BLAST is the de facto standard for genomic sequence alignment and used world wide. The problem of the original code was occasional crashing during runs. The reason for this was found and corrected. Data structures were reconfigured, which improved the runtime by a factor of 1.5, and code development is continued to shorten the runtime even further. A commercial version with parallel databases already exists.
GEANT4 Bertini code
The Bertini code, developed in collaboration by CERN and the University of Helsinki, simulates nuclear cascades from particle beams in matter. The class structure of the code was simplified and the results from repeated calculations were tabulated, which speeded up the code by a factor of 3.1. The results were presented in a workshop arranged by CERN, held in Oporto in October 2006. The Bertini code has applications in targeted cancer treatments. Simulations are crucial for the determination of dosage and particle beam absorption in tissues.
MEMBRA
MEMBRA
is a simulation program for physical and biophysical applications, and in
particular, for simulating the texture of growing cellular aggregates. The
former serial version was transformed to an MPI parallel code, which can
simulate 100–1000 times larger aggregates compared to the serial version.
MEMBRA is the first parallelized code for modeling cellular aggregates. It has
obvious medical applications.
ACTIN
ACTIN is a cellular biology code for simulating the actin microfilament network in eukaryotic cells. The model is a coarse grained representation of the physical system enabling studies of large systems over long time scales. During this project a parallel algorithm was developed and implemented in ACTIN. The parallelization enables much larger systems to be simulated, which opens new avenues for research. The code offers new ways to investigate cytokinesis (i.e. the division of cells) with obvious medical applications.