For PSI-BLAST searches, CSC uses an optimized version of blastpgp program.
Optimization
The optimized blastpgp version is based on the NCBI BLAST version 2.2.20. In the optimized version modifications were made to the code that constructs the sequence profiles during the PSI-BLAST search. These changes do not modify the PSI-BLAST algorithm in any sense. They just enable the program to build the sequence profiles faster.
The speedup compared to the original BLAST 2.2.20 varies between 0 and 40 % depending on the search task. The more time the search uses for the profile construction, the more it gains from the optimization.
Bug fixes
In addition to the code optimization several bugs in the blastpgp code were fixed too. If the number of hit sequences gets very high, the hit counter of the original blastpgp program overflows and the program fails collect the hits. Instead the program writes loads of error messages saying:
[blastpgp] ERROR: ncbiapi [000.000] ObjMgrNextAvailEntityID failed with idx 2048A typical situation where this error occurs, is a case where the PSI-BLAST query file contains a large number of query sequences.
In the optimized blastpgp version, this bug has been fixed and the program is able to deal with tens on thousands of hits.
Usage
The blastpgp program, that is in use at CSC, automatically uses the optimized version of blastpgp. Check the BLAST page of CSC BioBox for more information about using BLAST at the server of CSC.
If you wish to use optimized blastpgp in your local computer, please follow the instructions bellow.
1. Download the NCBI toolkit version 2.2.20 from the ftp site of ncbi
(note. the optimized codes can't be used with newer NCBI-toolkit versions)
Unizip and untar the installation package.
2. Download the modified source code files using the link bellow
Unzip and untar this file.
3. You should now have two new directories in you computer: ncbi and 2.2.20_Opt
Next, copy the modified source code files from ncbiOPT directory to the ncbi directory.
cp 2.2.20_Opt/api/objmgr.c ncbi/api/objmgr.c
cp 2.2.20_Opt/api/objmgr.h ncbi/api/objmgr.h
cp 2.2.20_Opt/biostruc/cdd/cddposutil.c ncbi/biostruc/cdd/cddposutil.c
cp 2.2.20_Opt/demo/blastpgp.c ncbi/demo/blastpgp.c
cp 2.2.20_Opt/tools/posit.c ncbi/tools/posit.c
cp 2.2.20_Opt/tools/posit.h ncbi/tools/posit.h
4. Install the ncbi toolkit using the normal installation commands
References
The optimization of the blastpgp program was done by Kristoffer Osowski, Mats Aspnäs and Jan Westerholm from the Department of Information Technologies, Åbo Akademi University. The work was done as part of the FinHPC project funded by the Tekes - Finnish Funding agency for technology and Innovation
| Mattila Kimmo | Kimmo.Mattila at csc.fi |