Blastall may be used to perform all five flavors of blast comparison. One may obtain the blastall options by executing 'blastall -' (note the dash). A typical use of blastall would be to perform a blastn search (nucl. vs. nucl.) of a file called QUERY would be:
blastall -p blastn -d nr -i QUERY -o out.QUERY
The output is placed into the output file out.QUERY and the search is performed against the 'nr' database. If a protein vs. protein search is desired, then 'blastn' should be replaced with 'blastp' etc.
Some of the most commonly used blastall options are:
blastall arguments:
-p Program Name [String]
Input should be one of "blastp", "blastn", "blastx", "tblastn", or "tblastx".
-d Database [String]
default = nr
The database specified must first be formatted with formatdb.
Multiple database names (bracketed by quotations) will be accepted.
An example would be
-d "nr est"
which will search both the nr and est databases, presenting the results as if one
'virtual' database consisting of all the entries from both were searched. The
statistics are based on the 'virtual' database of nr and est.
-i Query File [File In]
default = stdin
The query should be in FASTA format. If multiple FASTA entries are in the input
file, all queries will be searched.
-e Expectation value (E) [Real]
default = 10.0
-o BLAST report Output File [File Out] Optional
default = stdout
-F Filter query sequence (DUST with blastn, SEG with others) [String]
default = T
BLAST 2.0 and 2.1 uses the dust low-complexity filter for blastn and seg for the
other programs. Both 'dust' and 'seg' are integral parts of the NCBI toolkit
and are accessed automatically.
If one uses "-F T" then normal filtering by seg or dust (for blastn)
occurs (likewise "-F F" means no filtering whatsoever).
This options also takes a string as an argument. One may use such a
string to change the specific parameters of seg or invoke other filters.
Please see the "Filtering Strings" section (below) for details.
-S Query strands to search against database (for blast[nx], and tblastx). 3 is both, 1 is top, 2 is bottom [Integer]
default = 3
-T Produce HTML output [T/F]
default = F
-l Restrict search of database to list of GI's [String] Optional
This option specifies that only a subset of the database should be
searched, determined by the list of gi's (i.e., NCBI identifiers) in a
file. One can obtain a list of gi's for a given Entrez query from
http://www.ncbi.nlm.nih.gov/Entrez/batch.html. This file should
be in the same directory as the database, or in the directory that
BLAST is called from.
-U Use lower case filtering of FASTA sequence [T/F] Optional
default = F
This option specifies that any lower-case letters in the input FASTA file
should be masked.
Documentation for PSI-TBLASTN
PSI-BLASTN is a variant of blastall that searches a protein query
sequence against a nucleotide sequence database using a position
specific matrix created by PSI-BLAST. The nucleotide sequence database
is dynamically translated in all reading frames during PSI-TBLASTN
search. Using a position specific matrix may enable finding more
distantly related sequences.
Programs:
blastpgp [takes a protein query and perform PSI-BLAST search to
creates a position specific matrix using a protein
database]
blastall [reads position specific matrix and performs PSI-TBLASTN
search]
Usage:
A user would typically run blastpgp to create and save a position
specific matrix, followed by a run of blastall for PSI-TBLASTN search.
blastpgp must be executed with -C option followed by a file name to
save position specific score matrix.
blastall with "-p psitblastn" option executes PSI-TBLASTSN search, and -R option followed by a file name specifying the file that contains position specific score matrix. All other options that apply when using "blastall -p tblastn ..." also apply when using "blastall –p psitblastn ...", but there are some restrictions to parameters: 1) The query must be the same as the one used in blastpgp for creating a position specific matrix. 2) By default, blastpgp has filtering off (-F F) and blastall has filtering on (-F T). To ensure consistent usage of the blastpgp/psitblastn combination, the -F option should be explicitly set in one or the other run.
Example:
One may run PSI-BLST to create and save a position specific score matrix
as follows:
blastpgp -d nr -i ff.chd -j 2 -C ff.chd.ckp
Position specific score matrix is saved in ff.chd.ckp. Then, using
this matrix, one may run PSI-TBLASTN search:
blastall -i ff.chd -d yeast -p psitblastn -R ff.chd.ckp
Note that this allows the score matrix to be constructed using one
database (nr in the example) and then used to search a second database (yeast in the example). Even if the two database names are the same, blastpgp uses the protein version while "blastall -p psitblastn" uses the DNA version.