Sciences and methods > Biosciences > Programs > SNPHAP
 
Tehdyt toimenpiteet

SNPHAP

Version

1.3.1. installed in HIPPU

Description

SNPHAP is a program for population-based haplotype construction developed by David Clayton (2002). The program uses EM algorithm and can be used with biallelic SNP markers.

Usage

The program is installed in Hippu (hippu.csc.fi).

The software needs to be initialized before use:

module load snphap

SNPHAP requires a single input file, where each row corresponds to a single individual. The rows start with a person identifier which is followed by individual's genotypes in map order.

An example of SNPHAP input file consisting of 200 individuals and 11 markers.

To run SNPHAP, type snphap followed by input and output file names and command line options. An example of the execution command is:

snphap -i 300 -mm 100 in.1 out.1 out.2

The command ends with three file names:

  • in.1 is the name of the input file (see example)

  • out.1 is the output file that contains the estimated population-level haplotype frequencies at time of convergence (see example). Also, cumulative probabilities are given.

  • out.2 is the name of the file that contains the individuals' haplotypes and their posterior probabilities (see example)

Some of the most widely used command line options are:

-i n sets the maximum number of iterations for the EM algorithm.
-mm n repeats the haplotype construction n times, each time with a separate starting point for the EM algorithm. After the repeatitions, the program selects the solution that produces the highest likelihood. Increasing the number of repetitions decreases the probability that the program will be stuck in a local optimum.
-th f             
Sets the probability threshold for the individuals' haplotypes that will be included in the output. For example, setting the threshold to 0.7 means that those alternative haplotypes for the individual will be printed whose probability is at least 0.7 times the probability of the most likely haplotype.
-mi n generates n data sets where each individuals' haplotype is randomly selected from the posterior distribution using the probabilities for the particular individual.

A full description of SNPHAP command line parameters can be viewed in SNPHAP documentation.


Documentation

A more detailed description about how to run SNPHAP is available at SNPHAP documentation.

User support

Saren Ari-Matti +358 9 457 2282 Ari-Matti.Saren at csc.fi