BMC Bioinformatics (May 2011)
SNPPicker: High quality tag SNP selection across multiple populations
Abstract
Abstract Background Linkage Disequilibrium (LD) bin-tagging algorithms identify a reduced set of tag SNPs that can capture the genetic variation in a population without genotyping every single SNP. However, existing tag SNP selection algorithms for designing custom genotyping panels do not take into account all platform dependent factors affecting the likelihood of a tag SNP to be successfully genotyped and many of the constraints that can be imposed by the user. Results SNPPicker optimizes the selection of tag SNPs from common bin-tagging programs to design custom genotyping panels. The application uses a multi-step search strategy in combination with a statistical model to maximize the genotyping success of the selected tag SNPs. User preference toward functional SNPs can also be taken into account as secondary criteria. SNPPicker can also optimize tag SNP selection for a panel tagging multiple populations. SNPPicker can optimize custom genotyping panels including all the assay-specific constraints of Illumina's GoldenGate and Infinium assays. Conclusions A new application has been developed to maximize the success of custom multi-population genotyping panels. SNPPicker also takes into account user constraints including options for controlling runtime. Perl Scripts, Java source code and executables are available under an open source license for download at http://mayoresearch.mayo.edu/mayo/research/biostat/software.cfm