SNP selection for association studies: maximizing power across SNP choice and study size.
Pardi, F;
Lewis, CM;
Whittaker, JC;
(2005)
SNP selection for association studies: maximizing power across SNP choice and study size.
Annals of human genetics, 69 (Pt 6).
pp. 733-746.
ISSN 0003-4800
DOI: https://doi.org/10.1111/j.1529-8817.2005.00202.x
Permanent Identifier
Use this Digital Object Identifier when citing or linking to this resource.
Selection of single nucleotide polymorphisms (SNPs) is a problem of primary importance in association studies and several approaches have been proposed. However, none provides a satisfying answer to the problem of how many SNPs should be selected, and how this should depend on the pattern of linkage disequilibrium (LD) in the region under consideration. Moreover, SNP selection is usually considered as independent from deciding the sample size of the study. However, when resources are limited there is a tradeoff between the study size and the number of SNPs to genotype. We show that tuning the SNP density to the LD pattern can be achieved by looking for the best solution to this tradeoff. Our approach consists of formulating SNP selection as an optimization problem: the objective is to maximize the power of the final association study, whilst keeping the total costs below a given budget. We also propose two alternative algorithms for the solution of this optimization problem: a genetic algorithm and a hill climbing search. These standard techniques efficiently find good solutions, even when the number of possible SNPs to choose from is large. We compare the performance of these two algorithms on different chromosomal regions and show that, as expected, the selected SNPs reflect the LD pattern: the optimal SNP density varies dramatically between chromosomal regions.