57755 Statistical Methods for Association Mapping
This is an intensive graduate level short course intended for statistical/mathematical and biological science students interested in association mapping. The course is sponsored by the ComBi graduate school in computational biology, bioinformatics and biometry.
This course introduces association mapping and the statistical methods and computational techniques needed. The course is based on chapters 7, and 8 of the book Association Mapping in Plants, together with case studies of recent large scale case-control studies of human disease.
Association mapping is a gene mapping method based on detecting and utilising population-level associations --- i.e. non-independence or `linkage disequilibrium' between genetic loci, e.g. between DNA markers and traits of interest. There is interest in using association mapping to find genetic loci associated with variation in complex traits and diseases.
The advent of dense maps (e.g. 500,000 or more) of SNP markers covering the genome, and technologies for screening large numbers of markers per individual is leading to generation of vast amounts of genomic data. However, obtaining useful information from the data is non-trivial, and many published associations are spurious. Statistical methods for analysing the data are presented. Experimental designs with sufficient power, to overcome the low prior odds for genomic associations, are equally vital. The course introduces methods and software for ensuring designs have sufficient power to obtain reasonable posterior odds for associations.
The basic concepts of Bayesian statistics, and how to use them for testing scientific hypotheses are introduced. This enables computation of posterior probabilities for scientific hypotheses. A range of techniques including analytical approximate methods, conjugate prior distributions and MCMC sampling are introduced. Bayesian computations for case studies are demonstrated and compared with classical `frequentist' inference based on p-values, which are shown to be particularly problematic in a genomics context.
The R system for data analysis and graphics and the BUGS system are introduced, and the required computations demonstrated. R functions and libraries (ldDesign) will be provided.
The course starts from first principles of Bayesian statistics. Knowledge of the basics of calculus (differentiation, integration), matrix algebra, and probability theory is an advantage. Basic knowledge of genetics (e.g. Mendelian inheritance, heritability) is also an advantage. The course aims to cater for both biologically and statistically oriented students.
There is a 1 hour exam at the end of the course and an optional project due by the end of May. Students passing the exam will receive 4 credits. Satisfactory completion of the project will be worth an extra 2 credits. Students are encouraged to work in pairs, with one biologist and one statistician.
There are 5 days with 5 hours of lectures per day from Monday the 5th of May to Friday the 9th of May. This includes a 5 hours introduction to the R system for statistical analysis and graphics and a 5 hours introduction to the BUGS system for Bayesian analysis using Gibbs sampling.
Last updated 2010-01-19 13:23