School of Public Health
Twin Cities
These researchers are developing novel statistical methods and efficient computational tools for discovering disease associated (primarily rare or less frequent) variants and risk prediction using the U.K. biobank and U.K.10K data and whole genome sequence (WGS) data from the NHLBI Trans-Omics for Precision Medicine (TOPMed) Program. Both datasets are whole genome sequencing samples with more comprehensive coverage of genetic variants compared to traditional GWAS chips. The TOPMed data have been released to the scientific community through dbGaP.
The researchers are currently developing several statistical methods for efficiently testing rare variant set association with multiple traits and robust risk prediction. Statistical methods for rare variant set testing methods across the genome and multiple traits are currently under active development. The group will explore several adaptive testing methods to improve the overall rare variant detection power. A common theme to these data analyses is the large-scale volume of data and testing. The researchers typically rely on very small p-values for significance testing. For example, it is common to use 5E-8 as a genome-wide significance cutoff. As a result, to compare and validate any developed methods, they need to perform hundreds of millions of simulation experiments in order to verify small p-values. Thus HPC resources prove to be an essential and indispensable tool to ensure project success.
Research by this group was featured on the MSI website in February 2016: Finding Genetic Markers of Transplant Rejection.