Insight into the biology of common diseases using summary statistics of large genome-wide association studies
Ó³»´«Ã½ Fellow Insight into the biology of common diseases using summary statistics of large genome-wide association studies
Data from genome-wide association studies (GWAS) contain valuable information about the genetic basis of the disease. For most common diseases, obtaining insights from these data is difficult because the signal is very diffuse: there are likely thousands or tens of thousands of causal variants, each with a very small effect size on disease risk. Moreover, for many of the largest disease GWAS, no individual researcher has access to all of the genotype data; rather, the only data available are meta-analyzed marginal effect size estimates for each variant. I will describe a powerful approach to modeling these summary statistics that allows us, for example, to identify disease-relevant tissues and cell types, or to quantify the degree to which two traits have a common genetic basis. The approach, called LD score regression, is based on a commonly used model in genetics in which the effect size of each variant on the disease is random. The parameters of this model provide information about the disease such as whether regions of the genome active in a given tissue (e.g., liver) tend to be more associated with disease than regions of the genome active in a second tissue (e.g., brain). I will present results from an application of LD score regression to identify relevant tissues and cell types from several large GWAS, and from an application of LD score regression to identify pairs of phenotypes with shared genetic basis. [papers , , , ]