Detecting effects of transcription factors on disease/Generalized least squares


Harvard CS, Harvard/MIT MD/PhD program
Detecting effects of transcription factors on disease

Abstract:  Learning biology using GWAS data frequently involves identifying genomic regions involved in a biological process and assessing for enrichment of GWAS signal in those regions. But in some cases, e.g., binding of a transcription factor (TF), improving models and growing data sets allow us to estimate in a signed way whether genetic variants promote or hinder a biological process. I'll present a new method, signed LD profile regression, for combining this type of information with GWAS data to draw relatively strong inferences about trait mechanism. I'll then describe how this method can be applied in conjunction with signed genomic annotations reflecting binding of ~100 TFs in various cell lines generated using a convolutional neural network, Basset. Finally, I'll discuss some results from applying our method to GWAS data about a range of traits including gene expression, epigenetic traits, and several diseases.


Finucane Lab, Ó³»­´«Ã½
Primer: Generalized least squares

Abstract:  Linear models are a very common choice when modeling the relation between inputs and outputs because of their simplicity and interpretability. We will explore methods for parameter estimation in these models, with an eye toward understanding some of the more advanced techniques. We will start by reviewing the most commonly used estimator: the ordinary least squares (OLS) estimator. Then we will explore some limitations of the OLS estimator when the residuals are not i.i.d. and discuss how to overcome these limitations, first with with weighted least squares and then with generalized least squares. We'll close by discussing linear models in the context of genome-wide association studies (GWAS) as a lead-in to the talk.