Using whole genome scores to compare three clinical phenotyping methods in complex diseases.

Sci Rep
Authors
Keywords
Abstract

Genome-wide association studies depend on accurate ascertainment of patient phenotype. However, phenotyping is difficult, and it is often treated as an afterthought in these studies because of the expense involved. Electronic health records (EHRs) may provide higher fidelity phenotypes for genomic research than other sources such as administrative data. We used whole genome association models to evaluate different EHR and administrative data-based phenotyping methods in a cohort of 16,858 Caucasian subjects for type 1 diabetes mellitus, type 2 diabetes mellitus, coronary artery disease and breast cancer. For each disease, we trained and evaluated polygenic models using three different phenotype definitions: phenotypes derived from billing data, the clinical problem list, or a curated phenotyping algorithm. We observed that for these diseases, the curated phenotype outperformed the problem list, and the problem list outperformed administrative billing data. This suggests that using advanced EHR-derived phenotypes can further increase the power of genome-wide association studies.

Year of Publication
2018
Journal
Sci Rep
Volume
8
Issue
1
Pages
11360
Date Published
2018 07 27
ISSN
2045-2322
DOI
10.1038/s41598-018-29634-w
PubMed ID
30054501
PubMed Central ID
PMC6063939
Links
Grant list
K01 DK114379 / DK / NIDDK NIH HHS / United States
R01 HL122225 / HL / NHLBI NIH HHS / United States
T15 LM007092 / LM / NLM NIH HHS / United States