advertisement
Electronic medical records (EMRs) are being widely implemented for use in genetic and genomic studies. As a phenotypic rich resource, EMRs provide researchers with the opportunity to identify disease cohorts and perform genotype-phenotype association studies. The Epidemiologic Architecture for Genes Linked to Environment (EAGLE) study, as part of the Population Architecture using Genomics and Epidemiology (PAGE) I study, has genotyped more than 15,000 individuals of diverse genetic ancestry in BioVU, the Vanderbilt University Medical Center's biorepository linked to a de-identified version of the EMR (EAGLE BioVU). Here we develop and deploy an algorithm utilizing data mining techniques to identify primary open-angle glaucoma (POAG) in African Americans from EAGLE BioVU for genetic association studies. The algorithm described here was designed using a combination of diagnostic codes, current procedural terminology billing codes, and free text searches to identify POAG status in situations where gold-standard digital photography cannot be accessed. The case algorithm identified 267 potential POAG subjects but underperformed after manual review with a positive predictive value of 51.6% and an accuracy of 76.3%. The control algorithm identified controls with a negative predictive value of 98.3%. Although the case algorithm requires more downstream manual review for use in large-scale studies, it provides a basis by which to extract a specific clinical subtype of glaucoma from EMRs in the absence of digital photographs.
Center for Human Genetics Research, Vanderbilt University, Nashville, Tennessee, United States of America.
Full article3.4.1 Linkage studies (Part of: 3 Laboratory methods > 3.4 Molecular genetics)