International Conference on Healthcare Informatics (ICHI)
Clinico-genomic Data Analytics for Precision Diagnosis and Disease Management
November 30, 2015
Patient data can be present in clinical notes, lab results, genomic data sources, environmental and geospatial data sources and tissue banks to name a few. A holistic view of the patient's health can be achieved when relevant data from multiple heterogeneous sources are extracted and analyzed in a personalized manner. Moreover, comparative analysis of patients can be performed when multiple patient records are viewed across these heterogeneous data sources. To address this need, we propose clinico-genomic data analytics to enhance personalized medicine treatment decisions using heterogeneous, high dimensional, sparse and massive datasets. We utilize this framework to discover similar patients and overlaps among patients in a set of features towards the goals of: (1) better cohort discovery for clinical trials, (2) better disease management by studying peer group of patients with similar diagnosis but better prognosis, (3) early disease diagnosis by identifying similar features in patients with the existing diagnosis. We propose novel approach in two areas: (1) integrating clinical and genomic data of patients and (2) combined data analytics in such heterogeneous datasets. Our approach is modeled as a unified clustering algorithm for finding correlations among clinical and genomic factors of patients. We integrate data containing risk causing Single Nucleotide Polymorphism's (SNP's) known from literature with clinical records of patients. In such heterogeneous data, we propose a combined similarity measure for numeric and nominal data attributes, which we use in our clustering algorithm. Our results show compelling overlaps among patients in the same cluster. These patients had high pair wise similarity and emulated the real world similarities between patients with co-morbid diseases.
Downloads: 1017 downloads