Making Sense of Biomedical Big Data: Integrated Biomedical and Health Informatics for Personalized Care
May D. Wang, Georgia Tech and Emory University
Rapid advancements in biotechnologies such as –omic (genomics, proteomics, metabolomics, lipidomics etc.), next generation sequencing, bio-nanotechnologies, molecular imaging, mobile and wearable sensors etc. give hope that personalized, predictive, preventive, and participatory health will become a reality. I have focused on Biomedical Big Data Analytics research, especially on “Integrated Translational Biomedical and Health Informatics (BHI) for Personalized and Predictive Health”, with 160+ peer-reviewed publications. Our research includes (1) bioinformatics that extracts personalized –omic biomarkers from data acquired by microarray, next generation sequencing and mass spec technologies; (2) tissue imaging informatics for clinical decision support to improve diagnosis accuracy; and (3) health informatics for health condition monitoring and behavior change. Our informatics solutions aim to address challenges in three phases from discovery (e.g. novel biomarkers), to development (e.g. clinical decision support systems), and ultimately delivery (e.g. mHealth and patient-centric education intervention) for both chronic conditions (e.g., cancer, cardiovascular, sickle cell disease, asthma, diabetes, and neurological brain injury) and acute conditions (e.g., intensive-care-units, emergency room visit).
There are multiple steps in BHI problem solving, ranging from data quality control, information feature extraction, advanced knowledge modeling, to decision making, and proper action taking through feedback. In this talk, I will use cancer to explain two Big Data Analytics areas. The first one is Big Cancer Genomics Data Analytics for Microarray Chip or Next Generation Sequencing (e.g. caCORRECT that improves genomics data quality; omniBiomarker that identifies biomarkers from high throughput –omic data based on clinical knowledge, and SEQC-pipelines that analyzes 100s of sequencing analysis algorithms). The second one is Big Cancer Tissue Imaging Data Analytics (e.g. whole-slide-imaging informatics for clinical decision support; Q-IHC that quantifies multiplexing in vitro diagnostic QD imaging data; TissueWiki that archives and analyzes multi-terabytes of raw and meta-information from Human Protein Atlas (HPA); tissue imaging mass spectrometry (IMS) data, and multiplexing Quantum Dots (QD) imaging data). Our research has been supported by NIH, NSF, Georgia Research Alliance, Microsoft Research, HP, Emory-Georgia Tech Cancer Nanotechnology Center, and Children’s Health Care of Atlanta.
Background Review Article:
(1) Kothari S, Phan JH, Stokes TH, and Wang MD, "Pathology imaging informatics for quantitative analysis of whole-slide images," Journal of the American Medical Informatics Association 20 (6), 1099-1108. 126.96.36.199/content/20/6/1099.full
(2) Wu PY, Phan JH, and Wang MD, "Assessing the impact of human genome annotation choice on RNA-seq expression estimates," BMC Bioinformatics 14 (Suppl 11), S8.