Mobile sensors offer tremendous opportunities for accelerating biomedical discovery and optimizing care delivery; however, they present grand transdisciplinary data analytics challenges due to the unique combination of high volume, velocity, variety, variation, versatility, and semantic gap in their generated data. To address these challenges, the Data Science Research (DSR) Core of MD2K will pursue research in four primary thrust areas.
Thrust 1: Mobile Sensor Data-to-Information (MD2I)
Extracting useful and actionable information from mobile sensor data requires addressing several challenging problems, including real-world uncertainty and dynamics; inherent variability; non-stationary noise processes and missing data associated with on-body sensors; and dealing with a wide range of sensor signals with different properties.
Thrust 1 will develop general principles and computational methods for inferring markers (i.e., measures) of patient health, as well as markers of behavioral, physical, social, and environmental risk factors that are robust to wide variability in subject behaviors, an array of known and unknown confounders, errors in self-report data, and the variable quality and availability of sensor data. It will provide a mobile sensor data processing toolkit as open-source software that implements all data analytic steps to obtain robust markers data science researchers can use to develop new markers. For our two biomedical applications, MD2I will produce a variety of markers to detect a smoking lapse event (e.g., smoking event), to detect onset of congestion in CHF patients (e.g., lung water volume), to discover risk factors for smoking lapse (e.g., stress), and to discover risk factors for CHF (e.g., fast food consumption). Biomedical researchers can directly use these markers in their applications.
Thrust 2: Mobile Sensor Information-to-Knowledge (MI2K)
After Thrust 1 produces robust markers of health state, and behavioral, contextual, and environmental risk factors, the data collected by mobile sensors can be converted to time series of these markers. Thrust 2 will develop data analytic tools to mine marker time series to discover detectors and predictors of vulnerable states, as well as tools for using these discoveries to inform care decisions. It will develop discriminative latent variable models to discover patterns in multivariate time series of markers to robustly detect intermediate health outcomes (e.g., a smoking lapse) and generate alerts for patients and care providers with infographics of the surrounding context to inform care decisions. It will develop frequent pattern mining and Granger Causality models to discover predictors of adverse health events from marker time series data and develop a discovery dashboard to engage biomedical researchers in the discovery process. It will also develop learning algorithms for online adaptation of rules for deciding the content and timing of just-in-time adaptive interventions (JITAI).
Thrust 3: MD2K – Computation (MD2K-I)
Thrust 3 will architect and implement a scalable, responsive, trustworthy, accessible, generalizable, interoperable, and evolvable Big Data software platform to support MD2K research.Thrust 3 will make it possible for data science researchers to learn their models from large population-scale data and for biomedical researchers to apply these models for biomedical discovery. It will also enable the application of these models to individual-scale data for near-real time care delivery (e.g., JITAI) on mobile devices.
To provide a responsive user experience to researchers, care providers, and individuals, Thrust 3 will build upon our ongoing research on Big Data computational methods such as Iterative Map-Reduce. Thrust 3 will also develop computational mechanisms and software for management of participant privacy and identity by leveraging our ongoing work on sensor data vaults and recent developments in fully homomorphic encryption.
To make MD2K extensible to new mobile data sources and to make MD2K applicable to other biomedical applications, Thrust 3 will build upon the Open mHealth platform to standardize all the data types, analytic modules, visualization modules, and storage modules developed in MD2K, so it is interoperable, extensible, and generalizable.
Thrust 4: MD2K-Applications (MD2K-A)
Thrust 4 will help realize the vision of P5 medicine by using team science principles and design thinking expertise to create a bi-directional, rapid feedback loop between real-world applications and Thrusts 1 to 3, to inform the design of MD2K’s core technology and application platforms.Thrust 4 will design and conduct user studies in Years 2, 3, and 4, each involving a new pool of 75 smokers in lab and field (1 week ad lib, 1 week post-quit) and 75 CHF patients in hospital and 30 days post discharge in field. The first study will evaluate the accuracy of markers, and provide data for development of new markers and improvement of existing markers by Thrust 1. These data will also be used by researchers in Thrust 2 to develop models for detecting and predicting vulnerable states, all of which will be evaluated in Year 3. Finally, Year 4 study will field test end-to-end just-in-time notification systems for each application.
By demonstrating a sensor-information-knowledge pipeline applied to two quite different use cases, Thrust 4 will validate the robustness, accuracy, and feasibility of the MD2K platform. Evaluation of the platform by other centers in the BD2K Consortium will further validate our approach and demonstrate the potential of MD2K to enhance the Big Data capacity and capability of the biomedical enterprise to realize the P5 vision.