Advancing biomedical discovery and improving health through mobile sensor Big Data

MD2K Software Platform

Software Platform | User Guides | Report a Bug | Forum

 

SOFTWARE DOWNLOAD HERE


 

Introduction

The MD2K software platform includes mobile phone software platform called mCerebrum and a cloud counterpart called Cerebral Cortex. mCerebrum is a configurable software platform for mobile and wearable sensors. It provides support for reliable data collection from mobile and wearable sensors, and real-time processing of these data for sensor triggered just-in-time adaptive interventions.

Cerebral Cortex is the big data companion of mCerebrum designed to support population-scale data analysis, visualization, model development, and intervention design for mobile sensor data. It provides the ability to do machine learning model development on population scale data sets and provides interoperable interfaces for aggregation of diverse data sources.

Limitations of Existing mHealth Platforms: Several software platforms have recently emerged for collecting mobile health data. For example, Apple and Google each provide a set of APIs and tools for mobile mHeath data collection. But, Apple Healthkit and Google Fit APIs are primarily designed for low frequency data collection of digital biomarkers including: blood pressure, weight, or physical activity. Both of these platforms have limited expressiveness for time series data streams.

Uniqueness of MD2K: The MD2K platform (https://www.github.com/MD2Korg/), on the other hand, is designed from the ground up as a high-frequency data stream processing toolchain that provides flexible data types and custom object storage. It can collect and analyze data from tens of wearable sensors via a wide array of wireless radios (ANT, Bluetooth, Bluetooth LE, etc.). It also provides native support for triggering notifications, self-report prompts, and interventions based on real-time values of digital biomarkers derived from sensor data.

Background: mCerebrum is backed by 9 years of software development on the Fieldstream (http://www.fieldstream.org/) and AutoSense (https://sites.google.com/site/autosenseproject/) projects which yielded in excess of 20,000 hours of wearable sensor data from a variety of lab and field studies with hundreds of participants. Over 27 research articles (with over 400 citations) have been published using analysis of these data (see list of articles belowgo here for citations). mCerebrum is based on these extensive experiences with real-life high-frequency sensor data and its analysis for both technology and health research.

The signal processing for Fieldstream and AutoSense were on a Matlab batch-processing codebase that operates on data after a participant has returned the data collection device to the lab. Therefore, no real-time sensor-triggered notifications or interventions were possible with the existing framework and furthermore, the codebase does not lend itself well to distributed processing for scalable application performance.

Novelty of MD2K: mCerebrum, on the other hand, is designed to be compatible with mobile platforms so as to support real-time signal processing of high-frequency data streams in excess of 800 hertz, while meeting quality of service requirements for the mobile platform. In addition, mCerebrum is designed as an open-source project so that it can be easily used by the community and modified to suit specific research needs.

Cerebral Cortex is the big data companion to mCerebrum designed to support population-scale analytics, model development, and data visualizations. One of the primary capabilities of Cerebral Cortex is its ability to support scalable big data machine learning model development and iterative data analysis and model generation across population-scale data sets. Models learned on population-scale data can be sent back to a smartphone in the field to improve detection and classification accuracy.

Functionality & Architecture of mCerebrum (Mobile)

Figure 1: Multi-layer architecture of mCerebrum

Figure 1: Multi-layer architecture of mCerebrum

mCerebrum is a configurable software platform for mobile and wearable devices. The mCerebrum platform is divided into functional layers so that each component is flexible and can be adapted and extended without adversely affecting the other components.  Two components – an access controller and data router – links each layer.  The access controller is responsible to ensuring that pairs of components within the system have appropriate credentials to communicate with each other through the data router, which is responsible for routing data objects throughout the platform.

To meet future needs, we anticipate continuing to adapt and augment the mCerebrum platform to support future technologies and the needs of new studies. The component-based architecture is easily modified and adapted to specific studies and the simple APIs should provide for easy integration into existing applications. mCerebrum has the following layers and associated application components:

Mobile Sensor Support for Data Collection

mCerebrum provides support for reliable data collection from mobile and wearable sensors in excess of 800 hertz, and real-time processing of these data for sensor-triggered just-in-time adaptive interventions. mCerebrum currently supports a variety of data sources including:

These sensor platforms communicate with mCerebrum over one of four interfaces: 802.15.4, Bluetooth, headphone port, or local API.

mCerebrum supports real-time data processing algorithms to evaluate stress, activity, driving/riding, smoking, and conversation. mCerebrum collects user reported data collection through EMAs, intervention response, and self reports.

Finally, mCerebrum provides end-to-end access control, encryption, and data source linage along with a simple set of APIs for application development that is freely available under the open source BSD 2-clause license.

Communication Interfaces

Data sources are either on the smartphone or come from external devices that can be connected to the platform through various radios or wire interfaces:

The Microsoft Band, EasySense (lung fluid and heart/lung motion), and Omron (blood pressure and weight) devices have custom applications that utilize the Bluetooth radio to interact with the device and relay data into mCerebrum. Microsoft Band (https://github.com/MD2Korg/mCerebrum-MicrosoftBand) samples accelerometer, gyroscope, GSR, light, and heart rate for measuring arm movements for eating and smoking behavior detection. AutoSense (https://github.com/MD2Korg/mCerebrum-AutoSense) receives data from multiple devices over an 802.15.4 (ANT+) radio that includes: Chest, which contains ECG, respiration, and accelerometer data and Wrist, which samples accelerometer and gyroscope information before sending data into Data Kit.  A Bluetooth version of AutoSense is under development for more flexibility in deployments. These two radio chipsets represent most of the wireless communication between wearable sensors and a smartphone and are currently integrated into mCerebrum.

The ICO smokerlyzer is connected to mCerebrum via the headphone jack. Our platform is able to communicate over this connection with the device to measure carbon monoxide of a smoker in the field. The Phone Sensor application (https://github.com/MD2Korg/mCerebrum-PhoneSensor) can record all available sensors on a smartphone platform and is typically configured to record the accelerometer, gyroscope, GPS, CPU, and battery levels from the device. Once data has arrived on mCerebrum through any one of the interfaces, it is routed via the data router and Data Kit. We have applications that integrate into mCerebrum across a variety of different communication modalities and push data into our common data core, Data Kit, for use by additional signal processing.

Signal processing

A signal processing layer is responsible for converting sensor data into markers on which the Intervention Kit acts.  The primary real-time data processor, Stream Processor (https://github.com/MD2Korg/mCerebrum-StreamProcessor), subscribes to data sources produced by the lower tiers and produces markers for the upper tiers.  Currently, it contains signal processing algorithms designed to compute various features and markers including: data quality of raw sensor signals:

The stream processor performs this computation in one-minute blocks that provides near real-time markers for other applications. A visualization layer contains multiple components around displaying the results of the signal processing and can be shown to the participant as necessary but is typically utilized for gaining a sense of how well the system if functioning on the backend.

Storage interfaces

Storage interfaces provide encrypted data storage and transport capabilities and are subject to the privacy controller, which allows a participant to temporally disable sensor data flow within the system according to rules dictated by the study rules. There are currently three storage interfaces:

Participant interaction

Participants interact with the system through a suite of applications.

For example, puffMarker uses multiple data streams (respiration, and wrist-based accelerometers and gyroscopes) to detect when a cigarette puff occurs.  This puff information is utilized by the intervention manager to provide alerts to the participant through mCererbrum’s notification system that sends a set of configurable messages, tones, and vibrations to a Microsoft band and the smartphone. Failure to acknowledge the intervention or message can result in repeated attempts to contact followed by an escalation of causing additional alerts to be sent.

Similarly, ECG, respiration, and accelerometer data is used by the cStress model to assess the likelihood of stress that is then used to generate triggers for launching stress intervention apps.

Functionality & Architecture of Cerebral Cortex (cloud)

Figure 2: Architecture of Cerebral Cortex Big Data Cloud Software Platform

Figure 2: Architecture of Cerebral Cortex Big Data Cloud Software Platform

Cerebral Cortex is a flexible layered architecture designed around different functional layers so that each component can be adapted and extended without adversely affecting the other components.  A Kernel links the layers to provide security controls between modules and a unified data interface to abstract implementation specifics. To meet future needs, we anticipate continuing to adapt and augment the Cerebral Cortex platform to support future technologies and the needs of new studies. Cerebral Cortex has the following layers and associated modules:

Gateway

A gateway layer operates in front of all the other layers to provide secure interfaces and APIs for interfacing with the platform.  This currently includes maintaining HTTP over SSL and routing requests to the appropriate internal platforms.

User Interfaces

The user interface layer is currently composed of four core applications:

Machine Interfaces

A set of machine interfaces complements the UI by providing several different APIs for both Cerebral Cortex’s internal applications and as an interface to external web services and other data sources.

Both the user and machine interfaces are built around a rapidly deployable containerized platform (Docker) to support a more flexible set of deployments across multiple different cloud architectures.

Analytics

The analytics layer contains modules primarily designed around the Apache Spark toolchain to run algorithm on the population-scale data sets that Cerebral Cortex contains.  Many processes parallel what is computed in mCerebrum and extends the analysis to examine data at larger time resolutions (e.g., processing the entire day’s worth of data for better estimate of baseline physiology rather than relying on history for online computation in mCerebrum) for higher accuracy and for population scale analytics. Described are several modules within the system.

Data Storage

Data storage is currently provided by a combination of the Apache Hadoop Distributed File System (HDFS) and HBase, a distributed bigtable datastore. HBase is responsible for storing the vast majority of the high-frequency time series data streams and provides a queryable interface that integrates well with the analytics layer tools.

Protection of PHI

The MD2K platform, mCerebrum and Cerebral Cortex, is designed to encrypt all data both at rest and in transport.  mCerebrum provides inter-app data protections to prevent unauthorized services from retrieving information from Data Kit without appropriate security credentials.  Data Kit ensures that all data persisted on the phone is stored in on an encrypted storage device in a  SQLite database.  This database can only be read upon providing an correct passphrase on the phone that encrypted the storage medium.  This prevents the removal of the data from the smartphone without utilizing built-in Android functionality.

Cerebral Cortex provides an encrypted API for data transport from mCerebrum to our cloud services.  This API follows standard industry practices and utilizes HTTPS with Nginx handling the SSL certificates for the internal services and can be easily adapted to the AWS architecture. Data is deidentified as much as possible before being stored within the Cerebral Cortex system.

Bios of Contributors

Timothy Hnat, PhD

Dr. Hnat is Chief Software Architect for the MD2K Center. He previously served as Assistant Professor of Computer Science at the University of Memphis. His research interests cover several areas of the construction and evaluation of distributed systems, including compilers, programming languages, networking, and wireless sensor networks. He seeks to harness the potential of distributed systems to affect and interact with the physical world to address mHealth issues.

Mani Srivastava, PhD

Dr. Srivastava is a Professor of Electrical Engineering and Computer Science at the University of California, Los Angeles. His research is broadly in the area of networked human-cyber-physical systems, and spans problems across the entire spectrum of applications, architectures, algorithms, and technologies. His current interests include issues of sensing, privacy, security, data quality, and variability in the context of applications in mHealth and sustainability. He is a deputy director of NSF Expeditions on Variability and is the lead investigator on an NSF Cyber Physical Systems Frontier Project called RoseLine. His works have been cited extensively (over 30,000 times) and have won several best paper awards. He has served as editor-in-chief of IEEE Transaction on Mobile Computing and the ACM Mobile Computing and Communication Review. He is a Fellow of IEEE.

Santosh Kumar, PhD

Dr. Kumar is the Lillian and Morrie Moss Chair of Excellence Professor in the Department of Computer Science at the University of Memphis. He received his Ph.D. in Computer Science and Engineering from The Ohio State University in 2006, where his dissertation won a presidential fellowship. In 2010, the Popular Science magazine named him one of America’s ten most brilliant scientists under the age of 38 (called “Brilliant Ten”). In 2011, he chaired the “mHealth Evidence” meeting jointly organized by NIH, NSF, RWJF, and McKesson Foundation to establish evidence requirements for mHealth. In 2013, he was invited to meet with the NIH Director to advise him on NIH efforts in the area of mHealth and was invited to the White House to give a talk on the future of Biosensors. In 2014, he co-organized and co-chaired the NSF-NIH Workshop on Computing Challenges in Future Mobile Health (mHealth) Systems and Applications.

Syed Monowar Hossain

Mr. Hossain is Lead Software Engineer for the MD2K Center. He is a Ph.D. Student in the Department of Computer Science at the University of Memphis.  He has 4+ years of experience in designing, implementing, integrating, testing and supporting mHealth applications to conduct research studies using wearable sensors for mobile devices on the Android platform. His research interest is on real time inference of different user behavior and context from physiological measurements collected from body worn sensors.

References

(a list of citations can be found here)

  1. Sarker, H., Tyburski, M., Rahman, M., Hovsepian, K., Sharmin, M., Epstein, D., Preston, K., Furr-Holden, D., Nahum-Shani, I., al’Absi, M., and Kumar, S. (2016). Finding Significant Stress Episodes in a Discontinuous Time-series of Rapidly Varying Mobile Sensor Data. ACM Conference on Human Computer Interaction (CHI 2016).
  2. Moushumi Sharmin, Andrew Raij, David Epstein, Inbal Nahum-Shani, J Gayle Beck, Sudip Vhaduri, Kenzie Preston, and Santosh Kumar. Visualization of Time-Series Sensor Data to Inform the Design of Just-In-Time Adaptive Stress Interventions, ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp 2015).
  3. Karen Hovsepian, Mustafa al’Absi, Emre Ertin, Thomas Kamarck, and Santosh Kumar. cStress: Towards a Gold Standard for Continuous Stress Assessment in the Mobile Environment, ACM International Joint Conference on Pervasive and Ubiquitous Computing ((UbiComp 2015).
  4. Nazir Saleheen, Amin A. Ali, Syed Monowar Hossain, Hillol Sarker, Soujanya Chatterjee, Benjamin Marlin, Emre Ertin, Mustafa al’Absi, and Santosh Kumar. puffMarker: A Multi-sensor Approach for Pinpointing the Timing of First Lapse in Smoking Cessation, ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp 2015).
  5. Kurt Plarre, Andrew Raij, Syed Monowar Hossain, Amin Ahsan Ali, Motohiro Nakajima, Mustafa al’Absi, Emre Ertin, Thomas Kamarck, Santosh Kumar, Marcia Scott, Daniel Siewiorek, Asim Smailagic, Lorentz E Wittmers Jr. Continuous in-the-field measurement of heart rate: correlates of drug use, craving, stress, and mood in polydrug users, 10th International Conference on Information Processing in Sensor Networks (IPSN), 2011
  6. Kumar, M. al’Absi, J.G. Beck, E. Ertin, M. Scott. Behavioral Monitoring and Assessment via Mobile Sensing Technologies. In: Marsch,; Lord,; Dallery, (Ed.): Leveraging Technology to Transform Behavioral Healthcare, pp. 27-39, Oxford Press, 2014.
  7. M Rahman, R. Bari, A.A. Ali, M. Sharmin, A. Raij, K. Hovsepian, S.M. Hossain, E. Ertin, A. Kennedy, D.H. Epstein, K.L. Preston, M. Jobes, J.G. Beck, S. Kedia, K.D. Ward, M. al’Absi, S. Kumar. Are we there yet?: feasibility of continuous stress assessment via wireless physiological sensors, 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, pp. 479–488, ACM 2014.
  8. Vhaduri, A.A. Ali, M. Sharmin, K. Hovsepian, S. Kumar. (2014) Estimating Drivers’ Stress from GPS Traces, 6th International Conference on Automotive User Interfaces and Interactive Vehicular Applications (AutomotiveUI ’14)
  9. Sarker, M. Sharmin, A.A. Ali, M.M. Rahman, R. Bari, S.M. Hossain, S. Kumar. Assessing the Availability of Users to Engage in Just-in-time Intervention in the Natural Environment. 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp ‘14).
  10. M. Hossain, A.A. Ali, M.M. Rahman, E. Ertin, D. Epstein, A. Kennedy, K. Preston, A. Umbricht, Y. Chen, S. Kumar. Identifying Drug (Cocaine) Intake Events from Acute Physiological Response in the Presence of Free-living Physical Activity,  13th ACM/IEE Conference on Information Processing in Sensor Networks, 2014.
  11. Ju Gao, Emre Ertin, Sudhakar Kumar, Mustafa al’Absi. Contactless sensing of physiological signals using wideband RF probes, Asilomar Conference on Signals, Systems and Computers, 2013.
  12. Mustafa Al’Absi, Motohiro Nakajima, Bingshuo Li, Santosh Kumar, Emre Ertin, Marcia S Scott, Lorentz Wittmers. The Assessment of Psychophysiological Response to Laboratory Stress Using A Wireless Ambulatory System: A Validation Study, Vol. 50, Psychophysiology, 2013.
  13. Motohiro Nakajima, Santosh Kumar, Lorentz Wittmers, Marcia S Scott, Mustafa al’Absi. Psychophysiological responses to stress following alcohol intake in social drinkers who are at risk of hazardous drinking, Vol. 93, Issue 1, Biological Psychology, 2013
  14. Motohiro Nakajima, Bingshuo Li, Santosh Kumar, Emre Ertin, Marcia S Scott, Lorentz Wittmers, Mustafa al’Absi. Validating a Novel Wireless Ambulatory Technology in the Assessment of Stress: A Pilot Study, Vol 73, Issue 3, Psychosomatic Medicine, 2013
  15. Amin Ahsan Ali, Syed Monowar Hossain, Karen Hovsepian, Md Mahbubur Rahman, Kurt Plarre, Santosh Kumar. mPuff: automated detection of cigarette smoking puffs from respiration measurements, 11th international conference on Information Processing in Sensor Networks, 2012
  16. Santosh Kumar, Mustafa al’Absi, Emre Ertin. Automated Assessment of Naturally Occurring Conversations, Vol 23, Annals of Behavioral Medicine, 2012
  17. Emre Ertin, Nathan Stohs, Santosh Kumar, Andrew Raij, Mustafa al’Absi, Siddharth Shah. AutoSense: unobtrusively wearable sensor suite for inferring the onset, causality, and consequences of stress in the field. 9th ACM Conference on Embedded Networked Sensor Systems. 2011
  18. Md Mahbubur Rahman, Amin Ahsan Ali, Kurt Plarre, Mustafa al’Absi, Emre Ertin, Santosh Kumar. mConverse: inferring conversation episodes from respiratory measurements collected in the field,  2nd Conference on Wireless Health, 2011
  19. Mohamed Musthag, Andrew Raij, Deepak Ganesan, Santosh Kumar, Saul Shiffman. Exploring micro-incentive strategies for participant compensation in high-burden studies, 13th international conference on Ubiquitous Computing (UbiComp 2011).
  20. Motohiro Nakajima, Mustafa al’Absi, Santosh Kumar, Emre Ertin, Angela K George, Nancy Dold, Lorentz Wittmers, Subjective and Cortisol Responses to Stress Following Alcohol Intake in Individuals with Hazardous Alcohol Use, Vol. 48, Psychophysiology, 2011.
  21. Andrew Raij, Animikh Ghosh, Santosh Kumar, Mani Srivastava. Privacy risks emerging from the adoption of innocuous wearable sensors in the mobile environment, SIGCHI Conference on Human Factors in Computing Systems, 2011.
  22. Md Mahbubur Rahman, Amin Ahsan Ali, Andrew Raij, Mustafa al’Absi, Emre Ertin, Santosh Kumar. Demo abstract: Online detection of speaking from respiratory measurements collected in the natural environment, 10th International Conference on Information Processing in Sensor Networks (IPSN), 2011.
  23. Kurt Plarre, Andrew Raij, Syed Monowar Hossain, Amin Ahsan Ali, Motohiro Nakajima, Mustafa al’Absi, Emre Ertin, Thomas Kamarck, Santosh Kumar, Marcia Scott, Daniel Siewiorek, Asim Smailagic, Lorentz E Wittmers Jr. Continuous inference of psychological stress from sensory measurements collected in the natural environment, 10th International Conference on Information Processing in Sensor Networks (IPSN), 2011.
  24. Kurt Plarre, Andrew Raij, Santanu Guha, Mustafa al’Absi, Emre Ertin, Santosh Kumar. Automated detection of sensor detachments for physiological sensing in the wild, Wireless Health 2010.
  25. Nan Hua, Ashwin Lall, Justin Romberg, Jun Jim Xu, Mustafa al’Absi, Emre Ertin, Santosh Kumar, Shikhar Suri. Just-in-time sampling and pre-filtering for wearable physiological sensors: going from days to weeks of operation on a single charge, Wireless Health 2010
  26. Andrew Raij, Patrick Blitz, Amin Ahsan Ali, Scott Fisk, Brian French, Somnath Mitra, Motohiro Nakajima, M Nuyen, Kurt Plarre, Mahbubur Rahman, Siddharth Shah, Yuan Shi, Nathan Stohs, Mustafa al’Absi, Emre Ertin, Thomas Kamarck, Santosh Kumar, Marcia Scott, Daniel Siewiorek, Asim Smailagic. mStress: Supporting Continuous Collection of Objective and Subjective Measures of Psychosocial Stress on Mobile Devices, Technical Report No. CS-10-004, Department of Computer Science, University of Memphis, 2010
  27. Yuan Shi, Minh Hoai Nguyen, Patrick Blitz, Brian French, Scott Fisk, Fernando De la Torre, Asim Smailagic, Daniel P Siewiorek, Mustafa al’Absi, Emre Ertin, Thomas Kamarck, Santosh Kumar. Personalized stress detection from physiological measurements, International Symposium on Quality of Life Technology, 2010