Matching entries: 0
settings...
Adams RJ and Marlin BM (2017), "Learning Time Series Detection Models from Temporally Imprecise Labels", In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics.
Abstract: In this paper, we consider a new low-quality label learning problem: learning time series detection models from temporally imprecise labels. In this problem, the data consist of a set of input time series, and supervision is provided by a sequence of noisy time stamps corresponding to the occurrence of positive class events. Such temporally imprecise labels commonly occur in areas like mobile health research where human annotators are tasked with labeling the occurrence of very short duration events. We propose a general learning framework for this problem that can accommodate different base classifiers and noise models. We present results on real mobile health data showing that the proposed framework significantly outperforms a number of alternatives including assuming that the label time stamps are noise-free, transforming the problem into the multiple instance learning framework, and learning on labels that were manually re-aligned.
BibTeX:
@inproceedings{adams17,
  author = {Roy J. Adams and Benjamin M. Marlin},
  title = {Learning Time Series Detection Models from Temporally Imprecise Labels},
  booktitle = {Proceedings of the 20th International Conference on Artificial Intelligence and Statistics},
  year = {2017},
  url = {https://md2k.org/images/papers/methods/adams17a.pdf}
}
Adams RJ, Parate A and Marlin BM (2016), "Hierarchical Span-Based Conditional Random Fields for Labeling and Segmenting Events in Wearable Sensor Data Streams", In Proceedings of The 33rd International Conference on Machine Learning. , pp. 334-343.
Abstract: The field of mobile health (mHealth) has the potential to yield new insights into health and behavior through the analysis of continuously recorded data from wearable health and activity sensors. In this paper, we present a hierarchi-cal span-based conditional random field model for the key problem of jointly detecting discrete events in such sensor data streams and segment-ing these events into high-level activity sessions. Our model includes higher-order cardinality fac-tors and inter-event duration factors to capture domain-specific structure in the label space. We show that our model supports exact MAP in-ference in quadratic time via dynamic program-ming, which we leverage to perform learning in the structured support vector machine frame-work. We apply the model to the problems of smoking and eating detection using four real data sets. Our results show statistically significant improvements in segmentation performance rel-ative to a hierarchical pairwise CRF.
BibTeX:
@inproceedings{adams2016hierarchical,
  author = {Roy J Adams and Abinhav Parate and Benjamin M Marlin},
  title = {Hierarchical Span-Based Conditional Random Fields for Labeling and Segmenting Events in Wearable Sensor Data Streams},
  booktitle = {Proceedings of The 33rd International Conference on Machine Learning},
  year = {2016},
  pages = {334--343},
  url = {https://md2k.org/images/papers/methods/adams16.pdf}
}
Alzantot M, Wang Y, Ren Z and Srivastava MB (2017), "RSTensorFlow: GPU Enabled TensorFlow for Deep Learning on Commodity Android Devices", In Proceedings of the 1st International Workshop on Deep Learning for Mobile Systems and Applications.
Abstract: Mobile devices have become an essential part of our daily lives. By virtue of both their increasing computing power and the recent progress made in AI, mobile devices evolved to act as intelligent assistants in many tasks rather than a mere way of making phone calls. However, popular and commonly used tools and frameworks for machine intelli-gence are still lacking the ability to make proper use of the available heterogeneous computing resources on mobile devices. In this paper, we study the benefits of utilizing the heterogeneous (CPU and GPU) computing resources available on commodity android devices while running deep learning models. We leveraged the heterogeneous comput-ing framework RenderScript to accelerate the execution of deep learning models on commodity Android devices. Our system is implemented as an extension to the popular open-source framework TensorFlow. By integrating our acceler-ation framework tightly into TensorFlow, machine learning engineers can now easily make benefit of the heterogeneous computing resources on mobile devices without the need of any extra tools. We evaluate our system on different android phones models to study the trade-offs of running different neural network operations on the GPU. We also compare the performance of running different models architectures such as convolutional and recurrent neural networks on CPU only vs using heterogeneous computing resources. Our result shows that although GPUs on the phones are capable of of-fering substantial performance gain in matrix multiplication on mobile devices. Therefore, models that involve multi-plication of large matrices can run much faster (approx. 3 times faster in our experiments) due to GPU support.
BibTeX:
@inproceedings{alzantot17,
  author = {Moustafa Alzantot and Yingnan Wang and Zhengshuang Ren and Mani B. Srivastava},
  title = {RSTensorFlow: GPU Enabled TensorFlow for Deep Learning on Commodity Android Devices},
  booktitle = {Proceedings of the 1st International Workshop on Deep Learning for Mobile Systems and Applications},
  year = {2017},
  url = {https://md2k.org/images/papers/methods/p7-alzantot.pdf},
  doi = {10.1145/3089801.3089805}
}
Dadkhahi H and Marlin B (2017), "Learning Tree-Structured Detection Cascades for Heterogeneous Networks of Embedded Devices", 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining., 2017., In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
Abstract: In this paper, we present a new approach to learning cascaded classifiers for use in computing environments that involve networks of heterogeneous and resource-constrained, low-power embedded compute and sensing nodes. We present a generalization of the classical linear detection cascade to the case of tree-structured cascades where different branches of the tree execute on different physical compute nodes in the network. Different nodes have access to different features, as well as access to potentially different computation and energy resources. We concentrate on the problem of jointly learning the parameters for all of the classifiers in the cascade given a fixed cascade architecture and a known set of costs required to carry out the computation at each node. To accomplish the objective of joint learning of all detectors, we propose a novel approach to combining classifier outputs during training that better matches the hard cascade setting in which the learned system will be deployed. This work is motivated by research in the area of mobile health where energy efficient real time detectors integrating information from multiple wireless on-body sensors and a smart phone are needed for real-time monitoring and the delivery of just-in-time adaptive interventions. We evaluate our framework on mobile sensor-based human activity recognition and mobile health detector learning problems.
BibTeX:
@inproceedings{Dadkhahi17,
  author = {Hamid Dadkhahi and Benjamin Marlin},
  title = {Learning Tree-Structured Detection Cascades for Heterogeneous Networks of Embedded Devices},
  booktitle = {Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
  journal = {23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining., 2017},
  year = {2017},
  url = {https://md2k.org/images/papers/methods/p1773-dadkhahi.pdf}
}
Dadkhahi H, Saleheen N, Kumar S and Marlin BM (2016), "Learning Shallow Detection Cascades for Wearable Sensor-Based Mobile Health Applications", Proceedings of the 33rd International Conference on Machine Learning, 2016.
Abstract: The field of mobile health aims to leverage recent advances in wearable on-body sensing technol-ogy and smart phone computing capabilities to develop systems that can monitor health states and deliver just-in-time adaptive interventions. How-ever, existing work has largely focused on analyz-ing collected data in the off-line setting. In this paper, we propose a novel approach to learning shallow detection cascades developed explicitly for use in a real-time wearable-phone or wearable-phone-cloud systems. We apply our approach to the problem of cigarette smoking detection from a combination of wrist-worn actigraphy data and respiration chest band data using two and three stage cascades.
BibTeX:
@inproceedings{dadkhahi2016learning,
  author = {Hamid Dadkhahi and Nazir Saleheen and Santosh Kumar and Benjamin M Marlin},
  title = {Learning Shallow Detection Cascades for Wearable Sensor-Based Mobile Health Applications},
  journal = {Proceedings of the 33rd International Conference on Machine Learning, 2016},
  year = {2016},
  url = {https://md2k.org/images/papers/methods/Shallow-Detection-Cascades_Marlin16.pdf}
}
Dempsey WH, Moreno A, Scott CK, Dennis ML, Gustafson DH, Murphy SA and Rehg JM (2017), "iSurvive: An Interpretable, Event-time Prediction Model for mHealth", In Proceedings of the 34th International Conference on Machine Learning. International Convention Centre, Sydney, Australia Vol. 70, pp. 970-979. PMLR.
Abstract: An important mobile health (mHealth) task is the use of multimodal data, such as sensor streams and self-report, to construct interpretable time-to-event predictions of, for example, lapse to alcohol or illicit drug use. Interpretability of the prediction model is important for acceptance and adoption by domain scientists, enabling model outputs and parameters to inform theory and guide intervention design. Temporal latent state models are therefore attractive, and so we adopt the continuous time hidden Markov model (CT-HMM) due to its ability to describe irregular arrival times of event data. Standard CT-HMMs, however, are not specialized for predicting the time to a future event, the key variable for mHealth interventions. Also, standard emission models lack a sufficiently rich structure to describe multimodal data and incorporate domain knowledge. We present iSurvive, an extension of classical survival analysis to a CT-HMM. We present a parameter learning method for GLM emissions and survival model fitting, and present promising results on both synthetic data and an mHealth drug use dataset.
BibTeX:
@inproceedings{pmlr-v70-dempsey17a,
  author = {Walter H. Dempsey and Alexander Moreno and Christy K. Scott and Michael L. Dennis and David H. Gustafson and Susan A. Murphy and James M. Rehg},
  editor = {Doina Precup and Yee Whye Teh},
  title = {iSurvive: An Interpretable, Event-time Prediction Model for mHealth},
  booktitle = {Proceedings of the 34th International Conference on Machine Learning},
  publisher = {PMLR},
  year = {2017},
  volume = {70},
  pages = {970--979},
  url = {https://md2k.org/images/papers/methods/dempsey17a.pdf},
  doi = {http://proceedings.mlr.press/v70/dempsey17a.html}
}
Fang D, Hohman F, Polack P, Sarker H, Kahng M, Sharmin M, al'Absi M and Chau DH (2017), "mHealth Visual Discovery Dashboard", In Proceedings of the 2017 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2017 ACM International Symposium on Wearable Computers. New York, NY, USA , pp. 237-240. ACM.
Abstract: We present Discovery Dashboard, a visual analytics system for exploring large volumes of time series data from mobile medical field studies. Discovery Dashboard offers interactive exploration tools and a data mining motif discovery algorithm to help researchers formulate hypotheses, discover trends and patterns, and ultimately gain a deeper understanding of their data. Discovery Dashboard emphasizes user freedom and flexibility during the data exploration process and enables researchers to do things previously challenging or impossible to do --- in the web-browser and in real time. We demonstrate our system visualizing data from a mobile sensor study conducted at the University of Minnesota that included 52 participants who were trying to quit smoking.
BibTeX:
@inproceedings{Fang:2017:MVD:3123024.3123170,
  author = {Fang, Dezhi and Hohman, Fred and Polack, Peter and Sarker, Hillol and Kahng, Minsuk and Sharmin, Moushumi and al'Absi, Mustafa and Chau, Duen Horng},
  title = {mHealth Visual Discovery Dashboard},
  booktitle = {Proceedings of the 2017 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2017 ACM International Symposium on Wearable Computers},
  publisher = {ACM},
  year = {2017},
  pages = {237--240},
  url = {https://md2k.org/images/papers/methods/p237-fang.pdf},
  doi = {10.1145/3123024.3123170}
}
Gao J, Teng D and Ertin E (2016), "ECG feature detection using randomly compressed samples for stable HRV analysis over low rate links", In 2016 IEEE 13th International Conference on Wearable and Implantable Body Sensor Networks (BSN). , pp. 165-170.
Abstract: Wireless biosensors enable continuous monitoring of physiology and can provide early signs of imminent problems allowing for quick intervention and improved outcomes. Wireless communication of the sensor data for remote storage and analysis dominates the device power budget and puts severe constraints on lifetime and size of these sensors. Traditionally, to minimize the wireless communication bandwidth, data compression at the sensor node and signal reconstruction at the remote terminal is utilized. Here we consider an alternative strategy of feature detection with compressed samples without the intermediate step of signal reconstruction. Specifically, we present a compressed matched subspace detection algorithm to detect fiducial points of ECG waveform from streaming random projections of the data. We provide a theoretical analysis to compare the performance of the compressed matched detector performance to that of a matched detector operating with uncompressed data. We present extensive experimental results with ECG data collected in the field illustrating that the proposed system can provide high quality heart rate variability indices and achieve an order of magnitude better RMSE in beat-to-beat heart rate estimation than the traditional filter/downsample solutions at low data rates.
BibTeX:
@inproceedings{7516253,
  author = {J. Gao and D. Teng and E. Ertin},
  title = {ECG feature detection using randomly compressed samples for stable HRV analysis over low rate links},
  booktitle = {2016 IEEE 13th International Conference on Wearable and Implantable Body Sensor Networks (BSN)},
  year = {2016},
  pages = {165-170},
  url = {http://ieeexplore.ieee.org/document/7516253/},
  doi = {10.1109/BSN.2016.7516253}
}
Gulzar MA, Interlandi M, Condie T and Kim M (2017), "Debugging Big Data Analytics in Spark with BigDebug", In Proceedings of the 2017 ACM International Conference on Management of Data. , pp. 1627-1630.
Abstract: Developing Big Data Analytics workloads often involves trial and error debugging, due to the unclean nature of datasets or wrong assumptions made about data. When errors (e.g., program crash, outlier results, etc.) arise, developers are often interested in identify-ing a subset of the input data that is able to reproduce the problem. BIGSIFT is a new faulty data localization approach that combines insights from automated fault isolation in software engineering and data provenance in database systems to find a minimum set of failure-inducing inputs. BIGSIFT redefines data provenance for the purpose of debugging using a test oracle function and implements several unique optimizations, specifically geared towards the iterative nature of automated debugging workloads. BIGSIFT improves the accu-racy of fault localizability by several orders-of-magnitude (∼103 to 107×) compared to Titian data provenance, and improves perfor-mance by up to 66× compared to Delta Debugging, an automated fault-isolation technique. For each faulty output, BIGSIFT is able to localize fault-inducing data within 62% of the original job running time.
BibTeX:
@inproceedings{gulzar2017debugging,
  author = {Muhammad Ali Gulzar and Matteo Interlandi and Tyson Condie and Miryung Kim},
  title = {Debugging Big Data Analytics in Spark with BigDebug},
  booktitle = {Proceedings of the 2017 ACM International Conference on Management of Data},
  year = {2017},
  pages = {1627--1630}
}
Hohman F, Hodas N and Chau DH (2017), "ShapeShop: Towards Understanding Deep Learning Representations via Interactive Experimentation", In Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems. Denver, Colorado, USA , pp. 1694-1699. ACM.
Abstract: Deep learning is the driving force behind many recent technologies; however, deep neural networks are often viewed as “black-boxes” due to their internal complexity that is hard to understand. Little research focuses on helping people explore and understand the relationship between a user’s data and the learned representations in deep learning models. We present our ongoing work, ShapeShop, an interactive system for visualizing and understanding what semantics a neural network model has learned. Built using standard web technologies, ShapeShop allows users to experiment with and compare deep learning models to help explore the robustness of image classifiers
BibTeX:
@inproceedings{Hohman:2017:STU:3027063.3053103,
  author = {Fred Hohman and Nathan Hodas and Duen Horng Chau},
  title = {ShapeShop: Towards Understanding Deep Learning Representations via Interactive Experimentation},
  booktitle = {Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems},
  publisher = {ACM},
  year = {2017},
  pages = {1694--1699},
  url = {https://md2k.org/images/papers/methods/ea1694-hohman.pdf},
  doi = {10.1145/3027063.3053103}
}
Interlandi M, Ekmekji A, Shah K, Gulzar MA, Tetali SD, Kim M, Millstein T and Condie T (2017), "Adding data provenance support to Apache Spark", The VLDB Journal. , pp. 1-21. Springer Berlin Heidelberg.
Abstract: Debugging data processing logic in data-intensive scalable computing (DISC) systems is a difficult and time-consuming effort. Today’s DISC systems offer very little
tooling for debugging programs, and as a result, programmers spend countless hours collecting evidence (e.g., from log files) and performing trial-and-error debugging. To aid this effort, we built Titian, a library that enables data provenance—tracking data through transformations—in Apache Spark. Data scientists using the Titian Spark extension will be able to quickly identify the input data at the root cause of a potential bug or outlier result. Titian is built directly into the Spark platform and offers data provenance support at interactive speeds—orders of magnitude faster than alternative solutions—while minimally impacting Spark job performance; observed overheads for capturing data lineage rarely exceed 30% above the baseline job execution time.
BibTeX:
@article{interlandi2017adding,
  author = {Matteo Interlandi and Ari Ekmekji and Kshitij Shah and Muhammad Ali Gulzar and Sai Deep Tetali and Miryung Kim and Todd Millstein and Tyson Condie},
  title = {Adding data provenance support to Apache Spark},
  journal = {The VLDB Journal},
  publisher = {Springer Berlin Heidelberg},
  year = {2017},
  pages = {1--21},
  url = {https://md2k.org/images/papers/methods/adding-provenance_interlandi.pdf}
}
Liu Y-Y, Moreno A, Li S, Li F, Song L and Rehg JM (2017), "Learning Continuous-Time Hidden Markov Models for Event Data", In Mobile Health: Sensors, Analytic Methods, and Applications. Cham , pp. 361-387. Springer International Publishing.
Abstract: The Continuous-Time Hidden Markov Model (CT-HMM) is an attractive modeling tool for mHealth data that takes the form of events occurring at irregularly-distributed continuous time points. However, the lack of an efficient parameter learning algorithm for CT-HMM has prevented its widespread use, necessitating the use of very small models or unrealistic constraints on the state transitions. In this paper, we describe recent advances in the development of efficient EM-based learning methods for CT-HMM models. We first review the structure of the learning problem, demonstrating that it consists of two challenges: (1) the estimation of posterior state probabilities and (2) the computation of end-state conditioned expectations. The first challenge can be addressed by reformulating the estimation problem in terms of an equivalent discrete time-inhomogeneous hidden Markov model. The second challenge is addressed by exploiting computational methods traditionally used for continuous-time Markov chains and adapting them to the CT-HMM domain. We describe three computational approaches and analyze the tradeoffs between them. We evaluate the resulting parameter learning methods in simulation and demonstrate the use of models with more than 100 states to analyze disease progression using glaucoma and Alzheimer's Disease datasets.
BibTeX:
@inbook{Liu2017,
  author = {Liu, Yu-Ying and Moreno, Alexander and Li, Shuang and Li, Fuxin and Song, Le and Rehg, James M.},
  editor = {Rehg, James M. and Murphy, Susan A. and Kumar, Santosh},
  title = {Learning Continuous-Time Hidden Markov Models for Event Data},
  booktitle = {Mobile Health: Sensors, Analytic Methods, and Applications},
  publisher = {Springer International Publishing},
  year = {2017},
  pages = {361--387},
  url = {https://md2k.org/images/papers/methods/LearningC-THMM.pdf},
  doi = {10.1007/978-3-319-51394-2_19}
}
Natarajan A, Angarita G, Gaiser E, Malison R, Ganesan D and Marlin BM (2016), "Domain Adaptation Methods for Improving Lab-to-field Generalization of Cocaine Detection Using Wearable ECG", In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing. Heidelberg, Germany , pp. 875-885. ACM.
Abstract: Mobile health research on illicit drug use detection typically involves a two-stage study design where data to learn detectors is first collected in lab-based trials, followed by a deployment to subjects in a free-living environment to assess detector performance. While recent work has demonstrated the feasibility of wearable sensors for illicit drug use detection in the lab setting, several key problems can limit lab-to-field generalization performance. For example, lab-based data collection often has low ecological validity, the ground-truth event labels collected in the lab may not be available at the same level of temporal granularity in the field, and there can be significant variability between subjects. In this paper, we present domain adaptation methods for assessing and mitigating potential sources of performance loss in lab-to-field generalization and apply them to the problem of cocaine use detection from wearable electrocardiogram sensor data.
BibTeX:
@inproceedings{Natarajan:2016:DAM:2971648.2971666b,
  author = {Annamalai Natarajan and Gustavo Angarita and Edward Gaiser and Robert Malison and Deepak Ganesan and Benjamin M. Marlin},
  title = {Domain Adaptation Methods for Improving Lab-to-field Generalization of Cocaine Detection Using Wearable ECG},
  booktitle = {Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing},
  publisher = {ACM},
  year = {2016},
  pages = {875--885},
  url = {https://md2k.org/images/papers/methods/nihms835285_Marlin.pdf},
  doi = {10.1145/2971648.2971666}
}
Sugavanam N, Baskar S and Ertin E (2016), "Recovery guarantees for high resolution radar sensing with compressive illumination", In Compressed Sensing Theory and its Applications to Radar, Sonar and Remote Sensing (CoSeRa), 2016 4th International Workshop on. , pp. 252-256.
Abstract: We present a compressive radar design that combines multitone linear frequency modulated (LFM) waveforms on transmit with classical stretch processor and sub-Nyquist sampling on receive. The proposed compressive illumination scheme has much fewer random elements compared to previously proposed compressive radar designs based on stochastic waveforms, resulting in reduced storage and complexity for implementation. We present bounds on the operator norm and mutual coherence of the sensing matrix of the proposed scheme and show that for sufficiently large number of modulating tones, high resolution range recovery is guaranteed for a sparse scene using sampling rates that scale linearly with the scene sparsity. Simulation results are presented to study recovery performance as a function of system parameters for targets both on and off the grid. In addition, we present experimental results using a high speed digital waveform generator and a custom designed analog stretch processor.
BibTeX:
@inproceedings{sugavanam2016recovery,
  author = {Nithin Sugavanam and Siddharth Baskar and Emre Ertin},
  title = {Recovery guarantees for high resolution radar sensing with compressive illumination},
  booktitle = {Compressed Sensing Theory and its Applications to Radar, Sonar and Remote Sensing (CoSeRa), 2016 4th International Workshop on},
  year = {2016},
  pages = {252--256},
  url = {http://ieeexplore.ieee.org/document/7745739/}
}
Teng D and Ertin E (2015), "Learning to Aggregate Information for Sequential Inferences", ArXiv e-prints.
Abstract: We consider the problem of training a binary sequential classifier un-der an error rate constraint. It is well known that for known densities, accumulating the likelihood ratio statistics is time optimal under a fixed error rate constraint. For the case of unknown densities, we formulate the learning for sequential detection problem as a constrained density ratio estimation problem. Specifically, we show that the problem can be posed as a convex optimization problem using a Reproducing Kernel Hilbert Space representation for the log-density ratio function. The proposed bi-nary sequential classifier is tested on synthetic data set and UC Irvine human activity recognition data set, together with previous approaches for density ratio estimation. Our empirical results show that the classifier trained through the proposed technique achieves smaller average sampling cost than previous classifiers proposed in the literature for the same error rate.
BibTeX:
@article{2015arXiv150807964T,
  author = {Teng, D. and Ertin, E.},
  title = {Learning to Aggregate Information for Sequential Inferences},
  journal = {ArXiv e-prints},
  year = {2015},
  url = {https://md2k.org/images/papers/methods/Learning-to-Aggregate_Ertin16.pdf}
}