Advertisement

Monitoring and Coaching the Use of Home Medical Devices

  • Yang Cai
  • Yi Yang
  • Alexander Hauptmann
  • Howard Wactlar

Abstract

Despite the popularity of home medical devices, serious safety concerns have been raised, because the use-errors of home medical devices have linked to a large number of fatal hazards. To resolve the problem, we introduce a cognitive assistive system to automatically monitor the use of home medical devices. Being able to accurately recognize user operations is one of the most important functionalities of the proposed system. However, even though various action recognition algorithms have been proposed in recent years, it is still unknown whether they are adequate for recognizing operations in using home medical devices. Since the lack of the corresponding database is the main reason causing the situation, at the first part of this paper, we present a database specially designed for studying the use of home medical devices. Then, we evaluate the performance of the existing approaches on the proposed database. Although using state-of-art approaches which have demonstrated near perfect performance in recognizing certain general human actions, we observe significant performance drop when applying it to recognize device operations. We conclude that the tiny actions involved in using devices is one of the most important reasons leading to the performance decrease. To accurately recognize tiny actions, it’s critical to focus on where the target action happens, namely the region of interest (ROI) and have more elaborate action modeling based on the ROI. Therefore, in the second part of this paper, we introduce a simple but effective approach to estimating ROI for recognizing tiny actions. The key idea of this method is to analyze the correlation between an action and the sub-regions of a frame. The estimated ROI is then used as a filter for building more accurate action representations. Experimental results show significant performance improvements over the baseline methods by using the estimated ROI for action recognition. We also introduce an interaction framework, which considers both the confidence of the detection as well as the seriousness of the potential error with messages to the user that take both aspects into account.

Keywords

Visual Word Action Recognition Infusion Pump Class Distance Fisher Score 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgements

This work was supported in part by the National Science Foundation under Grant No. IIS-0917072. Any opinions, findings, and conclusions expressed in this material are those of the author(s) and do not reflect the views of the National Science Foundation.

References

  1. 1.
    Aggarwal, J. K., & Ryoo, M. S. (2011). Human activity analysis: A review. ACM Computing Surveys (CSUR), 43(3), 1–43.CrossRefGoogle Scholar
  2. 2.
    Belhumeur, P. N., Hespanha, J. P., & Kriegman, D. J. (1997). Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7), 711–720.CrossRefGoogle Scholar
  3. 3.
    Blank, M., Gorelick, L., Shechtman, E., Irani, M., & Basri, R. (2005). Actions as space-time shapes. In International Conference on Computer Vision.Google Scholar
  4. 4.
    Chen, M. Y., & Hauptmann, A. (2009). MoSIFT: Reocgnizing human actions in surveillance videos. In CMU-CS-09-161.Google Scholar
  5. 5.
    Cheng, M. M., Zhang, G. X., Mitra, N. J., Huang, X., & Hu, S. M. (2011). Global contrast based salient region detection. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  6. 6.
    Dollar, P., Rabaud, V., Cottrell, G., & Belongie, S. (2005). Behavior recognition via sparse spatio-temporal features. In Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.Google Scholar
  7. 7.
    Gao, Z., Chen, M. Y., Detyniecki, M., Wu, W., Hauptmann, A., Wactlar, H., et al. (2010) Multi-camera monitoring of infusion pump use. In IEEE International Conference on Semantic Computing.Google Scholar
  8. 8.
    Gao, Z., Detyniecki, M., Chen, M. Y., Hauptmann, A. G., Wactlar, H. D., & Cai, A. (2010). The application of spatio-temporal feature and multi-sensor in home medical devices. International Journal of Digital Content Technology and Its Applications, 4(6), 69–78.Google Scholar
  9. 9.
    Gao, Z., Detyniecki, M., Chen, M. Y., Wu, W., Hauptmann, A. G., & Wactlar, H. D. (2010). Towards automated assistance for operating home medical devices. In International Conference of Engineering in Medicine and Biology Society.Google Scholar
  10. 10.
    Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., & Serre, T. (2011). HMDB: A large video database for human motion recognition. In International Conference on Computer Vision.Google Scholar
  11. 11.
    Laptev, I. (2005). On space-time interest points. International Journal of Computer Vision, 64(2/3), 107–123.CrossRefGoogle Scholar
  12. 12.
    Marszałek, M., Laptev, I., & Schmid, C. (2009). Actions in context. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  13. 13.
    Meier, B. (2010). F.D.A. Steps up oversight of infusion pumps. In New York Times.Google Scholar
  14. 14.
    Ni, B., Wang, G., & Moulin, P. (2011). RGBD-HuDaAct: A color-depth video database for human daily activity recognition. In International Conference on Computer Vision Workshops.Google Scholar
  15. 15.
    Niebles, J. C., Wang, H., & Fei-Fei, L. (2008). Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision, 79(3), 299–318.CrossRefGoogle Scholar
  16. 16.
    Parameswaran, V., & Chellappa, R. (2006). View invariance for human action recognition. International Journal of Computer Vision, 66(1), 83–101.CrossRefGoogle Scholar
  17. 17.
    Philbin, J., Chum, O., Isard, M., Sivic, J., & Zisserman, A. (2007). Object retrieval with large vocabularies and fast spatial matching. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  18. 18.
    Reddy, K., & Shah, M. (2012). Recognizing 50 human action categories of web videos. Machine Vision and Applications Journal, 25(5), 97–81.Google Scholar
  19. 19.
    Ryoo, M. S., Aggarwal, J. K. (2010). UT-Interaction dataset. In ICPR Contest on Semantic Description of Human Activities (SDHA).Google Scholar
  20. 20.
    Ryoo, M. S., & Aggarwal, J. K. (2011). Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities. In International Conference on Computer Vision.Google Scholar
  21. 21.
    Schuldt, C., Laptev, I., & Caputo B (2004) Recognizing human actions: A local sVM approach. In International Conference on Pattern Recognition.Google Scholar
  22. 22.
    Sivic, J., & Zisserman, A. (2003). Video google: A text retrieval approach to object matching in videos. In International Conference on Computer Vision.Google Scholar
  23. 23.
    Wang, J., Liu, Z., Wu, Y., & Yuan, J. (2012). Mining actionlet ensemble for action recognition with depth cameras. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  24. 24.
    Weinland, D., Özuysal, M., & Fua, P. (2010). Making action recognition robust to occlusions and viewpoint changes. In European Conference on Computer Vision.Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Yang Cai
    • 1
  • Yi Yang
    • 1
  • Alexander Hauptmann
    • 1
  • Howard Wactlar
    • 1
  1. 1.Carnegie Mellon UniversityPittsburghUSA

Personalised recommendations