Skip to main content

uMoDT: an unobtrusive multi-occupant detection and tracking using robust Kalman filter for real-time activity recognition


Human activity recognition (HAR) is an important branch of human-centered research. Advances in wearable and unobtrusive technologies offer many opportunities for HAR. While much progress has been made in HAR using wearable technology, it still remains a challenging task using unobtrusive (non-wearable) sensors. This paper investigates detection and tracking of multi-occupant HAR in a smart-home environment, using a novel low-resolution Thermal Vision Sensor (TVS). Specifically, the research presents the development and implementation of a two-step framework, consisting of a Computer Vision-based method to detect and track multiple occupants combined with Convolutional Neural Network (CNN)-based HAR. The proposed algorithm uses frame difference over consecutive frames for occupant detection, a set of morphological operations to refine identified objects, and features are extracted before applying a Kalman filter for tracking. Laterally, a 19-layer CNN architecture is used for HAR and afterward the results from both methods are fused using time interval-based sliding window. This approach is evaluated through a series of experiments based on benchmark Thermal Infrared datasets (VOT-TIR2016) and multi-occupant data collected from TVS. Results demonstrate that the proposed framework is capable of detecting and tracking 88.46% of multi-occupants with a classification accuracy of 90.99% for HAR.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9


  1. Benmansour, A., Bouchachia, A., Feham, M.: Multioccupant activity recognition in pervasive smart home environments. ACM Comput. Surv. (CSUR) 48(3), 34 (2016)

    Article  Google Scholar 

  2. Chen, L., Hoey, J., Nugent, C.D., Cook, D.J., Yu, Z.: Sensor-based activity recognition. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(6), 790–808 (2012)

    Article  Google Scholar 

  3. Singla, G., Cook, D.J., Schmitter-Edgecombe, M.: Recognizing independent and joint activities among multiple residents in smart environments. J. Ambient Intell. Humaniz Comput. 1(1), 57–63 (2010)

    Article  Google Scholar 

  4. Gade, R., Moeslund, T.B.: Constrained multi-target tracking for team sports activities. IPSJ Trans. Comput. Vis. Appl. 10(1), 2 (2018)

    Article  Google Scholar 

  5. Synnott, J., Rafferty, J., Nugent, CD.: Detection of workplace sedentary behavior using thermal sensors. In: 2016 IEEE 38th Annual International Conference of the Engineering in Medicine and Biology Society (EMBC), pp. 5413–5416. IEEE (2016)

  6. Fiaz, M., Mahmood, A., Jung, SK.: Tracking noisy targets: a review of recent object tracking approaches. arXiv preprint arXiv:180203098 (2018)

  7. Tran, SN., Zhang, Q., Karunanithi, M.: On multi-resident activity recognition in ambient smart-homes. arXiv preprint arXiv:180606611 (2018)

  8. Gade, R., Moeslund, T.B., Nielsen, S.Z., Skov-Petersen, H., Andersen, H.J., Basselbjerg, K., Dam, H.T., Jensen, O.B., Jørgensen, A., Lahrmann, H., et al.: Thermal imaging systems for real-time applications in smart cities. Int. J. Comput. Appl. Technol. 53(4), 291–308 (2016)

    Article  Google Scholar 

  9. Li, X., Hu, W., Shen, C., Zhang, Z., Dick, A., Hengel, A.V.D.: A survey of appearance models in visual object tracking. ACM Trans. Intell. Syst. Technol. (TIST) 4(4), 58 (2013)

    Google Scholar 

  10. Shen, J., Liang, Z., Liu, J., Sun, H., Shao, L., Tao, D.: Multiobject tracking by submodular optimization. IEEE Trans. Cybern. 49, 1990–2001 (2018)

    Article  Google Scholar 

  11. Wang, J., Chen, Y., Hu, L., Peng, X., Philip, S.Y.: Stratified transfer learning for cross-domain activity recognition. In: 2018 IEEE International Conference on Pervasive Computing and Communications (PerCom), pp 1–10. IEEE (2018)

  12. Wang, L., Gu, T., Tao, X., Chen, H., Lu, J.: Recognizing multi-user activities using wearable sensors in a smart home. Pervasive Mob. Comput. 7(3), 287–298 (2011)

    Article  Google Scholar 

  13. Rafsanjani, H.N., Ahn, C.R., Alahmad, M.: A review of approaches for sensing, understanding, and improving occupancy-related energy-use behaviors in commercial buildings. Energies 8(10), 10996–11029 (2015)

    Article  Google Scholar 

  14. Hevesi, P., Wille, S., Pirkl, G., Wehn, N., Lukowicz, P.: Monitoring household activities and user location with a cheap, unobtrusive thermal sensor array. In: Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 141–145. ACM (2014)

  15. Sengar, S.S., Mukhopadhyay, S.: Moving object detection based on frame difference and w4. Signal Image Video Process. 11(7), 1357–1364 (2017)

    Article  Google Scholar 

  16. Mandellos, N.A., Keramitsoglou, I., Kiranoudis, C.T.: A background subtraction algorithm for detecting and tracking vehicles. Expert Syst. Appl. 38(3), 1619–1631 (2011)

    Article  Google Scholar 

  17. Xing, J., Ai, H., Lao, S.: Multi-object tracking through occlusions by local tracklets filtering and global tracklets association with detection responses. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1200–1207. IEEE (2009)

  18. Parekh, H.S., Thakore, D.G., Jaliya, U.K.: A survey on object detection and tracking methods. Int. J. Innov. Res. Comput. Commun. Eng. 2(2), 2970–2979 (2014)

    Google Scholar 

  19. Luo, W., Xing, J., Zhang, X., Zhao, X., Kim, T.K.: Multiple object tracking: a literature review. arXiv preprint arXiv:14097618 (2014)

  20. Cai, Z., Gu, Z., Yu, Z.L., Liu, H., Zhang, K.: A real-time visual object tracking system based on kalman filter and mb-lbp feature matching. Multimed. Tools Appl. 75(4), 2393–2409 (2016)

    Article  Google Scholar 

  21. Yilmaz, A., Javed, O., Shah, M.: Object tracking: a survey. ACM Comput. Surv. (CSUR) 38(4), 13 (2006)

    Article  Google Scholar 

  22. Luo, X., Guan, Q., Tan, H., Gao, L., Wang, Z., Luo, X.: Simultaneous indoor tracking and activity recognition using pyroelectric infrared sensors. Sensors 17(8), 1738 (2017)

    Article  Google Scholar 

  23. Hu, W.C., Chen, C.H., Chen, T.Y., Huang, D.Y., Wu, Z.C.: Moving object detection and tracking from video captured by moving camera. J. Vis. Commun. Image Represent. 30, 164–180 (2015)

    Article  Google Scholar 

  24. Hou, L., Wan, W., Hwang, J.N., Muhammad, R., Yang, M., Han, K.: Human tracking over camera networks: a review. EURASIP J. Adv. Signal Process. 1, 43 (2017)

    Article  Google Scholar 

  25. Zhang, B., Li, Z., Perina, A., Del Bue, A., Murino, V., Liu, J.: Adaptive local movement modeling for robust object tracking. IEEE Trans. Circ. Syst. Video Technol. 27(7), 1515–1526 (2016)

    Article  Google Scholar 

  26. Choi, W., Savarese, S.: A unified framework for multi-target tracking and collective activity recognition. In: European Conference on Computer Vision, pp. 215–230. Springer (2012)

  27. Shen, J., Yu, D., Deng, L., Dong, X.: Fast online tracking with detection refinement. IEEE Trans. Intell. Transp. Syst. 19, 162–173 (2017)

    Article  Google Scholar 

  28. Zebin, T., Scully, PJ., Ozanyan, KB.: Human activity recognition with inertial sensors using a deep learning approach. In: 2016 IEEE Sensors, pp. 1–3. IEEE (2016)

  29. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

  30. Dhillon, JK., Kushwaha, AKS., et al.: A recent survey for human activity recoginition based on deep learning approach. In: 2017 Fourth International Conference on Image Information Processing (ICIIP), pp. 1–6. IEEE (2017)

  31. Dobhal, T., Shitole, V., Thomas, G., Navada, G.: Human activity recognition using binary motion image and deep learning. Procedia Comput. Sci. 58, 178–185 (2015)

    Article  Google Scholar 

  32. Ji, S., Xu, W., Yang, M., Yu, K.: 3d convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)

    Article  Google Scholar 

  33. Ray, K.S., Chakraborty, S.: An efficient approach for object detection and tracking of objects in a video with variable background. arXiv preprint arXiv:170602672 (2017)

  34. Leira, F.S., Johansen, T.A., Fossen, T.I.: Automatic detection, classification and tracking of objects in the ocean surface from UAVs using a thermal camera. In: Aerospace Conference, 2015 IEEE, pp. 1–10. IEEE (2015)

  35. Tiwari, M., Singhai, R.: A review of detection and tracking of object from image and video sequences. Int. J. Comput. Intell. Res. 13(5), 745–765 (2017)

    Google Scholar 

  36. Wang, Y., Luo, X., Fu, S., Hu, S.: Context multi-task visual object tracking via guided filter. Signal Process. Image Commun. 62, 117–128 (2018)

    Article  Google Scholar 

  37. Dehghan, A., Shah, M.: Binary quadratic programing for online tracking of hundreds of people in extremely crowded scenes. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 568–581 (2018)

    Article  Google Scholar 

  38. Sahbani, B., Adiprawita, W.: Kalman filter and iterative-hungarian algorithm implementation for low complexity point tracking as part of fast multiple object tracking system. In: 2016 6th International Conference on System Engineering and Technology (ICSET), pp. 109–115. IEEE (2016)

  39. Heimanntvs. imaging.php. Accessed 25 Feb 2020

  40. Javier, M-Q., Shewell, C., Cleland, I., Rafferty, J., Nugent, C., Estévez, M.E.: Computer vision-based gait velocity from non-obtrusive thermal vision sensors. In: 2018 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), pp. 391–396. IEEE (2018)

  41. Zeng, M., Nguyen, L.T., Yu, B., Mengshoel, O.J., Zhu, J., Wu, P., Zhang, J.: Convolutional neural networks for human activity recognition using mobile sensors. In: 2014 6th International Conference on Mobile Computing, Applications and Services (MobiCASE), pp. 197–205. IEEE (2014)

  42. Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:150203167 (2015)

  43. Ordóñez, F.J., Roggen, D.: Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition. Sensors 16(1), 115 (2016)

    Article  Google Scholar 

  44. Albelwi, S., Mahmood, A.: A framework for designing the architectures of deep convolutional neural networks. Entropy 19(6), 242 (2017)

    Article  Google Scholar 

  45. Gao, Z.: Object-based image classification and retrieval with deep feature representations. Doctor of Philosophy Thesis, School of Computing and Information Technology, University of Wollongong (2018)

  46. Teow, MY.: Understanding convolutional neural networks using a minimal model for handwritten digit recognition. In: 2017 IEEE 2nd International Conference on Automatic Control and Intelligent Systems (I2CACIS), pp. 167–172. , IEEE (2017)

  47. Tzutalin Labelimg: Image annotation tool. Accessed 25 Feb 2020

  48. Kristan, M., Matas, J., Leonardis, A., Vojir, T., Pflugfelder, R., Fernandez, G., Nebehay, G., Porikli, F., Čehovin, L.: A novel performance evaluation methodology for single-target trackers. IEEE Trans. Pattern Anal. Mach. Intell. 38(11), 2137–2155 (2016).

    Article  Google Scholar 

  49. Vot2016 benchmark. Accessed 25 Feb 2020

  50. Bradski, G.: The opencv library. Dr Dobb's J. Softw. Tools 25, 120–125 (2000)

    Google Scholar 

  51. Portmann, J., Lynen, S., Chli, M., Siegwart, R.: People detection and tracking from aerial thermal views. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 1794–1800. IEEE (2014)

  52. uMoDT framework source code. Accessed 25 Feb 2020

  53. Mishkin, D., Sergievskiy, N., Matas, J.: Systematic evaluation of convolution neural network advances on the imagenet. Comput. Vis. Image Underst. 161, 11–19 (2017)

    Article  Google Scholar 

  54. Manohar, V., Soundararajan, P., Raju, H., Goldgof, D., Kasturi, R., Garofolo, J.: Performance evaluation of object detection and tracking in video. In: Asian Conference on Computer Vision, pp. 151–161. Springer (2006)

  55. Gade, R., Moeslund, T.: Thermal tracking of sports players. Sensors 14(8), 13679–13691 (2014)

    Article  Google Scholar 

  56. Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the clear mot metrics. EURASIP J. Image Video Process. 2008, 1–10 (2008)

    Article  Google Scholar 

  57. Bochinski, E., Eiselein, V., Sikora, T.: High-speed tracking-by-detection without using image information. In: 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–6. IEEE (2017)

  58. Wan, X., Wang, J., Zhou, S.: An online and flexible multi-object tracking framework using long short-term memory. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1230–1238 (2018)

  59. Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 3464–3468. IEEE (2016)

  60. Murray, S.: Real-time multiple object tracking-a study on the importance of speed. arXiv preprint arXiv:170903572 (2017)

  61. Chen, L., Ai, H., Zhuang, Z., Shang, C.: Real-time multiple people tracking with deeply learned candidate selection and person re-identification. arXiv preprint arXiv:180904427v1 (2018)

  62. Wu, Y., Lim, J., Yang, M.H.: Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1834–1848 (2015)

    Article  Google Scholar 

  63. Čehovin, L., Kristan, M., Leonardis, A.: Is my new tracker really better than yours? In: IEEE Winter Conference on Applications of Computer Vision, pp. 540–547. IEEE (2014)

  64. Čehovin, L., Leonardis, A., Kristan, M.: Visual object tracking performance measures revisited. IEEE Transactions on Image Processing 25, 1261–1274 (2016)

    MathSciNet  MATH  Google Scholar 

  65. Wang, Q., Gong, D., Qi, M., Shen, Y., Lei, Y.: Temporal sparse feature auto-combination deep network for video action recognition. Concurrency and Computation: Practice and Experience p e4487 (2018)

Download references


This research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2017-0-01629) supervised by the IITP (Institute for Information & communications Technology Promotion) and this work was supported by the Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) (No.2017-0-00655) and NRF-2016K1A3A7A03951968 and NRF-2019R1A2C2090504. This work was supported by the REMIND project, which has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sk lodowska-Curie grant agreement No 734355.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Sungyoung Lee.

Additional information

Communicated by C. Xu.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Razzaq, M.A., Quero, J.M., Cleland, I. et al. uMoDT: an unobtrusive multi-occupant detection and tracking using robust Kalman filter for real-time activity recognition. Multimedia Systems 26, 553–569 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Human activity recognition
  • Image processing
  • Object detection
  • Tracking
  • Classification