Skip to main content

An Approach to Extract and Compare Metadata of Human Activity Recognition (HAR) Data Sets

  • Conference paper
  • First Online:
Proceedings of the International Conference on Ubiquitous Computing & Ambient Intelligence (UCAmI 2022) (UCAmI 2022)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 594))

Abstract

Currently, open data and data sets are emerging in human activity recognition (HAR) due to their importance in different application areas such as improving people's lives, enabling informed care decisions, real-world problem solutions, and strategies for choosing the best HAR approaches. There are challenges associated with curating and sharing open data and data sets due to the absence of metadata and complete descriptions of the shared data. By properly curating data sets it will be easier to recognise, obtain and reuse to help make progress in HAR research. In this paper, we propose a conceptual framework for understanding the open data set lifecycle as consisting of four phases of construction, sharing, finding, and using. Similarly, open issues and challenges are explored related to HAR data sets from the published literature. On this basis, an approach is presented to automatically extract metadata through web scraping of the HAR data sets and then perform a natural language processing (NLP) pipeline to detect the metadata of data sets. As a result of metadata retrieval, we show how comparisons can be performed under different scenarios which can help evaluate data set quality and identify areas for improvement in data set curation. This research work will assist the HAR research community in better understanding the open data set lifecycle and how data set quality can be improved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Banos, O., et al.: mHealthDroid: a novel framework for agile development of mobile health applications. In: Pecchia, L., Chen, L.L., Nugent, C., Bravo, J. (eds.) IWAAL 2014. LNCS, vol. 8868, pp. 91–98. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13105-4_14

    Chapter  Google Scholar 

  2. Kwapisz, J.R., Weiss, G.M., Moore, S.A.: Activity recognition using cell phone accelerometers. ACM SIGKDD Explor. Newsl. 12(2), 74–82 (2011). https://doi.org/10.1145/1964897.1964918

    Article  Google Scholar 

  3. Roggen, D., et al.: Collecting complex activity datasets in highly rich networked sensor environments. In: 2010 Seventh International Conference on Networked Sensing Systems (INSS), pp. 233–240 (2010). https://doi.org/10.1109/INSS.2010.5573462

  4. Abdel-Salam, R., Mostafa, R., Hadhood, M.: Human activity recognition using wearable sensors: review, challenges, evaluation benchmark. In: Li, X., Min, W., Chen, Z., Zhang, L. (eds.) DL-HAR 2021. CCIS, vol. 1370, pp. 1–15. Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-0575-8_1

    Chapter  Google Scholar 

  5. Chen, L., Nugent, C.: Ontology‐based activity recognition in intelligent pervasive environments. Int. J. Web Inf. Syst. 5(4), 410–430 (2009)

    Article  Google Scholar 

  6. Chen, L., Nugent, C., Okeyo, G.: An ontology-based hybrid approach to activity modeling for smart homes. IEEE Trans Hum.-Mach. Syst. 44(1), 92–105 (2014). https://doi.org/10.1109/THMS.2013.2293714

    Article  Google Scholar 

  7. The State of Open Data 2021. Digital Science (2021). https://www.digital-science.com/resource/the-state-of-open-data-2021/. Accessed 28 Mar 2022

  8. Demrozi, F., Turetta, C., Pravadelli, G.: B-HAR: an open-source baseline framework for in depth study of human activity recognition datasets and workflows. ArXiv Prepr. arXiv:2101.10870 (2021)

  9. Saddiqa, M., Magnussen, R., Larsen, B., Pedersen, J.M.: Open Data Interface (ODI) for secondary school education. Comput. Educ. 174, 104294 (2021)

    Article  Google Scholar 

  10. Friberger, M.G., Togelius, J.: Generating game content from open data. In: Proceedings of the International Conference on the Foundations of Digital Games, New York, NY, USA, pp. 290–291, May 2012. https://doi.org/10.1145/2282338.2282404

  11. Dunwell, I., Dixon, R., Bul, K.C., Hendrix, M., Kato, P.M., Ascolese, A.: Translating open data to educational minigames. In: 2016 11th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP), pp. 145–150, October 2016. https://doi.org/10.1109/SMAP.2016.7753400

  12. Chiotaki, D., Karpouzis, K.: Open and cultural data games for learning. In: International Conference on the Foundations of Digital Games, New York, NY, USA, pp. 1–7, September 2020. https://doi.org/10.1145/3402942.3409621

  13. Bouchabou, D., Lohr, C., Kanellos, I., Nguyen, S.M.: HAR in smart homes. ArXiv Prepr. arXiv:2112.11232 (2021)

  14. Rafferty, J., Nugent, C., Liu, J., Chen, L.: Automatic metadata generation through analysis of narration within instructional videos. J. Med. Syst. 39(9), 1–7 (2015). https://doi.org/10.1007/s10916-015-0295-2

    Article  Google Scholar 

  15. Neumann, M., King, D., Beltagy, I., Ammar, W.: ScispaCy: fast and robust models for biomedical natural language processing. ArXiv Prepr. arXiv:1902.07669 (2019)

  16. Watkins, H., Gray, R., Jha, A., Nachev, P.: An artificial intelligence natural language processing pipeline for information extraction in neuroradiology. ArXiv Prepr. arXiv:2107.10021 (2021)

  17. Nasar, Z., Jaffry, S.W., Malik, M.K.: Information extraction from scientific articles: a survey. Scientometrics 117(3), 1931–1990 (2018). https://doi.org/10.1007/s11192-018-2921-5

    Article  Google Scholar 

  18. Xia, C., et al.: Multi-grained named entity recognition. ArXiv Prepr. arXiv:1906.08449 (2019)

  19. Stamper, J.C., et al.: Managing the educational dataset lifecycle with datashop. In: Biswas, G., Bull, S., Kay, J., Mitrovic, A. (eds.) AIED 2011. LNCS (LNAI), vol. 6738, pp. 557–559. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21869-9_100

    Chapter  Google Scholar 

  20. Chen, K., Yao, L., Zhang, D., Wang, X., Chang, X., Nie, F.: A semisupervised recurrent convolutional attention model for human activity recognition. IEEE Trans. Neural Netw. Learn. Syst. 31(5), 1747–1756 (2019)

    Article  Google Scholar 

  21. Gupta, S., Gupta, A.: Dealing with noise problem in machine learning data-sets: a systematic review. Procedia Comput. Sci. 161, 466–474 (2019). https://doi.org/10.1016/j.procs.2019.11.146

    Article  Google Scholar 

  22. Yu, S., Chen, H., Brown, R.A.: Hidden Markov model-based fall detection with motion sensor orientation calibration: a case for real-life home monitoring. IEEE J. Biomed. Health Inform. 22(6), 1847–1853 (2017)

    Article  Google Scholar 

  23. Khaertdinov, B., Ghaleb, E., Asteriadis, S.: Deep triplet networks with attention for sensor-based human activity recognition. In: 2021 IEEE International Conference on Pervasive Computing and Communications (PerCom), pp. 1–10, March 2021. https://doi.org/10.1109/PERCOM50583.2021.9439116

  24. Kwon, E., Park, H., Byon, S., Jung, E.S., Lee, Y.T.: HaaS (Human Activity Analytics as a Service) using sensor data of smart devices. In: 2018 International Conference on Information and Communication Technology Convergence (ICTC), pp. 1500–1502 (2018)

    Google Scholar 

  25. Mekruksavanich, S., Jitpattanakul, A.: Recognition of real-life activities with smartphone sensors using deep learning approaches. In: 2021 IEEE 12th International Conference on Software Engineering and Service Science (ICSESS), pp. 243–246, August 2021. https://doi.org/10.1109/ICSESS52187.2021.9522231

  26. Bacharidis, K., Argyros, A.: Improving deep learning approaches for human activity recognition based on natural language processing of action labels. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2020)

    Google Scholar 

  27. Keretna, S., Lim, C.P., Creighton, D.: A hybrid model for named entity recognition using unstructured medical text. In: 2014 9th International Conference on System of Systems Engineering (SOSE), pp. 85–90, June 2014. https://doi.org/10.1109/SYSOSE.2014.6892468

  28. Kumar, K., Haider, M.U., Ahsan, S.S.: Ontology-based full-text searching using named entity recognition. In: Hura, G.S., Singh, A.K., Siong Hoe, L. (eds.) Advances in Communication and Computational Technology. LNEE, vol. 668, pp. 211–222. Springer, Singapore (2021). https://doi.org/10.1007/978-981-15-5341-7_17

    Chapter  Google Scholar 

  29. Riboni, D., Bettini, C.: OWL 2 modeling and reasoning with complex human activities. Pervasive Mob. Comput. 7(3), 379–395 (2011). https://doi.org/10.1016/j.pmcj.2011.02.001

    Article  Google Scholar 

  30. McChesney, I., Nugent, C., Rafferty, J., Synnott, J.: Exploring an open data initiative ontology for shareable smart environment experimental datasets. In: Ochoa, S.F., Singh, P., Bravo, J. (eds.) UCAmI 2017. LNCS, vol. 10586, pp. 400–412. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67585-5_42

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gulzar Alam .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Alam, G., McChesney, I., Nicholl, P., Rafferty, J. (2023). An Approach to Extract and Compare Metadata of Human Activity Recognition (HAR) Data Sets. In: Bravo, J., Ochoa, S., Favela, J. (eds) Proceedings of the International Conference on Ubiquitous Computing & Ambient Intelligence (UCAmI 2022). UCAmI 2022. Lecture Notes in Networks and Systems, vol 594. Springer, Cham. https://doi.org/10.1007/978-3-031-21333-5_71

Download citation

Publish with us

Policies and ethics