Abstract
With the rapid advancements of sensor technologies and mobile computing, Mobile Crowd Sensing (MCS) has emerged as a new paradigm to collect massive-scale rich trajectory data. Nomadic sensors empower people and objects with the capability of reporting and sharing observations on their state, their behavior and/or their surrounding environments. Processing and mining multi-source sensor data in MCS raise several challenges due to their multi-dimensional nature where the measured parameters (i.e., dimensions) may differ in terms of quality, variability, and time scale. We consider the context of air quality MCS and focus on the task of mining the micro-environment from the MCS data. Relating the measures to their micro-environment is crucial to interpret them and analyse the participant’s exposure properly. In this paper, we focus on the problem of investigating the feasibility of recognizing the human’s micro-environment in an environmental MCS scenario. We propose a novel approach for learning and predicting the micro-environment of users from their trajectories enriched with environmental data represented as multidimensional time series plus GPS tracks. We put forward a multi-view learning approach that we adapt to our context, and implement it along with other time series classification approaches. We extend the proposed approach to a hybrid method that employs trajectory segmentation to bring the best of both methods. We optimise the proposed approaches either by analysing the exact geolocation (which is privacy invasive), or simply applying some a priori rules (which is privacy friendly). The experimental results, applied to real MCS data, not only confirm the power of MCS and air quality (AQ) data in characterizing the micro-environment, but also show a moderate impact of the integration of mobility data in this recognition. Furthermore, and during the training phase, multi-view learning shows similar performance as the reference deep learning algorithm, without requiring specific hardware. However, during the application of models on new data, the deep learning algorithm fails to outperform our proposed models.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
This value has been chosen in accordance with the resolution adopted by Airparif (the agency in charge of AQ monitoring in the Paris Region, also part of the Polluscope consortium) in their simulation models.
References
Abboud M, Hafyani HE, Zuo J, Zeitouni K, Taher Y (2021) Micro-environment recognition in the context of environmental crowdsensing. In: Proceedings of the workshops of the EDBT/ICDT 2021 joint conference 2841
Antoniou A, Storkey A, Edwards H (2017) Data augmentation generative adversarial networks. arXiv preprint arXiv:1711.04340
Asimina S, Chapizanis D, Karakitsios S, Kontoroupis P, Asimakopoulos D, Maggos T, Sarigiannis D (2018) Assessing and enhancing the utility of low-cost activity and location sensors for exposure studies. Environmental Monitoring and Assessment 190(3):1–12
Berndt DJ, Clifford J (1994) Using dynamic time warping to find patterns in time series. KDD Workshop, vol 10. Seattle, WA, USA, pp 359–370
Chaix B, Kestens Y, Bean K, Leal C, Karusisi N, Meghiref K, Burban J, Fon Sing M, Perchoux C, Thomas F et al (2012) Cohort profile: residential and non-residential environments, individual activity spaces and cardiovascular risk factors and diseases–the record cohort study. International Journal of Epidemiology 41(5):1283–1292
Chaix B, Kestens Y, Perchoux C, Karusisi N, Merlo J, Labadi K (2012) An interactive mapping tool to assess individual mobility patterns in neighborhood studies. American Journal of Preventive Medicine 43(4):440–450
Chatzidiakou L, Krause A, Kellaway M, Han Y, Li Y, Martin E, Kelly FJ, Zhu T, Barratt B, Jones RL (2022) Automated classification of time-activity-location patterns for improved estimation of personal exposure to air pollution
Chawla N, Bowyer K, Hall L, Kegelmeyer W (2002) Smote: Synthetic minority over-sampling technique. J Artif Intell Res (JAIR) 16:321–357. https://doi.org/10.1613/jair.953
Chen K, Zhang D, Yao L, Guo B, Yu Z, Liu Y (2020) Deep Learning for Sensor-based Human Activity Recognition: Overview, Challenges and Opportunities. arXiv:2001.07416 [cs]
Cho H, Yoon SM (2018) Divide and conquer-based 1d cnn human activity recognition using test data sharpening. Sensors 18(4):1055
Dabiri S, Heaslip K (2018) Inferring transportation modes from gps trajectories using a convolutional neural network. Transportation Research Part C: Emerging Technologies 86:360–371
Deng H, Runger G, Tuv E, Vladimir M (2013) A time series forest for classification and feature extraction. Inform Sci 239:142–153
Do TMT, Gatica-Perez D (2013) The places of our lives: Visiting patterns and automatic labeling from longitudinal smartphone data. IEEE Trans Mobile Comput 13(3):638–648
Do TMT, Gatica-Perez D (2014) The Places of Our Lives: Visiting Patterns and Automatic Labeling from Longitudinal Smartphone Data. IEEE Trans Mobile Comput 13(3):638–648. https://doi.org/10.1109/TMC.2013.19
El Hafyani H, Abboud M, Zuo J, Zeitouni K, Taher Y (2021) Tell me what air you breath, i tell you where you are. In: 17th international symposium on spatial and temporal databases, SSTD ’21, Association for Computing Machinery, New York, NY, USA, pp 161–165. https://doi.org/10.1145/3469830.3470914
El Hafyani H, Zeitouni K, Taher Y, Abboud M (2020) Leveraging change point detection for activity transition mining in the context of environmental crowdsensing. The 9th SIGKDD International Workshop on Urban Computing
van Engelen JE, Hoos H (2019) A survey on semi-supervised learning. Mach Learn 109:373–440
Etemad M, Soares Júnior A, Matwin S (2018) Predicting transportation modes of gps trajectories using feature engineering and noise removal. In: Advances in artificial intelligence: 31st Canadian conference on artificial intelligence, Canadian AI 2018, Toronto, ON, Canada, May 8–11, 2018, Proceedings 31, Springer, pp 259–264
Fawaz HI, Forestier G, Weber J, Idoumghar L, Muller PA (2019) Deep learning for time series classification: a review. Data Mining and Knowledge Discovery 33(4):917–963
Fawaz HI, Lucas B, Forestier G, Pelletier C, Schmidt D, Weber J, Webb GI, Idoumghar L, Muller PA, Petitjean, F (2020) Inceptiontime: Finding alexnet for time series classification. arXiv:abs/1909.04939
Garcia-Ceja E, Galván-Tejada CE, Brena R (2018) Multi-view stacking for activity recognition with sound and accelerometer data. Inform Fusion 40:45–56. https://doi.org/10.1016/j.inffus.2017.06.004, http://www.sciencedirect.com/science/article/pii/S15662535163 01932. Accessed August 2022
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Advances in neural information processing systems 27
Guo B, Wang Z, Yu Z, Wang Y, Yen NY, Huang R, Zhou X (2015) Mobile crowd sensing and computing: The review of an emerging human-powered sensing paradigm. ACM Computing Surveys (CSUR) 48(1):1–31
Jiang W, Yin Z (2015) Human activity recognition using wearable sensors by deep convolutional neural networks. In: Proceedings of the 23rd ACM international conference on multimedia, pp 1307–1310
Karim F, Majumdar S, Darabi H, Harford S (2019) Multivariate lstm-fcns for time series classification. Neural Networks 116:237–245
Kranz M, Möller A, Hammerla N, Diewald S, Plötz T, Olivier P, Roalter L (2013) The mobile fitness coach: Towards individualized skill assessment using personalized mobile devices. Pervasive and Mobile Comput 9(2):203–215
Languille B, Gros V, Bonnaire N, Pommier C, Honoré C, Debert C, Gauvin L, Srairi S, Annesi-Maesano I, Chaix B et al (2020) A methodology for the characterization of portable sensors for air quality measure with the goal of deployment in citizen science. Science of the Total Environment 708:134698
Li S, Li Y, Fu Y (2016) Multi-view time series classification: A discriminative bilinear projection approach. In: Proceedings of the 25th ACM international on conference on information and knowledge management, pp 989–998
Lines J, Taylor S, Bagnall A (2016) HIVE-COTE: The Hierarchical Vote Collective of Transformation-based Ensembles for Time Series Classification. In: 2016 IEEE 16th international conference on data mining (ICDM), pp 1041–1046
Liu L, Peng Y, Wang S, Liu M, Huang Z (2016) Complex activity recognition using time series pattern dictionary learned from ubiquitous sensors. Inform Sci 340-341, 41–57. https://doi.org/10.1016/j.ins.2016.01.020, http://www.sciencedirect.com/science/article/pii/S00200255160 00311. Accessed August 2022
Moon B, Jagadish HV, Faloutsos C, Saltz JH (2001) Analysis of the clustering properties of the hilbert space-filling curve. IEEE TKDE’01 13(1):124–141
Nayak G, Mithal V, Jia X, Kumar V (2018) Classifying multivariate time series by learning sequence-level discriminative patterns. In: Proceedings of the 2018 SIAM international conference on data mining, SIAM, pp 252–260
Pappalardo L, Simini F, Barlacchi G, Pellungrini R (2019) scikit-mobility: a python library for the analysis, generation and risk assessment of mobility data
Parent C, Spaccapietra S, Renso C, Andrienko G, Andrienko N, Bogorny V, Damiani ML, Gkoulalas-Divanis A, Macedo J, Pelekis N et al (2013) Semantic trajectories modeling and analysis. ACM Computing Surveys (CSUR) 45(4):1–32
Pärkkä J, Ermes M, Korpipää P, Mäntyjärvi J, Peltola J, Korhonen I (2006) Activity classification using realistic data from wearable sensors. IEEE Transactions on Information Technology in Biomedicine: A Publication of the IEEE Engineering in Medicine and Biology Society 10(1):119–128. https://doi.org/10.1109/titb.2005.856863
Rehrl K, Gröchenig S, Kranzinger S (2020) Why did a vehicle stop? a methodology for detection and classification of stops in vehicle trajectories. Int J Geograph Inform Sci 34(10):1953–1979
Ruiz AP, Flynn M, Bagnall A (2020) Benchmarking Multivariate Time Series Classification Algorithms. arXiv:2007.13156 [cs, stat]
Ruiz AP, Flynn M, Large J, Middlehurst M, Bagnall A (2021) The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Mining and Knowledge Discovery 35(2):401–449
Sai KBK, Subbareddy SR, Luhach AK (2019) Iot based air quality monitoring system using mq135 and mq7 with machine learning analysis. Scalable Computing: Practice and Experience 20(4):599–606
Sardianos C, Varlamis I, Bouras G (2018) Extracting user habits from google maps history logs. In: 2018 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), IEEE, pp 690–697
Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Advances in neural information processing systems, vol 30, Curran Associates, Inc., pp 4077–4087
Sonawani S, Patil K, Chumchu P (2021) No2 pollutant concentration forecasting for air quality monitoring by using an optimised deep learning bidirectional gru model. Int J Comput Sci Eng 24(1):64–73
Tavenard R, Faouzi J, Vandewiele G, Divo F, Androz G, Holtz C, Payne M, Yurchak R, Rußwurm M, Kolar K, Woods E (2020) Tslearn, a machine learning toolkit for time series data. J Mach Learn Res 21(118):1–6. http://jmlr.org/papers/v21/20-091.html. Accessed August 2022
Toch E, Lerner B, Ben-Zion E, Ben-Gal I (2019) Analyzing large-scale human mobility data: a survey of machine learning methods and applications. Knowledge and Information Systems 58(3):501–523
Wang B, Jiang T, Zhou X, Ma B, Zhao F, Wang Y (2020) Time-series classification based on fusion features of sequence and visualization. Appl Sci 10(12):4124. https://doi.org/10.3390/app10124124, https://www.mdpi.com/2076-3417/10/12/4124. Accessed August 2022
Wang J, Chen Y, Hao S, Peng X, Hu L (2019) Deep Learning for Sensor-based Activity Recognition: A Survey. Pattern Recogn Lett 119, 3–11. https://doi.org/10.1016/j.patrec.2018.02.010,arXiv:1707.03502
Wei L, Keogh E (2006) Semi-supervised time series classification. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’06, Association for Computing Machinery, New York, NY, USA, pp 748–753. https://doi.org/10.1145/1150402.1150498
Wolpert DH (1992) Stacked generalization. Neural Networks 5(2):241–259. https://doi.org/10.1016/S0893-6080(05)80023-1, http://www.sciencedirect.com/science/article/pii/S08936080058 00231. Accessed August 2022
Ye L, Keogh E (2009) Time series shapelets: A new primitive for data mining. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’09 pp 947–956
Yoon J, Jarrett D, van der Schaar M (2019) Time-series generative adversarial networks. In: Wallach H, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox E, Garnett R (eds) Advances in neural information processing systems, vol 32. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2019/file/c9efe5f26cd17b a6216bbe2a7d26d490-Paper.pdf. Accessed August 2022
Zhang M, Sawchuk AA (2012) Motion primitive-based human activity recognition using a bag-of-features approach. In: Proceedings of the 2nd ACM SIGHIT international health informatics symposium, IHI ’12, Association for Computing Machinery, New York, NY, USA, pp 631–640. https://doi.org/10.1145/2110363.2110433
Zhang M, Sawchuk AA (2012) Motion primitive-based human activity recognition using a bag-of-features approach. In: Proceedings of the 2nd ACM SIGHIT international health informatics symposium, pp 631–640
Zhang X, Gao Y, Lin J, Lu CT (2020) TapNet: Multivariate Time Series Classification with Attentional Prototypical Network. Proceedings of the AAAI Conference on Artificial Intelligence 34:6845–6852
Zheng Y (2015) Trajectory data mining: An overview. ACM Trans Intell Syst Technol 6(3). https://doi.org/10.1145/2743025
Zheng Y, Li Q, Chen Y, Xie X, Ma WY (2008) Understanding mobility based on GPS data. In: Proceedings of the 10th international conference on ubiquitous computing, association for computing machinery, New York, NY, USA, pp 312–321, https://doi.org/10.1145/1409635.1409677
Zheng Y, Liu L, Wang L, Xie X (2008) Learning transportation mode from raw gps data for geographic applications on the web. In: Proceedings of the 17th international conference on World Wide Web, pp 247–256
Zheng Y, Zhang L, Ma Z, Xie X, Ma WY (2011) Recommending friends and locations based on individual location history. ACM Trans Web (TWEB) 5(1):1–44
Zhou ZH (2012) Ensemble Methods: Foundations and Algorithms. CRC Press
Zuo J, Zeitouni K, Taher Y (2019) Exploring interpretable features for large time series with se4tec. In: Proc EDBT, pp 606–609
Zuo J, Zeitouni K, Taher Y (2019) Incremental and adaptive feature exploration over time series stream. In: 2019 IEEE international conference on big data (Big Data), pp 593–602
Acknowledgements
This work has supported by the French National Research Agency (ANR) project Polluscope, funded under the grant agreement ANR-15-CE22-0018, by the H2020 EU GO GREEN ROUTES funded under the research and innovation programme H2020- EU.3.5.2 grant agreement No 869764, and by the DATAIA convergence institute project StreamOps, as part of the Programme d’ Investissement d’Avenir, ANR-17-CONV-0003. Part of the equipment was funded by iDEX Paris-Saclay, in the framework of the IRS project ACE-ICSEN, and by the Communauté d’agglomération Versailles Grand Parc - VGP - (www.versaillesgrandparc.fr). We are thankful to VGP (Thomas Bonhoure) for facilitating the campaign. We would like to thank all the members of the Polluscope consortia who contributed in one way or another to this work: Salim Srairi and Jean-Marc Naude (CEREMA) who conducted the campaign; Boris Dessimond and Isabella Annesi-Maesano (Sorbonne University) for their contribution to the campaign; Valerie Gros and Nicolas Bonnaire (LSCE), and Anne Kauffman and Christophe Debert (Airparif) for their contribution in the periodic qualification of the sensors and their active involvement in the project. Finally, we would like to thank the participants for their great effort in carrying the sensors, without whom this work would not be possible.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
El Hafyani, H., Abboud, M., Zuo, J. et al. Learning the micro-environment from rich trajectories in the context of mobile crowd sensing. Geoinformatica 28, 177–220 (2024). https://doi.org/10.1007/s10707-022-00471-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-022-00471-4