Abstract
The paper introduces an efficient feature selection approach for multivariate time-series of heterogeneous sensor data within a pervasive computing scenario. An iterative filtering procedure is devised to reduce information redundancy measured in terms of time-series cross-correlation. The algorithm is capable of identifying nonredundant sensor sources in an unsupervised fashion even in presence of a large proportion of noisy features. In particular, the proposed feature selection process does not require expert intervention to determine the number of selected features, which is a key advancement with respect to time-series filters in the literature. The characteristic of the prosed algorithm allows enriching learning systems, in pervasive computing applications, with a fully automatized feature selection mechanism which can be triggered and performed at run time during system operation. A comparative experimental analysis on real-world data from three pervasive computing applications is provided, showing that the algorithm addresses major limitations of unsupervised filters in the literature when dealing with sensor time-series. Specifically, it is presented an assessment both in terms of reduction of time-series redundancy and in terms of preservation of informative features with respect to associated supervised learning tasks.
Similar content being viewed by others
Notes
MATLAB code for ICF and CleVer available at www.di.unipi.it/~bacciu/icf.
References
Amato G, Bacciu D, Broxvall M, Chessa S, Coleman S, Di Rocco M, Dragone M, Gallicchio C, Gennaro C, Lozano H, McGinnity T, Micheli A, Ray A, Renteria A, Saffiotti A, Swords D, Vairo C, Vance P (2015) Robotic ubiquitous cognitive ecology for smart homes. J Intell Robot Syst 1–25. doi:10.1007/s10846-015-0178-2
Bacciu D, Barsocchi P, Chessa S, Gallicchio C, Micheli A (2014) An experimental characterization of reservoir computing in ambient assisted living applications. Neural Computing and Applications 24(6):1451–1464
Bacciu D, Benedetti F, Micheli A (2015) ESNigma: efficient feature selection for echo state networks. In: Proceedings of the European symposium on artificial neural networks, computational intelligence and machine learning (ESANN’15), pp 189–194
Bacciu D, Etchells TA, Lisboa PJ, Whittaker J (2013) Efficient identification of independence networks using mutual information. Comput Stat 28(2):621–646
Bacciu D, Gallicchio C, Micheli A, Di Rocco M, Saffiotti A (2014) Learning context-aware mobile robot navigation in home environments. In: The 5th international conference on information, intelligence, systems and applications, IISA 2014, pp 57–62. IEEE
Cheema S, Henne T, Koeckemann U, Prassler E (2010) Applicability of feature selection on multivariate time series data for robotic discovery. In: Proceedings of ICACTE’10, vol 2, pp 592–597
García-Pajares R, Benítez JM, Sainz-Palmero G (2013) FRASel: a consensus of feature ranking methods for time series modelling. Soft Comput 17(8):1489–1510
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1–3):389–422
Han M, Liu X (2013) Feature selection techniques with class separability for multivariate time series. Neurocomputing 110:29–34
Jaeger H, Haas H (2004) Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication. Science 304(5667):78–80
Lal TN, Schroder M, Hinterberger T, Weston J, Bogdan M, Birbaumer N, Scholkopf B (2004) Support vector channel selection in BCI. IEEE Trans Biomed Eng 51(6):1003–1010
Lukoševičius M, Jaeger H (2009) Reservoir computing approaches to recurrent neural network training. Comput Sci Rev 3(3):127–149
Mitra P, Murthy CA, Pal SK (2002) Unsupervised feature selection using feature similarity. IEEE Trans Pattern Anal Mach Intell 24(3):301–312
Papana A, Kugiumtzis D (2009) Evaluation of mutual information estimators for time series. Int J Bifurc Chaos 19(12):4197–4215
Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Yang K, Yoon H, Shahabi C (2005) A supervised feature subset selection technique for multivariate time series. In: Proceedings of FSDM’05, pp 92–101
Ye J, Dobson S, McKeever S (2012) Review: situation identification techniques in pervasive computing—a review. pervasive Mob Comput 8(1):36–66
Yoon H, Yang K, Shahabi C (2005) Feature subset selection and feature ranking for multivariate time series. IEEE Trans Knowl Data Eng 17(9):1186–1198
Acknowledgments
This work is supported by the FP7 RUBICON Project (Contract No. 269914). The author would like to thank Claudio Gallicchio for providing part of the results on the Echo State Network experiment, as well as Filippo Barontini for the collection of the HAR dataset.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bacciu, D. Unsupervised feature selection for sensor time-series in pervasive computing applications. Neural Comput & Applic 27, 1077–1091 (2016). https://doi.org/10.1007/s00521-015-1924-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-015-1924-x