Abstract
To address the problem of unsupervised outlier detection in wireless sensor networks, we develop an approach that (1) is flexible with respect to the outlier definition, (2) computes the result in-network to reduce both bandwidth and energy consumption, (3) uses only single-hop communication, thus permitting very simple node failure detection and message reliability assurance mechanisms (e.g., carrier-sense), and (4) seamlessly accommodates dynamic updates to data. We examine performance by simulation, using real sensor data streams. Our results demonstrate that our approach is accurate and imposes reasonable communication and power consumption demands.
Similar content being viewed by others
References
Adam N, Janeja V, Atluri V (2004) Neighborhood-based detection of anomalies in high dimensional spatio-temporal sensor datasets. In: Proceedings of ACM symposium on applied computing (SAC04), pp 576–583
Agrawal R, Mannila H, Srikant R, Toivonen H, Verkamo A (1996) Fast discovery of association rules. In: Advances in knowledge discovery and data mining, pp 307–328
Ajdler T, Kozintsev I, Lienhart R, Vetterli M (2004) Acoustic source localization in distributed sensor networks. In: Proceedings of the asilomar conference on signals, systems and computers, pp 1328–1332
Akyildiz IF, Su W, Sankarasubramaniam Y, Cayirci E (2002) A survey on sensor networks. IEEE Commun Mag 40(8): 102–114
Akyildiz IF, Su W, Sankarasubramaniam Y, Cayirci E (2002) Wireless sensor networks: a aurvey. IEEE Trans Syst Man Cybern Part B 38: 393–422
Angiulli F, Pizzuti C (2002) Fast outlier detection in high dimensional spaces. In: Proceedings of the European conference on the principales and practice of data mining and knowledge discovery (PKDD02)
Apiletti D, Baralis E, Cerquitelli T (2010) Energy-saving models for wireless sensor networks. Knowl Inf Syst 28(3): 615–644
Barnett V, Lewis T (1994) Outliers in statistical data. Wiley, New York
Basu S, Meckesheimer M (2007) Automatic outlier detection for time series: an application to sensor data. Knowl Inf Syst 11: 137–154
Bawa M, Gionis A, Garcia-Molina H, Motwani R (2007) The price of validity in dynamic networks. J Comput Syst Sci 73(3): 245–264
Bay S, Schwabacher M (2003) Mining distance-based outliers in near linear time with randomization and a simple pruning rule. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining
Beck A, Stoica P, Li J (2008) Exact and approximate solutions of source localization problems. IEEE Trans Signal Process 56(5): 1770–1778
Bhaduri K, Kargupta H (2008) A scalable local algorithm for distributed multivariate regression. In: Proceedings of the SIAM conference on data mining (SDM))
Bhaduri K, Wolff R, Giannella C, Kargupta H (2008) Distributed decision tree induction in peer-to-peer systems. Stat Anal Data Mining 1(2): 85–103
Boyd S, Ghosh A, Prabhakar B, Shah D (2005) Gossip algorithms: design, analysis, and applications. In: Proceedings of IEEE international conference on computer communication (Infocom05), vol 3, pp 1653–1664
Branch J, Chen G, Szymanski B (2005) ESCORT: energy-efficient sensor network communal routing topology using signal quality metrics. In: Proceedings of the international conference on networking (ICN05), pp 438–448
Branch J, Szymanski B, Wolff R, Giannella C, Kargupta H (2006) In-network outlier detection in wireless sensor networks. In: Proceedings of the international conference on distributed computing systems (ICDCS)
Breunig M, Kriegel H-P, Ng R, Sander J (2000) LOF: identifying density-based local outliers. In: Proceedings of ACM SIGMOD international conference on the management of data (SIGMOD00), pp 93–104
Cerpa A, Estrin D (2004) ASCENT: adaptive self-configuring sensor networks topologies. IEEE Trans Mobile Comput 3(3): 272–285
Chen G, Branch J, Pflug M, Zhu L, Szymanksi B (2004) SENSE: a wireless sensor network simulator. In: Szymanksi B, Yener B (eds) Advances in pervasive computing and networking. Springer, Berlin, pp 249–267
Chen L, Wang Z, Szymanski B, Branch J, Verma D, Damarla R, Ibbotson J (2010) Dynamic service execution in sensor networks. Comput J 53(5): 513–527
Chong S, Gaber M, Krishnaswamy S, Loke L (2011) Energy conservation in wireless sensor networks: a rule-based approach. Knowl Inf Syst 28(3): 579–614
Clemente J, Defago X, Satou K (2003) Asynchronous peer-to-peer communication for failure resilient distributed genetic algorithms. In: Proceedings of the IASTED international conference on parallel and distributed computing and systems (PDCS03), pp 769–773
Crossbow Technology: MPR, MIB user’s manual. http://www.xbow.com
Das K, Bhaduri K, Liu K, Kargupta H (2008) Distributed identification of top-l inner product elements and its application in a peer-to-peer network. IEEE Trans Knowl Data Eng 20(4): 475–488
Datta S, Kargupta H (2007) Uniform data sampling from a peer-to-peer network. In: Proceedings of the international conference on distributed computing systems (ICDCS), p 50
Datta S, Giannella C, Kargupta H (2006) K-means clustering over a large, dynamic network. In: Proceedings of the SIAM international conference on data mining (SDM06), pp 153–164
Estrin D, Govindan R, Heidemann J, Kumar S (1999) Next century challenges: scalable coordination in sensor networks. In: Proceedings of the ACM international conference on mobile computing and networking (MobiCom99), pp 263–270
Fan H, Zaiane O, Foss A, Wu J (2009) Resolution-based outlier factor: detecting the top-n most outlying data points in engineering data. Knowl Inf Syst 19: 31–51
Gupta P, Kumar P (2000) The capacity of wireless networks. IEEE Trans Inf Theory 46(2): 388–404
Hautamaki V, Cherednichenko S, Karkkainen I, Kinnunen T, Franti P (2005) Improving K-means by outlier removal. In: Kalviainen H, Parkkinen J, Kaarna A (eds) Image analysis, lecture notes in computer science, vol 3540. Springer, Berlin/Heidelberg, pp 978–987
Hawkins S, He H, Williams G, Baxter R (2002) Outlier detection using replicator neural networks. In: Kambayashi Y, Winiwarter W, Arikawa M (eds) Data warehousing and knowledge discovery, lecture notes in computer science, vol 2454. Springer, Berlin/Heidelberg, pp 113–123
Hodge V, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22: 85–126
Holger K, Willig A (2007) Protocols and architectures for wireless sensor networks. Wiley, New York
Intel Berkeley Research Lab: Wireless sensor data. http://db.lcs.mit.edu/labdata/labdata.html
Janakiram D, Reddy VA, Kumar AVUP (2006) Outlier detection in wireless sensor networks using Bayesian belief networks. In: Proceedings of IEEE conference on communication system software and middleware (Comsware06), pp 1–6
John GH (1995) Robust decision trees: removing outliers from databases. In: First international conference on knowledge discovery and data mining. AAAI Press, pp 174–179
Kargupta H, Sivakumar K (2004) Existential pleasures of distributed data mining. In: Kargupta H, Joshi A, Sivakumar K, Yesha Y (eds) Data mining: next generation challenges and future directions. MIT/AAAI Press
Kargupta H, Hamzaoglu I, Stafford B (1997) Scalable, distributed data mining using an agent-based architecture. In: Proceedings of knowledge discovery and data mining, pp 211–214
Kargupta H, Park P, Hershberger D, Johnson E (1999) Collective data mining: a new perspective toward distributed data mining. In: Kargupta H, Chan P (eds) Advances in distributed and parallel knowledge discovery. MIT/AAAI Press
Kempe D, Dobra A, Gehrke J (2003) Computing aggregate information using Gossip. In: Proceedings of the IEEE symposium on foundations of computer science (FoCS03), pp 482–491
Knorr E, Ng R (1998) Algorithms for mining distance-based outliers in large datasets. In: Proceedings of the international conference on very large data bases (VLDB98)
Kowalczyk W, Jelasity M, Eiben A (2003) Towards data mining in large and fully distributed peer-to-peer overlay networks. In: Proceedings of Belgian-Dutch conference on artificial intelligence (BNAIC03), pp 203–210
Krivitski D, Schuster A, Wolff R (2007) A local facility location algorithm for large-scale distributed systems. J Grid Comput 5(4): 361–378
Kurita T, Takahashi T, Ikeda Y (2002) A neural network classifier for occluded images. In: International conference on pattern recognition, vol 3, pp 30045–30049
Luo P, Xiong H, Lü K, Shi Z (2007) Distributed classification in peer-to-peer networks. In: Proceedings of SIGKDD’07, pp 968–976
Mebane W (2010) Fraud in the 2009 presidential election in Iran?. Chance 23: 6–15
Mehyar M, Spanos D, Pongsajapan J, Low S, Murray R (2007) Asynchronous distributed averaging on communication networks. IEEE Trans Netw 15(3): 512–529
Mukherjee S, Kargupta H (2008) Distributed probabilistic inferencing in sensor networks using variational approximation. J Parallel Distrib Comput 68(1): 78–92
Otey M, Ghoting A, Parthasarathy S (2006) Fast distributed outlier detection in mixed-attribute data sets. Data Mining Knowl Discov 12: 203–228
Palpanas T, Papadopoulos D, Kalogeraki V, Gunopulos D (2003) Distributed deviation detection in sensor networks. In: ACM SIGMOD Record, pp 77–82
Perkins C, Royer E (1999) Ad-hoc on demand distance vector routing. In: Proceedings of the 2nd IEEE workshop on mobile computing systems and applications, pp 90–100
Radivojac P, Korad U, Sivalingam KM, Obradovic Z (2003) Learning from class-imbalanced data in wireless sensor networks. In: Proceedings of the IEEE 58th vehicular technology conference, vol 5, pp 3030–3034
Rajasegarar S, Leckie C, Palaniswami M, Bezdek J (2006) Distributed anomaly detection in wireless sensor networks. In: Proceedings of the IEEE Singapore international conference on communication systems, pp 1–5
Ramaswamy S, Rastogi R, Shim K (2000) Efficient algorithms for mining outliers from large datasets. In: Proceedings of the ACM SIGMOD conference on the management of data (SIGMOD00)
Schurgers C, Tsiatsis V, Srivastava M (2002) STEM: topology management for energy-efficient sensor networks. In: Proceedings of the IEEE aerospace conference, vol 3, pp 1099–1108
Sharfman I, Schuster A, Keren D (2007) A geometric approach to monitoring threshold functions over distributed data streams. ACM Trans Database Syst 32(4)
Sheng B, Li Q, Mao W, Jin W (2007) Outlier detection in sensor networks. In: Proceedings of the 8th ACM international symposium on mobile and ad hoc networking and computing (MobiHoc), pp 219–228
Sheng X, Hu Y-H (2005) Maximum likelihood multiple-source localization using acoustic energy measurements with wireless sensor networks. IEEE Trans Signal Process 53(1): 44–53
Shin K, Abraham A, Han (2006) Improving kNN text categorization by removing outliers from training set. In: Gelbukh A (ed) Computational linguistics and intelligent text processing, lecture notes in computer science, vol 3878. Springer, Berlin/Heidelberg, pp 563–566
Simon G, Maroti M, Ledeczi A, Balogh G, Kusy B, Nadas A, Pap G, Sallai J, Frampton K (2004) Sensor network-based countersniper system. In: Proceedings of the international conference on embedded networked sensor systems (SenSys04), pp 1–12
Su L, Han W, Yang S, Zou P, Jia Y (2007) Continuous adaptive outlier detection on distributed data streams. In: Lecture notes in computer science 4782—proceedings of the high performance computation conference (HPCC), pp 74–85
Subramaniam S, Palpanas T, Papadopoulos D, Kalogeraki V, Gunopulos D (2006) Online outlier detection in sensor data using non-parametric models. In: Proceedings of ACM conference on very large databases (VLDB06), pp 187–198
Tietjen G, Moore R (1972) Some grubbs-type statistics for the detection of outliers. Technometrics 14(3): 583–597
Wang Z, Bulut E, Szymanski BK (2010) Distributed energy-efficient target tracking with binary sensor networks. ACM Transactions on Sensor Networks (TOSN) 6(4)
Wasilewski K, Branch J, Lisee M, Szymanski BK (2007) Self-healing routing: a study in efficiency and resiliency of data delivery in wireless sensor networks. In: Proceedings of the conference on unattended ground, sea, and air sensor technologies and applications, SPIE symposium on defense and security
Wolff R, Schuster A (2004) Association rule mining in peer-to-peer systems. IEEE Trans Syst Man Cybern Part B 34(6): 2426–2438
Wolff R, Bhaduri K, Kargupta H (2006) Local L2 thresholding-based data mining in peer-to-peer systems. In: Proceedings of the SIAM international conference on data mining (SDM06), pp 430–441
Wolff R, Bhaduri K, Kargupta H (2009) A generic local algorithm for mining data streams in large distributed systems. IEEE Trans on Knowl Data Eng 21(4): 465–487
Xu Y, Heidemann J, Estrin D (2001) Geography-informed energy conservation for ad hoc routing. In: Proceedings of the ACM international conference on mobile computing and networking (MobiCom01), pp 70–84
Zhuang Y, Chen L (2006) In-network outlier cleaning for data collection in sensor networks. In: Proceedings of the 1st international VLDB workshop on clean databases (CleanDB06)
Zhuang Y, Chen L, Wang X, Lian J (2007) A weighted average-based approach for cleaning sensor data. In: Proceedings of the 27th international conference on distributed computing systems (ICDCS)
Zuniga M, Krishnamachari B (2004) Analyzing the transitional region in low power wireless links. In: Proceedings of the IEEE conference on sensor and ad hoc communications and networks (SECON04), pp 517–526
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Branch, J.W., Giannella, C., Szymanski, B. et al. In-network outlier detection in wireless sensor networks. Knowl Inf Syst 34, 23–54 (2013). https://doi.org/10.1007/s10115-011-0474-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-011-0474-5