Advertisement

Earth Science Informatics

, Volume 5, Issue 1, pp 43–59 | Cite as

On retrieving patterns in environmental sensor data

  • Md. Sumon ShahriarEmail author
  • Paulo de Souza
  • Greg Timms
Research Article

Abstract

As many sensor networks are currently being deployed for environmental monitoring, there is a growing need to develop systems and applications for managing, processing and retrieving massive amounts of data generated from those networks. In this research, a query answering system with pattern mining techniques is investigated specifically for marine sensor data. We consider three applications of pattern mining: similar pattern search, predictive query and query by clustering. In pattern mining for query answering, we adopt the dynamic time warping (DTW) method for similarity measurement. We also propose the use of a query relaxation approach that recommends users change parameters of a given query to get an answer. Finally, we show implementation results of pattern query answering in a marine sensor network deployed in the South East of Tasmania, Australia. Pattern query answering system benefits in accessing and discovering knowledge from sensor data for decision making purposes.

Keywords

Environmental informatics Information retrieval Data mining Marine sensor data 

Notes

Acknowledgements

The Tasmanian ICT Centre is jointly funded by the Australian Government through the Intelligent Island Program and CSIRO. The Intelligent Island Program is administered by the Tasmanian Department of Economic Development, Tourism and the Arts. This research was conducted as part of the CSIRO Wealth from Oceans National Research Flagship and the Sensors and Sensor Networks Transformational Capability Platform(SSN-TCP). We thank Aidan O’Mara for providing improved prediction using clustering.

Supplementary material

12145_2012_95_MOESM1_ESM.zip (77 kb)
(ZIP 76.7 KB)

References

  1. Adhikari PR, Hollmén J (2010) Patterns from multiresolution 0-1 data. In: UP ’10 Proceedings of the ACM SIGKDD workshop on useful patterns (UP), pp 8–16Google Scholar
  2. Assent I, Kremer H, Gunnemann S, Seidl T (2010) Pattern detector: fast detection of suspicious stream patterns for immediate reaction. EDBT, pp 709–712Google Scholar
  3. Assent I, Witchterich M, Krieger R, Kremer H, Seidl T (2009) Anticipatory DTW for efficient similarity search in time series databases. VLDB, pp 826–837Google Scholar
  4. Bulut A, Singh AK (2005) A unified framework for monitoring data streams in real time. ICDE, pp 44–55Google Scholar
  5. Buono P, Plaisant C, Simione A, Aris A, Shneiderman B, Shmueli G, Jank W (2007) Similarity-based forecasting with simultaneous previews: a river plot interface for time series forecasting. International Conference Information Visualization (IV’07), pp 191–196Google Scholar
  6. Cao H, Qi Y, Candan S, Sapino ML (2010) Feedback-driven result ranking and query refinement for exploring semi-structured data collections. EDBT, pp 3–14Google Scholar
  7. Chan FKP, Fu AWC, Yu C (2003) Haar wavelets for efficient similarity search of time-series: with and without time warping. IEEE TKDE 15(3):686–705Google Scholar
  8. Chen H, Cheng C-C (2011) A distortion-aware intelligent context-aggregation agent for smart environments. IEEE Intelligent Systems, pp 42–49Google Scholar
  9. Chen Y, Nascimento MA, Ooi BC, Tung AKH (2007) SpADE: on shape-based pattern detection in streaming time series. ICDE, pp 786–795Google Scholar
  10. Cheng H, Tan P-N, Gao J, Scripps J (2009) Multistep-ahead time series prediction. PAKDD, pp 765–774Google Scholar
  11. Ciglan M, Habela O, Tran V, Hluchy L, Kremler M, Gera M (2010) Application of ADMIRE data mining nd integration technologies in environmental scenarios. PPAM, pp 165–173Google Scholar
  12. Diao Y, Ganesan D, Mathur G, Shenoy P (2007) Rethinking data management for storage-centric sensor networks. CIDR, pp 22–31Google Scholar
  13. Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. VLDB, pp 1542–1552Google Scholar
  14. Fu AWC, Keogh E, Lau LYH, Ratanamahatana CA (2005) Scaling and time warping in time series querying. VLDB, pp 649–660Google Scholar
  15. Giberta K, Sanchez-Marrea M (2011) Outcomes from the iEMSs data mining in the environmental sciences workshop series. Environ Model Softw 26(7):983–985CrossRefGoogle Scholar
  16. Herzfeld M, Andrewartha J, Sakov P (2010) Modelling the physical oceanography of the d’entrecasteaux channel and the Huon estuary, south-eastern Tasmania. Marine and Freshwater Research vol 61, CSIRO publishing, pp 568– 586Google Scholar
  17. Hluchy L, Habela O, Tran V, Ciglan M (2009) Hydro-meteorological scenarios using advanced data mining and integration. International Conference on Fuzzy Systems and Knowledge Discovery, pp 260–264Google Scholar
  18. Hugo D, Howell B, D’este C, Timms G, Sharman C, de Souza P, Allen S (2011) Low-cost marine monitoring: from sensors to information delivery. IEEE Oceans, pp 1–7Google Scholar
  19. Huh SY, Moon KH, Lee H (2000) A data abstraction approach for query relaxation. Inf Softw Technol 42:407–418CrossRefGoogle Scholar
  20. Keogh E, Kassety S, (2002) On the need for time series data mining benchmarks: a survey and impirical demonstration. SIGKDD, pp 102–111Google Scholar
  21. Kidron A, Klein ST (2007) An information retrieval approach to predicting meteorological data. Int J Model Simul 27(3):218–225.Google Scholar
  22. Koopman A, Knobbe A, Meeng M (2010) Pattern selection problems in multivariate time-series using equation discovery. In: UP ’10 Proceedings of the ACM SIGKDD workshop on useful patterns (UP), pp 74–81 Pattern selection problems in multivariate time-series using equation discovery, Useful Pattern (UP)Google Scholar
  23. Lian X, Chen L (2008) Efficient similarity search over future stream time series. IEEE TKDE 20(1):40–54Google Scholar
  24. Lian X, Chen L, Yu JX (2009) Multiscale representations for fast pattern matching in stream time series. IEEE TDKE 21(4):568–581Google Scholar
  25. Liao TW (2005) Clustering of time series data—a survey. Pattern Recogn 38:1857–1874CrossRefGoogle Scholar
  26. Liu C, Li J, Yu JX, Zhou R (2010) Adaptive relaxation for querying heterogeneous XML data sources. Inf Syst 35:688–707CrossRefGoogle Scholar
  27. Mamoulis N, Cao H, Kollios G, Hadjieleftheriou M, Tao, Y, Cheung DW (2004) Mining, indexing, and querying historical spatiotemporal data. KDD, pp 236–245Google Scholar
  28. Mirzadeh N, Ricci F, Bansal M (2004) Supporting user query relaxation in a recommender system. EC-Web, LNCS, vol 3182, pp 31–40Google Scholar
  29. Morealle P, Callegari J, Valle G, Kendall F (2011) Sensor integration and analysis for visual identification of environmental patterns. IEEE SysCon., pp 7–12Google Scholar
  30. Pan L, Luo J, Li J (2008) Probing queries in wireless sensor networks. IEEE International Conference on Distributed Computing Systems, pp 546–553Google Scholar
  31. Ricci F, Mirzadeh N, Venturini A (2002) Intelligent query management in a mediator architecture. IEEE International Symposium on Intelligent Systems, pp 221–226Google Scholar
  32. Sakurai Y, Faloutsos C, Yamamura M (2007) Stream monitoring under the time warping distance. ICDE, pp 1046–1055Google Scholar
  33. Sakurai Y, Yoshikawa M, Faloutsos C (2005) FTW: fast similarity search under the time warping distance. PODS, pp 326– 337Google Scholar
  34. SANY-an open service architecture for sensor networks. SANY Consortium, p 161 ISBN: 9783000285714 (2009) http://www.frisia-it.de/assets/images/SANY_Book.pdf
  35. Shahriar MS, de Souza P, Timms G (2011) Smart query answering for marine sensor data. Sensors 11:2885–2897. doi: 10.3390/s110302885 CrossRefGoogle Scholar
  36. Shan J, Shen D, Nie T, Kou Y, Yu G (2010) An effective and high-quality query relaxation solution on the deep web. APWeb, pp 68–74Google Scholar
  37. Tran V, Hluchy L, Habela O (2010) Data mining and integration for environmental scenarios. SoICT, pp 55–58Google Scholar
  38. Timms GP, de Souza PA, Reznik L (2010) Automated assessment of data quality in marine sensor networks. IEEE Oceans, pp 1–5Google Scholar
  39. Timms GP, McCulloch JW, McCarthy P, Howell B, de Souza PA, Dunbabin MD, Hartmann K (2009) The Tasmanian Marine Analysis Network (TasMAN). IEEE Oceans, pp 1–6Google Scholar
  40. Wu J, Zhou Y, Aberer K, Tan KL (2009) Towards integrated and efficient scientific sensor data processing: a database approach. EDBT, pp 922–933Google Scholar
  41. Yang K, Shahabi C (2004) A PCA-based similarity measure for multivariate time series, 2004. MMDB, pp 65–74Google Scholar
  42. Yuelong Z, Dingsheng W, Xiaohua Z, (2008) A novel approach to the similarity analysis of multivariate time series and its application in hydrological data mining. International Conference on Computer Science and Software Engineering, pp 730–734Google Scholar
  43. Zhang X, Liu J, Du Y, Lv T (2011) A novel clustering method on time series data. Expert Syst Appl 38(9):11891–11900Google Scholar
  44. Zhang X, Wu J, Yang X (2009) A novel pattern extraction method for time series classification. Optimization Engineering 10:253–271CrossRefGoogle Scholar
  45. Zhou X, Gaugaz J, Balke TW, Nejdl W (2007) Query relaxation using malleable schemas. SIGMOD, pp 545–556Google Scholar

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  • Md. Sumon Shahriar
    • 1
    Email author
  • Paulo de Souza
    • 1
  • Greg Timms
    • 1
  1. 1.Tasmanian ICT CentreCommonwealth Scientific and Industrial Research Organisation (CSIRO)HobartAustralia

Personalised recommendations