Journal of Combinatorial Optimization

, Volume 32, Issue 4, pp 1068–1088 | Cite as

Efficient accuracy evaluation for multi-modal sensed data

  • Yan Zhang
  • Hongzhi WangEmail author
  • Hong Gao
  • Jianzhong Li


Data accuracy is an important aspect in sensed data quality. Thus one necessary task for data quality management is to evaluate the accuracy of sensed data. However, to our best knowledge, neither measure nor effective methods for the accuracy evaluation are proposed for multi-typed sensed data. To address the problem for accuracy evaluation, we propose a systematic method. With MSE, a parameter to measure the accuracy in statistics, we design the accuracy evaluation framework for multi-modal data. Within this framework, we classify data types into three categories and develop accuracy evaluation algorithms for each category in cases of in presence and absence of true values. Extensive experimental results show the efficiency and effectiveness of our proposed framework and algorithms.


Data quality Accuracy Sensed data 



This paper was partially supported by NGFR 973 Grant 2012CB316200 and NSFC Grant 61472099.


  1. Balakrishnan R, Kambhampati S (2011) Sourcerank: relevance and trust assessment for deep web sources based on inter-source agreement. In: WWW, pp 227–236Google Scholar
  2. Bertsekas Dimitri P (1999) Nonlinear programming, 2nd edn. Athena Scientific, CambridgezbMATHGoogle Scholar
  3. Cai Z, Lin G, Xue G (2005) Improved approximation algorithms for the capacitated multicast routing problem. Computing and combinatorics. Springer, Berlin, pp 136–145zbMATHGoogle Scholar
  4. Cai Z, Chen Z-Z, Lin G (2008) A 3.4713-approximation algorithm for the capacitated multicast tree routing problem. Theoret Comput Sci 410(52):5415–5424Google Scholar
  5. Cai Z, Ji S, He J, Bourgeois AG (2012) Optimal distributed data collection for asynchronous cognitive radio networks. In: IEEE 32nd international conference on distributed computing systems (ICDCS), pp 245–254. IEEEGoogle Scholar
  6. Chen L, Liu Y, Li M (2007) Non-threshold based event detection for 3d environment monitoring in sensor networks. IEEE Trans Knowl Data Eng 20:1699–1711MathSciNetGoogle Scholar
  7. Chen G, Cui S (2013) Relay node placement in two-tiered wireless sensor networks with base stations. J Comb Optim 26(3):499–508MathSciNetCrossRefzbMATHGoogle Scholar
  8. Cheng X, Du D, Wang L, Xu B (2008) Relay sensor placement in wireless sensor networks. Wirel Netw 14(3):347–355CrossRefGoogle Scholar
  9. Cheng S, Li J (2009) Sampling based (epsilon, delta)-approximate aggregation algorithm in sensor networks. In: The 29th IEEE international conference on distributed computing systems (ICDCS), IEEE, pp 273–280Google Scholar
  10. Cheng S, Li J, Cai Z (2013) \(o(\epsilon )\)-approximation to physical world by sensor networks. In: INFOCOMGoogle Scholar
  11. Cheng S, Li J, Ren Q, Yu L (2010) Bernoulli sampling based \((\varepsilon, \delta )\)-approximate aggregation in large-scale sensor networks. In: Proceedings of the 29th conference on Information communications, IEEE Press, pp 1181–1189Google Scholar
  12. Cheng S, Li J, Yu L (2012) Location aware peak value queries in sensor networks. In: INFOCOM, IEEE, pp 486–494Google Scholar
  13. Cheng X, Thaeler A, Xue G, Chen D (2004) Tps: a time-based positioning scheme for outdoor wireless sensor networks. IEEE INFOCOM 2004:2685–2696Google Scholar
  14. Ding M, Chen D, Xing K, Cheng X (2005) Localized fault-tolerant event boundary detection in sensor networks. IEEE INFOCOM 2005:902–913Google Scholar
  15. Dong XL, Srivastava D (2011) Large-scale copy detection. In: SIGMOD conference, pp 1205–1208Google Scholar
  16. Dong XL, Srivastava D (2012) Detecting clones, copying and reuse on the web. In: ICDE, pp 1211–1213Google Scholar
  17. Dong XL, Berti-Equille L, Srivastava D (2009a) Integrating conflicting data: the role of source dependence. Proc VLDB Endow 2(1):550–561CrossRefGoogle Scholar
  18. Dong XL, Berti-Equille L, Srivastava D (2009b) Truth discovery and copying detection in a dynamic world. Proc VLDB Endow 2(1):562–573CrossRefGoogle Scholar
  19. Dong X, Berti-Equille L, Hu Y, Srivastava D (2010) Solomon: seeking the truth via copying detection. Proc VLDB Endow 3(2):1617–1620CrossRefGoogle Scholar
  20. Dong X, Berti-Equille L, Hu Y, Srivastava D (2010) Global detection of complex copying relationships between sources. Proc VLDB Endow 3(1):1358–1369CrossRefGoogle Scholar
  21. Du H, Wu W, Shan S, Kim D, Lee W (2012) Constructing weakly connected dominating set for secure clustering in distributed sensor network. J Comb Optim 23(2):301–307MathSciNetCrossRefzbMATHGoogle Scholar
  22. Elmagarmid AK, Ipeirotis PG, Verykios VS (2007) Duplicate record detection: a survey. IEEE Trans Knowl Data Eng 19(1):1–16CrossRefGoogle Scholar
  23. Florescu D, Koller D, Levy AY (1997) Using probabilistic information in data integration. In: VLDB, pp 216–225Google Scholar
  24. Galland A, Abiteboul S, Marian A, Senellart P (2010) Corroborating information from disagreeing views. In: WSDM, pp 131–140Google Scholar
  25. Getoor L, Machanavajjhala A (2012) Entity resolution: Theory, practice & open challenges. Proc VLDB Endow 5(12):2018–2019CrossRefGoogle Scholar
  26. Jindal A, Liu M (2010) Networked computing in wireless sensor networks for structural health monitoring. Network Comput Wirel Sensor Netw Struct Health Monit 2798(1):1–14Google Scholar
  27. Kasneci G, Van Gael J, Stern DH, Graepel T (2011) Cobayes: Bayesian knowledge corroboration with assessors of unknown areas of expertise. In: WSDM, pp 465–474Google Scholar
  28. Kozlov MK, Tarasov SP, Khachiyan LG (1980) Polynomial solvability of convex quadratic programming. In: Doklady Akademii Nauk SSSR, p 248Google Scholar
  29. Kumar S, Shepherd D (2001) SensIT: sensor information technology for the warfighter. In: Proceedings of the 4th international conference on information fusion, p 3C9Google Scholar
  30. Lehmann EL, George Casella (1998) Theory of point estimation, 2nd edn. Springer, New YorkzbMATHGoogle Scholar
  31. Li M, Liu Y, Chen L (2007) Non-threshold based event detection for 3d environment monitoring in sensor networks. In: ICDCS, p 9Google Scholar
  32. Li X, Meng W, Yu CT (2011) T-verifier: verifying truthfulness of fact statements. In: ICDE, pp 63–74Google Scholar
  33. Li M, Liu Q, Wang J, Zhao Y (2012) Dispatching design for storage-centric wireless sensor networks. J Comb Optim 24(4):485–507MathSciNetCrossRefzbMATHGoogle Scholar
  34. Li D, Zhu Q, Du H, Li J (2014a) An improved distributed data aggregation scheduling in wireless sensor networks. J Comb Optim 27(2):221–240MathSciNetCrossRefzbMATHGoogle Scholar
  35. Li J, Cheng S, Gao H, Cai Z (2014b) Approximate physical world reconstruction algorithms in sensor networks. In: IEEE transactions on parallel and distributed systemsGoogle Scholar
  36. Li J, Cheng S, Gao H, Cai Z (2014c) Approximate physical world reconstruction algorithms in sensor networks. In: IEEE transactions on parallel and distributed systemsGoogle Scholar
  37. Liu K, Li M, Liu Y, Li X-Y, Li M, Ma H (2010) Exploring the hidden connectivity in urban vehicular networks. In: ICNP, pp 243–252Google Scholar
  38. Murphy KP (2012) Machine learning: a probabilistic perspective. MIT Press, LondonzbMATHGoogle Scholar
  39. Navarro G, Raffinot M (2002) Flexible pattern matching in strings–practical on-line search algorithms for texts and biological sequences. Cambridge University Press, CambridgeCrossRefzbMATHGoogle Scholar
  40. Nocedal J, Wright SJ (2006) Numerical optimization, 2nd edn. Springer, BerlinzbMATHGoogle Scholar
  41. Pasternack J, Roth D (2011) Making better informed trust decisions with generalized fact-finding. In: IJCAI, pp 2324–2329Google Scholar
  42. Peterson D, Gao S, Malhotra A, Sperberg-McQueen CM, Henry S (2010) Xml schema 1.1.
  43. Ren Z, Zhou G, Pyles AJ, Keally M, Mao W, Wang H (2011) Bodyt2: throughput and time delay performance assurance for heterogeneous bsns. In: INFOCOM, pp 2750–2758Google Scholar
  44. Wang D, Kaplan L, Abdelzaher T, Aggarwal C (2011) On quantifying the accuracy of maximum likelihood estimation of participant reliability in social sensing. In: DMSNGoogle Scholar
  45. Zeinalipour-Yazti D, Vagena Z, Gunopulos D, Kalogeraki V, Tsotras V, Vlachos M, Koudas N, Srivastava D (2005) The threshold join algorithm for top-k queries in distributed sensor networks. In: Proceedings of the 2nd international workshop on Data management for sensor networks. ACM, pp 61–66Google Scholar
  46. Zhang Y, Wang H (2014) Accuracy evaluation for sensed data. In: WASA, pp 205–214Google Scholar
  47. Zhao B, Rubinstein BIP, Gemmell J, Han J (2012) A Bayesian approach to discovering truth from conflicting sources for data integration. Proc VLDB Endow 5(6):550–561CrossRefGoogle Scholar
  48. Zhou Y, Chen X, Lyu MR, Liu J (2010) Sentomist: unveiling transient sensor network bugs via symptom mining. In: IEEE 30th international conference on distributed computing systems (ICDCS), IEEE, pp 784–794Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Harbin Institute of TechnologyHarbinChina

Personalised recommendations