Abstract
Ranking in uncertain database environments has gained a great importance recently. Many techniques were introduced to rank uncertain databases and others to rank distributed certain databases. Unfortunately, there are not that much techniques in ranking distributed uncertain databases. This paper proposes a framework that improves ranking processing in the case of uncertain and distributed database. In the proposed framework, new communication and computation-efficient algorithms are investigated for retrieving the top-k tuples from distributed sites. These algorithms are applied in tuple-level uncertainty. The main concern of the proposed algorithms is to reduce the communication rounds utilized and amount of data transmitted while achieving efficient ranking. Experimental results emphasize that both proposed algorithms have a great impact on reducing communication cost. Also, the results clarify that the first algorithm is efficient in the case of a low number of sites while the second achieves better performance in the context of a higher number of sites.
Similar content being viewed by others
References
AbdulAzeem YM, Eldesouky A, Ali H (2012) Ranking in uncertain distributed database environments. In: The seventh international conference on computer engineering and systems (ICCES), pp 275–280
Agrawal P, Benjelloun O, Sarma AD, Hayworth C, Nabar S, Sugihara T, Widom J (2006) Trio: a system for data, uncertainty, and lineage. In: Proceedings of the 32nd international conference on very large data bases, VLDB Endowment, VLDB ’06, pp 1151–1154
Andreou P, Zeinalipour-Yazti D, Chrysanthis PK, Samaras G (2011) Power efficiency through tuple ranking in wireless sensor network monitoring. Distrib Parallel Databases 29(1–2):113–150
Antova L, Koch C, Olteanu D (2007) From complete to incomplete information and back. In: Proceedings of the 2007 ACM SIGMOD international conference on management of data, SIGMOD ’07. ACM, New York, NY, USA, pp 713–724
Antova L, Jansen T, Koch C, Olteanu D (2008) Fast and simple relational processing of uncertain data. In: Proceedings of the 2008 IEEE 24th international conference on data engineering, ICDE ’08. IEEE Computer Society, Washington, DC, USA, pp 983–992
Antova L, Koch C, Olteanu D (2009) \({{10}^{10}}^6\) worlds and beyond: efficient representation and processing of incomplete information. VLDB J 18(5):1021–1040
Babcock B, Olston C (2003) Distributed top-k monitoring. In: Proceedings of the 2003 ACM SIGMOD international conference on management of data, SIGMOD ’03. ACM, New York, NY, USA, pp 28–39
Benjelloun O, Sarma AD, Halevy A, Widom J (2006) Uldbs: databases with uncertainty and lineage. In: Proceedings of the 32nd international conference on very large data bases, VLDB Endowment, VLDB ’06, pp 953–964
Beskales G, Soliman MA, Ilyas IF (2008) Efficient search for the top-k probable nearest neighbors in uncertain databases. Proc VLDB Endow 1(1):326–339
Calders T, Garboni C, Goethals B (2010) Efficient pattern mining of uncertain data with sampling. In: Zaki M, Yu J, Ravindran B, Pudi V (eds) Advances in knowledge discovery and data mining, Lecture Notes in Computer Science, vol 6118. Springer, Berlin, pp 480–487
Cao P, Wang Z (2004) Efficient top-k query calculation in distributed networks. In: Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing, PODC ’04. ACM, New York, NY, USA, pp 206–215
Cheng R, Kalashnikov DV, Prabhakar S (2003) Evaluating probabilistic queries over imprecise data. In: Proceedings of the 2003 ACM SIGMOD international conference on management of data, SIGMOD ’03. ACM, New York, NY, USA, pp 551–562
Cheng R, Kalashnikov DV, Prabhakar S (2004) Querying imprecise data in moving object environments. IEEE Trans Knowl Data Eng 16(9):1112–1127
Cho Y, Son J, Chung YD (2008) Pot: an efficient top-k monitoring method for spatially correlated sensor readings. In: Proceedings of the 5th workshop on Data management for sensor networks, DMSN ’08. ACM, New York, NY, USA, pp 8–13
Dalvi NN, Suciu D (2007) Efficient query evaluation on probabilistic databases. VLDB J 16(4):523–544
Deshpande A, Guestrin C, Madden SR, Hellerstein JM, Hong W (2004) Model-driven data acquisition in sensor networks. In: Proceedings of the thirtieth international conference on very large data Bases, VLDB Endowment, VLDB ’04, vol 30, pp 588–599
El-Desouky AI, Ali HA, AbdulAzeem YM (2010) Ranking distributed uncertain database systems: discussion and analysis. In: International conference on computer engineering and systems (ICCES), pp 295–300
Fagin R, Lotem A, Naor M (2003) Optimal aggregation algorithms for middleware. J Comput Syst Sci 66(4):614–656
Hua M, Pei J, Zhang W, Lin X (2008) Efficiently answering probabilistic threshold top-k queries on uncertain data. In: Proceedings of the 24th IEEE international conference on data, engineering, pp 1403–1405
Jestes J, Cormode G, Li F, Yi K (2011) Semantics of ranking queries for probabilistic data. IEEE Trans Knowl Data Eng 23(12):1903–1917
Kanagal B, Deshpande A (2008) Online filtering, smoothing and probabilistic modeling of streaming data. In: Proceedings of the 2008 IEEE 24th international conference on data engineering, ICDE ’08. IEEE Computer Society, Washington, DC, USA, pp 1160–1169
Li C, Huang L, Tian L (2011) Efficient building algorithms of decision tree for uniformly distributed uncertain data. In: Ding Y, Wang H, Xiong N, Hao K, Wang L (eds) Seventh international conference on natural computation, ICNC 2011. IEEE, New York, pp 105–108
Li F, Yi K, Jestes J (2009) Ranking distributed probabilistic data. In: Proceedings of the 2009 ACM SIGMOD international conference on management of data, SIGMOD ’09. ACM, New York, NY, USA, pp 361–374
Lin CW, Hong TP (2012) A new mining approach for uncertain databases using cufp trees. Expert Syst Appl 39(4):4084–4093
Lin CW, Hong TP, Chen YF, Lin TC, Pan ST (2013) An integrated mffp-tree algorithm for mining global fuzzy rules from distributed databases. J Univers Comput Sci 19(4):521–538
Ljosa V, Singh AK (2008) Top-k spatial joins of probabilistic objects. In: Proceedings of the 2008 IEEE 24th international conference on data engineering, ICDE ’08. IEEE Computer Society, Washington, DC, USA, pp 566–575
Marian A, Bruno N, Gravano L (2004) Evaluating top-k queries over web-accessible databases. ACM Trans Database Syst 29(2):319–362
Neumann T, Bender M, Michel S, Schenkel R, Triantafillou P, Weikum G (2009) Distributed top-k aggregation queries at large. Distrib Parallel Databases 26(1):3–27
Qian A, Lu Y, Xiaofeng D, Zou L, Li Z (2009) Efficient top-k monitoring of abnormality in sensor networks. In: Proceedings of the 2009 ninth IEEE international conference on computer and information technology, CIT ’09, vol 02. IEEE Computer Society, Washington, DC, USA, pp 348–353
Re C, Dalvi N, Suciu D (2007) Efficient top-k query evaluation on probabilistic data. Proceedings of the 23th IEEE international conference on data engineering. IEEE Computer Society, Los Alamitos, CA, USA, pp 886–895
SAMOS (2013) Samos. shipboard automated meteorological and oceanographic system. http://samos.coaps.fsu.edu
Sarma AD, Benjelloun O, Halevy A, Widom J (2006) Working models for uncertain data. In: Proceedings of the 22nd international conference on data engineering, ICDE ’06. IEEE Computer Society, Washington, DC, USA, pp 7–27
Sharfman I, Schuster A, Keren D (2007) A geometric approach to monitoring threshold functions over distributed data streams. ACM Trans Database Syst 32(4):1–32
Soliman MA, Ilyas IF, Chang KCC (2007) Top-k query processing in uncertain databases. In: Proceedings of the 23th IEEE international conference on data engineering, pp 896–905
Tao Y, Cheng R, Xiao X, Ngai WK, Kao B, Prabhakar S (2005) Indexing multi-dimensional uncertain data with arbitrary probability density functions. In: Proceedings of the 31st international conference on very large data bases, VLDB Endowment, VLDB ’05, pp 922–933
Vlachou A, Doulkeridis C, Nørvåg K, Vazirgiannis M (2008) On efficient top-k query processing in highly distributed environments. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data, SIGMOD ’08. ACM, New York, NY, USA, pp 753–764
Wu M, Xu J, Tang X, Lee WC (2007) Top-k monitoring in wireless sensor networks. IEEE Trans Knowl Data Eng 19(7):962–976
Xiong L, Chitti S, Liu L (2005) Top-k queries across multiple private databases. In: Proceedings of the 25th IEEE international conference on distributed computing systems, ICDCS ’05. IEEE Computer Society, Washington, DC, USA, pp 145–154
Ye M, Liu X, Lee WC, Lee DL (2010) Probabilistic top-k query processing in distributed sensor networks. Proceedings of the 26th IEEE international conference on data engineering. IEEE Computer Society, Los Alamitos, CA, USA, pp 585–588
Yi K, Li F, Kollios G, Srivastava D (2008) Efficient processing of top-k queries in uncertain databases with x-relations. IEEE Trans Knowl Data Eng 20(12):1669–1682
Yu H, Li HG, Wu P, Agrawal D, El Abbadi A (2005) Efficient processing of distributed top-k queries. In: Proceedings of the 16th international conference on database and expert systems applications, DEXA’05. Springer, Berlin, pp 65–74
Zhang Q, Li F, Yi K (2008a) Finding frequent items in probabilistic data. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data, SIGMOD ’08. ACM, New York, NY, USA, pp 819–832
Zhang W, Lin X, Pei J, Zhang Y (2008b) Managing uncertain data: probabilistic approaches. In: Proceedings of the 2008 the ninth international conference on web-age information management, WAIM ’08. IEEE Computer Society, Washington, DC, USA, pp 405–412
Zhang X, Chomicki J (2009) Semantics and evaluation of top-k queries in probabilistic databases. Distrib Parallel Databases 26(1):67–126
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by A. Lotfi.
Rights and permissions
About this article
Cite this article
AbdulAzeem, Y.M., Eldesouky, A.I., Ali, H.A. et al. Ranking distributed database in tuple-level uncertainty. Soft Comput 19, 965–980 (2015). https://doi.org/10.1007/s00500-014-1306-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-014-1306-9