Advertisement

World Wide Web

, Volume 21, Issue 2, pp 537–555 | Cite as

Top-K representative documents query over geo-textual data stream

  • Bin Wang
  • Rui Zhu
  • Xiaochun Yang
  • Guoren Wang
Article

Abstract

The increasing popularity of location-based social networks encourages more and more users to share their experiences. It deeply impacts the decision of customers when shopping, traveling, and so on. This paper studies the problem of top-K valuable documents query over geo-textual data stream. Many researchers have studied this problem. However, they do not consider the reliability of documents, where some unreliable documents may mislead customers to make improper decisions. In addition, they lack the ability to prune documents with low representativeness. In order to increase user satisfaction in recommendation systems, we propose a novel framework named PDS. It first employs an efficiently machine learning technique named ELM to prune unreliable documents, and then uses a novel index named \(\mathcal {GH}\) to maintain documents. For one thing, this index maintains a group of pruning values to filter low quality documents. For another, it utilizes the unique property of sliding window to further enhance the PDS performance. Theoretical analysis and extensive experimental results demonstrate the effectiveness of the proposed algorithms.

Keywords

Documents Geo-textual data stream Top-k ELM 

Notes

Acknowledgments

This work is partially supported by the NSF of China for Outstanding Young Scholars under grant No. 61322208, the NSF of China under grant Nos. 61572122, 61272178, 61502317, U1401256, and the NSF of China for Key Program under grant No. 61532021. Bin Wang is the corresponding author.

References

  1. 1.
    Bai, M., Xin, J., Wang, G., Zhang, L., Zimmermann, R., Ye, Y., Wu, X.: Discovering the k representative skyline over a sliding window. IEEE Trans. Knowl Data Eng. 28(8), 2041–2056 (2016)CrossRefGoogle Scholar
  2. 2.
    Caruana, G., Li, M., Qi, M.: A MapReduce based parallel SVM for large scale spam filtering. In: Fuzzy Systems and Knowledge Discovery (2011)Google Scholar
  3. 3.
    Chen, X., Zeng, Y., Cong, G., Qin, S., Xiang, Y., Dai, Y.: On information coverage for location category based point-of-interest recommendation. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Texas, USA, pp. 37–43 (2015)Google Scholar
  4. 4.
    Chen, L., Cong, G.: Diversity-aware top-k publish/subscribe for text stream. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Victoria, Australia, pp. 347–362 (2015)Google Scholar
  5. 5.
    Cheng, Y., Ye, Y., Chen, L., Wang, G., Giraud-Carrier, C.G., Sun, Y.: Distr: A distributed method for the reachability query over large uncertain graphs. IEEE Trans. Parallel Distrib. Syst. 27(11), 3172–3185 (2016)CrossRefGoogle Scholar
  6. 6.
    Di, Y., Shastri, A., Rundensteiner, E.A., Ward, M.O.: An optimal strategy for monitoring top-k queries in streaming windows. In: EDBT, pp. 57–68 (2011)Google Scholar
  7. 7.
    Hu, H., Liu, Y., Li, G., Feng, J., Tan, K.-L.: A location-aware publish/subscribe framework for parameterized spatio-textual subscriptions. In: 31st IEEE International Conference on Data Engineering, ICDE 2015, Seoul, South Korea, pp. 711–722 (2015)Google Scholar
  8. 8.
    Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning machine: a new learning scheme of feedforward neural networks. In: International Symposium on Neural Networks, vol. 2 (2004)Google Scholar
  9. 9.
    Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning Machine: Theory and applications. Neurocomputing 70, 489–501 (2006)CrossRefGoogle Scholar
  10. 10.
    Huang, G.-B., Zhou, H., Ding, X., Zhang, R.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. 42, 513–529 (2012)CrossRefGoogle Scholar
  11. 11.
    Mouratidis, K., Bakiras, S., Papadias, D.: Continuous monitoring of top-k queries over sliding windows. In: SIGMOD Conference, pp. 635–646 (2006)Google Scholar
  12. 12.
    Rong, H.-J., Huang, G.-B., Sundararajan, N., Saratchandran, P.: Online sequential fuzzy extreme learning machine for function approximation and classification problems. IEEE Trans. Syst. Man Cybern. 39, 1067–1072 (2009)CrossRefGoogle Scholar
  13. 13.
    She, J., Tong, Y., Chen, L., Cao, C.C.: Conflict-aware event-participant arrangement and its variant for online setting. IEEE Trans. Knowl. Data Eng. 28(9), 2281–2295 (2016)CrossRefGoogle Scholar
  14. 14.
    Shen, Z., Cheema, M.A., Lin, X., Zhang, W., Wang, H.: Efficiently monitoring top-k pairs over sliding windows. In: ICDE, pp. 798–809 (2012)Google Scholar
  15. 15.
    Tong, Y., Zhang, X., Chen, L.ei: Tracking frequent items over distributed probabilistic data. World Wide Web 19(4), 579–604 (2016)CrossRefGoogle Scholar
  16. 16.
    Tong, Y., She, J., Meng, R.: Bottleneck-aware arrangement over event-based social networks: the max-min approach. World Wide Web 19(6), 1151–1177 (2016)CrossRefGoogle Scholar
  17. 17.
    Tong, Y., She, J., Ding, B., Chen, L., Wo, T., Xu, K.: Online minimum matching in real-time spatial data Experiments and analysis. PVLDB 9(12), 1053–1064 (2016)Google Scholar
  18. 18.
    Tong, Y., She, J., Ding, B., Wang, L., Chen, L.: Online mobile micro-task allocation in spatial crowdsourcing. In: 32nd IEEE International Conference on Data Engineering, ICDE 2016, Helsinki, Finland, pp. 49–60 (2016)Google Scholar
  19. 19.
    Wang, X., Zhang, Y., Zhang, W., Lin, X., Wang, W.: Selectivity estimation on streaming spatio-textual data using local correlations. PVLDB 8(2), 101–112 (2014)Google Scholar
  20. 20.
    Ye, M., Yin, P., Lee, W.-C.: Location recommendation for location-based social networks 18th ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems, ACM-GIS 2010, pp 458–461. Proceedings, CA, USA (2010)Google Scholar
  21. 21.
    Zhu, R., Wang, B., Yang, X., Zheng, B., Wang, G.: SAP: Improving continuous top-k queries over streaming data. IEEE Trans. Knowl. Data Eng. 29(6), 1310–1328 (2017)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.School of Computer Science and EngineeringNortheastern UniversityLiaoningChina

Personalised recommendations