Skip to main content

Advertisement

Log in

Detecting rumours with latency guarantees using massive streaming data

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Today’s social networks continuously generate massive streams of data, which provide a valuable starting point for the detection of rumours as soon as they start to propagate. However, rumour detection faces tight latency bounds, which cannot be met by contemporary algorithms, given the sheer volume of high-velocity streaming data emitted by social networks. Hence, in this paper, we argue for best-effort rumour detection that detects most rumours quickly rather than all rumours with a high delay. To this end, we combine techniques for efficient, graph-based matching of rumour patterns with effective load shedding that discards some of the input data while minimising the loss in accuracy. Experiments with large-scale real-world datasets illustrate the robustness of our approach in terms of runtime performance and detection accuracy under diverse streaming conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Notes

  1. https://cse.hkust.edu.hk/graphgen/.

References

  1. Bian, T., Xiao, X., Xu, T., Zhao, P., Huang, W., Rong, Y., Huang, J.: Rumor detection on social media with bi-directional graph convolutional networks. In: AAAI, vol. 34, pp. 549–556 (2020)

  2. Castillo, C., Mendoza, M., Poblete, B.: Information credibility on twitter. In: WWW, pp. 675–684 (2011)

  3. Chen, F., Neill, D.B.: Non-parametric scan statistics for event detection and forecasting in heterogeneous social media graphs. In: KDD, pp. 1166–1175 (2014)

  4. Cormode, G., Muthukrishnan, S.: An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms 55(1), 58–75 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  5. Das, A., Svendsen, M., Tirthapura, S.: Incremental maintenance of maximal cliques in a dynamic graph. The VLDB Journal 28(3), 351–375 (2019)

    Article  Google Scholar 

  6. Ding, K., Li, J., Dhar, S., Devan, S., Liu, H.: Interspot: interactive spammer detection in social media. In: IJCAI, pp. 6509–6511 (2019)

  7. Fang, Y., Huang, X., Qin, L., Zhang, Y., Zhang, W., Cheng, R., Lin, X.: A survey of community search over big graphs. The VLDB Journal 29(1), 353–392 (2020)

    Article  Google Scholar 

  8. Farajtabar, M., Yang, J., Ye, X., Xu, H., Trivedi, R., Khalil, E., Li, S., Song, L., Zha, H.: Fake news mitigation via point process based intervention. In: ICML, pp. 1097–1106 (2017)

  9. Friggeri, A., Adamic, L.A., Eckles, D., Cheng, J.: Rumor cascades. In: ICWSM (2014)

  10. Hao, T., Huang, L.: A social interaction activity based time-varying user vectorization method for online social networks. In: IJCAI, pp. 3790–3796 (2018)

  11. He, Y., Barman, S., Naughton, J.F.: On load shedding in complex event processing. In: ICDT, pp. 213–224 (2014)

  12. Hu, S., Sturtevant, N.R.: Direction-optimizing breadth-first search with external memory storage. In: IJCAI, pp. 1258–1264 (2019)

  13. Huang, H., Zhang, Q., Huang, X., Huang, H., Zhang, Q., Huang, X.: Mention recommendation for twitter with end-to-end memory network. In: IJCAI, pp. 1872–1878 (2017)

  14. Ihler, A., Hutchins, J., Smyth, P.: Adaptive event detection with time-varying poisson processes. In: KDD, pp. 207–216 (2006)

  15. Knoblauch, J., Jewson, J.E., Damoulas, T.: Doubly robust bayesian inference for non-stationary streaming data with beta-divergences. In: NIPS, pp. 64–75 (2018)

  16. Kulldorff, M.: A spatial scan statistic. Communications in Statistics-Theory and methods 26(6), 1481–1496 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  17. Kwon, S., Cha, M., Jung, K.: Rumor detection over varying time windows. PloS one 12(1) (2017)

  18. Lee, J., Han, W.S., Na, H.J., Park, C.G., Kim, K.H., Kim, D.H., Lee, J.Y., Cha, S.K., Moon, S.: Parallel replication across formats for scaling out mixed oltp/olap workloads in main-memory databases. The VLDB Journal 27(3), 421–444 (2018)

    Article  Google Scholar 

  19. Li, R.H., Qin, L., Yu, J.X., Mao, R.: Finding influential communities in massive networks. The VLDB Journal 26(6), 751–776 (2017)

    Article  Google Scholar 

  20. Liu, G., Zheng, K., Wang, Y., Orgun, M.A., Liu, A., Zhao, L., Zhou, X.: Multi-constrained graph pattern matching in large-scale contextual social graphs. In: ICDE, pp. 351–362 (2015)

  21. Ma, J., Gao, W., Mitra, P., Kwon, S., Jansen, B.J., Wong, K.F., Cha, M.: Detecting rumors from microblogs with recurrent neural networks. In: IJCAI, pp. 3818–3824 (2016)

  22. Ma, J., Gao, W., Wong, K.F.: Detect rumors in microblog posts using propagation structure via kernel learning. In: ACL, vol. 1, pp. 708–717 (2017)

  23. Muandet, K., Schölkopf, B.: One-class support measure machines for group anomaly detection. arXiv preprint arXiv:1303.0309 (2013)

  24. Nguyen, T.T., Nguyen, T.T., Nguyen, T.T., Vo, B., Jo, J., Nguyen, Q.V.H.: Judo: Just-in-time rumour detection in streaming social platforms. Information Sciences 570, 70–93 (2021)

    Article  MathSciNet  Google Scholar 

  25. Oluwasuji, O.I., Malik, O., Zhang, J., Ramchurn, S.D., et al.: Algorithms for fair load shedding in developing countries. In: IJCAI, pp. 1590–1596 (2018)

  26. Peierls, R.: Statistical error in counting experiments. Royal Society 149(868), 467–486 (1935)

    MATH  Google Scholar 

  27. Sahu, S., Mhedhbi, A., Salihoglu, S., Lin, J., Özsu, M.T.: The ubiquity of large graphs and surprising challenges of graph processing: extended survey. The VLDB Journal pp. 1–24 (2019)

  28. Shi, C., Li, Y., Zhang, J., Sun, Y., Philip, S.Y.: A survey of heterogeneous information network analysis. TKDE 29(1), 17–37 (2017)

    Google Scholar 

  29. Shu, K., Liu, H.: Detecting fake news on social media. Synthesis Lectures on Data Mining and Knowledge Discovery 11(3), 1–129 (2019)

    Article  Google Scholar 

  30. Shu, K., Mahudeswaran, D., Liu, H.: Fakenewstracker: a tool for fake news collection, detection, and visualization. Computational and Mathematical Organization Theory 25(1), 60–71 (2019)

    Article  Google Scholar 

  31. Shu, K., Sliva, A., Wang, S., Tang, J., Liu, H.: Fake news detection on social media: A data mining perspective. SIGKDD Explorations Newsletter 19(1), 22–36 (2017)

  32. Slo, A., Bhowmik, S., Rothermel, K.: espice: Probabilistic load shedding from input event streams in complex event processing. In: Middleware, pp. 215–227 (2019)

  33. Srijith, P., Hepple, M., Bontcheva, K., Preotiuc-Pietro, D.: Sub-story detection in twitter with hierarchical dirichlet processes. IPM 53(4), 989–1003 (2017)

    Google Scholar 

  34. Tam, N.T., Weidlich, M., Zheng, B., Yin, H., Hung, N.Q.V., Stantic, B.: From anomaly detection to rumour detection using data streams of social platforms. PVLDB 12(9), 1016–1029 (2019)

    Google Scholar 

  35. To, Q.C., Soto, J., Markl, V.: A survey of state management in big data processing systems. The VLDB Journal 27(6), 847–872 (2018)

    Article  Google Scholar 

  36. Vosoughi, S., Roy, D., Aral, S.: The spread of true and false news online. Science 359(6380), 1146–1151 (2018)

    Article  Google Scholar 

  37. Wang, B., Chen, G., Fu, L., Song, L., Wang, X., Liu, X.: Drimux: Dynamic rumor influence minimization with user experience in social networks. In: AAAI, pp. 791–797 (2016)

  38. Wang, S., Moise, I., Helbing, D., Terano, T.: Early signals of trending rumor event in streaming social media. In: COMPSAC, vol. 2, pp. 654–659 (2017)

  39. Wang, S., Terano, T.: Detecting rumor patterns in streaming social media. In: Big Data, pp. 2709–2715 (2015)

  40. Xing, C., Wang, Y., Liu, J., Huang, Y., Ma, W.Y.: Hashtag-based sub-event discovery using mutually generative lda in twitter. In: AAAI, pp. 2666–2672 (2016)

  41. Yang, F., Liu, Y., Yu, X., Yang, M.: Automatic detection of rumor on sina weibo. In: Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics, p. 13 (2012)

  42. Ying, R., Wang, A., You, J., Leskovec, J.: Frequent subgraph mining by walking in order embedding space. In: ICML (2020)

  43. Yu, R., He, X., Liu, Y.: Glad: group anomaly detection in social media analysis. TKDD 10(2), 18 (2015)

    Article  Google Scholar 

  44. Yu, R., Qiu, H., Wen, Z., Lin, C., Liu, Y.: A survey on social media anomaly detection. ACM SIGKDD Explorations Newsletter 18(1), 1–14 (2016)

    Article  Google Scholar 

  45. Yu, S., Wang, X., Príncipe, J.C.: Request-and-reverify: Hierarchical hypothesis testing for concept drift detection with expensive labels. In: IJCAI, p. 3033–3039 (2018)

  46. Zellag, K., Kemme, B.: Consistency anomalies in multi-tier architectures: automatic detection and prevention. The VLDB Journal 23(1), 147–172 (2014)

    Article  Google Scholar 

  47. Zhao, B., Hung, N.Q.V., Weidlich, M.: Load shedding for complex event processing: Input-based and state-based techniques. In: ICDE, pp. 1093–1104 (2020)

  48. Zhao, Z., Resnick, P., Mei, Q.: Enquiring minds: Early detection of rumors in social media from enquiry posts. In: WWW, pp. 1395–1405 (2015)

  49. Zubiaga, A., Aker, A., Bontcheva, K., Liakata, M., Procter, R.: Detection and resolution of rumours in social media: A survey. CSUR 51(2), 32 (2018)

    Google Scholar 

Download references

Acknowledgements

This research is funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.01-2019.323.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thanh Tam Nguyen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nguyen, T.T., Huynh, T.T., Yin, H. et al. Detecting rumours with latency guarantees using massive streaming data. The VLDB Journal 32, 369–387 (2023). https://doi.org/10.1007/s00778-022-00750-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-022-00750-4

Keywords

Navigation