Skip to main content
Log in

TKES: A Novel System for Extracting Trendy Keywords from Online News Sites

  • Published:
Journal of the Operations Research Society of China Aims and scope Submit manuscript

Abstract

As the Smart city trend especially artificial intelligence, data science, and the internet of things has attracted lots of attention, many researchers have created various smart applications for improving people’s life quality. As it is very essential to automatically collect and exploit information in the era of industry 4.0, a variety of models have been proposed for storage problem solving and efficient data mining. In this paper, we present our proposed system, Trendy Keyword Extraction System (TKES), which is designed for extracting trendy keywords from text streams. The system also supports storing, analyzing, and visualizing documents coming from text streams. The system first automatically collects daily articles, then it ranks the importance of keywords by calculating keywords’ frequency of existence in order to find trendy keywords by using the Burst Detection Algorithm which is proposed in this paper based on the idea of Kleinberg. This method is used for detecting bursts. A burst is defined as a period of time when a keyword is continuously and unusually popular over the text stream and the identification of bursts is known as burst detection procedure. The results from user requests could be displayed visually. Furthermore, we create a method in order to find a trendy keyword set which is defined as a set of keywords that belong to the same burst. This work also describes the datasets used for our experiments, processing speed tests of our two proposed algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Kleinberg, J.: Bursty and hierarchical structure in streams. Data Min. Knowl. Discov. 7(4), 373–397 (2003)

    Article  MathSciNet  Google Scholar 

  2. Fung, G.P.C., Yu, J.X., Yu, P.S., Lu, H.: Parameter free bursty events detection in text streams. In: Proceedings of the 31st international conference on Very large data bases, pp. 181–192. ACM (2005)

  3. Wang, M., Madhyastha, T., Chan, N.H., Papadimitriou, S., Faloutsos, C.: Data mining meets performance evaluation: fast algorithms for modeling bursty traffic. In: Proceedings 18th International Conference on Data Engineering, pp. 507–516. IEEE (2002). https://doi.org/10.1109/ICDE.2002.994770

  4. Zhang, X.: Fast algorithms for burst detection. Ph.D. thesis, New York University, Graduate School of Arts and Science (2006)

  5. Neill, D.B., Moore, A.W.: A fast multi-resolution method for detection of significant spatial disease clusters. In: Advances in Neural Information Processing Systems, pp. 651–658. MIT Press (2004)

  6. Neill, D.B., Moore, A.W.: Anomalous spatial cluster detection. In: Proceedings of the KDD 2005 Workshop on Data Mining Methods for Anomaly Detection (2005)

  7. Neill, D.B., Moore, A.W., Pereira, F., Mitchell, T.M.: Detecting significant multidimensional spatial clusters. In: Advances in Neural Information Processing Systems, pp. 969–976 (2005)

  8. Saul, L.K., Weiss, Y., Bottou, L.: Advances in Neural Information Processing Systems 17 (2005)

  9. Thrun, S., Saul, L.K., Schölkopf, B.: Advances in Neural Information Processing Systems 16: Proceedings of the 2003 Conference, vol. 16. MIT Press (2004)

  10. Bakkum, D.J., Radivojevic, M., Frey, U., Franke, F., Hierlemann, A., Takahashi, H.: Parameters for burst detection. Front. Comput. Neurosci. 7, 193 (2014)

    Article  Google Scholar 

  11. Wagenaar, D., DeMarse, T.B., Potter, S.M.: Meabench: a toolset for multi-electrode data acquisition and on-line analysis. In: Conference Proceedings. 2nd International IEEE EMBS Conference on Neural Engineering, 2005, pp. 518–521. IEEE (2005)

  12. Romsaiyud, W.: Detecting emergency events and geo-location awareness from twitter streams. In: The International Conference on E-Technologies and Business on the Web (EBW2013), pp. 22–27 (2013)

  13. Weng, J., Lee, B.S.: Event detection in twitter. In: Fifth International AAAI Conference on Weblogs and Social Media, pp.17–21 (2011)

  14. Vlachos, M., Meek, C., Vagena, Z., Gunopulos, D.: Identifying similarities, periodicities and bursts for online search queries. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 131–142 (2004). https://doi.org/10.1145/1007568.1007586

  15. Zhang, Y., Hua, W., Yuan, S.: Mapping the scientific research on open data: a bibliometric review. Learned Publ. 31(2), 95–106 (2018)

    Article  Google Scholar 

  16. Heydari, A., ali Tavakoli, M., Salim, N., Heydari, Z.: Detection of review spam: a survey. Expert Syst. Appl. 42(7), 3634–3642 (2015)

    Article  Google Scholar 

  17. Yamamoto, S., Wakabayashi, K., Kando, N., Satoh, T.: Twitter user tagging method based on burst time series. Int. J. Web Inf. Syst. 12(3), 292–311 (2016)

    Article  Google Scholar 

  18. Huyen, N.T.M., Roussanaly, A., Vinh, H.T., et al.: A hybrid approach to word segmentation of vietnamese texts. In: International conference on language and automata theory and applications, pp. 240–249. Springer, Berlin (2008)

  19. Hong, T.V.T., Do, P.: Developing a graph-based system for storing, exploiting and visualizing text stream. In: Proceedings of the 2nd international conference on machine learning and soft computing, pp. 82–86 (2018). https://doi.org/10.1145/3184066.3184084

  20. Krishnamoorthy, M., Suresh, S., Alagappan, S., et al.: Deep learning techniques and optimization strategies in big data analytics: automated transfer learning of convolutional neural networks using enas algorithm. In: Deep Learning Techniques and Optimization Strategies in Big Data Analytics, pp. 142–153. IGI Global (2020)

  21. Vasant, P.: Intelligent Computing & Optimization, vol. 866. Springer, Berlin (2019)

    Book  Google Scholar 

Download references

Acknowledgements

We greatly appreciate the support of the ICO 2018. We would like to offer our special thanks to Lac Hong University, Thu Dau Mot University, and Vietnam National University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tham Vo.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

The work of Tham Vo is supported by Lac Hong University, and funded by Thu Dau Mot University (No. DT.20-031). The work of Phuc Do is funded by Vietnam National University, Ho Chi Minh City (No. DS2020-26-01).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vo, T., Do, P. TKES: A Novel System for Extracting Trendy Keywords from Online News Sites. J. Oper. Res. Soc. China 10, 801–816 (2022). https://doi.org/10.1007/s40305-020-00327-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40305-020-00327-4

Keywords

Mathematics Subject Classification

Navigation