Advertisement

Fuzzy Based Latent Dirichlet Allocation in Spatio-Temporal and Randomized Least Angle Regression

  • D. NithyaEmail author
  • S. Sivakumari
Conference paper
Part of the Lecture Notes on Data Engineering and Communications Technologies book series (LNDECT, volume 39)

Abstract

Due to the emerging growth of Social Networks and web blogs, several news providers share their news articles on different web sites and web blogs. It is also used to get public opinion about the articles. Twitter is one of the popular microblogs which act as intermediate for publics to distribute their thoughts. Our intention is to find Twitter data related to the news in the web news articles are also used to enhance the performance of Evolving Fuzzy System-Penguins Search Optimization Algorithm (EFS-PeSOA) based web news mining. In this paper, Latent Dirichlet Allocation (LDA) is used to model the topics within the text of tweets. Twitter-specific tokenizer, part-of-speech tagger, and snowball stemmer are used to generate terms from the twitter data. Term-Frequency and Inverse-Document-Frequency (tf-idf) of each term in tweets are calculated along with the terms in web news articles for the creation of evolving fuzzy rules. Based on the evolving fuzzy rules, web news articles are categorized. In order to enhance the efficiency of categorization of web news articles, a Spatio-Temporal Generalized Additive Model (STGM) is developed where the spatial and temporal information of the tweets are considered for categorization. However, it generates different terms in tweets. So a Randomized Least Angle Regression (RLAR) is used to choose the most significant terms in tweets and only the selected term’s tf-idf values are used in EFS for categorization of web news articles.

Keywords

Web news articles Evolving Fuzzy System Latent Dirichlet Allocation Spatio-Temporal Generalized Additive Model Randomized Least Angle Regression 

References

  1. 1.
    Mirończuk, M.M., Protasiewicz, J.: A recent overview of the state-of-the-art elements of text classification. Expert Syst. Appl. 106, 36–54 (2018).  https://doi.org/10.1016/j.eswa.2018.03.058CrossRefGoogle Scholar
  2. 2.
    Thangaraj, M., Sivakami, M.: Text classification techniques: a literature review. Interdiscip. J. Inf., Knowl. Manag. 13, 118–135 (2018)  https://doi.org/10.28945/4066
  3. 3.
    Iglesias, J.A., Tiemblo, A., Ledezma, A., Sanchis, A.: Web news mining in an evolving framework. Inf. Fusion 28, 90–98 (2016).  https://doi.org/10.1016/j.inffus.2015.07.004CrossRefGoogle Scholar
  4. 4.
    Nithya, D., Sivakumari, S.: Categorizing online news articles using Penguin search optimization algorithm. Int. J. Eng. Technol. 7, 2265–2268 (2018).  https://doi.org/10.14419/ijet.v7i4.15607CrossRefGoogle Scholar
  5. 5.
    Bastani, K., Namavari, H., Shaffer, J.: Latent Dirichlet allocation (LDA) for topic modeling of the CFPB consumer complaints. arXiv preprint arXiv:1807.07468 (2018)
  6. 6.
    Kompan, M., Bieliková, M.: News article classification based on a vector representation including words’ collocations. In: Third International Conference on Software, Services and Semantic Technologies, S3T 2011, pp. 1–8. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-23163-6_1Google Scholar
  7. 7.
    Dilrukshi, I., De Zoysa, K., Caldera, A.: Twitter news classification using SVM. In: IEEE 2013 8th International Conference on Computer Science & Education, pp. 287–291 (2013).  https://doi.org/10.1109/iccse.2013.6553926
  8. 8.
    Liparas, D., HaCohen-Kerner, Y., Moumtzidou, A., Vrochidis, S., Kompatsiaris, I.: News articles classification using random forests and weighted multimodal features. In: Information Retrieval Facility Conference. Springer, Cham, pp. 63–75 (2014).  https://doi.org/10.1007/978-3-319-12979-2_6Google Scholar
  9. 9.
    Kaur, S., Khiva, N.K.: Online news classification using Deep Learning Technique. Int. Res. J. Eng. Technol. (IRJET) 3, 558–563 (2016)Google Scholar
  10. 10.
    Demirsoz, O., Ozcan, R.: Classification of news-related tweets. J. Inf. Sci. 43, 509–524 (2017).  https://doi.org/10.1177/0165551516653082CrossRefGoogle Scholar
  11. 11.
    Jahanbin, K., Rahmanian, F., Rahmanian, V., Jahromi, A.S., Hojjat-Farsangi, M.: Application of Twitter and web news mining in monitoring and documentation of communicable diseases. J. Int. Transl. Med. 6, 167–175 (2018).  https://doi.org/10.11910/2227-6394.2018.06.04.03CrossRefGoogle Scholar
  12. 12.
    Khan, J.A., Van Aelst, S., Zamar, R.H.: Robust linear model selection based on least angle regression. J. Am. Stat. Assoc. 102, 1289–1299 (2007).  https://doi.org/10.1198/01621450700000095MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Malhotra, S., Dixit, A.: An effective approach for news article summarization. Int. J. Comput. Appl. 76(16), 5–10 (2013)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of Computer Science and Engineering, School of EngineeringAvinashilingam Institute for Home Science and Higher Education for WomenCoimbatoreIndia

Personalised recommendations