Cluster Computing

, Volume 21, Issue 1, pp 967–975 | Cite as

A four-gram unified event model for web mining

  • Xinyao ZouEmail author


In order to improve the quality of web data mining algorithm, this paper summarizes the advantages and disadvantages of several web data source models, including web log, application server log, Client-side log, Packet sniffer, and 5-gram united events model. Based on this analysis, a new 4-gram united events model (UEM4) is proposed in this paper. Simulation experiments were conducted to verify the performance of UEM4, compared with web log and 5-gram united events model. The experiment results show that web log has the worst session identification performance; UEM5 has high accuracy, best online and offline performance, but it needs the application system support the ability to identify the session; UEM4 does not require the application system to support session identification, and also has a good accuracy and performance of session identification. Therefore, this model can be used in e-commerce, which can provide high quality data sources for web mining algorithms and improve the quality of intelligent services.


4-Gram unified events model Session identification User session 


  1. 1.
    Tourassi, G., Yoon, H.J., Xu, S.H., Han, X.S.: The utility of web mining for epidemiological research: studying the association between parity and cancer risk. J. Am. Med. Inf. Assoc. 23(3), 588–595 (2016)CrossRefGoogle Scholar
  2. 2.
    Zhao, J.S., Zhao, S.Y.: Business analytics programs offered by AACSB-accredited U.S. colleges of business: a web mining study. J. Educ. Bus. 91(6), 327–337 (2016)CrossRefGoogle Scholar
  3. 3.
    Panda, B., Tripathy, S.N., Sethi, N., Samantray, O.P.: A comparative study on serial and parallel web content mining. Int. J. Adv. Netw. Appl. 7(5), 2882–2886 (2016)Google Scholar
  4. 4.
    Patil, Swapnil S., Khandagale, Hridaynath P.: Enhancing web navigation usability using web usage mining techniques. Int. Res. J. Eng. Technol. 4(6), 2828–2834 (2016)Google Scholar
  5. 5.
    Asha, K.N., Rajkumar, R.: Survey on web mining techniques and challenges of e-commerce in online social networks. Indian J. Sci. Technol. 9(13) (2016)Google Scholar
  6. 6.
    Siddiqui, A.T., Aljahdali, S.: Web mining techniques in e-commerce applications. Int. J. Comput. Appl. 69(8), 39–43 (2013)Google Scholar
  7. 7.
    Xu, Z., Luo, X., Zhang, S., Wei, X., Mei, L., Hu, C.: Mining temporal explicit and implicit semantic relations between entities using web search engines. Future Gener. Comput. Syst. 37, 468–477 (2014)CrossRefGoogle Scholar
  8. 8.
    Satish, B., Sunil, P.: Study and evaluation of user’s behavior in e-Commerce using data mining. Res. J. Recent Sci. 1, 375–387 (2012)Google Scholar
  9. 9.
    Jafari, M., Sabzchi, F.S., Rani, A.J.: Applying web usage mining techniques to design effective web recommendation systems: a case study. ACSIJ Adv. Comput. Sci. Int. J. 3(2), 78–90 (2014)Google Scholar
  10. 10.
    Kathirvel, P.: A survey on online shopping recommendation based on customer transactions. Int. J. Sci. Eng. Technol. Res. 4(3), 564–566 (2015)Google Scholar
  11. 11.
    Asha, K.N., Rajkumar, R.: Survey on web mining techniques and challenges of e-commerce in online social networks. Indian J. Sci. Technol. 9(13), 1–5 (2016)CrossRefGoogle Scholar
  12. 12.
    Tesfaye, B., Atique, S., Elias, N., et al.: Determinants and development of a web-based child mortality prediction model in resource-limited settings: a data mining approach. Comput. Methods Progr. Biomed. 140(3), 45–51 (2017)CrossRefGoogle Scholar
  13. 13.
    Iyer, N., Dcunha, A., Desai, A., Jain, K.: Survey on online recommendation using web usage mining. Int. J. Comput. Sci. Inf. Technol. 6(2), 1465–1467 (2015)Google Scholar
  14. 14.
    Xuan, J.Y., Luo, X.F., Zhang, G.Q., Liu, J., Xu, Z.: Uncertainty analysis for the keyword system of web events. IEEE Trans. Syst. Man Cybern. Syst. 46(6), 829–842 (2016)CrossRefGoogle Scholar
  15. 15.
    Ambili, P.S.: Varghese Paul. Enhanced user personalization by web log mining and link structure display. Middle-east. J. Sci. Res. 24(3), 628–631 (2016)Google Scholar
  16. 16.
    Alessandra, M., Piercesare, S.: Statistical analysis of complex and spatially dependent data: a review of object oriented spatial statistics. Eur. J. Oper. Res. 258(2), 401–410 (2017)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Zhang, W., Pan, X.F., Yan, Y.B., Pan, X.Y.: Convergence analysis of regional energy efficiency in china based on large-dimensional panel data model. J. Clean. Product. 142(2), 801–808 (2017)CrossRefGoogle Scholar
  18. 18.
    Jana, M., Jan-Philipp, M., Karsten, R., Fabian, E.: Retrieving chromatin patterns from deep sequencing data using correlation functions. Biophys. J. 112(3), 473–490 (2017)CrossRefGoogle Scholar
  19. 19.
    Mahajan, R., Sodhi, J.S., Mahajan, V.: Usage patterns discovery from a web log in an Indian e-learning site: a case study. Educ. Inf. Technol. 21(1), 123–148 (2016)CrossRefGoogle Scholar
  20. 20.
    Parthiban, P., Selvakumar, S.: Big data architecture for capturing, storing, analyzing and visualizing of web server logs. Indian J. Sci. Technol. 9(4), 1–9 (2016)Google Scholar
  21. 21.
    Girdhar, Palak, Malik, Vikas: A study on detecting packet using sniffing method. J. Netw. Commun. Emerg. Technol. 6(7), 45–46 (2016)Google Scholar
  22. 22.
    Zou, X.Y.: 5-gram united event model. Appl. Mech. Mater. 1319–1322 (2010)Google Scholar
  23. 23.
    Kohavi R.: Mining e-commerce data: the good, the bad, and the ugly. In: Provost, F., Srikant R. (Eds.) Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press: USA, pp. 8–13 (2001)Google Scholar
  24. 24.
    Ha, S.H.: Helping online customers decide through web personalization. IEEE Intell. Syst. 17(6), 34–43 (2002)CrossRefGoogle Scholar
  25. 25.
    More, A., Joshi, P.P.: Survey on inferring user image-search goals using click through logs. Int. Res. J. Eng. Technol. 3(3), 149–152 (2016)Google Scholar
  26. 26.
    Liao, Z., Song, Y., Huang, Y.L., et al.: An effective segmentation of user search behavior. IEEE Trans. Knowl. Data Eng. 26(12), 3090–3102 (2014)CrossRefGoogle Scholar
  27. 27.
    Gaikwad, Pravin, Kulkarni, Jyoti: Inconsistency extraction using advanced FP-growth algorithm. Int. J. Comput. Appl. 105(5), 6–10 (2014)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Mechanical and electrical departmentGuangdong AIB Polytechnic CollegeGuangzhouChina

Personalised recommendations