Data Preprocessing in Web Usage Mining

Conference paper

Abstract

At present, the study on Web Usage Mining mainly focuses on pattern discovery (including Association Rules, sequence pattern, etc) and pattern analysis. However, the study on the main data sources, that is to say, the study on web-log pre-process is relatively rare. Given that high-quality data helps a lot in improving Pattern mining precision, this paper studies from this aspects, and proposes the high-effective data preprocessing method.

Keywords

Client identification Date cleaning Path completion Web usage mining 

References

  1. Cooley R (1997a) Web mining: information and pattern discovery on the world wide web. In: 9th international conference on tools with artificial intelligence (ICTAI’97), New-port Beach, USA, 1997, pp 558–567Google Scholar
  2. Cooley R (1997b) Grouping web page references into transactions for mining world wide web browsing patterns. In: Proceeding Of the IEEE knowledge and data engineering exchange workshop (KDEX-97)Google Scholar
  3. Cooley R, Mobasher B, Srivastava J (1999) Data preparation for mining world wide web browsing patterns. J Knowl Inf Syst 1:5–32Google Scholar
  4. Ji Y (2009) Application cases of data mining technology. China Machine Press, BeijingGoogle Scholar
  5. Liu Y (2003) Research on content mining technology based on web. Harbin Institute of Technology, HarbinGoogle Scholar
  6. Liu W (2007a) Design for web usage mining model. Appl Res Comput 24(3):184–186Google Scholar
  7. Liu L (2007b) The Preprocessing of web usage mining. Comput Sci 5:200–204Google Scholar
  8. Pirolli P (1996) Silk from a sow sear: extracting usable structures from the web. In: Proceeding of 1996 conference on human factors in computing systems (CHI-96), Vancouver, British Columbia, CanadaGoogle Scholar
  9. Shao F (2009) Principle and algorithm of data mining. Science Press, Beijing, pp 379–380Google Scholar
  10. Tang Q (2002) The text mining based on web. Comput Eng Appl 21:198–201Google Scholar
  11. Wang J (2000) Research of web text mining. J Comput Res Dev 37(5):513–520Google Scholar
  12. Wu Q (2002) Client identification in the processing of web log mining. Comput Sci 29(4):64–66Google Scholar
  13. Xu M (2003) Study on text mining on web. Basic Autom (5):44–46Google Scholar
  14. Zhang W (2006) Clustering web client based on interest similarity. Shandong Univ (Nat Sci) 41(6):54–57Google Scholar
  15. Zhao W (2003) Research on data processing technology in web log mining. Comput Appl 23(5):62–64Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.College of Information EngineeringShandong Youth University of Political ScienceJinanChina

Personalised recommendations