Advertisement

Commercial Data Mining Software

  • Qingyu Zhang
  • Richard S. Segall
Chapter

Summary

This chapter discusses selected commercial software for data mining, supercomputing data mining, text mining, and web mining. The selected software are compared with their features and also applied to available data sets. The software for data mining are SAS Enterprise Miner, Megaputer PolyAnalyst 5.0, PASW (formerly SPSS Clementine), IBM Intelligent Miner, and BioDiscovery GeneSight. The software for supercomputing are Avizo by Visualization Science Group and JMP Genomics from SAS Institute. The software for text mining are SAS Text Miner and Megaputer PolyAnalyst 5.0. The software for web mining are Megaputer PolyAnalyst and SPSS Clementine . Background on related literature and software are presented. Screen shots of each of the selected software are presented, as are conclusions and future directions.

Keywords

Data Mining Screen Shot Mining Software Data Mining Software Distribute Data Mining 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Notes

Acknowledgement

The authors would like to acknowledge the support provided by a 2009 Summer Faculty Research Grant as awarded to them by the College of Business of Arkansas State University without whose program and support this work cannot be done. The authors also want to acknowledge each of the software manufactures for their support of this research.

References

  1. AAAI (2002), American Association for Artificial Intelligence (AAAI) Spring Symposium on Information Refinement and Revision for Decision Making: Modeling for Diagnostics, Prognostics, and Prediction, Software and Data, retrieved from http: //www.cs.rpi.edu/∼goebel/ss02/software-and-data.html.
  2. Ceccato, M., M. Marin, K. Mens, L. Moonen, et al., (2006), Applying and combining three different aspect Mining Techniques, Software Quality Journal. 14(3), 209-214.CrossRefGoogle Scholar
  3. Chang, J. and Lee, W. (2006), Finding frequent itemsets over online data streams, Information and Software Technology. 48(7), 606-619.CrossRefGoogle Scholar
  4. Chou, C., Sinha, A. and Zhao, H. (2008), A text mining approach to Internet abuse detection, Information Systems and eBusiness Management. 6(4), 419-440.CrossRefGoogle Scholar
  5. Curry, C., Grossman, R., Locke, D., Vejcik, S., and Bugajski, J. (2007), Detecting changes in large data sets of payment card data: A case study, KDD’07, August 12-15, San Jose, CA.Google Scholar
  6. Data Intelligence Group (1995), An overview of data mining at Dun & Bradstreet, DIG White Paper 95/01, retrieved from http://www.thearling.com.text/wp9501/wp9501.htm.
  7. Davi, A, Dominique Haughton, Nada Nasr, Gaurav Shah, et al (2005), A Review of Two Text-Mining Packages: SAS TextMining and WordStat. The American Statistician. 59(1), 89-104.CrossRefGoogle Scholar
  8. Davies, A. (2007), Identification of spurious results generated via data mining using an Internet distributed supercomputer grant, Duquesne University Donahue School of Business, http://www.business.duq.edu/Research/details.asp?id=83
  9. Deshmukah, A. V. (1997), Software review: ModelQuest Expert 1.0, ORMS Today, December 1997, retrieved from http://www.lionhrtpub.com/orms/orms-12-97/softwarereview. html.
  10. Ducatelle, F., (2006), Software for the data mining course, School of Informatics, The University of Edinburgh, Scotland, UK, retrieved from http://www.inf.ed.ac.uk/teaching/courses/dme/html/software2.html.
  11. Ganapathy, S., Ranganathan, C. and Sankaranarayanan, B. (2004), Visualization strategies and tools for enhancing customer relationship management, Communications of the ACM. 47(11), 92-98.CrossRefGoogle Scholar
  12. Grossman, R. (2007), Data grids, data clouds and data webs: a survey of high performance and distributed data mining, HPC Workshop: Hardware and software for largescale biological computing in the next decade, December 11-14, Okinawa, Japan, http://www.irp.oist.jp/hpc-workshop/slides.html
  13. Hearst, M. A.(2003), What is Data Mining?, http://www.ischool.berkeley.edu/∼hearstr/text_mining.html
  14. IBM DB2 Intelligent Miner Visualization: Using the Intelligent Miner Visualizers Version 8.2 SH12, Second Edition, August 2004Google Scholar
  15. Kim, S., E James Whitehead Jr and Yi Zhang, (2008), Classifying Software Changes: Clean or Buggy? IEEE Transactions on Software Engineering. 34(2), 181-197.CrossRefGoogle Scholar
  16. Lau, K., Lee, K. and Ho, Y. (2005), Text Mining for the Hotel Industry, Cornell Hotel and Restaurant Administration Quarterly. 46(3), 344-363.CrossRefMathSciNetGoogle Scholar
  17. Lazarevic A., Fiea T., & Obradovic, Z., (2006), A software system for spatial data analysis and modeling, retrieved from http://www.ist.temple.edu?∼zoran/papers/lazarevic00.pdf.
  18. Leung, Y. F. (2004), My microarray software comparison - Data mining software, September 2004, Chinese University of Hong Kong, retrieved from http://www.ihome.cuhk.edu.hk/∼b400559/arraysoft mining specific.html.
  19. Megaputer Intelligence Inc.(2007), Data Mining, Text Mining, and Web Mining Software, http:///www.megaputer.com
  20. Mesrobian, E. , Muntz, R., Shek,E., Mechoso,, C. R., Farrara, J.D., Spahr, J.A., Stolorz, P.(1995), Real time data mining, management, and visualization of GCM output, IEEE Computer Society, v.81, http://dml.cs.ucla.edu/∼shek/publications/sc_94.ps.gz
  21. Metz. C.(2003), Software: Text Mining, PC Magazine, July 1, http://www.pcmag.com/print_article2/0,1217.a=43573,00.asp
  22. National Center for Biotechnology Information (2006), National Library of Medicine, National Institutes of Health, NCBI tools for data mining, retrieved from http://www.ncbi.nlm,nih.gov/Tools/.
  23. Nayak, R. (2008), Data Mining in Web Services Discovery and Monitoring, International Journal of Web Services Research. 5(1), 63-82.Google Scholar
  24. Nisbet, R. A.(2006), Data mining tools: Which one is best for CRM? Part 3, DM Review, March 21, 2006, retrieved from http://www.dmreview.com/editorial/dmreview/print_action.cfm?articleId=1049954.
  25. Pabarskaite, Z. and Raudys, A. (2007), A process of knowledge discovery from web log data: Systematization and critical review, Journal of Intelligent Information Systems. 28(1), 79-105.CrossRefGoogle Scholar
  26. Rokach L., Mining manufacturing data using genetic algorithm-based feature set decomposition, Int. J. Intelligent Systems Technologies and Applications, 4(1):57-78, 2008.CrossRefGoogle Scholar
  27. Rokach, L. and Maimon, O., Theory and applications of attribute decomposition, IEEE International Conference on Data Mining, IEEE Computer Society Press, pp.473–480, 2001.Google Scholar
  28. Rokach, L. and Maimon, O. and Averbuch, M., Information Retrieval System for Medical Narrative Reports, Lecture Notes in Artificial intelligence 3055, page 217-228 Springer-Verlag, 2004.Google Scholar
  29. Sanchez, E. (1996), Speedier: Penn researchers to link supercomputers to community problems, The Compass,v.43,n.4,p.14, September 17, http://www.upenn.edu/pennnews/ features/1996/091796/research
  30. Sanchez, M., Moreno, M., Segrera,S. and Lopez, V. (2008), Framework for the development of a personalised recomm ender system with integrated web-mining functionalities, International Journal of Computer Applications in Technology, 33(4), 312-327.CrossRefGoogle Scholar
  31. SAS (2009), JMP Genomics 4.0 Product Brief, http://www.jmp.com/software/genomics/pdf/103112_jmpg4_prodbrief.pdf
  32. Segall, R. and Zhang, Q. (2006), Data visualization and data mining of continuous numerical and discrete nominal-valued microarray databases for biotechnology, Kybernetes: International Journal of Systems and Cybernetics, 35(9/10),1538-1566.CrossRefGoogle Scholar
  33. Seigle, G. (2002), CIA, FBI developing intelligence supercomputer, Global Security.Google Scholar
  34. Sekijima, M. (2007), Application of HPC to the analysis of disease related protein and the design of novel proteins, HPC Workshop: “Hardware and software for largescale biological computing in the next decade”, December 11-14, Okinawa, Japan, http://www.irp.oist.jp/hpc-workshop/slides.html
  35. SPPS (2009b): PAWS Modeler Auto Cluster and Cluster Viewer, http://www.spss.com/media/demos/modeler/demo-modeler-autocluster/index.htm.
  36. PSS (2007),Web Mining for Clementine, http://www.spss.com/web_mining_for_clementine, viewed 16 May 2007.
  37. StatSoft, Inc. (2006), Electronic textbook, retrieved from http://www.statsoft.com/textbook/glosa.html.
  38. VSG Visualization Sciences Group (2009), Avizo The 3D visualization software for scientific and industrial data, http://www.vsg3d.com/vsg_prod_avizo_overview.php
  39. Wikipedia (2006), Supercomputers, Retrieved May 19, 2009 from BookRags.com: http://www.bookrags.com/wiki/Supercomputer
  40. Wikipedia (2007), Web mining, http://en.wikipedia.org/wiki/Web_mining
  41. Woodfield, Terry (2004), Mining Textual Data Using SAS Text Miner for SAS9 Course Notes, SAS Institute, Inc., Cary, NC.Google Scholar
  42. Zhang, Q. and Segall, R. (2008), Web mining: a survey of current research, techniques, and software, International Journal of Information Technology & Decision Making, 7(4), 683-720.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Department of Computer and Info. Tech.Arkansas State UniversityJonesboroUSA

Personalised recommendations