Computational Statistics

, Volume 24, Issue 2, pp 225–232

Open-source machine learning: R meets Weka

Original Paper

Abstract

Two of the prime open-source environments available for machine/statistical learning in data mining and knowledge discovery are the software packages Weka and R which have emerged from the machine learning and statistics communities, respectively. To make the different sets of tools from both environments available in a single unified system, an R package RWeka is suggested which interfaces Weka’s functionality to R. With only a thin layer of (mostly R) code, a set of general interface generators is provided which can set up interface functions with the usual “R look and feel”, re-using Weka’s standardized interface of learner classes (including classifiers, clusterers, associators, filters, loaders, savers, and stemmers) with associated methods.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Carey V (2007) arji: Another R-Java interface. http://www.bioconductor.org/, R package version 0.3.16
  2. Chambers JM, Hastie TJ (1992) Statistical models in S. Chapman & Hall, LondonMATHGoogle Scholar
  3. Ellson J, Gansner E, Koutsofios E, North S, Woodhull G (2003) Graphviz and Dynagraph—static and dynamic graph drawing tools. In: Junger M, Mutzel P (eds.) Graph drawing software. Springer, Heidelberg, pp 127–148. http://www.Graphviz.org/
  4. Gentry J, Long L, Gentleman R, Falcon S (2007) Rgraphviz: plotting capabilities for R graph objects. http://www.bioconductor.org/, R package version 1.14.1
  5. Hahsler M, Grün B, Hornik K (2005) arules—A computational environment for mining association rules and frequent item sets. J Stat Softw 14(15):1–25. ISSN 1548-7660, http://www.jstatsoft.org/v14/i15/ Google Scholar
  6. Hornik K, Zeileis A, Hothorn T, Buchta C (2007) RWeka: an R interface to Weka. http://CRAN.R-project.org/package=RWeka, R package version 0.3-4
  7. Hothorn T, Hornik K, Zeileis A (2006) Unbiased recursive partitioning: a conditional inference framework. J Comput Graphical Stat 15(3): 651–674CrossRefMathSciNetGoogle Scholar
  8. R Development Core Team (2007) R: a language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria, http://www.R-project.org/, ISBN 3-900051-07-0
  9. Schauerhuber M, Zeileis A, Meyer D, Hornik K (2007) Benchmarking open-source tree learners in R/RWeka. In: Data analysis, machine learning, and applications (Proceedings of the 31st annual conference of the Gesellschaft für Klassifikation e.V., March 7–9, 2007, Freiburg), forthcomingGoogle Scholar
  10. Temple Lang D, Chambers J (2005) SJava: The omegahat interface for R and Java. http://www.omegahat.org/RSJava/, R package version 0.69-0
  11. Urbanek S (2007) rJava: Low-Level R to Java interface. http://CRAN.R-project.org/package=rJava, R package version 0.4-16
  12. Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San FranciscoMATHGoogle Scholar

Copyright information

© Springer-Verlag 2008

Authors and Affiliations

  1. 1.Department of Statistics and MathematicsWirtschaftsuniversität WienViennaAustria
  2. 2.Institute for Tourism and Leisure StudiesWirtschaftsuniversität WienViennaAustria

Personalised recommendations