Skip to main content

Weka-A Machine Learning Workbench for Data Mining


The Weka workbench is an organized collection of state-of-the-art machine learning algorithms and data preprocessing tools. The basic way of interacting with these methods is by invoking them from the command line. However, convenient interactive graphical user interfaces are provided for data exploration, for setting up large-scale experiments on distributed computing platforms, and for designing configurations for streamed data processing. These interfaces constitute an advanced environment for experimental data mining. The system is written in Java and distributed under the terms of the GNU General Public License.

Key words

  • machine learning software
  • Data Mining
  • data preprocessing
  • data visualization
  • extensible workbench

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD   299.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  • Bazzan, A. L., Engel, P. M., Schroeder, L. F., and da Silva, S. C. (2002). Automated annotation of keywords for proteins related to mycoplasmataceae using machine learning techniques. Bioinformatics, 18:35S–43S.

    Google Scholar 

  • Frank, E., Holmes, G., Kirkby, R., and Hall, M. 2002). Racing committees for large datasets. In Proceedings of the International Conference on Discovery Science, pages 153–164. Springer-Verlag.

    Google Scholar 

  • Frank, E., Paynter, G. W., Witten, I. H., Gutwin, C., and Nevill-Manning, C. G. (1999). Domain-specific keyphrase extraction. In Proceedings of the 16th International Joint Conference on Artificial Intelligence, pages 668–673. Morgan Kaufmann.

    Google Scholar 

  • Holmes, G., Cunningham, S. J., Rue, B. D., and Bollen, F. (1998). Predicting apple bruising using machine learning. Acta Hort, 476:289–296.

    Google Scholar 

  • Holmes, G. and Hall, M. (2002). A development environment for predictive modelling in foods. International Journal of Food Microbiology, 73:351–362.

    CrossRef  Google Scholar 

  • Holmes, G., Kirkby, R., and Pfahringer, B. (2003). Mining data streams using option trees. Technical Report 08/03, Department of Computer Science, University of Waikato.

    Google Scholar 

  • Kusabs, N., Bollen, F., Trigg, L., Holmes, G., and Inglis, S. (1998). Objective measurement of mushroom quality. In Proc New Zealand Institute of Agricultural Science and the New Zealand Society for Horticultural Science Annual Convention, page 51.

    Google Scholar 

  • Li, J., Liu, H., Downing, J. R., Yeoh, A. E.-J., andWong, L. (2003). Simple rules underlying gene expression profiles of more than six subtypes of acute lymphoblastic leukemia (all) patients. Bioinformatics, 19:71–78.

    CrossRef  MATH  Google Scholar 

  • McQueen, R., Holmes, G., and Hunt, L. (1998). User satisfaction with machine learning as a data analysis method in agricultural research. New Zealand Journal of Agricultural Research, 41(4):577–584.

    CrossRef  Google Scholar 

  • Pedersen, T. (2002). Evaluating the effectiveness of ensembles of decision trees in disambiguating Senseval lexical samples. In Proceedings of the ACL-02 Workshop on Word Sense Disambiguation: Recent Successes and Future Directions.

    Google Scholar 

  • Sauban, M. and Pfahringer, B. (2003). Text categorisation using document profiling. In Proceedings of the 7th European Conference on Principles and Practice of Knowledge Discovery in Databases, pages 411–422. Springer.

    Google Scholar 

  • Taylor, J., King, R. D., Altmann, T., and Fiehn, O. (2002). Application of metabolomics to plant genotype iscrimination using statistics and machine learning. Bioinformatics, 18:241S–248S.

    Google Scholar 

  • Tobler, J. B., Molla, M., Nuwaysir, E., Green, R., and Shavlik, J. (2002). Evaluating machine learning approaches for aiding probe selection for gene-expression arrays. Bioinformatics, 18:164S–171S.

    Google Scholar 

Download references


Many thanks to past and present members of the Waikato machine learning group and the many external contributors for all the work they have put into Weka.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Eibe Frank .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Frank, E. et al. (2009). Weka-A Machine Learning Workbench for Data Mining. In: Maimon, O., Rokach, L. (eds) Data Mining and Knowledge Discovery Handbook. Springer, Boston, MA.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-09822-7

  • Online ISBN: 978-0-387-09823-4

  • eBook Packages: Computer ScienceComputer Science (R0)