Skip to main content
Log in

Comparison of Machine Learning Methods for Solving the Problem of Wheat Seeds Classification by Yield Properties

  • METHODS
  • Published:
Russian Agricultural Sciences Aims and scope

Abstract

The use of data mining in agricultural production is gaining popularity. The results of the implementation of machine learning methods, namely, decision tree, support vector machine and the K-nearest neighbor for solving the problem of wheat seeds classification by yield properties, using bioelectric indicators of seeds are for the first time presented in the work. The effectiveness of the studied classifiers is presented by the accuracy indicators, the confusion matrix construction and training quality cross validation. The methods comparison results found that the decision tree method showed the best results in data classification. The method is quite simple in the model results understanding and interpretation and does not require additional data preparation. The experimental results showed relatively high accuracy (96%) for the sample with a noise component. There is no need to normalize data, add dummy variables or delete missed data. The K-nearest neighbor is also recommended for classifying seeds by yield properties. However, it is inferior in accuracy to decision trees. For sampling with noise the accuracy was 91%. The support vector machine is not a promising tool for solving this problem, although it is an extremely successful method for other areas.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.

Similar content being viewed by others

REFERENCES

  1. Kalke, H. and Loewen, M., Support vector machine learning applied to digital images of river ice conditions, Cold Reg. Sci. Technol., 2018, vol. 155, pp. 225–236.

    Article  Google Scholar 

  2. Skvortsov, E.A., Nabokov, V.I., Nekrasov, K.V., Skvortsova, E.G., and Krotov, M.I., Application of technologies of artificial intelligence in agriculture, Agrar. Vestn. Urala, 2019, vol. 187, no. 8.

  3. Chase, T. and Rothley, K.D., Hierarchical tree classifiers to find suitable sites for sandplain grasslands and heathlands on Martha’s Vineyard Island, Massachusetts, Biol. Conserv., 2007, vol. 136, pp. 65–75.

    Article  Google Scholar 

  4. Steele, B.M., Combining multiple classifiers: An application using spatial and remotely sensed information for land cover mapping, Remote Sens. Environ., 2000, vol. 74, pp. 545–556.

    Article  Google Scholar 

  5. Raevsky, B.V. and Tarasenko, V.V., Investigation of the dynamics of forests in the Karelian part of the Green belt of Fennoscandia by remote sensing, Tr. Karel. Nauchn. Tsentra Ross. Akad. Nauk, 2019, pp. 89–99.

    Google Scholar 

  6. Caley, P. and Kuhnert, P.M., Application and evaluation of classification trees for screening unwanted plants, Aust. Ecol., 2006, vol. 31, pp. 647–655.

    Article  Google Scholar 

  7. Recknagel, F., Applications of machine learning to ecological modeling, Ecol. Modell., 2001, vol. 146, pp. 303–310.

    Article  Google Scholar 

  8. Bertsimas, D. and Dunn, J., Optimal classification trees, Mach. Learn., 2017, vol. 106, pp. 1039–1082. https://doi.org/10.1007/s10994-017-5633-9

    Article  Google Scholar 

  9. Elith, J., Graham, C.H., Anderson, R.P., Dudik, M., Ferrier, S., Guisan, A., Hijmans, R.J., Huettmann, F., Leathwick, J.R., Lehmann, A., Li, J., Lohmann, L.G., Loiselle, B.A., Manion, G., Moritz, C., Nakamura, M., et al., Novel methods improve prediction of species’ distributions from occurrence data, Ecography, 2006, vol. 29, pp. 129–151.

    Article  Google Scholar 

  10. Norouzi, M., Collins, M.D., Johnson, M.A., Fleet, D.J., and Kohli, P., Efficient non-greedy optimization of decision trees, Annual Conference on Neural Information Processing Systems, 2015, Montreal, pp. 1729–1737.

  11. Donskikh, A.O., Minakov, D.A., Sirota, A.A., and Shulgin, V.A., Methods of classification of grain mixtures components based on spectral analysis in visible and infrared wavelength ranges, Vestn. Voronezh.Gos. Univ.: Ser. Sist. Anal. Inf. Tekhnol., 2016, vol. 1, pp. 150–160.

    Google Scholar 

  12. Shamanin, V.P., Petukhovsky, S.L., and Krasnova, Yu.S., The cluster analysis of grades of the soft spring-sown wheat on elements of the crop structure in the southern forest-steppe of Western Siberia, Byull. Krasnoyarsk. Gos. Univ., 2016, no. 4.

  13. Barysheva, N.N. and Pronin, S.P., Method of determining seed germination by using membrane potential of wheat seeds, Inzh. Tekhnol. Sist., 2019, vol. 29, no. 3, pp. 443–455.

    Google Scholar 

  14. Kampichler, C., Wieland, R., Calme, S., Weissenberger, H., and Arriaga-Weiss, S., Classification in conservation biology: A comparison of five machine-learning methods, Ecol. Inf., 2010, vol. 5, pp. 441–450. https://doi.org/10.1016/j.ecoinf.2010.06.003

    Article  Google Scholar 

  15. Mehne, S.H.H. and Mirjalili, S., Support vector machine: Applications and improvements using evolutionary algorithms, in Evolutionary Machine Learning Techniques. Algorithms for Intelligent Systems, Mirjalili, S., Faris, H., and Aljarah, I., Eds., Singapore: Springer, 2020.

    Google Scholar 

  16. Schölkopf, B. and Smola, A., Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, Cambridge, MA: MIT Press, 2002.

    Google Scholar 

  17. Razzaghi, T., Roderick, O., Safro, I., and Marko, N., Multilevel weighted support vector machine for classification on healthcare data with missing values, PloS ONE, 2016, vol. 11, no. 5. https://doi.org/10.1371/journal.pone.0155119

  18. Weinberger, K.Q. and Saul, L.K., Distance metric learning for large margin nearest neighbor classification, J. Mach. Learn. Res., 2009, vol. 10, pp. 207–244.

    Google Scholar 

  19. Zhang, Z., Introduction to machine learning: k-nearest neighbors, Ann. Transl. Med., 2016, vol. 4, no. 11, p. 218. https://doi.org/10.21037/atm.2016.03.37

    Article  PubMed  PubMed Central  Google Scholar 

  20. Ruuska, S., Hämäläinen, W., Kajava, S., Mughal, M., Matilainen, P., and Mononen, J., Evaluation of the confusion matrix method in the validation of an automated system for measuring feeding behaviour of cattle, Behav. Process., 2018, vol. 148, pp. 56–62. https://doi.org/10.1016/j.beproc.2018.01.004

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by D.D. Baryshev, N.N. Barysheva, S.P. Pronin, and O.N. Nikol’skii. The first draft of the manuscript was written by N.N. Barysheva and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to D. D. Baryshev.

Ethics declarations

The authors declare that they have no conflict of interest. This article does not contain any studies involving animals or human participants performed by any of the authors.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Baryshev, D.D., Barysheva, N.N., Pronin, S.P. et al. Comparison of Machine Learning Methods for Solving the Problem of Wheat Seeds Classification by Yield Properties. Russ. Agricult. Sci. 46, 410–417 (2020). https://doi.org/10.3103/S1068367420040047

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S1068367420040047

Keywords:

Navigation