Skip to main content

Image analysis and data mining techniques for classification of morphological and color features for seeds of the wild castor oil plant (Ricinus communis L.)


In this study, a castor seed (Ricinus communis L.) classification process was developed using a precise image analysis technique, and several data mining algorithms. Castor seed oil has an excellent demand in the pharmaceutical sector, and it has recently aroused the interest of the biodiesel production companies. However, there are few studies describing the physical characteristics of Ricinus communis; thus, any advance in this field contributes to the design of technology that may increase the production of this oil, up to industrial levels. In fact, this work aims to contribute not only to understand the physical features of castor seed varieties, but also to unveil key information to develop better castor seed oil extraction machines. Additionally, a novel methodology to study accessions of castor seed gathered from several geographical locations is proposed. Particularly, an automatically accurate image analysis technique was implemented in order to extract color and morphological features from seeds. The data set of seeds was built considering fifty samples per accession. After that, several classification experiments were done using well known data mining algorithms in order to cluster all samples. Experimental results showed that it is possible to cluster studied seeds into ten similar classes with high accuracy (larger than 95 %). Moreover, image analysis and data mining techniques were efficient tools for the classification of seeds, and the color and morphological data gathered are really useful for the design of oil extraction equipment. In fact, the effectiveness in the correct classification instances was 100 %, with a computation time of 0.01 seconds.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  1. 1.

    Amonsou E, Taylor J, Minnaar A (2011) Microstructure of protein bodies in marama bean species. J LWT-Food Sci Technol 44(1):42–47

    Article  Google Scholar 

  2. 2.

    Armendáriz J, Lapuerta M, Zavala F, García-Zambrano E, del Carmen Ojeda M (2015) Evaluation of eleven genotypes of castor oil plant (Ricinus communis L.) for the production of biodiesel. Ind Crops Prod 77:484–490

    Article  Google Scholar 

  3. 3.

    Berman P, Nizri S, Wiesman Z (2011) Castor oil biodiesel and its blends as alternative fuel. Biomass Bioenergy 35(7):2861–2866

    Article  Google Scholar 

  4. 4.

    Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  5. 5.

    Bouguet J (2004) Camera calibration toolbox for matlab

  6. 6.

    Campbell D N, Rowland D L, Schnell R W, Ferrell J A, Wilkie A (2014) Developing a castor (Ricinus communis L.) production system in Florida, US: Evaluating crop phenology and response to management. Ind Crops Prod 53:217–227

    Article  Google Scholar 

  7. 7.

    Carvalho M, Alves R, Oliveira L (2010) Radiographic analysis in castor bean seeds (Ricinus communis L.) Revista Brasileira de Sementes 32:170–175

    Article  Google Scholar 

  8. 8.

    Cervantes E, Martín J J, Ardanuy R, de Diego J G, Tocino Á (2010) Modeling the Arabidopsis seed shape by a cardioid: efficacy of the adjustment with a scale change with factor equal to the Golden Ratio and analysis of seed shape in ethylene mutants. J Plant Physiol 167(5):408–410

  9. 9.

    Cervantes E, Martín J J, Chan P K, Gresshoff P M, Tocino Á (2010) Seed shape in model legumes: approximation by a cardioid reveals differences in ethylene insensitive mutants of Lotus japonicus and Medicago truncatula. J Plant Physiol 169 (14):1359–1365

    Article  Google Scholar 

  10. 10.

    Chtioui Y, Bertrand D, Dattée Y, Devaux M F (1996) Identification of seeds by colour imaging: comparison of discriminant analysis and artificial neural network. J Sci Food Agri 71(4):433–441

    Article  Google Scholar 

  11. 11.

    Conceição M M, Candeia R A, Silva F C, Bezerra A F, Fernandes V J, Souza A G (2007) Thermoanalytical characterization of castor oil biodiesel. Renew Sustain Energy Rev 11(5):964–975

    Article  Google Scholar 

  12. 12.

    Dufaure C, Leyris J, Rigal L, Mouloungui Z (1999) A twin-screw extruder for oil extraction: I. Direct expression of oleic sunflower seeds. J Amer Oil Chem Soc 76(9):1073–1079

    Google Scholar 

  13. 13.

    Friedl MA, Brodley CE (1997) Decision tree classification of land cover from remotely sensed data. Remote Sens Environ 61(3):399–409

    Article  Google Scholar 

  14. 14.

    Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2-3):131–163

    Article  MATH  Google Scholar 

  15. 15.

    Gonzalez RC, Woods RE, Eddins SL (2004) Digital image processing using MATLAB. Pearson Education India

  16. 16.

    Grillo O, Mattana E, Venora G, Bacchetta G (2010) Statistical seed classifiers of 10 plant families representative of the Mediterranean vascular flora. Seed Sci Technol 38(2):455–476

    Article  Google Scholar 

  17. 17.

    Gübitz GM, Mittelbach M, Trabi M (1999) Exploitation of the tropical oil seed plant Jatropha curcas L. Bioresource Technol 6(1):73–82

    Article  Google Scholar 

  18. 18.

    Hall MA, Frank E (2008) Combining naive bayes and decision tables. In: FLAIRS conference, vol 2118, pp 318–319

  19. 19.

    Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newslett 11(1):10–18

    Article  Google Scholar 

  20. 20.

    Hernández-Martínez MÁ, Núñez-Colín CA, Guzmán-Maldonado SH, Espinosa-Trujillo E, Holmes G, Donkin A, Witten I (1994) A machine learning workbench. Intell Inf Syst 357–361

  21. 21.

    Hosmer DW, Lemeshow S (2004) Applied logistic regression. Wiley

  22. 22.

    Isely D (1947) Investigations in seed classification by family characteristics. Agricultural Experiment Station, Iowa State College of Agriculture and Mechanic Arts

  23. 23.

    Jiang L, Wang S, Li C, Zhang L (2016) Structure extended multinomial naive Bayes. Inf Sci 329:346–356

    Article  Google Scholar 

  24. 24.

    Kyari M (2008) Extraction and characterization of seed oils. Int Agrophys

  25. 25.

    Lati RN, Filin S, Eizenberg H (2013) Estimation of plants’ growth parameters via image-based reconstruction of their three-dimensional shape. Agron J 105(1):191–198

    Article  Google Scholar 

  26. 26.

    Liu ZY, Cheng F, Ying YB, Rao XQ (2005) Identification of rice seed varieties using neural network. J Zhejiang Univ Sci B 6(11):1095–1100

    Article  Google Scholar 

  27. 27.

    Liu S, Zhang Z, Qi L, Ma M (2014) A fractal image encoding method based on statistical loss used in agricultural image compression. Multimed Tools Appl 1–12

  28. 28.

    Lorestani AN, Jaliliantabar F, Gholami R (2012) Physical and mechanical properties of castor seed. Qual Assur Saf Crops Foods 4(5):e29–e32

    Article  Google Scholar 

  29. 29.

    Ma WY, Manjunath BS (1996) Texture-based pattern retrieval from image databases. Multimed Tools Appl 2:35–51

    Google Scholar 

  30. 30.

    Maïssa C, Guillon M, Simmons P, Vehige J (2010) Effect of castor oil emulsion eyedrops on tear film composition and stability. Contact Lens Anterior Eye 33(2):76–82

    Article  Google Scholar 

  31. 31.

    Medina W, Skurtys O, Aguilera JM (2010) Study on image analysis application for identification Quinoa seeds (Chenopodium quinoa Willd) geographical provenance. LWT-Food Sci Technol 43(2):238–246

    Article  Google Scholar 

  32. 32.

    Mitchell TM (1997) Machine learning. Mc Grawill. (forthcoming)

  33. 33.

    Mohsenin NN (1970) Physical properties of plant and animial materials. Volume 1. Structure physical characterisitics and mechanical properties

  34. 34.

    Montes JM, Technow F, Bohlinger B, Becker K (2013) Seed quality diversity, trait associations and grouping of accessions in Jatropha curcas L. Ind Crops Prod 51:178–185

    Article  Google Scholar 

  35. 35.

    Ogunniyi DS (2006) Castor oil: A vital industrial raw material. Bioresource Technol 97(9):1086–1091

    Article  Google Scholar 

  36. 36.

    Pal SK, Mitra S (1992) Multilayer perceptron, fuzzy sets, and classification. IEEE Trans Neural Netw 3(5):683–697

    Article  Google Scholar 

  37. 37.

    Pecina-Quintero V, Anaya-López JL, Núñez-Colín CA, Zamarripa-Colmenero A, Montes-García N, Solís-Bonilla JL, Aguilar-Rangel MR (2013) Assessing the genetic diversity of castor bean from Chiapas, México using SSR and AFLP markers. Ind Crops Prod 41:134–143

    Article  Google Scholar 

  38. 38.

    Perdomo FA, Acosta-Osorio AA, Herrera G, Vasco-Leal JF, Mosquera-Artamonov JD, Millan-Malo B, Rodriguez-Garcia ME (2013) Physicochemical characterization of seven Mexican Ricinus communis L. seeds & oil contents. Biomass Bioenergy 48:17–24

    Article  Google Scholar 

  39. 39.

    Perea-Flores MJ, Chanona-Pérez JJ, Garibay-Febles V, Calderón-Dominguez G, Terrés-Rojas E, Mendoza-Pérez JA, Herrera-Bucio R (2011) Microscopy techniques and image analysis for evaluation of some chemical and physical properties and morphological features for seeds of the castor oil plant (Ricinus communis L.) Ind Crops Products 34(1):1057–1065

    Article  Google Scholar 

  40. 40.

    Porebski A, Vandenbroucke N, Macaire L, Hamad D (2014) A new benchmark image test suite for evaluating colour texture classification schemes. Multimed Tools Appl 70:543–556

    Article  Google Scholar 

  41. 41.

    Pourreza A, Pourreza H, Abbaspour-Fard M H, Sadrnia H (2012) Identification of nine Iranian wheat seed varieties by textural analysis with image processing. Comput Electron Agri 83:102–108

    Article  Google Scholar 

  42. 42.

    Roscher R, Herzog K, Kunkel A, Kicherer A, Töpfer R, Förstner W (2014) Automated image analysis framework for high-throughput determination of grapevine berry sizes using conditional random fields. Comput Electron Agri 100:148–158

    Article  Google Scholar 

  43. 43.

    Safieddin Ardebili M, Najafi G, Ghobadian B, Tavakkoli Hashjin T (2012) Determination of some mechanical properties of castor seed (Ricinus communis L.) to design and fabricate an oil extraction machine. J Agri Sci Technol 14(6):1219–1227

    Google Scholar 

  44. 44.

    Sammut C, Webb GI (2016) Encyclopedia of machine learning and data mining. Springer

  45. 45.

    Shahin M, Symons S (2003) Lentil type identification using machine vision. Can Biosyst Eng 45:3–5

    Google Scholar 

  46. 46.

    Sehgal P, Khan M, Kumar O, Vijayaraghavan R (2010) Purification, characterization and toxicity profile of ricin isoforms from castor beans. Food Chem Toxicol 48(11):3171–3176

    Article  Google Scholar 

  47. 47.

    Senger E, Martin M, Montes JM (2015) Classification of Jatropha curcas L. genotypes into germplasm groups associated with the presence of phorbol esters by means of seed characteristics. Ind Crops Prod 78:9–12

    Article  Google Scholar 

  48. 48.

    Severino LS, Auld DL, Baldanzi M, Cândido MJ, Chen G, Crosby W, Machado OL, Mielke T, Milani M, Miller TD, Morris JB, Morse SA, Navas AA, Soares D J, Sofiatti V, Wang ML, Zanotto MD, Zieler H (2012) A review on the challenges for increased production of castor. Agron J 104 (4):853–880

    Article  Google Scholar 

  49. 49.

    Severino LS, Mendes BS, Lima GS (2015) Seed coat specific weight and endosperm composition define the oil content of castor seed. Ind Crops Prod 75:14–19

    Article  Google Scholar 

  50. 50.

    Sharma N, Bajpai A, Litoriya MR (2012) Comparison the various clustering algorithms of weka tools. Fac Int J Emerg Technol Adv Eng 2(5):73–80

    Google Scholar 

  51. 51.

    Silva LO, Koga ML, Cugnasca CE, Costa AH (2013) Comparative assessment of feature selection and classification techniques for visual inspection of pot plant seedlings. Comput Electron Agri 97:47–55

    Article  Google Scholar 

  52. 52.

    Wilcox D, Dove B, Mcdavid DG (2003) Image tool. Version 3. Users guide. San Antonio: University of Texas Health Science Center

  53. 53.

    Zhao Y, Zhang Y (2008) Comparison of decision tree methods for finding active objects. Adv Space Res 41(12):1955–1959

    Article  Google Scholar 

Download references


Authors thank to the Mexican Council of Science and Technology (CONACYT, Mexico) for many years of support, and the Laboratory of Applied Technological Systems of the Telematic Engineering Department, Polytechnic University of Queretaro. Paticularly, J. D. Mosquera-Artamonov and J.F. Vasco-Leal thank CONACYT for their respective PhD. granted scholarships.

Author information



Corresponding author

Correspondence to Cesar Isaza.

Additional information

This research was partially supported with a grant from Secretaria de Educacion Publica under grant Nuevos PTC F-PROMEP-38/Rev-03–SEP-23-005, and the Mexican Council of Science and Technology (CONACYT, Mexico)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Isaza, C., Anaya, K., de Paz, J.Z. et al. Image analysis and data mining techniques for classification of morphological and color features for seeds of the wild castor oil plant (Ricinus communis L.) . Multimed Tools Appl 77, 2593–2610 (2018).

Download citation


  • Ricinus communis
  • Seed characterization
  • Image analysis
  • Classification