Skip to main content

Probability on Graphical Structure: A Knowledge-Based Agricultural Case

Abstract

This paper provides a rich framework to estimate the causal relationship among eighteen features (related to the product type and classification) on an agronomy study by using Bayesian Networks, which are a type of probabilistic graphical model. Thereby, with this class of models, we aimed to classify and identify the complaints based on corn seed commercialization. Simulation studies were used to compare both adopted algorithms, K2 and PC, and their hybrid version. These studies indicate excellent classification performance, given the knowledge of the network structure. After the estimated Directed Acyclic Graph, three features (Brand, Germination percentage, and Amount of commercialized bags) were evidenced as Impacting factors in the complaints based on corn seed commercialization.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

References

  1. Lauritzen SL, Spiegelhalter DJ (1988) Local computations with probabilities on graphical structures and their application to expert systems. J R Stat Soc Ser B (Methodol) 50(2):157–194

    Google Scholar 

  2. Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. MIT Press, Cambridge

    Google Scholar 

  3. Neapolitan RE et al (2004) Learning Bayesian networks, vol 38. Pearson Prentice Hall, Upper Saddle River

    Google Scholar 

  4. Kevin B, Nicholson A (2004) Bayesian artificial intelligence. Chapman & Hall/CRC, Boca Raton

    Google Scholar 

  5. Deming WE, Edwards DW (1982) Quality, productivity, and competitive position, vol 183. Massachusetts Institute of Technology, Center for advanced engineering study, Cambridge

    Google Scholar 

  6. Shi Y (2014) Big data: history, current status, and challenges going forward. Bridge 44(4):6–11

    Google Scholar 

  7. Ahmed M, Islam AN (2020) Deep learning: hope or hype. Ann Data Sci 7:1–6

    Article  Google Scholar 

  8. Abberton M, Batley J, Bentley A, Bryant J, Cai H, Cockram J, Costa de Oliveira A, Cseke LJ, Dempewolf H, De Pace C et al (2016) Global agricultural intensification during climate change: a role for genomics. Plant Biotechnol J 14(4):1095–1098

    Article  Google Scholar 

  9. Suphamitmongkol W, Nie G, Liu R, Kasemsumran S, Shi Y (2013) An alternative approach for the classification of orange varieties based on near infrared spectroscopy. Comput Electron Agric 91:87–93

    Article  Google Scholar 

  10. Gerland P, Raftery AE, Ševčíková H, Li N, Gu D, Spoorenberg T, Alkema L, Fosdick BK, Chunn J, Lalic N et al (2014) World population stabilization unlikely this century. Science 346(6206):234–237

    Article  Google Scholar 

  11. Chen HR, Cheng BW (2010) A case study in solving customer complaints based on the 8ds method and Kano model. J Chin Inst Ind Eng 27(5):339–350

    Google Scholar 

  12. Højsgaard S, Edwards D, Lauritzen S (2012) Graphical models with R. Springer, Berlin

  13. Granger CW (1969) Investigating causal relations by econometric models and cross-spectral methods. Econom J Econom Soc 37:424–438

    Google Scholar 

  14. Sims CA (1972) Money, income, and causality. Am Econ Rev 62(4):540–552

    Google Scholar 

  15. Kim JH, Pearl J (1987) Convince: a conversational inference consolidation engine. IEEE Trans Syst Man Cyber 17(2):120–132

    Article  Google Scholar 

  16. Siggiridou E, Koutlis C, Tsimpiris A, Kugiumtzis D (2019) Evaluation of granger causality measures for constructing networks from multivariate time series. Entropy 21(11):1080

    Article  Google Scholar 

  17. Ramos PL, Nascimento DC, Cocolo C, Nicola MJ, Alonso C, Ribeiro LG, Ennes A (2018) Louzada F (2018) Reliability-centered maintenance: analyzing failure in harvest sugarcane machine using some generalizations of the weibull distribution. Model Simul Eng. https://doi.org/10.1155/2018/1241856

    Article  Google Scholar 

  18. Kumar M, Pathak A, Soni S (2019) Bayesian inference for rayleigh distribution under step-stress partially accelerated test with progressive type-II censoring with binomial removal. Ann Data Sci 6(1):117–152

    Article  Google Scholar 

  19. Flesch I, Lucas PJ (2007) Markov equivalence in Bayesian networks. In: Lucas P, Gámez JA, Cerdan AS (eds) Advances in probabilistic graphical models. Springer, Berlin, pp 3–38

    Chapter  Google Scholar 

  20. Gámez JA, Mateo JL, Puerta JM (2011) Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood. Data Min Knowl Discov 22(1–2):106–148

    Article  Google Scholar 

  21. Scutari M (2010) Learning Bayesian networks with the bnlearn R package. J Stat Softw 35(3):1–22. https://doi.org/10.18637/jss.v035.i03

    Article  Google Scholar 

  22. Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163

    Article  Google Scholar 

  23. Dash D, Druzdzel MJ (1999) A hybrid anytime algorithm for the construction of causal models from sparse data. In: Proceedings of the fifteenth conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., pp 142–149

  24. Jiang X, Neapolitan RE, Barmada MM, Visweswaran S (2011) Learning genetic epistasis using Bayesian network scoring criteria. BMC Bioinform 12(1):89

    Article  Google Scholar 

  25. Estrada E (2012) The structure of complex networks: theory and applications. Oxford University Press, Oxford

    Google Scholar 

  26. Breiman L et al (1998) Arcing classifier (with discussion and a rejoinder by the author). Ann Stat 26(3):801–849

    Article  Google Scholar 

  27. Cooper GF, Herskovits E (1992) A Bayesian method for the induction of probabilistic networks from data. Mach Learn 9(4):309–347

    Google Scholar 

  28. Colombo D, Maathuis MH (2014) Order-independent constraint-based causal structure learning. J Mach Learn Res 15(1):3741–3782

    Google Scholar 

  29. Abellán J, Gómez-Olmedo M, Moral S (2006) Some variations on the pc algorithm. In: Proceedings of the 3rd European workshop on probabilistic graphical models, PGM 2006, pp 1–8

  30. R Core Team (2018) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/. Accessed 30 Apr 2019

  31. RStudio Team (2015) RStudio: integrated development environment for R. RStudio, Inc., Boston, MA. http://www.rstudio.com/. Accessed 30 Apr 2019

  32. Schloerke B, Crowley J, Cook D, Briatte F, Marbach M, Thoen E, Elberg A, Larmarange J (2018) GGally: extension to ’ggplot2’. https://CRAN.R-project.org/package=GGally (r package version 1.4.0). Accessed 30 Apr 2019

  33. Butts CT (2008) Network: a package for managing relational data in R. J Stat Softw 24(2):1–36

    Article  Google Scholar 

  34. Meyer PE (2014) Infotheo: information-theoretic measures, p. 177. https://CRAN.R-project.org/package=infotheo (r package version 1.2.0). Accessed 30 Apr 2019

  35. Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F (eds) 2nd International symposium on information theory. Akademia Kiado, Budapest, pp 267–281

  36. Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19:215–222

    Google Scholar 

  37. Akaike H (1983) Information measures and model selection. Bull Int Stat Inst 50:277–290

    Google Scholar 

  38. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464

    Article  Google Scholar 

  39. Tien JM (2017) Internet of things, real-time decision making, and artificial intelligence. Ann Data Sci 4(2):149–178

    Article  Google Scholar 

  40. Scutari M, Vitolo C, Tucker A (2019) Learning Bayesian networks from big data with greedy search: computational complexity and efficient implementation. Stat Comput 29(5):1095–1108

    Article  Google Scholar 

Download references

Acknowledgements

The authors are grateful for the funding partially provided by the Brazilan agencies CNPq, FAPESP and CAPES. Francisco Louzada and Pedro L. Ramos acknowledge support from the São Paulo State Research Foundation (FAPESP Processes 2013/07375-0 and 2017/25971-0, respectively). The authors are also thankful for the collaboration of  Camila Sgarioni Ozelame, Herlisson Maciel Bezerra, Cláudio Luiz, Selegatto Filho, Leonardo Alves Miguel, Lucas Santos Nunez, Mariane Romildo dos Santos, Matheus de Oliveira Souza, and Paulo Lombardi.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Diego Carvalho do Nascimento.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ianishi, P., Gonzatto Junior, O.A., Henriques, M.J. et al. Probability on Graphical Structure: A Knowledge-Based Agricultural Case. Ann. Data. Sci. 9, 327–345 (2022). https://doi.org/10.1007/s40745-020-00311-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40745-020-00311-y

Keywords