Abstract
This paper provides a rich framework to estimate the causal relationship among eighteen features (related to the product type and classification) on an agronomy study by using Bayesian Networks, which are a type of probabilistic graphical model. Thereby, with this class of models, we aimed to classify and identify the complaints based on corn seed commercialization. Simulation studies were used to compare both adopted algorithms, K2 and PC, and their hybrid version. These studies indicate excellent classification performance, given the knowledge of the network structure. After the estimated Directed Acyclic Graph, three features (Brand, Germination percentage, and Amount of commercialized bags) were evidenced as Impacting factors in the complaints based on corn seed commercialization.
This is a preview of subscription content, access via your institution.






References
Lauritzen SL, Spiegelhalter DJ (1988) Local computations with probabilities on graphical structures and their application to expert systems. J R Stat Soc Ser B (Methodol) 50(2):157–194
Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. MIT Press, Cambridge
Neapolitan RE et al (2004) Learning Bayesian networks, vol 38. Pearson Prentice Hall, Upper Saddle River
Kevin B, Nicholson A (2004) Bayesian artificial intelligence. Chapman & Hall/CRC, Boca Raton
Deming WE, Edwards DW (1982) Quality, productivity, and competitive position, vol 183. Massachusetts Institute of Technology, Center for advanced engineering study, Cambridge
Shi Y (2014) Big data: history, current status, and challenges going forward. Bridge 44(4):6–11
Ahmed M, Islam AN (2020) Deep learning: hope or hype. Ann Data Sci 7:1–6
Abberton M, Batley J, Bentley A, Bryant J, Cai H, Cockram J, Costa de Oliveira A, Cseke LJ, Dempewolf H, De Pace C et al (2016) Global agricultural intensification during climate change: a role for genomics. Plant Biotechnol J 14(4):1095–1098
Suphamitmongkol W, Nie G, Liu R, Kasemsumran S, Shi Y (2013) An alternative approach for the classification of orange varieties based on near infrared spectroscopy. Comput Electron Agric 91:87–93
Gerland P, Raftery AE, Ševčíková H, Li N, Gu D, Spoorenberg T, Alkema L, Fosdick BK, Chunn J, Lalic N et al (2014) World population stabilization unlikely this century. Science 346(6206):234–237
Chen HR, Cheng BW (2010) A case study in solving customer complaints based on the 8ds method and Kano model. J Chin Inst Ind Eng 27(5):339–350
Højsgaard S, Edwards D, Lauritzen S (2012) Graphical models with R. Springer, Berlin
Granger CW (1969) Investigating causal relations by econometric models and cross-spectral methods. Econom J Econom Soc 37:424–438
Sims CA (1972) Money, income, and causality. Am Econ Rev 62(4):540–552
Kim JH, Pearl J (1987) Convince: a conversational inference consolidation engine. IEEE Trans Syst Man Cyber 17(2):120–132
Siggiridou E, Koutlis C, Tsimpiris A, Kugiumtzis D (2019) Evaluation of granger causality measures for constructing networks from multivariate time series. Entropy 21(11):1080
Ramos PL, Nascimento DC, Cocolo C, Nicola MJ, Alonso C, Ribeiro LG, Ennes A (2018) Louzada F (2018) Reliability-centered maintenance: analyzing failure in harvest sugarcane machine using some generalizations of the weibull distribution. Model Simul Eng. https://doi.org/10.1155/2018/1241856
Kumar M, Pathak A, Soni S (2019) Bayesian inference for rayleigh distribution under step-stress partially accelerated test with progressive type-II censoring with binomial removal. Ann Data Sci 6(1):117–152
Flesch I, Lucas PJ (2007) Markov equivalence in Bayesian networks. In: Lucas P, Gámez JA, Cerdan AS (eds) Advances in probabilistic graphical models. Springer, Berlin, pp 3–38
Gámez JA, Mateo JL, Puerta JM (2011) Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood. Data Min Knowl Discov 22(1–2):106–148
Scutari M (2010) Learning Bayesian networks with the bnlearn R package. J Stat Softw 35(3):1–22. https://doi.org/10.18637/jss.v035.i03
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163
Dash D, Druzdzel MJ (1999) A hybrid anytime algorithm for the construction of causal models from sparse data. In: Proceedings of the fifteenth conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., pp 142–149
Jiang X, Neapolitan RE, Barmada MM, Visweswaran S (2011) Learning genetic epistasis using Bayesian network scoring criteria. BMC Bioinform 12(1):89
Estrada E (2012) The structure of complex networks: theory and applications. Oxford University Press, Oxford
Breiman L et al (1998) Arcing classifier (with discussion and a rejoinder by the author). Ann Stat 26(3):801–849
Cooper GF, Herskovits E (1992) A Bayesian method for the induction of probabilistic networks from data. Mach Learn 9(4):309–347
Colombo D, Maathuis MH (2014) Order-independent constraint-based causal structure learning. J Mach Learn Res 15(1):3741–3782
Abellán J, Gómez-Olmedo M, Moral S (2006) Some variations on the pc algorithm. In: Proceedings of the 3rd European workshop on probabilistic graphical models, PGM 2006, pp 1–8
R Core Team (2018) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/. Accessed 30 Apr 2019
RStudio Team (2015) RStudio: integrated development environment for R. RStudio, Inc., Boston, MA. http://www.rstudio.com/. Accessed 30 Apr 2019
Schloerke B, Crowley J, Cook D, Briatte F, Marbach M, Thoen E, Elberg A, Larmarange J (2018) GGally: extension to ’ggplot2’. https://CRAN.R-project.org/package=GGally (r package version 1.4.0). Accessed 30 Apr 2019
Butts CT (2008) Network: a package for managing relational data in R. J Stat Softw 24(2):1–36
Meyer PE (2014) Infotheo: information-theoretic measures, p. 177. https://CRAN.R-project.org/package=infotheo (r package version 1.2.0). Accessed 30 Apr 2019
Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F (eds) 2nd International symposium on information theory. Akademia Kiado, Budapest, pp 267–281
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19:215–222
Akaike H (1983) Information measures and model selection. Bull Int Stat Inst 50:277–290
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Tien JM (2017) Internet of things, real-time decision making, and artificial intelligence. Ann Data Sci 4(2):149–178
Scutari M, Vitolo C, Tucker A (2019) Learning Bayesian networks from big data with greedy search: computational complexity and efficient implementation. Stat Comput 29(5):1095–1108
Acknowledgements
The authors are grateful for the funding partially provided by the Brazilan agencies CNPq, FAPESP and CAPES. Francisco Louzada and Pedro L. Ramos acknowledge support from the São Paulo State Research Foundation (FAPESP Processes 2013/07375-0 and 2017/25971-0, respectively). The authors are also thankful for the collaboration of Camila Sgarioni Ozelame, Herlisson Maciel Bezerra, Cláudio Luiz, Selegatto Filho, Leonardo Alves Miguel, Lucas Santos Nunez, Mariane Romildo dos Santos, Matheus de Oliveira Souza, and Paulo Lombardi.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ianishi, P., Gonzatto Junior, O.A., Henriques, M.J. et al. Probability on Graphical Structure: A Knowledge-Based Agricultural Case. Ann. Data. Sci. 9, 327–345 (2022). https://doi.org/10.1007/s40745-020-00311-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40745-020-00311-y