Skip to main content

Impact of techniques to reduce error in high error rule-based expert system gradient descent networks


Machine learning systems offer the key capability to learn about their operating environment from the data that they are supplied. They can learn via supervised and unsupervised training, from system results during operations, or both. However, while machine learning systems can identify solutions to problems and questions, in many cases they cannot explain how they arrived at them. Moreover, they cannot guarantee that they have not relied upon confounding variables and other non-causal relationships. In some circumstances, learned behaviors may violate legal or ethical principles such as rules regarding non-discrimination. In these and other cases, learned associations that are true in many – but not all – cases may result in critical system failures when processing exceptions to the learned behaviors. A machine learning system, which applies gradient descent to expert system networks, has been proposed as a solution to this. The expert system foundation means that the system can only learn across valid pathways, while the machine learning capabilities facilitate optimization via training and operational learning. While the initial results of this approach are promising, cases where networks were optimized into high error states (and for which continued optimization continued to increase the error level) were noted. This paper proposes and evaluates multiple techniques to handle these high error networks and improve system performance, in these cases.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Data Availability

It is planned that the data will be made publicly available through a data publication at a later date.

Code Availability

It is planned that the code will be made publicly available through a code publication at a later date.


  1. Abbass, H.A. (2003). Speeding up backpropagation using multiobjective evolutionary algorithms. Neural Computation, 15, 2705–2726.

    Article  Google Scholar 

  2. Abu-Nasser, B. (2017). Medical expert systems survey. Int J Eng Inf Syst, 1, 218–224.

    Google Scholar 

  3. Aicher, C., Foti, N.J., & Fox, E.B. (2020). Adaptively truncating backpropagation through time to control gradient bias. In Proceedings of the 35th uncertainty in artificial intelligence conference (pp. 799–808), MLR Press.

  4. Arsene, O., Dumitrache, I., & Mihu, I. (2015). Expert system for medicine diagnosis using software agents. Expert Systems with Applications, 42, 1825–1834.

    Article  Google Scholar 

  5. Baig, Z.A., Baqer, M., & Khan, A.I. (2006). A pattern recognition scheme for distributed denial of service (DDoS) attacks in wireless sensor networks. In Proceedings - international conference on pattern recognition (pp. 1050–1054).

  6. Barredo Arrieta, A., Díaz-Rodríguez, N, Del Ser, J., & et al. (2020). Explainable explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf Fusion, 58, 82–115.

    Article  Google Scholar 

  7. Battiti, R. (1989). Accelerated backpropagation learning: two optimization methods. Complex System, 3, 331–342.

    MATH  Google Scholar 

  8. Baxt, W.G. (1992). Improving the accuracy of an artificial neural network using multiple differently trained networks. Neural Computation, 4, 772–780.

    Article  Google Scholar 

  9. Bianchini, M., Gori, M., & Maggini, M. (1994). On the problem of local minima in recurrent neural networks. IEEE Transactions Neural Networks, 5, 167–177.

    Article  Google Scholar 

  10. Brahma, I., He, Y., & Rutland, C.J. (2003). Improvement of neural network accuracy for engine simulations. In SAE Technical Papers. SAE International.

  11. Buchanan, B.G., Barstow, D., Bechtal, R., & et al. (1983). Constructing an expert system. Build Expert System, 50, 127–167.

    Google Scholar 

  12. Caruana, R., & Niculescu-Mizil, A. (2006). An empirical comparison of supervised learning algorithms. In ACM International conference proceeding series (pp. 161–168). New York: ACM Press.

  13. Chizat, L., & Bach, F. (2020). Implicit bias of gradient descent for wide two-layer neural networks trained with the logistic loss. Proc Mach Learn Res, 125, 1–34.

    Google Scholar 

  14. Das, G., Pattnaik, P.K., & Padhy, S.K. (2014). Artificial neural network trained by particle swarm optimization for non-linear channel equalization. Expert Systems with Applications, 41, 3491–3496.

    Article  Google Scholar 

  15. Dias, J.C., Machado, P., Silva, D.C., & Abreu, P.H. (2014). An inverted ant colony optimization approach to traffic. Engineering Applications of Artificial Intelligence, 36, 122–133.

    Article  Google Scholar 

  16. Duan, Y., Chen, X., Houthooft, R., & et al (2016). Benchmarking deep reinforcement learning for continuous control. In Proceedings of the 33 rd International conference on machine learning.

  17. Eykholt, K., Evtimov, I., Fernandes, E., & et al. (2017). Robust physical-world attacks on deep learning models.

  18. Foody, G.M., & Arora, M.K. (1997). An evaluation of some factors affecting the accuracy of classification by an artificial neural network. International Journal of Remote Sensing, 18, 799–810.

    Article  Google Scholar 

  19. Gibbons, E.D. (2021). Toward a more equal world: the human rights approach to extending the benefits of artificial intelligence. IEEE Technology and Society Magazine, 40, 25–30.

    Article  Google Scholar 

  20. Goel, A. (2021). The association for the advancement of artificial intelligence’s new award for the societal benefits of artificial intelligence –an interview with Richard Tong. AI Mag 42.

  21. Gong, Y., & Poellabauer, C. (2017). Crafting adversarial examples for speech paralinguistics applications.

  22. Gunning, D., Stefik, M., Choi, J., & et al. (2019). XAI-explainable artificial intelligence. Sci Robot 4:.

  23. Hayes-Roth, B. (1985). A blackboard architecture for control. Artificial Intelligence, 26, 251–321.

    Article  Google Scholar 

  24. He, S., Wang, Y., Xie, F., & et al. (2008). Game player strategy pattern recognition and how UCT algorithms apply pre-knowledge of player’s strategy to improve opponent AI. In 2008 International conference on computational intelligence for modelling control and automation, CIMCA 2008 (pp. 1177–1181).

  25. Hershey, S., Chaudhuri, S., Ellis, D.P.W., & et al. (2017). CNN architectures for large-scale audio classification. In ICASSP, IEEE international conference on acoustics speech and signal processing - proceedings (pp. 131–135). Institute of Electrical and Electronics Engineers Inc.

  26. Jacobsen, S.C., Olivier, M., Smith, F.M., & et al. (2004). Research robots for applications in artificial intelligence, teleoperation and entertainment. Int J Rob Res, 23, 319–330.

    Article  Google Scholar 

  27. Kalicanin K, Colovic M, Njeguš A, & Mitic V (2019). Benefits of artificial intelligence and machine learning in marketing. In Sinteza 2019 - International scientific conference on information technology and data related research (pp. 472–477). Singidunum University.

  28. Kalogirou, S. (2002). Expert systems and GIS: an application of land suitability evaluation. Computers, Environment and Urban Systems, 26, 89–112.

    Article  Google Scholar 

  29. Kavzoglu, T. (2009). Increasing the accuracy of neural network classification using refined training data. Environmental Modelling & Software, 24, 850–858.

    Article  Google Scholar 

  30. Kennedy, J. (1995). Particle swarm optimization. In Proceedings of IEEE international conference on neural networks Perth (pp. 1942–1948), Australia.

  31. KhudaBukhsh, A.R., Carbonell, J.G., & Jansen, P.J. (2018). Robust learning in expert networks: a comparative analysis. Journal of Intelligent Information Systems, 51, 207–234.

    Article  Google Scholar 

  32. Koehler, J. (2018). Business process innovation with artificial intelligence: levering benefits and controlling operational risks. Eur Bus Manag, 4, 55–66.

    Article  Google Scholar 

  33. Kolen, J.F., & Pollack, J.B. (1990). Backpropagation is sensitive to initial conditions. Complex Syst, 4, 269–280.

    MATH  Google Scholar 

  34. Kosko, B., Audhkhasi, K., & Osoba, O. (2020). Noise can speed backpropagation learning and deep bidirectional pretraining. Neural Networks, 129, 359–384.

    Article  Google Scholar 

  35. Kuehn, M., Estad, J., Straub, J., & et al. (2017). An expert system for the prediction of student performance in an initial computer science course. In IEEE International conference on electro information technology.

  36. Lindsay, R.K., Buchanan, B.G., Feigenbaum, E.A., & Lederberg, J. (1993). DENDRAL: A case study of the first expert system for scientific hypothesis formation. Artificial Intelligence, 61, 209–261.

    Article  Google Scholar 

  37. Liu, W., Chen, L., Chen, Y., & Zhang, W. (2020). Accelerating federated learning via momentum gradient descent. IEEE Trans Parallel Distrib System, 31, 1754–1766.

    Article  Google Scholar 

  38. Maksimenko, V.A., Kurkin, S.A., Pitsik, E.N., & et al. (2018). Artificial neural network classification of motor-related EEG: An increase in classification accuracy by reducing signal complexity. Complexity 2018.

  39. McKinion, J.M., & Lemmon, H.E. (1985). Expert systems for agriculture. Computers and Electronics in Agriculture, 1, 31–40.

    Article  Google Scholar 

  40. Mitra, S., & Pal, S.K. (1996). Neuro-fuzzy expert systems: relevance, features and methodologies. IETE Journal of Research, 42, 335–347.

    Article  Google Scholar 

  41. Nadimpalli, M. (2007). Artificial intelligence risks and benefits. Int J Innov Res Sci Eng Technol 3297.

  42. Noble, S.U. (2018). Algorithms of oppression: how search engines reinforce racism. New York: NYU Press.

    Book  Google Scholar 

  43. Paliouras, G., Papatheodorou, C., Karkaletsis, V., & Spyropoulos, C. (2002). Discovering user communities on the Internet using unsupervised machine learning techniques. Interacting with Computers, 14, 761–791.

    Article  Google Scholar 

  44. Pantic, M., & Rothkrantz, L.J.M. (2000). Expert system for automatic analysis of facial expressions. Image and Vision Computing, 18, 881–905.

    Article  Google Scholar 

  45. Papadopoulou, M.P., Nikolos, I.K., & Karatzas, G.P. (2010). Computational benefits using artificial intelligent methodologies for the solution of an environmental design problem: Saltwater intrusion. Water Science and Technology, 62, 1479–1490.

    Article  Google Scholar 

  46. Rehman, M.Z., & Nawi, N.M. (2011). The effect of adaptive momentum in improving the accuracy of gradient descent back propagation algorithm on classification problems. In Communications in computer and information science (pp. 380–390). Berlin: Springer.

  47. Renders, J.M., & Themlin, J.M. (1995). Optimization of fuzzy expert systems using genetic algorithms and neural networks. IEEE Transactions on Fuzzy Systems, 3, 300–312.

    Article  Google Scholar 

  48. Robinson, S.C. (2020). Trust, transparency, and openness: how inclusion of cultural values shapes Nordic national public policy strategies for artificial intelligence (AI). Technology in Society, 63, 101421.

    Article  Google Scholar 

  49. Ruder, S. (2016). An overview of gradient descent optimization algorithms.

  50. Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell, 15(1), 206–215.

    Article  Google Scholar 

  51. Sahin, S., Tolun, M.R., & Hassanpour, R. (2012). Hybrid expert systems: A survey of current approaches and applications. Expert Syst Appl., 39, 4609–4617.

    Article  Google Scholar 

  52. Shah-Hosseini, H. (2009). The intelligent water drops algorithm: a nature-inspired swarm-based optimization algorithm. Int J Bio-Inspired Comput, 1, 71–79.

    Article  Google Scholar 

  53. Sharif, M., Bhagavatula, S., Bauer, L., & Reiter, M.K. (2016). Accessorize to a crime: real and stealthy attacks on state-of-the-art face recognition. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security - CCS’16 (pp. 1528–1540). ACM Press: New York.

  54. Soviany, C. (2018). The benefits of using artificial intelligence in payment fraud detection: a case study. J Payments Strateg Syst, 12, 102–110.

    Google Scholar 

  55. Straub, J. (2021). Expert system gradient descent style training: development of a defensible artificial intelligence technique. Knowledge-Based System 228, 107275.

  56. Styvaktakis, E., Bollen, M.H.J., & Gu, I.Y.H. (2002). Expert system for classification and analysis of power system events. IEEE Trans Power Deliv, 17, 423–428.

    Article  Google Scholar 

  57. Tosun, A., Bener, A., & Kale, R. (2010). AI-based software defect predictors: applications and benefits in a case study.

  58. Waterman, D. (1986). A guide to expert systems. Reading: Addison-Wesley Pub Co.

    Google Scholar 

  59. West, D., & West, V. (2000). Improving diagnostic accuracy using a hierarchical neural network to model decision subtasks. International Journal of Medical Informatics, 57, 41–55.

    Article  Google Scholar 

  60. Wu, Z., Ling, Q., Chen, T., & Giannakis, G.B. (2020). Federated variance-reduced stochastic gradient descent with robustness to byzantine attacks. IEEE Trans Signal Process, 68, 4583–4596.

    MathSciNet  Article  Google Scholar 

  61. Xu, F., Uszkoreit, H., Du, Y., & et al (2019). Explainable AI: a brief survey on history, research areas, approaches and challenges. In Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) (pp. 563–574). Springer.

  62. Yanco, H.A., & Gips, J. (1997). Preliminary investigation of a semi-autonomous robotic wheelchair directed through electrodes. In Proceedings of the rehabilition engineering society of north america annual conference (pp. 414–416). Pittsburgh: RESNA Press.

  63. Yeasmin, S. (2019). Benefits of artificial intelligence in medicine. In 2nd International conference on computer applications and information security, ICCAIS. Institute of Electrical and Electronics Engineers Inc.

  64. Zadeh, L.A. (1965). Fuzzy sets. Inf Control, 8, 338–353.

    Article  Google Scholar 

  65. Zhao, P., Chen, P.Y., Wang, S., & Lin, X. (2020). Towards query-efficient black-box adversary with zeroth-order natural gradient descent. arXiv:34:6909--6916.

  66. Zwass, V. (2016). Expert system. Britannica Accessed 24 Feb 2021.

Download references

Author information




This is a single author manuscript. As such the author was responsible for all areas of article development.

Corresponding author

Correspondence to Jeremy Straub.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Straub, J. Impact of techniques to reduce error in high error rule-based expert system gradient descent networks. J Intell Inf Syst (2021).

Download citation


  • Expert system
  • Error reduction
  • Machine learning
  • Gradient descent
  • Training
  • Backpropagation