Machine learning systems offer the key capability to learn about their operating environment from the data that they are supplied. They can learn via supervised and unsupervised training, from system results during operations, or both. However, while machine learning systems can identify solutions to problems and questions, in many cases they cannot explain how they arrived at them. Moreover, they cannot guarantee that they have not relied upon confounding variables and other non-causal relationships. In some circumstances, learned behaviors may violate legal or ethical principles such as rules regarding non-discrimination. In these and other cases, learned associations that are true in many – but not all – cases may result in critical system failures when processing exceptions to the learned behaviors. A machine learning system, which applies gradient descent to expert system networks, has been proposed as a solution to this. The expert system foundation means that the system can only learn across valid pathways, while the machine learning capabilities facilitate optimization via training and operational learning. While the initial results of this approach are promising, cases where networks were optimized into high error states (and for which continued optimization continued to increase the error level) were noted. This paper proposes and evaluates multiple techniques to handle these high error networks and improve system performance, in these cases.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Price includes VAT (USA)
Tax calculation will be finalised during checkout.
It is planned that the data will be made publicly available through a data publication at a later date.
It is planned that the code will be made publicly available through a code publication at a later date.
Abbass, H.A. (2003). Speeding up backpropagation using multiobjective evolutionary algorithms. Neural Computation, 15, 2705–2726. https://doi.org/10.1162/089976603322385126.
Abu-Nasser, B. (2017). Medical expert systems survey. Int J Eng Inf Syst, 1, 218–224.
Aicher, C., Foti, N.J., & Fox, E.B. (2020). Adaptively truncating backpropagation through time to control gradient bias. In Proceedings of the 35th uncertainty in artificial intelligence conference (pp. 799–808), MLR Press.
Arsene, O., Dumitrache, I., & Mihu, I. (2015). Expert system for medicine diagnosis using software agents. Expert Systems with Applications, 42, 1825–1834.
Baig, Z.A., Baqer, M., & Khan, A.I. (2006). A pattern recognition scheme for distributed denial of service (DDoS) attacks in wireless sensor networks. In Proceedings - international conference on pattern recognition (pp. 1050–1054).
Barredo Arrieta, A., Díaz-Rodríguez, N, Del Ser, J., & et al. (2020). Explainable explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf Fusion, 58, 82–115. https://doi.org/10.1016/j.inffus.2019.12.012.
Battiti, R. (1989). Accelerated backpropagation learning: two optimization methods. Complex System, 3, 331–342.
Baxt, W.G. (1992). Improving the accuracy of an artificial neural network using multiple differently trained networks. Neural Computation, 4, 772–780.
Bianchini, M., Gori, M., & Maggini, M. (1994). On the problem of local minima in recurrent neural networks. IEEE Transactions Neural Networks, 5, 167–177. https://doi.org/10.1109/72.279182.
Brahma, I., He, Y., & Rutland, C.J. (2003). Improvement of neural network accuracy for engine simulations. In SAE Technical Papers. SAE International.
Buchanan, B.G., Barstow, D., Bechtal, R., & et al. (1983). Constructing an expert system. Build Expert System, 50, 127–167.
Caruana, R., & Niculescu-Mizil, A. (2006). An empirical comparison of supervised learning algorithms. In ACM International conference proceeding series (pp. 161–168). New York: ACM Press.
Chizat, L., & Bach, F. (2020). Implicit bias of gradient descent for wide two-layer neural networks trained with the logistic loss. Proc Mach Learn Res, 125, 1–34.
Das, G., Pattnaik, P.K., & Padhy, S.K. (2014). Artificial neural network trained by particle swarm optimization for non-linear channel equalization. Expert Systems with Applications, 41, 3491–3496.
Dias, J.C., Machado, P., Silva, D.C., & Abreu, P.H. (2014). An inverted ant colony optimization approach to traffic. Engineering Applications of Artificial Intelligence, 36, 122–133.
Duan, Y., Chen, X., Houthooft, R., & et al (2016). Benchmarking deep reinforcement learning for continuous control. In Proceedings of the 33 rd International conference on machine learning.
Eykholt, K., Evtimov, I., Fernandes, E., & et al. (2017). Robust physical-world attacks on deep learning models.
Foody, G.M., & Arora, M.K. (1997). An evaluation of some factors affecting the accuracy of classification by an artificial neural network. International Journal of Remote Sensing, 18, 799–810. https://doi.org/10.1080/014311697218764.
Gibbons, E.D. (2021). Toward a more equal world: the human rights approach to extending the benefits of artificial intelligence. IEEE Technology and Society Magazine, 40, 25–30. https://doi.org/10.1109/MTS.2021.3056295.
Goel, A. (2021). The association for the advancement of artificial intelligence’s new award for the societal benefits of artificial intelligence –an interview with Richard Tong. AI Mag 42.
Gong, Y., & Poellabauer, C. (2017). Crafting adversarial examples for speech paralinguistics applications. https://doi.org/10.1145/3306195.3306196.
Gunning, D., Stefik, M., Choi, J., & et al. (2019). XAI-explainable artificial intelligence. Sci Robot 4:. https://doi.org/10.1126/scirobotics.aay7120.
Hayes-Roth, B. (1985). A blackboard architecture for control. Artificial Intelligence, 26, 251–321.
He, S., Wang, Y., Xie, F., & et al. (2008). Game player strategy pattern recognition and how UCT algorithms apply pre-knowledge of player’s strategy to improve opponent AI. In 2008 International conference on computational intelligence for modelling control and automation, CIMCA 2008 (pp. 1177–1181).
Hershey, S., Chaudhuri, S., Ellis, D.P.W., & et al. (2017). CNN architectures for large-scale audio classification. In ICASSP, IEEE international conference on acoustics speech and signal processing - proceedings (pp. 131–135). Institute of Electrical and Electronics Engineers Inc.
Jacobsen, S.C., Olivier, M., Smith, F.M., & et al. (2004). Research robots for applications in artificial intelligence, teleoperation and entertainment. Int J Rob Res, 23, 319–330. https://doi.org/10.1177/0278364904042198.
Kalicanin K, Colovic M, Njeguš A, & Mitic V (2019). Benefits of artificial intelligence and machine learning in marketing. In Sinteza 2019 - International scientific conference on information technology and data related research (pp. 472–477). Singidunum University.
Kalogirou, S. (2002). Expert systems and GIS: an application of land suitability evaluation. Computers, Environment and Urban Systems, 26, 89–112. https://doi.org/10.1016/S0198-9715(01)00031-X.
Kavzoglu, T. (2009). Increasing the accuracy of neural network classification using refined training data. Environmental Modelling & Software, 24, 850–858. https://doi.org/10.1016/j.envsoft.2008.11.012.
Kennedy, J. (1995). Particle swarm optimization. In Proceedings of IEEE international conference on neural networks Perth (pp. 1942–1948), Australia.
KhudaBukhsh, A.R., Carbonell, J.G., & Jansen, P.J. (2018). Robust learning in expert networks: a comparative analysis. Journal of Intelligent Information Systems, 51, 207–234. https://doi.org/10.1007/s10844-018-0515-6.
Koehler, J. (2018). Business process innovation with artificial intelligence: levering benefits and controlling operational risks. Eur Bus Manag, 4, 55–66. https://doi.org/10.11648/j.ebm.20180402.12.
Kolen, J.F., & Pollack, J.B. (1990). Backpropagation is sensitive to initial conditions. Complex Syst, 4, 269–280.
Kosko, B., Audhkhasi, K., & Osoba, O. (2020). Noise can speed backpropagation learning and deep bidirectional pretraining. Neural Networks, 129, 359–384. https://doi.org/10.1016/j.neunet.2020.04.004.
Kuehn, M., Estad, J., Straub, J., & et al. (2017). An expert system for the prediction of student performance in an initial computer science course. In IEEE International conference on electro information technology.
Lindsay, R.K., Buchanan, B.G., Feigenbaum, E.A., & Lederberg, J. (1993). DENDRAL: A case study of the first expert system for scientific hypothesis formation. Artificial Intelligence, 61, 209–261. https://doi.org/10.1016/0004-3702(93)90068-M.
Liu, W., Chen, L., Chen, Y., & Zhang, W. (2020). Accelerating federated learning via momentum gradient descent. IEEE Trans Parallel Distrib System, 31, 1754–1766. https://doi.org/10.1109/TPDS.2020.2975189.
Maksimenko, V.A., Kurkin, S.A., Pitsik, E.N., & et al. (2018). Artificial neural network classification of motor-related EEG: An increase in classification accuracy by reducing signal complexity. Complexity 2018. https://doi.org/10.1155/2018/9385947.
McKinion, J.M., & Lemmon, H.E. (1985). Expert systems for agriculture. Computers and Electronics in Agriculture, 1, 31–40. https://doi.org/10.1016/0168-1699(85)90004-3.
Mitra, S., & Pal, S.K. (1996). Neuro-fuzzy expert systems: relevance, features and methodologies. IETE Journal of Research, 42, 335–347. https://doi.org/10.1080/03772063.1996.11415939.
Nadimpalli, M. (2007). Artificial intelligence risks and benefits. Int J Innov Res Sci Eng Technol 3297.
Noble, S.U. (2018). Algorithms of oppression: how search engines reinforce racism. New York: NYU Press.
Paliouras, G., Papatheodorou, C., Karkaletsis, V., & Spyropoulos, C. (2002). Discovering user communities on the Internet using unsupervised machine learning techniques. Interacting with Computers, 14, 761–791. https://doi.org/10.1016/S0953-5438(02)00015-2.
Pantic, M., & Rothkrantz, L.J.M. (2000). Expert system for automatic analysis of facial expressions. Image and Vision Computing, 18, 881–905.
Papadopoulou, M.P., Nikolos, I.K., & Karatzas, G.P. (2010). Computational benefits using artificial intelligent methodologies for the solution of an environmental design problem: Saltwater intrusion. Water Science and Technology, 62, 1479–1490. https://doi.org/10.2166/wst.2010.442.
Rehman, M.Z., & Nawi, N.M. (2011). The effect of adaptive momentum in improving the accuracy of gradient descent back propagation algorithm on classification problems. In Communications in computer and information science (pp. 380–390). Berlin: Springer.
Renders, J.M., & Themlin, J.M. (1995). Optimization of fuzzy expert systems using genetic algorithms and neural networks. IEEE Transactions on Fuzzy Systems, 3, 300–312. https://doi.org/10.1109/91.413235.
Robinson, S.C. (2020). Trust, transparency, and openness: how inclusion of cultural values shapes Nordic national public policy strategies for artificial intelligence (AI). Technology in Society, 63, 101421. https://doi.org/10.1016/j.techsoc.2020.101421.
Ruder, S. (2016). An overview of gradient descent optimization algorithms.
Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell, 15(1), 206–215. https://doi.org/10.1038/s42256-019-0048-x.
Sahin, S., Tolun, M.R., & Hassanpour, R. (2012). Hybrid expert systems: A survey of current approaches and applications. Expert Syst Appl., 39, 4609–4617.
Shah-Hosseini, H. (2009). The intelligent water drops algorithm: a nature-inspired swarm-based optimization algorithm. Int J Bio-Inspired Comput, 1, 71–79.
Sharif, M., Bhagavatula, S., Bauer, L., & Reiter, M.K. (2016). Accessorize to a crime: real and stealthy attacks on state-of-the-art face recognition. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security - CCS’16 (pp. 1528–1540). ACM Press: New York.
Soviany, C. (2018). The benefits of using artificial intelligence in payment fraud detection: a case study. J Payments Strateg Syst, 12, 102–110.
Straub, J. (2021). Expert system gradient descent style training: development of a defensible artificial intelligence technique. Knowledge-Based System 228, 107275. https://doi.org/10.1016/j.knosys.2021.107275.
Styvaktakis, E., Bollen, M.H.J., & Gu, I.Y.H. (2002). Expert system for classification and analysis of power system events. IEEE Trans Power Deliv, 17, 423–428.
Tosun, A., Bener, A., & Kale, R. (2010). AI-based software defect predictors: applications and benefits in a case study.
Waterman, D. (1986). A guide to expert systems. Reading: Addison-Wesley Pub Co.
West, D., & West, V. (2000). Improving diagnostic accuracy using a hierarchical neural network to model decision subtasks. International Journal of Medical Informatics, 57, 41–55. https://doi.org/10.1016/S1386-5056(99)00059-3.
Wu, Z., Ling, Q., Chen, T., & Giannakis, G.B. (2020). Federated variance-reduced stochastic gradient descent with robustness to byzantine attacks. IEEE Trans Signal Process, 68, 4583–4596. https://doi.org/10.1109/TSP.2020.3012952.
Xu, F., Uszkoreit, H., Du, Y., & et al (2019). Explainable AI: a brief survey on history, research areas, approaches and challenges. In Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) (pp. 563–574). Springer.
Yanco, H.A., & Gips, J. (1997). Preliminary investigation of a semi-autonomous robotic wheelchair directed through electrodes. In Proceedings of the rehabilition engineering society of north america annual conference (pp. 414–416). Pittsburgh: RESNA Press.
Yeasmin, S. (2019). Benefits of artificial intelligence in medicine. In 2nd International conference on computer applications and information security, ICCAIS. Institute of Electrical and Electronics Engineers Inc.
Zadeh, L.A. (1965). Fuzzy sets. Inf Control, 8, 338–353. https://doi.org/10.1016/S0019-9958(65)90241-X.
Zhao, P., Chen, P.Y., Wang, S., & Lin, X. (2020). Towards query-efficient black-box adversary with zeroth-order natural gradient descent. arXiv:34:6909--6916.
Zwass, V. (2016). Expert system. Britannica https://www.britannica.com/technology/expert-system Accessed 24 Feb 2021.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Straub, J. Impact of techniques to reduce error in high error rule-based expert system gradient descent networks. J Intell Inf Syst (2021). https://doi.org/10.1007/s10844-021-00672-7
- Expert system
- Error reduction
- Machine learning
- Gradient descent