Advertisement

Determining Optimal Multi-layer Perceptron Structure Using Linear Regression

  • Mohamed Lafif TejEmail author
  • Stefan Holban
Conference paper
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 353)

Abstract

This paper presents a novel method to determine the optimal Multi-layer Perceptron structure using Linear Regression. Starting from clustering the dataset used to train a neural network it is possible to define Multiple Linear Regression models to determine the architecture of a neural network. This method work unsupervised unlike other methods and more flexible with different datasets types. The proposed method adapt to the complexity of training datasets to provide the best results regardless of the size and type of dataset. Clustering algorithm used to impose a specific analysis of data used to train the network such us determining the distance measure, normalization and clustering technique suitable with the type of training dataset used.

Keywords

Multi-layer Perceptron Linear regression Clustering methods Pattern recognition Artificial neural network 

References

  1. 1.
    Xie, Y., Fan, X., Chen, J.: Affinity propagation-based probability neural network structure optimization. In: Tenth International Conference on Computational Intelligence and Security (CIS), pp. 85–89. IEEE, November 2014.  https://doi.org/10.1109/cis.2014.156
  2. 2.
    Thomas, A.J., Petridis, M., Walters, S.D., Gheytassi, S.M., Morgan, R.E.: On predicting the optimal number of hidden nodes. In: International Conference on Computational Science and Computational Intelligence (CSCI), pp. 565–570. IEEE, December 2015.  https://doi.org/10.1109/csci.2015.33
  3. 3.
    Bishop, C.: Pattern Recognition and Machine Learning. Springer, New York (2006). ISBN 978-1-4939-3843-8zbMATHGoogle Scholar
  4. 4.
    Pan, H., Liang, D., Tang, J., Wang, N., Li, W.: Shape recognition and retrieval based on edit distance and dynamic programming. Tsinghua Sci. Technol. 14(6), 739–745 (2009).  https://doi.org/10.1016/S1007-0214(09)70144-0CrossRefGoogle Scholar
  5. 5.
    Amiri, S.S., Mottahedi, M., Asadi, S.: Using multiple regression analysis to develop energy consumption indicators for commercial buildings in the US. Energy Build. 109, 209–216 (2015).  https://doi.org/10.1016/j.enbuild.2015.09.073CrossRefGoogle Scholar
  6. 6.
    Dora, S., Sundaram, S., Sundararajan, N.: A two stage learning algorithm for a growing-pruning spiking neural network for pattern classification problems. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–7. IEEE, July 2015.  https://doi.org/10.1109/ijcnn.2015.7280592
  7. 7.
    Sheela, K.G., Deepa, S.N.: Review on methods to fix number of hidden neurons in neural networks. Math. Prob. Eng. (2013). http://dx.doi.org/10.1155/2013/425740
  8. 8.
    Berry, M.J., Linoff, G.: Data Mining Techniques: For Marketing, Sales, and Customer Support. Wiley, New York (1997). ISBN 0471179809Google Scholar
  9. 9.
    Esfe, M.H., et al.: Thermal conductivity of Cu/TiO2–water/EG hybrid nanofluid: experimental data and modeling using artificial neural network and correlation. Int. Commun. Heat Mass Transfer 66, 100–104 (2015).  https://doi.org/10.1016/j.icheatmasstransfer.2015.05.014CrossRefGoogle Scholar
  10. 10.
    Vinod, V.V., Ghose, S.: Growing nonuniform feedforward networks for continuous mappings. Neurocomputing 10(1), 55–69 (1996).  https://doi.org/10.1016/0925-2312(95)00024-0CrossRefzbMATHGoogle Scholar
  11. 11.
    Faraway, J.J.: Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models, vol. 124. CRC Press, Boca Raton (2016)CrossRefGoogle Scholar
  12. 12.
    Dangeti, P.: Statistics for Machine Learning. Packt Publishing Ltd, Birmingham (2017)Google Scholar
  13. 13.
    Brown, S.H.: Multiple linear regression analysis: a matrix approach with MATLAB. Alabama J. Math. 34, 1–3 (2009)Google Scholar
  14. 14.
    Austin, P.C., Steyerberg, E.W.: The number of subjects per variable required in linear regression analyses. J. Clin. Epidemiol. 68(6), 627–636 (2015).  https://doi.org/10.1016/j.jclinepi.2014.12.014CrossRefGoogle Scholar
  15. 15.
    Sasaki, T., Kinoshita, K., Kishida, S., Hirata, Y., Yamada, S.: Effect of number of input layer units on performance of neural network systems for detection of abnormal areas from X-ray images of chest. In: IEEE 5th International Conference on Cybernetics and Intelligent Systems (CIS), pp. 374–379. IEEE, September 2011.  https://doi.org/10.1109/iccis.2011.6070358
  16. 16.
    Naseem, I., Togneri, R., Bennamoun, M.: Linear regression for face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32(11), 2106–2112 (2010).  https://doi.org/10.1109/TPAMI.2010.128CrossRefGoogle Scholar
  17. 17.
    Pozo, F., Vidal, Y.: Wind turbine fault detection through principal component analysis and statistical hypothesis testing. Energies 9(1), 3 (2015).  https://doi.org/10.3390/en9010003CrossRefGoogle Scholar
  18. 18.
    Cohen, P., West, S.G., Aiken, L.S.: Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Psychology Press, New York (2014). ISBN 9781135468255CrossRefGoogle Scholar
  19. 19.
    Wang, W., Morrison, T.A., Geller, J.A., Yoon, R.S., Macaulay, W.: Predicting short-term outcome of primary total hip arthroplasty: a prospective multivariate regression analysis of 12 independent factors. J. Arthroplasty 25(6), 858–864 (2010).  https://doi.org/10.1016/j.arth.2009.06.011CrossRefGoogle Scholar
  20. 20.
    Ghaedi, M., Reza Rahimi, M., Ghaedi, A.M., Tyagi, I., Agarwal, S., Gupta, V.K.: Application of least squares support vector regression and linear multiple regression for modeling removal of methyl orange onto tin oxide nanoparticles loaded on activated carbon and activated carbon prepared from Pistacia atlantica wood. J. Colloid Interface Sci. 461, 425–434 (2016).  https://doi.org/10.1016/j.jcis.2015.09.024CrossRefGoogle Scholar
  21. 21.
    Chatterjee, S., Hadi, A.S.: Regression Analysis by Example. Wiley, New York (2015)zbMATHGoogle Scholar
  22. 22.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Parallelizing neural networks during training. U.S. Patent 9,811,775, Google Inc. (2017)Google Scholar
  23. 23.
    Bouguettaya, A., Yu, Q., Liu, X., Zhou, X., Song, A.: Efficient agglomerative hierarchical clustering. Expert Syst. Appl. 42(5), 2785–2797 (2015).  https://doi.org/10.1016/j.eswa.2014.09.054CrossRefGoogle Scholar
  24. 24.
    Ng, M.K., Li, M.J., Huang, J.Z., He, Z.: On the impact of dissimilarity measure in k-modes clustering algorithm. IEEE Trans. Pattern Anal. Mach. Intell. (3), 503–507 (2007). http://doi.ieeecomputersociety.org/10.1109/TPAMI.2007.53CrossRefGoogle Scholar
  25. 25.
    Karypis, G., Han, E.H., Kumar, V.: Chameleon: hierarchical clustering using dynamic modeling. Computer 32(8), 68–75 (1999).  https://doi.org/10.1109/2.781637CrossRefGoogle Scholar
  26. 26.
    Murtagh, F., Contreras, P.: Algorithms for hierarchical clustering: an overview, II. Wiley Interdisc. Rev. Data Min. Knowl. Discov. 7(6), e1219 (2017).  https://doi.org/10.1002/widm.1219CrossRefGoogle Scholar
  27. 27.
    Dalbouh, H.A., Norwawi, N.M.: Improvement on agglomerative hierarchical clustering algorithm based on tree data structure with bidirectional approach. In: Third International Conference on Intelligent Systems, Modelling and Simulation (ISMS), pp. 25–30. IEEE, February 2012.  https://doi.org/10.1109/isms.2012.13
  28. 28.
    Aggarwal, C.C., Reddy, C.K. (eds.): Data Clustering: Algorithms and Applications. CRC Press, Boca Raton (2013). ISBN 1466558210, 9781466558212Google Scholar
  29. 29.
    Gath, I., Geva, A.B.: Unsupervised optimal fuzzy clustering. IEEE Trans. Pattern Anal. Mach. Intell. 11(7), 773–780 (1989).  https://doi.org/10.1109/34.192473CrossRefzbMATHGoogle Scholar
  30. 30.
    Langfelder, P., Zhang, B., Horvath, S.: Defining clusters from a hierarchical cluster tree: the dynamic tree cut package for R. Bioinformatics 24(5), 719–720 (2007).  https://doi.org/10.1093/bioinformatics/btm563CrossRefGoogle Scholar
  31. 31.
    Zhao, Z., Xu, S., Kang, B.H., Kabir, M.M.J., Liu, Y., Wasinger, R.: Investigation and improvement of multi-layer perceptron neural networks for credit scoring. Expert Syst. Appl. 42(7), 3508–3516 (2015).  https://doi.org/10.1016/j.eswa.2014.12.006CrossRefGoogle Scholar
  32. 32.
    Raghuvanshi, A.S., Tiwari, S., Tripathi, R., Kishor, N.: Optimal number of clusters in wireless sensor networks: an FCM approach. In: International Conference on Computer and Communication Technology (ICCCT), pp. 817–823. IEEE, September 2010.  https://doi.org/10.1109/iccct.2010.5640391
  33. 33.
    Wang, L.C., Wang, C.W., Liu, C.M.: Optimal number of clusters in dense wireless sensor networks: a cross-layer approach. IEEE Trans. Veh. Technol. 58(2), 966–976 (2009).  https://doi.org/10.1109/TVT.2008.928637CrossRefGoogle Scholar
  34. 34.
    Liu, X., Croft, W.B.: Experiments on retrieval of optimal clusters. Technical report IR-478, Center for Intelligent Information Retrieval (CIIR), University of Massachusetts (2006)Google Scholar
  35. 35.
    Kumar, V., Chhabra, J.K., Kumar, D.: Performance evaluation of distance metrics in the clustering algorithms. INFOCOMP 13(1), 38–52 (2014)Google Scholar
  36. 36.
    Piczak, K.J.: Environmental sound classification with convolutional neural networks. In: IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6. IEEE, September 2015.  https://doi.org/10.1109/mlsp.2015.7324337
  37. 37.
    Lillicrap, T.P., Cownden, D., Tweed, D.B., Akerman, C.J.: Random synaptic feedback weights support error backpropagation for deep learning. Nature Commun. 7, 13276 (2016).  https://doi.org/10.1038/ncomms13276CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Faculty of Automation and ComputersPolitehnica University of TimisoaraTimisoaraRomania

Personalised recommendations