Deep Learning Architecture for High-Level Feature Generation Using Stacked Auto Encoder for Business Intelligence

Chapter
Part of the Studies in Systems, Decision and Control book series (SSDC, volume 125)

Abstract

In the era of modern world, faster development and wider use of digital technology generates large amount of data in digital space. Handling such large amount of data by conventional machine learning algorithms is difficult because of heterogeneous nature and large size of data. Deep learning strategy, is an advancement in machine learning research to deal with such heterogeneous nature and large size of data and extract high-level representations of data through a hierarchical learning process. This paper proposes novel multi-layer feature selection with conjunction of Stacked Auto-Encoder (SAE) to extract high level features or representations and eliminate the lower level features or representations from data. The proposed approach is validated on the Farm Ads dataset and the result is compared with various conventional machine learning algorithms. The proposed approach has outperformed as compared to conventional machine learning algorithms for the given dataset.

Keywords

Business data Stacked auto-encoder Deep learning Feature selection 

References

  1. 1.
    Hinton Geoffrey, E., et al.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Lichman, M.: UCI Machine Learning Repository. Irvine, CA University of California, School of Information and Computer Science. http://archive.ics.uci.edu/ml (2013)
  3. 3.
    Domingos, P.: A few useful things to know about machine learning. Commun. ACM 55(10) (2012)Google Scholar
  4. 4.
    Breiman, L.: Statistical modeling: The two cultures (with comments and a rejoinder by the author). Stat. Sci. 16(3), 199–231 (2001)CrossRefMATHGoogle Scholar
  5. 5.
    Apte, C.: The role of machine learning in business optimization. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 1–2 (2010)Google Scholar
  6. 6.
    Faris, H., et al.: A genetic programming based framework for churn prediction in telecommunication industry. In: International Conference on Computational Collective Intelligence September 24, pp. 353–362. Springer (2014)Google Scholar
  7. 7.
    Dean, F., Silvia, F.: Random survival forests models for SME credit risk measurement. Methodol. Comput. Appl. Probab. 11(1), 29–45 (2009)MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Jian, S., et al.: Credit scoring by feature-weighted support vector machines. J. Zhejiang Univ. Sci. C 14(3), 197–204 (2013)CrossRefGoogle Scholar
  9. 9.
    Peng, H., et al.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Taffler, R.J., et al.: Forecasting company failure in the UK using discriminant analysis and financial ratio data. J. R. Stat. Soc. Ser. A (General) 342–358 (1982)Google Scholar
  11. 11.
    Bengio, Y., et al.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)CrossRefGoogle Scholar
  12. 12.
    Bengio, Y.: Learning Deep Architectures for AI. Now Publishers Inc., Hanover, MA, USA (2009)MATHGoogle Scholar
  13. 13.
    Bengio, Y.: Deep learning of representations: looking forward. In: Proceedings of the 1st International Conference on Statistical Language and Speech Processing. SLSP’13, pp. 1–37. Springer, Tarragona, Spain (2013)Google Scholar
  14. 14.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefMATHGoogle Scholar
  15. 15.
    Safavian, S.R., Landgrebe, D.: A Survey of Decision Tree Classifier Methodology (1990)Google Scholar
  16. 16.
    Corinna, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)Google Scholar
  17. 17.
    Sevakula, R.K., et al.: Fast data sampling for large scale support vector machines. In: IEEE Workshop on Computational Intelligence: Theories, Applications and Future Directions (IEEE WCI 2015), India (2015)Google Scholar
  18. 18.
    Sevakula, R.K., et al.: Data preprocessing methods for sparse auto-encoder based fuzzy rule classifier. In: IEEE Workshop on Computational Intelligence: Theories, Applications and Future Directions (IEEE WCI 2015), India (2015)Google Scholar
  19. 19.
    Ding, X., et al.: Deep learning for event-driven stock prediction. In: Proceedings of the 24th International Joint Conference on Artificial Intelligence (ICJAI 15), pp. 2327–2333 (2015)Google Scholar
  20. 20.
    Sirignano, J.A.: Deep Learning for Limit Order Books. arXiv:1601.01987 (2016)
  21. 21.
    Takeuchi, L., Lee, Y.-Y.: Applying Deep Learning to Enhance Momentum Trading Strategies in Stocks. http://cs229.stanford.edu/proj2013/
  22. 22.
    Ng, A.: Sparse autoencoder. In: CS294A Lecture Notes, vol. 72, pp. 1–19 (2011)Google Scholar
  23. 23.
    Bengio, Y., et al.: Greedy layer-wise training of deep networks. Adv. Neural. Inf. Process. Syst. 19, 153 (2007)Google Scholar
  24. 24.
    Poultney, C., et al.: Efficient learning of sparse representations with an energy-based model. In: Advances in Neural Information Processing Systems, pp. 1137–1144 (2006)Google Scholar
  25. 25.
    Thirukovalluru, R., et al.: Generating feature sets for fault diagnosis using denoising stacked auto-encoder. In: 2016 IEEE International Conference on Prognostics and Health Management (ICPHM), pp. 1–7. IEEE (2016)Google Scholar
  26. 26.
    Jolliffe, I.: Principal Component Analysis. Wiley (2002)Google Scholar
  27. 27.
    Chih-Chung, C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)Google Scholar
  28. 28.
    Biau, G.: Analysis of a random forests model. J. Mach. Learn. Res. 13, 1063–1095 (2012)MathSciNetMATHGoogle Scholar
  29. 29.
    Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: ICML, vol. 3, pp. 856–863 (2003)Google Scholar
  30. 30.
    Teng, C.M.: Combining noise correction with feature selection. In: International Conference on Data Warehousing and Knowledge Discovery, pp. 340–349. Springer (2003)Google Scholar
  31. 31.
    Ng, A., et al.: UFLDL Tutorial (2016)Google Scholar
  32. 32.
    Izenman, A.J.: Linear discriminant analysis. Modern Multivariate Statistical Techniques, pp. 237–280. Springer, New York (2013)CrossRefGoogle Scholar
  33. 33.

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.Department of Electrical EngineeringIndian Institute of Technology KanpurKanpurIndia

Personalised recommendations