Hashing for Financial Credit Risk Analysis

  • Bernardete Ribeiro
  • Ning Chen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8835)


Hashing techniques have recently become the trend for accessing complex content over large data sets. With the overwhelming financial data produced today, binary embeddings are efficient tools of indexing big datasets for financial credit risk analysis. The rationale is to find a good hash function such that similar data points in Euclidean space preserve their similarities in the Hamming space for fast data retrieval. In this paper, first we use a semi-supervised hashing method to take into account the pairwise supervised information for constructing the weight adjacency graph matrix needed to learn the binarised Laplacian EigenMap. Second, we train a generalised regression neural network (GRNN) to learn the k-bits hash code. Third, the k-bits code for the test data is efficiently found in the recall phase. The results of hashing financial data show the applicability and advantages of the approach to credit risk assessment.


hashing method financial credit risk generalised regression neural network k-bits hash code 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Communications of the ACM 51(1), 117–121 (2008)CrossRefGoogle Scholar
  2. 2.
    Baluja, S., Covell, M.: Learning to hash: forgiving hash functions and applications. Data Mining and Knowledge Discovery 17, 402–430 (2008)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation 15, 1373–1396 (2002)CrossRefGoogle Scholar
  4. 4.
    Bodo, Z., Csato, L.: Linear spectral hashing. Neurocomputing 141, 117–123 (2014)CrossRefGoogle Scholar
  5. 5.
    Cai, D., He, X., Han, J., Huang, T.S.: Graph regularized non-negative matrix factorization for data representation. IEEE Trans. on Pattern Analysis and Machine Intelligence 33(8), 1548–1560 (2011)CrossRefGoogle Scholar
  6. 6.
    Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Trans. on Intelligent Systems and Technology 2, 27:1–27:27 (2011),
  7. 7.
    Chung, F.: Spectral Graph Theory. American Mathematical Society (1997)Google Scholar
  8. 8.
    Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41(6), 391–407 (1990)CrossRefGoogle Scholar
  9. 9.
    Gordo, A., Perronnin, F., Gong, Y., Lazebnik, S.: Asymmetric distances for binary embeddings. IEEE Trans. on Pattern Analysis and Machine Intelligence 36(1), 33–47 (2014)CrossRefGoogle Scholar
  10. 10.
    Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: 30th STOC, pp. 604–613. ACM Press (1998)Google Scholar
  11. 11.
    Nene, S.A., Nayar, S.K.: A simple algorithm for nearest neighbor search in high dimensions. Tech. Rep. CUCS-030-95, CS Dep, University of Columbia, USA (1995)Google Scholar
  12. 12.
    Raginsky, M., Lazebnik, S.: Locality sensitive binary codes from shift-invariant kernels. In: Adv. in Neural Information Proc. Sys. (NIPS), pp. 1509–1517 (2009)Google Scholar
  13. 13.
    Ribeiro, B., Chen, N.: Graph weighted subspace learning models in bankruptcy. In: Proc. of Int. J. Conf. on Neural Networks (IJCNN), pp. 2055–2061. IEEE (2011)Google Scholar
  14. 14.
    Salakhutdinov, R., Hinton, G.: Semantic hashing. Int. J. Approx. Reasoning 50(7), 969–978 (2009)CrossRefGoogle Scholar
  15. 15.
    Specht, D.F.: A general regression neural network. IEEE Transactions on Neural Networks 2(6), 568–576 (1991)CrossRefGoogle Scholar
  16. 16.
    Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: Adv. in Neural Information Proc. Sys. 21 (NIPS), pp. 1753–1760 (2009)Google Scholar
  17. 17.
    Zhang, D., Wang, J., Cai, D., Lu, J.: Self-taught hashing for fast similarity search. In: Proc. of the 33rd Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 18–25. ACM (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Bernardete Ribeiro
    • 1
  • Ning Chen
    • 2
  1. 1.CISUC - Department of Informatics EngineeringUniversity of CoimbraPortugal
  2. 2.GECADInstituto Superior de Engenharia do PortoPortugal

Personalised recommendations