Skip to main content

BHONEM: Binary High-Order Network Embedding Methods for Networked-Guarantee Loans

Abstract

Networked-guarantee loans may cause systemic risk related concern for the government and banks in China. The prediction of the default of enterprise loans is a typical machine learning based classification problem, and the networked guarantee makes this problem very difficult to solve. As we know, a complex network is usually stored and represented by an adjacency matrix. It is a high-dimensional and sparse matrix, whereas machine-learning methods usually need lowdimensional dense feature representations. Therefore, in this paper, we propose a binary higher-order network embedding method to learn the low-dimensional representations of a guarantee network. We first set vertices of this heterogeneous economic network by binary roles (guarantor and guarantee), and then define high-order adjacent measures based on their roles and economic domain knowledge. Afterwards, we design a penalty parameter in the objective function to balance the importance of network structure and adjacency. We optimize it by negative sampling based gradient descent algorithms, which solve the limitation of stochastic gradient descent on weighted edges without compromising efficiency. Finally, we test our proposed method on three real-world network datasets. The result shows that this method outperforms other start-of-the-art algorithms for both classification accuracy and robustness, especially in a guarantee network.

This is a preview of subscription content, access via your institution.

References

  1. [1]

    Khandani A E, Kim A J, Lo A W. Consumer credit-risk models via machine-learning algorithms. Journal of Banking & Finance, 2010, 34(11): 2767-2787.

    Article  Google Scholar 

  2. [2]

    Baesens B, Setiono R, Mues C, Vanthienen J. Using neural network rule extraction and decision tables for credit-risk evaluation. Management Science, 2003, 49(3): 255-350.

    Article  MATH  Google Scholar 

  3. [3]

    Hand D J, Henley W E. Statistical classification methods in consumer credit scoring: A review. Journal of the Royal Statistical Society: Series A (Statistics in Society), 1997, 160(3): 523-541.

    Article  Google Scholar 

  4. [4]

    Ruzzier M, Hisrich R D, Antoncic B. SME internationalization research: Past, present, and future. Journal of Small Business and Enterprise Development, 2006, 13(4): 476-497.

    Article  Google Scholar 

  5. [5]

    DeYoung R, Gron A, Torna G, Winton A. Risk overhang and loan portfolio decisions: Small business loan supply before and during the financial crisis. The Journal of Finance, 2015, 70(6): 2451-2488.

    Article  Google Scholar 

  6. [6]

    Niu Z, Cheng D, Zhang L, Zhang J. Visual analytics for networked-guarantee loans risk management. In Proc. the 2018 IEEE Pacific Visualization Symposium, April 2018, pp.160-169.

  7. [7]

    Wu D D, Chen S H, Olson D L. Business intelligence in risk management: Some recent progresses. Information Sciences, 2014, 256: 1-7.

    Article  Google Scholar 

  8. [8]

    Peng C Y J, Lee K L, Ingersoll G M. An introduction to logistic regression analysis and reporting. The Journal of Educational Research, 2002, 96(1): 3-14.

    Article  Google Scholar 

  9. [9]

    Safavian S R, Landgrebe D. A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man, and Cybernetics, 1991, 21(3): 660-674.

    MathSciNet  Article  Google Scholar 

  10. [10]

    Cheong S, Oh S H, Lee S Y. Support vector machines with binary tree architecture for multi-class classification. Neural Information Processing — Letters and Reviews, 2004, 2(3): 47-51.

    Google Scholar 

  11. [11]

    Prairie J R, Rajagopalan B, Fulp T J, Zagona E A. Modified k-NN model for stochastic stream-flow simulation. Journal of Hydrologic Engineering, 2006, 11(4): 371-378.

    Article  Google Scholar 

  12. [12]

    Wold S, Esbensen K, Geladi P. Principal component analysis. Chemometrics and Intelligent Laboratory Systems, 1987, 2(1/2/3): 37-52.

    Article  Google Scholar 

  13. [13]

    Anderberg M R. Cluster Analysis for Applications. Academic Press, 1973.

  14. [14]

    Kao L J, Chiu C C, Chiu F Y. A Bayesian latent variable model with classification and regression tree approach for behavior and credit scoring. Knowledge-Based Systems, 2012. 36: 245-252.

    Article  Google Scholar 

  15. [15]

    Levitsky J. Credit guarantee schemes for SMEs— An international review. Small Enterprise Development, 1997, 8(2): 4-17.

    Article  Google Scholar 

  16. [16]

    Perozzi B, Al-Rfou R, Skiena S. Deepwalk: Online learning of social representations. In Proc. the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2014, pp.701-710.

  17. [17]

    Keogh E, Mueen A. Curse of dimensionality. In Encyclopedia of Machine Learning and Data Mining (2nd edition), Sammut C, Webb G I (eds.), Springer, 2017, pp.314-315.

  18. [18]

    Yang C, Sun M, Liu Z, Tu C. Fast network embedding enhancement via high order proximity approximation. In Proc. the 26th International Joint Conference on Artificial Intelligence, August 2017, pp.3894-3900.

  19. [19]

    Qiu J, Dong Y, Ma H, Li J, Wang K, Tang J. Network embedding as matrix factorization: Unifying DeepWalk, LINE, PTE, and node2vec. In Proc. the 11th ACM International Conference on Web Search and Data Mining, February 2018, pp.459-467.

  20. [20]

    Lin Y, Liu Z, Sun M, Liu Y, Zhu X. Learning entity and relation embeddings for knowledge graph completion. In Proc. the 29th AAAI Conference on Artificial Intelligence, January 2015, pp.2181-2187.

  21. [21]

    Cui P, Wang X, Pei J, Zhu W. A survey on network embedding. arXiv:1711.08752, 2017. https://arxiv.org/abs/1711.08752, March 2019.

  22. [22]

    Grover A, Leskovec J. node2vec: Scalable feature learning for networks. In Proc. the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2016, pp.855-864.

  23. [23]

    Wang D, Cui P, Zhu W. Structural deep network embedding. In Proc. the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2016, pp.1225-1234.

  24. [24]

    Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q. LINE: Large-scale information network embedding. In Proc. the 24th International Conference on World Wide Web, May 2015, pp.1067-1077.

  25. [25]

    Tang J, Qu M, Mei Q. PTE: Predictive text embedding through large-scale heterogeneous text networks. In Proc. the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2015, pp.1165-1174.

  26. [26]

    Schuler D A, Cording M. A corporate social performance — Corporate financial performance behavioral model for consumers. Academy of Management Review, 2006, 31(3): 540-558.

    Article  Google Scholar 

  27. [27]

    Tu C, Liu H, Liu Z, Sun M. CANE: Context-aware network embedding for relation modeling. In Proc. the 55th Annual Meeting of the Association for Computational Linguistics, July 2017, pp.1722-1731.

  28. [28]

    Camerer C F, Fehr E. When does “economic man” dominate social behavior? Science, 2006, 311(5757): 47-52.

    Article  Google Scholar 

  29. [29]

    Wang X, Cui P, Wang J, Pei J, Zhu W, Yang S. Community preserving network embedding. In Proc. the 31st AAAI Conference on Artificial Intelligence, February 2017, pp.203-209.

  30. [30]

    Cao S, Lu W, Xu Q. GraRep: Learning graph representations with global structural information. In Proc. the 24th ACM International Conference on Information and Knowledge Management, October 2015, pp.891-900.

  31. [31]

    Kompass R. A generalized divergence measure for nonnegative matrix factorization. Neural Computation, 2007, 19(3): 780-791.

    MathSciNet  Article  MATH  Google Scholar 

  32. [32]

    Mikolov T, Sutskever I, Chen K, Corrado G S, Dean J. Distributed representations of words and phrases and their compositionality. In Proc. the 27th Annual Conference on Neural Information Processing Systems, December 2013, pp.3111-3119.

  33. [33]

    Recht B, Re C, Wright S, Niu F. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Proc. the 25th Annual Conference on Neural Information Processing Systems, December 2011, pp.693-701.

  34. [34]

    Li A Q, Ahmed A, Ravi S, Smola A J. Reducing the sampling complexity of topic models. In Proc. the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2014, pp.891-900.

  35. [35]

    Barabási A L. Scale-free networks: A decade and beyond. Science, 2009, 325(5939): 412-413.

    MathSciNet  Article  MATH  Google Scholar 

  36. [36]

    Barabási A L, Bonabeau E. Scale-free networks. Scientific American, 2003, 288(5): 60-69.

    Article  Google Scholar 

  37. [37]

    Ting I. Social Network Mining, Analysis, and Research Trends: Techniques and Applications Hershey: IGI Global, 2011.

  38. [38]

    Satish N, Sundaram N, Patwary M M A, Seo J, Park J, Hassaan M A, Sengupta S, Yin Z, Dubey P. Navigating the maze of graph analytics frameworks using massive graph datasets. In Proc. the 2014 ACM SIGMOD International Conference on Management of Data, June 2014, pp.979-990.

  39. [39]

    Tang L, Liu H. Leveraging social media networks for classification. Data Mining and Knowledge Discovery, 2011, 23(3): 447-478.

    MathSciNet  Article  MATH  Google Scholar 

  40. [40]

    Ahmed A, Shervashidze N, Narayanamurthy S, Josifovski V, Smola A J. Distributed large-scale natural graph factorization. In Proc. the 22nd International Conference on World Wide Web, May 2013, pp.37-48.

  41. [41]

    Pons P, Latapy M. Computing communities in large networks using random walks. In Proc. the 20th International Symposium on Computer and Information Sciences, October 2005, pp.284-293.

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Li-Qing Zhang.

Electronic supplementary material

ESM 1

(PDF 88 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Cheng, DW., Tu, Y., Ma, ZW. et al. BHONEM: Binary High-Order Network Embedding Methods for Networked-Guarantee Loans. J. Comput. Sci. Technol. 34, 657–669 (2019). https://doi.org/10.1007/s11390-019-1934-8

Download citation

Keywords

  • networked-guarantee loan
  • high-order network embedding
  • representative learning
  • gradient descent