Networked-guarantee loans may cause systemic risk related concern for the government and banks in China. The prediction of the default of enterprise loans is a typical machine learning based classification problem, and the networked guarantee makes this problem very difficult to solve. As we know, a complex network is usually stored and represented by an adjacency matrix. It is a high-dimensional and sparse matrix, whereas machine-learning methods usually need lowdimensional dense feature representations. Therefore, in this paper, we propose a binary higher-order network embedding method to learn the low-dimensional representations of a guarantee network. We first set vertices of this heterogeneous economic network by binary roles (guarantor and guarantee), and then define high-order adjacent measures based on their roles and economic domain knowledge. Afterwards, we design a penalty parameter in the objective function to balance the importance of network structure and adjacency. We optimize it by negative sampling based gradient descent algorithms, which solve the limitation of stochastic gradient descent on weighted edges without compromising efficiency. Finally, we test our proposed method on three real-world network datasets. The result shows that this method outperforms other start-of-the-art algorithms for both classification accuracy and robustness, especially in a guarantee network.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Khandani A E, Kim A J, Lo A W. Consumer credit-risk models via machine-learning algorithms. Journal of Banking & Finance, 2010, 34(11): 2767-2787.
Baesens B, Setiono R, Mues C, Vanthienen J. Using neural network rule extraction and decision tables for credit-risk evaluation. Management Science, 2003, 49(3): 255-350.
Hand D J, Henley W E. Statistical classification methods in consumer credit scoring: A review. Journal of the Royal Statistical Society: Series A (Statistics in Society), 1997, 160(3): 523-541.
Ruzzier M, Hisrich R D, Antoncic B. SME internationalization research: Past, present, and future. Journal of Small Business and Enterprise Development, 2006, 13(4): 476-497.
DeYoung R, Gron A, Torna G, Winton A. Risk overhang and loan portfolio decisions: Small business loan supply before and during the financial crisis. The Journal of Finance, 2015, 70(6): 2451-2488.
Niu Z, Cheng D, Zhang L, Zhang J. Visual analytics for networked-guarantee loans risk management. In Proc. the 2018 IEEE Pacific Visualization Symposium, April 2018, pp.160-169.
Wu D D, Chen S H, Olson D L. Business intelligence in risk management: Some recent progresses. Information Sciences, 2014, 256: 1-7.
Peng C Y J, Lee K L, Ingersoll G M. An introduction to logistic regression analysis and reporting. The Journal of Educational Research, 2002, 96(1): 3-14.
Safavian S R, Landgrebe D. A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man, and Cybernetics, 1991, 21(3): 660-674.
Cheong S, Oh S H, Lee S Y. Support vector machines with binary tree architecture for multi-class classification. Neural Information Processing — Letters and Reviews, 2004, 2(3): 47-51.
Prairie J R, Rajagopalan B, Fulp T J, Zagona E A. Modified k-NN model for stochastic stream-flow simulation. Journal of Hydrologic Engineering, 2006, 11(4): 371-378.
Wold S, Esbensen K, Geladi P. Principal component analysis. Chemometrics and Intelligent Laboratory Systems, 1987, 2(1/2/3): 37-52.
Anderberg M R. Cluster Analysis for Applications. Academic Press, 1973.
Kao L J, Chiu C C, Chiu F Y. A Bayesian latent variable model with classification and regression tree approach for behavior and credit scoring. Knowledge-Based Systems, 2012. 36: 245-252.
Levitsky J. Credit guarantee schemes for SMEs— An international review. Small Enterprise Development, 1997, 8(2): 4-17.
Perozzi B, Al-Rfou R, Skiena S. Deepwalk: Online learning of social representations. In Proc. the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2014, pp.701-710.
Keogh E, Mueen A. Curse of dimensionality. In Encyclopedia of Machine Learning and Data Mining (2nd edition), Sammut C, Webb G I (eds.), Springer, 2017, pp.314-315.
Yang C, Sun M, Liu Z, Tu C. Fast network embedding enhancement via high order proximity approximation. In Proc. the 26th International Joint Conference on Artificial Intelligence, August 2017, pp.3894-3900.
Qiu J, Dong Y, Ma H, Li J, Wang K, Tang J. Network embedding as matrix factorization: Unifying DeepWalk, LINE, PTE, and node2vec. In Proc. the 11th ACM International Conference on Web Search and Data Mining, February 2018, pp.459-467.
Lin Y, Liu Z, Sun M, Liu Y, Zhu X. Learning entity and relation embeddings for knowledge graph completion. In Proc. the 29th AAAI Conference on Artificial Intelligence, January 2015, pp.2181-2187.
Cui P, Wang X, Pei J, Zhu W. A survey on network embedding. arXiv:1711.08752, 2017. https://arxiv.org/abs/1711.08752, March 2019.
Grover A, Leskovec J. node2vec: Scalable feature learning for networks. In Proc. the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2016, pp.855-864.
Wang D, Cui P, Zhu W. Structural deep network embedding. In Proc. the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2016, pp.1225-1234.
Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q. LINE: Large-scale information network embedding. In Proc. the 24th International Conference on World Wide Web, May 2015, pp.1067-1077.
Tang J, Qu M, Mei Q. PTE: Predictive text embedding through large-scale heterogeneous text networks. In Proc. the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2015, pp.1165-1174.
Schuler D A, Cording M. A corporate social performance — Corporate financial performance behavioral model for consumers. Academy of Management Review, 2006, 31(3): 540-558.
Tu C, Liu H, Liu Z, Sun M. CANE: Context-aware network embedding for relation modeling. In Proc. the 55th Annual Meeting of the Association for Computational Linguistics, July 2017, pp.1722-1731.
Camerer C F, Fehr E. When does “economic man” dominate social behavior? Science, 2006, 311(5757): 47-52.
Wang X, Cui P, Wang J, Pei J, Zhu W, Yang S. Community preserving network embedding. In Proc. the 31st AAAI Conference on Artificial Intelligence, February 2017, pp.203-209.
Cao S, Lu W, Xu Q. GraRep: Learning graph representations with global structural information. In Proc. the 24th ACM International Conference on Information and Knowledge Management, October 2015, pp.891-900.
Kompass R. A generalized divergence measure for nonnegative matrix factorization. Neural Computation, 2007, 19(3): 780-791.
Mikolov T, Sutskever I, Chen K, Corrado G S, Dean J. Distributed representations of words and phrases and their compositionality. In Proc. the 27th Annual Conference on Neural Information Processing Systems, December 2013, pp.3111-3119.
Recht B, Re C, Wright S, Niu F. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Proc. the 25th Annual Conference on Neural Information Processing Systems, December 2011, pp.693-701.
Li A Q, Ahmed A, Ravi S, Smola A J. Reducing the sampling complexity of topic models. In Proc. the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2014, pp.891-900.
Barabási A L. Scale-free networks: A decade and beyond. Science, 2009, 325(5939): 412-413.
Barabási A L, Bonabeau E. Scale-free networks. Scientific American, 2003, 288(5): 60-69.
Ting I. Social Network Mining, Analysis, and Research Trends: Techniques and Applications Hershey: IGI Global, 2011.
Satish N, Sundaram N, Patwary M M A, Seo J, Park J, Hassaan M A, Sengupta S, Yin Z, Dubey P. Navigating the maze of graph analytics frameworks using massive graph datasets. In Proc. the 2014 ACM SIGMOD International Conference on Management of Data, June 2014, pp.979-990.
Tang L, Liu H. Leveraging social media networks for classification. Data Mining and Knowledge Discovery, 2011, 23(3): 447-478.
Ahmed A, Shervashidze N, Narayanamurthy S, Josifovski V, Smola A J. Distributed large-scale natural graph factorization. In Proc. the 22nd International Conference on World Wide Web, May 2013, pp.37-48.
Pons P, Latapy M. Computing communities in large networks using random walks. In Proc. the 20th International Symposium on Computer and Information Sciences, October 2005, pp.284-293.
Electronic supplementary material
About this article
Cite this article
Cheng, DW., Tu, Y., Ma, ZW. et al. BHONEM: Binary High-Order Network Embedding Methods for Networked-Guarantee Loans. J. Comput. Sci. Technol. 34, 657–669 (2019). https://doi.org/10.1007/s11390-019-1934-8
- networked-guarantee loan
- high-order network embedding
- representative learning
- gradient descent