Heterogeneous Graph Representation for Text Mining

Shi, Chuan; Wang, Xiao; S. Yu, Philip

doi:10.1007/978-981-16-6166-2_8

Chuan Shi⁶,
Xiao Wang⁷ &
Philip S. Yu⁸

Part of the book series: Artificial Intelligence: Foundations, Theory, and Algorithms ((AIFTA))

1311 Accesses

Abstract

Heterogeneous graph representation techniques can be applied in many real-world applications. Even the natural languages that are usually modeled as sequential data can also be constructed as a heterogeneous graph by some techniques, so as to widely and accurately capture the complex interactions among the words, entities, topics, instances, and other components of the texts. In this chapter, we focus on summarizing the heterogeneous graph representation applications on text mining. Particularly, we introduce several heterogeneous graph based text mining methods, including HGAT for short text classification, GUND and GNewsRec for news recommendation. In the field of heterogeneous graph representation for text mining, methods mainly contain two key components: heterogeneous graph construction from texts and heterogeneous graph representation algorithm for tasks. We will roughly illustrate heterogeneous graph modeling for text mining tasks from these two points.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.99; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://sobigdata.d4science.org/group/tagme/.
2.
https://code.google.com/archive/p/word2vec/.
3.
Here we follow the original naming in [18].
4.
http://disi.unitn.it/moschitti/corpora.htm.
5.
https://www.nltk.org/.
6.
Here, we assume each news has only one topic, i.e., |Z(d)| = 1.
7.
S(d) may contain duplicates if |U(d)| < L _u. If U(d) = ∅, then S(d) = ∅.
8.
If the click history sequence length is less than l, it will be padded with zero embeddings.
9.
http://reclab.idi.ntnu.no/dataset/.
10.
sessionStart and sessionStop determine the session boundaries.

References

Aggarwal, C.C., Zhai, C.: A survey of text classification algorithms. In: Mining Text Data, pp. 163–222. Springer, Berlin (2012)
Google Scholar
An, M., Wu, F., Wu, C., Zhang, K., Liu, Z., Xie, X.: Neural news recommendation with long- and short-term user representations. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 336–345 (2019)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)
MATH Google Scholar
Cheng, H.T., Koc, L., Harmsen, J., Shaked, T., Chandra, T., Aradhye, H., Anderson, G., Corrado, G., Chai, W., Ispir, M., et al.: Wide and deep learning for recommender systems. In: Proceedings of the 1st Workshop on Deep Learning for Recommender Systems (DLRS@RecSys), pp. 7–10 (2016)
Google Scholar
Das, A.S., Datar, M., Garg, A., Rajaram, S.: Google news personalization: scalable online collaborative filtering. In: Proceedings of the 16th International Conference on World Wide Web (WWW), pp. 271–280 (2007)
Google Scholar
De Francisci Morales, G., Gionis, A., Lucchese, C.: From chatter to headlines: harnessing the real-time web for personalized news recommendation. In: Proceedings of the fifth ACM International Conference on Web Search and Data Mining (WSDM), pp. 153–162 (2012)
Google Scholar
Gulla, J.A., Zhang, L., Liu, P., Özgöbek, Ö., Su, X.: The Adressa dataset for news recommendation. In: Proceedings of the International Conference on Web Intelligence (ICWI), pp. 1042–1048 (2017)
Google Scholar
Guo, H., Tang, R., Ye, Y., Li, Z., He, X.: DeepFM: a factorization-machine based neural network for CTR prediction. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI), pp. 1725–1731 (2017)
Google Scholar
Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS), pp. 1024–1034 (2017)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Hu, L., Li, C., Shi, C., Yang, C., Shao, C.: Graph neural news recommendation with long-term and short-term interest modeling. Inf. Process. Manage. 57(2), 102142 (2020)
Article Google Scholar
Hu, L., Xu, S., Li, C., Yang, C., Shi, C., Duan, N., Xie, X., Zhou, M.: Graph neural news recommendation with unsupervised preference disentanglement. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 4255–4264 (2020)
Google Scholar
Huang, P.S., He, X., Gao, J., Deng, L., Acero, A., Heck, L.: Learning deep structured semantic models for web search using clickthrough data. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM), pp. 2333–2338 (2013)
Google Scholar
IJntema, W., Goossen, F., Frasincar, F., Hogenboom, F.: Ontology-based news recommendation. In: Proceedings of the 2010 EDBT/ICDT Workshops, p. 16 (2010)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751 (2014)
Google Scholar
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: Proceedings of the Conference ICLR (2017)
Google Scholar
Li, L., Wang, D., Li, T., Knox, D., Padmanabhan, B.: Scene: a scalable two-stage personalized news recommendation system. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 125–134 (2011)
Google Scholar
Linmei, H., Yang, T., Shi, C., Ji, H., Li, X.: Heterogeneous graph attention networks for semi-supervised short text classification. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 4821–4830 (2019)
Google Scholar
Liu, P., Qiu, X., Huang, X.: Recurrent neural network for text classification with multi-task learning. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI), pp. 2873–2879 (2016)
Google Scholar
Liu, M., Wang, X., Nie, L., Tian, Q., Chen, B., Chua, T.S.: Cross-modal moment localization in videos. In: Proceedings of the 26th ACM International Conference on Multimedia (MM), pp. 843–851 (2018)
Google Scholar
Ma, J., Cui, P., Kuang, K., Wang, X., Zhu, W.: Disentangled graph convolutional networks. In: International Conference on Machine Learning (ICML), pp. 4212–4221 (2019)
Google Scholar
Meng, Y., Shen, J., Zhang, C., Han, J.: Weakly-supervised neural text classification. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM), pp. 983–992 (2018)
Google Scholar
Newman, D., Smyth, P., Welling, M., Asuncion, A.U.: Distributed inference for latent Dirichlet allocation. In: Advances in Neural Information Processing Systems (NIPS), pp. 1081–1088 (2008)
Google Scholar
Okura, S., Tagami, Y., Ono, S., Tajima, A.: Embedding-based news recommendation for millions of users. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 1933–1942 (2017)
Google Scholar
Pang, B., Lee, L.: Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL), pp. 115–124 (2005)
Google Scholar
Phan, X.H., Nguyen, L.M., Horiguchi, S.: Learning to classify short and sparse text and web with hidden topics from large-scale data collections. In: Proceedings of the 17th International Conference on World Wide Web (WWW), pp. 91–100 (2008)
Google Scholar
Rendle, S.: Factorization machines with LIBFM. ACM Trans. Intell. Syst. Technol. 3(3), 57 (2012)
Article Google Scholar
Shimura, K., Li, J., Fukumoto, F.: HFT-CNN: learning hierarchical category structure for multi-label short text categorization. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 811–816. Brussels, Belgium (2018)
Google Scholar
Sinha, K., Dong, Y., Cheung, J.C.K., Ruths, D.: A hierarchical neural attention-based text classifier. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 817–823. Brussels, Belgium (2018)
Google Scholar
Song, G., Ye, Y., Du, X., Huang, X., Bie, S.: Short text classification: A survey. J. Multimedia 9(5), 635 (2014)
Article Google Scholar
Tang, J., Qu, M., Mei, Q.: PTE: Predictive text embedding through large-scale heterogeneous text networks. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 1165–1174 (2015)
Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L.u., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems (NIPS), pp. 5998–6008 (2017)
Google Scholar
Vitale, D., Ferragina, P., Scaiella, U.: Classification of short texts by deploying topical annotations. In: European Conference on Information Retrieval (ECIR), pp. 376–387 (2012)
Google Scholar
Wang, C., Blei, D.M.: Collaborative topic modeling for recommending scientific articles. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 448–456 (2011)
Google Scholar
Wang, S., Manning, C.D.: Baselines and bigrams: Simple, good sentiment and topic classification. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 90–94 (2012)
Google Scholar
Wang, X., Chen, R., Jia, Y., Zhou, B.: Short text classification using Wikipedia concept based document representation. In: Proceedings of the 2013 International Conference on Information Technology and Applications (ICITA), pp. 471–474 (2013)
Google Scholar
Wang, J., Wang, Z., Zhang, D., Yan, J.: Combining knowledge with deep convolutional neural networks for short text classification. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI), pp. 2915–2921 (2017)
Google Scholar
Wang, X., Yu, L., Ren, K., Tao, G., Zhang, W., Yu, Y., Wang, J.: Dynamic attention deep model for article recommendation by learning human editors’ demonstration. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 2051–2059 (2017)
Google Scholar
Wang, H., Zhang, F., Xie, X., Guo, M.: DKN: Deep knowledge-aware network for news recommendation. In: Proceedings of the 2018 World Wide Web Conference (WWW), pp. 1835–1844 (2018)
Google Scholar
Wang, H., Zhao, M., Xie, X., Li, W., Guo, M.: Knowledge graph convolutional networks for recommender systems. In: Proceedings of the World Wide Web (WWW), pp. 3307–3313 (2019)
Google Scholar
Wang, X., He, X., Cao, Y., Liu, M., Chua, T.S.: KGAT: Knowledge graph attention network for recommendation. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 950–958 (2019)
Google Scholar
Wang, X., He, X., Wang, M., Feng, F., Chua, T.S.: Neural graph collaborative filtering. In: Proceedings of the 42nd international ACM SIGIR conference on Research and development in Information Retrieval (SIGIR), pp. 165–174 (2019)
Google Scholar
Wang, X., Ji, H., Shi, C., Wang, B., Ye, Y., Cui, P., Yu, P.S.: Heterogeneous graph attention network. In: The World Wide Web Conference (WWW), pp. 2022–2032 (2019)
Google Scholar
Wu, C., Wu, F., An, M., Huang, J., Huang, Y., Xie, X.: NPA: Neural news recommendation with personalized attention. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 2576–2584 (2019)
Google Scholar
Xue, H.J., Dai, X., Zhang, J., Huang, S., Chen, J.: Deep matrix factorization models for recommender systems. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI), pp. 3203–3209 (2017)
Google Scholar
Yang, C., Sun, M., Yi, X., Li, W.: Stylistic Chinese poetry generation via unsupervised style disentanglement. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 3960–3969 (2018)
Google Scholar
Yang, T., Hu, L., Shi, C., Ji, H., Li, X., Nie, L.: HGAT: Heterogeneous graph attention networks for semi-supervised short text classification. ACM Trans. Inf. Syst. 39(3), 1–29 (2021)
Article Google Scholar
Yao, L., Mao, C., Luo, Y.: Graph convolutional networks for text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 7370–7377 (2019)
Google Scholar
Zeng, J., Li, J., Song, Y., Gao, C., Lyu, M.R., King, I.: Topic memory networks for short text classification. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 3120–3131 (2018)
Google Scholar
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. Adv. Neural Inf. Proces. Syst. 28, 649–657 (2015)
Google Scholar
Zhu, Q., Zhou, X., Song, Z., Tan, J., Guo, L.: Dan: Deep attention neural network for news recommendation. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), vol. 33, pp. 5973–5980 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Beijing University of Posts and Telecommunications, School of Computer Science, Beijing, China
Chuan Shi
Beijing University of Posts and Telecommunications, School of Computer Science, Beijing, China
Xiao Wang
University of Illinois at Chicago, Department of Computer Science, Chicago, USA
Philip S. Yu

Authors

Chuan Shi
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Philip S. Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Chuan Shi , Xiao Wang or Philip S. Yu .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Shi, C., Wang, X., S. Yu, P. (2022). Heterogeneous Graph Representation for Text Mining. In: Heterogeneous Graph Representation Learning and Applications. Artificial Intelligence: Foundations, Theory, and Algorithms. Springer, Singapore. https://doi.org/10.1007/978-981-16-6166-2_8

Download citation

DOI: https://doi.org/10.1007/978-981-16-6166-2_8
Published: 05 November 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-6165-5
Online ISBN: 978-981-16-6166-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics