Abstract
The attention mechanism is widely used in GNNs to improve performances. However, we argue that it breaks the prerequisite for a GNN model to obtain the maximum expressive power of distinguishing different graph structures. This paper performs theoretical analyses of attention-based GNN models’ expressive power on graphs with both node and edge features. We propose an enhanced graph attention network (EGAT) framework based on the analysis to deal with this problem. We add a degree-related scale term to the attention coefficients and adjust the message extraction function to enhance the expressive power, which is critical in the graph classification task. Furthermore, we introduce a virtual node connected with all nodes to augment the node representation update process with global information. To prove the effectiveness of our EGAT framework, we first construct synthetic datasets to validate our theoretical proposal, then we apply EGAT to two Open Graph Benchmark (OGB) graph classification tasks to empirically demonstrate that our model also performs well in real applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal, C.C., Bar-Noy, A., Shamoun, S.: On sensor selection in linked information networks. Comput. Networks 126, 100–113 (2017). https://doi.org/10.1016/j.comnet.2017.05.024
Allamanis, M.: The adverse effects of code duplication in machine learning models of code. In: Masuhara, H., Petricek, T. (eds.) Proceedings of the 2019 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, Onward! 2019, Athens, Greece, 23–24 October 2019, pp. 143–153. ACM (2019). https://doi.org/10.1145/3359591.3359735
Babai, L., Kucera, L.: Canonical labelling of graphs in linear average time. In: 20th Annual Symposium on Foundations of Computer Science, San Juan, Puerto Rico, 29–31 October 1979, pp. 39–46. IEEE Computer Society (1979). https://doi.org/10.1109/SFCS.1979.8
Backstrom, L., Leskovec, J.: Supervised random walks: predicting and recommending links in social networks. In: King, I., Nejdl, W., Li, H. (eds.) Proceedings of the Forth International Conference on Web Search and Web Data Mining, WSDM 2011, Hong Kong, China, 9–12 February, 2011, pp. 635–644. ACM (2011). https://doi.org/10.1145/1935826.1935914
Battaglia, P.W., Pascanu, R., Lai, M., Rezende, D.J., Kavukcuoglu, K.: Interaction networks for learning about objects, relations and physics. In: Lee, D.D., Sugiyama, M., von Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 5–10 December 2016, Barcelona, Spain, pp. 4502–4510 (2016). https://proceedings.neurips.cc/paper/2016/hash/3147da8ab4a0437c15ef51a5cc7f2dc4-Abstract.html
Beaini, D., Passaro, S., Létourneau, V., Hamilton, W.L., Corso, G., Liò, P.: Directional graph networks. CoRR abs/2010.02863 (2020). https://arxiv.org/abs/2010.02863
Brossard, R., Frigo, O., Dehaene, D.: Graph convolutions that can finally model local structure. CoRR abs/2011.15069 (2020). https://arxiv.org/abs/2011.15069
Cai, J., Fürer, M., Immerman, N.: An optimal lower bound on the number of variables for graph identifications. Comb. 12(4), 389–410 (1992). https://doi.org/10.1007/BF01305232
Corso, G., Cavalleri, L., Beaini, D., Liò, P., Velickovic, P.: Principal neighbourhood aggregation for graph nets. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., Lin, H. (eds.) Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, 6–12 December 2020, virtual (2020). https://proceedings.neurips.cc/paper/2020/hash/99cad265a1768cc2dd013f0e740300ae-Abstract.html
Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. In: Lee, D.D., Sugiyama, M., von Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 5–10 December, 2016, Barcelona, Spain, pp. 3837–3845 (2016). https://proceedings.neurips.cc/paper/2016/hash/04df4d434d481c5bb723be1b6df1ee65-Abstract.html
Deng, S., Huang, L., Xu, G., Wu, X., Wu, Z.: On deep learning for trust-aware recommendations in social networks. IEEE Trans. Neural Networks Learn. Syst. 28(5), 1164–1177 (2017). https://doi.org/10.1109/TNNLS.2016.2514368
Duvenaud, D., et al.: Convolutional networks on graphs for learning molecular fingerprints. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 7–12 December 2015, Montreal, Quebec, Canada, pp. 2224–2232 (2015). https://proceedings.neurips.cc/paper/2015/hash/f9be311e65d81a9ad8150a60844bb94c-Abstract.html
Hamilton, W.L., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: Guyon, I., von Luxburg, U., Bengio, S., Wallach, H.M., Fergus, R., Vishwanathan, S.V.N., Garnett, R. (eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4–9 December 2017, Long Beach, CA, USA, pp. 1024–1034 (2017). https://proceedings.neurips.cc/paper/2017/hash/5dd9db5e033da9c6fb5ba83c7a7ebea9-Abstract.html
Hornik, K.: Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2), 251–257 (1991). https://doi.org/10.1016/0893-6080(91)90009-T
Hornik, K., Stinchcombe, M.B., White, H.: Multilayer feedforward networks are universal approximators. Neural Networks 2(5), 359–366 (1989). https://doi.org/10.1016/0893-6080(89)90020-8
Hu, W., et al.: Open graph benchmark: Datasets for machine learning on graphs. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., Lin, H. (eds.) Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, 6–12 December, 2020, virtual (2020). https://proceedings.neurips.cc/paper/2020/hash/fb60d411a5c5b72b2e7d3527cfc84fd0-Abstract.html
Hu, Z., Dong, Y., Wang, K., Sun, Y.: Heterogeneous graph transformer. In: Huang, Y., King, I., Liu, T., van Steen, M. (eds.) WWW ’20: The Web Conference 2020, Taipei, Taiwan, 20–24 April, 2020, pp. 2704–2710. ACM / IW3C2 (2020). https://doi.org/10.1145/3366423.3380027
Husain, H., Wu, H., Gazit, T., Allamanis, M., Brockschmidt, M.: Codesearchnet challenge: evaluating the state of semantic code search. CoRR abs/1909.09436 (2019). http://arxiv.org/abs/1909.09436
Kearnes, S.M., McCloskey, K., Berndl, M., Pande, V.S., Riley, P.: Molecular graph convolutions: moving beyond fingerprints. J. Comput. Aided Mol. Des. 30(8), 595–608 (2016). https://doi.org/10.1007/s10822-016-9938-8
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May, 2015, Conference Track Proceedings (2015). http://arxiv.org/abs/1412.6980
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track Proceedings. OpenReview.net (2017). https://openreview.net/forum?id=SJU4ayYgl
Landrum: Rdkit: Open-source cheminformatics (2006)
Lee, J.B., Kong, X., Bao, Y., Moore, C.M.: Identifying deep contrasting networks from time series data: application to brain network analysis. In: Chawla, N.V., Wang, W. (eds.) Proceedings of the 2017 SIAM International Conference on Data Mining, Houston, Texas, USA, 27–29 April, 2017, pp. 543–551. SIAM (2017). https://doi.org/10.1137/1.9781611974973.61
Lee, J.B., Rossi, R.A., Kong, X.: Graph classification using structural attention. In: Guo, Y., Farooq, F. (eds.) Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2018, London, UK, 19–23 August 2018, pp. 1666–1674. ACM (2018). https://doi.org/10.1145/3219819.3219980
Li, G., Xiong, C., Thabet, A.K., Ghanem, B.: Deepergcn: All you need to train deeper gcns. CoRR abs/2006.07739 (2020). https://arxiv.org/abs/2006.07739
Li, J., Cai, D., He, X.: Learning graph-level representation for drug discovery. CoRR abs/1709.03741 (2017). http://arxiv.org/abs/1709.03741
Liu, Q., Xiang, B., Yuan, N.J., Chen, E., Xiong, H., Zheng, Y., Yang, Y.: An influence propagation view of pagerank. ACM Trans. Knowl. Discov. Data 11(3), 30:1–30:30 (2017). https://doi.org/10.1145/3046941
Morris, C., Ritzert, M., Fey, M., Hamilton, W.L., Lenssen, J.E., Rattan, G., Grohe, M.: Weisfeiler and leman go neural: Higher-order graph neural networks. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019, pp. 4602–4609. AAAI Press (2019). https://doi.org/10.1609/aaai.v33i01.33014602
Pei, J., Jiang, D., Zhang, A.: On mining cross-graph quasi-cliques. In: Grossman, R., Bayardo, R.J., Bennett, K.P. (eds.) Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, Illinois, USA, 21–24 August 2005, pp. 228–238. ACM (2005). https://doi.org/10.1145/1081870.1081898
Pham, T., Tran, T., Dam, K.H., Venkatesh, S.: Graph classification via deep learning with virtual nodes. CoRR abs/1708.04357 (2017). http://arxiv.org/abs/1708.04357
Rong, Y., Huang, W., Xu, T., Huang, J.: Dropedge: towards deep graph convolutional networks on node classification. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30, April 2020. OpenReview.net (2020). https://openreview.net/forum?id=Hkx1qkrKPr
Scarselli, F., Gori, M., Tsoi, A.C., Hagenbuchner, M., Monfardini, G.: Computational capabilities of graph neural networks. IEEE Trans. Neural Networks 20(1), 81–102 (2009). https://doi.org/10.1109/TNN.2008.2005141
Schlichtkrull, M., Kipf, T.N., Bloem, P., van den Berg, R., Titov, I., Welling, M.: Modeling relational data with graph convolutional networks. In: Gangemi, A., Navigli, R., Vidal, M.-E., Hitzler, P., Troncy, R., Hollink, L., Tordai, A., Alam, M. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 593–607. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_38
Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., Bengio, Y.: Graph attention networks. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, 30 April–3 May 2018, Conference Track Proceedings. OpenReview.net (2018). https://openreview.net/forum?id=rJXMpikCZ
Wang, M., et al.: Deep graph library: towards efficient and scalable deep learning on graphs. CoRR abs/1909.01315 (2019). http://arxiv.org/abs/1909.01315
Wang, X., et al.: Heterogeneous graph attention network. In: Liu, L., White, R.W., Mantrach, A., Silvestri, F., McAuley, J.J., Baeza-Yates, R., Zia, L. (eds.) The World Wide Web Conference, WWW 2019, San Francisco, CA, USA, 13–17 May 2019, pp. 2022–2032. ACM (2019). https://doi.org/10.1145/3308558.3313562
Wu, Z., et al.: Moleculenet: a benchmark for molecular machine learning. CoRR abs/1703.00564 (2017). http://arxiv.org/abs/1703.00564
Xu, K., Hu, W., Leskovec, J., Jegelka, S.: How powerful are graph neural networks? In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, 6–9 May 2019. OpenReview.net (2019). https://openreview.net/forum?id=ryGs6iA5Km
Zaheer, M., Kottur, S., Ravanbakhsh, S., Póczos, B., Salakhutdinov, R., Smola, A.J.: Deep sets. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4–9 December 2017, Long Beach, CA, USA, pp. 3391–3401 (2017). https://proceedings.neurips.cc/paper/2017/hash/f22e4747da1aa27e363d86d40ff442fe-Abstract.html
Zheng, Y., Capra, L., Wolfson, O., Yang, H.: Introduction to the special section on urban computing. ACM Trans. Intell. Syst. Technol. 5(3), 37:1–37:2 (2014). https://doi.org/10.1145/2642650
Acknowledgement
The work is partly supported by Delta Research Program.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Tao, Y., Li, Y., Wu, Z. (2022). Revisiting Attention-Based Graph Neural Networks for Graph Classification. In: Rudolph, G., Kononova, A.V., Aguirre, H., Kerschke, P., Ochoa, G., Tušar, T. (eds) Parallel Problem Solving from Nature – PPSN XVII. PPSN 2022. Lecture Notes in Computer Science, vol 13398. Springer, Cham. https://doi.org/10.1007/978-3-031-14714-2_31
Download citation
DOI: https://doi.org/10.1007/978-3-031-14714-2_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-14713-5
Online ISBN: 978-3-031-14714-2
eBook Packages: Computer ScienceComputer Science (R0)