Skip to main content
Log in

Integrating graph embedding and neural models for improving transition-based dependency parsing

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

This paper introduces an effective method for improving dependency parsing which is based on a graph embedding model. The model helps extract local and global connectivity patterns between tokens. This method allows neural network models to perform better on dependency parsing benchmarks. We propose to incorporate node embeddings trained by a graph embedding algorithm into a bidirectional recurrent neural network scheme. The new model outperforms a baseline reference using a state-of-the-art method on three dependency treebanks for both low-resource and high-resource natural languages, namely Indonesian, Vietnamese and English. We also show that the popular pretraining technique of BERT would not pick up on the same kind of signal as graph embeddings. The new parser together with all trained models is made available under an open-source license, facilitating community engagement and advancement of natural language processing research for two low-resource languages with around 300 million users worldwide in total.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

The datasets generated during and/or analyzed during the current study are available in the Universal Dependencies repository: https://universaldependencies.org.

Notes

  1. https://github.com/phuonglh/jvl/, under the VLP/aep module.

  2. https://universaldependencies.org/.

  3. https://spacy.io/api/dependencyparser.

  4. In practice, the dimensionality is usually in the range of a dozen to one thousand.

  5. https://www.internetworldstats.com/stats3.htm.

  6. Vietnamese Language and Speech Processing, http://vlsp.org.vn/.

  7. https://github.com/UniversalDependencies/UD_Vietnamese-VTB/.

  8. The shape dictionary of a word includes a dozen of different word forms such as number, date, allcaps, url....

  9. Kiperwasser and Goldberg [18] select the top three tokens on the stack and the first token on the buffer. They use the arc-hybrid system instead of the arc-eager system as in our work.

  10. The GSD treebank is about five times larger than the PUD or CSUI treebanks.

  11. All models are implemented in the Julia programming language using the https://fluxml.ai library.

  12. Recall that in the SOF variant, 20 embedding vectors of individual features are concatenated, resulting in an embedding dimension of \(20*e\).

  13. Kiperwasser and Goldberg [18] evaluated their model on the English Penn Treebank corpus.

  14. This small number makes sense due to a small number of 12 different possible word shapes.

  15. These embedding dimensions have been tuned by Kiperwasser and Goldberg [18].

  16. We use the package HypothesisTests of Julia to perform the statistical tests.

  17. More precisely, we use the model bert-uncased_L-12_H-768_A-12 which is publicly available.

References

  1. Alves M (1999) What’s so Chinese about Vietnamese? In: Proceedings of the Ninth Annual Meeting of the Southeast Asian Linguistics Society, pp 221–224, University of California, Berkeley, USA

  2. Baroni M, Lenci A (2010) Distributional memory: a general framework for corpus-based semantics. Comput Linguist 36(4):673–721

    Article  Google Scholar 

  3. Björkelund A, Falenska A, Yu X, and Kuhn J (2017) Ims at the conll 2017 ud shared task: Crfs and perceptrons meet neural networks. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp 40–51, Vancouver, Canada. Association for Computational Linguistics

  4. Bordes A, Usunier N, Garcia-Duran A, Weston J, Yakhnenko O (2013) Translating embeddings for modeling multi-relational data. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in Neural Information Processing Systems, vol 26. Curran Associates Inc, pp 1–9

  5. Buchholz S and Marsi E (2006) CoNLL-X shared task on multilingual dependency parsing. In: Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X), pp 149–164, New York City. Association for Computational Linguistics

  6. Cavallari S, Cambria E, Cai H, Chang K, Zheng V (2019) Embedding both finite and infinite communities on graph. IEEE Comput Intell Mag 14(3):39–50

    Article  Google Scholar 

  7. Chen D. and Manning C (2014) A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 740–750, Doha, Qatar. Association for Computational Linguistics

  8. Dang HV and Le-Hong P (2021) A combined syntactic-semantic embedding model based on lexicalized tree-adjoining grammar. Comput Speech Lang 68

  9. Devlin J, Chang M-W, Lee K, and Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL, pages 1–16, Minnesota, USA

  10. Dozat T, Qi P, and Manning CD (2017) Stanford’s graph-based neural dependency parser at the CoNLL 2017 shared task. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp 20–30, Vancouver, Canada. Association for Computational Linguistics

  11. Dyer C, Ballesteros M, Ling W, Matthews A, and Smith NA (2015) Transition-based dependency parsing with stack long short-term memory. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp 334–343, Beijing, China. Association for Computational Linguistics

  12. Fernandez Astudillo R, Ballesteros M, Naseem T, Blodgett A, and Florian R (2020) Transition-based parsing with stack-transformers. In Findings of the Association for Computational Linguistics: EMNLP 2020, pp 1001–1007, Online. Association for Computational Linguistics

  13. Green N, Larasati SD, and Zabokrtsky Z (2012) Indonesian dependency treebank: annotation and parsing. In: Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation, pp 137–145, Bali, Indonesia. Faculty of Computer Science, Universitas Indonesia

  14. Harris ZS (1954) Distributional structure. Word 10(2–3):146–162

    Article  Google Scholar 

  15. Ji S, Pan S, Cambria E, Marttinen P, Yu PS (2022) A survey on knowledge graphs: representation, acquisition and applications. IEEE Trans Neural Netw Learn Syst 33(10):494–514

    Article  MathSciNet  Google Scholar 

  16. Kingma DP and Ba J (2015) Adam: a method for stochastic optimization. In: Bengio Y and LeCun Y, eds, Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, pp 1–15, San Diego, CA, USA

  17. Kiperwasser E, Goldberg Y (2016) Easy-first dependency parsing with hierarchical tree LSTMs. Trans Assoc Comput Linguist 4:445–461

    Article  Google Scholar 

  18. Kiperwasser E, Goldberg Y (2016) Simple and accurate dependency parsing using bidirectional LSTM feature representations. Trans Assoc Comput Linguist 4:313–327

    Article  Google Scholar 

  19. Kolen JF and Kremer SC (2001) Gradient Flow in Recurrent Nets: The Difficulty of Learning LongTerm Dependencies, pp 237–243. IEEE

  20. Kondratyuk D. and Straka M (2019) 75 languages, 1 model: parsing Universal Dependencies universally. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp 2779–2795, Hong Kong, China. Association for Computational Linguistics

  21. Kübler S, McDonald R, and Nivre J (2009) Dependency parsing. Morgan & Claypool Publishers

  22. Le P and Zuidema W (2014) The inside-outside recursive neural network model for dependency parsing. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 729–739, Doha, Qatar. Association for Computational Linguistics

  23. Le-Hong P, Nguyen TMH, and Azim R (2012) Vietnamese parsing with an automatically extracted tree-adjoining grammar. In: Proceedings of the IEEE RIVF, pp 91–96, HCMC, Vietnam

  24. Le-Hong P, Roussanaly A, Nguyen T-M-H (2015) A syntactic component for Vietnamese language processing. J Lang Modell 3(1):145–184

    Article  Google Scholar 

  25. Lei T, Xin Y, Zhang Y, Barzilay R, and Jaakkola T (2014) Low-rank tensors for scoring dependency structures. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1381–1391, Baltimore, Maryland. Association for Computational Linguistics

  26. Levy O and Goldberg Y (2014) Dependency-based word embeddings. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp 302–308, Baltimore, Maryland. Association for Computational Linguistics

  27. Ling W, Tsvetkov Y, Amir S, Fermandez R, Dyer C, Black AW, Trancoso I, and Lin C-C (2015) Not all contexts are created equal: better word representations with variable attention. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 1367–1372, Lisbon, Portugal. Association for Computational Linguistics

  28. Liu J and Zhang Y (2017) Encoder-decoder shift-reduce syntactic parsing. In: Proceedings of the 15th International Conference on Parsing Technologies, pp 105–114, Pisa, Italy. Association for Computational Linguistics

  29. McDonald R, Nivre J (2011) Analyzing and integrating dependency parsers. Comput Linguist 37(1):197–230

    Article  Google Scholar 

  30. McDonald R and Pereira F (2006) Online learning of approximate dependency parsing algorithms. In: Proceedings of EACL, pp 81–88, Trento, Italy

  31. McDonald R, Pereira F, Ribarov K, and Hajic J (2005) Non-projective dependency parsing using spanning tree algorithms. In: Proceedings of HLT-EMNLP, pp 522–530, Vancouver, Canada

  32. Nguyen TL, Ha ML, Nguyen VH, Nguyen TMH, and Le-Hong P (2013) Building a treebank for Vietnamese dependency parsing. In The 10th IEEE RIVF, pp 147–151, Hanoi, Vietnam. IEEE

  33. Nivre J (2003) An efficient algorithm for projective dependency parsing. In: Proceedings of the Eighth International Conference on Parsing Technologies, pp 149–160, Nancy, France

  34. Nivre J, de Marneffe M-C, Ginter F, Goldberg Y, Hajič J, Manning CD, McDonald R, Petrov S, Pyysalo S, Silveira N, Tsarfaty R, and Zeman D (2016) Universal Dependencies v1: a multilingual treebank collection. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pp 1659–1666, Portorož, Slovenia. European Language Resources Association (ELRA)

  35. Nivre J, Hall J, Kübler S, McDonald R, Nilsson J, Riedel S, and Yuret D (2007) The CoNLL 2007 shared task on dependency parsing. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp 915–932, Prague, Czech Republic. Association for Computational Linguistics

  36. Nivre J and McDonald R (2008) Integrating graph-based and transition-based dependency parsers. In: Proceedings of ACL-08, pp 950–958, Columbus, Ohio, USA. ACL

  37. Nivre J and Scholz M (2004) Deterministic dependency parsing of English text. In: Proceedings of COLING 2004, pp 1–7, Geneva, Switzerland

  38. Nivre JEA (2018) Universal dependencies 2.2. LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University

  39. Lewis MP, Simons GF, Fennig CD (eds) (2014) Ethnologue: languages of the World, 17th edn. SIL International, Dallas, Texas, USA

    Google Scholar 

  40. Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, and Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of NAACL, pp 1–15, Louisiana, USA

  41. Sneddon JN (2004) The Indonesian language: its history and role in modern society. UNSW Press

  42. Turian J, Ratinov L, and Bengio Y (2010) Word representations: a simple and general method for semi-supervised learning. In: Proceedings of ACL, pp 384–394, Uppsala, Sweden

  43. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in Neural Information Processing Systems, vol 30. Curran Associates Inc

  44. Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, and Bengio Y (2018) Graph attention networks. In: Proceedings of the Sixth International Conference on Learning Representations (ICLR), pp 1–12, Vancouver, Canada

  45. Wilie B, Vincentio K, Winata GI, Cahyawijaya S, Li X, Lim ZY, Soleman S, Mahendra R, Fung P, Bahar S, and Purwarianti A (2020) IndoNLU: benchmark and resources for evaluating Indonesian natural language understanding. In: Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing. Association for Computational Linguistics

  46. Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov R, and Le QV (2019) XLNet: generalized autoregressive pretraining for language understanding. In: Proceedings of NeurIPS, pp 5754–5764

  47. Zeman D, Hajič J, Popel M, Potthast M, Straka M, Ginter F, Nivre, J, and Petrov S (2018) CoNLL 2018 shared task: multilingual parsing from raw text to Universal Dependencies. In: Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp 1–21, Brussels, Belgium. Association for Computational Linguistics

  48. Zeman D, Popel M, Straka M, Hajič J, Nivre J, Ginter F, Luotolahti J, Pyysalo S, Petrov S, Potthast M, Tyers F, Badmaeva E, Gokirmak M, Nedoluzhko A, Cinková S, Hajič jr J, Hlaváčová J, Kettnerová V, Urešová Z, Kanerva J, Ojala S, Missilä A, Manning CD, Schuster S, Reddy S, Taji D, Habash N., Leung, H., de Marneffe, M.-C., Sanguinetti, M., Simi, M., Kanayama, H, de Paiva, V., Droganova, K., Martínez Alonso, H., Çöltekin, Ç., Sulubacak, U., Uszkoreit, H, Macketanz, V, Burchardt A, Harris K, Marheinecke K, Rehm G, Kayadelen T, Attia M, Elkahky A, Yu Z, Pitler E, Lertpradit S, Mandl M, Kirchner J, Alcalde HF, Strnadová J, Banerjee E, Manurung R, Stella A, Shimada A, Kwak S, Mendonça G, Lando T, Nitisaroj R, and Li J (2017) CoNLL 2017 shared task: Multilingual parsing from raw text to Universal Dependencies. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp 1–19, Vancouver, Canada. Association for Computational Linguistics

  49. Zhang Y and Nivre J (2011) Transition-based dependency parsing with rich non-local features. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp 188–193, Portland, Oregon, USA. Association for Computational Linguistics

  50. Zhang Z, Liu S, Li M, Zhou M, and Chen E (2017) Stack-based multi-layer attention for transition-based dependency parsing. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp 1677–1682, Copenhagen, Denmark. Association for Computational Linguistics

  51. Zhu C, Qiu X, Chen X, and Huang X (2015) A re-ranking model for dependency parser with recursive convolutional neural network. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp 1159–1168, Beijing, China. Association for Computational Linguistics

Download references

Acknowledgements

This study is supported by Vingroup Innovation Foundation (VINIF) in project code VINIF.2020.DA14.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Phuong Le-Hong.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Informed Consent

Informed consent was not required as no human or animals were involved.

Human and animal rights

This article does not contain any studies with human or animal subjects performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Le-Hong, P., Cambria, E. Integrating graph embedding and neural models for improving transition-based dependency parsing. Neural Comput & Applic 36, 2999–3016 (2024). https://doi.org/10.1007/s00521-023-09223-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-09223-3

Keywords

Navigation