Skip to main content
Log in

Learning and evaluation of latent dependency forest models

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Latent dependency forest models (LDFMs) are a new type of probabilistic models with dynamic dependency structures over random variables. They distinguish themselves from other probabilistic models by the fact that there is no need for structure search when learning the models. However, parameter learning of LDFMs is still quite challenging since the partition function cannot be tractably calculated. In this paper, we investigate and empirically compare several algorithms of learning parameters of LDFMs which either approximate or ignore the partition function in the learning objective. Furthermore, we propose an approximate algorithm to estimate the partition function of LDFM. Experimental results show that (1) our learning algorithms can achieve better results than the previous learning algorithm of LDFMs, and (2) our partition function estimation algorithm is accurate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Carreira-Perpinan MA, Hinton GE (2005) On contrastive divergence learning. Artif Intell Stat 10:33–40

    Google Scholar 

  2. Chu S, Jiang Y, Tu K (2017) Latent dependency forest models. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, 4–9 Feb 2017, San Francisco, California, USA, pp 3733–3739

  3. Collins M, Globerson A, Koo T, Carreras X, Bartlett PL (2008) Exponentiated gradient algorithms for conditional random fields and max-margin Markov networks. J Mach Learn Res 9(Aug):1775–1822

    MathSciNet  MATH  Google Scholar 

  4. Fahlman SE, Hinton GE, Sejnowski TJ (1983) Massively parallel architectures for AI: NETL, Thistle, and Boltzmann machines. In: Proceedings of the third AAAI conference on artificial intelligence. AAAI Press, pp 109–113

  5. Gens R, Pedro D (2013) Learning the structure of sum-product networks. In: International conference on machine learning, pp 873–880

  6. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge

    MATH  Google Scholar 

  7. Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14(8):1771–1800

    Article  Google Scholar 

  8. Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554

    Article  MathSciNet  Google Scholar 

  9. Kivinen J, Warmuth MK (1997) Exponentiated gradient versus gradient descent for linear predictors. Inf Comput 132(1):1–63

    Article  MathSciNet  Google Scholar 

  10. Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. MIT Press, Cambridge

    MATH  Google Scholar 

  11. Liang P, Klein D (2009) Online EM for unsupervised models. In: Proceedings of human language technologies: the 2009 annual conference of the North American chapter of the association for computational linguistics. Association for Computational Linguistics, pp 611–619

  12. Lowd D, Rooshenas A (2015) The libra toolkit for probabilistic models. J Mach Learn Res 16:2459–2463

    MathSciNet  MATH  Google Scholar 

  13. Ma J, Peng J, Wang S, Xu J (2013) Estimating the partition function of graphical models using Langevin importance sampling. J Mach Learn Res 31:433–441

    Google Scholar 

  14. McDonald R, Pereira F, Ribarov K, Hajič J (2005) Non-projective dependency parsing using spanning tree algorithms. In: Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, pp 523–530

  15. Meila M, Jordan MI (2001) Learning with mixtures of trees. J Mach Learn Res 1:1–48

    MathSciNet  MATH  Google Scholar 

  16. Murray I, Ghahramani Z (2004) Bayesian learning in undirected graphical models: approximate MCMC algorithms. In: Proceedings of the 20th conference on Uncertainty in artificial intelligence. AUAI Press, pp 392–399

  17. Neal RM (2001) Annealed importance sampling. Stat Comput 11(2):125–139

    Article  MathSciNet  Google Scholar 

  18. Poon H, Domingos P (2011) Sum-product networks: a new deep architecture. In: Proceedings of the twenty-seventh conference on uncertainty in artificial intelligence. AUAI Press, pp 337–346

  19. Rooshenas A, Lowd D (2014) Learning sum-product networks with direct and indirect variable interactions. In: Proceedings of the 31st international conference on machine learning (ICML-14), pp 710–718

  20. Sato MA, Ishii S (2000) On-line em algorithm for the normalized gaussian network. Neural Comput 12(2):407–432

    Article  Google Scholar 

  21. Tieleman T (2008) Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th international conference on machine learning. ACM, pp 1064–1071

  22. Tsamardinos I, Aliferis CF, Statnikov A (2003) Time and sample efficient discovery of Markov blankets and direct causal relations. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 673–678

  23. Wainwright MJ, Jordan MI (2008) Graphical models, exponential families, and variational inference. Found Trends® Mach Learn 1(1–2):1–305

    MATH  Google Scholar 

  24. Zhao H, Poupart P, Gordon GJ (2016) A unified approach for learning the parameters of sum-product networks. In: Advances in neural information processing systems, pp 433–441

  25. Zhao Y, Chen Y, Tu K, Tian J (2015) Curriculum learning of Bayesian network structures. In: Proceedings of the 7th Asian conference on machine learning, pp 269–284

Download references

Funding

Funding was provided by National Natural Science Foundation of China (CN) (Grant No. 61503248).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yong Jiang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, Y., Zhou, Y. & Tu, K. Learning and evaluation of latent dependency forest models. Neural Comput & Applic 31, 6795–6805 (2019). https://doi.org/10.1007/s00521-018-3504-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-018-3504-3

Keywords

Navigation