Block-Diagonal Approach to Non-Negative Factorization of Sparse Linguistic Matrices and Tensors of Extra-Large Dimension Using the Latent Dirichlet Distribution

Anisimov, A. V.; Marchenko, O. O.; Nasirov, E. Ì.

doi:10.1007/s10559-018-0087-z

Block-Diagonal Approach to Non-Negative Factorization of Sparse Linguistic Matrices and Tensors of Extra-Large Dimension Using the Latent Dirichlet Distribution

CYBERNETICS
Published: 23 November 2018

Volume 54, pages 853–859, (2018)
Cite this article

Cybernetics and Systems Analysis Aims and scope

A. V. Anisimov¹,
O. O. Marchenko¹ &
E. Ì. Nasirov¹

42 Accesses
Explore all metrics

Abstract

This paper describes algorithms for non-negative factorization of sparse matrices and tensors, which is a popular technology in artificial intelligence in general and in computer linguistics in particular. It is proposed to use the latent Dirichlet distribution to reduce matrices and tensors to block-diagonal form for parallelizing computations and accelerating non-negative factorization of linguistic matrices and tensors of extremely large dimension. The proposed model also allows to supplement models with new data without performing non-negative factorization of the entire very large tensor anew from the very beginning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Near-term advances in quantum natural language processing

Article 11 April 2024

Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey

Article 28 November 2018

Bolstering stochastic gradient descent with model building

Article Open access 15 April 2024

References

W. Xu, X. Liu, and Y. Gong, “Document-clustering based on n-negative matrix factorization,” in: Proc. SIGIR’2003 (2003), pp. 267–273.
F. Shahnaz, M. W. Berry, V. Paul Pauca, and R. J. Plemmons, “Document clustering using nonnegative matrix factorization,” Information Processing and Management, Vol. 42, 649–660 (2006).
Article Google Scholar
A. Anisimov, O. Marchenko, A. Nikolenko, E. Porkhun, and V. Taranukha, “Ukrainian WordNet: Creation and Filling,” in: H. L. Larsen, M. J. Marnin-Bautista, M. A. Vila, T. Andreasen, and H. Christiasen (eds.), Flexible Query Answering Systems (FQAS 2013). Lecture Notes in Computer Science, Vol. 8132, 649–660 (2013).
Chapter Google Scholar
T. Van De Cruys, “A non-negative tensor factorization model for selectional preference induction,” Journal of Natural Language Engineering, Vol. 16, No. 4, 417–437 (2010).
Article Google Scholar
T. Van De Cruys, L. Rimell, T. Poibeau, and A. Korhonen, “Multi-way tensor factorization for unsupervised lexical acquisition,” in: Proc. COLING-2012 (2012), pp. 2703–2720.
O. O. Marchenko, “A method for automatic construction of ontological knowledge bases. I. Development of a semantic-syntactic model of natural language,” Cybernetics and Systems Analysis, Vol. 52, No. 1, 20–29 (2016).
Article MathSciNet Google Scholar
B. W. Bader and T. G. Kolda, Matlab Tensor Toolbox Version 2.5. URL: http://www.sandia.gov/~tgkolda/TensorToolbox/.
K. Kanjani, “Parallel non negative matrix factorization for document clustering,” Tech. Rep., Texas A & M University (2007). URL: https://pdfs.semanticscholar.org/66ad/868f7fe55db5b64f963533a6cb8e9a245257.pdf.
V. Kysenko, K. Rupp, O. Marchenko, S. Selberherr, and A. Anisimov, “GPU accelerated non-negative matrix factorization for text mining,” Natural Language Processing and Information Systems. Lecture Notes in Computer Science, Vol. 7337, 158–163 (2012).
Article Google Scholar
C. Liu, H.-C. Yang, J. Fan, L.-W. He, and Y.-M. Wang, “Distributed non-negative matrix factorization for web-scale dyadic data analysis on mapreduce,” in: Proc. 19th Intern. Conf. on World Wide Web (WWW’10) (2010), pp. 681–690.
S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman, “Indexing by latent semantic analysis,” Journal of the American Society for Information Science, Vol. 41, No. 6, 391–407 (1990).
Article Google Scholar
P. Paatero and U. Tapper, “Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values,” Environmetrics, Vol. 5, No. 2, 111–126 (1994).
Article Google Scholar
D. D. Lee and H. S. Seung, “Algorithms for non-negative matrix factorization,” in: Advances in Neural Information Processing Systems 13 (NIPS 2000) (2000), pp. 556–562.
S. A. Vavasis, “On the complexity of non-negative matrix factorization,” SIAM J. Optim., Vol. 20, 1364–1377 (2009).
Article MathSciNet Google Scholar
D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet allocation,” Journal of Machine Learning Research, Vol. 3, 993–1022 (2003).
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
A. V. Anisimov, O. O. Marchenko & E. Ì. Nasirov

Authors

A. V. Anisimov
View author publications
You can also search for this author in PubMed Google Scholar
O. O. Marchenko
View author publications
You can also search for this author in PubMed Google Scholar
E. Ì. Nasirov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. V. Anisimov.

Additional information

Translated from Kibernetika i Sistemnyi Analiz, No. 6, November–December, 2018, pp. 3–10.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Anisimov, A.V., Marchenko, O.O. & Nasirov, E.Ì. Block-Diagonal Approach to Non-Negative Factorization of Sparse Linguistic Matrices and Tensors of Extra-Large Dimension Using the Latent Dirichlet Distribution. Cybern Syst Anal 54, 853–859 (2018). https://doi.org/10.1007/s10559-018-0087-z

Download citation

Received: 19 July 2018
Published: 23 November 2018
Issue Date: November 2018
DOI: https://doi.org/10.1007/s10559-018-0087-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Block-Diagonal Approach to Non-Negative Factorization of Sparse Linguistic Matrices and Tensors of Extra-Large Dimension Using the Latent Dirichlet Distribution

Abstract

Access this article

Similar content being viewed by others

Near-term advances in quantum natural language processing

Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey

Bolstering stochastic gradient descent with model building

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Block-Diagonal Approach to Non-Negative Factorization of Sparse Linguistic Matrices and Tensors of Extra-Large Dimension Using the Latent Dirichlet Distribution

Abstract

Access this article

Similar content being viewed by others

Near-term advances in quantum natural language processing

Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey

Bolstering stochastic gradient descent with model building

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation