A Matrix Factorization Framework for Jointly Analyzing Multiple Nonnegative Data Sources

Gupta, Sunil Kumar; Phung, Dinh; Adams, Brett; Venkatesh, Svetha

doi:10.1007/978-3-642-45252-9_10

Sunil Kumar Gupta³,
Dinh Phung³,
Brett Adams⁴ &
…
Svetha Venkatesh³

Part of the book series: Studies in Big Data ((SBD,volume 3))

3435 Accesses
4 Citations

Abstract

Nonnegative matrix factorization based methods provide one of the simplest and most effective approaches to text mining. However, their applicability is mainly limited to analyzing a single data source. In this chapter, we propose a novel joint matrix factorization framework which can jointly analyze multiple data sources by exploiting their shared and individual structures. The proposed framework is flexible to handle any arbitrary sharing configurations encountered in real world data. We derive an efficient algorithm for learning the factorization and show that its convergence is theoretically guaranteed. We demonstrate the utility and effectiveness of the proposed framework in two real-world applications—improving social media retrieval using auxiliary sources and cross-social media retrieval. Representing each social media source using their textual tags, for both applications, we show that retrieval performance exceeds the existing state-of-the-art techniques. The proposed solution provides a generic framework and can be applicable to a wider context in data mining wherever one needs to exploit mutual and individual knowledge present across multiple data sources.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

DC-NMF: nonnegative matrix factorization based on divide-and-conquer for fast clustering and topic modeling

Article 04 April 2017

Nonnegative Matrix Factorizations for Intelligent Data Analysis

Nonnegative Matrix Factorization for Document Clustering: A Survey

Notes

1.
http://www.blogger.com/
2.
http://www.flickr.com/services/api/
3.
http://code.google.com/apis/youtube/overview.html
4.
fixed at 0.05 for selecting the tags with more than 5 % weight in a topic.

References

Ando, R., Zhang, T.: A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learn. Res. 6, 1817–1853 (2005)
MATH MathSciNet Google Scholar
Baeza-Yates, R., Ribeiro-Neto, B., et al.: Modern Information Retrieval. Addison-Wesley, Reading (1999)
Google Scholar
Berry, M., Browne, M.: Email surveillance using non-negative matrix factorization. Comput. Math. Org. Theor. 11(3), 249–264 (2005)
Article MATH Google Scholar
Cilibrasi, R., Vitanyi, P.: The Google similarity distance. IEEE Trans. Knowl. Data Eng. 19(3), 370–383 (2007)
Article Google Scholar
Golder, S., Huberman, B.: Usage patterns of collaborative tagging systems. J. Inf. Sci. 32(2), 198 (2006)
Article Google Scholar
Gu, Q., Zhou, J., (2009) Learning the shared subspace for multi-task clustering and transductive transfer classification. In: 9th IEEE International Conference on Data Mining: ICDM’09, pp. 159–168. IEEE (2009)
Google Scholar
Gupta, S., Phung, D., Adams, B., Tran, T., Venkatesh, S.: Nonnegative shared subspace learning and its application to social media retrieval. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1169–1178. ACM (2010)
Google Scholar
Gupta, S., Phung, D., Adams, B., Venkatesh, S.: Regularized nonnegative shared subspace learning. Data Min. Knowl. Disc. 26(1), 57–97, (2011)
Google Scholar
Ji, S., Tang, L., Yu, S., Ye, J.: A shared-subspace learning framework for multi-label classification. ACM Trans. Knowl. Disc. Data 4(2), 1–29 (2010)
Article Google Scholar
Kankanhalli, M., Rui, Y.: Application potential of multimedia information retrieval. Proc. IEEE 96(4), 712–720 (2008)
Article Google Scholar
Lee, D., Seung, H.: Algorithms for non-negative matrix factorization. Adv. Neural Inf. Process. Syst. 13, 556–562 (2001)
Google Scholar
Lin, C.: Projected gradient methods for nonnegative matrix factorization. Neural Comput. 19(10), 2756–2779 (2007)
Article MATH MathSciNet Google Scholar
Lin, Y., Sundaram, H., De Choudhury, M., Kelliher, A.: Temporal patterns in social media streams: theme discovery and evolution using joint analysis of content and context. In: IEEE International Conference on Multimedia and Expo, 2009: ICME 2009, pp. 1456–1459 (2009)
Google Scholar
Mardia, K.V., Bibby, J.M., Kent, J.T.: Multivariate Analysis. Academic Press, New York (1979)
Google Scholar
Marlow, C., Naaman, M., Boyd, D., Davis, M.: HT06, tagging paper, taxonomy, flickr, academic article, to read. In: Proceedings Hypertext’06, pp. 31–40 (2006)
Google Scholar
Shahnaz, F., Berry, M., Pauca, V., Plemmons, R.: Document clustering using nonnegative matrix factorization. Inf. Process. Manage. 42(2), 373–386 (2006)
Article MATH Google Scholar
Si, S., Tao, D., Geng, B.: Bregman divergence based regularization for transfer subspace learning. IEEE Trans. Knowl. Data Eng. 22(7), 929–942 (2009)
Article Google Scholar
Sigurbjörnsson, B., Van Zwol, R.: Flickr tag recommendation based on collective knowledge. In: Proceeding of the 17th International Conference on World Wide Web, pp. 327–336. ACM, New York (2008)
Google Scholar
Xu, W., Liu, X., Gong, Y.: Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th ACM SIGIR Conference on Research and Development in Informaion Retrieval, pp. 267–273 (2003)
Google Scholar
Yan, R., Tesic, J., Smith, J.: Model-shared subspace boosting for multi-label classification. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 834–843. ACM (2007)
Google Scholar
Yang, Y., Xu, D., Nie, F., Luo, J., Zhuang, Y.: Ranking with local regression and global alignment for cross media retrieval. In: Proceedings of the 17th ACM International Conference on Multimedia, pp. 175–184. ACM (2009)
Google Scholar
Yi, Y., Zhuang, Y., Wu, F., Pan, Y.: Harmonizing hierarchical manifolds for multimedia document semantics understanding and cross-media retrieval. Language 1520, 9210 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Pattern Recognition and Data Analytics, Deakin University, Geelong, VIC, Australia
Sunil Kumar Gupta, Dinh Phung & Svetha Venkatesh
Department of Computing, Curtin University, Perth, Western Australia
Brett Adams

Authors

Sunil Kumar Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Dinh Phung
View author publications
You can also search for this author in PubMed Google Scholar
Brett Adams
View author publications
You can also search for this author in PubMed Google Scholar
Svetha Venkatesh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sunil Kumar Gupta .

Editor information

Editors and Affiliations

Faculty of Commerce, Kansai University, Osaka, Japan
Katsutoshi Yada

Appendix

1.1 Proof of Convergence

We prove the convergence of multiplicative updates given by Eqs. (10) and (11). We avoid lengthy derivations and only provide a sketch of the proof. Following Ref. [11], the auxiliary function $G(w,w^{t})$ is defined as an upper bound function for $J(w^{t})$. For our MS-NMF case, we prove the following lemma extended from Ref. [11]:

Lemma.

If $\left( W_{\nu }\right) _{p}$ is $p$th row of matrix $W_{\nu }$, $\nu \in S\left( n,i\right) $ and $C\left( \left( W_{\nu }\right) _{p}\right) $ is the diagonal matrix with its $\left( l,k\right) $th element

$$ C_{lk}\left( \left( W_{\nu }\right) _{p}\right) =\mathbf {1}_{l,k}\frac{\left( \sum \limits _{i\in \nu }\lambda _{i}H_{i,\nu }\left( \sum \limits _{u\in S\left( n,i\right) }H_{i,u}^{\mathsf {T}}\left( W_{u}\right) _{p}\right) \right) _{l}}{\left( W_{\nu }\right) _{pl}} $$

then

$$ G\left( \left( W_{\nu }\right) _{p},\left( W_{\nu }\right) _{p}^{t}\right) =J\left( \left( W_{\nu }\right) _{p}^{t}\right) +\left( \left( W_{\nu }\right) _{p}-\left( W_{\nu }\right) _{p}^{t}\right) ^{\mathsf {T}}\nabla _{\left( W_{\nu }\right) _{p}^{t}}J\left( \left( W_{\nu }\right) _{p}^{t}\right) \\ +\frac{1}{2}\left( \left( W_{\nu }\right) _{p}-\left( W_{\nu }\right) _{p}^{t}\right) ^{\mathsf {T}}C\left( \left( W_{\nu }\right) _{p}^{t}\right) \left( \left( W_{\nu }\right) _{p}^{t}-\left( W_{\nu }\right) _{p}^{t}\right) $$

is an auxiliary function for $J\left( \left( W_{\nu }\right) _{p}^{t}\right) $, cost function defined for $p$th row of the data.

Proof.

The second derivative of $J\left( \left( W_{\nu }\right) _{p}^{t}\right) $ i.e. $\nabla _{\left( W_{\nu }\right) _{p}^{t}}^{2}J\left( \left( W_{\nu }\right) _{p}^{t}\right) =\sum _{i\in \nu }\lambda _{i}H_{i,\nu } H_{i,\nu }^{\mathsf {T}}$. Comparing the expression of $G\left( \left( W_{\nu }\right) _{p},\left( W_{\nu }\right) _{p}^{t}\right) $ in the lemma with the Taylor series expansion of $G\left( \left( W_{\nu }\right) _{p},\left( W_{\nu }\right) _{p}^{t}\right) $ at $\left( W_{\nu }\right) _{p}^{t}$, it can be seen that all we need to prove is the following

$$ \left( \left( W_{\nu }\right) _{p}-\left( W_{\nu }\right) _{p}^{t}\right) ^{\mathsf {T}}T_{W_{\nu }}\left( \left( W_{\nu }\right) _{p}^{t}-\left( W_{\nu }\right) _{p}^{t}\right) \ge 0 $$

where $T_{W_{\nu }}\triangleq C\left( \left( W_{\nu }\right) _{p}^{t}\right) -\sum _{i\in \nu }\lambda _{i}H_{i,\nu }H_{i,\nu }^{\mathsf {T}}$. Similar to Ref. [11], instead of showing it directly, we show the positive definiteness of matrix $E$ with elements

$$\begin{aligned} E_{lk}\left( \left( W_{\nu }\right) _{p}^{t}\right)&= \left( \left( W_{\nu }\right) _{p}-\left( W_{\nu }\right) _{p}^{t}\right) _{l}^{\mathsf {T}}\left( T_{W_{\nu }}\right) _{lk}\left( \left( W_{\nu }\right) _{p}^{t}-\left( W_{\nu }\right) _{p}^{t}\right) _{k} \end{aligned}$$

For positive definiteness of matrix $E$, for every nonzero $z$, we have to show that $z^{\mathsf {T}}Mz$ is positive. To avoid lengthy derivation, we only show main step here :

$$\begin{aligned} z^{\mathsf {T}}Mz&=\sum _{l,k}z_{l}\left( W_{\nu }\right) _{pl}^{t}\left( T_{W_{\nu }}\right) _{lk}\left( W_{\nu }\right) _{pk}^{t}z_{k}\\&=\sum _{l,k}z_{l}^{2}\left( W_{\nu }\right) _{pl}^{t}\left( \sum _{u\in S\left( n,i\right) ,u\ne \nu }H_{i,u}^{\mathsf {T}}\left( W_{u}\right) _{p}\right) _{l}\\&\quad +\lambda \sum _{l,k}\left( W_{\nu }\right) _{pl}^{t}\left( \sum _{i\in \nu }\lambda _{i}\left( H_{i,\nu }H_{i,\nu }^{\mathsf {T}}\right) _{lk}\right) \left( W_{\nu }\right) _{pk}^{t}\frac{\left( z_{l}-z_{k}\right) ^{2}}{2}\ge 0 \end{aligned}$$

At the local minimum of $G\left( \left( W_{\nu }\right) _{p},\left( W_{\nu }\right) _{p}^{t}\right) $ for iteration $\left( t\right) $, by comparing $\nabla _{\left( W_{\nu }\right) _{p}^{t}}G\left( \left( W_{\nu }\right) _{p},\left( W_{\nu }\right) _{p}^{t}\right) $ with gradient-descent update of Eq. (8), we get the step size $\eta _{\left( W_{\nu }\right) _{lk}^{t}}$ as in Eq. (9).

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gupta, S.K., Phung, D., Adams, B., Venkatesh, S. (2014). A Matrix Factorization Framework for Jointly Analyzing Multiple Nonnegative Data Sources. In: Yada, K. (eds) Data Mining for Service. Studies in Big Data, vol 3. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45252-9_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-45252-9_10
Published: 04 January 2014
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45251-2
Online ISBN: 978-3-642-45252-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

A Matrix Factorization Framework for Jointly Analyzing Multiple Nonnegative Data Sources

Abstract

Access this chapter

Similar content being viewed by others

DC-NMF: nonnegative matrix factorization based on divide-and-conquer for fast clustering and topic modeling

Nonnegative Matrix Factorizations for Intelligent Data Analysis

Nonnegative Matrix Factorization for Document Clustering: A Survey

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

1.1 Proof of Convergence

Lemma.

Proof.

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

A Matrix Factorization Framework for Jointly Analyzing Multiple Nonnegative Data Sources

Abstract

Access this chapter

Similar content being viewed by others

DC-NMF: nonnegative matrix factorization based on divide-and-conquer for fast clustering and topic modeling

Nonnegative Matrix Factorizations for Intelligent Data Analysis

Nonnegative Matrix Factorization for Document Clustering: A Survey

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 Proof of Convergence

Lemma.

Proof.

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation