Joint Text Mining with Heterogeneous Data

Aggarwal, Charu C.

doi:10.1007/978-3-319-73531-3_8

Charu C. Aggarwal²

9899 Accesses

Abstract

In Web and social media networks, the text documents are often associated with nodes. For example, the Web can be a viewed as a graph in which each node contains a Web page and also connects to other nodes via hyperlinks. Similarly, a social network is a friendship graph of user-to-user linkages in which each node contains the textual posting activity of the user.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The libraries libFM and libMF are different.

Bibliography

C. Aggarwal. Data mining: The textbook. Springer, 2015.
Google Scholar
C. Aggarwal. Recommender systems: The textbook. Springer, 2016.
Google Scholar
C. Aggarwal and N. Li. On node classification in dynamic content-based networks. SDM Conference, pp. 355–366, 2011.
Google Scholar
C. Aggarwal, Y. Xie, and P. Yu. On Dynamic Link Inference in Heterogeneous Networks. SDM Conference, pp. 415–426, 2012.
Chapter Google Scholar
C. Aggarwal, and C. Zhai, Mining text data. Springer, 2012.
Google Scholar
C. Aggarwal, Y. Zhao, and P. Yu. On the use of side information for mining text data. IEEE Transactions on Knowledge and Data Engineering, 26(6), pp. 1415–1429, 2014.
Article Google Scholar
L. Ballesteros and W. B. Croft. Dictionary methods for cross-lingual information retrieval. International Conference on Database and Expert Systems Applications, pp. 791–801, 1996.
Google Scholar
I. Bayer. Fastfm: a library for factorization machines. arXiv preprint arXiv:1505.00641, 2015. https://arxiv.org/pdf/1505.00641v2.pdf
S. Chakrabarti, B. Dom, and P. Indyk. Enhanced hypertext categorization using hyperlinks. ACM SIGMOD Conference, pp. 307–318, 1998.
Article Google Scholar
W. Dai, Y. Chen, G. Xue, Q. Yang, and Y. Yu. Translated learning: Transfer learning across different feature spaces. NIPS Conference, pp. 353–360, 2008.
Google Scholar
H. Deng, B. Zhao, J. Han. Collective topic modeling for heterogeneous networks. ACM SIGIR Conference, pp. 1109-1110, 2011.
Google Scholar
H. Deng, J. Han, B. Zhao, Y. Yu, and C. Lin. Probabilistic topic models with biased propagation on heterogeneous information networks. ACM KDD Conference, pp. 1271–1279, 2011.
Google Scholar
C. Freudenthaler, L. Schmidt-Thieme, and S. Rendle. Factorization machines: Factorized polynomial regression models. German-Polish Symposium on Data Analysis and Its Applications (GPSDAA), 2011. https://www.ismll.uni-hildesheim.de/pub/pdfs/FreudenthalerRendle_FactorizedPolynomialRegression.pdf
L. Getoor, N. Friedman, D. Koller, and B. Taskar. Learning probabilistic models of link structure. Journal of Machine Learning Research, 3, pp. 679–707, 2002.
MathSciNet MATH Google Scholar
D. Liben-Nowell, and J. Kleinberg. The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology, 58(7), pp. 1019–1031, 2007.
Article Google Scholar
Q. Mei, D. Cai, D. Zhang, and C. Zhai. Topic modeling with network regularization. World Wide Web Conference, pp. 101–110, 2008.
Google Scholar
A. K. Menon, and C. Elkan. Link prediction via matrix factorization. Machine Learning and Knowledge Discovery in Databases, pp. 437–452, 2011.
Chapter Google Scholar
L. Michelbacher, F. Laws, B. Dorow, U. Heid, and H. Schütze. Building a cross-lingual relatedness thesaurus using a graph similarity measure. LREC, 2010.
Google Scholar
G. Qi, C. Aggarwal, and T. Huang. Towards semantic knowledge propagation from text corpus to web images. WWW Conference, pp. 297–306, 2011.
Google Scholar
G. Qi, C. Aggarwal, and T. Huang. Community detection with edge content in social media networks. ICDE Conference, pp. 534–545, 2012.
Google Scholar
S. Rendle. Factorization machines. IEEE ICDM Conference, pp. 995–100, 2010.
Google Scholar
S. Rendle. Factorization machines with libfm. ACM Transactions on Intelligent Systems and Technology, 3(3), 57, 2012.
Article Google Scholar
P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, and T. Eliassi-Rad. Collective classification in network data. AI magazine, 29(3), pp. 93, 2008.
Article Google Scholar
A. Singh and G. Gordon. A unified view of matrix factorization models. Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 358–373, 2008.
Google Scholar
Y. Sun, J. Han, J. Gao, and Y. Yu. itopicmodel: Information network-integrated topic modeling. IEEE ICDM Conference, pp. 493–502, 2011.
Google Scholar
M. Tsai, C. Aggarwal, and T. Huang. Ranking in heterogeneous social media. WSDM Conference, pp. 613–622, 2014.
Google Scholar
A. Vinokourov, N. Cristianini, and J. Shawe-Taylor. Inferring a semantic representation of text via cross-language correlation analysis. NIPS Conference, pp. 1473–1480, 2002.
Google Scholar
H. Wang, H. Huang, F. Nie, and C. Ding. Cross-language Web page classification via dual knowledge transfer using nonnegative matrix tri-factorization. ACM SIGIR Conference, pp. 933–942, 2011.
Google Scholar
J. Yang, J. McAuley, and J. Leskovec. Community detection in networks with node attributes. IEEE ICDM Conference, pp. 1151–1156, 2013.
Google Scholar
Q. Yang, Q., Y. Chen, G. Xue, W. Dai, and T. Yu. Heterogeneous transfer learning for image clustering via the social web. Joint Conference of the ACL and Natural Language Processing of the AFNLP, pp. 1–9, 2009.
Google Scholar
T. Yang, R. Jin, Y. Chi, and S. Zhu. Combining link and content for community detection: a discriminative approach. ACM KDD Conference, pp. 927–936, 2009.
Google Scholar
Y. Zhou, H. Cheng, and J. X. Yu. Graph clustering based on structural/attribute similarities. Proceedings of the VLDB Endowment, 2(1), pp. 718–729, 2009.
Article Google Scholar
Y. Zhu, Y. Chen, Z. Lu, S. J. Pan, G. Xue, Y. Yu, and Q. Yang. Heterogeneous transfer learning for image classification. AAAI Conference, 2011.
Google Scholar
http://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html
http://www.cs.waikato.ac.nz/ml/weka/
https://www.csie.ntu.edu.tw/~cjlin/libmf/

Download references

Author information

Authors and Affiliations

IBM T. J. Watson Research Center, Yorktown Heights, NY, USA
Charu C. Aggarwal

Authors

Charu C. Aggarwal
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Aggarwal, C.C. (2018). Joint Text Mining with Heterogeneous Data. In: Machine Learning for Text. Springer, Cham. https://doi.org/10.1007/978-3-319-73531-3_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-73531-3_8
Published: 20 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73530-6
Online ISBN: 978-3-319-73531-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics