Skip to main content

Joint Text Mining with Heterogeneous Data

  • Chapter
  • First Online:
Machine Learning for Text
  • 9899 Accesses

Abstract

In Web and social media networks, the text documents are often associated with nodes. For example, the Web can be a viewed as a graph in which each node contains a Web page and also connects to other nodes via hyperlinks. Similarly, a social network is a friendship graph of user-to-user linkages in which each node contains the textual posting activity of the user.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 84.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The libraries libFM and libMF are different.

Bibliography

  1. C. Aggarwal. Data mining: The textbook. Springer, 2015.

    Google Scholar 

  2. C. Aggarwal. Recommender systems: The textbook. Springer, 2016.

    Google Scholar 

  3. C. Aggarwal and N. Li. On node classification in dynamic content-based networks. SDM Conference, pp. 355–366, 2011.

    Google Scholar 

  4. C. Aggarwal, Y. Xie, and P. Yu. On Dynamic Link Inference in Heterogeneous Networks. SDM Conference, pp. 415–426, 2012.

    Chapter  Google Scholar 

  5. C. Aggarwal, and C. Zhai, Mining text data. Springer, 2012.

    Google Scholar 

  6. C. Aggarwal, Y. Zhao, and P. Yu. On the use of side information for mining text data. IEEE Transactions on Knowledge and Data Engineering, 26(6), pp. 1415–1429, 2014.

    Article  Google Scholar 

  7. L. Ballesteros and W. B. Croft. Dictionary methods for cross-lingual information retrieval. International Conference on Database and Expert Systems Applications, pp. 791–801, 1996.

    Google Scholar 

  8. I. Bayer. Fastfm: a library for factorization machines. arXiv preprint arXiv:1505.00641, 2015. https://arxiv.org/pdf/1505.00641v2.pdf

  9. S. Chakrabarti, B. Dom, and P. Indyk. Enhanced hypertext categorization using hyperlinks. ACM SIGMOD Conference, pp. 307–318, 1998.

    Article  Google Scholar 

  10. W. Dai, Y. Chen, G. Xue, Q. Yang, and Y. Yu. Translated learning: Transfer learning across different feature spaces. NIPS Conference, pp. 353–360, 2008.

    Google Scholar 

  11. H. Deng, B. Zhao, J. Han. Collective topic modeling for heterogeneous networks. ACM SIGIR Conference, pp. 1109-1110, 2011.

    Google Scholar 

  12. H. Deng, J. Han, B. Zhao, Y. Yu, and C. Lin. Probabilistic topic models with biased propagation on heterogeneous information networks. ACM KDD Conference, pp. 1271–1279, 2011.

    Google Scholar 

  13. C. Freudenthaler, L. Schmidt-Thieme, and S. Rendle. Factorization machines: Factorized polynomial regression models. German-Polish Symposium on Data Analysis and Its Applications (GPSDAA), 2011. https://www.ismll.uni-hildesheim.de/pub/pdfs/FreudenthalerRendle_FactorizedPolynomialRegression.pdf

  14. L. Getoor, N. Friedman, D. Koller, and B. Taskar. Learning probabilistic models of link structure. Journal of Machine Learning Research, 3, pp. 679–707, 2002.

    MathSciNet  MATH  Google Scholar 

  15. D. Liben-Nowell, and J. Kleinberg. The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology, 58(7), pp. 1019–1031, 2007.

    Article  Google Scholar 

  16. Q. Mei, D. Cai, D. Zhang, and C. Zhai. Topic modeling with network regularization. World Wide Web Conference, pp. 101–110, 2008.

    Google Scholar 

  17. A. K. Menon, and C. Elkan. Link prediction via matrix factorization. Machine Learning and Knowledge Discovery in Databases, pp. 437–452, 2011.

    Chapter  Google Scholar 

  18. L. Michelbacher, F. Laws, B. Dorow, U. Heid, and H. Schütze. Building a cross-lingual relatedness thesaurus using a graph similarity measure. LREC, 2010.

    Google Scholar 

  19. G. Qi, C. Aggarwal, and T. Huang. Towards semantic knowledge propagation from text corpus to web images. WWW Conference, pp. 297–306, 2011.

    Google Scholar 

  20. G. Qi, C. Aggarwal, and T. Huang. Community detection with edge content in social media networks. ICDE Conference, pp. 534–545, 2012.

    Google Scholar 

  21. S. Rendle. Factorization machines. IEEE ICDM Conference, pp. 995–100, 2010.

    Google Scholar 

  22. S. Rendle. Factorization machines with libfm. ACM Transactions on Intelligent Systems and Technology, 3(3), 57, 2012.

    Article  Google Scholar 

  23. P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, and T. Eliassi-Rad. Collective classification in network data. AI magazine, 29(3), pp. 93, 2008.

    Article  Google Scholar 

  24. A. Singh and G. Gordon. A unified view of matrix factorization models. Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 358–373, 2008.

    Google Scholar 

  25. Y. Sun, J. Han, J. Gao, and Y. Yu. itopicmodel: Information network-integrated topic modeling. IEEE ICDM Conference, pp. 493–502, 2011.

    Google Scholar 

  26. M. Tsai, C. Aggarwal, and T. Huang. Ranking in heterogeneous social media. WSDM Conference, pp. 613–622, 2014.

    Google Scholar 

  27. A. Vinokourov, N. Cristianini, and J. Shawe-Taylor. Inferring a semantic representation of text via cross-language correlation analysis. NIPS Conference, pp. 1473–1480, 2002.

    Google Scholar 

  28. H. Wang, H. Huang, F. Nie, and C. Ding. Cross-language Web page classification via dual knowledge transfer using nonnegative matrix tri-factorization. ACM SIGIR Conference, pp. 933–942, 2011.

    Google Scholar 

  29. J. Yang, J. McAuley, and J. Leskovec. Community detection in networks with node attributes. IEEE ICDM Conference, pp. 1151–1156, 2013.

    Google Scholar 

  30. Q. Yang, Q., Y. Chen, G. Xue, W. Dai, and T. Yu. Heterogeneous transfer learning for image clustering via the social web. Joint Conference of the ACL and Natural Language Processing of the AFNLP, pp. 1–9, 2009.

    Google Scholar 

  31. T. Yang, R. Jin, Y. Chi, and S. Zhu. Combining link and content for community detection: a discriminative approach. ACM KDD Conference, pp. 927–936, 2009.

    Google Scholar 

  32. Y. Zhou, H. Cheng, and J. X. Yu. Graph clustering based on structural/attribute similarities. Proceedings of the VLDB Endowment, 2(1), pp. 718–729, 2009.

    Article  Google Scholar 

  33. Y. Zhu, Y. Chen, Z. Lu, S. J. Pan, G. Xue, Y. Yu, and Q. Yang. Heterogeneous transfer learning for image classification. AAAI Conference, 2011.

    Google Scholar 

  34. http://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html

  35. http://www.cs.waikato.ac.nz/ml/weka/

  36. https://www.csie.ntu.edu.tw/~cjlin/libmf/

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Aggarwal, C.C. (2018). Joint Text Mining with Heterogeneous Data. In: Machine Learning for Text. Springer, Cham. https://doi.org/10.1007/978-3-319-73531-3_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-73531-3_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-73530-6

  • Online ISBN: 978-3-319-73531-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics