Advertisement

An Information Retrieval Approach for Finding Dependent Subspaces of Multiple Views

  • Ziyuan Lin
  • Jaakko Peltonen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10358)

Abstract

Finding relationships between multiple views of data is essential both in exploratory analysis and as pre-processing for predictive tasks. A prominent approach is to apply variants of Canonical Correlation Analysis (CCA), a classical method seeking correlated components between views. The basic CCA is restricted to maximizing a simple dependency criterion, correlation, measured directly between data coordinates. We introduce a new method that finds dependent subspaces of views directly optimized for the data analysis task of neighbor retrieval between multiple views. We optimize mappings for each view such as linear transformations to maximize cross-view similarity between neighborhoods of data samples. The criterion arises directly from the well-defined retrieval task, detects nonlinear and local similarities, measures dependency of data relationships rather than only individual data coordinates, and is related to well understood measures of information retrieval quality. In experiments the proposed method outperforms alternatives in preserving cross-view neighborhood similarities, and yields insights into local dependencies between multiple views.

Notes

Acknowledgments

We acknowledge the computational resources provided by the Aalto Science-IT project. Authors belong to the Finnish CoE in Computational Inference Research COIN. The work was supported in part by TEKES (Re:Know project). The work was also supported in part by the Academy of Finland, decision numbers 252845, 256233, and 295694.

References

  1. 1.
    Andrew, G., Arora, R., Livescu, K., Bilmes, J.: Deep canonical correlation analysis. In: Proceedings of ICML (2013)Google Scholar
  2. 2.
    Bach, F.R., Jordan, M.I.: Kernel independent component analysis. J. Mach. Learn. Res. 3, 1–48 (2003)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Bunte, K., Järvisalo, M., Berg, J., Myllymäki, P., Peltonen, J., Kaski, S.: Optimal neighborhood preserving visualization by maximum satisfiability. In: Proceedings of AAAI (2014)Google Scholar
  4. 4.
    Ceci, M., Pio, G., Kuzmanovski, V., Deroski, S.: Semi-supervised multi-view learning for gene network reconstruction. PLOS ONE 10(12), 1–27 (2015)CrossRefGoogle Scholar
  5. 5.
    Faisal, A., Gillberg, J., Leen, G., Peltonen, J.: Transfer learning using a nonparametric sparse topic model. Neurocomputing 112, 124–137 (2013)CrossRefGoogle Scholar
  6. 6.
    Faisal, A., Gillberg, J., Peltonen, J., Leen, G., Kaski, S.: Sparse nonparametric topic model for transfer learning. In: Proceedings of ESANN (2012)Google Scholar
  7. 7.
    Globerson, A., Chechik, G., Pereira, F., Tishby, N.: Euclidean embedding of co-occurrence data. J. Mach. Learn. Res. 8, 2265–2295 (2007)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Hodosh, M., Young, P., Hockenmaier, J.: Framing image description as a ranking task: data, models and evaluation metrics. J. Artif. Intell. Res. 47(1), 853–899 (2013)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Honkela, A., Peltonen, J., Topa, H., Charapitsa, I., Matarese, F., Grote, K., Stunnenberg, H., Reid, G., Lawrence, N., Rattray, M.: Genome-wide modeling of transcription kinetics reveals patterns of RNA production delays. Proc. Natl. Acad. Sci. 112(42), 13115–13120 (2015)CrossRefGoogle Scholar
  10. 10.
    Hotelling, H.: Relations between two sets of variates. Biometrika 28(3–4), 321–377 (1936)CrossRefzbMATHGoogle Scholar
  11. 11.
    Klami, A., Virtanen, S., Kaski, S.: Bayesian canonical correlation analysis. J. Mach. Learn. Res. 14, 965–1003 (2013)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Lai, P., Fyfe, C.: Kernel and nonlinear canonical correlation analysis. Int. J. Neural Syst. 10(5), 365–377 (2000)CrossRefGoogle Scholar
  13. 13.
    LeCun, Y., Cortes, C.: MNIST handwritten digit database (2010)Google Scholar
  14. 14.
    Leen, G., Peltonen, J., Kaski, S.: Focused multi-task learning using Gaussian processes. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011. LNCS, vol. 6912, pp. 310–325. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-23783-6_20 CrossRefGoogle Scholar
  15. 15.
    Leen, G., Peltonen, J., Kaski, S.: Focused multi-task learning in a Gaussian process framework. Mach. Learn. 89(1–2), 157–182 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)zbMATHGoogle Scholar
  17. 17.
    Nguyen, H., Vreeken, J.: Canonical divergence analysis. CoRR abs/1510.08370 (2015)Google Scholar
  18. 18.
    Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, New York (1999)CrossRefzbMATHGoogle Scholar
  19. 19.
    Peltonen, J.: Visualization by linear projections as information retrieval. In: Príncipe, J.C., Miikkulainen, R. (eds.) WSOM 2009. LNCS, vol. 5629, pp. 237–245. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-02397-2_27 CrossRefGoogle Scholar
  20. 20.
    Peltonen, J., Kaski, S.: Generative modeling for maximizing precision and recall in information visualization. In: Proceedings of AISTATS, pp. 579–587 (2011)Google Scholar
  21. 21.
    Peltonen, J., Lin, Z.: Information retrieval approach to meta-visualization. Mach. Learn. 99(2), 189–229 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Shneiderman, B.: The eyes have it: a task by data type taxonomy for information visualizations. In: Proceedings of IEEE Symposium on Visual Languages, pp. 336–343. IEEE Computer Society Press (1996)Google Scholar
  23. 23.
    Spellman, P., Sherlock, G., Zhang, M., Iyer, V., Anders, K., Eisen, M., Brown, P., Botstein, D., Futcher, B.: Comprehensive identification of cell cycle-regulated genes of the yeast saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 9(12), 3273–3297 (1998)CrossRefGoogle Scholar
  24. 24.
    Sun, T., Chen, S.: Locality preserving CCA with applications to data visualization and pose estimation. Image Vis. Comput. 25(5), 531–543 (2007)CrossRefGoogle Scholar
  25. 25.
    Tripathi, A., Klami, A., Kaski, S.: Simple integrative preprocessing preserves what is shared in data sources. BMC Bioinform. 9, 111 (2008)CrossRefGoogle Scholar
  26. 26.
    Van Der Maaten, L.: Accelerating t-SNE using tree-based algorithms. J. Mach. Learn. Res. 15(1), 3221–3245 (2014)MathSciNetzbMATHGoogle Scholar
  27. 27.
    Venna, J., Peltonen, J., Nybo, K., Aidos, H., Kaski, S.: Information retrieval perspective to nonlinear dimensionality reduction for data visualization. J. Mach. Learn. Res. 11, 451–490 (2010)MathSciNetzbMATHGoogle Scholar
  28. 28.
    Verbeek, J.J., Roweis, S.T., Vlassis, N.A.: Non-linear CCA and PCA by alignment of local models. In: Proceedings of NIPS, pp. 297–304. MIT Press (2003)Google Scholar
  29. 29.
    Vladymyrov, M., Carreira-Perpinán, M.A.: Linear-time training of nonlinear low-dimensional embeddings. In: Proceedings of AISTATS, vol. 33 (2014)Google Scholar
  30. 30.
    Wang, W., Arora, R., Livescu, K., Bilmes, J.: On deep multi-view representation learning. In: Proceedings of ICML (2015)Google Scholar
  31. 31.
    Wei, L., Xu, F.: Local CCA alignment and its applications. Neurocomputing 89, 78–88 (2012)CrossRefGoogle Scholar
  32. 32.
    Westbury, J.R.: X-ray microbeam speech production database user’s handbook. Waisman Center on Mental Retardation & Human Development, University of Wisconsin, 1.0 edn., June 1994Google Scholar
  33. 33.
    Xu, C., Tao, D., Xu, C.: A survey on multi-view learning. CoRR abs/1304.5634 (2013)Google Scholar
  34. 34.
    Yang, Z., Peltonen, J., Kaski, S.: Scalable optimization of neighbor embedding for visualization. In: Proceedings of ICML, pp. 127–135 (2013)Google Scholar
  35. 35.
    Yang, Z., Peltonen, J., Kaski, S.: Optimization equivalence of divergences improves neighbor embedding. In: Proceedings of ICML, pp. 460–468 (2014)Google Scholar
  36. 36.
    Yang, Z., Peltonen, J., Kaski, S.: Majorization-minimization for manifold embedding. In: Proceedings of AISTATS, pp. 1088–1097 (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Faculty of Natural SciencesUniversity of TampereTampereFinland
  2. 2.Department of Computer ScienceAalto UniversityEspooFinland

Personalised recommendations