Fast Manifold Landmarking Using Locality-Sensitive Hashing

Aye, Zay Maung Maung; Rubinstein, Benjamin I. P.; Ramamohanarao, Kotagiri

doi:10.1007/978-3-319-93040-4_36

Zay Maung Maung Aye¹⁹,
Benjamin I. P. Rubinstein¹⁹ &
Kotagiri Ramamohanarao¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10939))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

3361 Accesses
1 Citations

Abstract

Manifold landmarks can approximately represent the low-dimensional nonlinear manifold structure embedded in high-dimensional ambient feature space. Due to the quadratic complexity of many learning algorithms in the number of training samples, selecting a sample subset as manifold landmarks has become an important issue for scalable learning. Unfortunately, state-of-the-art Gaussian process methods for selecting manifold landmarks themselves are not scalable to large datasets. In an attempt to speed up learning manifold landmarks, uniformly selected minibatch stochastic gradient descent is used by the state-of-the-art approach. Unfortunately, this approach only goes part way to making manifold learning tractable. We propose two adaptive sample selection approaches for gradient-descent optimization, which can lead to better performance in accuracy and computational time. Our methods exploit the compatibility of locality-sensitive hashing (via LSH and DBH) and the manifold assumption, thereby limiting expensive optimization to relevant regions of the data. Landmarks selected by our methods achieve superior accuracy than training the state-of-the-art learner with randomly selected minibatch. We also demonstrate that our methods can be used to find manifold landmarks without learning Gaussian processes at all, which leads to orders-of-magnitude speed up with only minimal decrease in accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agarwal, A., Chapelle, O., Dudík, M., Langford, J.: A reliable effective terascale linear learning system. J. Mach. Learn. Res. 15(1), 1111–1133 (2014)
MathSciNet MATH Google Scholar
Athitsos, V., Potamias, M., Papapetrou, P., Kollios, G.: Nearest neighbor retrieval using distance-based hashing. In: ICDE, pp. 327–336. IEEE (2008)
Google Scholar
Cai, D., He, X., Wu, X., Han, J.: Non-negative matrix factorization on manifold. In: ICDM, pp. 63–72. IEEE (2008)
Google Scholar
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: SoCG, pp. 253–262. ACM (2004)
Google Scholar
Elhamifar, E., Vidal, R.: Sparse manifold clustering and embedding. In: Advances in Neural Information Processing Systems, pp. 55–63 (2011)
Google Scholar
Faloutsos, C., Lin, K.I.: FastMap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets, vol. 24. ACM (1995)
Article Google Scholar
Goldberg, A.B., Zhu, X., Singh, A., Xu, Z., Nowak, R.: Multi-manifold semi-supervised learning (2009)
Google Scholar
Huh, S., Fienberg, S.E.: Discriminative topic modeling based on manifold learning. ACM Trans. Knowl. Discov. Data (TKDD) 5(4), 20 (2012)
Google Scholar
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: STOC, pp. 604–613. ACM (1998)
Google Scholar
Liang, D., Paisley, J.: Landmarking manifolds with Gaussian processes. In: ICML, pp. 466–474 (2015)
Google Scholar
Maaten, L.V.D., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(Nov), 2579–2605 (2008)
MATH Google Scholar
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Article Google Scholar
Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)
Article Google Scholar
Wang, Y., Lin, X., Wu, L., Zhang, W., Zhang, Q.: LBMCH: learning bridging mapping for cross-modal hashing. In: SIGIR, pp. 999–1002. ACM (2015)
Google Scholar
Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: NIPS, vol. 16(16), pp. 321–328 (2004)
Google Scholar

Download references

Acknowledgments

We acknowledge partial support from ARC DP150103710.

Author information

Authors and Affiliations

School of Computing and Information Systems, The University of Melbourne, Parkville, VIC, 3052, Australia
Zay Maung Maung Aye, Benjamin I. P. Rubinstein & Kotagiri Ramamohanarao

Authors

Zay Maung Maung Aye
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin I. P. Rubinstein
View author publications
You can also search for this author in PubMed Google Scholar
Kotagiri Ramamohanarao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zay Maung Maung Aye .

Editor information

Editors and Affiliations

Deakin University, Geelong, Victoria, Australia
Dinh Phung
National Chiao Tung University, Hsinchu City, Taiwan
Vincent S. Tseng
Monash University, Clayton, Victoria, Australia
Geoffrey I. Webb
Japan Advanced Institute of Science and Technology, Nomi, Ishikawa, Japan
Bao Ho
University of Melbourne, Melbourne, Victoria, Australia
Mohadeseh Ganji
University of Melbourne, Melbourne, Victoria, Australia
Lida Rashidi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aye, Z.M.M., Rubinstein, B.I.P., Ramamohanarao, K. (2018). Fast Manifold Landmarking Using Locality-Sensitive Hashing. In: Phung, D., Tseng, V., Webb, G., Ho, B., Ganji, M., Rashidi, L. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 10939. Springer, Cham. https://doi.org/10.1007/978-3-319-93040-4_36

Download citation

DOI: https://doi.org/10.1007/978-3-319-93040-4_36
Published: 17 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93039-8
Online ISBN: 978-3-319-93040-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics