Fast Manifold Landmarking Using Locality-Sensitive Hashing
Manifold landmarks can approximately represent the low-dimensional nonlinear manifold structure embedded in high-dimensional ambient feature space. Due to the quadratic complexity of many learning algorithms in the number of training samples, selecting a sample subset as manifold landmarks has become an important issue for scalable learning. Unfortunately, state-of-the-art Gaussian process methods for selecting manifold landmarks themselves are not scalable to large datasets. In an attempt to speed up learning manifold landmarks, uniformly selected minibatch stochastic gradient descent is used by the state-of-the-art approach. Unfortunately, this approach only goes part way to making manifold learning tractable. We propose two adaptive sample selection approaches for gradient-descent optimization, which can lead to better performance in accuracy and computational time. Our methods exploit the compatibility of locality-sensitive hashing (via LSH and DBH) and the manifold assumption, thereby limiting expensive optimization to relevant regions of the data. Landmarks selected by our methods achieve superior accuracy than training the state-of-the-art learner with randomly selected minibatch. We also demonstrate that our methods can be used to find manifold landmarks without learning Gaussian processes at all, which leads to orders-of-magnitude speed up with only minimal decrease in accuracy.
We acknowledge partial support from ARC DP150103710.
- 2.Athitsos, V., Potamias, M., Papapetrou, P., Kollios, G.: Nearest neighbor retrieval using distance-based hashing. In: ICDE, pp. 327–336. IEEE (2008)Google Scholar
- 3.Cai, D., He, X., Wu, X., Han, J.: Non-negative matrix factorization on manifold. In: ICDM, pp. 63–72. IEEE (2008)Google Scholar
- 4.Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: SoCG, pp. 253–262. ACM (2004)Google Scholar
- 5.Elhamifar, E., Vidal, R.: Sparse manifold clustering and embedding. In: Advances in Neural Information Processing Systems, pp. 55–63 (2011)Google Scholar
- 7.Goldberg, A.B., Zhu, X., Singh, A., Xu, Z., Nowak, R.: Multi-manifold semi-supervised learning (2009)Google Scholar
- 8.Huh, S., Fienberg, S.E.: Discriminative topic modeling based on manifold learning. ACM Trans. Knowl. Discov. Data (TKDD) 5(4), 20 (2012)Google Scholar
- 9.Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: STOC, pp. 604–613. ACM (1998)Google Scholar
- 10.Liang, D., Paisley, J.: Landmarking manifolds with Gaussian processes. In: ICML, pp. 466–474 (2015)Google Scholar
- 14.Wang, Y., Lin, X., Wu, L., Zhang, W., Zhang, Q.: LBMCH: learning bridging mapping for cross-modal hashing. In: SIGIR, pp. 999–1002. ACM (2015)Google Scholar
- 15.Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: NIPS, vol. 16(16), pp. 321–328 (2004)Google Scholar