On the Improvement of the Mapping Trustworthiness and Continuity of a Manifold Learning Model
Manifold learning methods model high-dimensional data through low-dimensional manifolds embedded in the observed data space. This simplification implies that their are prone to trustworthiness and continuity errors. Generative Topographic Mapping (GTM) is one such manifold learning method for multivariate data clustering and visualization, defined within a probabilistic framework. In the original formulation, GTM is optimized by minimization of an error that is a function of Euclidean distances, making it vulnerable to the aforementioned errors, especially for datasets of convoluted geometry. Here, we modify GTM to penalize divergences between the Euclidean distances from the data points to the model prototypes and the corresponding geodesic distances along the manifold. Several experiments with artificial data show that this strategy improves the continuity and trustworthiness of the data representation generated by the model.
KeywordsGeodesic Distance Finite Mixture Model Prototype Vector Miss Data Imputation Generative Topographic Mapping
Unable to display preview. Download preview PDF.
- 7.Lee, J.A., Lendasse, A., Verleysen, M.: Curvilinear Distance Analysis versus Isomap. In: Proceedings of European Symposium on Artificial Neural Networks (ESANN), pp. 185–192 (2002)Google Scholar
- 8.Bernstein, M., de Silva, V., Langford, J., Tenenbaum, J.: Graph approximations to geodesics on embedded manifolds. Technical report, Stanford University, CA (2000)Google Scholar