Unsupervised group matching with application to cross-lingual topic matching without alignment information
Abstract
We propose a method for unsupervised group matching, which is the task of finding correspondence between groups across different domains without cross-domain similarity measurements or paired data. For example, the proposed method can find matching of topic categories in different languages without alignment information. The proposed method interprets a group as a probability distribution, which enables us to handle uncertainty in a limited amount of data, and to incorporate the high order information on groups. Groups are matched by maximizing the dependence between distributions, in which we use the Hilbert Schmidt independence criterion for measuring the dependence. By using kernel embedding which maps distributions into a reproducing kernel Hilbert space, we can calculate the dependence between distributions without density estimation. In the experiments, we demonstrate the effectiveness of the proposed method using synthetic and real data sets including an application to cross-lingual topic matching.
Keywords
Unsupervised object matching Kernel embedding of distributions Multilingual corpus analysisReferences
- Barnard K, Duygulu P, Forsyth D, De Freitas N, Blei DM, Jordan MI (2003) Matching words and pictures. J Mach Learning Res 3:1107–1135MATHGoogle Scholar
- Chang C, Lin C (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):27CrossRefGoogle Scholar
- Christmann A, Steinwart I (2010) Universal kernels on non-standard input spaces. In: Advances in neural information processing systems, pp 406–414Google Scholar
- Coleman TF, Li Y (1996) An interior trust region approach for nonlinear minimization subject to bounds. SIAM J Optim 6(2):418–445MathSciNetCrossRefMATHGoogle Scholar
- Djuric N, Grbovic M, Vucetic S (2012) Convex kernelized sorting. In: AAAI conference on artificial intelligenceGoogle Scholar
- Doan A, Madhavan J, Domingos P, Halevy A (2004) Ontology matching: a machine learning approach. In: Staab S, Studer R (eds) Handbook on ontologies. Springer, Berlin, pp 385–403CrossRefGoogle Scholar
- Dudley RM (2002) Real analysis and probability. Cambridge University Press, CambridgeCrossRefMATHGoogle Scholar
- Fukumizu K, Bach FR, Jordan MI (2004) Dimensionality reduction for supervised learning with reproducing kernel Hilbert spaces. J Mach Learning Res 5:73–99MathSciNetMATHGoogle Scholar
- Fukumizu K, Gretton A, Sun X, Schölkopf B (2008) Kernel measures of conditional dependence. In: Advances in neural information processing systems, pp 489–496Google Scholar
- Gretton A, Bousquet O, Smola A, Schölkopf B (2005) Measuring statistical dependence with Hilbert-Schmidt norms. Algorithmic Learning Theory 3734:63–77MathSciNetMATHGoogle Scholar
- Gretton A, Borgwardt K, Rasch M, Schölkopf B, Smola A (2012a) A kernel two-sample test. J Mach Learning Res 13:723–773MathSciNetMATHGoogle Scholar
- Gretton A, Borgwardt KM, Rasch MJ, Schölkopf B, Smola A (2012b) A kernel two-sample test. J Mach Learning Res 13(1):723–773MathSciNetMATHGoogle Scholar
- Haghighi A, Liang P, Berg-Kirkpatrick T, Klein D (2008) Learning bilingual lexicons from monolingual corpora. In: Annual meeting of the association for computational linguistics: human language technologies, pp 771–779Google Scholar
- Iwata T, Hirao T, Ueda N (2013) Unsupervised cluster matching via probabilistic latent variable models. In: AAAI conference on artificial intelligence, pp 445–451Google Scholar
- Jagarlamudi J, Juarez S, Daumé III H (2010) Kernelized sorting for natural language processing. In: AAAI conference on artificial intelligence, pp 1020–1025Google Scholar
- Kamahara J, Asakawa T, Shimojo S, Miyahara H (2005) A community-based recommendation system to reveal unexpected interests. In: International multimedia modelling conference, pp 433–438Google Scholar
- Klami A (2012) Variational Bayesian matching. In: Asian conference on machine learning, pp 205–220Google Scholar
- Kuhn HW (1955) The hungarian method for the assignment problem. Naval Res Logist Q 2(1–2):83–97MathSciNetCrossRefMATHGoogle Scholar
- Li B, Yang Q, Xue X (2009) Transfer learning for collaborative filtering via a rating-matrix generative model. In: International conference on machine learning, pp 617–624Google Scholar
- Muandet K, Schölkopf B (2013) One-class support measure machines for group anomaly detection. In: Conference on uncertainty in artificial intelligence, pp 449–458Google Scholar
- Muandet K, Fukumizu K, Dinuzzo F, Schölkopf B (2012) Learning from distributions via support measure machines. In: Advances in neural information processing systems, pp 10–18Google Scholar
- Parthasarathy KR (1967) Probability measures on metric spaces. Academic Press, New YorkCrossRefMATHGoogle Scholar
- Quadrianto N, Smola AJ, Song L, Tuytelaars T (2010) Kernelized sorting. IEEE Trans Pattern Anal Mach Intell 32(10):1809–1821CrossRefGoogle Scholar
- Shvaiko P, Euzenat J (2013) Ontology matching: state of the art and future challenges. IEEE Trans Knowl Data Eng 25(1):158–176CrossRefGoogle Scholar
- Smola A, Gretton A, Song L, Schölkopf B (2007) A Hilbert space embedding for distributions. In: Algorithmic learning theory, pp 13–31Google Scholar
- Socher R, Fei-Fei L (2010) Connecting modalities: Semi-supervised segmentation and annotation of images using unaligned text corpora. In: IEEE conference on computer vision and pattern recognition, pp 966–973Google Scholar
- Song L, Smola A, Gretton A, Bedo J, Borgwardt K (2012) Feature selection via dependence maximization. J Mach Learning Res 13(1):1393–1434MathSciNetMATHGoogle Scholar
- Sriperumbudur BK, Fukumizu K, Gretton A, Lanckriet GR, Schölkopf B (2009) Kernel choice and classifiability for RKHS embeddings of probability distributions. In: Advances in neural information processing systems, pp 1750–1758Google Scholar
- Sriperumbudur BK, Gretton A, Fukumizu K, Schölkopf B, Lanckriet GR (2010) Hilbert space embeddings and metrics on probability measures. J Mach Learning Res 11:1517–1561MathSciNetMATHGoogle Scholar
- Steinwart I (2001) On the influence of the kernel on the consistency of support vector machines. J Mach Learning Res 2:67–93MathSciNetMATHGoogle Scholar
- Taira H, Haruno M (1999) Feature selection in SVM text categorization. In: National conference on artificial intelligence, pp 480–486Google Scholar
- Terada A, Sese J (2012) Global alignment of protein-protein interaction networks for analyzing evolutionary changes of network frameworks. In: Proceedings of 4th international conference on bioinformatics and computational biology, pp 196–201Google Scholar
- Tripathi A, Klami A, Virpioja S (2010) Bilingual sentence matching using kernel CCA. In: IEEE international workshop on machine learning for signal processing, pp 130–135Google Scholar
- Yamada M, Sugiyama M (2011) Cross-domain object matching with model selection. In: International conference on artificial intelligence and statistics, pp 807–815Google Scholar