Skip to main content
Log in

Learning to rank with relational graph and pointwise constraint for cross-modal retrieval

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Cross-modal retrieval (i.e., image–query–text or text–query–image) is a hot research topic for multimedia information retrieval, but the heterogeneity gap between different modalities generates a critical challenge for multimodal data. Some researchers regard the cross-modal retrieval as a leaning to rank task, and they usually consider to measure similarity between two different modalities in the embedding shared subspace. However, previous methods almost pay more attention to construct a discriminative objective function to optimize common space, ignoring to exploit correlation between the single modality. In this paper, we consider the cross-modal retrieval task, from the perspective of optimizing ranking model, as a listwise ranking problem, and propose a novel method called learning to rank with relational graph and pointwise constraint (\( {\text{LR}}^{2} {\text{GP}} \)). In \( {\text{LR}}^{2} {\text{GP}} \), we first propose a discriminative ranking model, which makes use of the relation between the single modality to improve ranking performance so as to learn an optimal embedding common subspace. Then, a pointwise constraint is introduced in the low-dimension embedding subspace to make up for the real loss in the training phase since listwise method introduced merely considers directly optimize latent permutation from the perspective of the overall. Finally, a dynamic interpolation algorithm, which gradually transits from pointwise and pairwise to listwise learning, is selected to deal with the problem of fusion of loss function reasonable. Experiments on the benchmark datasets about Wikipedia and Pascal demonstrate the effectiveness for proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Akaho S (2006) A kernel method for canonical correlation analysis. Comput Sci 40(2):263–269

    Google Scholar 

  • Andrew G, Arora R, Bilmes J, Livescu K (2010) Deep canonical correlation analysis. In: International conference on machine learning (ICML), pp 3408–3415

  • Bai Y, Mu X (2018) Global asymptotic stability of a generalized SIRS epidemic model with transfer from infectious to susceptible. J Appl Anal Comput 8(2):402–412

    MathSciNet  Google Scholar 

  • Bai B et al (2010) Learning to rank with (a lot of) word features. Inf Retr 13(3):291–314

    Article  Google Scholar 

  • Cao X, Wang J (2018) Finite-time stability of a class of oscillating systems with two delays. Math Methods Appl Sci. https://doi.org/10.1002/mma.4943

    Article  MathSciNet  MATH  Google Scholar 

  • Cao Z, Qin T, Liu T-Y, Tsai M-F, Li H (2007) Learning to rank: from pairwise approach to listwise approach. In: Proceedings of the 24th international conference on machine learning. ACM, pp 129–136

  • Duncan Luce R (2005) Individual choice behavior: a theoretical analysis. Courier Corporation, Chelmsford

    Book  MATH  Google Scholar 

  • Everingham M, Gool V, Williams C, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338

    Article  Google Scholar 

  • Fushan L, Qingyong G (2016) Blow-up of solution for a nonlinear Petrovsky type equation with memory. Appl Math Comput 274:383–392

    MathSciNet  MATH  Google Scholar 

  • Gong Y, Ke Q, Isard M, Lazebnik S (2014) A multi-view embed- ding space for modeling internet images, tags, and their semantics. Int J Comput Vis (IJCV) 106(2):210–233

    Article  Google Scholar 

  • Grangier D, Bengio S (2008a) A discriminative kernel-based approach to rank images from text queries. IEEE Trans Pattern Anal Mach Intell 30(8):1371–1384

    Article  Google Scholar 

  • Grangier D, Bengio S (2008b) A discriminative kernel-based approach to rank images from text queries. IEEE Trans Pattern Anal Mach Intell 30(8):1371–1384

    Article  Google Scholar 

  • Han M, Sheng L, Zhang X (2018a) Bifurcation theory for finitely smooth planar autonomous differential systems. J Differ Equ 264:3596–3618

    Article  MathSciNet  MATH  Google Scholar 

  • Han M, Hou X, Sheng L, Wang C (2018b) Theory of rotated equations and applications to a population model. Discrete Contin Dyn Syst A 38(4):2171–2185

    Article  MathSciNet  MATH  Google Scholar 

  • Hardoon DR, Szedmák S, Shawe-Taylor J (2004) Canonical correlation analysis: an overview with application to learning methods. Neural Comput 16(12):2639–2664

    Article  MATH  Google Scholar 

  • Hwang SJ, Grauman K (2012) Reading between the lines: object localization using implicit cues from im- age tags. IEEE Trans Pattern Anal Mach Intell 34(6):1145–1158

    Article  Google Scholar 

  • Kang C, Xiang S, Liao S, Xu C, Pan C (2015a) Learning consistent feature representation for cross-modal multimedia retrieval. IEEE Trans Multimed 17(3):370–381

    Article  Google Scholar 

  • Kang C, Xiang S, Liao S, Xu C, Pan C (2015b) Learning consistent feature representation for cross-modal multimedia retrieval. IEEE Trans Multimed 17(3):370–381

    Article  Google Scholar 

  • Kang C, Xiang S, Liao S, Xu C, Pan C (2016) Multi- view discriminant analysis. IEEE Trans Pattern Anal Mach Intell 38(1):188–194

    Article  Google Scholar 

  • Laurens VDM (2014) Accelerating t-SNE using tree-based algorithms. J Mach Learn Res 15(1):3221–3245

    MathSciNet  MATH  Google Scholar 

  • Li H (2014) Learning to rank for information retrieval and natural language processing. Synth Lect Hum Lang Technol 4(1):113

    Google Scholar 

  • Li F, Guangwei D (2018) General energy decay for a degenerate viscoelastic Petrovsky-type plate equation with boundary feedback. J Appl Anal Comput 8(1):390–401

    MathSciNet  Google Scholar 

  • Li F, Li J (2012) Global existence and blow-up phenomena for nonlinear divergence form parabolic equations with inhomogeneous Neumann boundary conditions. J Math Anal Appl 385:1005–1014

    Article  MathSciNet  MATH  Google Scholar 

  • Li F, Li J (2014) Global existence and blow-up phenomena for p- Laplacian heat equation with inhomogeneous Neumann boundary conditions. Bound Value Probl 2014:219

    Article  MathSciNet  MATH  Google Scholar 

  • Li P, Ren G (2016) Some classes of equations of discrete type with harmonic singular operator and convolution. Appl Math Comput 284:185–194

    MathSciNet  MATH  Google Scholar 

  • Li M, Wang J (2018) Exploring delayed Mittag–Leffler type matrix function to study finite timestability of fractional delay differential equations. Appl Math Comput 324:254–265

    MathSciNet  MATH  Google Scholar 

  • Li H, Liu TY, Zhai CX (2009) Learning to rank for information retrieval (LR4IR 2009). Acm Sigir Forum 43(2):41–45

    Article  Google Scholar 

  • Liu S, Cheng X, Lan C, Fu W, Zhou J, Li Q, Gao G (2013) Fractal property of generalized M-set with rational number exponent. Appl Math Comput 220:668–675

    MathSciNet  MATH  Google Scholar 

  • Liu S, Pan Z, Cheng X (2017a) A novel fast fractal image compression method based on distance clustering in high dimensional sphere surface. Fractals 25(4):1740004

    Article  Google Scholar 

  • Liu S, Pan Z, Cheng Z (2017b) A novel fast fractal image compression method based on distance clustering in high dimensional sphere surface. Fractals 25(4):1740004

    Article  Google Scholar 

  • Liu S, Pan Z, Son H (2017c) Digital image watermarking method based on DCT and fractal encoding. IET Image Process 11(10):815–821

    Article  Google Scholar 

  • Liu G, Xu S, Wei Y, Qi Z, Zhang Z (2018a) New insight into reachable set estimation for uncertain singular time-delay systems. Appl Math Comput 320:769–780

    MathSciNet  MATH  Google Scholar 

  • Liu G, Liu S, Muhammad K (2018b) Object tracking in vary lighting conditions for fog based intelligent surveillance of public spaces. IEEE Access 6:29283–29296

    Article  Google Scholar 

  • Lu X, Wu F, Tang S, et al (2013) A low rank structural large margin method for cross-modal ranking. In: International ACM SIGIR conference on research and development in information retrieval. ACM, pp 433–442

  • Lu X, Wu F, Tang S, Zhang Z, He X, Zhuang Y (2013) A low rank structural large margin method for cross-modal ranking. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval, pp 433–442

  • Lu M, Liu S, Sangaiah AK (2018) Nucleosome positioning with fractal entropy increment of diversity in telemedicine. IEEE Access 6:33451–33459

    Article  Google Scholar 

  • Ma X, Wang P, Wei W (2018) Constant mean curvature surfaces and mean curvature flow with non-zero Neumann boundary conditions on strictly convex domains. J Funct Anal 274:252–277

    Article  MathSciNet  MATH  Google Scholar 

  • Mao A, Chang H (2016) Kirchhoff type problems in RN with radial potentials and locally Lipschitz functional. Appl Math Lett Appl Math Lett 62:49–54

    Article  MathSciNet  MATH  Google Scholar 

  • Mao A, Yang L, Qian A, Luan S (2017) Existence and concentration of solutions of Schroinger-Poisson system. Appl Math Lett 68:8–12

    Article  MathSciNet  MATH  Google Scholar 

  • Meng D, Zhao Q, Lu J (2017) A theoretical understanding of self-paced learning. Inf Sci 414:319–328

    Article  Google Scholar 

  • Peihe W, Dekai Z (2017) Convexity of level sets of minimal graph on space form with nonnegative curvature. J Differ Equ 262:5534–5564

    Article  MathSciNet  MATH  Google Scholar 

  • Peihe W, Lingling Z (2016) Some geometrical properties of convex level sets of minimal graph on 2-dimensional Riemannian mani-folds. Nonlinear Anal Theory Methods Appl 130(1):1–13

    MATH  Google Scholar 

  • Peng X, Shang Y, Zheng X (2018) Lower bounds for the blow-up time to a nonlinear viscoelastic wave equation with strong damping. Appl Math Lett 76:66–73

    Article  MathSciNet  MATH  Google Scholar 

  • Plackett RL (1975) The analysis of permutations. Appl Stat 24(2):193–202

    Article  MathSciNet  Google Scholar 

  • Ranjan V, Rasiwasia N, Jawahar CV (2015) Multi-label cross-modal retrieval. In: IEEE international conference on computer vision. IEEE Computer Society, pp 4094–4102

  • Ranjan V, Rasiwasia N, Jawahar C (2015) Multi-label cross-modal retrieval. In: ICCV, pp 4094–4102

  • Rasiwasia N, Costa Pereira J, Coviello E, Doyle G, Lanckriet GR, Levy R, Vasconcelos N (2010) A new approach to cross-modal multimedia retrieval. In: ACM international conference on multimedia (ACM MM), pp 251–260

  • Rasiwasia N,Pereira JC, Coviello E (2010) A new approach to cross-modal multimedia retrieval. In: ACMMM, pp 251–260

  • Rasiwasia N, Mahajan D, Mahadevan V, Aggarwal G (2014) Cluster canonical correlation analysis. In: International conference on artificial intelligence and statistics (AISTATS), pp 823–831

  • Sharma A, Kumar A, Hal D, Jacobs D (2012) Generalized multiview analysis: a discriminative latent space. In: CVPR, pp 2160–2167

  • Sun F, Liu L, Yonghong W (2018) Finite time blow-up for a thin-film equation with initial data at arbitrary energy level. J Math Anal Appl 458:9–20

    Article  MathSciNet  MATH  Google Scholar 

  • Tong H, He J, Li M, Zhang C, Ma W-Y (2005) Graph based multi-modality learning. In: ACM international conference on multimedia (ACM MM), pp 862–871

  • Wang K, He R, Wang W, Wang L, Tan T (2013) Learning coupled feature spaces for cross-modal matching. In: IEEE international conference on computer vision, pp 2088–2095

  • Wang B, Iserles A, Wu X (2016a) Arbitrary order trigonometric fourier collocation methods for multi-frequency oscillatory systems. Found Comput Math 16(1):151–181

    Article  MathSciNet  MATH  Google Scholar 

  • Wang B, Iserles A, Wu X (2016b) Arbitrary order trigonometric Fourier collocation methods for second-order ODEs. Found Comput Math 16:151–181

    Article  MathSciNet  MATH  Google Scholar 

  • Wang B, Wu X, Meng F (2017a) Trigonometric collocation methods based on Lagrange basis polynomials for multi-frequency oscillatory second order differential equations. J Comput Appl Math 313:185–201

    Article  MathSciNet  MATH  Google Scholar 

  • Wang B, Yang H, Meng F (2017b) Sixth order symplectic and symmetric explicit ERKN schemes for solving multi frequency oscillatory nonlinear Hamiltonian equations. Calcolo 54:117–140

    Article  MathSciNet  MATH  Google Scholar 

  • Wang PH, Qiu HM, Liu ZH (2018a) Some geometrical properties of minimal graph on space forms with nonpositive curvature. Houston J Math 44(2):545–570

    MathSciNet  MATH  Google Scholar 

  • Wang PH, Liu X, Liu ZH (2018b) The convexity of the level sets of maximal strictly space-like hypersurfaces defined on 2-dimensional space forms. Nonlinear Anal 174:79–103

    Article  MathSciNet  MATH  Google Scholar 

  • Wang J, Ibrahim AG, O’Regan D (2018c) Topological structure of the solution set for fractional non-instantaneous impulsive evolution inclusions. J Fixed Point Theory Appl 20:59. https://doi.org/10.1007/s11784-018-0534-5

    Article  MathSciNet  MATH  Google Scholar 

  • Wu M, Chang Y, Zheng Z et al (2009) Smoothing DCG for learning to rank: a novel approach using smoothed hinge functions. In: ACM conference on information and knowledge management. ACM, pp 1923–1926

  • Wu F, Lu X, Zhang Z, Yan S, Rui Y, Zhuang Y (2013) Cross-media semantic representation via bi-directional learning to rank. In: Proceedings of 21st ACM international conference on multimedia, pp 877–886

  • Xia F, Liu TY, Wang J et al (2008) Listwise approach to learning to rank: theory and algorithm. In: International conference on machine learning. ACM, pp 1192–1199

  • Xiao M, Ding YX, Gao X (2011) Learning to rank relational objects based on the listwise approach. In: International joint conference on neural networks. IEEE, pp 1818–1824

  • Xiuli L, Zengqin Z (2016) Iterative technique for a third-order differential equation with three-point nonlinear boundary value conditions. Electron J Qual Theory Differ Equ 12(1):1–10. https://doi.org/10.14232/ejqtde.2016.1.12

    Article  MathSciNet  MATH  Google Scholar 

  • Xu R, Meng F (2016) Some new weakly singular integral inequalities and their applications to fractional differential equations. J Inequal Appl 2016(1):1–16

    Article  MathSciNet  Google Scholar 

  • Yan F, Mikolajczyk K (2015) Deep correlation for matching images and text. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3441–3450

  • Yang S, Chuangqiang H (2018) Pure Weierstrass gaps from a quotient of the Hermitian curve. Finite Fields Appl 50:251–271

    Article  MathSciNet  MATH  Google Scholar 

  • Yang S, Hu C (2017) Weierstrass semigroups from Kummer extensions. Finite Fields Appl 45:264–284

    Article  MathSciNet  MATH  Google Scholar 

  • Yang S, Yao Z-A (2017) Complete weight enumerators of a class of linear codes. Discrete Math 340:729–739

    Article  MathSciNet  MATH  Google Scholar 

  • Yang S, Yao Z-A, Zhao C-A (2017) The weight distributions of two classes of p-ary cyclic codes with few weights. Finite Fields Appl 44:76–91

    Article  MathSciNet  MATH  Google Scholar 

  • Yu J, Rui Y, Tao D (2014) Click prediction for web image reranking using multimodal sparse coding. IEEE Trans Image Process 23(5):2019–2032

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang L, Zhao Y, Zhu Z, Wei S, Wu X (2014) Mining semantically consistent patterns for cross-view data. IEEE Trans Knowl Data Eng (TKDE) 26:2745–2758

    Article  Google Scholar 

  • Zhang L, Ma B, He JF, Li GR, Huang QM, Tian Q (2017) Adaptively unified semi-supervised learning for cross-modal retrieval. In: IJCAI, pp 3406–3412

  • Zhuang Y, Yang Y, Wu F (2008) Miningsemanticcorrelationofheterogeneous multimedia data for cross-media retrieval. IEEE Trans Multimed 10(2):221–229

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers and the editor for the very instructive suggestions that led to the much improved quality of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qingzhen Xu.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by A.K. Sangaiah, H. Pham, M.-Y. Chen, H. Lu, F. Mercaldo.

This work was supported in part by The 13th Five-Year plan for the development of philosophy and Social Sciences in GUANGZHOU (2018GZYB36), Science Foundation of Guangdong Provincial Communications Department (2015-02-064), the National NATURAL SCIENCE Foundation of China (61402185), and South China Normal Q4 University–Bluedon Information Security Technologies Co, Ltd joint laboratory project (LD20170201).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, Q., Li, M. & Yu, M. Learning to rank with relational graph and pointwise constraint for cross-modal retrieval. Soft Comput 23, 9413–9427 (2019). https://doi.org/10.1007/s00500-018-3608-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-018-3608-9

Keywords

Navigation