The VLDB Journal

, Volume 24, Issue 4, pp 519–536 | Cite as

Embedding-based subsequence matching with gaps–range–tolerances: a Query-By-Humming application

  • Alexios Kotsifakos
  • Isak Karlsson
  • Panagiotis Papapetrou
  • Vassilis Athitsos
  • Dimitrios Gunopulos
Regular Paper


We present a subsequence matching framework that allows for gaps in both query and target sequences, employs variable matching tolerance efficiently tuned for each query and target sequence, and constrains the maximum matching range. Using this framework, a dynamic programming method is proposed, called SMBGT, that, given a short query sequence Q and a large database, identifies in quadratic time the subsequence of the database that best matches Q. SMBGT is highly applicable to music retrieval. However, in Query-By-Humming applications, runtime is critical. Hence, we propose a novel embedding-based approach, called ISMBGT, for speeding up search under SMBGT. Using a set of reference sequences, ISMBGT maps both Q and each position of each database sequence into vectors. The database vectors closest to the query vector are identified, and SMBGT is then applied between Q and the subsequences that correspond to those database vectors. The key novelties of ISMBGT are that it does not require training, it is query sensitive, and it exploits the flexibility of SMBGT. We present an extensive experimental evaluation using synthetic and hummed queries on a large music database. Our findings show that ISMBGT can achieve speedups of up to an order of magnitude against brute-force search and over an order of magnitude against cDTW, while maintaining a retrieval accuracy very close to that of brute-force search.


Subsequence matching Query-By-Humming Indexing Embeddings 



The work of I. Karlsson and P. Papapetrou was supported in part by the project “High-Performance Data Mining for Drug Effect Detection” funded by Swedish Foundation for Strategic Research under grant IIS11-0053. The work of V. Athitsos was partially supported by National Science Foundation grants IIS-0812601, IIS-1055062, CNS-1059235, CNS-1035913, and CNS-1338118. Finally, the work of D. Gunopulos was partially supported by the FP7-ICT project INSIGHT and the General Secretariat for Research and Technology ARISTEIA program project “MMD: Mining Mobility Data”.


  1. 1.
    Athitsos, V., Alon, J., Sclaroff, S., Kollios, G.: BoostMap: a method for efficient approximate similarity rankings. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 268–275 (2004)Google Scholar
  2. 2.
    Athitsos, V., Hadjieleftheriou, M., Kollios, G., Sclaroff, S.: Query-sensitive embeddings. In: ACM International Conference on Management of Data (SIGMOD), pp. 706–717 (2005)Google Scholar
  3. 3.
    Bellman, R.: The theory of dynamic programming. Bull. Am. Math. Soc. 60(6), 503–515 (1954)CrossRefzbMATHGoogle Scholar
  4. 4.
    Bergroth, L., Hakonen, H., Raita, T.: A survey of longest common subsequence algorithms. In: SPIRE, pp. 39–48 (2000)Google Scholar
  5. 5.
    Bollobás, B., Das, G., Gunopulos, D., Mannila, H.: Time-series similarity problems and well-separated geometric sets. In: Symposium on Computational Geometry, pp. 454–456 (1997)Google Scholar
  6. 6.
    Chen, L., Ng, R.: On the marriage of \(l_p\)-norms and edit distance. In: VLDB, pp. 792–803 (2004)Google Scholar
  7. 7.
    Chen, L., Özsu, M.T.: Robust and fast similarity search for moving object trajectories. In: SIGMOD, pp. 491–502 (2005)Google Scholar
  8. 8.
    Chen, Y., Nascimento, M.A., Ooi, B.C., Tung, A.K.H.: Spade: On shape-based pattern detection in streaming time series. In: ICDE, pp. 786–795 (2007)Google Scholar
  9. 9.
    Crochemore, M., Iliopoulos, C., Makris, C., Rytter, W., Tsakalidis, A., Tsichlas, K.: Approximate string matching with gaps. Nord. J. Comput. 9(1), 54–65 (2002)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Dannenberg, R., Birmingham, W., Pardo, B., Hu, N., Meek, C., Tzanetakis, G.: A comparative evaluation of search techniques for Query-By-Humming using the MUSART testbed. J. Am. Soc. Inf. Sci. Technol. 58(5), 687–701 (2007)CrossRefGoogle Scholar
  11. 11.
    Faloutsos, C., Lin, K.I.: FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In: ACM International Conference on Management of Data (SIGMOD), pp. 163–174 (1995)Google Scholar
  12. 12.
    Fu, AWc, Chan, PMc, Cheung, Y.L., Moon, Y.S.: Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances. VLDB J. 9(2), 154–173 (2000). doi: 10.1007/PL00010672 CrossRefGoogle Scholar
  13. 13.
    Fu, A.W.C., Keogh, E., Lau, L.Y.H., Ratanamahatana, C., Wong, R.C.W.: Scaling and time warping in time series querying. Very Large Databases (VLDB) J. 17(4), 899–921 (2008)CrossRefGoogle Scholar
  14. 14.
    Han, T., Ko, S.K., Kang, J.: Efficient subsequence matching using the longest common subsequence with a dual match index. In: Machine Learning and Data Mining in Pattern Recognition, pp. 585–600 (2007)Google Scholar
  15. 15.
    Han, W.S., Lee, J., Moon, Y.S., Jiang, H.: Ranked subsequence matching in time-series databases. In: International Conference on Very Large Data Bases (VLDB), pp. 423–434 (2007)Google Scholar
  16. 16.
    Hjaltason, G., Samet, H.: Properties of embedding methods for similarity searching in metric spaces. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 25(5), 530–549 (2003)CrossRefGoogle Scholar
  17. 17.
    Hristescu, G., Farach-Colton, M.: Cluster-preserving embedding of proteins. Tech. Rep. 99-50. CS Department, Rutgers University (1999)Google Scholar
  18. 18.
    Hu, N., Dannenberg, R., Lewis, A.: A probabilistic model of melodic similarity. In: ICMC, pp. 509–515 (2002)Google Scholar
  19. 19.
    Iliopoulos, C., Kurokawa, M.: String matching with gaps for musical melodic recognition. In: PSC, pp. 55–64 (2002)Google Scholar
  20. 20.
    Jang, J., Gao, M.: A Query-By-Singing system based on dynamic programming. In: International Workshop on Intelligent Systems Resolutions, pp. 85–89 (2000)Google Scholar
  21. 21.
    Keogh, E.: Exact indexing of dynamic time warping. In: International Conference on Very Large Databases (VLDB), pp. 406–417 (2002)Google Scholar
  22. 22.
    Keogh, E., Chu, S., Hart, D., Pazzani, M.: Segmenting time series: a survey and novel approach. In: In an Edited Volume, Data Mining in Time Series Databases, pp. 1–22. World Scientific Publishing Company (1993)Google Scholar
  23. 23.
    Keogh, E., Pazzani, M.: Scaling up dynamic time warping for data mining applications. In: Proc. of SIGKDD (2000)Google Scholar
  24. 24.
    Kotsifakos, A., Papapetrou, P., Hollmén, J., Gunopulos, D.: A subsequence matching with gaps–range–tolerances framework: a Query-By-Humming application. Proc. VLDB 4(11), 761–771 (2011)Google Scholar
  25. 25.
    Kotsifakos, A., Papapetrou, P., Hollmén, J., Gunopulos, D., Athitsos, V.: A survey of Query-By-Humming similarity methods. In: Proceedings of PETRA (2012)Google Scholar
  26. 26.
    Kotsifakos, A., Papapetrou, P., Hollmén, J., Gunopulos, D., Athitsos, V., Kollios, G.: Hum-a-song: a subsequence matching with gaps–range–tolerances Query-By-Humming system. Proc. VLDB Endow. 5(12), 1930–1933 (2012)CrossRefGoogle Scholar
  27. 27.
    Kruskall, J.B., Liberman, M.: The symmetric time warping algorithm: from continuous to discrete. In: Time Warps. Addison-Wesley (1983)Google Scholar
  28. 28.
    Lemström, K., Ukkonen, E.: Including interval encoding into edit distance based music comparison and retrieval. In: AISB, pp. 53–60 (2000)Google Scholar
  29. 29.
    Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. 10(8), 707–710 (1966)MathSciNetGoogle Scholar
  30. 30.
    Maier, D.: The complexity of some problems on subsequences and supersequences. J. ACM 25(2), 322–336 (1978)MathSciNetCrossRefGoogle Scholar
  31. 31.
    Mongeau, M., Sankoff, D.: Comparison of musical sequences. Comput. Humanit. 24(3), 161–175 (1990)CrossRefGoogle Scholar
  32. 32.
    Papapetrou, P., Athitsos, V., Kollios, G., Gunopulos, D.: Reference-based alignment of large sequence databases. In: International Conference on Very Large Data Bases (VLDB) (2009)Google Scholar
  33. 33.
    Papapetrou, P., Athitsos, V., Potamias, M., Kollios, G., Gunopulos, D.: Embedding-based subsequence matching in time-series databases. ACM Trans. Database Syst. (TODS) 36(3), 17 (2011)Google Scholar
  34. 34.
    Pardo, B., Birmingham, W.: Encoding timing information for musical query matching. In: ISMIR, pp. 267–268 (2002)Google Scholar
  35. 35.
    Pardo, B., Shifrin, J., Birmingham, W.: Name that tune: a pilot study in finding a melody from a sung query. J. Am. Soc. Inf. Sci. Technol. 55(4), 283–300 (2004)Google Scholar
  36. 36.
    Park, S., Chu, W.W., Yoon, J., Won, J.: Similarity search of time-warped subsequences via a suffix tree. Inform. Syst. 28(7), 867–883 (2003)CrossRefGoogle Scholar
  37. 37.
    Park, S., Kim, S., Chu, W.W.: Segment-based approach for subsequence searches in sequence databases. In: ACM Symposium on Applied Computing (SAC), pp. 248–252 (2001)Google Scholar
  38. 38.
    Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)Google Scholar
  39. 39.
    Rakthanmanon, T., Campana, B., Mueen, A., Batista, G., Westover, B., Zhu, Q., Zakaria, J., Keogh, E.: Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 262–270. ACM (2012)Google Scholar
  40. 40.
    Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. Trans. ASSP 26, 43–49 (1978)CrossRefzbMATHGoogle Scholar
  41. 41.
    Sakurai, Y., Faloutsos, C., Yamamuro, M.: Stream monitoring under the time warping distance. In: ICDE, pp. 1046–1055 (2007)Google Scholar
  42. 42.
    Shou, Y., Mamoulis, N., Cheung, D.: Fast and exact warping of time series using adaptive segmental approximations. Mach. Learn. 58(2–3), 231–267 (2005)CrossRefzbMATHGoogle Scholar
  43. 43.
    Uitdenbogerd, A., Zobel, J.: Melodic matching techniques for large music databases. In: ACM Multimedia (Part 1), p. 66 (1999)Google Scholar
  44. 44.
    Ukkonen, E., Lemström, K., Mäkinen, V.: Geometric algorithms for transposition invariant content-based music retrieval. In: ISMIR, pp. 193–199 (2003)Google Scholar
  45. 45.
    Unal, E., Chew, E., Georgiou, P., Narayanan, S.: Challenging uncertainty in query by humming systems: a fingerprinting approach. Trans. Audio Speech Lang. Process. 16(2), 359–371 (2008)CrossRefGoogle Scholar
  46. 46.
    Wang, X., Wang, J.T.L., Lin, K.I., Shasha, D., Shapiro, B.A., Zhang, K.: An index structure for data mining and clustering. Knowl. Inf. Syst. 2(2), 161–184 (2000)CrossRefzbMATHGoogle Scholar
  47. 47.
    Zhou, M., Wong, M.: Efficient online subsequence searching in data streams under dynamic time warping distance. In: IEEE 24th International Conference on, Data Engineering, 2008. ICDE 2008. pp. 686–695. IEEE (2008)Google Scholar
  48. 48.
    Zhu, Y., Shasha, D.: Warping indexes with envelope transforms for query by humming. In: ACM International Conference on Management of Data (SIGMOD), pp. 181–192 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Alexios Kotsifakos
    • 1
  • Isak Karlsson
    • 2
  • Panagiotis Papapetrou
    • 2
  • Vassilis Athitsos
    • 1
  • Dimitrios Gunopulos
    • 3
  1. 1.Department of Computer Science and EngineringUniversity of Texas at ArlingtonArlingtonUSA
  2. 2.Department of Computer and Systems SciencesStockholm UniversityStockholmSweden
  3. 3.Department of Informatics and TelecommunicationsNational and Kapodistrian University of AthensAthensGreece

Personalised recommendations