Predicting SPARQL Query Performance

  • Rakebul Hasan
  • Fabien Gandon
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8798)


We address the problem of predicting SPARQL query performance. We use machine learning techniques to learn SPARQL query performance from previously executed queries. We show how to model SPARQL queries as feature vectors, and use k-nearest neighbors regression and Support Vector Machine with the nu-SVR kernel to accurately (\(R^2\) value of 0.98526) predict SPARQL query execution time.



This work is supported by the ANR CONTINT program under the Kolflow project (ANR-2010-CORD-021-02).


  1. 1.
    Akdere, M., Cetintemel, U., Riondato, M., Upfal, E., Zdonik, S.: Learning-based query performance modeling and prediction. In: 2012 IEEE 28th International Conference on Data Engineering (ICDE), pp. 390–401 (2012)Google Scholar
  2. 2.
    Altman, N.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46(3), 175–185 (1992)MathSciNetGoogle Scholar
  3. 3.
    Ganapathi, A., Kuno, H., Dayal, U., Wiener, J.L., Fox, A., Jordan, M., Patterson, D.: Predicting multiple metrics for queries: better decisions enabled by machine learning. In: Proceedings of the 2009 IEEE International Conference on Data Engineering. ICDE ’09, pp. 592–603. IEEE Computer Society, Washington, DC, USA (2009)Google Scholar
  4. 4.
    Gupta, C., Mehta, A., Dayal, U.: PQR: predicting query execution times for autonomous workload management. In: Proceedings of the 2008 International Conference on Autonomic Computing. ICAC ’08, pp. 13–22. IEEE Computer Society, Washington, DC, USA (2008)Google Scholar
  5. 5.
    Kaufman, L., Rousseeuw, P.: Clustering by means of medoids. In: Dodge, Y. (ed.) Statistical Data Analysis Based on the L1-Norm and Related Methods, pp. 405–416. North Holland Publishing, New York (1987)Google Scholar
  6. 6.
    Morsey, M., Lehmann, J., Auer, S., Ngonga Ngomo, A.-C.: DBpedia SPARQL benchmark – performance assessment with real queries on real data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 454–469. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  7. 7.
    Riesen, K., Bunke, H.: Approximate graph edit distance computation by means of bipartite graph matching. Image Vision Comput. 27(7), 950–959 (2009)CrossRefGoogle Scholar
  8. 8.
    Shevade, S.K., Keerthi, S.S., Bhattacharyya, C., Murthy, K.R.K.: Improvements to the SMO algorithm for SVM regression. IEEE Trans. Neural Netw. 11(5), 1188–1193 (2000)CrossRefGoogle Scholar
  9. 9.
    Tsialiamanis, P., Sidirourgos, L., Fundulaki, I., Christophides, V., Boncz, P.: Heuristics-based query optimisation for SPARQL. In: Proceedings of the 15th International Conference on Extending Database Technology. EDBT ’12, pp. 324–335. ACM, New York, NY, USA (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.INRIA Sophia AntipolisWimmicsSophia-Antipolis CedexFrance

Personalised recommendations