Adaptive Join Operator for Federated Queries over Linked Data Endpoints

  • Damla Oguz
  • Shaoyi Yin
  • Abdelkader Hameurlain
  • Belgin Ergenc
  • Oguz Dikenelli
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9809)


Traditional static query optimization is not adequate for query federation over linked data endpoints due to unpredictable data arrival rates and missing statistics. In this paper, we propose an adaptive join operator for federated query processing which can change the join method during the execution. Our approach always begins with symmetric hash join in order to produce the first result tuple as soon as possible and changes the join method as bind join when it estimates that bind join is more efficient than symmetric hash join for the rest of the process. We compare our approach with symmetric hash join and bind join. Performance evaluation shows that our approach provides optimal response time and has the adaptation ability to the different data arrival rates.


Distributed query processing Linked data Query federation Join methods Adaptive query optimization 



This work is partially supported by The Scientific and Technological Research Council of Turkey (TUBITAK).


  1. 1.
    Hartig, O., Bizer, C., Freytag, J.-C.: Executing SPARQL queries over the web of linked data. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 293–309. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  2. 2.
    Görlitz, O., Staab, S.: Federated data management and query optimization for linked open data. In: Vakali, A., Jain, L.C. (eds.) New Directions in Web Data Management 1. SCI, vol. 331, pp. 109–137. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  3. 3.
    Schwarte, A., Haase, P., Hose, K., Schenkel, R., Schmidt, M.: FedX: optimization techniques for federated query processing on linked data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 601–616. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  4. 4.
    Quilitz, B., Leser, U.: Querying distributed RDF data sources with SPARQL. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 524–538. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  5. 5.
    Görlitz, O., Staab, S.: SPLENDID: SPARQL endpoint federation exploiting VOID descriptions. In: Proceedings of the Second International Workshop on Consuming Linked Data (COLD 2011), CEUR Workshop Proceedings, Bonn, Germany, 23 October 2011, vol. 782 (2011).
  6. 6.
    Wang, X., Tiropanis, T., Davis, H.C.: LHD: optimising linked data query processing using parallelisation. In: Proceedings of the WWW 2013 Workshop on Linked Data on the Web, Rio de Janeiro, Brazil, 14 May 2013 (2013)Google Scholar
  7. 7.
    Deshpande, A., Ives, Z., Raman, V.: Adaptive query processing. Found. Trends Databases 1(1), 1–140 (2007)CrossRefzbMATHGoogle Scholar
  8. 8.
    Acosta, M., Vidal, M.-E., Lampo, T., Castillo, J., Ruckhaus, E.: ANAPSID: an adaptive query processing engine for SPARQL endpoints. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 18–34. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  9. 9.
    Lynden, S., Kojima, I., Matono, A., Tanimura, Y.: Adaptive integration of distributed semantic web data. In: Kikuchi, S., Sachdeva, S., Bhalla, S. (eds.) DNIS 2010. LNCS, vol. 5999, pp. 174–193. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  10. 10.
    Lynden, S., Kojima, I., Matono, A., Tanimura, Y.: ADERIS: an adaptive query processor for joining federated SPARQL endpoints. In: Meersman, R., et al. (eds.) OTM 2011, Part II. LNCS, vol. 7045, pp. 808–817. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  11. 11.
    Wilschut, A.N., Apers, P.M.G.: Dataflow query execution in a parallel main-memory environment. In: Proceedings of the First International Conference on Parallel and Distributed Information Systems. PDIS 1991, pp. 68–77. IEEE Computer Society Press (1991)Google Scholar
  12. 12.
    Urhan, T., Franklin, M.J.: XJoin: a reactively-scheduled pipelined join operator. IEEE Data Eng. Bull. 23(2), 27–33 (2000)Google Scholar
  13. 13.
    Haas, L.M., Kossmann, D., Wimmers, E.L., Yang, J.: Optimizing queries across diverse data sources. In: Proceedings of the 23rd International Conference on Very Large Data Bases, VLDB 1997, pp. 276–285. Morgan Kaufmann Publishers Inc. (1997)Google Scholar
  14. 14.
    Viglas, S.D., Naughton, J.F., Burger, J.: Maximizing the output rate of multi-way join queries over streaming information sources. In: Proceedings of the 29th International Conference on Very Large Data Bases, VLDB 2003, vol. 29, pp. 285–296. VLDB Endowment (2003)Google Scholar
  15. 15.
    Oguz, D., Ergenc, B., Yin, S., Dikenelli, O., Hameurlain, A.: Federated query processing on linked data: a qualitative survey and open challenges. Knowl. Eng. Rev. 30(5), 545–563 (2015)CrossRefGoogle Scholar
  16. 16.
    Shekita, E.J., Young, H.C., Tan, K.L.: Multi-join optimization for symmetric multiprocessors. In: Proceedings of the 19th International Conference on Very Large Data Bases, VLDB 1993, pp. 479–492. Morgan Kaufmann Publishers Inc. (1993)Google Scholar
  17. 17.
    Amsaleg, L., Franklin, M.J., Tomasic, A.: Dynamic query operator scheduling for wide-area remote access. Distrib. Parallel Databases 6(3), 217–246 (1998)CrossRefGoogle Scholar
  18. 18.
    Kabra, N., DeWitt, D.J.: Efficient mid-query re-optimization of sub-optimal query execution plans. SIGMOD Rec. 27(2), 106–117 (1998)CrossRefGoogle Scholar
  19. 19.
    Ives, Z.G., Florescu, D., Friedman, M., Levy, A., Weld, D.S.: An adaptive query execution system for data integration. SIGMOD Rec. 28(2), 299–310 (1999)CrossRefGoogle Scholar
  20. 20.
    Markl, V., Raman, V., Simmen, D., Lohman, G., Pirahesh, H., Cilimdzic, M.: Robust query processing through progressive optimization. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, SIGMOD 2004, pp. 659–670. ACM (2004)Google Scholar
  21. 21.
    Kache, H., Han, W.S., Markl, V., Raman, V., Ewen, S.: POP/FED: progressive query optimization for federated queries in DB2. In: Proceedings of the 32nd International Conference on Very Large Data Bases, VLDB 2006, pp. 1175–1178. VLDB Endowment (2006)Google Scholar
  22. 22.
    Han, W.S., Ng, J., Markl, V., Kache, H., Kandil, M.: Progressive optimization in a shared-nothing parallel database. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, SIGMOD 2007, pp. 809–820. ACM (2007)Google Scholar
  23. 23.
    Babu, S., Bizarro, P., DeWitt, D.: Proactive re-optimization. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, SIGMOD 2005, pp. 107–118. ACM (2005)Google Scholar
  24. 24.
    Arcangeli, J., Hameurlain, A., Migeon, F., Morvan, F.: Mobile agent based self-adaptive join for wide-area distributed query processing. J. Database Manag. (JDM) 15(4), 25–44 (2004)CrossRefGoogle Scholar
  25. 25.
    Ozakar, B., Morvan, F., Hameurlain, A.: Mobile join operators for restricted sources. Mob. Inf. Syst. 1(3), 167–184 (2005)Google Scholar
  26. 26.
    Avnur, R., Hellerstein, J.M.: Eddies: continuously adaptive query processing. SIGMOD Rec. 29(2), 261–272 (2000)CrossRefGoogle Scholar
  27. 27.
    Raman, V., Deshpande, A., Hellerstein, J.M.: Using state modules for adaptive query processing. In: Proceedings of the 19th International Conference on Data Engineering, 5–8 March 2003, Bangalore, India, pp. 353–364 (2003)Google Scholar
  28. 28.
    Deshpande, A.: An initial study of overheads of eddies. SIGMOD Rec. 33(1), 44–49 (2004)CrossRefGoogle Scholar
  29. 29.
    Deshpande, A., Hellerstein, J.M.: Lifting the burden of history from adaptive query processing. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases, VLDB 2004, vol. 30, pp. 948–959. VLDB Endowment (2004)Google Scholar
  30. 30.
    Bizarro, P., Babu, S., DeWitt, D., Widom, J.: Content-based routing: different plans for different data. In: Proceedings of the 31st International Conference on Very Large Data Bases, VLDB 2005, pp. 757–768. VLDB Endowment (2005)Google Scholar
  31. 31.
    Zhou, Y., Ooi, B.C., Tan, K., Tok, W.H.: An adaptable distributed query processing architecture. Data Knowl. Eng. 53(3), 283–309 (2005)CrossRefGoogle Scholar
  32. 32.
    Khan, L., McLeod, D., Shahabi, C.: An adaptive probe-based technique to optimize join queries in distributed internet databases. J. Database Manag. 12(4), 3–14 (2001)CrossRefzbMATHGoogle Scholar
  33. 33.
    Basca, C., Bernstein, A.: Avalanche: putting the spirit of the web back into semantic web querying. In: Proceedings of the ISWC 2010 Posters & Demonstrations Track: Collected Abstracts, Shanghai, China, 9 November 2010 (2010)Google Scholar
  34. 34.
    Verborgh, R., et al.: Querying datasets on the web with high availability. In: Mika, P., et al. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 180–196. Springer, Heidelberg (2014)Google Scholar
  35. 35.
    Acosta, M., Vidal, M.E.: Networks of linked data eddies: an adaptive web query processing engine for RDF data. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 111–127. Springer International Publishing, Heidelberg (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Damla Oguz
    • 1
    • 2
    • 3
  • Shaoyi Yin
    • 2
  • Abdelkader Hameurlain
    • 2
  • Belgin Ergenc
    • 1
  • Oguz Dikenelli
    • 3
  1. 1.Department of Computer EngineeringIzmir Institute of TechnologyIzmirTurkey
  2. 2.IRIT LaboratoryPaul Sabatier UniversityToulouseFrance
  3. 3.Department of Computer EngineeringEge UniversityIzmirTurkey

Personalised recommendations