Abstract
DBpedia Live enables access to structured data extracted from Wikipedia in real-time. A data stream that is generated from Wikipedia changes is instantly loaded in the DBpedia RDF store. Applications can benefit by subscribing to the RDF update stream and receive continuous results from DBpedia. Providing a continuous update stream of changes to subscribed DBpedia queries is a challenging task due to the load it places on the RDF store.
In this paper, we propose an optimization approach for processing subscriptions to DBpedia Live. By monitoring the change data stream, query processing can be optimized to avoid unnecessary processing load by continuous database polling. Queries are only re-processed when the system can detect a relation between incoming changes and queries so that it can trigger the processing of the specific query. We evaluated our approach by using a recorded history of the DBpedia change stream and as queries we used the most frequent DBpedia SPARQL queries obtained from the logs. A comparison of our approach to the interval-based database polling approach shows a significant optimization of processing costs.
Notes
- 1.
- 2.
Linked Open Data http://linkeddata.org/.
- 3.
- 4.
A triple store is a special kind of database system for storage of RDF triple data.
- 5.
- 6.
RDF Stream Processing Community Group http://www.w3.org/community/rsp/.
- 7.
Linked Open Data http://linkeddata.org/.
- 8.
https://code.google.com/p/cqels/wiki/CQELS_language retrieved April, 2014.
- 9.
DBpedia Live Website http://live.dbpedia.org/changesets/.
- 10.
Some special characters were included before “http://” of RDF resources that we had to remove.
- 11.
DBpedia query logs downloaded from ftp://download.openlinksw.com/support/dbpedia/.
- 12.
References
Anicic, D., Fodor, P., Rudolph, S., Stojanovic, N.: Ep-sparql: a unified language for event processing and stream reasoning. In: Proceedings of the 20th International Conference on World Wide Web, WWW 2011, pp. 635–644. ACM, New York (2011)
Anicic, D., Fodor, P., Rudolph, S., Stühmer, R., Stojanovic, N., Studer, R.: ETALIS: rule-based reasoning in event processing. In: Helmer, S., Poulovassilis, A., Xhafa, F. (eds.) Reasoning in Event-Based Distributed Systems. SCI, vol. 347, pp. 99–124. Springer, Heidelberg (2011)
Arias, M., Fernández, J.D., Martínez-Prieto, M.A., de la Fuente, P.: An empirical study of real-world sparql queries. arXiv preprint arXiv:1103.5043 (2011)
Barbieri, D.F., Braga, D., Ceri, S., Grossniklaus, M.: An execution environment for c-sparql queries. In: Proceedings of the 13th International Conference on Extending Database Technology, EDBT 2010, pp. 441–452. ACM, New York (2010)
Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: Dbpedia - a crystallization point for the web of data. Web Semant. 7(3), 154–165 (2009)
Bolles, A., Grawunder, M., Jacobi, J.: Streaming SPARQL - extending SPARQL to process data streams. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 448–462. Springer, Heidelberg (2008)
Calbimonte, J.-P., Corcho, O., Gray, A.J.G.: Enabling ontology-based access to streaming data sources. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 96–111. Springer, Heidelberg (2010)
Danh, L.P., Minh, D.T., Minh Duc, P., Boncz, P.A., Thomas, E., Michael, F.: Linked stream data processing: facts and figures, 01 November 2012
Eugster, P.T., Felber, P.A., Guerraoui, R., Kermarrec, A.-M.: The many faces of publish/subscribe. ACM Comput. Surv. 35(2), 114–131 (2003)
Le-Phuoc, D., Dao-Tran, M., Xavier Parreira, J., Hauswirth, M.: A native and adaptive approach for unified processing of linked streams and linked data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 370–388. Springer, Heidelberg (2011)
Liu, Y., Plale, B.: Survey of publish subscribe event systems. Technical report, Indiana University (2003)
Passant, A., Mendes, P.N.: sparqlpush: Proactive notification of data updates in rdf stores using pubsubhubbub. In: Proceedings of the 6th Workshop on Scripting and Development for the Semantic Web (SFSW2010) co-located with ESWC 2010 (2010)
Sequeda, J., Corcho, Ó.: Linked stream data: a position paper, pp. 148–157 (2009)
Della Valle, E., Ceri, S., van Harmelen, F., Fensel, D.: It’s a streaming world! reasoning upon rapidly changing information. IEEE Intell. Syst. 24(6), 83–89 (2009)
Zhang, Y., Duc, P.M., Groffen, F., Liarou, E., Boncz, P., Kersten, M., Calbimonte, J.-P., Corcho, O.: Benchmarking RDF storage engines. Deliverable D1.2. Technical report, PlanetData FP7 (2012)
Acknowledgements
This work has been partially supported by the “InnoProfile-Transfer Corporate Smart Content” project funded by the German Federal Ministry of Education and Research (BMBF) and the BMBF Innovation Initiative for the New German Länder-Entrepreneurial Regions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Teymourian, K., Todor, A., Łukasiewicz, W., Paschke, A. (2015). Optimized Processing of Subscriptions to DBpedia Live. In: Abramowicz, W. (eds) Business Information Systems Workshops. BIS 2015. Lecture Notes in Business Information Processing, vol 228. Springer, Cham. https://doi.org/10.1007/978-3-319-26762-3_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-26762-3_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26761-6
Online ISBN: 978-3-319-26762-3
eBook Packages: Computer ScienceComputer Science (R0)