RSPLab: RDF Stream Processing Benchmarking Made Easy
In Stream Reasoning (SR), empirical research on RDF Stream Processing (RSP) is attracting a growing attention. The SR community proposed methodologies and benchmarks to investigate the RSP solution space and improve existing approaches. In this paper, we present RSPLab, an infrastructure that reduces the effort required to design and execute reproducible experiments as well as share their results. RSPLab integrates two existing RSP benchmarks (LSBench and CityBench) and two RSP engines (C-SPARQL engine and CQELS). It provides a programmatic environment to: deploy in the cloud RDF Streams and RSP engines, interact with them using TripleWave and RSP Services, and continuously monitor their performances and collect statistics. RSPLab is released as open-source under an Apache 2.0 license.
KeywordsSemantic Web Stream Reasoning RDF Stream Processing Benchmarking
In the recent years, research about Semantic Web and streaming data – Stream Reasoning (SR) – constantly grew. The community has been investigating foundational research on algorithms for RDF Stream Processing (RSP) , applied research with systems architectures [3, 10] and, recently, empirical research on benchmarks [1, 5, 8, 11, 15] and evaluation methodologies [12, 14, 17].
Focusing on the latter two, the state of the art comprehends RSP engines prototypes [3, 10] and benchmarks that address the different challenges the community investigated: query language expressive power , performance , correctness of results [5, 8], memory load and latency [1, 8]. This heterogeneity of benchmarks helps to explore the solution space, but hinders the systematic evaluation of RSP engines. Therefore,  proposed a requirement analysis for benchmarks and ranked existing benchmark accordingly;  proposed a framework for systematic and comparative RSP research. Beside the aforementioned community efforts, the evaluation of RSP engines is still not systematic.
In this paper, we propose RSPLab  a cloud-ready open-source test driver to support empirical research for SR/RSP. RSPLab offers a programmatic environment design and execute experiments. It uses linked data principle to publish RDF streams  and a set of REST APIs  to interact with RSP engines.
RSPLab continuously monitors memory consumption and CPU load of the deployed RSP engines and it persists the measurements on a time-series database. It allows to estimate results correctness and max throughput post-hoc by collecting query results on a reliable file storage. RSPLab provides real-time assisted data visualization by the means of a dashboard. Finally, it allows to publish experimental reports as linked data.
In this section, we present the requirements for a RSP test driver, we describe the test driver architecture and how RSPLab currently implements it.
(R.1) Benchmarks Independence. RSPLab must allow its users to integrate any benchmark, i.e. ontologies, streams, dataset and queries.
(R.2) Engine Independence. RSPLab must be agnostic to the RSP engine under test and it must not be bounded to any specific query language (QL).
(R.3) Minimal yet Extensible KPI set. According to the state of the art [1, 14, 17], the KPI set must include at least query result correctness and throughput. However, the KPI set must be extensible to include KPIs that are measurable in specific implementation and deployment.
(R.4) Continuous Monitoring. RSPLab must enable the observation of the RSP engine dynamics under the whole experiment execution.
(R.5) Error Minimization. RSPLab must minimize the experimental error, isolating each module to avoid resource contention.
(R.6) Ease of Deployment. RSPLab must be easy-to-deploy and it must simplify the deployment of the experiments modules, e.g. streams and engines.
(R.7) Ease of Execution. RSPLab must simplify the access to the available resource, e.g. reuse existing benchmarks, and the execution of experiments.
(R.8) Repeatability. RSPLab must guarantee experiment repeatability under the specific settings.
(R.9) Data Analysis. RSPLab must render simple data analyses about the collected statistics and allow its users to perform custom ones.
(R.10) Data Publishing. RSPLab must simplify the publications of performance statistics, query results and experiment design using linked data principles.
Architecture. Figure 1 presents RSPLab architecture that comprises four independent tiers: Streamer, Consumer, Collector and Controller. For each tier, it shows its logical submodules, e.g., a timeseries database in the Collector, and it refers to the technologies involved in the current implementation, e.g. InfluxDB.
The Streamer, the data provisioning tier, publishes RDF streams from existing benchmarks (R.1). The Streamer can stream any (virtual) RDF dataset that has a temporal dimension. Published RDF streams are accessible from the web.
The Collector, the monitoring tier, comprises two submodules: (1) a monitoring system that, during the executions of experiments, continuously measures the performance statistics of any deployed module (R.4), (2) a time-series database to save the statistics and a persistent storage to save the query results. (R.3).
The Controller, the control and analysis tier, allows the RSPLab user to control the other tiers. It allows to design and execute the experiments programmatically (R.7). It enable the verification of the results (R.8). through an assisted and customized real time data analysis dashboard (R.9).
Streamer. This tier is implemented using a modified version of TripleWave 4 \(^,\) 5 that includes methods to registers and start streams remotely. It includes synthetic RDF data from LSBench. We used the included data generator and we loaded them into a SPARQL endpoint to stream with TripleWave. It also includes data from CityBench. We exploited R2RML mappings and to convert CSV data into RDF on-demand. This tier is not limited to them. Streams from other benchmarks can be added following TripleWave principles.
Consumer. This tier uses the RSP Services , i.e. a set of REST methods that abstract from the RSP engine’s query language syntax and semantics. The RSP services generalize the processing model enabling streams registration, queries registration and results consumption. This tier includes, but it not limited to, CQELS  and C-SPARQL  engine. Using the RSP Services, new RSP engines can be added to RSPLab.
Collector. This tiers includes (1) a distributed continuous monitoring system, called cAdvisor6, that collects statistics about memory consumption, CPU load every 100 ms (R.3) for Docker containers. We target those running RSP engines but any of RSPLab ’s component can be observed. (2) A time-series database, called InfluxDB,7 where we write the collected statistics. (3) A python daemon, called RSPSink, that persists query results on a cloud file systems (e.g., Amazon S3 or Azure Blob Storage), allowing to verify correctness and estimate the system’s maximum throughput post-hoc.
Controller. This tier is implemented using iPython Notebooks8. We developed an ad-hoc python library  that allows to interact with the whole environment. It includes wrappers to RSP services, TripleWave APIs and sinks. Thanks to this programmatic APIs the RSPLab user can run TripleWave and RSP engine instances, execute experiment over them and analyze the results in a programmatic way (R.7). Moreover, with Grafana9 it provides an assisted data visualization dashboard that reads data from InfluxDB enabling real-time monitoring (R.9). Last, but not least, the included library automatically generates experiments reports using the VOID vocabulary (R.10).
The running experiment.
Memory & CPU
Aarhus Traffic Data
182955 & 158505
3 RSPLab In-Use
In this section, we show how to design and execute experiments and how to publish the results as linked using RSPLab.
Experiment Execution. In RSP, the experimental workflow has a warm-up phase followed by an observation phase because most of the transient behaviors occur during the engine warm-up and they should not bias the performance measures [1, 11, 12].
Warm-Up. In this phase, RSPLab deploys engine and RDF streams. It registers the streams, the queries and the observers on the RSP engine subject of the evaluation. It sets up the sinks to persists the queries results. Observing the engine’s dynamics using the assisted dashboard (Grafana) and it possible to determinate when the RSP engine is steady. Listing 1.2, lines 1 to 14, shows how this phase looks like in RSPLab. Figure 2 shows how this phase impacts the system dynamics approximatively until 15.16.
Observe. In this phase, which usually has a fixed duration, the RSP engine is stable. It consumes the streams and answers the queries. The results and the performance statistics are persisted. When time expires, everything is shut down. Listing 1.2, lines 15 to 24 shows how this phase looks like in RSPLab. RSPLab makes possible to define more complex workflows, simulate real scenarios, e.g. add/remove queries or tune stream rates while observing the engine response.
4 Related Work
In this section, we compare RSPLab with existing research solutions from SR/RSP, Linked Data, and database.
LSBench’s and Citybench [1, 11] proposed two test-drivers that push RDF Stream to the RSP engine subject of the evaluation. Differently from RSPLab, they are not benchmark-independent (R.1). The test drivers are designed to work with the benchmark queries and stream the benchmark data and do not guarantee error minimization by the means of module isolation(R.5).
Heaven  includes a test-bed proof of concept, with an architecture similar to RSPLab. However, Heaven does not include a programmatic environment that simplifies experiment execution (R.7), is not engine-independent (R.2), and its scope is limited to window-based, single-thread RSP engines. Like RSPLab, Heaven treats RSP engines as black box, but communication happens using Java Facade rather than a RESTful interface. Therefore, Heaven constrains the RSP engine’s processing model. It enables analysis of performance dynamics but it does not offer assisted data visualization (R.9) nor automated reporting (R.10).
LOD Lab  aims at reducing the human cost of approach evaluation. It also supports data cleaning and simplifies dataset selections using metadata. However, RDF Streams and RSP engine testing are not in its scope. LOD Lab does not offer a continuous monitoring system, but only addresses the problem of data provisioning. It provides a command line interface to interact with it (R.6), but not a programmatic environment to control the experimental workflow (R.7).
OLTP-Bench  is a universal benchmarking infrastructure for relational databases. Similarly to RSPLab it supports the deployment in a distributed environment (R.6) and it comes with assisted statistics visualization (R.9). However, it does not offer a programmatic environment to interact with the platform, execute experiments (R.7) and publish reports (R.10). OLTP-Bench includes a workload manager, but does not consider RDF Streams. Moreover, it provides an SQL dialect translation module, which is flexible enough in the SQL area but not in the SR/RSP one (R.2).
This paper presented RSPLab, a test-drive for SR/RSP engines that can be deployed on the cloud. RSPLab integrates two existing RSP benchmarks (LSBench and CityBench) and two exiting RSP engines (C-SPARQL engine and CQELS). We showed that it enables design of experiments by the means of a programmatic interface that allows deploying the environment, running experiments, measuring the performance, visualizing the results as reports, and cleaning up the environment to get ready for a new experiment.
Future work on RSPLab comprise (i) the integration of all the existing RSP benchmarks datasets and queries, i.e. SRBench and YaBench, (ii) the integration of CSRBench’s and YABench’s oracles for correctness checking (iii) the execution of existing benchmark experiments at scale and systematically. Last, but not least, (iv) the extension of RSPLab APIs towards a RSP Library.
- 2.Balduini, M., Della Valle, E.: A restful interface for RDF stream processors. In: Proceedings of the ISWC 2013 Posters and Demonstrations Track, Sydney, pp. 209–212, 23 October 2013Google Scholar
- 4.Boettiger, C.: An introduction to Docker for reproducible research. Oper. Syst. Rev. 49(1), pp. 71–79 (2015).http://doi.acm.org/10.1145/2723872.2723882
- 5.Dell’Aglio, D., Calbimonte, J.-P., Balduini, M., Corcho, O., Della Valle, E.: On correctness in RDF stream processor benchmarking. In: Alani, H., Kagal, L., Fokoue, A., Groth, P., Biemann, C., Parreira, J.X., Aroyo, L., Noy, N., Welty, C., Janowicz, K. (eds.) ISWC 2013. LNCS, vol. 8219, pp. 326–342. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-41338-4_21 CrossRefGoogle Scholar
- 7.Difallah, D.E., Pavlo, A., Curino, C., Cudré-Mauroux, P.: Oltp-bench: an extensible testbed for benchmarking relational databases. PVLDB 7(4), 277–288 (2013)Google Scholar
- 8.Kolchin, M., Wetz, P., Kiesling, E., Tjoa, A.M.: YABench: a comprehensive framework for RDF stream processor correctness and performance assessment. In: Bozzon, A., Cudre-Maroux, P., Pautasso, C. (eds.) ICWE 2016. LNCS, vol. 9671, pp. 280–298. Springer, Cham (2016). doi: 10.1007/978-3-319-38791-8_16 Google Scholar
- 9.Mauri, A., Calbimonte, J.-P., Dell’Aglio, D., Balduini, M., Brambilla, M., Della Valle, E., Aberer, K.: TripleWave: spreading RDF streams on the web. In: Groth, P., Simperl, E., Gray, A., Sabou, M., Krötzsch, M., Lecue, F., Flöck, F., Gil, Y. (eds.) ISWC 2016. LNCS, vol. 9982, pp. 140–149. Springer, Cham (2016). doi: 10.1007/978-3-319-46547-0_15 CrossRefGoogle Scholar
- 10.Le-Phuoc, D., Dao-Tran, M., Xavier Parreira, J., Hauswirth, M.: A native and adaptive approach for unified processing of linked streams and linked data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 370–388. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-25073-6_24 CrossRefGoogle Scholar
- 11.Le-Phuoc, D., Dao-Tran, M., Pham, M.-D., Boncz, P., Eiter, T., Fink, M.: Linked stream data processing engines: facts and figures. In: Cudré-Mauroux, P., Heflin, J., Sirin, E., Tudorache, T., Euzenat, J., Hauswirth, M., Parreira, J.X., Hendler, J., Schreiber, G., Bernstein, A., Blomqvist, E. (eds.) ISWC 2012. LNCS, vol. 7650, pp. 300–312. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-35173-0_20 CrossRefGoogle Scholar
- 12.Ren, X., Khrouf, H., Kazi-Aoul, Z., Chabchoub, Y., Curé, O.: On measuring performances of C-SPARQL and CQELS. CoRR abs/1611.08269Google Scholar
- 13.Rietveld, L., Beek, W., Schlobach, S.: LOD lab: experiments at LOD scale. In: Arenas, M., Corcho, O., Simperl, E., Strohmaier, M., d’Aquin, M., Srinivas, K., Groth, P., Dumontier, M., Heflin, J., Thirunarayan, K., Staab, S. (eds.) ISWC 2015. LNCS, vol. 9367, pp. 339–355. Springer, Cham (2015). doi: 10.1007/978-3-319-25010-6_23 CrossRefGoogle Scholar
- 14.Scharrenbach, T., Urbani, J., Margara, A., Della Valle, E., Bernstein, A.: Seven commandments for benchmarking semantic flow processing systems. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 305–319. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-38288-8_21 CrossRefGoogle Scholar
- 15.Stupar, A., Michel, S.: Srbench-a benchmark for soundtrack recommendation systems. In: 22nd ACM International Conference on Information and Knowledge Management, CIKM 2013, San Francisco, pp. 2285–2290 October 27 - November 1, (2013)Google Scholar
- 16.Tommasini, R.: streamreasoning/rsplib: Rsplib beta v0.2.4. https://doi.org/10.5281/zenodo.579659
- 17.Tommasini, R., Della Valle, E., Balduini, M., Dell’Aglio, D.: Heaven: a framework for systematic comparative research approach for RSP engines. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 250–265. Springer, Cham (2016). doi: 10.1007/978-3-319-34129-3_16 CrossRefGoogle Scholar
- 18.Tommasini, R., Mauri, A.: streamreasoning/rsplab: Rsplab v0.9. https://doi.org/10.5281/zenodo.572320