COACT: a query interface language for collaborative databases

Abstract

Data curation activities in collaborative databases mandate that collaborators interact until they converge and agree on the content of their data. In a previous work, we presented a cloud-based collaborative database system that promotes and enables collaboration and data curation scenarios. Our system classifies different versions of a data item to either pending, approved, or rejected. The approval or rejection of a certain version is done by the database Principle Investigators (or PIs) based on its value. Our system also allows collaborators to view the status of each version and help PIs take decisions by providing feedback based on their experiments and/or opinions. Most importantly, our system provided mechanisms for history tracking of different versions to trace the modifications and approval/rejection done by both collaborators and PIs on different versions of a data item. We labeled our system as Update-Pending-Approval model (or UPA). In this paper, we describe a high level SQL query interface language for PIs and collaborators to interact with the UPA framework. We define a set of UPA keywords that are used as a part of the history tracking mechanism to select specific versions of a data item, and a set of UPA options that select specific versions based on possible future decisions of PIs. We implemented our query interface mechanism on top of Apache Phoenix, taking into consideration that the UPA system was implemented on top of Apache HBase. We test the performance of the UPA query language by executing several queries that contain different complexity levels and discuss their results.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

References

  1. 1.

    Fagin, R.: On an authorization mechanism. ACM Trans. Database Syst. 3(3), 310–319 (1978)

    MathSciNet  Article  Google Scholar 

  2. 2.

    Griffiths, P.P., Wade, B.W.: An authorization mechanism for a relational database system. ACM TODS 1(3), 242–255 (1976)

    Article  Google Scholar 

  3. 3.

    Mershad, K., Malluhi, Q., Quzzani, M, Tang, M., Aref, A.: Approving updates in collaborative databases. In: Proceedings of the 3rd IEEE International Conference on Cloud Engineering. IC2E 15, March (2015)

  4. 4.

    Mershad, K., Malluhi, Q., Quzzani, M, Tang, M., Gribskov, M., Aref, A.: AUDIT: Approving and Tracking Updates with Dependencies in Collaborative Databases. Distributed and Parallel Databases. Springer, Berlin. https://doi.org/10.1007/s10619-017-7208-y (in press)

  5. 5.

    Apache HBase. [Online]. https://hbase.apache.org/

  6. 6.

    Rose, E., Segev, A.: TooSQL-a temporal object-oriented query language. In: Entity-Relationship Approach ER’93, pp. 122–136. Springer, Berlin (1994)

  7. 7.

    Snodgrass, R.: An overview of the temporal query language TQuel. University of Arizona, Department of Computer Science (1992)

  8. 8.

    Snodgrass, R.T.: The TSQL2 Temporal Query Language, vol. 330. Springer Science & Business Media, New York (2012)

    MATH  Google Scholar 

  9. 9.

    Jensen, C.S., Lomet, D.B.: Transaction timestamping in (temporal) databases. In: VLDB, 2001, pp. 441–450

  10. 10.

    Lomet, D., Barga, R., Mokbel, M.F., Shegalov, G., Wang, R., Zhu, Y.: Immortal DB: transaction time support for SQL server. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data. ACM, 2005, pp. 939–941 (2005 )

  11. 11.

    Lomet, D., Hong, M., Nehme, R., Zhang, R.: Transaction time indexing with version compression. Proc. VLDB Endow. 1(1), 870–881 (2008)

    Article  Google Scholar 

  12. 12.

    Abdessalem, T., Jomier, G.: VQL: a query language for multiversion databases. In: Database Programming Languages. Springer, Berlin, pp. 160–179 (1998)

  13. 13.

    Proll, S., Rauber, A.: Scalable data citation in dynamic, large databases: model and reference implementation. In: Big Data, 2013 IEEE International Conference on. IEEE, pp. 307–312 (2013)

  14. 14.

    Meimaris, M., Papastefanatos, G., Viglas, S., Stavrakas, Y., Pateritsas, C.: A query language for multi-version data web archives (2015). arXiv:1504.01891

  15. 15.

    SPARQL Query Language for RDF. [Online]. http://www.w3.org/TR/rdf-sparql-query/

  16. 16.

    Cypher—the Neo4j query Language. [Online]. http://www.neo4j.org/learn/cypher

  17. 17.

    He, H., Singh, A.K.: Graphs-at-a-time: query language and access methods for graph databases. In: Management of data, 2008 ACM SIGMOD International Conference on. ACM, pp. 405–418 (2008)

  18. 18.

    Hong, S., Chafi, H., Sedlar, E., Olukotun, K.: Green-Marl: a DSL for easy and efficient graph analysis. In: Proceedings of the 17\({{\rm th}}\) International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, pp. 349–362 (2012)

  19. 19.

    Jindal, A., Madden, S.: Graphiql: a graph intuitive query language for relational databases. In: Big Data, 2014 IEEE International Conference on IEEE, pp. 441–450 (2014)

  20. 20.

    Tinkerpop, Gremlin. [Online]. https://github.com/tinkerpop/gremlin/wiki

  21. 21.

    Apache hadoop. [Online]. http://hadoop.apache.org/

  22. 22.

    Hadoop distributed file system. [Online]. http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html

  23. 23.

    Apache hive. [Online]. https://hive.apache.org/

  24. 24.

    Cloudera impala. [Online]. http://impala.io/

  25. 25.

    Apache phoenix. [Online]. http://phoenix.apache.org/

  26. 26.

    Wikimedia downloads. [Online]. https://dumps.wikimedia.org/

  27. 27.

    How to calculate the record size of HBase? [Online]. http://prafull-blog.blogspot.com/2012/06/how-to-calculate-record-size-of-hbase.html/

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Khaleel Mershad.

Additional information

This publication was made possible by the support of an NPRP Grant 4-1534-1-247 from the the Qatar National Research Fund (a member of Qatar Foundation) and the National Science Foundation under Grants IIS-1117766 and IIS-0964639. The statements made herein are solely the responsibility of the authors.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mershad, K., Malluhi, Q.M., Ouzzani, M. et al. COACT: a query interface language for collaborative databases. Distrib Parallel Databases 36, 121–151 (2018). https://doi.org/10.1007/s10619-017-7213-1

Download citation

Keywords

  • Collaborative databases
  • Update authorization
  • SQL
  • Query options
  • True values
  • False positives and negatives