Abstract
Database technology remains underused in science, especially in the long tail — the small labs and individual researchers that collectively produce the majority of scientific output. These researchers increasingly require iterative, ad hoc analysis over ad hoc databases but cannot individually invest in the computational and intellectual infrastructure required for state-of-the-art solutions.
We describe a new “delivery vector” for database technology called SQLShare that emphasizes ad hoc integration, query, sharing, and visualization over pre-defined schemas. To empower non-experts to write complex queries, we synthesize example queries from the data itself and explore limited English hints to augment the process. We integrate collaborative visualization via a web-based service called VizDeck that uses automated visualization techniques with a card game metaphor to allow creation of interactive visual dashboards in seconds with zero programming.
We present data on the initial uptake and usage of the system and report preliminary results testingout new features with the datasets collected during the initial pilot deployment. We conclude that the SQLShare system and associated services have the potential to increase uptake of relational database technology in the long tail of science.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abiteboul, S., Greenshpan, O., Milo, T., Polyzotis, N.: Matchup: Autocompletion for mashups. In: ICDE, pp. 1479–1482 (2009)
Akbarnejad, J., Chatzopoulou, G., Eirinaki, M., Koshy, S., Mittal, S., On, D., Polyzotis, N., Varman, J.S.V.: Sql querie recommendations. PVLDB 3(2) (2010)
Amazon Relational Database Service (RDS), http://www.amazon.com/rds/
Amazon SimpleDB, http://www.amazon.com/simpledb/
Anderson, C.: The long tail. Wired 12(10) (2004)
Bernstein, P.A., Melnik, S.: Model management 2.0: manipulating richer mappings. In: SIGMOD Conference, pp. 1–12 (2007)
Boeckmann, B., Bairoch, A., Apweiler, R., Blatter, M.C., Estreicher, A., Gasteiger, E., Martin, M.J., Michoud, K., O’Donovan, C., Phan, I., Pilbout, S., Schneider, M.: The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Research 31(1), 365–370 (2003)
Bouch, A., Kuchinsky, A., Bhatti, N.: Quality is in the eye of the beholder: meeting users’ requirements for internet quality of service. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2000, pp. 297–304. ACM, New York (2000)
Brown, P.G.: Overview of scidb: large scale array storage, processing and analysis. In: Proceedings of the 2010 International Conference on Management of Data, SIGMOD 2010, pp. 963–968. ACM, New York (2010)
Cafarella, M.J., Halevy, A.Y., Khoussainova, N.: Data integration for the relational web. PVLDB 2(1) (2009)
Dörk, M., Carpendale, S., Collins, C., Williamson, C.: Visgets: Coordinated visualizations for web-based information exploration and discovery. IEEE Transactions on Visualization and Computer Graphics 14, 1205–1212 (2008)
Elmeleegy, H., Ivan, A., Akkiraju, R., Goodwin, R.: Mashup advisor: A recommendation tool for mashup development. In: ICWS 2008: Proceedings of the 2008 IEEE International Conference on Web Services, pp. 337–344. IEEE Computer Society, Washington, DC, USA (2008)
Franklin, M.J., Halevy, A.Y., Maier, D.: From databases to dataspaces: A new abstraction for information management. SIGMOD Record 34(4) (December 2005)
Google fusion tables, http://www.google.com/fusiontables
Gene ontology, http://www.geneontology.org/
Gotz, D., Wen, Z.: Behavior-driven visualization recommendation. In: Proceedings of the 13th International Conference on Intelligent User Interfaces, IUI 2009, pp. 315–324. ACM, New York (2009)
Graves, M., Bergeman, E.R., Lawrence, C.B.: Graph database systems for genomics. IEEE Eng. Medicine Biol. Special Issue on Managing Data for the Human Genome Project 11(6) (1995)
Gray, J., Liu, D.T., Nieto-Santisteban, M.A., Szalay, A.S., DeWitt, D.J., Heber, G.: Scientific data management in the coming decade. In: CoRR abs/cs/0502008 (2005)
Heber, G., Gray, J.: Supporting finite element analysis with a relational database backend; part 1: There is life beyond files. Technical report, Microsoft MSR-TR-2005-49 (April 2005)
Howe, B.: Sqlshare: Database-as-a-service for long tail science, http://escience.washington.edu/sqlshare
Khoussainova, N., Kwon, Y., Balazinska, M., Suciu, D.: Snipsuggest: A context-aware sql autocomplete system. In: Proc. of the 37th VLDB Conf. (2011)
Large Hadron Collider (LHC), http://lhc.web.cern.ch
Lin, J., Wong, J., Nichols, J., Cypher, A., Lau, T.A.: End-user programming of mashups with vegemite. In: Proceedings of the 13th International Conference on Intelligent User Interfaces, IUI 2009, pp. 97–106. ACM, New York (2009)
Big science and long-tail science. Term attributed to Jim Downing, http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=938
Large Synoptic Survey Telescope, http://www.lsst.org/
Mackinlay, J.: Automating the design of graphical presentations of relational information. ACM Transactions on Graphics 5, 110–141 (1986)
Madhavan, J., Bernstein, P.A., Rahm, E.: Generic schema matching with cupid. In: VLDB (2001)
Microsoft SQL Azure, http://www.microsoft.com/windowsazure/sqlazure/
Norman, D.: The design of everyday things. Doubleday, New York (1990)
Sloan Digital Sky Survey, http://cas.sdss.org
Yang, D.X., Procopiuc, C.M.: Summarizing relational databases. In: Proc. VLDB Endowment, vol. 2(1), pp. 634–645 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Howe, B. et al. (2011). Database-as-a-Service for Long-Tail Science. In: Bayard Cushing, J., French, J., Bowers, S. (eds) Scientific and Statistical Database Management. SSDBM 2011. Lecture Notes in Computer Science, vol 6809. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22351-8_31
Download citation
DOI: https://doi.org/10.1007/978-3-642-22351-8_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22350-1
Online ISBN: 978-3-642-22351-8
eBook Packages: Computer ScienceComputer Science (R0)