Abstract
In this paper we report on our experience of using Database Supported Haskell (DSH) for analysing the entire Wikipedia history. DSH is a novel high-level database query facility allowing for the formulation and efficient execution of queries on nested and ordered collections of data. DSH grew out of a research project on the integration of database querying capabilities into high-level, general-purpose programming languages. It is an emerging trend that querying facilities embedded in general-purpose programming languages are gradually replacing lower-level database languages such as SQL as preferred facilities for querying large-scale database-resident data. We relate this new approach to the current practice which integrates database queries into analysts’ workflows in a rather ad hoc fashion. This paper would interest early technology adopters interested in new database query languages and practitioners working on large-scale data analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Database Supported Haskell (DSH), http://hackage.haskell.org/package/DSH
Copeland, G., Maier, D.: Making Smalltalk a database system. ACM SIGMOD Record 14(2), 316–325 (1984)
Giorgidze, G., Grust, T., Schreiber, T., Weijers, J.: Haskell boards the Ferry: Database-supported program execution for Haskell. In: Hage, J., Morazán, M.T. (eds.) IFL. LNCS, vol. 6647, pp. 1–18. Springer, Heidelberg (2011)
Giorgidze, G., Grust, T., Schweinsberg, N., Weijers, J.: Bringing back monad comprehensions. In: Proc. of the Haskell Symposium 2011. ACM, Tokyo (2011)
Grust, T., Mayr, M.: A deep embedding of queries into Ruby. In: Proc. of ICDE 2012, IEEE, Washington, DC (2012)
Grust, T., Rittinger, J., Schreiber, T.: Avalanche-safe LINQ compilation. Proc. VLDB Endow. 3(1-2), 162–172 (2010)
Halatchliyski, I., Moskaliuk, J., Kimmerle, J., Cress, U.: Who integrates the networks of knowledge in Wikipedia? In: Proc. of WikiSym 2010. ACM (2010)
Kummer, M., Saam, M., Halatchliyski, I., Giorgidze, G.: Centrality and content creation in networks the case of German Wikipedia. Technical Report 12-053, ZEW, Mannheim, Germany (2012)
Meijer, E., Beckman, B., Bierman, G.: LINQ: reconciling object, relations and XML in the.NET framework. In: Proc. of SIGMOD 2006. ACM (2006)
Ulrich, A.: A Ferry-based query backend for the Links programming language. Master’s thesis, University of Tübingen (2011)
Vogt, J.: Type safe integration of query languages into Scala. Master’s thesis, RWTH Aachen University (2011)
Wassermann, G., Su, Z.: Sound and precise analysis of web applications for injection vulnerabilities. In: Proc. of PLDI 2007. ACM, San Diego (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Giorgidze, G., Grust, T., Halatchliyski, I., Kummer, M. (2013). Analysing the Entire Wikipedia History with Database Supported Haskell. In: Sagonas, K. (eds) Practical Aspects of Declarative Languages. PADL 2013. Lecture Notes in Computer Science, vol 7752. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45284-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-45284-0_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45283-3
Online ISBN: 978-3-642-45284-0
eBook Packages: Computer ScienceComputer Science (R0)