An Ecosystem for Linked Humanities Data

  • Rinke Hoekstra
  • Albert Meroño-Peñuela
  • Kathrin Dentler
  • Auke Rijpma
  • Richard Zijdeman
  • Ivo Zandhuis
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9989)

Abstract

The main promise of the digital humanities is the ability to perform scholarly studies at a much broader scale, and in a much more reusable fashion. The key enabler for such studies is the availability of sufficiently well described data. For the field of socio-economic history, data usually comes in a tabular form. Existing efforts to curate and publish datasets take a top-down approach and are focused on large collections. This paper presents QBer and the underlying structured data hub, which address the long tail of research data by catering for the needs of individual scholars. QBer allows researchers to publish their (small) datasets, link them to existing vocabularies and other datasets, and thereby contribute to a growing collection of interlinked datasets. We present QBer, and evaluate our first results by showing how our system facilitates three use cases in socio-economic history.

Keywords

Digital humanities Structured data Linked Data QBer 

References

  1. 1.
    Ashkpour, A., Meroño-Peñuela, A., Mandemakers, K.: The Dutch historical censuses: harmonization and RDF. Hist. Methods J. Quant. Interdisc. Hist. 48 (2015)Google Scholar
  2. 2.
    van Assem, M., Rijgersberg, H., Wigham, M., Top, J.: Converting and annotating quantitative data tables. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 16–31. Springer, Heidelberg (2010). doi:10.1007/978-3-642-17746-0_2 CrossRefGoogle Scholar
  3. 3.
    Barker, D.J.: The fetal and infant origins of adult disease. BMJ Br. Med. J. 301(6761), 1111 (1990)CrossRefGoogle Scholar
  4. 4.
    Bartels, L.M., Jackman, S.: A generational model of political learning. Electoral. Stud. 33, 7–18 (2014)CrossRefGoogle Scholar
  5. 5.
    Bolt, J., Timmer, M., van Zanden, J.L.: GDP per capita since 1820. In: How Was Life? Global well-being since 1820, pp. 57–72. Organisation for Economic Co-operation and Development, October 2014Google Scholar
  6. 6.
    Cyganiak, R., Reynolds, D., Tennison, J.: The RDF data cube vocabulary. Technical report, W3C (2013). http://www.w3.org/TR/vocab-data-cube/
  7. 7.
    DERI: RDF Refine - a Google Refine extension for exporting RDF. Technical report, Digital Enterprise Research Institute (2015). http://refine.deri.ie/
  8. 8.
    Ferguson, A.R., Nielson, J.L., Cragin, M.H., Bandrowski, A.E., Martone, M.E.: Big data from small data: data-sharing in the ‘long tail’ of neuroscience. Nat. Neurosc. 17(11), 1442–1447 (2014)CrossRefGoogle Scholar
  9. 9.
    Groth, P., Gibson, A., Velterop, J.: The anatomy of a nanopublication. Inf. Serv. Use 30(1–2), 51–56 (2010)Google Scholar
  10. 10.
    Haigh, T.: We have never been digital. Commun. ACM 57(9), 24–28 (2014)CrossRefGoogle Scholar
  11. 11.
    Heath, T., Bizer, C.: Linked Data: Evolving the Web into a Global Data, 1st edn. Morgan and Claypool, Palo Alto (2011)Google Scholar
  12. 12.
    Heckman, J.J.: Skill formation and the economics of investing in disadvantaged children. Science 312(5782), 1900–1902 (2006). http://www.sciencemag.org/content/312/5782/1900 CrossRefGoogle Scholar
  13. 13.
    Heyvaert, P., Dimou, A., Herregodts, A.-L., Verborgh, R., Schuurman, D., Mannens, E., Van de Walle, R.: RMLEditor: a graph-based mapping editor for linked data mappings. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 709–723. Springer, Heidelberg (2016). doi:10.1007/978-3-319-34129-3_43 CrossRefGoogle Scholar
  14. 14.
    Hoekstra, R., Groth, P.: Linkitup: link discovery for research data. In: AAAI Fall Symposium Series Technical Reports (FS-13-01), pp. 28–35 (2013)Google Scholar
  15. 15.
    Kalampokis, E., Nikolov, A., et al.: Exploiting linked data cubes with opencube toolkit. In: Posters and Demos Track, 13th International Semantic Web Conference (ISWC2014), vol. 1272. CEUR-WS, Riva del Garda, Italy (2014). http://ceur-ws.org/Vol-1272/paper_109.pdf
  16. 16.
    Knoblock, C.A., et al.: Semi-automatically mapping structured sources into the semantic web. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 375–390. Springer, Heidelberg (2012). doi:10.1007/978-3-642-30284-8_32 CrossRefGoogle Scholar
  17. 17.
    Lambert, P.S., Zijdeman, R.L., Van Leeuwen, M.H., Maas, I., Prandy, K.: The construction of HISCAM: a stratification scale based on social interactions for historical comparative research. Hist. Methods J. Quant. Interdisc. Hist. 46(2), 77–89 (2013)CrossRefGoogle Scholar
  18. 18.
    Lebo, T., McCusker, J.: csv2rdf4lod. Technical report, Tetherless World, RPI (2012). https://github.com/timrdf/csv2rdf4lod-automation/wiki
  19. 19.
    van Leeuwen, M., Maas, I., Miles, A.: HISCO: Historical International Standard Classification of Occupations. Leuven University Press, Leuven (2002)Google Scholar
  20. 20.
    Meroño-Peñuela, A.: LSD dimensions: use and reuse of linked statistical data. In: Lambrix, P., Hyvönen, E., Blomqvist, E., Presutti, V., Qi, G., Sattler, U., Ding, Y., Ghidini, C. (eds.) EKWA 2014 Satellite Events. LNCS, vol. 8982, pp. 159–163. Springer, Heidelberg (2015). doi:10.1007/978-3-319-17966-7_22 Google Scholar
  21. 21.
    Meroño-Peñuela, A., Ashkpour, A., van Erp, M., Mandemakers, K., Breure, L., Scharnhorst, A., Schlobach, S., van Harmelen, F.: Semantic technologies for historical research: a survey. Seman. Web Interoperability Usability Applicability 6(6), 539–564 (2015)Google Scholar
  22. 22.
    Meroño-Peñuela, A., Ashkpour, A., Rietveld, L., Hoekstra, R., Schlobach, S.: Linked humanities data: the next frontier? In: 2nd International Workshop on Linked Science (LISC2012), ISWC, vol. 951. CEUR-WS (2012). http://ceur-ws.org/Vol-951/
  23. 23.
    Meroño-Peñuela, A., Guéret, C., Schlobach, S.: Linked edit rules: a web friendly way of checking quality of RDF data cubes. In: 3rd International Workshop on Semantic Statistics (SemStats 2015), ISWC. CEUR (2015)Google Scholar
  24. 24.
    Meroño-Peñuela, A., Hoekstra, R.: grlc makes GitHub taste like linked data APIs. In: Sack, H., Rizzo, G., Steinmetz, N., Mladenić, D., Auer, S., Lange, C. (eds.) ESWC 2016 Satellite Events. LNCS, vol. 9989, pp. 342–353. Springer, Heidelberg (2016)Google Scholar
  25. 25.
    Morris, T., Guidry, T., Magdinie, M.: OpenRefine: a free, open source, powerful tool for working with messy data. Technical report, The OpenRefine Development Team (2015). http://openrefine.org/
  26. 26.
    Muñoz, E., Hogan, A., Mileo, A.: DRETa: extracting RDF from Wikitables. In: International Semantic Web Conference, Posters and Demos, pp. 92–98. CEUR-WS (2013)Google Scholar
  27. 27.
    van Ossenbruggen, J., Hildebrand, M., de Boer, V.: Interactive vocabulary alignment. In: Gradmann, S., Borri, F., Meghini, C., Schuldt, H. (eds.) TPDL 2011. LNCS, vol. 6966, pp. 296–307. Springer, Heidelberg (2011). doi:10.1007/978-3-642-24469-8_31 CrossRefGoogle Scholar
  28. 28.
    Piwowar, H.A., Day, R.S., Fridsma, D.B.: Sharing detailed research data is associated with increased citation rate. PloS one 2(3), e308 (2007). http://dx.plos.org/10.1371/journal.pone.0000308 CrossRefGoogle Scholar
  29. 29.
    Renckens, E.: Digital humanities verfrissen onze blik op bestaande data. E-Data Res. 10 (2016)Google Scholar
  30. 30.
    Roman, D., Nikolov, N., et al.: DataGraft: one-stop-shop for open data management. Sem. Web Interoperability Usability Applicability (2016, under review). http://www.semantic-web-journal.net/content/datagraft-one-stop-shop-open-data-management
  31. 31.
    Ruggles, S., Roberts, E., Sarkar, S., Sobek, M.: The North Atlantic population project: progress and prospects. Hist. Methods J. Quant. Interdisc. Hist. 44(1), 1–6 (2011)CrossRefGoogle Scholar
  32. 32.
    Szekely, P., Knoblock, C.A., Yang, F., Zhu, X., Fink, E.E., Allen, R., Goodlander, G.: Connecting the Smithsonian American art museum to the linked data cloud. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 593–607. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38288-8_40 CrossRefGoogle Scholar
  33. 33.
    Tenopir, C., Allard, S., Douglass, K., Aydinoglu, A.U., Wu, L., Read, E., Manoff, M., Frame, M.: Data sharing by scientists: practices and perceptions. PLoS ONE 6(6), e21101 (2011). http://dx.doi.org/10.1371/journal.pone.0021101 CrossRefGoogle Scholar
  34. 34.
    Thomasson, M.A., Fishback, P.V.: Hard times in the land of plenty: the effect on income and disability later in life for people born during the great depression. Explor. Econ. Hist. 54, 64–78 (2014)CrossRefGoogle Scholar
  35. 35.
    Wilkinson, M., et al.: The fair guiding principles for scientific data management and stewardship. Sci. Data (160018) (2016). http://www.nature.com/articles/sdata201618

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Rinke Hoekstra
    • 1
    • 5
  • Albert Meroño-Peñuela
    • 1
    • 4
  • Kathrin Dentler
    • 1
  • Auke Rijpma
    • 2
    • 7
  • Richard Zijdeman
    • 2
    • 6
  • Ivo Zandhuis
    • 3
  1. 1.Department of Computer ScienceVrije Universiteit AmsterdamAmsterdamNetherlands
  2. 2.International Institute of Social History, KNAWAmsterdamNetherlands
  3. 3.Ivo Zandhuis Research and ConsultancyHaarlemNetherlands
  4. 4.Data Archiving and Networked Services, KNAWThe HagueNetherlands
  5. 5.Faculty of LawUniversity of AmsterdamAmsterdamNetherlands
  6. 6.University of StirlingStirlingUK
  7. 7.Utrecht UniversityUtrechtNetherlands

Personalised recommendations