Advertisement

Crowdsourcing and Scholarly Culture: Understanding Expertise in an Age of Popularism

  • Alan DixEmail author
  • Rachel Cowgill
  • Christina Bashford
  • Simon McVeigh
  • Rupert Ridgewell
Chapter
Part of the Human–Computer Interaction Series book series (HCIS)

Abstract

The increasing volume of digital material available to the humanities creates clear potential for crowdsourcing. However, tasks in the digital humanities typically do not satisfy the standard requirement for decomposition into microtasks each of which must require little expertise on behalf of the worker and little context of the broader task. Instead, humanities tasks require scholarly knowledge to perform and even where sub-tasks can be extracted, these often involve broader context of the document or corpus from which they are extracted. That is the tasks are macrotasks, resisting simple decomposition. Building on a case study from musicology, the In Concert project, we will explore both the barriers to crowdsourcing in the creation of digital corpora and also examples where elements of automatic processing or less-expert work are possible in a broader matrix that also includes expert microtasks and macrotasks. Crucially we will see that the macrotask–microtask distinction is nuanced: it is often possible to create a partial decomposition into less-expert microtasks with residual expert macrotasks, and crucially do this in ways that preserve scholarly values.

References

  1. Ackerman, P. (1922). Catalogue of the Retrospective Loan Exhibition of European Tapestries, Taylor and Tayloy, NY. http://www.gutenberg.org/ebooks/57518.
  2. Ahmed, E., Ipeirotis, P., & Verykios, V. (2007). Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering, 19(1), 1–16.  https://doi.org/10.1109/TKDE.2007.9.CrossRefGoogle Scholar
  3. Bashford, C., Cowgill, R., & McVeigh, S. (2000). The Concert Life in Nineteenth-Century London Database, in Nineteenth-Century British Music Studies, 2, ed. by J. Dibble and B. Zon (Aldershot: Ashgate, 2000) (pp. 1–12).Google Scholar
  4. Bell, D. (2004). Infinite archives, substance (Vol. 33, No. 3, Issue 105, pp. 148–161). University of Wisconsin Press. http://www.jstor.org/stable/3685549.
  5. Berners, T.L. (1989). Information management: A Proposal. CERN internal report, March 1989, May 1990. http://info.cern.ch/Proposal.html.
  6. Bhattacharya, I., & Getoor, L. (2007). Collective entity resolution in relational data. ACM Transactions on Knowledge Discovery from Data, 1(1), 5.CrossRefGoogle Scholar
  7. Bodleian Library (2012/2019). What’s the Score at the Bodleian? Bodleian Library. Retrieved May 1, 2019, from http://scores.bodleian.ox.ac.uk.
  8. Borges, J. (1946). Del rigor en la ciencia. (tr. ‘On Exactitude in Science’) Los Anales de Buenos Aires 1.3 (March 1946):53.Google Scholar
  9. Brown, J., & Stratton, S. (1897). British Musical Biography: a dictionary of musical artists, authors and composers, born in Britain and its colonies. S.S. Stratton, Birmingham. OCR text: https://archive.org/details/britishmusicalb00browsearchable and data version: http://www.datatodata.com/in-concert/BMB/.
  10. Cheng, J., Teevan, J., Iqbal, S. T., & Bernstein, M. S. (2015). Break it down: A comparison of macro- and microtasks. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI ‘15). ACM, New York, NY, USA (pp. 4061–4064).  https://doi.org/10.1145/2702123.2702146.
  11. Concert Life in 19th-Century London database project, funded by the University of Huddersfield and Oxford Brookes University (1997–2001), and the Arts and Humanities Research Board (UK) and University of Leeds (2001–04).Google Scholar
  12. Concert Programmes online database. Created 2004–2007. Retrieved September 29, 2018, from http://www.concertprogrammes.org.uk/about/.
  13. Cowgill, R., & Poriss, H. (eds) (2012). The arts of the prima donna in the long nineteenth century. Oxford University Press.Google Scholar
  14. Di Gioia, M., Scannapieco, M. & Beneventano, D. (2010). Object identification across multiple sources. In Proceedings of the Eighteenth Italian Symposium on Advanced Database Systems, SEBD 2010, Rimini, Italy, June 20–23, 2010.Google Scholar
  15. Distributed Proofreaders (2018). Distributed proofreaders: Preserving history one page at a time. Retrieved September 02, 2018, from https://www.pgdp.net/.
  16. Dix, A. (2019). Creativity – understanding and enhancing technical creativity and innovation. Retrieved November 11, 2019, from https://alandix.com/creativity/.
  17. Dix, A., Beale, R., & Wood, A. (2000). Architectures to make simple visualisations using simple systems. In Proceedings of the Working Conference on Advanced Visual Interfaces (pp. 51–60). ACMGoogle Scholar
  18. Dix, A., Cowgill, R., Bashford, C., McVeigh, S., & Ridgewell, R. (2014). Authority and judgement in the digital archive. In Proceedings of the 1st International Workshop on Digital Libraries for Musicology (DLfM ‘14). ACM, New York, NY, USA (pp. 1–8).  https://doi.org/10.1145/2660168.2660171.
  19. Dix, A., Cowgill, R., Bashford, C., McVeigh, S., & Ridgewell, R. (2016). Spreadsheets as user interfaces. In Proceedings of AVI2016, ACM (pp. 192–195).  https://doi.org/10.1145/2909132.2909271.
  20. Dunn, H. (1946). Record linkage. American Journal of Public Health, 36(12), 1412–1416.  https://doi.org/10.2105/AJPH.36.12.1412.CrossRefGoogle Scholar
  21. Fink, F., Schulz, K. U., & Springmann, U. (2017). Profiling of OCR’ed historical texts revisited. In Proceedings of the 2nd International Conference on Digital Access to Textual Cultural Heritage (DATeCH2017). ACM, New York, NY, USA (pp. 61–66).  https://doi.org/10.1145/3078081.3078096.
  22. Gove, M. (2016). Sky News interview with Faisal Islam, 6 June 2016.Google Scholar
  23. Grove, G. (Ed.). (1900). A Dictionary of Music and Musicians AD 1450-1880 (Vol. 3). Macmillan.Google Scholar
  24. Haas, D., Ansel, J., Gu, L., & Marcus, A. (2015). Argonaut: macrotask crowdsourcing for complex data processing. Proceedings of the VLDB Endowment, 8(12), 1642–1653. http://dx.doi.org/10.14778/2824032.2824062.
  25. In Concert (2014–2016). Retrieved January 03, 2016 from http://inconcert.datatodata.com.
  26. Leverhulme Trust (2018). Research Project Grants. Retrieved September 04, 2018, from https://www.leverhulme.ac.uk/funding/grant-schemes/research-project-grants.
  27. McVeigh, S. (1992–2014). Calendar of London Concerts 1750–1800. (Dataset) Goldsmiths, University of London. http://research.gold.ac.uk/10342/.
  28. Nikolov, A., d’Aquin, M., and Motta, E. (2012). Unsupervised learning of link discovery configuration. In Proceedings of ESWC’12, Springer, Berlin, Heidelberg (pp. 119–133).  https://doi.org/10.1007/978-3-642-30284-8_15.Google Scholar
  29. Nurmikko-Fuller, T., Dix, A., Weigl, D. M., & Page, K. R. (2016). In collaboration with in concert: reflecting a digital library as linked data for performance ephemera. In Proceedings of the 3rd International workshop on Digital Libraries for Musicology (DLfM 2016). ACM, New York, NY, USA (pp. 17–24).  https://doi.org/10.1145/2970044.2970049.
  30. OpenRefine: Reconciliation Service API. Retrieved September 24, 2018, from https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation-Service-API.
  31. Part 2D: Main Panel D criteria, Panel criteria and working methods, REF2014, Research Excellence Framework. January 2012. http://www.ref.ac.uk/pubs/2012-01/.
  32. Rendle, S. & Schmidt-Thieme. L. (2006). Object identification with constraints. Data Mining, 2006 1026–1031. http://www.ismll.uni-hildesheim.de/pub/pdfs/Rendle_SchmidtThieme2006-Object_Identification_with_Constraints.pdf.
  33. Rusbridge, C. (2007). Arts and Humanities Data Service decision. DCC News, 6 June, 2007. Digital Curation Centre. http://www.dcc.ac.uk/news/arts-and-humanities-data-service-decision.
  34. Scannapieco, M., Tosco, L., Valentino, L., Mancini, L., Cibella, N., Tuoto T., & Fortini, M. (2015). Relais User’s Guide – Version 3.0. Technical Report, Italian National Institute of Statistics (Istat). July 2015.  https://doi.org/10.13140/rg.2.1.1332.5922.
  35. Schmitz, H., & Lykourentzou, I. (2018). Online sequencing of Non-decomposable macrotasks in expert crowdsourcing. ACM Transactions on Social Computing, 1(1), 1.Article 1 (January 2018), 33 p.  https://doi.org/10.1145/3140459.CrossRefGoogle Scholar
  36. Transforming Musicology. Retrieved January 03, 2016, from http://www.transforming-musicology.org.
  37. Vobl, T., Gotscharek, A., Reffle, U., Ringlstetter, C., & Schulz, K. U. (2014, May). PoCoTo-an open source system for efficient interactive postcorrection of OCRed historical texts. In Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage (pp. 57–61). ACM. http://doi.org/10.1145/2595188.2595197.
  38. von Ahn, Luis, Maurer, Benjamin, McMillen, Colin, Abraham, David, & Blum, Manuel. (2008). reCAPTCHA: Human-based character recognition via web security measures. Science, 321(5895), 1465–1468.MathSciNetCrossRefGoogle Scholar
  39. Wikipedia. (2019). Arts and humanities data service. Retrieved January 01, 2019, from https://en.wikipedia.org/wiki/Arts_and_Humanities_Data_Service.

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Alan Dix
    • 1
    Email author
  • Rachel Cowgill
    • 2
  • Christina Bashford
    • 3
  • Simon McVeigh
    • 4
  • Rupert Ridgewell
    • 5
  1. 1.Swansea UniversitySwanseaUK
  2. 2.School of Music, Humanities and MediaUniversity of HuddersfieldHuddersfieldUK
  3. 3.School of MusicUniversity of Illinois at Urbana-ChampaignChampaignUSA
  4. 4.Department of MusicGoldsmiths, University of LondonLondonUK
  5. 5.British LibraryLondonUK

Personalised recommendations