Crowdsourcing and Scholarly Culture: Understanding Expertise in an Age of Popularism
The increasing volume of digital material available to the humanities creates clear potential for crowdsourcing. However, tasks in the digital humanities typically do not satisfy the standard requirement for decomposition into microtasks each of which must require little expertise on behalf of the worker and little context of the broader task. Instead, humanities tasks require scholarly knowledge to perform and even where sub-tasks can be extracted, these often involve broader context of the document or corpus from which they are extracted. That is the tasks are macrotasks, resisting simple decomposition. Building on a case study from musicology, the In Concert project, we will explore both the barriers to crowdsourcing in the creation of digital corpora and also examples where elements of automatic processing or less-expert work are possible in a broader matrix that also includes expert microtasks and macrotasks. Crucially we will see that the macrotask–microtask distinction is nuanced: it is often possible to create a partial decomposition into less-expert microtasks with residual expert macrotasks, and crucially do this in ways that preserve scholarly values.
- Ackerman, P. (1922). Catalogue of the Retrospective Loan Exhibition of European Tapestries, Taylor and Tayloy, NY. http://www.gutenberg.org/ebooks/57518.
- Bashford, C., Cowgill, R., & McVeigh, S. (2000). The Concert Life in Nineteenth-Century London Database, in Nineteenth-Century British Music Studies, 2, ed. by J. Dibble and B. Zon (Aldershot: Ashgate, 2000) (pp. 1–12).Google Scholar
- Bell, D. (2004). Infinite archives, substance (Vol. 33, No. 3, Issue 105, pp. 148–161). University of Wisconsin Press. http://www.jstor.org/stable/3685549.
- Berners, T.L. (1989). Information management: A Proposal. CERN internal report, March 1989, May 1990. http://info.cern.ch/Proposal.html.
- Bodleian Library (2012/2019). What’s the Score at the Bodleian? Bodleian Library. Retrieved May 1, 2019, from http://scores.bodleian.ox.ac.uk.
- Borges, J. (1946). Del rigor en la ciencia. (tr. ‘On Exactitude in Science’) Los Anales de Buenos Aires 1.3 (March 1946):53.Google Scholar
- Brown, J., & Stratton, S. (1897). British Musical Biography: a dictionary of musical artists, authors and composers, born in Britain and its colonies. S.S. Stratton, Birmingham. OCR text: https://archive.org/details/britishmusicalb00browsearchable and data version: http://www.datatodata.com/in-concert/BMB/.
- Cheng, J., Teevan, J., Iqbal, S. T., & Bernstein, M. S. (2015). Break it down: A comparison of macro- and microtasks. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI ‘15). ACM, New York, NY, USA (pp. 4061–4064). https://doi.org/10.1145/2702123.2702146.
- Concert Life in 19th-Century London database project, funded by the University of Huddersfield and Oxford Brookes University (1997–2001), and the Arts and Humanities Research Board (UK) and University of Leeds (2001–04).Google Scholar
- Concert Programmes online database. Created 2004–2007. Retrieved September 29, 2018, from http://www.concertprogrammes.org.uk/about/.
- Cowgill, R., & Poriss, H. (eds) (2012). The arts of the prima donna in the long nineteenth century. Oxford University Press.Google Scholar
- Di Gioia, M., Scannapieco, M. & Beneventano, D. (2010). Object identification across multiple sources. In Proceedings of the Eighteenth Italian Symposium on Advanced Database Systems, SEBD 2010, Rimini, Italy, June 20–23, 2010.Google Scholar
- Distributed Proofreaders (2018). Distributed proofreaders: Preserving history one page at a time. Retrieved September 02, 2018, from https://www.pgdp.net/.
- Dix, A. (2019). Creativity – understanding and enhancing technical creativity and innovation. Retrieved November 11, 2019, from https://alandix.com/creativity/.
- Dix, A., Beale, R., & Wood, A. (2000). Architectures to make simple visualisations using simple systems. In Proceedings of the Working Conference on Advanced Visual Interfaces (pp. 51–60). ACMGoogle Scholar
- Dix, A., Cowgill, R., Bashford, C., McVeigh, S., & Ridgewell, R. (2014). Authority and judgement in the digital archive. In Proceedings of the 1st International Workshop on Digital Libraries for Musicology (DLfM ‘14). ACM, New York, NY, USA (pp. 1–8). https://doi.org/10.1145/2660168.2660171.
- Dix, A., Cowgill, R., Bashford, C., McVeigh, S., & Ridgewell, R. (2016). Spreadsheets as user interfaces. In Proceedings of AVI2016, ACM (pp. 192–195). https://doi.org/10.1145/2909132.2909271.
- Fink, F., Schulz, K. U., & Springmann, U. (2017). Profiling of OCR’ed historical texts revisited. In Proceedings of the 2nd International Conference on Digital Access to Textual Cultural Heritage (DATeCH2017). ACM, New York, NY, USA (pp. 61–66). https://doi.org/10.1145/3078081.3078096.
- Gove, M. (2016). Sky News interview with Faisal Islam, 6 June 2016.Google Scholar
- Grove, G. (Ed.). (1900). A Dictionary of Music and Musicians AD 1450-1880 (Vol. 3). Macmillan.Google Scholar
- Haas, D., Ansel, J., Gu, L., & Marcus, A. (2015). Argonaut: macrotask crowdsourcing for complex data processing. Proceedings of the VLDB Endowment, 8(12), 1642–1653. http://dx.doi.org/10.14778/2824032.2824062.
- In Concert (2014–2016). Retrieved January 03, 2016 from http://inconcert.datatodata.com.
- Leverhulme Trust (2018). Research Project Grants. Retrieved September 04, 2018, from https://www.leverhulme.ac.uk/funding/grant-schemes/research-project-grants.
- McVeigh, S. (1992–2014). Calendar of London Concerts 1750–1800. (Dataset) Goldsmiths, University of London. http://research.gold.ac.uk/10342/.
- Nurmikko-Fuller, T., Dix, A., Weigl, D. M., & Page, K. R. (2016). In collaboration with in concert: reflecting a digital library as linked data for performance ephemera. In Proceedings of the 3rd International workshop on Digital Libraries for Musicology (DLfM 2016). ACM, New York, NY, USA (pp. 17–24). https://doi.org/10.1145/2970044.2970049.
- OpenRefine: Reconciliation Service API. Retrieved September 24, 2018, from https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation-Service-API.
- Part 2D: Main Panel D criteria, Panel criteria and working methods, REF2014, Research Excellence Framework. January 2012. http://www.ref.ac.uk/pubs/2012-01/.
- Rendle, S. & Schmidt-Thieme. L. (2006). Object identification with constraints. Data Mining, 2006 1026–1031. http://www.ismll.uni-hildesheim.de/pub/pdfs/Rendle_SchmidtThieme2006-Object_Identification_with_Constraints.pdf.
- Rusbridge, C. (2007). Arts and Humanities Data Service decision. DCC News, 6 June, 2007. Digital Curation Centre. http://www.dcc.ac.uk/news/arts-and-humanities-data-service-decision.
- Scannapieco, M., Tosco, L., Valentino, L., Mancini, L., Cibella, N., Tuoto T., & Fortini, M. (2015). Relais User’s Guide – Version 3.0. Technical Report, Italian National Institute of Statistics (Istat). July 2015. https://doi.org/10.13140/rg.2.1.1332.5922.
- Transforming Musicology. Retrieved January 03, 2016, from http://www.transforming-musicology.org.
- Vobl, T., Gotscharek, A., Reffle, U., Ringlstetter, C., & Schulz, K. U. (2014, May). PoCoTo-an open source system for efficient interactive postcorrection of OCRed historical texts. In Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage (pp. 57–61). ACM. http://doi.org/10.1145/2595188.2595197.
- Wikipedia. (2019). Arts and humanities data service. Retrieved January 01, 2019, from https://en.wikipedia.org/wiki/Arts_and_Humanities_Data_Service.