Skip to main content

Archiving Software Surrogates on the Web for Future Reference

  • Conference paper
  • First Online:
Research and Advanced Technology for Digital Libraries (TPDL 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9819))

Included in the following conference series:

Abstract

Software has long been established as an essential aspect of the scientific process in mathematics and other disciplines. However, reliably referencing software in scientific publications is still challenging for various reasons. A crucial factor is that software dynamics with temporal versions or states are difficult to capture over time. We propose to archive and reference surrogates instead, which can be found on the Web and reflect the actual software to a remarkable extent. Our study shows that about a half of the webpages of software are already archived with almost all of them including some kind of documentation.

This work is partly funded by the German Research Council under FID Math and the European Research Council under ALEXANDRIA (ERC 339233).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://blog.wolfram.com/2007/09/25/arithmetic-is-hard-to-get-right [from 25/09/2007].

  2. 2.

    https://support.microsoft.com/en-us/kb/943075 [from 09/10/2007].

  3. 3.

    https://git-scm.com.

  4. 4.

    http://www.software.ac.uk/.

  5. 5.

    http://www.dcc.ac.uk/.

  6. 6.

    http://www.sciforge-project.org/.

  7. 7.

    http://mpc.zib.de/.

  8. 8.

    http://fair-dom.org/.

  9. 9.

    http://software.ac.uk/so-exactly-what-software-did-you-use.

  10. 10.

    http://www.software.ac.uk/blog/2012-06-22-how-describe-software-you-used-your- research-top-ten-tips.

  11. 11.

    http://www.re3data.org/.

  12. 12.

    http://dataverse.org/.

  13. 13.

    http://www.swmath.org.

  14. 14.

    http://archive.org.

  15. 15.

    http://www.zbmath.org.

  16. 16.

    https://github.com/helgeho/Web2Warc (Last commit 73f0934 on Jan 29, 2016).

  17. 17.

    https://archive.org/help/wayback_api.php.

  18. 18.

    https://github.com/helgeho/ArchiveSpark (Last commit acc5a16 on Feb 17, 2016).

  19. 19.

    This dataset has been provided to us by the Internet Archive in the context of ALEXANDRIA (http://alexandria-project.eu).

References

  1. Peng, R.D.: Reproducible research in computational science. Science (New York, NY) 334, 1226–1227 (2011)

    Article  Google Scholar 

  2. Wilson, G., Aruliah, D., Brown, C.T., Hong, N.P.C., Davis, M., Guy, R.T., Haddock, S.H., Huff, K.D., Mitchell, I.M., Plumbley, M.D., et al.: Best practices for scientific computing. PLoS Biol. 12, e1001745 (2014)

    Article  Google Scholar 

  3. Goble, C.: Better software, better research. Internet Comput. 18, 4–8 (2014)

    Article  Google Scholar 

  4. Rusbridge, C., Burnhill, P., Ross, S., Buneman, P., Giaretta, D., Lyon, L., Atkinson, M.: The digital curation centre: a vision for digital curation. In: Local to Global Data Interoperability - Challenges and Technologies (2005)

    Google Scholar 

  5. Vogt, T.: Software dokumentieren!. Mitt. Dtsch. Math.-Ver. 22, 16–17 (2014)

    Google Scholar 

  6. Stanford, N.J., Wolstencroft, K., Golebiewski, M., Kania, R., Juty, N., Tomlinson, C., Owen, S., Butcher, S., Hermjakob, H., Le Novère, N., et al.: The evolution of standards and data management practices in systems biology. Mol. Syst. Biol. 11, 851 (2015)

    Article  Google Scholar 

  7. Collberg, C., Proebsting, T.A.: Repeatability in computer systems research. Commun. ACM 59, 62–69 (2016)

    Article  Google Scholar 

  8. A. P. Association: Publication Manual of the American Psychological Association, 6th edn. American Psychological Association, Washington (2009)

    Google Scholar 

  9. Gibaldi, J., Einsohn, A., Díaz, A., Uría, R., Rodríguez Sáenz, D., Labadie, J., Fontane, D., Floris, V., Chou, N.: MLA Style Manual and Guide to Scholarly Publishing, 3rd edn. Modern Language Association of America, New York (2008)

    Google Scholar 

  10. Pampel, H., Vierkant, P., Scholze, F., Bertelmann, R., Kindling, M., Klump, J., Goebelbecker, H.-J., Gundlach, J., Schirmbacher, P., Dierolf, U.: Making research data repositories visible: the re3data.org registry. PloS One 8, e78080 (2013)

    Article  Google Scholar 

  11. Macdonald, S.: Edinburgh DataShare - a DSpace data repository: achievements and aspirations. Presented at the Fedora-UK&I&EU Meeting, Oxford (2009)

    Google Scholar 

  12. Kraft, A., Razum, M., Potthoff, J., Porzel, A., Engel, T., Lange, F., van den Broek, K., Furtado, F.: The RADAR project - a service for research data archival and publication. ISPRS Int. J. Geo-Inf. 5, 28 (2016)

    Article  Google Scholar 

  13. Hockx-Yu, H.: Access and scholarly use of web archives. Alex. J. Natl. Int. Libr. Inf. Issues 25, 113–127 (2014)

    Google Scholar 

  14. Gomes, D., Costa, M.: The importance of web archives for humanities. Int. J. Humanit. Arts Comput. 8, 106–123 (2014)

    Article  Google Scholar 

  15. Ainsworth, S.G., Alsum, A., SalahEldeen, H., Weigle, M.C., Nelson, M.L.: How much of the web is archived? In: JCDL (2011)

    Google Scholar 

  16. Alkwai, L.M., Nelson, M.L., Weigle, M.C.: How well are Arabic websites archived? In: JCDL (2015)

    Google Scholar 

  17. Holzmann, H., Nejdl, W., Anand, A.: The dawn of today’s popular domains - a study of the archived german web over 18 years. In: JCDL (2016)

    Google Scholar 

  18. AlSum, A., Weigle, M.C., Nelson, M.L., Van de Sompel, H.: Profiling web archive coverage for top-level domain and content language. Int. J. Digit. Libr. 14(3–4), 149–166 (2014)

    Article  Google Scholar 

  19. Day, M., MacDonald, A., Pennock, M., Kimura, A.: Implementing digital preservation strategy: developing content collection profiles at the British library. In: JCDL (DL 2014) (2014)

    Google Scholar 

  20. Alam, S., Nelson, M.L., Van de Sompel, H., Balakireva, L.L., Shankar, H., Rosenthal, D.S.H.: Web archive profiling through CDX summarization. In: Kapidakis, S., et al. (eds.) TPDL 2015. LNCS, vol. 9316, pp. 3–14. Springer, Heidelberg (2015). doi:10.1007/978-3-319-24592-8_1

    Chapter  Google Scholar 

  21. Kasioumis, N., Banos, V., Kalb, H.: Towards building a blog preservation platform. World Wide Web J. 17, 799–825 (2014)

    Article  Google Scholar 

  22. Marshall, C.C., Shipman, F.M.: An argument for archiving Facebook as a heterogeneous personal store. In: JCDL (DL 2014) (2014)

    Google Scholar 

  23. SalahEldeen, H.M., Nelson, M.L.: Losing my revolution: how many resources shared on social media have been lost? In: Zaphiris, P., Buchanan, G., Rasmussen, E., Loizides, F. (eds.) TPDL 2012. LNCS, vol. 7489, pp. 125–137. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  24. Holzmann, H., Anand, A.: Tempas: temporal archive search based on tags. In: WWW (Demo) (2016)

    Google Scholar 

  25. Holzmann, H., Nejdl, W., Anand, A.: On the applicability of delicious for temporal search on web archives. In: SIGIR (Short) (2016)

    Google Scholar 

  26. Greuel, G.-M., Sperber, W.: swMATH – an information service for mathematical software. In: Hong, H., Yap, C. (eds.) ICMS 2014. LNCS, vol. 8592, pp. 691–701. Springer, Heidelberg (2014)

    Google Scholar 

  27. Holzmann, H., Goel, V., Anand, A.: Archivespark: efficient web archive access, extraction and derivation. In: JCDL (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Helge Holzmann .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Holzmann, H., Sperber, W., Runnwerth, M. (2016). Archiving Software Surrogates on the Web for Future Reference. In: Fuhr, N., Kovács, L., Risse, T., Nejdl, W. (eds) Research and Advanced Technology for Digital Libraries. TPDL 2016. Lecture Notes in Computer Science(), vol 9819. Springer, Cham. https://doi.org/10.1007/978-3-319-43997-6_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-43997-6_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-43996-9

  • Online ISBN: 978-3-319-43997-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics