Abstract
Software has long been established as an essential aspect of the scientific process in mathematics and other disciplines. However, reliably referencing software in scientific publications is still challenging for various reasons. A crucial factor is that software dynamics with temporal versions or states are difficult to capture over time. We propose to archive and reference surrogates instead, which can be found on the Web and reflect the actual software to a remarkable extent. Our study shows that about a half of the webpages of software are already archived with almost all of them including some kind of documentation.
This work is partly funded by the German Research Council under FID Math and the European Research Council under ALEXANDRIA (ERC 339233).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
http://blog.wolfram.com/2007/09/25/arithmetic-is-hard-to-get-right [from 25/09/2007].
- 2.
https://support.microsoft.com/en-us/kb/943075 [from 09/10/2007].
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
https://github.com/helgeho/Web2Warc (Last commit 73f0934 on Jan 29, 2016).
- 17.
- 18.
https://github.com/helgeho/ArchiveSpark (Last commit acc5a16 on Feb 17, 2016).
- 19.
This dataset has been provided to us by the Internet Archive in the context of ALEXANDRIA (http://alexandria-project.eu).
References
Peng, R.D.: Reproducible research in computational science. Science (New York, NY) 334, 1226–1227 (2011)
Wilson, G., Aruliah, D., Brown, C.T., Hong, N.P.C., Davis, M., Guy, R.T., Haddock, S.H., Huff, K.D., Mitchell, I.M., Plumbley, M.D., et al.: Best practices for scientific computing. PLoS Biol. 12, e1001745 (2014)
Goble, C.: Better software, better research. Internet Comput. 18, 4–8 (2014)
Rusbridge, C., Burnhill, P., Ross, S., Buneman, P., Giaretta, D., Lyon, L., Atkinson, M.: The digital curation centre: a vision for digital curation. In: Local to Global Data Interoperability - Challenges and Technologies (2005)
Vogt, T.: Software dokumentieren!. Mitt. Dtsch. Math.-Ver. 22, 16–17 (2014)
Stanford, N.J., Wolstencroft, K., Golebiewski, M., Kania, R., Juty, N., Tomlinson, C., Owen, S., Butcher, S., Hermjakob, H., Le Novère, N., et al.: The evolution of standards and data management practices in systems biology. Mol. Syst. Biol. 11, 851 (2015)
Collberg, C., Proebsting, T.A.: Repeatability in computer systems research. Commun. ACM 59, 62–69 (2016)
A. P. Association: Publication Manual of the American Psychological Association, 6th edn. American Psychological Association, Washington (2009)
Gibaldi, J., Einsohn, A., DÃaz, A., UrÃa, R., RodrÃguez Sáenz, D., Labadie, J., Fontane, D., Floris, V., Chou, N.: MLA Style Manual and Guide to Scholarly Publishing, 3rd edn. Modern Language Association of America, New York (2008)
Pampel, H., Vierkant, P., Scholze, F., Bertelmann, R., Kindling, M., Klump, J., Goebelbecker, H.-J., Gundlach, J., Schirmbacher, P., Dierolf, U.: Making research data repositories visible: the re3data.org registry. PloS One 8, e78080 (2013)
Macdonald, S.: Edinburgh DataShare - a DSpace data repository: achievements and aspirations. Presented at the Fedora-UK&I&EU Meeting, Oxford (2009)
Kraft, A., Razum, M., Potthoff, J., Porzel, A., Engel, T., Lange, F., van den Broek, K., Furtado, F.: The RADAR project - a service for research data archival and publication. ISPRS Int. J. Geo-Inf. 5, 28 (2016)
Hockx-Yu, H.: Access and scholarly use of web archives. Alex. J. Natl. Int. Libr. Inf. Issues 25, 113–127 (2014)
Gomes, D., Costa, M.: The importance of web archives for humanities. Int. J. Humanit. Arts Comput. 8, 106–123 (2014)
Ainsworth, S.G., Alsum, A., SalahEldeen, H., Weigle, M.C., Nelson, M.L.: How much of the web is archived? In: JCDL (2011)
Alkwai, L.M., Nelson, M.L., Weigle, M.C.: How well are Arabic websites archived? In: JCDL (2015)
Holzmann, H., Nejdl, W., Anand, A.: The dawn of today’s popular domains - a study of the archived german web over 18 years. In: JCDL (2016)
AlSum, A., Weigle, M.C., Nelson, M.L., Van de Sompel, H.: Profiling web archive coverage for top-level domain and content language. Int. J. Digit. Libr. 14(3–4), 149–166 (2014)
Day, M., MacDonald, A., Pennock, M., Kimura, A.: Implementing digital preservation strategy: developing content collection profiles at the British library. In: JCDL (DL 2014) (2014)
Alam, S., Nelson, M.L., Van de Sompel, H., Balakireva, L.L., Shankar, H., Rosenthal, D.S.H.: Web archive profiling through CDX summarization. In: Kapidakis, S., et al. (eds.) TPDL 2015. LNCS, vol. 9316, pp. 3–14. Springer, Heidelberg (2015). doi:10.1007/978-3-319-24592-8_1
Kasioumis, N., Banos, V., Kalb, H.: Towards building a blog preservation platform. World Wide Web J. 17, 799–825 (2014)
Marshall, C.C., Shipman, F.M.: An argument for archiving Facebook as a heterogeneous personal store. In: JCDL (DL 2014) (2014)
SalahEldeen, H.M., Nelson, M.L.: Losing my revolution: how many resources shared on social media have been lost? In: Zaphiris, P., Buchanan, G., Rasmussen, E., Loizides, F. (eds.) TPDL 2012. LNCS, vol. 7489, pp. 125–137. Springer, Heidelberg (2012)
Holzmann, H., Anand, A.: Tempas: temporal archive search based on tags. In: WWW (Demo) (2016)
Holzmann, H., Nejdl, W., Anand, A.: On the applicability of delicious for temporal search on web archives. In: SIGIR (Short) (2016)
Greuel, G.-M., Sperber, W.: swMATH – an information service for mathematical software. In: Hong, H., Yap, C. (eds.) ICMS 2014. LNCS, vol. 8592, pp. 691–701. Springer, Heidelberg (2014)
Holzmann, H., Goel, V., Anand, A.: Archivespark: efficient web archive access, extraction and derivation. In: JCDL (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Holzmann, H., Sperber, W., Runnwerth, M. (2016). Archiving Software Surrogates on the Web for Future Reference. In: Fuhr, N., Kovács, L., Risse, T., Nejdl, W. (eds) Research and Advanced Technology for Digital Libraries. TPDL 2016. Lecture Notes in Computer Science(), vol 9819. Springer, Cham. https://doi.org/10.1007/978-3-319-43997-6_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-43997-6_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43996-9
Online ISBN: 978-3-319-43997-6
eBook Packages: Computer ScienceComputer Science (R0)