- 702 Downloads
Measured in dollars per byte, the cost of data in some biological data sets exceeds that of “big science” data by several orders of magnitude. This somewhat pointless observation does at least underline the fact that biological databases are constructed and maintained with a very great deal human effort—they are curated. So what are the issues with curated data, and how well does current database technology serve them?
In this talk I shall describe some of the new challenges to database research that arise from curated databases and what my colleagues and I are doing to tackle them. They include annotation, data provenance, database archiving, data publishing and security. I shall also attempt to summarise the work of the recently formed Digital Curation Centre, which is concerned not only with these database-related issues but also with the larger problems of ensuring that our scientific and scholarly data is understandable not only by current users but is “curated” in the sense that it will be usable in the future.