Aim and scope of CDD

Maintaining water quality and availability has become an environmental priority of many governments, including the Government of Canada (Environment Canada 2013). Nevertheless, various components of global change will have significant, long-lasting impacts on the availability, distribution and quality of surface waters. Adequate understanding of these impacts has yet to emerge, presenting challenges for development of the best adaptive strategies for water management at local, regional and global scales (International Joint Commission 2004). An effective approach for understanding the potential future trajectories of freshwater resources in the face of rapid and widespread environmental change is the application of paleolimnological methods, which use physical, chemical and biological indicators preserved in lake and river sediments to infer past ecosystem dynamics (Vincent 2009), and provide a historical perspective on present-day conditions. Of the many biological indicators of past environmental conditions that are used routinely in paleolimnology, diatoms (class Bacillariophyceae) are among the most useful.

The international paleolimnological research community generates large amounts of paleo-data that feed into knowledge-transfer opportunities, understanding of environmental processes, and policy decisions on water protection (Smol 2002). Very often, however, these data are not used to their full potential because they are not systematically managed or made readily accessible to stakeholders (Elger et al. 2014). So far, relatively few international paleolimnological efforts/initiatives have made data available via online databases [e.g. Diatom Paleolimnology Data Cooperative (http://diatom.acnatsci.org/dpdc), Diatoms of the United States (http://westerndiatoms.colorado.edu), European Diatom Database (http://craticula.ncl.ac.uk/Eddi/jsp/index.jsp)]. As a result, the field of paleolimnology still faces challenges in bridging the gap between fundamental research and applied management, conservation and restoration of aquatic resources (Kumagai and Vincent 2003; Saulnier-Talbot 2015). Consequently, there is an increasing expectation from the research community that close attention should be given to data management and archiving, as noted by the recent Coalition on Publishing Data in the Earth and Space Sciences “Statement of Commitment” (COPDESS 2015).

The Circumpolar Diatom Database (CDD) primarily serves to facilitate analysis of biogeographic patterns and define autecological information on diatom species found in the remote circumpolar region of the northern hemisphere, which remain poorly known. The CDD was initiated in 1995 in the Aquatic Paleoecology Laboratory (Centre d’études nordiques, Université Laval, Quebec City). Its main objective is to illustrate and interpret spatial and temporal changes in the distribution of diatoms in the northern circumpolar region, in relation to limnological, environmental and geographical characteristics. The specific objectives of this undertaking were to: (1) develop a database structure enhanced by a Geographic Information System (GIS) for easy consultation and visualization of limnological and paleolimnological data, (2) elaborate biogeographical classifications and patterns based on the presence and absence of specific diatoms in space and time, and (3) add to the knowledge of and specify environmental and ecological preferences of the main diatom genera and specific taxa encountered in high-latitude regions, thereby complementing existing diatom databases for temperate and Antarctic regions [e.g. Canadian Diatom Database, European Diatom Database, Antarctic Freshwater Diatoms). Our aim was to develop a multifunctional database like the CDD to provide user-friendly access to data, while offering advanced storage, management and sharing functionalities.

Geographic coverage of CDD

Many sites were sampled throughout the northern circumpolar region over the past 22 years, mostly as part of paleolimnological studies completed by members of the Aquatic Paleoecology Laboratory at Centre for Northern Studies (Centre d’études nordiques-CEN), Université Laval (Quebec City, Canada). In the early stages of the development of the CDD, the limnological and biological data were incorporated into a database in the software program FilemakerPro©. At the time, the aim of this simple database was to archive and integrate the data gathered by laboratory members into an easy consultation tool. The more recent advances in computing programs and the development of powerful GIS now enable biogeographic research and archiving of a large amount of environmental and species data in the newly designed CDD.

Recent data acquisition and incorporation tripled the amount of information contained within this latest version of the CDD. It now contains 572 sampling sites, 4014 diatom taxa, 40,114 occurrence data, 15 datasets and more than 15,000 limnological (lake water chemistry and physics) entries. Its geographic range extends over eight areas of the circumpolar Arctic, distributed across three continents: North America, Europe and Asia. These areas include Alaska, Yukon, Northwest Territories, Nunavut, northern Quebec (Nunavik), Labrador, northern Sweden and Siberia (Taymyr Peninsula, Lena Delta, Pechora River). Most, but not all data are from studies published in the scientific literature, and are made freely available via the CDD.

Structure and functioning of CDD

The CDD is an online, open-access database of diatom and associated ecological and paleolimnological data relevant to the study of global change. It contains data on regional surface-sample calibration sets (diatom counts, water chemistry, inference models-transfer functions), sediment cores (diatom counts, chronological information, and diatom-inferred characteristics), and other types of samples.

Based on a relational database model, the CDD is also designed to accommodate sediment core data obtained through paleolimnological research. As a result, its data can easily be used for spatial analysis, cartographic representation in a GIS or publication over the Internet. The establishment of a web-based user interface (http://www.cen.ulaval.ca/cdd) with search modules allows exploration of CDD functions in a simple and precise manner. The data in the database are linked in such a way that access to all information relevant to each taxon, lake or dataset is straightforward and user-friendly.

The web interface is divided into three sections, allowing the user to search for data on (1) the biogeographic distribution of specific taxa, (2) datasets included in the database, and (3) contributors to the database. The first section allows an investigator to obtain a list of all taxa indexed in the CDD by indicating search criteria. One can choose the complete list or search for specific species and search for all occurrences of an individual taxon among all datasets (Fig. 1). The search can be by genus, species or subspecies (variety). From the result set, one can then access the taxon description, as well as the dataset with the limnological data, taxon occurrence, water chemistry, environmental data, lake characteristics, etc. where the taxon was found (Figs. 2, 3).

Fig. 1
figure 1

Search tool for finding diatom taxa

Fig. 2
figure 2

Example result set of taxa corresponding to search criteria. This list allows access to the taxon or dataset description

Fig. 3
figure 3

Taxon description, exemplified by Amphipleura pellicula

The second section makes it possible to consult and access the list of datasets contained within the CDD. For each dataset, a brief description is supplied, along with the geographic coordinates of the study region, relevant publications in scientific journals and the limnological and environmental data (Fig. 4). For potential users of the CDD, zip files each containing a «Water chemistry» and an «Abundance» data file can be downloaded directly from the CDD website via the “Dataset” page using the “download data” function/button (http://www.cen.ulaval.ca/cdd/DatasetList.aspx). Finally, the third section holds a list of contributors who supplied data to the respective datasets.

Fig. 4
figure 4

Example of dataset description

Quality control of diatom taxonomy

To ensure harmonization of diatom identifications among contributors (diatomists) to the CDD, and to improve comparisons between diatom studies and datasets, diatom inter-calibration and harmonization exercises were organized within a series of Arctic-Antarctic Diatom Symposia and workshops (AADS) that were initiated in 1991 (Hamilton 1994).

Future developments and outlook

Future development of the CDD will include the addition of a spatial representation and search engine of its contents to facilitate access to the data, in addition to increasing its visibility at an international scale and encouraging contributions. In the longer term, the CDD will be upgraded to accept data submission from external contributors via intranet for the easy upload of relevant data and metadata.

Furthermore, we will aim to develop a database structure enhanced by a GIS approach for easy consultation and visualization of limnological and paleolimnological data. Development of spatial representation of its contents facilitates access to the data, in addition to increasing its visibility at an international scale and encouraging contributions.

Finally, the data in the CDD will be published via Nordicana D, a formatted, online doi-referenced data report series, to facilitate the access and referencing of the data sets. This will also allow the recognition of individual researchers who have contributed to the database, while assuring wide dissemination of the information contained within the CDD.

Conclusions

The CDD was developed with the aim of offering researchers in the aquatic paleosciences and in diatom research a simple reference tool for easy acquisition of information on the spatial and temporal distribution and occurrence (biogeography) of diatom taxa from northern freshwater ecosystems and habitats. Beyond this, the CDD also serves as an information resource for limnologists, hydrologists, climate scientists and others interested in long-term aquatic ecosystem change and long-term data related to climate change and other global environmental issues.

The CDD aims to be an internationally recognized data acquisition reference for paleolimnologists, limnologists and diatomists. To ensure its success, interested members of the international scientific research community are encouraged to contribute to the project by sharing their data on diatom distribution and abundance resulting from research in the circumpolar regions, including diatom data from sediment cores and surface deposits. Contributions to the CDD will result in the expansion of knowledge on the autecology, spatial and temporal distributions of diatoms.

The Aquatic Paleoecology Laboratory will be grateful to receive and formally acknowledge your data contributions. Please contact us at cdd@cen.ulaval.ca, Laboratoire de Paléoécologie Aquatique, Centre d’études nordiques (CEN), Pavillon Abitibi-Price, 2405 rue de la Terrasse, Université Laval, Québec City (Canada), G1V 0A6 (Phone: 1-418-656-2131 +7006). We cordially invite you to cooperate in this effort, as the success of the CDD will largely depend on the voluntary submissions of the diatom paleoecology community!