Marine Biodiversity

, Volume 41, Issue 1, pp 141-179

First online:

Predictions of 27 Arctic pelagic seabird distributions using public environmental variables, assessed with colony data: a first digital IPY and GBIF open access synthesis platform

  • Falk HuettmannAffiliated withEWHALE lab, Institute of Arctic Biology, Biology & Wildlife Department, University of Alaska Email author 
  • , Yuri ArtukhinAffiliated withLaboratory of Ornithology, Kamchatka Branch of Pacific Inst. of Geography, Russian Academy of Science
  • , Olivier GilgAffiliated withLaboratoire Biogéosciences, UMR CNRS 5561, Equipe Ecologie Evolutive, Université de BourgogneDepartment of Biological and Environmental Sciences, Division of Population Biology, University of Helsinki
  • , Grant HumphriesAffiliated withEWHALE lab, Institute of Arctic Biology, Biology & Wildlife Department, University of Alaska

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access


We present a first compilation, quantification and summary of 27 seabird species presence data for north of the Arctic circle (>66 degrees latitude North) and the ice-free period (summer). For species names, we use several taxonomically valid online databases [Integrated Taxonomic Information System (ITIS), AviBase, 4 letter species codes of the American Ornithological Union (AOU), The British List 2000, taxonomic serial numbers TSNs, World Register of Marine Species (WORMS) and APHIA ID] allowing for a compatible taxonomic species cross-walk, and subsequent applications, e.g., phylogenies. Based on the data mining and machine learning RandomForest algorithm, and 26 environmental publicly available Geographic Information Systems (GIS) layers, we built 27 predictive seabird models based on public open access data archives such as the Global Biodiversity Information Facility (GBIF), North Pacific Pelagic Seabird Database (NPPSD) and PIROP database (in OBIS-Seamap). Model-prediction scenarios using pseudo-absence and expert-derived absence were run; aspatial and spatial model assessment metrics were applied. Further, we used an additional species model performance metric based on the best publicly available Arctic seabird colony location datasets compiled by the authors using digital and literature sources. The obtained models perform reasonably: from poor (only a few coastal species with low samples) to very high (many pelagic species). In compliance with data policies of the International Polar Year (IPY) and similar initiatives, data and models are documented with FGDC NBII metadata and publicly available online for further improvement, sustainability applications, synergy, and intellectual explorations in times of a global biodiversity, ocean and Arctic crisis.


Pelagic circumpolar seabird distribution Open access online databases GIS (Geographic Information System) Circumpolar seabird colonies International Polar Year (IPY) Arctic biodiversity Global Biodiversity Information Facility (GBIF) Data mining synthesis