For all calculations and graphics, we used the free statistical software environment R (R Core Team 2015).
Besides the calculation of the 19 BIOCLIM variables for Antarctica (see end of this section), this study focuses on two of the BIOCLIM variables, namely annual mean temperature (Bio 1) and annual precipitation (Bio 12). However, we decided to not use Bio 1 and Bio 12 themselves for the definition of the climate zonings but to calculate annual mean temperature and annual precipitation manually, to get temperature data based on the monthly averages and not on the monthly minimum and maximum values (as Bio 1 usually is computed).
Data preparation
The following preliminary steps were required: (1) processing of the original data, (2) computation of annual 2009–2015 data, (3) coordinate transformation and reduction to terrestrial points, and (4) averaging of 2009–2015 data.
Processing of the original data
Climate data for Antarctica were downloaded from the AMPS; a repository of this model forecast output is the AMPS archive (Powers et al. 2012). Basic variables are available at http://polarmet.osu.edu/AMPS/. The database offers different domains with different image sections and horizontal resolutions. In this study, we analyzed three different domains: domain 2 (d2, whole Antarctica), domain 5 (d5, Dry Valleys), and domain 6 (d6, Maritime Antarctica) (Fig. 1). Two parameters covering 7 years (2009–2015) were used, both of them in the form of three hourly forecast data: temperature at the surface (in Kelvin) and 3 h accumulated precipitation (in kg/m2). The spatial resolutions as well as the image sections of the three domains differed within these 7 years (Table 1). Thus, to get uniform resolutions of 10 km (d2), 1 km (d5), and 3 km (d6), respectively, the data was interpolated (details see below). The years before 2009 were not taken into account because the resolution of 20 km (in d2) was insufficient for the tasks of this study.
Table 1 Differences in horizontal resolution and image section of the AMPS data used in this study
The calculated climate zones are focusing on climatic requirements for terrestrial life, especially on the habitats of the more than 500 lecideoid lichen specimens from different ice-free areas all over the Antarctic continent and adjacent islands which are included in the overall project. Figure 2 shows a relief map with the location of the collection sites in d2 (a), d5 (b), and d6 (c).
Computation of annual 2009–2015 data
For computing the annual mean temperature and annual precipitation, first of all, monthly values were determined. The whole data set of a single month (for example, 248 files for January, corresponding to eight three hourly forecasts a day) was loaded in R (using nc_open() in the R package ncdf4; Pierce 2015), averaged (in case of temperature data) or rather summed up (precipitation data), and saved as new files. Those 12 monthly files of a single year were then averaged and summed up respectively to get annual data.
In the original data for precipitation, negative values can occur, due to the fact that the data of AMPS arise from computations. Negative values were substituted by zeros in this study.
During the process of computing monthly data, we had to deal with the problem of missing values. Sometimes, all the eight forecasts of a day were missing, sometimes only a few of them. We decided to delete all the data of a single day if it was incomplete and to replace the values (by averaging daily means of the previous and subsequent days) which does not distort the overall data. Table 2 summarizes the replaced data.
Table 2 Days whose original data were replaced by averaging daily means of the previous and subsequent days
Transformation to an orthogonal grid and reduction to terrestrial points
The AMPS data files contain not only terrestrial points of the Antarctica but also points of the adjacent marine area (and ice shelfs). Thus, the next step was to reduce the data to terrestrial points.
The country borders of Antarctica were downloaded from http://www.gadm.org/download. The coordinate reference system of this data set is WGS84 decimal degrees. These coordinates of the Antarctic borders were converted into projected Cartesian coordinates, which was done by applying a stereographic projection using the R function mapproject() in the package mapproj (McIlroy 2015) (Fig. 3).
The coordinate reference system of the AMPS data is also in WGS84 decimal degrees. Application of the stereographic projection on the AMPS longitude and latitude data resulted in a spatial grid with angles that were not exactly orthogonal, which, however, is necessary for using, for example, the R plotting function geom_tile() in the package ggplot2 (Wickham 2009). Thus, a grid was created which had the same coordinate minimum and maximum values and the same number of grid points as the previous grid, but with orthogonal angles. This was done for all three domains, in each case choosing the highest definition (10 km for d2, 1 km for d5, and 3 km for d6). After reducing this new orthogonal grid (containing, for example, 417,582 data points in d2) to those points lying within terrestrial Antarctica (121,670 data points in d2), the annual temperature means and the annual precipitation on the aforementioned grid were calculated by interpolation. This was done by ordinary kriging, using the R function krige() in the package gstat (Pebesma 2004). All data was given on the same grid. Any negative precipitation values were again replaced by zeros.
Averaging of 2009–2015 data
Subsequently, the 2009–2015 data was averaged. Therefore, for each of the three domains, we gained two data sets: one of the annual mean temperature and one of the annual precipitation.
To evaluate our data, annual mean temperatures for 30 different climate stations were determined by interpolation and compared to data from literature. With the exception of Hallett station, data was taken from the website of the Databank of Antarctic Surface Temperature and Pressure Data (Jones and Reid 2001) of the Carbon Dioxide Information Analysis Center (CDIAC). Hallett data is from the Latitudinal Gradient Project website (http://www.lgp.aq).
Correlation coefficients
To determine correlations between climate variables (annual mean temperature, annual precipitation, BIOCLIM variables) and geological variables (latitude, elevation, distance to coast), several computations were necessary: For the correlation with latitude and elevation, the original latitude (not transformed by stereographic projection) and elevation data was interpolated to the orthogonal grid, as done before for temperature and precipitation data. The distance to the coast was calculated using the R function gDistance() in the package rgeos (Bivand and Rundel 2016). Correlations were calculated using cor.test() in the R package stats (R Core Team 2015), which also includes a test of the value being zero.
Definition of climate zones
For both temperature and precipitation, separate zonings were constructed, respectively. Subsequently, these two zonings were combined to define climate zones.
Definition of temperature zones
Initially, the temperature data of d2 was divided into three zones by standard k-means clustering using the R function kmeans() in the package stats (R Core Team 2015). Nearly all of our lichen samples happened to be located in the “warmest” third, only a few in the middle, and none in the coldest one. Thus, to get a finer partitioning of the temperature zone relevant for our studies, the warmest third was evenly divided into ten subzones. The warmest zone was assigned zone number 1, the coldest zone number 12. We partitioned the regions d5 and d6 using the same zoning (based on d2).
Definition of precipitation zones
The precipitation zones were defined following the concept of Meigs (1953) who established the classification of hyperarid (<25 mm), arid (25–200 mm), and semiarid (200–500 mm) deserts (Meigs 1953). To get a finer partitioning, precipitation values above semiarid were about evenly divided into two zones. The zone with the highest precipitation was named A and the zone with the lowest precipitation, E. Regions d5 and d6 were partitioned based on the same precipitation ranges.
Definition of climate zones
Following the separate definition of temperature and precipitation zones, the two classifications were then combined. As direct consequence, each grid point of Antarctic d2, d5, and d6 was assigned a number (from 1 to 12) and a letter (from A to E). The arising concept of theoretically 60 different climate zones seems confusing at a first glance; the dual nomenclature, however, allows for a straightforward, quick, and accurate interpretation.
BIOCLIM variables
The 19 BIOCLIM variables were computed as established by Hijmans et al. (2005). In a first approach, we used all the variables for the definition of the climate zones. The inclusion of multiple variables led to zonings overlapping in single variables; for example, climate zone A with higher mean precipitation than zone B also included points with lower precipitation values. However, the zonings only differed in the mean values but not in the boundary points of a single BIOCLIM variable. Thus, in order to avoid problems caused by multicollinearity, we decided to reduce the zone definition to only two parameters, which allows easier comparison of locations.
The BIOCLIM variables were computed using the R function biovars() in the R package dismo (Hijmans et al. 2016). The computation requires monthly minimum and maximum temperature values as well as monthly precipitation data. By computing the monthly temperature means (see above), monthly temperature minimum and maximum values were calculated in the same step. Those monthly values then were averaged over the years 2009–2015. Precipitation data was calculated in a similar way.
Testing the soundness of the newly calculated climate zones vs. three climate zones defined by heterogeneous data from literature
Ruprecht et al. (2012a) investigated the photobiont diversity and abundance of lecideoid lichens from several localities in continental and maritime Antarctica. Focused on the photobiont diversity, they subsequently identified five major Trebouxia clades and correlated their occurrences with the climatological features of the sample sites. Based on literature data, they defined three different Antarctic climate types (dry and cold, intermediate, humid and relatively warm) based on heterogeneous information from the literature and classified the Trebouxia clades according to them.
Compared to the study from 2012, the same 98 Antarctic lichen samples were reclassified according to the newly generated climate zones. To achieve this, the Bio 1 and Bio 12 values were interpolated to the sample locations and subsequently assigned to the temperature and precipitation zones as described above. The samples belonged to five different Trebouxia species, namely Trebouxia jamesii (12 samples, 6 of them forming a subclade), Trebouxia sp. URa1 (8 samples), Trebouxia impressa (9 samples), Trebouxia URa2 (42 samples), and Trebouxia URa3 (26 samples).