Introduction

Target-based planning of protected areas is a widely used conservation approach (Possingham et al. 2006; Moilanen 2007), and developing protected area networks to represent biological features has become a central aim in conservation biogeography (Ladle and Whittaker 2011). Nonetheless, many protected areas have traditionally been designed to conserve particular landscapes from an aesthetic viewpoint and disregard biodiversity hotspots and/or endemicity (Joppa and Pfaff 2009). In addition, the spatial configuration of protected areas is often influenced by limited conservation efforts; the design of protected areas is achieved while minimizing socio-economic costs (Naidoo et al. 2006; Moilanen and Arponen 2011). Therefore, the resultant protected areas often do not reflect true biodiversity patterns (Scott et al. 2001; Tognelli et al. 2008; Albuquerque et al. 2013). A key step to fill in gaps between conservation targets and solutions is by assessing representativeness, which considers the properties of an existing network of protected areas and how well the area covers regional biodiversity (Kukkala and Moilanen 2013). This type of gap analysis can initially be used to improve spatial prioritization measures (Moilanen et al. 2009), design and implement new protected areas (Rodrigues et al. 2004; Soutullo et al. 2008), and/or replace underperforming protected areas (Fuller et al. 2010).

Any attempt to represent biodiversity features (e.g. species richness, species composition, endemicity, or phylogenetic diversity) within a protected area comes with an inherent assumption that the representativeness at the protected-area scale should mimic the mechanisms of biodiversity persistence at the regional scale. Under this assumption, the protected areas will be capable of safeguarding both the origins and maintenance of biodiversity. However, this assumption and its related conservation outcomes have not been fully scrutinized (Loucks et al. 2008), because of shortfalls of practicable information on biodiversity representation (Cantú-Salazar et al. 2013). Importantly, the locations of protected areas should be evaluated based on their ability to preserve the ecological and evolutionary processes that underpin biodiversity patterns (Ferrier et al. 2009). For example, areas with high species richness are considered to have functioned as centers of species accumulation in the past, as refugia during periods of extensive environmental change (Crisp et al. 2001). Likewise, high degrees of endemicity in current species distribution patterns are a result of in situ diversification through divergent selection or genetic drift, which are promoted by environmental heterogeneity, geographical isolation, and habitat stability (Sandel et al. 2011). Taking these processes into consideration when designing protected area networks increases the likelihood that biodiversity will persist long-term, even in the face of land-use and climate change (Sánchez-Fernández et al. 2013). The use of environmental factors to explain biodiversity patterns allows conservation planners to capture both ecological diversification and accumulation of taxa (Mittelbach et al. 2007). Therefore, understanding the drivers of biodiversity patterns is fundamentally important for developing systematic conservation plans that target critical areas from an ecological or evolutionary perspective.

The East Asian continental islands (including the Japanese and Ryukyu archipelagos) is one of the world’s 35 biodiversity hotspots, and is characterized by high plant diversity, endemism, and conservation threats (Mittermeier et al. 2011). The floras found on the islands that fringe East Asia (e.g. Japan) contain relictual survivors from plant communities that were once distributed throughout large sections of the Northern Hemisphere during much of the Tertiary (Latham and Ricklefs 1993; Milne and Abbott 2002). Kubota et al. (2015) showed that the diversity patterns of vascular plant species in Japan were shaped by dispersal limitation, in situ diversification, and species sorting relative to geohistorical factors and climate conditions (including Quaternary climate changes). They also claimed that the interplay between recent ecological and evolutionary processes played an important role in the formation of broad-scale patterns of plant species richness. Consequently, Japan is an ideal region in which to evaluate the locations of protected areas and spatial conservation prioritization measures from the perspective of the ecological and evolutionary processes that generate biodiversity patterns.

The consistency (or inconsistency) of the environmental factors that drive biodiversity patterns within a protected area and, more broadly, across the whole biogeographic region, may be used as a heuristic measure for evaluating process-based representativeness that contributes to improving the long-term effectiveness of conservation plans (Sánchez-Fernández et al. 2013; Albuquerque and Beier 2015). In Japan, plant diversity patterns are correlated with a variety of environmental factors that are a proxy for ecological and historical (evolutionary) processes: for example, current climate and edaphic factors drive species sorting; paleoclimatic and geological stability promote species accumulation; and geographical isolation is related to allopatric speciation (Kubota et al. 2015). Evaluating how well these processes (their related environmental gradients) are captured within the existing protected areas network is crucially important and can lead to development of adaptive conservation planning in the future under climate and/or landscape changes (Fuller et al. 2011; Batllori et al. 2014).

Therefore, we assessed the representativeness of climatic and geohistorical gradients associated with biodiversity processes as design criteria for protected areas that would facilitate biodiversity persistence in Japan, by using two datasets for species distribution (Kubota et al. 2015) and a phylogenetic tree (Kubota et al. 2016) that included 5565 known vascular plant species (more than 99% of all Japanese vascular plants). First, we examined how the distribution of each species is represented in the existing protected areas, and how different the existing protected areas from the minimum set areas that represent all plant species. Second, we evaluated the relative importance of environmental factors, including geographical, socio-economic, climatic, edaphic, and eco-evolutionary variables, in explaining the spatial arrangement of protected areas. Third, we evaluated the representativeness of the environmental factors that were expected to influence biodiversity patterns, including taxonomic and phylogenetic attributes, within the existing protected area network. Finally, based on the understanding that geohistorical gradients can affect processes such as species sorting, accumulation, and in situ diversification, we discussed the importance of historical perspectives in conservation planning to ensure adequate representation of the ecological and evolutionary potential in the East Asian continental islands.

Materials and methods

Study site

The Japanese and Ryukyu archipelagos comprise a long chain of islands located off the eastern coast of Asia (Fig. 1). Mean annual temperatures range from −5.3 to 24.2 °C, and annual precipitation is between 650 and 4538 mm.

Fig. 1
figure 1

Locations of protected areas in Japan. The gray lines within the land area indicate the boundaries of the 47 prefectures. Colors represent the rank of protection. Scatter diagram indicates total area (km2) and ratio of protected areas in each prefecture

The archipelagos are mostly covered by forest (66% of total land area), especially in mountainous areas. In lowland areas, most areas of vegetation have been converted to residential or industrial areas. At the beginning of the twentieth century, large portions of forest were nationalized to bring their management under central government control (Iwai 2002). Thus, spatial variation in national forest cover is a critical factor in systematic conservation planning in Japan.

The Japanese government established the national biodiversity strategy (the National Strategy for the Conservation and Sustainable Use of Biological Diversity 2012–2020) to achieve the Aichi targets (CBD 2011), and currently promotes action plans at the local governance level for 47 prefectures. Nonetheless, only 28 prefectures (60%) have enacted plans (see Table S1 in Electronic supplementary materials for the current status of biodiversity strategies at the prefectural level), and few prefectures have sought to prioritize conservation targets based on spatial biodiversity information. The implementation of these conservation plans does not reflect region-specific biodiversity patterns, e.g. vascular plant species richness, which is a biodiversity surrogate that we examine in this study (Fig. S1).

Protected area data

We created a dataset of existing protected areas based on Japan’s National Land Numerical Information (http://nlftp.mlit.go.jp/ksj-e/index.html) and the World Database on Protected Areas (http://www.protectedplanet.net) with 1-km grid resolution. To account for the differences in legal regulations between protected areas, we scrutinized each protected area by combining the relevant land use information and the International Union for Conservation of Nature (IUCN) conservation categories (Dudley 2008).

Although the IUCN categories provide a global standard for classification of protected areas (e.g. Soutullo et al. 2008), they do not necessarily represent conservation status, especially at fine-scale resolution (Jenkins and Joppa 2009). Thus, we first rasterized the polygon datasets for each protected area (including wilderness preservation areas, national parks, wildlife sanctuaries, natural habitat conservation areas, and national protected areas). Using the IUCN categories and Japan’s relevant legal and/or regulatory mechanisms, we then classified these protected areas into three ranks, based on their strictness of legal regulation (Table S2). A ‘high’ rank was given to areas where economic activities (e.g. logging) are strictly forbidden, a ‘medium’ rank was assigned to areas where public permission is required to undertake economic activities, and a ‘low’ rank was given to other areas. In total, these protected areas cover 40% of the land area of Japan (high = 2%, medium = 7%, low = 31%; Fig. 1). For each rank, we calculated the presence or absence of protected areas at the 1 km × 1 km grid level.

Plant distribution data

To identify the distribution patterns of Japanese vascular plant species, we used a published species distribution database (Kubota et al. 2015) that contains complete and partial range maps and point information (specimen records, local species checklists, and vegetation census results) and recompiled these distribution data for 5565 species at 1-km grid resolution (377589 cells). Then, we predicted the potential distributions of individual species using Maxent v. 3.3.3 k (Phillips et al. 2006). For species distribution modeling, we used 31 environmental factors, including climatic, soil, geological, topographical, and geographical conditions, as the predictor variables (see Table S3 for the list of explanatory variables). In the modeling, we estimated habitat suitability using all land areas of Japan as the background. We assessed the performance of each model using the area under the receiver operating characteristic curve (AUC), and confirmed that all models had high training AUC values (0.94 on average; Fig. S2a). The training AUC favors over-fit models (Warren and Seifert 2011). Therefore, for 100 randomly selected species, we compared the test AUC, which was calculated using k-fold cross-validation (k = 5), with the original AUC corresponding to the training AUC in the default setting of Maxent. We confirmed that the original AUC values were consistent with the test AUC (Fig. S2b) and thus that over-fitting was not an issue. Then, we generated the dataset of binary predictive distributions for individual species using the sensitivity–specificity sum maximizer threshold (Jiménez-Valverde and Lobo 2007). Finally, to reduce further the commission error of the prediction, we omitted presence predictions that were located outside the latitudinal and longitudinal limits of the observations.

We identified Japanese endemic species based on information from the literature (Iwatsuki 1995). Kubota et al. (2015) provides a more detailed description of how these data were compiled (including the source references and species checklists), and a discussion of the spatial patterns of species richness.

Evolutionary distinctiveness of Japanese vascular plant species

To evaluate evolutionary distinctiveness (ED), we used published phylogenetic trees for Japanese ferns, trees, and herbs (Kubota et al. 2016), in which phylogenetic positions were determined by DNA barcoding data for ferns (597 species; Ebihara et al. 2010) and a published mega-phylogeny for seed plants (2095 species; Smith et al. 2011); the phylogenetic positions of species not included in those works but contained in other phylogenetic studies were manually resolved (1623 species); the remaining species were placed as basal polytomies within their respective genera (1231 species). Kubota et al. (2016) provides more detailed descriptions of the data sources and construction processes of the phylogenetic trees. ED at the species level was calculated by taking the sum of branch lengths divided by the number of species subtending the branch (Isaac et al. 2007). We obtained species-specific ED values for four taxonomic (or functional) groups with different evolutionary histories: ferns, gymnosperms, and angiosperm trees and herbs (Schneider et al. 2004; Petit and Hampe 2006). We defined the ED of the plant assemblages in each grid cell as the number of species belonging in the top fraction of the ED at the species level (Safi et al. 2013). We tried several thresholds (1–50%) for the ED and confirmed that ED patterns were generally consistent regardless of the threshold values. Therefore, we only displayed the results of the 5% threshold.

Environmental data

We compiled data on environmental variables that are potentially associated with spatial patterns of protected areas and/or biodiversity at 1-km grid resolution (Fig. S3). Latitude (°) and longitude (°) were represented by the central coordinate in each cell. Distance from coastline (km) was calculated as the shortest distance from the center of each cell to the coastline. We obtained elevation data at 250-m grid resolution and human footprint data at 1-km grid resolution from the Japanese National Land Numerical Information (http://nlftp.mlit.go.jp/ksj-e/) and the 2010 Population Census by the Statistics Bureau in Japan (http://e-stat.go.jp/SG2/eStatGIS/page/download.html). Then, we calculated the mean elevation (m) within each cell and relative coverage area of national forests. We calculated the human influence index (HII; Sanderson et al. 2003) using human population density, human access (via roads, railroads, and navigable bodies of water), artificial night-time lighting conditions, degree of urbanization, and land cover of agriculture. The minimum temperature of the coldest month (°C), snow depth (m), and annual mean of global solar irradiance (0.1 MJ m−2) were obtained from Mesh Climate Data 2000 (JMA 2002). Actual evapotranspiration and water balance were obtained from CGIAR-CSI (http://www.cgiar-csi.org/). Data on soil cation-exchange capacity (cmol + kg−1) and soil pH were obtained from SoilGrids (https://soilgrids.org/#/?layer=geonode:taxnwrb_250m).

We compiled data on paleoclimate and geological features related to historical habitat stability and geographical isolation that influence biodiversity patterns (Kubota et al. 2015). We obtained the temperature and precipitation of the last glacial maximum from Worldclim (http://www.worldclim.org) and calculated the Quaternary climatic stability as the differences in mean annual temperature and annual total precipitation between the last glacial maximum and the present day. We also calculated the coverage area of pyroclastic flows and lowland alluvial plains (which were exposed to marine transgressions or regressions) in each cell, which represent geological stability, based on information from the Geological Survey of Japan (https://www.gsj.jp/) and geographical data (http://nrb-www.mlit.go.jp/kokjo/inspect/inspect.html). In addition, we calculated the shortest distance from each cell to the closest edge of the Asian continent, which represents isolation (dispersal limitation) from the continental species sources that affect the endemicity of plant species assemblages in Japan.

Statistical analysis

In this study, we used a three-step assessment to evaluate the representativeness of the existing protected areas network by: (1) assessing the overlap of species distribution with existing protected areas and also comparing the protected areas with the species distributions captured by the minimum set areas; (2) identifying the environmental factors of protected areas arrangement; and (3) evaluating the representativeness of biodiversity drivers within the existing protected areas network. In these analyses, we used the existing protected areas with different protection levels in a hierarchical manner: high-ranked; high- and medium-ranked; and high-, medium-, and low-ranked.

First, we counted the number of species covered or not covered by the existing protected area networks. To minimize the potential risk of misinterpretation resulting from omission and commission errors related to species distribution data, we conducted the analysis using both the raw dataset of species occurrence and the predicted data derived from the distribution modeling. In addition, we carried out an exact search for the minimum set, which was defined by the smallest number of grid cells that included all 5565 species at least once (i.e. the species set covering problem; Williams et al. 2005). In this computation, we applied exact algorithms (integer programming); thus, the minimum set is more accurate than heuristic methods (e.g. simulated annealing; Moilanen 2008; Beyer et al. 2016). We formulated this problem by introducing the following dichotomous decision variable and its corresponding vector:

$$x_{i} = \left\{ {\begin{array}{*{20}l} \begin{aligned} 1,\quad {\text{if}}\;{\text{the}}\;i{\text{th}}\;{\text{cell}}\;{\text{is}}\;{\text{selected}} \hfill \\ 0,\quad {\text{otherwise}} \hfill \\ \end{aligned} \\ \end{array} } \right..$$

The presence or absence of a species in a cell was expressed by the species coefficient as follows:

$$s_{i,j} = \left\{ {\begin{array}{*{20}l} \begin{aligned} 1,\quad {\text{if}}\;{\text{the}}\;j{\text{th}}\;{\text{species}}\;{\text{exists}}\,{\text{in}}\,{\text{the}}\,{\text{cell}} \hfill \\ 0,\quad {\text{otherwise}} \hfill \\ \end{aligned} \\ \end{array} } \right..$$

Thus, the objective function is expressed mathematically by:

$${\text{Minimize}}\;\mathop \sum \limits_{i = 1}^{m} x_{i} ,$$
(1)

where m is the total number of cells. To ensure that each plant species was included at least once, in any cell, we used the following constraints:

$$\mathop \sum \limits_{i = 1}^{m} s_{i,j} \cdot x_{j} \ge 1,\quad {\text{for}}\;{\text{all}}\;j.$$
(2)

Minimum set problems often have several equally optimal solutions (Rodrigues et al. 1999). Therefore, we conducted an iterative process to seek a set of optimal solutions, while updating the problem by setting the obtained solution as an additional constraint, thus seeking an exact solution different from previous ones. We repeated this procedure until we obtained 300 solutions. In each iteration, we randomly shuffled the order of decision variables to search the entire set of optimal solutions evenly (Gaston et al. 2001). We conducted non-metric multidimensional scaling analysis using the 300 solutions and the 300 randomly created cell-sets and confirmed that the combination of cells selected as the minimum set was similar between the multiple optimal solutions (Fig. S4). For the minimum set areas identified, we counted the number of cells that overlapped with the existing protected areas for each protection level.

Second, we explored the explanatory factors for the spatial arrangements of the existing protected areas using the random forest method. We set the presence/absence of existing protected areas for each protection level (high-ranked only; high- and medium-ranked; and high-, medium-, and low-ranked) as the binary response variables. The explanatory variables included four geographic variables, latitude (Lat), longitude (Lon), elevation (Elev), and distance from coastline (Dcoast); two socio-economic variables, HII and national forest cover (NF); five climatic variables, absolute minimum temperature (AMT), actual evapotranspiration (AET), water balance (WB), global solar irradiance (Irr), and snow depth (Snow); two edaphic variables, soil pH (SpH) and soil cation exchange capacity (CEC); and twelve eco-evolutionary variables, species richness (SR), endemic species richness (ESR), and ED for each group (ferns, gymnosperms, and angiosperm trees and herbs; Fig. 2). In addition, the spatial variation of species composition (beta diversity) potentially affects the spatial arrangement of protected areas. For the four taxonomic (or functional) assemblages in all the grid cells, we computed the Sørensen-based multiple-site dissimilarity measure (β) of species composition between the surrounding eight cells (Baselga 2010). The names and abbreviations of variables are summarized in Table 1. The inter-variable correlation was moderate for the geographic and socio-economic variables, but relatively strong for the biodiversity features, especially between SR, ESR, and ED (Fig. S5).

Fig. 2
figure 2

Spatial patterns of plant diversity measures in Japan. Species richness (SR) of a ferns, b gymnosperms, c angiosperm trees, and d angiosperm herbs, and evolutionary distinctiveness (ED) of e ferns, f gymnosperms, g angiosperm trees, and h angiosperm herbs

Table 1 Complete name, abbreviation, and type of the variables used in the statistical analyses

In the random forest analysis, we generated 500 classification trees for each run. We used the frequency of the minority class as the size of bootstrap sampling to address any unbalanced frequencies between the classes (Chen et al. 2004). We calculated the relative importance of each explanatory variable based on increases in the out-of-bag error estimate, which allows relatively stable ranking of variables to be obtained even with strong correlations between the explanatory variables (Nicodemus 2011). We also checked the presence probability pattern of the existing protected areas along each environmental variable by plotting the partial dependence, which represents the dependence of the response variable on one explanatory variable after averaging out the effects of other explanatory variables (Cutler et al. 2007).

Finally, we evaluated the representativeness of biodiversity processes: how environmental drivers of plant diversity were sufficiently (or deviatedly) captured within the existing protected areas network. To detect the predominant drivers of plant diversity at the 1 km × 1 km grid-cell level, we conducted multiple regression analysis using ten environmental factors: four climatic/edaphic factors, AMT, WB, CEC, and SpH; and six geohistorical factors, Quaternary changes in temperature (Qtc) and precipitation (Qpc) representing paleoclimatic stability, coverage areas of alluvial plains (Allu) and pyroclastic flows (Pyro) representing geological stability, and distance from the continent (Dcont) and elevation (Elev) representing geographical isolation of habitats. In the multiple regression model, these environmental factors were set as the explanatory variables, and SR, ESR, ED, and β of each taxonomic (or functional) group (ferns, gymnosperms, and angiosperm trees and herbs) were set as the response variables. Note that land area was included as a covariate in the regression analysis to remove coastline effects, and the variance inflation factors between the explanatory variables were sufficiently low (1.0–3.1) to indicate little influence of multicollinearity. The relative importance of environmental factors in explaining biodiversity measures was quantified by the coefficient of partial determination for each explanatory variable. Then, we compared the probability density distributions of the biodiversity drivers between the entirety of Japan and the existing protected areas under the three protection levels (high; high and medium; and high, medium and low). For each biodiversity driver, we estimated the kernel density distributions of the protected areas and the entirety of Japan, and tested the differences using a Kruskal–Wallis test (α = 0.05).

The minimum set problem was solved as a 0–1 integer programming problem using Gurobi Optimizer v. 5.6 (http://www.gurobi.com/). The other analyses and all graphics were performed and produced using the R environment for statistical computing (R Development Core Team 2014) with the packages ‘vegan’ (Oksanen et al. 2012) to conduct non-metric multidimensional scaling; ‘betapart’ (Baselga and Orme 2012) to calculate beta diversity; ‘picante’ (Kembel et al. 2010) to edit the phylogeny and calculate ED; and ‘randomForest’ (Liaw and Wiener 2014) to undertake the random forest analysis.

Results

Spatial patterns of existing protected areas

For most vascular plant species, the potential distributions overlapped with the existing protected areas at the 1-km grid scale (Table 2); only ten species were not captured by any protected area. Based on the predicted distribution for all species, the minimum set of 252 grid cells was identified (Fig. 3). Nearly half of the minimum set cells were selected from outside of the existing protected areas (Table 2; Fig. 3). These trends were in general consistent with the results using the raw data of species occurrence (Table 2).

Table 2 Number of plant species and grid cells (1-km resolution) included in existing protected and unprotected areas
Fig. 3
figure 3

Minimum set (Min. set) of cells that covers all vascular plant species in Japan. Using a linear integer programming model, optimal solutions of the minimum set areas were exactly obtained at 1-km grid resolution. The existing protected areas (PA) for the three protection ranks were also shown: high (8468 cells), medium (27038 cells), and low (110181 cells). Yellow and red areas indicate the cells in the minimum set that did and did not overlap with the high-ranked protected areas (1134 and 19585 cells), respectively. Note that the minimum set areas include the 300 equivalent multiple solutions and were drawn at 10-km grid resolution for ease of visualization

The pattern of existing protected areas was correlated with the geographic, socio-economic, climatic, edaphic, and eco-evolutionary variables (Figs. 4, S6). Of these, the geographic variables explained the location of existing protected areas, especially for a higher level of protection including high- and medium-ranked areas, to the greatest extent: Dcoast was a predominant predictor, followed by NF, Elev, Lat, and HII (Fig. 4a, b). The climatic and edaphic variables were of subordinate importance, in which WB and AET were relatively predominant predictors. For the eco-evolutionary variables, β in particular showed a relatively higher importance for all protection levels, whereas ESR and ED had little explanatory power. The partial dependence of the existing protected areas on each explanatory variable exhibited nonlinear patterns (Figs. 5, S6). The partial dependence of biodiversity features showed bimodal patterns, except for those of gymnosperms (Fig. 5); the existing protected areas are located in areas with low and high biodiversity. Bimodal patterns of partial dependence were also observed for Lon, Lat, Dcoast, AMT, WB, AET, Irr, CEC, and SpH (Fig. S6); i.e. the existing protected areas are relatively scarce in the middle latitudinal and longitudinal regions and climatically and edaphically moderate conditions. A nearly monotonically increasing pattern of partial dependence was observed for NF, Snow, and Elev, whereas a decreasing pattern was observed for HII (Fig. S6).

Fig. 4
figure 4

Relative importance of explanatory factors in the random forest models for explaining the spatial arrangement of existing protected areas at the 1 km × 1 km grid level: a high-ranked, b high- and medium-ranked, and c high-, medium-, and low-ranked. The explanatory factors were the geographical variables of latitude (Lat), longitude (Lon), elevation (Elev), and distance from coastline (Dcoast); the socio-economic variables of human influence index (HII) and national forest cover (NF); the climatic or edaphic variables of absolute minimum temperatures (AMT), water balance (WB), actual evapotranspiration (AET), global solar irradiance (Irr), snow depth (Snow), soil pH (SpH), and soil cation exchange capacity (CEC); and the eco-evolutionary variables of species richness (SR), endemic species richness (ESR), evolutionary distinctiveness (ED), and beta diversity (β) for each taxonomic/functional group, ferns, gymnosperms (gym), angiosperm trees (ang.t), and angiosperm herbs (ang.h)

Fig. 5
figure 5

Partial dependence plot of eco-evolutionary variables in the random forest model to explain the spatial arrangement of existing protected areas at the 1 km × 1 km grid level: high-ranked (thick red line), high- and medium-ranked (solid black line), and high-, medium-, and low-ranked (dashed line). The eco-evolutionary variables were species richness (SR, ad), endemic species richness (ESR, eh), evolutionary distinctiveness (ED, il), and beta diversity (β, mp) for the taxonomic (or functional) groups ferns, gymnosperms (gym), angiosperm trees (ang.t), and angiosperm herbs (ang.h). Note that the random forest model also included geographical, socio-economic, climatic, and edaphic variables. Partial dependence plots for those variables are provided in Fig. S6

Representativeness of biodiversity drivers in existing protected areas

SR, ESR, ED, and β showed geographical patterns across Japan (Fig. 2). AMT, Qtc, Qpc, Dcont, and Elev were predominant drivers of plant diversity: these environmental variables exhibited higher values of the coefficient of partial determination (Fig. 6). For most environmental variables, the patterns of the existing protected areas significantly differed from those of the entirety of Japan (P < 0.05). Especially in high-ranked protected areas, AMT and Elev greatly deviated from the patterns observed for the whole of Japan: cold climate and high-elevation regions were biasedly captured (Fig. 6). Qtc, Qpc, and Dcont also showed substantial deflection in the protected areas: paleoclimatically unstable regions, which were located at intermediate distances from the continents, were biasedly captured (Fig. 6).

Fig. 6
figure 6

Representativeness of biodiversity drivers in protected areas networks. The patterns of environmental features (defined as biodiversity drivers) across the entirety of Japan and within the existing protected areas are assessed using the probability density distributions of ten environmental variables at the 1 km × 1 km grid-cell level: a absolute minimum temperatures (AMT), b water balance (WB), c soil cation exchange capacity (CEC), d soil pH (SpH), e Quaternary changes in temperature (Qtc), f Quaternary changes in precipitation (Qpc), g coverage area of pyroclastic flows (Pyro), h coverage area of alluvial plains (Allu), i shortest distance from the continent (Dcont), and j elevation (Elev). The existing protected areas were categorized into three ranks based on their protection levels: high (H), high and medium (H + M), and high, medium, and low (H + M + L). D1, D2, and D3 represent the Kruskal–Wallis statistics, which indicate the degree of difference between the probability density distributions for the entirety of Japan and the existing protected areas (H, H + M, H + M + L), respectively. Note that all of the differences were statistically significant (P < 0.05), except for D1 of Pyro. The colored matrix represents the relative importance of each environmental variable evaluated by the partial coefficient of determination in multiple regression analyses. In the regression model, the response variables were biodiversity indices of vascular plants: species richness (SR), endemic species richness (ESR), evolutionary distinctiveness (ED), and beta diversity (beta) for ferns, gymnosperms (gym), angiosperm trees (ang.t), and angiosperm herbs (ang.h). The explanatory variables were the ten environmental variables listed above

Discussion

Our analyses showed the current status of conservation strategies that underpin protected area designation in Japan. The existing protected areas in Japan at least partly overlap with the distribution ranges for a majority of vascular plant species, especially when including the medium- and low-ranked protected areas (Table 2). However, caution is needed when interpreting the absolute numbers of covered or uncovered species, because such performance analyses are highly dependent on the quality and resolution of distribution data (Rondinini et al. 2006). Notably, the minimum set formulation with at least one occurrence per species indicates the smallest selection unit as priority areas, i.e. the “minimum conservation target”, and may be inadequate to propose new reserve networks for persistence of biodiversity in the context of metapopulation/metacommunity processes (dispersal, colonization, and extinction) between multiple populations or habitats (Cabeza and Moilanen 2001). Despite such methodological limitations, nearly half of the minimum set was selected from outside the existing protected area networks, indicating surprisingly low conservation efficiency. Furthermore, the locations of current protected areas, especially for high- and medium-ranked areas, were weakly associated with key biodiversity features of vascular plant species but were mainly associated with socio-economic and geographic factors. For example, protected areas situated in national forests at higher elevations could play a significant role in the conservation of isolated mountain habitats that contain a unique species composition or are characterized by compositional similarity with species assemblages in the surrounding areas. Conversely, protected areas are relatively scarce in areas of moderate and low elevation (except for coastal areas) and, consequently, not representative of the broader geographical pattern of vascular plant biodiversity. These results indicate that Japan’s existing protected areas neither efficiently nor effectively conserve plant diversity associated with a variety of different habitat conditions or functional niches (Kubota et al. 2015; Shiono et al. 2015).

Although criticizing the current design of Japan’s protected areas network may not be helpful, it is important to propose a conservation scheme based on scientific evidence that warrants future consideration. The current mismatch between actual biodiversity features and existing protected areas is not surprising but reflects the imperfect knowledge available for implementing conservation actions that are prone to compromises with economic demands (e.g. Balmford et al. 2001). In the case of biogeographic ignorance, it is extremely difficult to identify critical areas for biodiversity conservation (Riddle et al. 2011) and even harder to make a case for designation in economically productive areas such as lowland forests rich in valuable timber (Scott et al. 2001). Indeed, our finding that protected areas tend to be located at higher elevations (Fig. S6) is driven by the focus on ensuring economic activities in lowland areas. Such representation bias in the existing protected area networks has also been reported in previous studies (e.g. Batllori et al. 2014). In addition, the lack of phylogenetic information, i.e. Darwinian shortfalls, has recently become an issue for biodiversity conservation (Diniz-Filho et al. 2013). Carvalho et al. (2011) asserted that incorporating evolutionary processes into conservation planning can deliver fairly different spatial priorities compared with planning that is developed when only species representation is considered. Zupan et al. (2014) showed that the phylogenetic diversity of terrestrial vertebrates in Europe is protected unequally.

In Japan, a variety of historical effects associated with isolated and stable habitats and also energy availability were involved in the formation of the endemic flora (Kubota et al. 2015, 2016). Therefore, this study focused on current climate and edaphic factors, paleoclimatic and geological stability, and geographical isolation, which control the species sorting, species accumulation, and speciation processes. Indeed, the dominant environmental drivers of biodiversity features (including ED) are not well represented in the existing protected areas. This finding indicates under-representativeness of the ecological and evolutionary processes that underpin biodiversity patterns within the existing protected areas network. The biased representation of climatic gradient suggests that the existing protected areas network is not adequate to cover the shifts of biodiversity patterns in response to future climate changes (Batllori et al. 2014). Moreover, poor representativeness regarding the gradient of paleoclimatic stability and the spectrum of geographical isolation, which are related to speciation and/or extinction, may pose a threat to the persistence of evolutionary hotspots that could secure the preservation of future generations of biodiversity (Klein et al. 2009). Our results demonstrate considerable room for improvement in the present conservation plans in Japan, especially to ensure adequate representation of areas capable of in situ diversification that responds to region-specific geohistories.

In 2010, the Convention of Biological Diversity adopted a new strategic plan to tackle biodiversity loss and set the Aichi targets. One of these targets was that each nation should commit to expanding the protected area coverage to 17% (CBD 2011). In Japan, effective (high-ranked) protected areas account for only 2% of terrestrial habitats (3.9% if all protected areas within the IUCN’s categories I and II are included). However, about 40% of all land area has already been designated as protected areas, including medium- and low-ranked areas, in which key environmental drivers of biodiversity features are captured relatively well (Fig. 6). This highlights the need for enforcement in the existing protected areas to promote effective protection or replacement of underperforming protected areas (Fuller et al. 2010), and establishment of additional protected areas in priority areas (Jenkins and Joppa 2009). Although the Japanese government promotes strategies and action plans, implementation differs considerably between local authorities (Table S1). Based on a spatial prioritization analysis of protected areas around the world, Pouzols et al. (2014) found that parochialism of action plans may increase the local benefits of conservation. As demonstrated in this study, however, implementation of inconsistent plans across numerous local authorities is not an efficient approach to securing the conservation of endangered or endemic biotas (Hunter and Hutchinson 1994).

In conclusion, our representative assessment provides a fundamental basis for systematic conservation planning to retain the long-term persistence of ecological and evolutionary potential through the processes of species sorting, species accumulation, and in situ diversification. Importantly, under-representativeness of historically stable areas, which may function as refugia habitats in response to future climate changes, highlights the importance of considering the historical perspective when designing and implementing protected areas, as well as current climatic conditions (Batllori et al. 2014). Modifying the protected area network in Japan, one of the world’s biodiversity hotspots, to preserve the highly endemic biota by considering the long-term historical perspective will contribute to biodiversity conservation at the global scale. Further research into spatial conservation prioritization is needed, with a focus on methods that take into account both the spatial and temporal connectivity of environmental variables (Rose and Burton 2009) and the real-world complexity of cost layers (Lehtomäki and Moilanen 2013).