Introduction

Bitterlings (subfamily: Acheilognathinae) are small cyprinid fish, globally comprising approximately 40 species or subspecies (Froese and Pauly 2010), which are distributed in the temperate regions of Europe and East Asia, including Japan and Taiwan (Bânârescu 1990). The group is characterized by an unusual spawning relationship with freshwater mussels; they use only the interlamellar spaces of the paired inner and outer gills of living unionid freshwater mussels as a spawning substrate (Smith et al. 2004; Kitamura et al. 2009). Sixteen species or subspecies of bitterlings are present in Japan. Because most of these fishes are endemic (Arai and Akai 1988; Arai et al. 2007), Japan is regarded as a hotspot of bitterling diversity (Miyake et al. 2011). However, Japanese bitterlings are now under the threat of extinction because of anthropogenic effects (Katano and Mori 2005) and are consequently listed in the Japanese Species Red List (Ministry of the Environment 2007).

Six native bitterling fishes inhabit northern Kyushu, 2 of which are endemic subspecies of the region (Nagata 2005). The former species include Tanakia lanceolata, Tanakia limbata, Acheilognathus tabira nakamurae, Acheilognathus rhombeus, Rhodeus ocellatus kurumeus, and Rhodeus atremius atremius, all of which are designated as threatened wildlife in the Japanese Red List, except for A. rhombeus (Ministry of the Environment 2007). Overlapping with these native species, the two alien bitterling fishes Acheilognathus cyanostigma and Rhodeus ocellatus ocellatus have been observed (Nakajima et al. 2008), the latter being ranked as an “adventive species” that requires a preventive measure or special care compiled by the Ministry of the Environment (2010).

Recently, advances in statistics and the geographic information system (GIS) have contributed to the development of predictive modeling of species distributions (Guisan and Thuiller 2005; Kozak et al. 2008). These models usually result in predictions drawn on satellite images or maps by using statistical correspondence between species occurrence data and environmental variables (Peterson and Vieglais 2001). In Japan, such species distribution models have been applied recently as a technique for biodiversity conservation (Kano et al. 2010; Sato et al. 2010; Fukuda et al. 2011). Despite their applicability, these techniques have never been used to study species distributions of bitterlings inhabiting Kyushu.

The goal of this study was therefore to develop predictive models for bitterling distributions in order to provide information helpful for conserving native bitterlings and for preventing the expansion of alien bitterling species. To accomplish this, the potential distributions of bitterlings inhabiting northern Kyushu, Japan, were predicted using generalized linear models (GLMs) that were selected based on the Akaike information criterion (AIC).

Materials and methods

Design. The study design included three steps: model development, model evaluation, and prediction of bitterling fish distributions. A biogeographical study divided the freshwater fish fauna in Kyushu into four areas (Watanabe 1998), of which the northern Kyushu area exhibited relatively high species richness. Located in the center of northern Kyushu, the fauna in Fukuoka Prefecture was classified into two areas: the western area including Chikuzen and Chikugo, and the eastern area including Chikuho and Buzen (Nakajima et al. 2006). With this pattern in mind, we chose northwestern (NW) Kyushu for model development and northeastern (NE) Kyushu for model evaluation (Fig. 1). First, each bitterling distribution model was created using both spatially explicit environmental data and bitterling occurrence data from 710 sites in the NW area. Second, model performance was evaluated based on the same type of data from 362 sites of the NE area. Finally, potential distributions of each bitterling were predicted in northern Kyushu. For this purpose, environmental measurements gathered from 200 sites without bitterling catch data were prepared in addition to the NW and NE data (1,272 sites in total) and used for predicting the potential distributions of bitterling fishes.

Fig. 1
figure 1

The 710 sites (solid squares) for model development, 362 sites (open circles) for model evaluation, and the additional 200 sites (open squares) for prediction of potential bitterling distributions in northern Kyushu, Japan. Squares and circles also indicate the sites located in northwestern and northeastern areas, respectively

Data collections. Bitterling catch data have been gathered since the 1990s in major rivers and agricultural canals in Kyushu, Japan. Of these, data from northern Kyushu included 84 river systems (NW: 51 systems; NE: 33 systems) that were used to predict bitterling distribution in this study. The details of the fish sampling methods are described in Fukuda et al. (2011).

Species or subspecies included in the analysis were six native bitterling fishes—Tanakia lanceolata (Tla), T. limbata (Tli), Acheilognathus tabira nakamurae (Atn), A. rhombeus (Ar), Rhodeus ocellatus kurumeus (Rok), and R. atremius atremius (Raa)—and an alien bitterling, R. ocellatus ocellatus (Roo). Since A. cyanostigma (Ac) inhabited only the Midori River system, the present data for this species were too sparse in our database (Table 1) to include in further analyses. Previous reports on distributions of mitochondrial DNA (mtDNA) from Rok and Roo in Kyushu have indicated that three types of populations exist: one with only Rok mtDNA, one with only Roo mtDNA, and one that contains a mixture of both mtDNAs (Miyake et al. 2008). Since several studies reported the hybridization between these subspecies (Nagata 1980, 2005; Nagata et al. 1996; Kawamura et al. 2001; Kawamura 2005), we treated the populations with only Rok mtDNA as Rok populations, and the other populations as Roo.

Table 1 Occurrence data for each native and alien bitterling species or subspecies in northwestern (NW) and northeastern (NE) Kyushu, Japan, and their status, such as “critically endangered (CR),” “endangered (EN),” and “near threatened (NT),” listed in the Japanese Red List (Ministry of the Environment 2007)

In this study, we measured eight environmental characteristics: length of main river (LMR, km), elevation (EL, m), river gradient (RG), river width (RW, m), number of river to canal connections (RC), canal network index (CNI), area of paddy fields (AP, km2), and residential area (RA, km2). Most LMR measurements were taken from online statistical records provided by local and Japanese governments (Fukuoka Prefecture 2007; Nagasaki Prefecture 2007; Saga Prefecture 2007; Kumamoto Prefecture 2008; Ministry of Land, Infrastructure, Transport and Tourism 2009). For LMR, which lacks statistical records for several rivers and for the seven other environmental characteristics, data were gathered from Digital Map 25,000 (Japan Map Center, Tokyo), edited by the Geographical Survey Institute, Japan. We overlaid digital elevation data on the Map Image and measured environmental characteristics for each site with the GIS software KASHIMIR 3D version 8.0.9 (http://www.kashmir3d.com/). We also used a web map system (ZENRIN DataCom Co. Ltd., http://www.its-mo.com/) for measuring RW, because some rivers and canals with narrow widths could not be measured with the digital map. Details of the measurements of RC, CNI, AP, and RA are described in our previous report (Fukuda et al. 2011).

Fourth root transformations were performed on the environmental data, and a correlation matrix was prepared to analyze the variables for multicollinearity. Because EL and RG were highly correlated, as indicated by a Pearson's correlation coefficient greater than 0.7 (Table 2), EL was excluded to prevent multicollinearity between predictor variables from affecting the following analyses.

Table 2 Correlation matrix of variables used during model selection (Pearson's correlation coefficient)

Data analyses. For model development, a GLM was used (McCullagh and Nelder 1989). The dependent variable was the presence/absence (1/0) of each bitterling species or subspecies at each site, and the predictor variables included LMR, RG, RW, RC, CNI, AP, and AR. A logistic regression was conducted for all possible sets of predictor variables from a null model including no predictors to a full model including all predictors. The Akaike information criterion (AIC; Akaike 1974) was used for model selection; the model with the lowest AIC was defined as the best fit model.

For model evaluation, a receiver-operating characteristics (ROC) analysis was performed as follows. First, we obtained a set of ROC curves for each bitterling fish by using the best GLMs and the NW data. Second, we analyzed the ROC curves by using the same GLMs but with the NE data. The areas under the ROC curve (AUCs) were compared between the NW and NE cases by using a Z statistic (Lamy et al. 2000). An AUC additionally reflects how well the model distinguishes between the presence and absence of a species: an AUC greater than 0.9 has high accuracy, whereas 0.7–0.9 indicates moderate accuracy and 0.5–0.7 indicates low accuracy (Akobeng 2007). We evaluated the predictive performance of each model according to these standards.

ROC curves were also used to determine an optimal cutoff value for the discrimination of bitterling presence and absence (Akobeng 2007). The cutoff values were first calculated for each bitterling species or subspecies in the ROC analysis using NW data. These cutoff values were then used to convert the probability of occurrence calculated by the best models into presence or absence of each bitterling fish. In the following analyses, we excluded the prediction of Tli distribution because of the low prediction accuracy on the NE data. In addition, the distribution of Atn was not predicted in the NE area, because this subspecies does not occur in this region.

Results

Model development in NW. The best GLMs for predicting the distributions of bitterling species and subspecies are summarized in Table 3. The best models were selected with 4–6 variables, and mostly included LMR, RG, and CNI as explanatory variables. LMR had positive effects on the presences of all native fishes, whereas it indicated a negative effect on Roo presence. RG was correlated negatively with the presences of six fishes, excluding Ar. CNI, AP, and RA showed positive correlations with several fishes (CNI: Tla, Ar, Rok, Roo; AP: Tli, Atn, Roo; RA: Tla, Tli, Ar, Raa). The AUC values ranged from 0.753 to 0.927 (Fig. 2), indicating moderate accuracies of these models. The cutoff values were calculated as 0.1048 for Tla, 0.0906 for Tli, 0.0668 for Atn, 0.0739 for Ar, 0.1598 for Rok, 0.0632 for Roo, and 0.0994 for Rok.

Table 3 Summary of selected variables and the coefficient and standard error values of the best models for each bitterling
Fig. 2
figure 2

The receiver-operating characteristic (ROC) curves for model development in the northwestern area (W) and evaluation in the northeastern area (E) of GLM predictions for the occurrence of Tanakia lanceolata (Tla), T. limbata (Tli), Acheilognathus tabira nakamurae (Atn), A. rhombeus (Ar), Rhodeus ocellatus kurumeus (Rok), R. atremius atremius (Raa), and R. ocellatus ocellatus (Roo)

Model evaluation in NE. The AUC values for NE data of Ar, Rok, Roo, and Raa ranged from 0.759 to 0.851, indicating moderate accuracies of these models. These values were not significantly different between the NW and NE regions (Z score: 1.102, 0.651, 1.149, and 1.493 for Ar, Rok, Roo, and Raa, respectively), indicating similar predictive ability for each region. In the case of Tli, the value 0.543 indicated low accuracy, and there was a significant difference in values between the NW and NE regions (3.742 for Tli). Although the AUC of Tla significantly differed between the NW and NE regions (3.134 for Tla), the AUC value from the substitution of NE data still indicated moderate accuracy (0.761).

Prediction of bitterling distributions. The expected number of native bitterling species or subspecies based on the predictions of occurrence is shown for each site in Fig. 3. The sites with high-predicted bitterling richness were found frequently in the lower and middle reaches of large rivers such as the Onga, Rokkaku, Kase, Chikugo, Midori, and Kuma Rivers. A few rivers along the northern coast of the Sea of Ariake showed especially high potential as hotspots of bitterling diversity. There were also a few sites with high richness in the NE area. In the NW area, the average numbers of actual and predicted native bitterling fishes were 0.43 ± 0.91 and 1.30 ± 1.80, respectively, and the actual values were significantly smaller than those predicted (Mann–Whitney U test: z = −4.300, P < 0.01). In the NE area, the average actual numbers (0.39 ± 0.83) were also significantly smaller than the predicted values (0.99 ± 1.55, z = −8.246, P < 0.01).

Fig. 3
figure 3

The expected number of native bitterling species and subspecies based on predicted distributions at 1,262 sites in northern Kyushu, Japan. This figure is based on predicting distributions of six native bitterling fishes in the northwestern area and four fishes, excluding Tanakia limbata and Acheilognathus tabira nakamurae, in the northeastern area. Squares and circles indicate the sites located in northwestern and northeastern areas, respectively

Actual and predicted distributions of Rok and Roo are shown in Fig. 4. On the one hand, the distribution of Roo clearly overlapped to a large extent with that of Rok. On the other hand, the predicted distribution for Roo was larger than the actual distribution, indicating the areas for potential expansion. It was also clear that the lower reaches of the Onga, Rokkaku, Kase, Chikugo, and Yabe Rivers yielded additional areas for potential distribution expansion by Roo.

Fig. 4
figure 4

Map of actual and predicted distributions of Rhodeus ocellatus kurumeus (Rok) and R. ocellatus ocellatus (Roo) at 1,262 sites in northern Kyushu, Japan

Discussion

Model performance. The moderate accuracies in model development found in this study (AUC: 0.753–0.927 for NW data) are in accordance with the 3 similar studies of native Japanese freshwater fishes, which yielded values of 0.878 for Misgurnus anguillicaudatus (Kano et al. 2010), 0.881 for Opsariichthys uncirostris uncirostris (Sato et al. 2010), and 0.880 for Pseudorasbora parva (Fukuda et al. 2011). Our study, however, was unique in that it used data sets from the different areas for model development and evaluation (NW versus NE data sets). In our study, the ROC curves differed significantly between the NW and NE data for Tla and Tli; however, only Tli indicated low accuracy in the NE area. In former studies of Japanese freshwater fishes, only a study of O. uncirostris uncirostris analyzed differences in areas with regard to model development and evaluation (Sato et al. 2010). In that study, AUC values decreased between the processes of model development (0.881) and model evaluation (0.792). Therefore, excluding the model for Tli, the model performances for the 6 bitterlings in this study indicate reliable accuracies.

This decrease in accuracy between model development and evaluation could have occurred because of environmental differences over smaller spatial scales, such as instream and/or microhabitat characteristics, between these areas. In general, river environmental factors are viewed as a hierarchically arranged, nested series of units when examining fluvial ecosystems (Allan and Castillo 2007). In our study, distribution models were developed using GIS-derived (map measurement) data, which only apply to the meso- to macrohabitat scales of these hierarchical units. Therefore, our analyses might overlook smaller scale variations. Researchers in landscape ecology have found an answer to this problem (Turner et al. 2001) and suggested that proper selection of factors from hierarchical units can accommodate differently targeted scales of the analysis. As a comprehensive case study, Kano et al. (2010) selected 2 factors from both map and field measurements to accommodate the 2 scales of their analysis: a model obtained from the map measurement having the AUC value of 0.785 was used for predicting the distributions at a large scale, whereas a model from the field measurement having the AUC value of 0.881 was used in the prediction of potential habitats and/or microhabitats. The target scale of our study therefore most closely corresponds with the former case.

In contrast to most other species and subspecies in this study, the model for Tli resulted in a significant decrease in the AUC values between the NW and NE cases. In this case, however, a previous genetic study of T. limbata indicated that different populations exist between NW and NE Kyushu (Hashiguchi et al. 2006). We therefore conclude that the difference in distribution patterns observed in our study is related to the genetic differences between populations in these 2 areas. Further ecological field research is needed to elucidate distribution and genetic patterns within this species.

Predicting native bitterling distributions. All models for native bitterling fishes indicated positive effects of LMR in this study. Similarly, a previous study showed a positive relationship between river lengths flowing into Fukuoka Prefecture and the presence of the 5 native bitterling species or subspecies Tla, Tli, Ar, Rok, and Raa (Nakajima et al. 2006). Several fishes showed negative responses to RG and positive responses to CNI, AP, and RA, suggesting that most bitterlings inhabiting northern Kyushu often appeared in lowland and floodplain areas (Onikura et al. 2007; Nakajima et al. 2010). These results may be particularly important for the conservation of Tla and Raa, which were positively associated with both CNI and RA. These results also suggest that it is important to maintain the diversity of canal networks for the conservation of these species, especially when paddy fields are converted for residential use.

This study projected the potential hotspots for native bitterling diversity on a map. The actual distributions, however, were found to be significantly smaller than those predicted. As discussed previously, this gap between predicted and actual distributions should be interpreted with regard to the scales on which hierarchical units control the distributions. In our study, the occurrence of bitterlings was predicted by taking into account the environmental conditions only on a large scale, and the information on a small scale was not considered. What can a model considering only large-scale factors contribute to the potential hotspot analysis for target species? To answer this, Onikura and Inui (2011) discussed the applicability of prediction models on a large scale for natural restoration of freshwater organisms. The key points are summarized in Table 4. We use these ideal cases for explaining the applicability of a large-scale model and its potential limitations.

Table 4 Ideal cases for the predicted and actual distributions (P presence, A absence) of a target species in view of habitat suitability and environmental conditions at different spatial scales

In case A, target sites maintain environmental conditions suitable for a target species at both large and small scales, and thus the target species is likely to be present in the sites. In such a case, the model should be able to predict the presence of the species. In case B, target sites have suitable conditions on a large scale, whereas the conditions on a small scale are insufficient. In this case, the model considering only large-scale factors would predict the fish to be present in sites where the fish are actually absent. This can result in a deterioration of predictive accuracy because of the over-prediction of species distributions. In cases C and D, the likelihood of false prediction by a large-scale model may be lower than for case B.

Given these patterns, it would be possible to restore or enhance sites classified to be case B as bitterling hotspots by improving the environmental conditions at a small scale. This can be more practical and feasible compared to the activities at a larger scale, which may require much larger costs and longer times. Can we suggest similar activities for the sites classified as cases C and D? In these cases, the environmental conditions at a larger scale are insufficient, and thus it would be difficult to perform restoration activities for the same reasons mentioned above. There could be a possibility to reconstruct suitable habitats for a species if we had better knowledge and a model to precisely represent the environmental requirements of the species. This however, should be limited to sessile species that can complete their life cycles within a small spatial scale. Therefore, we conclude that the sites classified into case B are best suited as a target sites for natural restoration. Based on the bitterling hotspots revealed in this study, we suggest targeting such sites classified as case B to be restoration or mitigation sites for the target species. Further study focusing on a small spatial scale is necessary for a deeper understanding of target species, leading to better management strategies.

Potential distribution of and possible risk from an alien subspecies. The alien bitterling Roo was found to have a different distribution pattern from native bitterling fishes. For example, in all models of native fish distribution, LMR was selected as having a positive effect, whereas Roo experienced a negative effect of LMR. Native Roo is widely distributed in China, Taiwan, and the Korean Peninsula (Matsuzawa and Senou 2008), where rivers larger than in northern Kyushu are found. However, information on native Roo distribution patterns might be in conflict with the alien Roo pattern obtained from our analysis. An explanation for the differences in distribution patterns between native and alien Roo may lie in the presence of large populations of Rok inhabiting northern Kyushu. Rok populations in northern Kyushu mainly inhabit agricultural canals (Miyake et al. 2008), which are intricately connected and form numerous networks that irrigate paddy fields along the Sea of Ariake (Onikura et al. 2007). This species is often one of the dominant fishes there (Onikura et al. 2007) and apparently forms large populations. In addition, Kawamura (2005) mentioned that the population in Kyushu has higher genetic diversity than those in Honshu because of differences in population size and habitat conditions. These Rok population attributes may function against the emigration and expansion of alien Roo in northern Kyushu. Therefore, we conclude that the distribution of Roo showed negative responses to LMR in spite of its preference for large rivers. We also found similar effects from CNI (Roo: negative effect, Rok: positive effect). However, there also remains the possibility that the population and habitat area of Rok in northern Kyushu is too large to detect Roo mtDNA from limited samples.

From the present analyses, the predicted distribution map for Roo indicates parts of the lowland and floodplain along the Sea of Ariake and lower reaches of the Onga River system as potential distribution areas. These areas are mostly included within actual Rok distributions. Several studies report the hybridization between these subspecies (Nagata 1980, 2005; Nagata et al. 1996; Kawamura et al. 2001; Kawamura 2005), and Rok risks extinction through hybridization with Roo (Kawamura et al. 2001). Roo was also categorized as an invasive species in northern Kyushu using the Fish Invasiveness Scoring Kit (FISK; Copp et al. 2005a, b) in our previous study (Onikura et al. 2011). On the basis of our predictions, we suggest that stricter regulations and management strategies against Roo are needed to protect native bitterling species and monitor the expansion of this alien bitterling.