The global ecological niche of lumpfish ( Cyclopterus lumpus ) and predicted range shifts under climate change

Lumpfish are a commercially significant marine fish that are harvested in roe fisheries and used as cleaner fish in salmon farming, however, lit-tle is known of the environmental factors shaping the ecological niche of the species at global scale. As captive reared lumpfish are sensitive to warm water, the geographic distribution of wild populations may change as sea temperatures rise under expected climate change. After investigating the ecological niche of the lumpfish using Species Distribution Models, we found that nitrate concentration, ice cover, diffuse attenuation, and temperature predicted the probability of lumpfish occurrence. Through modelling distribution under expected climate change forecasts within a realistic scenario, we found reduced probability of lumpfish occurrence in areas which currently support roe harvest and cleaner fish industry. Future conservation of the species and fisheries management should account for changes in lumpfish distribution as the range shifts northward.


Introduction
Climate change leads to shifts in species distributions as populations move from traditional ranges to occupy newer regions that better suit their ecological niche (Donelson et al., 2019). These shifts may affect how economically important species, including those used in fisheries and aquaculture, can be used in future (Froehlich et al., 2018). Investigating how environmental parameters, such as temperature, shape the current distribution of commercially important species can help predict future range shifts under expected climate change models. These forecasts provide critical insight for conservation management, and are vital for protecting heavily exploited species or those that are sensitive to particular environmental parameters . estimated 50 million juveniles are reared in hatcheries each year to supply cleaner fish for salmon farms, with commercial supplies almost exclusively derived from wild caught parent broodstock (Powell et al., 2018). Despite their economic significance, there are still knowledge gaps regarding the biogeography and the environmental niche of wild lumpfish, as most contemporary studies focus on optimizing conditions for aquaculture production of juveniles rather than the natural ecology or habitat preference of all life stages of the species (Garcia-Mayoral et al., 2016). Captivereared lumpfish are sensitive to temperature and do not survive prolonged exposure above 15 °C, though they can withstand water flow up to 110 cm/s −1 , and show hypoxic stress at less than 63% O 2 (Hvas et al., 2018;Remen et al., 2022). There are distinct genetic groups across the North Atlantic (Jónsdóttir et al., 2018;Pampoulie et al., 2014;Whittaker et al., 2018), however, it is unknown whether these groups show any distinction in ecological niche. Declines in wild adult populations have been attributed to increasing harvest pressure. For instance, sex and size bias harvest is associated with skewed population dynamics in Canadian roe fisheries (Hoenig & Hewitt, 2005). Although lumpfish are classed as Near Threatened by the IUCN (Lorance et al., 2015), more recent analysis suggests the species should be considered Threatened in parts of the North Atlantic (Atkinson et al., 2017).
Species Distribution Models (SDMs) use the niche concept to describe the environmental conditions required by a species (Soberón, 2007;Peterson et al., 2011). In this method, the occurrence of a species are correlated with a set of habitat predictors using different techniques to investigate factors responsible for observed distributions (Franklin, 2010). SDMs are increasingly applied to marine species to delimitate conservation areas (Marshall et al., 2014), manage marine invasive species (Blanco et al., 2021) and estimate impacts of climate change on a species (Cristofari et al., 2018); though most SDMs are still applied to answer theoretical questions (Robinson et al., 2017). Therefore, SDMs provide valuable tools for analyzing broad-scale distributions of economically important fish stocks, such as lumpfish, to plan future conservation, inform development of aquaculture and fisheries (Rose et al., 2016;Wittmann et al., 2016;Oyinlola et al., 2020) and forewarn of potential consequences of climate change (Schickele et al., 2021). Therefore, this study has two aims: (1) identify variables which shape the niche and current distribution of wild lumpfish, (2) assess the potential impact of rising sea temperature on lumpfish distribution given the species' known sensitivity to temperature.

Study area, occurrences and predictors
Geographical background delimitation is key for parameterization and evaluation of the SDM and selecting biologically meaningful regions has been implemented in analyses on different species (Acevedo et al., 2012), including selecting biogeographical regions where multiple species share similar environmental adaptations (Barve et al., 2011). Therefore, our study area corresponds with the marine biogeographical regions where lumpfish are currently distributed as described by Costello et al. (2017). These integrated regions contain information on multiple environmental factors and heterogeneity related to the evolution of biota: NE Atlantic, Norwegian Sea, N American Boreal and NW North Atlantic corresponding with the realms 3, 4, 8 and 18 in Costello et al. (2017). The Baltic sea was omitted from this macroecological study given it is composed of brackish water (Muus & Nielsen, 1999).
The occurrence records were collected from GBIF [https:// doi. org/ 10. 15468/ dl. hjwebg (Fig. S1)] with 8414 records mainly corresponding to adult specimens. After curation, distribution records were extracted to the 100 × 100 km 2 grid size resolution. This resolution has been considered sufficient to test major changes in distribution using climatic and nonclimatic predictors (Araújo & Guisan, 2006). Higher grid resolution has been used in SDM applied to fish species in global models or those considering various oceans (Tittensor et al., 2010;Pompa et al., 2011) but we chose a lower resolution to capture potential variability at a larger scale (Austin & Van Niel, 2011).
Environmental variables were extracted from the Bio-ORACLE v2.0 database developed for ecological niche modelling (Assis et al., 2018;Tyberghein et al., 2012). Environmental data had a 5 arcmin resolution (i.e. 9.2 km at the equator) and included predictors related to sea temperature, salinity, ice concentrations or nutrients. Due to the lack of knowledge of the main drivers shaping lumpfish distribution, we considered the most widely available predictors at mean depth (Table 1). We accounted for collinearity by calculating the Variance Inflation Factor [VIF (Dormann et al., 2013)], based on the R-squared value of the regression of one variable against all other variables and defined as: and excluded predictors with VIF > 10 (Table 1).

Modeling approach
Sampling bias in occurrence data can result in overrepresentation of environmental predictors in more intensively surveyed areas (Phillips et al. 2009).
Sampling effort can be included in models but it is rarely quantified for larger regions (Anderson, 2012), therefore, to minimize sampling bias we focused our analysis on geographic filtering or splitting of data into bins corresponding to the 2982 grids composing the study area to overcome potential sampling bias (Gallardo & Aldridge, 2013).
We used an ensemble model to combine predictions from different models to minimize uncertainty from this selection technique (Marmion et al., 2009). For this, we used biomod2 package in R (Thuiller et al., 2016) including Generalized Linear Model (Hosmer & Lemeshow, 2000), Generalized Additive Models (Hastie & Tibshirani, 1990), Random Forest (Cutler et al., 2007), Artificial Neural Networks (Manel et al., 1999) and Classification Trees Analysis (Breiman et al. 1984) as modelling algorithms. Each ensemble was calculated by averaging model predictions weighted by Receiver Operating Characteristics (ROC) and True Skill Statistics (TSS) (Hao et al., 2019). We randomly selected the pseudo-absences in the training and testing databases and run the models 100 times with different combinations of presences and pseudo-absences (Barbet-Massin et al., 2012). We evaluated the models using 100 replications for cross-validation (Roberts et al. 2017), as independent data was unavailable (Araújo et al., 2005). Crossvalidation is a common procedure that consists of training each model tree with a random selection of 70% of the presence-absence data, and the remaining 30% are used to test the model (Pearson, 2010). As  (Fielding & Bell, 1997)] which are commonly used statistics for measuring accuracy of SDMs (Elith et al., 2006). SDM perform better than random when AUC is more than 0.5 (Swets, 1988) and TSS is more than 0 (Allouche et al., 2006). Ensembles were created only including models which quality based on AUC and TSS were higher than 0.5. and the mean of the predicted suitability values for the 100 ensembles was plotted as current and forecasted suitability maps (see below). The importance of variables included in the models was assessed for each of the 100 repetitions, and the average used as a final relevance value for each predictor of lumpfish distribution. We used caret library version 6.0-80 (Kuhn et al., 2020) in R to create response curves for the 100 models, which were extracted and compiled to investigate responses of lumpfish occurrence to predictors.
We projected the resulting model under increasing temperatures of climate change expected under the Representative Concentration Pathway (RCP) 4.5 scenario (Thomson et al., 2011) to investigate the possible future distribution of lumpfish in 2050 and evaluate the effect of rising temperature on lumpfish aquaculture and conservation. The RCP 4.5, in which emissions will reach their peak in 2040 and then decline, is considered the most likely baseline scenario and was therefore used to forecast lumpfish distribution under the most plausible conditions. The RCP 4.5 scenario was based on an average of atmosphere-ocean circulation models including the Community Climate System Model 4 (CCSM4), the Hadley Centre Global Environmental Model 2 (HadGEM2-ES) and the Model for Interdisciplinary Research on Climate 5 (MIROC5) as described in (Assis et al., 2018).

Results
The discrimination capacity of the 100 models composing the final ensemble model had an average AUC of 0.956 and an average TSS of 0.790. The minimum discrimination obtained was 0.952 for the AUC and 0.758 for the TSS (Table S1). This indicated that all models included in the final ensemble had high discrimination and predictive ability.
Nitrate showed a negative association with lumpfish distribution, and lumpfish occurrence dramatically decreased when nitrate concentrations were above 15 nmol/m 3 . Ice cover was also negatively associated with lumpfish occurrence. The model Fig. 1 Relative importance of environmental predictors over 100 averaged ensemble models of lumpfish occurrence showed lumpfish occurrence was lower in areas with brighter diffuse attenuation, reaching highest probability of occurrence in areas with 1.5% and then probability slowly decreased as light dissipation decreased (Fig. 2). As predicted, there was a negative relationship between temperature and lumpfish occurrence (Fig. 2). The temperature at mean depth of benthic areas showed a negative relationship with lumpfish suitability. Lumpfish preferred temperatures between 3 °C and 10 °C, with probability of occurrence sharply declining in areas with an average temperature of more than 10 °C (Fig. 2).
There was a slight decrease in occurrence with increased phytoplankton levels, and a positive association between cloud cover and lumpfish presence. Lumpfish were predominantly located in areas with oxygen levels between 200 and 400 mol/ m 3 , with peak occurrence at 275 mol/m 3 . Extreme high and low levels of Photosynthetically Available Radiation (PAR) was negatively associated with lumpfish occurrence, with greatest occurrence found between 24 and 35 Einstein/m 2 /day. The models showed pH and current velocity had relatively low importance for lumpfish occurrence (Fig. 2).

Predicted distribution under climate change
The model predicted that suitable areas for lumpfish in eastern Atlantic at present included the North Sea, Norwegian Sea, Barents Sea extending up to Svalbard, and the waters around the British Isles and Ireland (Fig. 3). Suitable areas identified toward the center of the range included waters surrounding Iceland and southern Greenland. In the western Atlantic, the model identified currently suitable areas as spanning most of the Canadian coast from New Brunswick in the south up to Newfoundland and Labrador in the north.
Under predicted climate change (RCP 4.5) suitability for lumpfish will decrease in the south of the range (Fig. 4). Our model forecasts the highest reductions in the eastern Atlantic will occur around southern and western Ireland, areas between eastern Iceland, the Faroe Islands and northern Scotland, as well as parts of the Norwegian Sea (Fig. 5). The highest reductions in the western Atlantic are predicted around the coasts of New Brunswick and Nova Scotia. Suitability is predicted to increase in the north of the range, suggesting populations may shift to occupy new regions in the Barents Sea around Svalbard, coastal regions of northern Greenland, and into the Hudson Strait (Fig. 5).

Discussion
This study is the first to model the ecological niche of lumpfish across the Atlantic, and to assess the possible impacts of climate change on future population distribution of the species. We found that nitrate concentration was the most significant predictor of lumpfish distribution. Nitrate concentration is an important predictor for the distribution of other marine fishes across diverse environments (Macpherson, 2002;Mellin et al., 2010) as nitrogen availability is associated with nutrient flow and determines primary production, meaning areas with higher nitrate levels are more likely to support primary producers such as phytoplankton which will in-turn support fish populations (Switzer et al., 2003). However, our model identified a negative relationship between nitrate concentration and the probability of lumpfish occurrence, which seems counterintuitive based on the mechanism described above (Chassot et al., 2007). This might indicate that other dynamics, such as temporal variability of primary productivity (Conti & Scardi, 2010) or ratio of secondary to primary production, is more relevant for lumpfish distribution (Friedland et al., 2012). We found additional negative associations between lumpfish occurrence and phytoplankton abundance and diffuse attenuation, alongside a positive association between lumpfish occurrence and cloud cover. Taken together, these findings imply lumpfish prefer dark and/or deep environments. Indeed, tagging studies show adult lumpfish spend extended time at depths exceeding 400 m and exhibit regular vertical migrations through pelagic and demersal zones (Kennedy et al., 2015). These preferred habitats are not conducive for primary production via phytoplankton and may also have lower nitrate levels. It is also possible that the ratio of secondary to primary production is more relevant for lumpfish distribution, especially given that lumpfish feed on jellyfish, crustaceans and small fish (Eliasen et al., 2018;Roy, 2021).
Probability of lumpfish occurrence was greater at temperatures below 10 °C and in areas with lower ice cover. This aligns with research from aquaculture facilities which report juvenile lumpfish housed in conditions above 10 °C showed reduced breeding performance, abnormal swimming behaviour, and increased mortality rates (Hvas et al., 2018;Imsland et al., 2019;Pountney et al., 2020). As average sea temperatures rise, wild populations in the south of the range will be exposed to warmer conditions that challenge the species' upper thermal limit (Ern et al., 2016). Therefore, lumpfish distribution will likely be pushed northward in a manner similar to other marine ectotherms (Sunday et al., 2012). Rising sea temperatures will also decrease ice coverage in the north of the range (Serreze et al., 2007), making this area more suitable and pulling distribution northwards. The combined net effect of warming temperatures in the south and reducing ice coverage in the north may account for the observed poleward shift of lumpfish populations as described by the SDM.
Roe fisheries target mature females as they return to coastal areas and spawn in spring and early summer.  Greenland has one of the fastest growing lumpfish fisheries, with roe harvest focused predominately along the western coast (Johannesson, 2006;Kennedy et al., 2018). The SDM forecasts reduced suitability for lumpfish in the west of Greenland, whereas, suitability is predicted to increase in the south and east of Greenland, perhaps owing to polar currents bringing cold water to the region (Pampoulie et al., 2014). Commercial scale roe harvest may therefore become unsustainable and efforts shift from the west to the southern and eastern coasts of the country.
Although the current distribution of lumpfish aligns with regions of intensive salmon aquaculture, thus allowing the species' use as cleaner fish to control parasitic lice in farms (Garcia de Leaniz et al., 2022), the SDM predicts declining suitability for lumpfish across these areas by 2050. The greatest declines are forecasted in southwest Ireland, north Scotland, Faroe Islands, and southeast Iceland. Given that warmer temperatures will increase the growth rates of lice and exacerbate infestation on salmon farms (Costello, 2006), the reduced effectiveness of lumpfish as a control agent in these areas may incur significant impacts for the industry. Deploying lumpfish in farms located in regions with an average water temperature higher than 10 °C will yield increasingly ineffective parasite control and risk the health and welfare of lumpfish. Therefore, salmon farms intending on using lumpfish as cleaner fish for long-term parasite control may need to relocate to areas with cooler water temperatures. Alternatively, farms located in southern areas should focus on different methods of parasite control including species of cleaner fishes that tolerate warmer temperatures, such as the ballen, Labrus bergylta (Ascanius, 1767) or goldsinny wrasse, Ctenolabrus rupestris (Linnaeus, 1758) in Europe and cunner wrasse, Tautogolabrus adspersus (Walbaum, 1792) in Canada (Whittaker et al., 2021;Yuen et al., 2019).
It should be noted that the lack of contemporary knowledge on wild lumpfish populations hinders the predictive ability of the SDM. Better understanding the natural ecology of the species, including spatial and temporal variation in diet, predation threats, and niche distinction across the range would increase the predictive power of future ecological models. This would ideally include information about dispersal, species interactions or ontogenetic niche shifts (Robinson et al., 2017). Also, demographic data (e.g. current reproduction, growth or mortality rates in the wild) will contribute to the development of mechanistic SDM (Fordham et al., 2013). Combined with environmental predictors, this will help design more effective conservation and management action for sustainable aquaculture (Maulu et al., 2021).

Conclusions
This study sheds light on the ecological niche of the lumpfish at global scale, identifying nitrate, ice cover, diffuse attenuation and temperature as major drivers of lumpfish distribution. Temperatures over 10 ºC are detrimental for lumpfish, which will shift their distribution northwards as sea-temperatures rise due to climate change. This holds significant consequences for aquaculture, fisheries and conservation of the species.
Author contributions MR-R and BW: conceived the idea. MR-R: compiled, curated and analyzed the data and both authors interpreted the outputs, contributed to manuscript writing, gave approval for publication and will correspond regarding the study.
Funding Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This study was supported by "Make Our Planet Great Again" grant from Campus France and Maria Zambrano grant from Spanish Government to M.R-R.

Data availability
The data used is derived from public domain resources, but all data and R scripts are available on request from M.R.R.

Conflict of interest The authors have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.