A decade of marine mammal acoustical presence and habitat preference in the Bering Sea

As Arctic seas rapidly change with increased ocean temperatures and decreased sea ice extent, traditional Arctic marine mammal distributions may be altered, and typically temperate marine mammal species may shift poleward. Extant and seasonal odontocete species on the continental shelves of the Bering and Chukchi Seas include killer whales (Orcinus orca), sperm whales (Physeter microcephalus), beluga whales (Delphiapterus leucas), harbor porpoises (Phocoena phocoena), and Dall’s porpoises (Phocoenoides dalli). Newly documented, typically temperate odontocete species include Risso’s dolphins (Grampus griseus) and Pacific white-sided dolphins (Lagenorhynchus obliquidens). Until recently, recording constraints limited sampling rates, preventing the acoustic detection of many of these high frequency-producing (> 22 kHz) species in the Arctic seas. Using one of the first long-term datasets to record frequencies up to 50 kHz in these waters, clicks, buzzes, and whistles have been detected, classified, and paired with environmental data to explore which variables best parameterize habitat preference. Typically temperate species were associated temporally with cold Bering Sea Climate Regimes in tandem with negative Pacific Decadal Oscillations. Typically Arctic species’ strongest explanatory variables for distribution were largely species and site specific. Regardless of species, however, the environmental cues (e.g. percent ice cover or zooplankton community structure) marine mammals use for locating viable habitat space are ones that will change as temperatures increase. This 10-year dataset documents the current state and tracks recent dynamics of odontocetes and their habitats along the Pacific Arctic Corridor to contribute to ongoing discussions about future Arctic conditions.


Introduction
Warming in the Arctic Ocean has been particularly significant in recent years compared to the past century. With warming rates three times that of the global rate, winter ice thickness has been reduced by as much as 0.75 m since 1965 and has declined 11.5% per decade since 1979 (Steele et al. 2008;Comiso and Hall 2014 adjust their distributions and migration behaviors concurrently with ice cover changes, such as ice retreating midwinter compared to being continuously present (Grebmeier and Dunton 2000;Miksis-Olds et al. 2013;Miksis-Olds and Madden 2014;Hauser et al. 2016). Therefore, the changes in general warming and retreating ice trends as reported by multiple studies (Comiso et al. 2008;Kaufman et al. 2009;Kwok et al. 2009) could cause shifts in additional Arctic species' distributions, including toothed whales (odontocetes).
For a more complete review of the three typically temperate species' historically documented habitats and how they have recently been acoustically observed in the Bering and Chukchi Seas, see Seger and Miksis-Olds (2019). Studies from previous decades have suggested surface temperatures as the potential driver for their changing distributions (Guiget and Pike 1965;Leatherwood et al. 1980;Dahlheim and Towell 1994;Salvadeo et al. 2010;Funk et al. 2011;Jefferson et al. 2014). The area over which temperature changes may affect both typically Arctic and temperate marine mammal species stretches beyond the Arctic Ocean and sub-Arctic seas because many species seasonally migrate from or naturally occupy temperate and tropical waters. Therefore, ocean basin wide phenomena will likely drive habitat preference and expansion. In addition to El Niño, shifts in the Pacific Decadal Oscillation (PDO) and the Bering Sea Climate Regime affect water temperatures and benthic versus pelagic-supported food webs along the Pacific Arctic Corridor (Hunt et al. 2002;Stauffer et al. 2015). Notable changes in environmental conditions during the timeframe when this dataset was collected include a shift from a cold Bering Sea Climate Regime to a warm regime in 2012 and a concurrent shift from a negative to positive PDO anomaly (Stauffer et al. 2015;Duffy-Anderson et al. 2017). During a cold regime, the Bering Sea is supported by a benthic biomass, compared to a pelagic biomass in warm regimes, providing a more direct (3-tiered instead of 5-tiered) food web for marine mammals (Hunt et al. 2002;Christianen et al. 2017). During a negative PDO index, warmer than average waters extend northwards throughout the Gulf of Alaska into the Aleutian Archipelago. These warmer waters may expand the desirable temperature range far enough North to entice marine mammals into the Bering Sea where a "Green Belt" of more plentiful food resources awaits (Okkonen et al. 2004). Determining whether or not marine mammals take advantage of the negative PDO occurring in tandem with the cold Bering Sea Climate Regime requires sufficiently long distribution time series to capture periods when this negative PDO/cold Regime combination are in or out of phase.
However, large-scale climatic events that affect water temperature are not the only cues odontocetes may use to locate a desirable habitat. While sperm whales have been shown to be more present in the high northern latitudes (outside of the Bering and Chukchi Seas) in the summers in off-shelf waters, or along thermal fronts of warm-core rings (Christensen et al. 1992;Mellinger et al. 2006;Waring et al. 2006;Griffin 2006), these cosmopolitan species prefer deep waters with weak thermoclines and strong haloclines in one area, but deep productive waters with cold surface temperatures in other areas, making their preferences difficult to pinpoint (Jaquet 1996;Embling 2008). The most recent studies to compare acoustic and visual surveys of sperms whales largely found them in the deep basin areas of the Bering and Chukchi Seas (Crance and Matuoka 2018;Crance et al. 2019).
Porpoises have been shown to prefer low tidal currents during spring tides (Embling 2008). In the Central-eastern and Southeastern Bering Sea, Dall's porpoises were mostly spotted along the 200 m isobath while harbor porpoises were mostly spotted along the 100-50 m isobaths (Moore et al. 2002).
Delphinid species in general are associated with deep thermoclines in Scottish waters (Embling 2008) and a couple Pacific white-sided dolphin sightings occurred very close to shore along the Aleutian archipelago both on the Gulf of Alaska and the Bering Sea sides of the islands (Moore et al. 2002). Visually, Risso's dolphins have not been documented above the 51st parallel more than once (Clark 1945), but they have been detected acoustically in recent years (Seger and Miksis-Olds 2019).
Beluga whales are the only ice-associated odontocete considered in this paper and are known to vary their preferences with season. They are more present in the Bering Sea in the fall and spring compared to the winter and summer (Crance and Berchok 2016), and they prefer slope and basin waters with moderate to heavy ice cover, except during the summer when they are most likely to be associated with low ice cover and most likely to avoid moderate ice cover (Moore et al. 2000).
Killer whales usually only occupy the Arctic in the summers, likely following their prey during reduced ice cover (Moore and Huntington 2008;Breed et al. 2017). These previous studies were mostly visual (vessel and aerial based) while a few were acoustic, and a variety of environmental variables were used as inputs to predict habitat preference. None paired their observations with zooplankton size class concentrations from co-located instruments which is a key contribution of this study.
Killer whales are not the only species for which prey is a key component of habitat preference. The base of the Arctic food chain-vertically migrating, sound-scattering layers of zooplankton-can be affected by lunar and solar light cycles and, through fish feeding on the layered zooplankton, affect delphinid behavior (Benoit-Bird et al. 2009). Odontocetes do not have the well-developed Jacobsen's organ that mysticetes 1 3 may use to "taste" surface lenses of freshwater from ice melt and then follow the low salinity "trails" to rich feeding areas (Stern 2009). But it is reasonable to assume that they use some sensory mechanism, like echolocation, to detect the ice edge and to gather information about open versus ice-covered waters. This assumption supports O' Corry-Crowe et al. (2016) finding of a significant association between atypical beluga whale migrations in the Norton and Kotzebue Sounds during an unusually early ice break-up in the northern Bering Sea during the spring of 1996. The time between when ice forms or retreats and when an animal senses the ice presence or absence creates a lead or a lag in how the animal will respond to occupying waters with or without ice cover. Days since ice has melted or formed, therefore, may hold some explanatory power to the extent of acoustic presence of a particular species in a given area.
The Arctic marine mammal distribution studies discussed above were historically performed using visual survey methods. The incorporation of some acoustical surveys has provided a more complete view of ecosystem dynamics because acoustic technology can collect data year-round during harsh weather conditions (Mellinger et al. 2007). These studies largely focused on low-frequency vocalizing baleen whales because, until recently, passive acoustic recordings were constrained (typically up to 44.1 kHz) by power consumption that increases linearly with sampling rate (Caldas- Morgan et al. 2015) and by storage capacity only being available in hard drives of a few GB (i.e. 1 min of a 16-bit, 44.1 kHz sampling rate on two channels requires 10 MB of storage space). Logistical constraints of recording in remote locations like the Arctic demand that 6-to 12-month autonomous deployments sample evenly over time within these power and storage limits. This has led to many acoustical studies of only species that vocalize below 22 kHz. New technological developments have expanded these storage and power limits, allowing researchers to use the acoustic signals of these high frequency producing odontocetes as a tool for measuring their vocal presence.
Two types of passive acoustic instruments have been used in this study to obtain data about the vocal presence of odontocetes along the Pacific Arctic Corridor. It is of significant note that passive acoustic monitoring methods can only verify animal presence; animal absence is not informed by lack of acoustic detection because the animals could be present and not vocalizing. The adaptive sampling technology in passive acoustic listeners (PALs) expanded recording possibilities to higher sampling rates (100 kHz) and advanced power-limited logistical requirements (Nystuen 1998). PALs are band-limited to an analysis upper limit of 50 kHz, but porpoises produce clicks with center frequencies between 130-140 kHz (harbor porpoise) (Miller and Wahlberg 2013) and 121-147 kHz (Dall's porpoise) (Kyhn et al. 2013); consequently, using a recorder with even higher sampling rates was desirable. C-PODs (Chelonia, Ltd.) are commercially available click logging devices that use omni-directional hydrophones over the 20-160 kHz bandwidth. They provide summary parameters of detected click trains instead of storing waveforms (Palmer et al. 2017). These two types of instruments combine to collect data to (a) perform an acoustical survey of relatively high frequency-producing (> 10 kHz) species and (b) investigate their seasonal distributions with respect to several environmental variables. This will contribute to the overall understanding of the Bering and Chukchi Seas' ecosystem response to various and changing environmental factors. Miksis-Olds et al. (2010) began collecting this acoustic dataset with PALs from three locations in the Bering Sea in 2007 and analyzed it for the vocal presence of relatively low-frequency (< 10 kHz) vocalizations. Miksis-Olds et al. (2013) showed that ice was the most predictive environmental variable for baleen whales and pinnipeds. At present, approximately a decade of recordings up to 50 kHz have been collected. This study focuses on seasonal and mostly non-ice-obligate species. It includes variables to parameterize habitat preference, such as six classes of concurrent acoustic backscatter measurements, four ice metrics, presence of other marine mammal species, sea surface temperature, and lunar and solar cycles.

Materials and methods
PALs and multi-frequency echosounder systems composed of Acoustic Water Column Profilers/Acoustic Zooplankton Fish Profilers (AWCPs/AZFPs: ASL Environmental Sciences, Inc., Victoria, BC) (Lemon et al. 2001) were deployed as part of larger vertical mooring assemblages by NOAA's Ecosystems and Fisheries-Oceanography Coordinated Investigations (Eco-FOCI) Program (https ://www.ecofo ci.noaa. gov) in the Bering Sea at three locations from Sept 2007 to Sept 2017 (Figs. 1,2). Their labels and GPS coordinates were: M2 at 56°52.202′ N, 164°03.935′ W; M5 at 59°54.646′ N, 171°43.854′ W; and M8 at 62°11.62′ N, 174°40.06′ W (Fig. 1). All sites were on the continental shelf approximately 70 m deep, and the PALs were suspended approximately 10 m above the ocean floor. The AWCP echosounder systems were placed two meters above the PALs. In 2016, C-PODs were added on all moorings, and the sensors previously deployed at the M8 site were shifted onto the Chukchi Sea (CH) mooring at 67°54.671′ N, 168°11.695′ W. The vocal presence of odontocete species in these audio recordings served as the response variables for all environmental modeling in this paper.

Passive and active acoustic data collection
Starting at midnight, the PALs would save up to twenty-one 4.5 s passive acoustic audio.wav recordings per day. PALs recorded on duty cycles of 0.75% and 3.75% depending on whether a signal of interest had been recorded in the previous file or not. Signals of interest included a 12-dB threshold exceedance, rain, and tones. Denes et al. (2014) provides additional information about this PAL adaptive subsampling technique. To determine whether the allotted daily audio recordings were being saturated early (thereby introducing a bias towards whichever animals were more active than others during time since midnight), the timestamps of all recordings were plotted against Julian day (Fig. 3). The PAL recordings were most even over all times of day throughout the years at site M2 and skewed towards mid-morning at site Median times of recordings are denoted by horizontal lines and labels. M2 was the site with the most even distribution of recordings over time, M5 was skewed slightly towards the morning, and CH was skewed heavily towards the morning. This illustrates an additional sampling bias between Bering and Chukchi Sea sites M5. One month of data at M8 tended to save more recordings from the first 2 h of each day than from other times of the day (orange dots, Fig. 3), thus heavily skewing the single year of data at M8 towards the beginning of each day. While data exist from all hours of the day at M8, any results where M8 is a significant explanatory variable should consider this early morning bias.
The C-POD logging interval was 1 min ON/4 min OFF (the closest possible duty cycle to the PALs'), its "limit on clicks logged per minute" was left at the default 4096, the high pass filter was set to 10 kHz, and it was set to not have "low" sensitivity. Additional information on the C-POD click detection thresholds and classification parameters using the KERNO and GENENC processors can be found at https ://www.chelo nia.co.uk/.
Acoustic Water Column Profilers/Acoustic Zooplankton Fish Profilers (AWCPs/AZFPs) actively transmitted pings and recorded backscatter for 5 min every half-hour, starting on the quarter hour, giving a 16.7% duty cycle. During each 5-min sampling period, acoustic backscatter measurements were recorded every 2 s with 20 cm range bins from approximately 0.75 m above the transducer face to the water surface (Miksis-Olds and Madden 2014). AWCPs/AZFPs are echosounder systems with sets of transducers at either 125, 200, and 460 kHz (AWCP older sensor versions) or 200, 455, or 775 kHz (AZFP newer sensor versions). Seger et al. (2016) provides more detailed deployment methods. Gaps in the data collection resulted from sensor malfunctions and mooring disruptions due to fisheries trawling periodically throughout the 10-year dataset.

Environmental data collection
Non-acoustic environmental data were obtained by accessing online databases. Ice data were obtained from the National Snow and Ice Data Center (NSIDC) which provides daily concentrations of ice from satellite data at a 12.5 km resolution. Using the "egg code" (https ://www. natic e.noaa.gov/produ cts/egg_code.html), concentrations expressed in tenths (C t ) were recorded as a percent. The "egg code" also expresses the stages of development (S a , S b , and S c ) as average thicknesses (cm) across the sampled area of ice. These three values were averaged to obtain a single value for ice thickness in the area around each site. Depending upon the time of year, the rate of sampling differed: sometimes sampling occurred daily, and at other times, gaps of at least 1 day existed. For samples not taken daily, values on either side of intermittent days were interpolated to obtain estimated measurements. Once days with and without ice were noted, the number of "days until next ice cover" and "days since last ice cover" were calculated to create two additional explanatory variables.
Sea surface temperatures were downloaded from NASA's Moderate Resolution Imaging Spectra-radiometer (MODIS)-Aqua Data website using daily 4 km resolution areas and the 4-µm nighttime measurements. Temperatures were then averaged over a 10 × 10 box size which is equivalent to a 40 × 40 km area (Moore et al. 2015). There were limited resolution choices available for download, but the 40 × 40 km area was the most appropriate choice given detection ranges of the species under consideration. Ambient sound levels recorded on the PALs ranged from 45 to 94 dB re 1 µPa 2 Hz −1 depending upon seasonal weather conditions (Miksis-Olds et al. 2013;Denes et al. 2014) and the PALs' 12-dB SNR filter pushes detectability received levels to 57-106 dB re 1 µPa 2 Hz −1 . Using the SONAR equation, spherical spreading, and published average rms source levels (SLs) for beluga whales (Le Bot et al. 2016), killer whales (Holt et al. 2009), sperm whales (Møhl et al. 2000), Risso's dolphins (Philips et al. 2003), and harbor porpoises (Villadsgaard et al. 2007), the lowest SL (beluga whales) in the quietest ambient conditions could be detected about 20 km away, which is analogous to the distance covered in a 40 × 40 km area centered on the mooring. In the loudest ambient conditions, four of the five species could have been detected out to 15 km away, which is within the bounds of the 40 × 40 km area of temperature measurements. Daily and 8-day temperature data were compared and determined to be similar in terms of gaps due to cloud cover. Any gaps between days with positive Celsius temperatures were interpolated. Gaps between days with negative Celsius temperatures were treated one of two ways: (1) if ice was not present, they were interpolated or (2) if ice was present, empty data cells were introduced into the spreadsheet since actual surface temperature below the ice was not measurable.
Lunar phase was quantified by the fraction of the moon illuminated at midnight in Alaska Standard Time from https ://aa.usno.navy.mil/data/docs/MoonF racti on.php. Day length was calculated as the number of minutes between sunrise and sunset from https ://aa.usno.navy.mil/data/ docs/Dur_OneYe ar.php. Factors such as water depth, shelf slope, distance from the shore, distance from a canyon, etc. are contained inherently in each site. Therefore, site as a categorical variable for environmental modeling purposes is considered a proxy for the combination of these variables.

PAL data processing
All available audio recordings were manually reviewed for marine mammal vocalizations in the 10-50 kHz range between Sept 21, 2007, and Sept 28, 2017. Ulysses software, a custom-made Matlab GUI (written by Dr. Aaron Thode and optimized by Dr. Jit Sarkar), was used for the manual analysis. Audio recordings were loaded into Ulysses and viewed as full 4.5 s clips with 90% overlap and a 1024 FFT size. Minimum dB and dB spread were adjusted as needed for ideal visibility, and target signals were expanded for further inspection when needed. Any recordings with only low-frequency vocalizing species, mooring self-noise, anthropogenic activity, and/or rain and ice sounds were tallied as zero vocal presence for delphinid species.
All suspected marine mammal clicks, trains, buzzes, whistles, etc. were classified (if possible) into the following groups (with corresponding scientific names and abbreviations): beluga whale (D. leucas, Dl), Risso's dolphin (G. griseus, Gg), Northern right whale dolphin (L. borealis, Lb), Pacific white-sided dolphin (L. obliquidens, Lo), killer whale (O. orca, Oo), and sperm whale (P. macrocephalus, Pm). If buzzes or clicks were not confidently classified as one of these species, they were classified as "unidentified odontocete" (Unid). Only vocalizations with reliable signalto-noise ratios (peak frequency manually verified as ~ 6-dB above background levels), were included. To be counted as "present" in an audio file, any vocalization needed to be clearly identifiable to species or to a unidentified odontocete group. Each species' presence was binary tallied as 1 for that 4.5-s file and non-presence was binary tallied as 0.

C-POD data processing
Data from the C-PODs were extracted from their memory cards and the "species encounter", or GENENC algorithm, was used in the C-POD software (Chelonia, Ltd.) to display all detections that were classified into the narrow band high frequency (NBHF) group at high and moderate sound pressure levels (SPLs). Using the KERNO classifier in the C-POD software yielded the same results as the GENENC algorithm. At a 10 ms viewing window, AWCP pings that had been misclassified as NBHF click trains were manually removed by changing them to the SONAR class. The number of actual NBHF clicks trains detected per day were then tallied. Tallies were converted to binary yes/no daily vocal presence. The possibility of discriminating Dall's porpoise from harbor porpoise detections using extracted click features was considered, but no ground-truth dataset existed for comparison. Therefore, the newer C-POD-F (which also stores waveforms) was deployed in 2019 (after analysis for 1 3 this manuscript was completed) in an attempt to more highly resolve the two species' presence in future work.

AWCP data processing
Data from the AWCPs were processed in and exported with EchoView (Myriax, Tasmania) and custom Matlab code was then used to estimate the community structure. Any contamination in the data was manually eliminated, and a surface line roughly 2 m below the visible air-surface interface in the EchoView echograms was manually defined. This 2 m buffer between the air and water column was important for minimizing surface bubble backscatter contributions to the biological measurements (Stauffer et al. 2015). The volume backscatter (200 kHz Sv in units m 2 m −3 ) was calculated from integrations in 24-h bins over the entire water column. Six community composition class sizes were estimated by integrating 30-min periods over the full water column depth and exported by EchoView into.csv files. The categories were defined by taking the difference of the volume backscatter from each of the three transducer frequencies in each 30-min period, otherwise known as "dB-difference values". Classifying these categories to corresponding zooplankton species was ground-truthed in previous studies by relating backscatter to species in related net tows (Miksis-Olds et al. 2013). The time series of averaged 30-min dB-difference values were then used to calculate the percent of time each day that each backscatter class was dominant in the water column immediately surrounding each mooring. (Stauffer et al. 2015). The methods here are consistent with previous work on some of the same instruments also deployed in the Bering Sea (Watkins and Brierley 2002;Reiss et al. 2008;DeRobertis et al. 2010).
The community composition size classes were defined as: 1 Small scatterers 1-5 mm long that are typically copepods, 2 Medium scatterers 5-15 mm long that are typically juvenile krill, chaetognaths, and amphipods, 3 Large scatterers 15-30 mm long that are typically adult euphausiids, 4 Weak resonant scatterers defined as less than a 3-dB increase in peak Sv at 200 kHz compared to background levels, 5 Strong resonant scatterers defined as greater than a 20-dB increase in peak Sv at 200 kHz compared to background levels, and 6 Unknown or unclassified resonant scatterers defined as having more than a 12-dB geometric distance between the three dB differences calculated for the aggregation and that of the theoretical scatterers from Miksis-Olds et al. (2013).
The backscatter categories are reliable for scatterers above 5 mm in length. Smaller zooplankton, such as neritic copepod species (Pseudocalanus spp., Acartia longiremis, Oithona spp. and Calanus) would have been captured in the small scatterer class only when present in very dense aggregations (Coyle and Pinchuk 2002). This analysis focused on the dominant daily scattering class to minimize the number of explanatory variables and confounding factors of interactions between scattering groups. When interpreting model results, we recognize that if a marine mammal species' presence was correlated with one scattering size class, it does not also imply that the species' presence was uncorrelated with all other size classes. Such nuances in the community structure of the water column were not able to be captured with this methodology.

Environmental modeling
The environmental data were used as explanatory variables in the modeling whereas the response variables were each species' daily vocal presence/non-presence (1 and 0, respectively) from the passive acoustic recordings. Using the AED package in R, and following examples from Zuur et al. (2009), homogeneity and collinearity between explanatory variables were tested by grouping all data by both year and site. To explore homogeneity, data were plotted, and it was noted whether outliers existed and whether spread was even. To explore collinearity, as well as to obtain insight into the relative strength of explanatory values, the pairs function was used with correlation panels (panel.cor.R) to make pairs plots. The resulting Pearson correlation coefficients were documented (Online Resource 1) and used to guide input to generalized additive models (GAMs). For example, sea surface temperature was a homogenous variable in that it had the same spread across years and sites with no outliers. It was collinear (Pearson's correlation coefficients in parentheses) with: Julian day (0.316), days until ice formation (0.525), day length (0.678), percent ice cover (0.447), and ice thickness (0.379). Because the number of years that recordings were made at each site is very different between M2/M5 and M8/CH, and because ice conditions varied greatly from M2 up to CH, collinearities were explored for each site, and the Pearson's correlation coefficients are presented in Online Resource 1. Collinearity between explanatory variables was defined as having a correlation coefficient greater than ± 0.3.
Tests for homogeneity showed that eight of the environmental variables were homogenous (asterisks in Table 1). The other variables were heterogenous insomuch that they either had unequal spreads when grouped by year or site and/or had outliers. Outliers were confirmed to not be errors in the dataset. Year, site, other species present, number of species present, and presence of each of the other species were factored. Environmental variables (ice cover and thickness, AWCP/AZFP outputs, and sea surface temperature) were centered and heterogenous variables were transformed to allow for better model convergence and interpretation. Transformation is the easiest option for dealing with heterogeneity (Zuur et al. 2009). The 200 kHz backscatter and ice thickness were divided by 10 while days since ice, days until ice, day length, and ice thickness were divided by 100 so the magnitudes of all variables were the same. Julian day was input using a cyclical spline to account for the fact that day 1 and day 365 are as similar to each other as all other pairs of subsequent days.
General additive models (GAMs) were used to include any potentially non-linear relationships between each explanatory variable and the presence of each species. ANOVAs were not explored because the response variables were all binomial. Generalized linear models (GLMs) were not used because they could not easily handle the cyclical spline needed in any model that included Julian Day. The gam function in R's mgcv (Wood 2011) package was used to construct models and create plots of the summaries. The gam function in the gam package (Hastie 2019) was used to explore different splines on each variable and their effects on the Akaike Information Criterion (AIC) values. The models with the best (smallest) AIC values in both packages contained the same explanatory variables and exhibited the same degrees of freedom needed for each spline. Ultimately, the mgcv package proved more intuitive because it includes an automatic estimation in the amount of smoothing needed for each variable and is easier to implement a cyclical spline (Zuur et al. 2009).
Model construction began with any two explanatory variables that had a Pearson's correlation coefficient of at least 1.0 at any site (Online Resource 1). AIC and deviance explained values were noted each time another explanatory variable with a Pearson's correlation coefficient of 1.0 or higher at any site was added. Binary and categorical variables were added as factored terms, and Julian Day was added with a cyclical spline. In this way, explanatory values were added one-by-one and remained in the model if AIC was better (lower) with its addition and if it was significant to a p-value of 0.05 or less. In the case that AIC was not improved by the addition of an explanatory variable significant to a p-value of at least 0.05, but a higher percentage of deviance was explained by its addition, that variable remained in the model. Any explanatory variable that was not significant to at least a p-value of 0.05 but did lower the AIC value by a whole number was retained.

Results
General additive models (GAMs) proved to be the most straight-forward method for modeling the presence of species in the Pacific Arctic Corridor given the various types of explanatory variables under consideration. GAMs were better suited than pairs plots to model binary and categorical variables as factored terms and they made it possible to include non-linear functions for continuous variables. The pairs plots provided GAM guidance in two ways. First, they provided Pearson's correlation coefficients between each species and its list of potential explanatory variables, thereby indicating which variables should be included in the GAM for that species. Second, pairs plots indicated which explanatory variables were collinear with each other, thereby indicating that care should be taken in model construction about including them together in a GAM or not. It should be noted that the majority of species and environmental variables had negative collinearity (Online Resource 1). Since all variables were referenced to the lowest value, a negative collinearity was interpreted as a higher likelihood of vocal presence of a species with a smaller value of the predictor variable. For example, more acoustic detections of species X are likely "early" in the year, during "fewer" days since ice had melted, during a "smaller" fraction of the moon, etc. Notable (high) Pearson's correlation coefficient values between species with each other's presence were (1) killer whales were more likely to be present when more and other species were also present during the same day, particularly sperm whales, and (2) Risso's dolphins were more likely present when Pacific white-sided dolphins were present. Notable (high) Pearson's correlation coefficient values between species presence and environmental variables were (1) killer whales were more likely present during longer days (more hours of daylight) and (2) presence of all species but sperm whales varied by year, particularly in the Chukchi Sea (which had the shortest dataset).
There were combinations of environmental variables that were collinear with each other and were considered during GAM construction. These included collinearity of year with site; backscatter levels at 200 kHz with medium and large sized crustaceans and strong and weak resonators; unclassified resonators with medium and large sized crustaceans and weak resonators; small, medium, and large sized crustaceans with each other; percent ice cover with year; percent ice cover with ice thickness (and both of these with Julian day, days since ice melt, and days until ice formation); day length with days since and until ice formation; and SST with Julian day and days until ice formation. Recall that, for the purpose of this paper, small and medium size classes have been correlated to crustacean-dominated communities using net tow studies in this area of the Bering Sea (Stauffer et al. 2015).
GAMs for each species reinforced most of the findings of the pairs plots. This fulfilled both aforementioned objectives: (a) performing an acoustical survey of relatively high frequency-producing (> 10 kHz) species across the Pacific Arctic Corridor and (b) establishing the best predictors of their acoustic detections.
Combining data from all four sites in each species' environmental modeling instead of using less data for individualsite GAMs illustrated some general relationships along the Pacific Arctic Corridor. Figure 4 provides an overview of all significant explanatory variables for each species and how those variables affected vocal presence. One can quickly see from this overview that the vocal presence of other species during the same day increased the likelihood that all but beluga whales would more likely be vocally present also. However, many species have one or two other species that they were less likely to be co-present with. For example, sperm whales and unidentified odontocetes were less likely to be present when killer whales were present. Also, in terms of environmental variables, sea surface temperature (SST) only affected the likelihood of vocal presence of killer whales and beluga whales, but in opposite ways. In fact, killer whales tended to be least present when SST was exactly 4 °C (they were more present when SST was both above and below 4 °C). Plots of the fitted function and pointwise standard errors for each variable, as well as the p-values and percent explained deviance, are discussed for each species.
The GAM for beluga whales (Fig. 5) included seven explanatory variables. Beluga whales were more likely to be present when other species were too, except for sperm whales. Beluga whale vocal presence was not steady across years-they were detected less in 2009, 2012, and 2016, and were detected the most in 2014. Beluga whales were most likely present when the sea surface temperature was near freezing and were particularly less likely to be detected once sea surface temperature reached 4 °C. Their vocal presence peaked in the winter, specifically around the beginning of February, and plummeted in the spring, specifically around mid-April. Finally, while they were detected more during more expansive ice cover, they were detected less around thick ice coverage.
The GAM for killer whales included nine explanatory variables (Fig. 6). They were most vocally present when other species, except for sperm whales and unidentified odontocetes, were also present. Killer whale vocal detection was highest at the beginning and end of the dataset-in 2008 and 2017-and detected least in 2016. Their vocal presence was higher across the Bering Sea compared to the Chukchi Sea, but it is important to recall the survey bias whereby data were only collected for 1 year at the CH site instead of for a decade, like at M5. Killer whale vocal presence was more likely during the warmest sea surface temperatures recorded, but least likely at 4 °C. Their vocal presence peaked in the fall, namely in November, and dipped around the summer solstice. Finally, killer whales were also detected more when days were longer and when there were fewer days until ice formation.
The GAM for sperm whales included nine explanatory variables (Fig. 7). They were more likely to be vocally present when other and more species were also present, except for beluga whales, killer whales, and unidentified odontocetes. Sperm whale vocal detection was highest in 2008 and lowest in 2016. Their vocal presence peaked in the fall, namely in November, and dipped around the summer solstice, but had three distinct peaks: April, August, and November. Finally, sperm whales were also more vocally present closer to (1) when ice had just melted and (2) when ice was about to form.
The GAM for unidentified odontocetes (unid) was the largest, including ten explanatory variables (Fig. 8). This complexity could be credited to the fact that the unid group was a broad-based category. It contains the clicks and whistles that did not fit in the other categories, so is probably a multi-species mix. This group was more likely to be vocally present when more species were also present, except for Pacific white-sided dolphins, killer whales, and sperm whales. Unidentified odontocete vocal detection was highest in 2008 and lowest in 2016. Their seasonal vocal presence had three peaks (April, August, and November) and three dips (March, June, and October), but increased generally from winter through fall. In terms of ice variables, unidentified odontocete vocal presence peaked about 150 days since ice melted. Also, they were more vocally present closer to when ice formed, but not when there was high percent ice cover. Finally, the unid group was more vocally present when the water column's zooplankton community was most dominated by strong resonant scatterers.
The GAM for Risso's dolphins included six explanatory variables (Fig. 9). Risso's dolphins were more likely to be vocally present when more species were present, namely Pacific white-sided dolphins, but not sperm whales and unidentified odontocetes. Finally, they were detected more in the Bering Sea than in the Chukchi Sea, but it is important to recall the survey bias whereby data were only collected for 1 year at the CH site instead of for a decade like at M5. The GAM for Pacific white-sided dolphins included five explanatory variables (Fig. 10). Pacific white-sided dolphins were more likely to be vocally present when more species were also present, namely Risso's dolphins, but not unidentified odontocetes. Their seasonal vocal detection was lowest at the end of April and peaked in late October. Finally, Pacific whitesided dolphins were detected more when the water column's zooplankton community was most dominated by unclassified resonant scatterers.
The GAM for narrow band high frequency (NBHF) porpoises (Dall's and harbor) had five explanatory variables (Fig. 11). Porpoises were more likely to be vocally present when one more species-the beluga whale-was also present. They were also more likely to be present during longer days late in the year around November and less likely to be present in the spring around April.

Discussion
After analyzing datasets from four sites up to 10 years, several important observations and trends of odontocete species' habitat preference along the Pacific Arctic Corridor were identified. This work serves as the only available acoustic presence baseline summary of echolocator distribution and range for future comparisons as environmental conditions change. Results from the collinearity and pairs plot analyses and GAMs are best interpreted as discoveries about which oceanographic features are most likely to predict habitat preferences for each species. There is value in identifying these relationships so limited monitoring resources can be focused in areas where those features are changing most rapidly for efficient and targeted study design in the future. Confounding variables are likely in two instances. First, Risso's and Pacific white-sided dolphins are significant explanatory variables for each other. The clicks from these two species are very similar to each other (Soldevilla et al. 2008) and were classified conservatively since they are not well-parameterized in the literature. Therefore, some clicks could have been miscategorized as each other or as unidentified odontocetes (Seger and Miksis-Olds 2019). The second confounding variable could be between NBHFs and beluga whales. The CPODs had a high-pass filter at 10 kHz but no associated waveforms with the clicks that their detection algorithm processed. As a result, the NBHF odontocete detections could have included clicks from beluga whales without any way to manually verify misclassifications.
A complex GAM with many highly significant explanatory variables and a majority of explained deviance is not necessarily an indication that additional new information was gleaned about a species-a simple, relatively less significant GAM may provide more information in some cases. For example, sperm whale clicks were easy to classify and were present all years at all sites. Their GAM had nine explanatory variables, six of which had p-values < 1.0e −10 , and 50% deviance explained. On the other hand, Risso's dolphin clicks were rarely detected. Their GAM had six variables, all with p-values of < 1.0e −10 , but with only 7.4% deviance explained. For sperm whales, these findings could reinforce the idea that more than a decade of data are needed to tease apart the complex relationships of this cosmopolitan species within the Pacific Arctic Corridor environment beyond water depth (Jaquet 1996;Crance and Matuoka 2018;Crance et al. 2019). In terms of Risso's dolphins, though, very few detections from a few years of data provide an important discovery that the species extends farther north than previously thought (Seger and Miksis-Olds 2019). Any indication of potential habitat preference for a typically temperate species in the Pacific Arctic Corridor, regardless of data paucity and a relatively weaker, simpler GAM, is new information. Now GAMs can be explored species-by-species to understand how the environmental variables may affect their distributions. For beluga whales, the GAM results support previous work highlighting how beluga whales are coupled to the ice (Asselin et al. 2012), since they were most present in the winter when SST was colder and ice was more expansive. This was particularly notable in that they were present every year at M5, except in 2016 when no ice formed over the mooring location. Surprisingly, the CH site had the most ice in 2016, yet very few belugas were acoustically detected there. A large number of killer whales were detected in the 2016 CH PAL recordings during even the thickest and most expansive ice cover, so the beluga whales may have been present-but-silent to avoid predation (Post 2017). Affinity for ice did not occur for killer whales at any of the other sites, suggesting that different orca ecotypes may be present at different sites in our study.
If ice were the only important variable, it would appear that beluga whales cannot inhabit an area void of ice for too long, thus shifting to areas where more ice cover was available. However, they also do not tend to prefer thick ice, particularly in the Chukchi Sea when it coincided with high killer whale vocal presence. This provides support to the interpretation that belugas may be present-but-silent to avoid predation by the Bigg's ('transient' T-type) ecotype which frequently occurs in shallow ice-covered waters in the western North Pacific and, as fish stocks diminish, are increasing their consumption of marine mammals (Filatova et al. 2019). A preference for moderate conditions where waters are the appropriate temperature for ice to form but not with so much ice as to limit the availability of breathing holes concurs with generally accepted ice-edge affinities of beluga whales reviewed by Asselin et al. (2012). This "ideal" type of partially icy habitat has been shifting further north over recent decades and is not as expansive as it once was. NSIDC, in October 2018, reported unusually warm waters and the third lowest average ice extent in the satellite record. Years 2019 and 2020 continue monthly recordsetting minimum ice extents. Results show that the M5 site was a typical location for beluga whales to inhabit until its complete lack of ice in 2016. Across the Pacific Arctic Corridor region, though, their vocal presence increased during the combination of a warm Bering Climate Regime and a positive anomaly in the PDO (Seger and Miksis-Olds 2019). The M8 and CH sites did not have equally long time series as M2 and M5 but are important sites for future observations in beluga whale vocal presence if the Arctic's ice extent continues to decrease annually, shifting prime beluga whale habitat space further poleward.
For killer whales, the GAM results tell a general story of two preferred temperature ranges (above and below 4 °C), a preference for longer days, but also a preference for waters where ice has recently formed late in the autumn. The collinearity results separated by site (Online Resource 1) show a high presence of killer whales at CH when ice cover was persistent and thick while those detected at M2 preferred open water. This study did not separate killer whale acoustic signals into ecotypes, but it is possible that the Bigg's ecotype that eats marine mammals may be more present at CH while the offshore fish-eating ecotype may be more present further south along the Pacific Arctic Corridor. Deecke et al. (2005) showed evidence that ecotypes rarely exist in combined groups. (The resident ecotype is not considered here because they are presumed to stay close to shore, preventing detection on any PAL at any site.) The two preferences for SST could be a division between the two seas whereby the killer whales in the Chukchi Sea prefer colder, icier waters later in the year, while the killer whales in the Bering Sea prefer warmer, ice-free waters around the summer solstice. There is higher confidence in Bering Sea data in these results since the CH dataset only contains 1 year of data. Regardless of site, survey effort bias, or potential ecotype separation, killer whale acoustic detections decreased between 2012 and 2014, which coincided with the combination of the warm Bering Sea Climate Regime and positive PDO. They were also detected the least in 2016 when no ice formed at M5.
Another complicating aspect of killer whale vocal activity is that orcas have been shown to modify their vocal behavior in the presence of prey (Deecke et al. 2005). It is not possible to know whether orcas were present-but-silent in the years they were vocally detected less often. Nor can we discern whether mammalian prey were also present-but-silent, and no data were gathered about the presence of fish prey species. This combination of potentially present-but-silent prey and predators could be explained three ways: (1) vocally active years could mean killer whales were not engaged in successful foraging of marine mammal prey, so they did not need to be stealthy and quiet while foraging for fish, or (2) vocally inactive years could mean killer whales were present and highly engaged in quiet foraging, or (3) killer whales were more absent from areas in less vocal years.
For sperm whales, GAM results provided some support for their cosmopolitan nature. They had three annual spikes in vocal presence (April, August, and November) which were about as distinct as the peaks in the GAM for unidentified odontocetes-the catch-all category. During data processing it was noted that sperm whales were one of the most vocally active species at all sites. In terms of their relationship with ice, sperm whales were more vocally active during days closer to when ice was about to form and further from when ice melted, which aligns with findings from Seger et al. (2016). In Seger et al. (2016), time series plots show sperm whale vocal presence beginning 3-4 months ahead of ice formation. The August peaks in sperm whale vocal presence in the GAM fell 3-4 months ahead of ice formation at CH, M8, and a few years at M5 whereas the November peaks fall 3-4 months ahead of ice formation at M2 and a few years at M5. It is possible that sperm whales moving through the Pacific Arctic Corridor follow the ice edge at a lead of about 3-4 months. Further evidence of this can be seen in the winter and spring of 2016 when the typical sperm whale presence did not occur at M5 when no ice formed (Seger et al. 2016). Current literature cites water depth as the only explanatory variable in sperm whale presence (Crance and Matuoka 2018;Crance et al. 2019), but the present results show that a 3-4 months lead in ice formation may be another important predictor variable of sperm whale presence.
The observation that sperm whales were more vocally present when beluga and killer whales and unidentified odontocetes were not present may (a) indicate they prefer habitats that are different than these three species, or (b) support the present-but-silent predator-prey relationship. Sperm whales could have actively hunted (click/echo-locate) while killer whales engaged in quiet foraging for mammalian prey like beluga whales and unidentified odontocetes which remained silent to prevent predation. The fact that sperm whales can also fall prey to killer whales (Jefferson et al. 1991) would support interpretation (a) over (b).
Model prediction was most complex for the unidentified odontocete category likely due to the fact that it was the broad-based category for clicks that were either not propagated well enough through the water column to be discernible for species identification or were truly unknown. Vocal presence of unidentified delphinids decreased over the sampling decade, but the largest drop occurred between 2008 and 2009. It slightly decreased during the 2012-2016 combination of a warm Bering Sea Climatic regime and positive PDO, then reached its nadir in 2016 when no ice formed at M5. In 2017, unidentified odontocete vocal presence returned as the Regime and PDO began to switch back to cold and negative, respectively.
Other environmental variables that predicted vocal presence for the unid group were a peak in presence about 150 days (5 months) since ice melted. This corresponds to their lowest vocal presence occurring around February and peaking around November closer to when ice begins to form and is thinner than during winter months. The higher likelihood of vocal presence when strong resonant scatterers dominate the water column may provide insight into food web functionality as the species that comprise the strong scatterers become better understood. Future advances in the ability to better classify clicks, buzzes, and whistles to species level will refine understanding of this group the most.
For the typically temperate species (Risso's dolphins, Pacific white-sided dolphins, and northern right whale dolphins), a lack of data prevented model fitting for only northern right whale dolphins. The presence of all three typically temperate species by site and year was explored more in depth by Seger and Miksis-Olds (2019) and again, the two ocean basin processes (the PDO and Bering Sea Climate Regime) appeared to work in tandem to facilitate their habitat expansions.
Risso's dolphin vocal presence was expected to be confounded by Pacific white-sided dolphin vocal presence because the peak and notch patterns in both species' clicks are not well documented throughout their ranges, nor are they easy to distinguish from one another. The vocal presence of Pacific white-sided dolphins as a significant explanatory variable for Risso's vocal presence was substantiated in the modeling results. Risso's dolphins were more to be likely present in the Bering Sea compared to the Chukchi Sea, which follows their known habitat niche and supports the northward habitat expansion trend from the Gulf of Alaska. It also reflects the survey effort bias between the CH site and the Bering Sea sites, so should be interpreted with caution. Risso's dolphins were most vocally present in 2010 and least present in 2016 when no ice formed at M5. A single cycle of boom and bust vocal presence can be seen in the GAM summary results (Fig. 9). The cycle lasted about 10 years: it increased from 2008 to a peak in 2010, then decreased into and through the 2014-2016 Bering Sea Climatic Regime/ PDO combination discussed previously, bottomed out in 2016, and increased to 2008 levels again in 2017. It is difficult to determine if this decadal cycle ever occurred previously yet went undetected. But monitoring for this cycle's occurrence in decades to come could determine whether and to what extent the Bering Sea Climatic Regime and the PDO affect Risso's dolphin habitat expansion or whether other oceanographic and ecological cycles are involved.
For Pacific white-sided dolphins, the modeling results also indicated that vocal presence of Risso's dolphins was a significant explanatory variable, supporting the confounding variable argument. The fact that their vocal presence was lowest in April when ice was usually present and was highest in October when ice was never present suggests that Pacific white-sided dolphins expand northward when the environmental conditions are most similar to the water columns further south in their typical niche and return south when conditions become too hostile. Finally, Pacific white-sided dolphin vocal presence coincided negatively with unclassified resonant scatterers dominating the water column, illustrating a need to better understand and track climate change driven shifts in food web dynamics to predict times and areas in the Pacific Arctic Corridor when and where various scattering size classes may dominate.
For narrow band high frequency (NBHF) porpoises, the modeling results from a single year of data from three sites (M2, M5, and CH) indicated that they are more likely present when days are longer, which is similar to killer whales. Their vocal presence peaked in November and was lowest in April. It is likely that this group represents only harbor porpoises in the Chukchi, but it could have included Dall's porpoises, as they are also a species with high potential for northward habitat expansion (Seger and Miksis-Olds 2019). Better technology, such as the C-POD-F which saves waveforms, will help separate these species at higher latitudes in the future.
The four variables that quantified ice presence occurred most often in the final models explaining marine mammal vocal presence. Since the ice extent is critically receding with climate change, the trends in presence of odontocetes along the Pacific Arctic Corridor are likely going to become exacerbated in the future. The fact that four of the five species' had final models containing year as a significant variable had their lowest vocal presence during 2016 when no ice formed at M5 is striking.
Plankton community structure drives the food web in the Pacific Arctic Corridor, and two model selections contained resonant scatterers as significant explanatory variables. Since ice and food availability are two potential drivers in habitat expansion for a few species, it is likely a driving mechanism for most species. If the effects of the PDO and the Bering Sea Climate Regime change with warmer Arctic temperatures and less ice cover, then many species could possibly shift northward in response, thereby restructuring the ecosystem at the top of the food web.

Future directions
It is of note that a few design features of the study could have skewed results and can be accounted for in future work. First, every sound produced by every odontocete is not yet known, so some recorded sounds are unable to be ascribed to a particular species. With this lack of omnipotent classification knowledge, it is not possible to determine how many species were wrongly placed into the unidentified odontocete category. This likely created confounding variables in the environmental models about the co-presence of some species, so using vocal presence data as indications of 1 3 behavioral associations between species should be cautionary. It is reasonable to expect that these models will change after classification research advances, making it possible to reassign unidentifiable sounds to their appropriate species classes.
Second, sites M8 and CH were each only sampled for a single year. These single-year datasets do provide insight into how much and in which ways a single site can change over a year, providing relative perspective to changes at the longer-monitored M2 and M5 sites. They do not, however, provide long enough datasets to track the hypothesized increase of odontocetes at more northern sites as a result of shifts away from the more southern sites. For the time being, relative presence at M2 and M5 is well established with several years of data, so future data from the northern sites can be compared to the "baseline" that M8 and CH datasets provide.
Third, the low duty cycle of the PALs created a potential under-sampling bias. If the maximum of 21 samples each day were recorded, that is only 1.5 min of daily surveillance. The likelihood of capturing an odontocete sound is small because the duty cycle of the recorders was only 0.75-3.75%, so where there were one or two recordings of any species on a single day, a recorder with a higher duty cycle could have captured more vocalizations. As batterypowered, passive acoustic monitoring technology advances and the Arctic becomes more accessible for deploying and retrieving gear more frequently, higher duty cycle studies to expand temporal coverage will be key.
The Chukchi Sea has undergone extensive visual surveying because of oil and gas development and consistent monitoring efforts from NOAA laboratories (Bakhmutov et al. 2009;Aerts et al. 2013;LGL 2014;Berchok et al. 2015). Visual survey efforts in the Bering Sea are lower by comparison. If visual survey efforts increase in the Bering Sea and are paired with acoustic sampling, such a dataset would help validate the vocal presence of Risso's dolphins, Pacific white-sided dolphins, and northern right whale dolphins that were sparsely collected in this study. This would fill an historical visual survey gap between the Aleutian archipelago and the Bering Strait that acoustics cannot close alone. It is especially important to initiate this future work over the next few years, as negative PDO values were consistently recorded in the second half of 2017 and for almost all of 2018 (Mantua et al. 1997; https ://www.ncdc.noaa.gov/telec onnec tions /pdo/data.csv).
Furthermore, a C-POD-F to collect waveforms of porpoise clicks deployed alongside a C-POD would assist in determining whether distinguishing Dall's and harbor porpoise clicks from each other in the 2016-2017 data are possible. Since harbor porpoises naturally inhabit the Chukchi, but only a handful of Dall's porpoises have been sighted there and recorded in the grey literature (Aerts et al. 2013), this distinction is key to determining whether a fourth species (Dall's) may be expanding its range northward.
If future acoustic data collection in the Chukchi Sea expands the time series of odontocete species' vocal presence, models should focus on significant explanatory power in the "year", "Julian day", and ice-related variables. If a species was vacating one site and going to another, the expectation is that later years at more southern sites would have a decrease in vocal presence while later years at northern sites would have an increase in vocal presence. Similarly, temporal lags within a single year would be seen progressively from south to north between less ice and vocal presence. These correlations would be important indications of northward habitat shifting by species found more often at M2 and M5 early in this dataset. In general, future environmental modeling work must be careful to interpret any results of increased vocal presence as habitat shifts by controlling for the possibility that more vocal presence could also indicate extant populations growing instead of recruiting more individuals from further south. When an understanding of how different vocalizations function within each species becomes possible through other studies, separating recordings into functional categories could even create species-specific distribution maps based on behavioral context throughout the Pacific Arctic Corridor.