Introduction

Urban environments are distributed worldwide and are becoming an important component of the world’s ecosystems (Grimm et al. 2008). Over the last hundred years, the human population has dramatically increased, and there is a strong tendency for people to move from rural to urban areas (Grimm et al. 2008; United Nations Population Division 2015). According to the United Nations, in 2014, about 56% of the human population inhabited urban areas, and it is expected that by 2050, 66% of the world’s population will live in cities (United Nations Population Division 2015). As a result, cities worldwide are growing in both number and size (number of inhabitants). For instance, the number of megacities (cities with more than 10 million inhabitants) increased from 10 in 1990 to 28 in 2014 (United Nations Population Division 2015). This urbanization trend requires habitat modification in order to allow more people to move into urban areas. In spite of the large negative effect that urbanization may impose on natural environments and wildlife (Grimm et al. 2008), cities house a variety of animal species that have successfully adapted to human-modified landscapes (Bonier et al. 2007; Marzluff 2017).

Increased urban ambient noise is one of the consequences of worldwide urbanization. Urban noise is mostly the outcome of an increase in road traffic, transportation activities (railway and air traffic), and resources extraction (Barber et al. 2010; Shannon et al. 2016), resulting in an increase in sound intensity, mainly at low frequencies (Brumm and Slabbekoorn 2005; Shannon et al. 2016). Urban noise impairs animal communication by masking the signals of animal species (Barber et al. 2010). This masking effect may limit the perception of sounds in communicative interactions (Barber et al. 2010; Brumm and Zollinger 2013; Shannon et al. 2016; Derryberry and Luther 2021), thereby affecting the behavioral responses of the receivers. Given that birds rely on song for communication purposes in a variety of contexts, including attracting a mate, territorial defense, parent–offspring communication, and group cohesion among others (Catchpole and Slater 2008; Bradbury and Vehrencamp 2011), urban noise may limit information exchange (Brumm and Slabbekoorn 2005), affecting mate choice (Huet des Aunay et al. 2014), territorial defense (Mockford and Marshall 2009), interfering in parent–offspring communication (Lucass et al. 2016), and thereby impacting fitness (Injaian et al. 2018; Mulholland et al. 2018), among other evolutionary and ecological processes.

Despite the negative effects that urban noise may impose on birds (Shannon et al. 2016; Derryberry and Luther 2021), over the last 20 years, we have learned that many songbird species are able to adjust their singing behavior and the acoustic characteristics of their song, in order to overcome the impact of urban noise (Shannon et al. 2016; Derryberry and Luther 2021). Birds are capable of changing the time that they sing in relation to rush hour, starting to sing earlier to avoid urban noise (Dorado-Correa et al. 2016). In addition, birds sing at higher frequency (Bermúdez-Cuamatzin et al. 2011), sing longer songs (Ríos-Chelén et al. 2013), and sing with higher amplitude (Brumm 2004) to reduce masking by low-frequency urban noise. Most of the knowledge on the impact of urban noise on birds is coming from studies in the northern hemisphere (Shannon et al. 2016); however, as sound transmission may differ between temperate and tropical regions due to differences in habitat type (Bradbury and Vehrencamp 2011), it is important to study the impact of urban noise at different latitudes.

Song in birds is a complex behavior resulting either from a combination of genetic inheritance and learning (known as vocal learning (Wright and Derryberry 2021)), or is determined purely by genetics, with no learning involved (Kroodsma 2004). Although vocal learning can be defined in different ways, we will consider it as the production of signals that are modified as a result of experience, also known as vocal production learning (Janik and Slater 2000). For practical purposes, we will refer to this learning ability as the learning program. Three different groups of birds are known to learn their songs: hummingbirds (Apodiformes), parrots (Psittaciformes), and songbirds (Passeriformes, oscines) (Jarvis et al. 2014), suggesting independent evolution of vocal learning within birds (Jarvis et al. 2014; Jarvis 2019). Learning in general may play a role in adaptation and evolution (Dukas 2013); however, it is uncertain whether birdsong learning plays a similar role, for instance, in adaptation to urban environments. Therefore, it may be interesting to study the evolutionary/ecological role of vocal learning using a phylogenetic comparative method which provides a powerful tool for comparing patterns of inter-specific variation, while taking into account the underlying evolutionary origin of the species (Garamszegi 2014).

Perching birds (Aves: Passeriformes) include two different groups of species differing in their mechanism of song acquisition. While vocal learning is thought to be common in oscine species (Suborder Passeri), suboscine species (Suborder Tyranni) acquire their song via genetic inheritance (Kroodsma 2004; Jarvis et al. 2014). Oscine species are highly adaptable and seem to adjust to different environments, including noisy ones, by modifying song parameters (Shannon et al. 2016). On the other hand, suboscine species show little variation in acoustic structure across their distribution (Kroodsma 2004) and may face limitations for responding to urban noise (Ríos-Chelén et al. 2012), suggesting little or no adjustability. Given that suboscine species seem to be extremely limited in their ability to modify acoustic parameters, it would be interesting to understand how they cope with urban noise. Nevertheless, most of the studies on the impact of urban noise have been developed with oscine species (Shannon et al. 2016).

With the aim of understanding the role of birdsong-learning programs for adapting to urban environments, we performed a twofold approach, combining a field study together with a phylogenetic comparative analysis. While the field data offers information on the response, the second approach may provide insight into the evolutionary advantages of vocal learning for living in noisy areas. Integrating field and comparative analyses will thereby enable us to obtain a better understanding of the role of birdsong learning for adapting to urban habitats, from both an ecological and an evolutionary perspective. During the field study, we followed sympatric oscine and suboscine species inhabiting urban and rural areas in a Neotropical region. Birds were recorded to assess song spectral parameters and birdsong activity, as measures of singing behavior. Additionally, using songs from an acoustic library, we performed a phylogenetic comparative analysis using learning ability and spectral parameters as predictors and adaptation to the urban environment as the response. We proposed three hypotheses. First, given that vocal learning may confer adaptive advantages, oscine species may vary all characteristics of their song (singing activity and spectral and temporal parameters), under noisy conditions. Second, suboscine species that are in theory less adaptable would face limitations for changing spectral parameters, but may compensate by modifying birdsong activity in relation to noise. Finally, if vocal learning mediates evolution, similar to other learned traits (Dukas 2013), we hypothesized that vocal learning may serve as a preadaptation for inhabiting noisy urban environments.

Materials and methods

Two different datasets were collected and analyzed for the testing of our hypotheses. First, with a field study, data on song structure and birdsong activity was collected in six different localities (3 urban, 3 rural) between May and December 2015 (details below). Second, using data obtained from recordings provided by Macaulay Library (details below), a comparative analysis was performed using a phylogenetic correction.

Sampling approach

We pre-selected a pool of sympatric oscine and suboscine birds that simultaneously inhabit urban and rural areas and that were not too different in body size. Based on the known presence of the species of interest, six different areas (3 urban, 3 rural) were selected for the field study. Urban areas were located in the city of Medellin, north-western Colombia, and the rural localities were positioned in its surroundings (Fig. 1). One urban and one rural site were always sampled on consecutive days to reduce seasonal variation between the two categories. Urban localities were close to highways or streets with high traffic flow to ensure high urban noise. All urban and rural localities were located at similar altitudes (between 1400 and 1600 m.a.s.l.), to ensure similarity in habitat and bird species composition. All localities were within a 50-km range to avoid latitudinal effects. Data for bird song activity and acoustic structure were collected only during working days, between sunrise and ten o’clock in the morning, both in rural and urban localities.

Fig. 1
figure 1

Map illustrating urban and rural localities for data collection. Colombia, located in northern South America, is showed in the upper right box, and Antioquia is showed in the bottom right box. The shadowed area in the main box corresponds to the urban area of Medellin, Antioquia. Dots represent localities. Since one rural locality was different for song activity and acoustic structure analyses, the map shows four rural localities

Noise measurements

Noise levels were measured as sound pressure levels at five different points in each locality every time the point was visited and following Brumm’s protocol (Brumm 2004). Noise was measured in a dB(A) scale using a PCE-322A class II sound meter, using A-frequency response, with automatic measurements in a range between 30 and 130 dB (re. 20 µPa), frequency response between 31.5 and 8 kHz, and fast response. The sound meter was held at about 1.5 m from ground level pointing in five different directions (north, south, east, west, up) for 10 s each, recording one measurement per second, for a total of 50 measurements per observation point per hour (Supplementary file 1). Average noise per point was subsequently calculated to avoid bias for temporary noise sources, such as buses or airplanes passing by.

Acoustic structure

Twelve different species (six oscines: house wren (Troglodytes aedon), Black-billed thrush (Turdus ignobilis), Blue-grey tanager (Thraupis episcopus), Greyish saltator (Saltator coerulescens), Streaked saltator (Saltator striatipectus), Bananaquit (Coereba flaveola) and six suboscines: Great kiskadee (Pitangus sulphuratus), Tropical kingbird (Tyrannus melancholicus), Yellow-bellied elaenia (Elaenia flavogaster), Bar-crested antshrike (Thamnophilus multistratus), Pale-breasted spinetail (Synallaxis albescens), Azara’s spinetail (S. azarae)) were selected for studying birdsong acoustic structure under noisy conditions. A minimum of two and a maximum of five different individuals of each species were recorded for at least 15 min at each locality. A total of 104 individuals of oscine species (54 rural, 50 urban) and 103 individuals of suboscine species (56 rural, 47 urban) were included in the analysis. Individual birds were identified while singing, and recordings were opportunistically collected. Field recordings were obtained by hand using a Sennheiser Me67/K6 microphone attached to a Marantz PMD661 portable recorder, at an approximate distance of 15 m from each individual bird. Recordings were collected during the morning, starting at sunrise, and ending when song activity decreased at each locality, typically around 10:00 in the morning. Singing birds were always observed throughout recordings, and the use of a directional microphone helped to obtain high-quality recordings, independent of noise levels.

Recordings were visually inspected, and based on the quality of the signal, five different strophes per individual were selected for analysis. Acoustic attributes were analyzed using Avisoft-SASLab Pro Software, v. 5.2.09 (Avisoft Bioacoustics, Berlin, Germany), spectrogram parameters: Hamming window, FFT length 512, frame size 75%, and overlap: 50%. Spectrum-based measurements were obtained using the automatic selection of the acoustic signals. For this, a threshold of − 36 dB was set for the automatic segmentation of strophes (recognition of start and end) to ensure standard spectrograms based on the peak amplitude of each analyzed song. Following this procedure, automatic measurements of frequency (peak, maximum, minimum), bandwidth, and entropy were retrieved from Avisoft (Supplementary file 2), which avoids spurious results due to measurement artifacts (Zollinger et al. 2012; Brumm et al. 2017).

Birdsong activity

Eight different species (four oscine: Tr. aedon, Tu. ignobilis, Th. episcopus, S. coerulescens; four suboscine: P. sulphuratus, Ty. melancholicus, E. flavogaster, T. multistratus) were used for studying bird song activity at the urban–rural localities. Each locality was surveyed for 4 h, starting at sunrise, and over four consecutive days, or semi-consecutive days in case of rain or bad weather conditions. As with the previous sampling, one urban and one rural site were always sampled on consecutive days to reduce seasonal variation.

Five different independent points were randomly selected at each locality. Points were located with a distance of at least 300 m between them, to ensure independence of observations. A single observer (DM-A) registered birdsong activity of the chosen species at each point for 5 min every hour, and during 4 h by registering the number of strophes sung by individuals of the selected species. Every time a bird from the selected species sang at each of the points, the observer counted and registered the strophes (Supplementary file 1). The order for visiting the points was randomized in order to sample all points at different hours, thereby avoiding bias due to natural changes in song activity.

Phylogenetic signal and comparative analysis

A total of 145 passerine species (75 oscine, 70 suboscine) were selected for performing the comparative phylogenetic analysis. Macaulay Library provided field recordings for all species. Given that we obtained several recordings per species, recordings were selected based on the quality of the signal, with a preference for recordings from tropical countries.

Avisoft-SASLab Pro Software was used for the spectrographic analyses. Following the aforementioned procedure, spectrum-based automatic measurements were retrieved for frequency (maximum, minimum, peak), bandwidth, duration, and entropy parameters (Supplementary file 3). Recordings were normalized before analysis, with filtering unnecessary, since all recordings were of high quality.

Given that the bird species selected for analysis belong to different families, and the fact that we did not have complete phylogenies for all species, Bird Tree (Jetz et al. 2012) was used to construct a reliable phylogeny for all species. In the absence of a complete phylogenetic analysis, Bird Tree provides a standardized phylogeny with a robust and validated phylogenetic background (Rubolini et al. 2015). We used the phylogeny of orders proposed by Hackett (Hackett et al. 2008) as the backbone of our phylogeny and a total of 1000 different trees were obtained. From the set of 1000 trees, we obtained the maximum clade credibility (MCC), which allows us to estimate the effect of factors explaining the evolution of acoustic traits (Supplementary file 4). The MCC method identifies the single tree in the posterior sample with the largest sum (or alternatively, product) of posterior probabilities across its constituent bifurcations (Heled and Bouckaert 2013).

We performed two different phylogenetic comparative analyses. First, we calculated the phylogenetic signal for acoustic parameters (Münkemüller et al. 2012). This metric measures the tendency of evolutionarily related organisms to resemble each other (Blomberg et al. 2003; Münkemüller et al. 2012). We used the package Phytools in R software (Revell 2012) to estimate the Lambda index (λ) (Pagel 1999) for the phylogenetic signal. We used Pagel’s Lambda because of all indices it has the smallest type I error (Münkemüller et al. 2012). This index varies between 0 and ∼1, indicating whether a trait follows a Brownian model of evolution (0) or not (1). Higher values indicate a larger phylogenetic signal, and the null hypothesis is that Lambda = 0. Second, we performed a phylogenetic generalized model to test for the role of the different acoustic traits and birdsong learning (oscines – suboscines) for adapting to urban environments. Details on this approach are provided below.

Statistical analyses

We tested for differences in noise levels between urban and rural localities using data collected during song activity measurements. Noise data were checked for normality using a normal probability plot and looking at dispersal of the data. Since the data appeared to be normal, we performed a two-tailed t-test.

Differences in acoustic structure between urban and rural areas were tested using linear mixed effect models (LMM) built in lme4 package (Bates et al. 2015) for R software for Mac (R version 4.2.1, R Development Core Team 2022). We aimed to compare acoustic variation of the same set of species that was simultaneously present in both urban and rural areas. First, we ran a principal component analysis (PCA) for dimensional reduction of the six acoustic variables retrieved from Avisoft. Data was scaled before analysis. As a result, we extracted two components with eigenvalues larger than 1 and representing 72.3% of the variance. According to the loadings, component 1 was more related to frequency parameters (peak (− 0.95), minimum (− 0.72), and maximum (− 0.98)) and bandwidth (− 0.80), and component 2 was more related to duration (0.61) and entropy (− 0.72). Each principal component was used individually for analysis. We built models with the components (PC1, PC2) as individual response variables. Moreover, noise level, locality type (urban, rural), and learning program (oscine, suboscine) were used as fixed factors. Species was included as a random factor, as the random intercept. We ran all possible models and selected the best model based on AIC. Finally, the best models were run independently to obtain model estimates per factor. We also calculated confidence intervals (CI) by using the function confint in the lme4 package, and the “boot” method with 1000 simulations. Normality of the data was checked with a histogram of residuals. All plots approximated a normal distribution. Finally, we compared acoustic characteristics between oscine and suboscine birds using a Wilcoxon rank sum test with continuity correction in the R package. We used this test because the variance was not homogeneous (all Levene’s test < 0.05).

Species vary in abundance across locations, and these differences may be a confounding factor when analyzing birdsong activity (number of strophes per species/point). Therefore, activity data was transformed to indicate presence or absence (1, 0) at each hour and point per species to avoid bias. Generalized logistic regression models were built in R software for Mac using activity (presence or absence) as the response variable and noise levels, type of locality (urban, rural) and learning ability (oscine, suboscine), and an interaction between learning ability and type of locality as factors in all possible combinations. A logistic model may provide a reliable estimation of the probability of finding a bird species active without the bias of species abundance. Both for birdsong activity and acoustic structure analyses, AIC criterion was used for model selection, where models with AIC lower than two were considered the best models (Burnham and Anderson 2002).

Finally, we performed a comparative analysis for understanding the evolutionary role of vocal learning for living in urban environments. Given that species traits are not independent, due to a common evolutionary history (Felsenstein 1985), a phylogenetic generalized model (PGLM) was developed in the Phylolm package built in R software (Ho and Ane 2014). We used the phyloglmstep function, which fits a phylogenetic logistic regression following Ives and Garland (2010) and performs stepwise model selection for phylogenetic generalized linear models, using the criterion − 2*log-likelihood + k*npar, where npar is the number of estimated parameters and k = 2 for the usual AIC. This test aimed to see whether acoustic traits might predict adaptation to urban environments. Accordingly, bird species were assigned to urban or rural type based upon the literature. We assumed that a species was adapted to a particular habitat (urban or rural) if it predominantly lives there. If a species is observed equally in rural and urban areas, we set urban as the preferred habitat to denote adaptation to urban environments. Adaptation to urban habitats was considered a binary response variable, with 1 for urban and 0 for rural. Acoustic traits were log transformed to make them comparable and were used as factors. Additionally, since an ecological variable (foraging stratum) has been shown to play a role in living in urban areas (Cardoso 2014), it was also included in the model as a factor. Figures were made using GGPlot2 in R (Wickham 2009).

Results

Acoustic structure

Noise levels were significantly lower in rural areas, compared to urban localities (rural = 45.7 dB(A) ± 0.077 SE, urban = 55.8 dB(A) ± 0.085 SE, t29 = 9.38, P < 0.001). Acoustic structure showed significant variation among species examined in urban–rural plots (LMM). For frequency parameters (PC1), two models had delta AIC lower than 2. Both models included Group: suboscines (Table 1). Moreover, the type of locality (urban) was included in one of the two models (Table 1). Effects of both factors were relatively small, and Group had a larger CI overlapping the zero in both models, while the type of locality showed a narrower CI without overlapping zero. In addition, intraclass correlation (ICC) showed that more than 95% of the variance was explained by the random factor (species). Furthermore, two models resulted in delta AIC lower than 2 (including a null model) for PC2 (duration, entropy). Group: suboscines was the only factor included in the resulting model. The effect of the factor was relatively small, with a narrow CI that overlapped the zero. Moreover, ICC suggested that 62% of the variance was explained by species membership. As a conclusion, variation in frequency parameters (PC1) was best explained by the type of locality (urban) and species membership. Although group (suboscines) may play a role, this was not significant. In addition, duration and entropy seem to vary between oscine and suboscine birds. The best models are presented in Table 1.

Table 1 Best models explaining variation in acoustic structure in sympatric oscine and suboscine species inhabiting urban and rural localities

Since we implemented a LMM with a PCA including all acoustic parameters, we performed t-tests for each group (oscine/suboscine), comparing urban and rural localities to evaluate the differences for each group. In summary, suboscine species showed higher frequency parameters in urban areas, compared to rural localities (peak frequency: t =  − 4.8, P < 0.001; minimum frequency: t =  − 4.1, P < 0.001; maximum frequency: t =  − 6.4, P < 0.001; bandwidth: t =  − 3.4, P = 0.001, Fig. 2). In addition, only the minimum frequency from rural to urban localities in oscine species showed a non-significant tendency to increase (peak frequency: t = 0.3, P = 0.76; minimum frequency: t =  − 1.85, P = 0.06; maximum frequency: t =  − 0.2, P = 0.83; bandwidth: t = 0.72, P = 0.47, Fig. 2).

Fig. 2
figure 2

Acoustic structure of oscine (black) and suboscine (gray) birds from urban and rural localities. Panels depict average values ± SE for A peak frequency, B minimum frequency, C maximum frequency, and D bandwidth. Significant differences with a t-test are shown with “*”

Given that oscine birds did not seem to vary frequency parameters from rural to urban environments, we compared acoustic characteristics between oscine and suboscine birds with a Wilcoxon rank sum test with continuity correction. Oscine species sang with a higher peak (W = 293,211, P < 0.001) and maximum frequency (W = 373,755, P = 0.001) compared to suboscines; however, there was no difference in the minimum frequency (W = 339,519, P = 0.927). In addition, oscines were found to produce songs with greater bandwidth than suboscines: W = 409,925, P < 0.001. These differences in frequency between oscine and suboscine birds can also be seen in Fig. 2. Altogether, these results suggest that the variation in frequency parameters was best explained by the learning program (suboscine birds), which seemed to show differences in frequency parameters between urban and rural localities.

Birdsong activity

Variation in the presence/absence of bird song activity between urban and rural areas was best explained by two models, the first one including the learning program, the hour of recording, the type of locality (urban), and the interaction between learning program (suboscine) and type of locality (urban). The second model included the same variables plus noise (Table 2). Our results suggest that suboscine birds are less active in urban areas compared to rural localities. Furthermore, oscine species seem to be as active in urban as in rural areas.

Table 2 Best models explaining variation in birdsong activity in sympatric oscine and suboscine species inhabiting urban and rural localities

Phylogenetic signal and comparative analysis

Lambda (λ) values varied between low and not significant, for stratum, and medium and significant for bandwidth (stratum λ = 0, P = 1.0; bandwidth λ = 0.346, P = 0.03). Duration produced a medium phylogenetic signal, but was not significant (λ = 0.272, P = 0.18). Finally, while the minimum frequency had low and no significant phylogenetic signal (λ = 0.06, P = 0.19), both peak and maximum frequency had medium and significant lambda (peak λ = 0.139, P = 0.008; maximum λ = 0.161, P = 0.006).

LPGM analysis indicated that four different models help to explain adaptation to urban environments. The best models are summarized in Table 3. The models indicated that living in urban areas was not significantly predicted by vocal learning. In addition, the best models consistently include foraging stratum and frequency parameters that helped to predict living in urban areas (Table 3). In general, living in urban environments was associated with foraging at higher stratum, higher minimum and maximum frequency, and broader bandwidth. Figure 3 depicts effects detected in the different models, and Supplementary Fig. 2 shows acoustic and ecological traits in urban and rural species.

Table 3 Best models explaining adaptation to urban environments in oscine and suboscine species
Fig. 3
figure 3

Coefficients per factor with its correspondent 95% confidence intervals (CI) for best models explaining adaptation to urban environments (delta AIC lower than 2). CI were obtained with 1000 bootstraps. Each module (1, 2, 3, 4) corresponds with a model described in Table 3

Discussion

We designed a field study in combination with a phylogenetic comparative analysis for testing three different hypotheses that may help to understand the role of vocal learning for adapting to urban environments. Assuming that bird species differing in their vocal learning program would differentially adjust to urban environments, we hypothesized first that learner species (oscines) would change the acoustic structure, contrary to non-learner birds (suboscines). Second, both non-learner and learner species may adjust song activity in urban environments. And third, the learning program may be an exaptation (preadaptation) that may favor colonization of urban environments.

Acoustic structure

Contrary to our expectations, the study revealed that oscine and suboscine birds differentially adjust the acoustic structure of their song in order to successfully inhabit urban environments, and both groups were capable of shifting frequency parameters in noisy localities. However, while oscine birds seemed to change only the minimum frequency, suboscines shifted the complete song upwards. The shift in minimum frequency in oscine birds was as predicted according to previous studies (Brumm and Zollinger 2013). Although we expected more variation in oscine birds, this result may be related to the higher frequencies observed in oscine birds both in urban and rural areas (Fig. 2). A recent analysis suggested that oscine birds sing at a higher frequency than suboscines (Mikula et al. 2021), which is in line with our results. Moreover, learning biases may also play a role (Williams and Lachlan 2022). Oscine species usually display more complex repertoires and this may help to increase frequency parameters (Lambrechts and Dhondt 1990), and wider bandwidth (Singh and Price 2015), as observed in our study.

In addition, changes in acoustic structure in suboscines were contrary to our predictions. Previous studies have shown that suboscine birds are capable of modulating their frequency by adjusting air sac pressure (Amador et al. 2008), suggesting that non-learner species may shift the frequency of the complete song upwards, without changing their pattern, hereby supporting our finding. Given that the song of non-learner passerines is highly stereotypic and genetically determined, suboscines may suffer limitations for adjusting frequency parameters alone. However, by using differential air sac pressure, which may be the result of increasing song amplitude due to the Lombard effect (Brumm 2004; Zollinger et al. 2011; Brumm and Zollinger 2013), suboscines may be able to shift the frequency of the whole song, avoiding the masking effect of urban noise as a product of the increase in amplitude. This form of behavioral plasticity has already been described in some bird species (Brumm and Zollinger 2011) representing a shared trait in extant birds (Schuster et al. 2012). The often observed increase in frequency may also be explained by measurement errors due to uncalibrated recordings (Zollinger et al. 2012). However, our measurements were taken automatically and based on spectra, which precludes such artifacts (Brumm et al. 2017).

Suboscine birds may also be more flexible than previously thought. They exhibit within-species song variation (Riebel et al. 2005; Kroodsma et al. 2013) and have a higher dominant frequency in forest fragments closer to cities (Tolentino et al. 2018). Song in suboscine species is an innate behavior; hence, it is highly stereotypic between populations, and geographic variation of song is usually associated with genetic divergence. Our study suggests that even though urban songs are sung at a higher frequency, these songs highly resemble songs of the same species in rural areas (Supplementary Fig. 1). Perhaps this frequency shift of the full song is a response to override the masking effect of urban noise.

Song activity

We partially found support for our second hypothesis. Our results indicated that suboscine species decreased song activity in urban areas, compared to oscine birds, which did not seem to vary their singing activity between urban and rural areas. On the other hand, our data showed that all the bird species in this study seemed to decrease activity with the progress of the day, a phenomenon already observed in other species. Our study was performed between approximately 6:00 and 10:00 a.m., and this period coincides with the rush hour in urban environments, which may help to explain changes in behavior (Dorado-Correa et al. 2016). Behavioral changes have already been documented in bird species when facing urban noise (Halfwerk and Slabbekoorn 2009; Dorado-Correa et al. 2016), suggesting that a decrease in song activity may be an important strategy for avoiding the masking effect of city noise. So far, there are very few studies addressing the question of birdsong activity in suboscine birds. Ríos-Chelén et al. (2013) studied the vermillion flycatcher (Pyrocephalus rubinus) and showed a positive association between noise level and song duration; however, no behavioral data on singing activity was collected for the study. A more recent study with the same species observed that males started to sing earlier in highly urbanized environments (Sánchez-González et al. 2021), suggesting that urbanization per se, rather than noise or light, is more important for determining birdsong activity in suboscine birds. If this is our case, suboscine birds may have started singing earlier, and accordingly, we may have observed less activity in urban environments during the studied hours, compared with rural localities. Another possible explanation is that a decrease in activity is an alternative behavioral strategy to avoid the masking effect of noise (Dorado-Correa et al. 2016).

Evolutionary role of song for adapting to urban noisy environments

Our results also showed that some of the frequency parameters appeared to have a phylogenetic signal. These acoustic parameters have shown to be very conservative among closely related species and this result could be explained by the conservation of ancestral acoustic traits within groups (Wiens and Graham 2005). Moreover, given that frequency traits are considered an index signal regulated by body size (Bradbury and Vehrencamp 2011), our results could also be explained by convergence due to evolution of body size together with a corporal plan. A recent analysis at a global scale already suggested that variation in peak frequency is phylogenetically conserved (Mikula et al. 2021), confirming our results. Our phylogenetic analysis showed that adaptation to urban environments is best explained by models including foraging stratum, average minimum and maximum frequency, and average bandwidth, but not by the learning program. Since both oscines and suboscines can live in cities, perhaps the learning program does not represent a barrier for colonizing urban environments. Remarkably, all models included the foraging stratum, frequency parameters (minimum, maximum), and bandwidth as factors that explained adaptation to urban environments. It seems that bird species which forage in higher strata (canopy) and have a higher frequency and broader bandwidth are better adapted to live in urban environments.

Previous studies have already suggested that bird populations living in cities sing at higher frequencies (Slabbekoorn 2013). In addition, a comparative analysis suggested that peak frequency helped to explain tolerance to urbanization (Cardoso 2014). Furthermore, different environmental factors (temperature, precipitation, vegetation cover) act as a barrier filtering communities (Spasojevic et al. 2014). For bird species, it is well known that environmental filters influence the composition and structure of communities (Martin et al. 2018; de Souza Leite et al. 2022). In that respect, noise may act as a selection pressure that filters the community (Francis et al. 2011; Proppe et al. 2013; Aronson et al. 2016), limiting which species can live in noisy urban environments. Our study compared tropical species with different vocal learning programs in an evolutionary context, trying to explain why city birds sing at higher frequencies. If noise is filtering urban communities, at least for some species, singing at higher frequencies may have occurred prior to colonization of the urban environment, and not as a consequence of living there. Although it is obvious that singing at higher frequencies is advantageous for communicating in cities, the phylogenetic signal in frequency traits indicates that higher frequency and broader bandwidth are not a response to urban noise, but an exaptation that facilitates colonization of urban environments. In addition, foraging in higher strata is very important for living in urban noisy cities, but does not have a phylogenetic signal, which may be a response to the absence of understory, due to the urbanization process.

The study of acoustic traits and song behavior of sympatric oscine and suboscine species in relation to urban noise enabled us to compare the response of both groups and to understand the role of birdsong-learning programs for living in noisy urban environments. We established that differences in vocal learning might not limit colonization of noisy areas. Although both oscine and suboscine birds displayed a flexible response in urban environments, they have different strategies. Oscine species shifted a single acoustic parameter, similar to previous studies, whereas suboscine species, which have a stereotypic song across populations, appeared to shift the complete song upwards, probably by adjusting air sac pressure while singing at a higher amplitude. This does not imply a change to the general template of their song; however, it may help them to avoid the masking effect of urban noise. In addition, suboscine species seemed to decrease song activity in the city, perhaps as a response to urban noise, singing before rush hour, or as a strategy to compensate for higher energy expenditure in cities. Moreover, due to ecological filtering by city noise, higher frequencies seem to be an exaptation that may help bird species to colonize urban environments, independent of the learning program. In conclusion, our study provided clear evidence that both oscine and suboscine species may display behavioral plasticity for communicating in urban noisy places. Moreover, we found proof that passerine species use different strategies, depending on their vocal learning ability when facing noisy environments, suggesting that the birdsong-learning program does not help them to colonize cities, rather determines how birds may cope with the masking effect of urban noise.