CT imaging and digital endocast reconstruction
Taxonomic sampling included the skulls of 80 specimens representing 51 species: 25 flying and 19 flightless avians, and 7 non-avian dinosaurs: Alioramus altai  (IGM 100/1844), Khaan mckennai  (IGM 100/973), Citipati osmolskae  (IGM 100/978), Incisivosaurus gauthieri  (IVPP V 13326), Zanabazar junior  (IGM 100/1), an unnamed troodontid (IGM 100/1126 ) and Archaeopteryx lithographica  (NMNH PV OR 37001). We CT scanned each specimen using a General Electric v|tome Phoenix Computed Tomography (CT) scanner (General Electric, Heidelberg, Germany) at the Microscopy and Imaging Facility at the AMNH or the high-resolution source at the University of Texas at Austin High-Resolution X-ray Computed Tomography Facility (Additional file 1: Tables S1 and S2). Raphus cucullatus (NHMUK PV A9040)  was scanned at the NHMUK Imaging Facility. Scan data were imported into VGStudio MAX v2.2 (Volume Graphics GmbH, Heidelberg, Germany) to construct a 3-D model of the endocranial cavity (endocast) following the protocol set forth by Balanoff et al. . Because the brain fills nearly the entire cranial cavity in birds [25, 26], its morphology is accurately reflected by use of an endocast [24, 27]. The 3-D models were smoothed in VGStudio MAX and exported as PLY files for the software Landmark Editor v3.6 to virtually place landmarks .
Time-calibrated phylogenetic tree
A time-calibrated tree was used to perform comparative phylogenetic analyses on endocranial shape. For extant birds, we constructed a topology using the data provided online by Jetz et al.  (birdtree.org), pruned to include only species sampled in the current study. The Jetz et al.  trees were chosen for their dense species sampling. Our dataset has, in some cases, multiple species that are very closely related and the increased tip sampling of that tree allowed us to retain all of our species.
TreeAnnotator v1.8.1  was used to construct a maximum clade credibility tree from 1000 posterior trees based on the Hackett et al.  tree backbone . Because some of the sampled taxa are not included in the Jetz et al.  dataset, we replaced these tip labels with the closest relatives to each of these species in our analysis. Thus, Alle alle, Chunga burmeisteri, and Goura cristata, and Rollandia rolland were replaced by Pinguinus impennis, Pelecyornis australis, Raphus cucullatus, and Podilymbus gigas, respectively. We considered the above listed extinct crown group avians to exist in the Recent for the phylogeny (i.e., 0 Ma) in the context of deep time. The dataset included Struthio molybdophanes and S. camelus, however, only the latter was present in the Jetz et al.  dataset. Both species were included for the non-phylogenetically informed analyses (e.g. PCA and LDA), but only S. camelus was used for the construction of the time-calibrated tree and phylogenetically informed analyses. This tree was used for the crown-group Aves dataset (Fig. 1). For the Coelurosaur dataset, we used Mesquite v3.02  to incorporate non-avian dinosaurs into the time-calibrated avian phylogeny based on the phylogenetic relationships proposed by Brusatte et al. . Branch lengths were calculated for non-avian coelurosaurs based on oldest fossil ages from the Paleobiology Database (http://www.paleobiodb.org) in sister subclades for each internal node (e.g., the species range for Incisivosaurus is 122.46–125.45 Ma and for Oviraptorosauria is 130.0 Ma, based on the first fossil occurrence of Caudipteryx zoui; subtracting the first occurrence age from the latter gives 4.55 Myr, which is the branch length for Incisivosaurus). This combined extant and fossil tree was used for the Coelurosaur dataset (Fig. 1).
To collect landmark data, each 3-D endocranial reconstruction was imported into Landmark Editor  for digital placement of 3-D Cartesian coordinate points. The relatively featureless surface of an endocast presents an issue for landmark placement due to the lack of clear surficial unions of tissues or other large features that help define discrete landmarks . We placed evenly spaced 3-D semilandmarks on five sections of the endocast (i.e., left and right cerebrum, left and right optic lobes, and cerebellum). To do this, we first placed discrete Type I or Type II landmarks (sensu , also see ) along major endocranial features (e.g. the triple point between the optic lobe, cerebrum, and cerebellum). These landmarks were used to define patches to place sliding semilandmarks on each brain section. With this approach, each patch comprises discrete landmarks anchoring the patch boundary, 3-D semilandmarks on lines between discrete landmarks along the surface of the endocast model, and 3-D surface semilandmarks within the patch. Patches of 6 × 6 sliding semilandmarks were used for each half of the cerebrum (Fig. 2, Additional file 1: Tables S3 and S4). 4 × 4 patches were used for each optic lobe. A single patch of 4 × 5 semilandmarks was used for the cerebellum. In places where the edges of neighboring patches overlapped, we manually deleted the duplicated landmarks in R, leaving a total of 109 landmarks. We chose the number of semilandmarks for each patch so as to capture the shape of each lobe without oversampling it, which we confirmed using the function LaSEC in the LaMBDA R package . The raw landmark files are available on Dryad.
The landmark data were imported into R v3.1.1 , and aligned using “geomorph” v3.0.3  with generalized Procrustes superimposition and with sliding semilandmarks minimizing total bending energy [15, 16]. The aligned coordinates were imported into MorphoJ v1.06a  as two datasets: 1) full sample with non-avian and avian dinosaurs (N = 80; ‘Coelurosaur dataset’) and 2) subsampled dataset with the crown-group birds (N = 73; ‘Aves dataset’). An analysis using the Coelurosaur dataset indicated that landmark placement on Alioramus may have been incorrect, probably due to deformation and lack of easily identifiable landmarks. Therefore, a third dataset was created excluding Alioramus, which is discussed below as the ‘Coelurosaur dataset.’ Due to bilateral symmetry, the symmetrical component of the shape data was used for further analyses. Species with multiple specimens were aligned with the pooled data and the mean shape was calculated for each species.
We subjected each shape dataset to a principal components analysis (PCA) in MorphoJ. Plotting the scores associated with the first three PC axes creates a morphospace of the overall shape differences among specimens and identifies the morphological changes occurring along each axis. In addition to endocast images, ‘lollipop’ diagrams were used to visualize major shape changes occurring in morphospace along PC axes, where the vectors indicate the direction and magnitude of change from the mean shape. Using the “geomorph” R package , we also created a phylomorphospace based on the first two PC axes. The plot shows the correspondence of phylogenetic relatedness to morphological resemblance and can illustrate morphological innovation through the amount of morphospace explored by taxa . This analysis visualizes unequal magnitude of change per clade or branch and unequal morphological innovation by the geometry and relative length of the branches .
A dataset containing the PC axes encompassing 95% of the total shape variation was exported from MorphoJ and imported into R to run a linear discriminant analysis (LDA) using the “MASS” R package v7.3–45 . For the Coelurosaur dataset, we used the first 17 PC axes that encompass 95% of the total shape variation. This analysis looks to maximally separate the a priori locomotor groups (e.g. terrestrial, volant, secondarily flightless). If the LDA is able to adequately separate the shape data into a priori groups, then it suggests that locomotor mode induces certain neuroanatomical shapes across taxonomic groups. If not, then those groups are not supported. Group membership was cross-validated to see if the LD axes are able to correctly predict group assignment. Two locomotor categories (‘protoflying’ and ‘swimming’) had single members and therefore those groups were not included in the dataset for the analysis. The data point for these members were subsequently projected onto the morphospace based on LDA using the ‘predict’ function in R. We also conducted cross-validation analysis to assess the ability to correctly assign locomotory mode based on endocranial shape.
These analyses were run twice. The first analysis was run with crown-group avians (N = 73 specimens representing 44 species) and their locomotory mode (e.g. volant or flightless). 'Flightlessness' among extant birds was defined by a complete loss of lift generation, i.e., a total inability to create sufficient lift to raise the body off the ground for any amount of time, such as the ostrich (Struthio camelus), dodo (Raphus cucullatus), and kakapo (Strigops habroptilus) and was labeled ‘flightless’ in these analyses. A penguin, Eudyptes chrysocome, is present in the dataset. Penguins use subaqueous flight to propel through the water  and therefore it was placed in its own locomotion category (‘swimming’). As it was the only specimen for the ‘swimming’ locomotor category, it was removed from the dataset for LDA. Therefore, the first LDA was performed with two a priori groups: ‘volant’ (N = 37 specimens, representing 25 species), and ‘flightless’ (N = 35 specimens, representing 18 species), for a total of 72 specimens and 43 species.
The second LDA contained the original set of crown-group birds and six non-avian dinosaurs (three oviraptorosaurs, two troodontids, and Archaeopteryx). The non-avian dinosaurs (excluding Archaeopteryx) were categorized as ‘terrestrial’ because their primary locomotory mode was not inherited from flying ancestors. The distinction here between ‘secondarily flightless’ and ‘terrestrial’ is important because ‘secondarily flightless’ indicates that the species evolved from flight-capable ancestors, whereas ‘terrestrial’ indicates their ancestors were never capable of volant activity. Although a penguin (Eudyptes chrysocome) is present in the dataset, it was again removed for this analysis. Similarly, because of continuing debate concerning the level of flying ability of Archaeopteryx [12, 42, 43], it was defined as a ‘proto-flyer’ and was the only member of that locomotor category. As such, Archaeopteryx was also removed from the dataset for the LDA. Therefore, this LDA was performed with three a priori groups: ‘terrestrial’ (N = 5 specimens, each representing a single species), ‘volant’ (N = 37 specimens, representing 25 species), and ‘flightless’ (N = 35 specimens, representing 18 species), for a total of 77 specimens and 48 species. Once the analysis was complete, the ‘predict’ function was used to place Archaeopteryx and Eudyptes into the LD morphospace.
To test explicitly whether locomotory mode and other factors, such as allometry and phylogenetic inertia, drive predictable changes to endocranial shape, we performed several regression analyses. First, regressions of the symmetric component of the shape data against log-transformed centroid size were run for each dataset in MorphoJ using 10,000 replicates to assess the allometric effect. Second, phylogenetically-informed least-squares analyses on locomotory mode and shape data were performed using the ‘procD.pgls’ function in the “geomorph” R package. In addition, we evaluated the effect of phylogenetic signal using the ‘physignal’ function in the “geomorph” R package and allometry based on log-transformed centroid sizes exported from MorphoJ that were used as a proxy for endocranial size [44, 45].