Introduction

Figure 1 here arose simply as free play with the tools of geometric morphometrics (GMM). The data set comprises the familiar “Vilmann octagons” tracing around the midsagittal neurocrania of close-bred laboratory rats radiographed in the 1960s by the Danish anatomist Henning Vilmann at eight ages between 7 and 150 days and digitized some years later by the New York craniofacial biologist Melvin Moss. This version of the data is the one explored in my textbook of 2018: the subset of 18 animals with complete data (all eight landmarks) at all eight ages. The concern of Fig. 1 is the contrast of the Procrustes-averaged shapes for the age-7 and age-150 animals (only the averages, no consideration of covariances). The heavy lines are for the age-150 data subset; the light lines, the data from the animals at age 7 days (a configuration this paper will occasionally refer to as the “template”). All panels of the figure complicate the usual Procrustes plot of shape coordinate pairs by all or some of the segments connecting these coordinate pairs. In the figure’s left column, all \(8\cdot 7/2=28\) of the interlandmark segments have been drawn; in the right column, only the subset that are the reason for calling your attention to this figure. In the top row, those average locations correspond to the usual Procrustes-registered shape coordinates, partialling out only centering, size, and rotation. Panel (b) is limited just to the nine (out of 28) interlandmark segments from panel (a) that rotated either way by at least 8.6\(^\circ \) between age 7 and age 150 days according to this registration. (The figure label expresses this as “0.15 radians”—1 radian is the mathematician’s natural metric of angle, the angle of about \(57^\circ \) at which the extent of a circular arc is equal to the radius of the circle.) A pretty graphic, but it features too much overlay of signals to qualify as a legible pattern analysis.

Fig. 1
figure 1

Unexpected pattern in the much-analyzed Vilmann data set of neurocranial octagons for growing laboratory rats. (upper left) Saturated network of interlandmark segments, Procrustes average shapes of the octagons at ages 7 days (light lines) and 150 days (heavy lines). Landmarks: Bas basion, Opi opisthion, IPP interparietal point, Lam lambda, Brg bregma, SES sphenoëthmoid synchondrosis, ISS intersphenoidal synchondrosis, SOS sphenoöccipital synchondrosis. (upper right) Subnetwork of segments rotating by at least 0.15 rad (\(8.6^\circ \)) over this age comparison. (Lower left) the same saturated network for the nonaffine component only of the same Procrustes shape coordinates, with landmark numbers. (Lower right) now that the uniform component of this shape coordinate space has been partialled out, there emerges a considerably simpler subnetwork, explicitly displaying the relative rotation of the anteriormost three landmarks with respect to the other five

Most of the clutter is due to the substantial change of aspect ratio (height-to-width ratio, obvious in the left column) that rotated both of the longer diagonals (Basion to Bregma, Lambda to SES) of the template with respect to the apparent principal axes (vertical and horizontal, in this registration) of that change. But that is not the problem here, where the apparent segmental rotations are independent of segment orientation but instead apparently a function mainly of position. So we want to adjust the graphical display in order to nullify the effects of just this specific known effect, the change of aspect ratio, in order to focus better on the relative rotation between ends of the organism that we are detecting in Fig. 1b. Fortunately, we already know how to remove this unwanted uniformity of relative vertical compression from our comparison: recourse to the “nonuniform” component of Procrustes shape space, complement to the subspace of uniform transformations (those that take all rectangles into parallelograms). The resulting plots are the pair in the bottom row.

It is no surprise that the diagram at lower left, panel (c), looks even more cluttered than panel (a), because now the calvarial roof, not just the cranial base, overlaps between the ages. But there is also a new signal once the diagram is edited to suppress all the segments that didn’t rotate much, a signal that seems not to have been anticipated in previously published analyses of these data. As panel (d) shows, 6 of the 28 possible segments rotate by more than 0.15 radian after standardizing this uniform aspect of the young-to-old comparison. And now the pattern is obvious. The four segments connecting the five landmarks at left (anatomical posterior, SOS around to Lam) are rotating clockwise (in this projection) over growth, while the three at the right, located anatomically anteriorly, are rotating counterclockwise, all this to a longitudinal arrangement (think of the centroid of the set of five, versus the centroid of the frontmost three) that isn’t rotating either way. This paper will refer to the segmented polygon SOS–Bas–Opi–IPP–Lam, the set of five landmarks at left in Fig. 1d, as the “posterior pentagon” and the remaining three, Brg–SES–ISS, as the “anterior triangle.”

Fig. 2
figure 2

Two-point superpositions (Bookstein coordinates) of the Vilmann age-7 and age-150 average octagons for every possible baseline. Landmarks are numbered as in Fig. 1. Circled landmarks: ends of the baseline as registered to (0, 0) and (1, 0). Light lines, age-7 average; heavy lines, age-150 average

Fig. 3
figure 3

Adapted from Bookstein (1991, Fig. 7.3.6)

Contrasting morphometric renderings for diverse transformations by thin-plate spline (lower row) of a variously oriented square (upper row). a Square to parallelogram, grid aligned with the edges of the square. b The same, grid now aligned with the square’s diagonals. c Square to trapezoid. d Almost the same, grid rotated \(45^\circ \): square to kite.

The opposition of rotations in panel (d) is consistent with a report using an alternative arithmetic of intersegment length-ratios. There is evidently shortening of upper calvarial anteroposterior length, Lam to Brg, relative to the central segment of the cranial base from ISS to SOS. There is no need to condition the finding as “relative to the sequestering of the uniform term,” because uniform transformations do not alter ratios of distances in the same direction, whether collinear or parallel.

This relative rotation, including that contrast of vertically aligned horizontal growth rates, the central cranial base versus the calvarial roof above it, is surely a feature of the 143-day change of form here. But where is it to be found in the GMM toolkit? Fig. 2 recovers exactly the same report from a quantitative style dating back more than 80 years prior to GMM, analysis via the coordinates Francis Galton introduced in (1907) for “classification of portraits.”Footnote 1 Here I have diagrammed every possible two-point registration of these octagons (quantified only by their average coordinates as Moss originally digitized them). For each alternative baseline, the original Cartesian coordinate average configurations have been recentered, rotated and rescaled separately so that the first baseline point is at (0, 0) of a new coordinate system and the second is at (1, 0) in the same system (the two points circled in every panel of the figure). We have thereby altered every single step of the Procrustes toolkit—the centering, the rotating, the scaling—while eschewing any recourse to the thin-plate spline for separating out that uniform term. And yet ten of the panels clearly show the same phenomenon, the relative rotation between the anatomically posterior pentagon of landmarks and the anterior triangle. Whenever both ends of the baseline are in the same sector (here numbered [8, 1, 2, 3, 4] versus [5, 6, 7]), the rotation is clear in the behavior of the complementary sector. This is particularly evident in the analysis to baselines 5–6 (row 4 column 5), 5–7 (row 4 column 6), or 3–4 (row 3 column 2), where, regardless of any overall change of aspect ratio, the border of the octagon opposite the baseline appears to have radically shifted by a rotation with respect to that baseline. The disparity between ratios of change of length for segments ISS–SOS and Lam–Brg is clearest, perhaps, in the panel for that ISS–SOS baseline, fifth row, fourth column.

Such an analysis, both elegant and elementary, shares no arithmetic with the standard GMM toolkit of Procrustes registration and thin-plate splines. (For a good overview of computational aspects of that standard toolkit in a format suitable for routine biometric applications, see (Claude, 2008). While the term “geometric morphometrics” is not limited to Procrustes methods, nevertheless the great majority of today’s GMM papers do indeed begin by a Procrustes transform of their landmark data.) The two-point registration is far older than that morphometric synthesis of the 1990s, older even than analysis by triangles (“tensor biometrics,” Bookstein, 1984) or by biorthogonal grids (Bookstein, 1978). Both of these versions involve attention to short or long transects of the form that intersect internally, where, by analogy with the change of form from a square to a rectangle, for one particular pair of directions (sides of the square) the ratios of change of distance are greatest or least and the angle of intersection is invariant at \(90^\circ ,\) while the ratio of change of the two distances at \(45^\circ \) to these directions (diagonals of the square) is unity and it is the change of their angle that is maximized. A closer inspection of the interlandmark-distance interpretation of Fig. 1d instead makes reference to distances that are parallel at some spacing (upper calvarial width versus lower), a change visible equally in the Procrustes fits and in the two-point versions, especially versions 7–8 (row 5 column 4) and 3-5 (row 3 column 3). The idea of examining ratios of parallel distances like these is already present in some much earlier applied treatises, such as Martin (1914).

For an intuitive understanding of what is going on here, turn back to the earliest textbook introduction of the thin-plate spline, (Bookstein, 1991), where analyses like these, restricted to just a quadrilateral of landmarks, exemplify what I called “purely inhomogeneous transformations” there, meaning, transformations without any uniform component. Figure 7.3.6 of that book displays, within the limits of the software tools of the time, the effect of rotating the starting grid on the graphs of this purely inhomogeneous component (here, the sole nonlinear component) of the deformations of a square that minimize net bending energy—the now-ubiquitous thin-plate spline.

Figure 3 is a modification of that textbook figure intended to clarify the contrast of the different types of salience (length-ratios and rotations) for interesting pairs of segments. Its four columns prototype different types of the transformations, each of a starting square of landmarks. In the top row are the starting squares, twice in Cartesian alignment with the page and twice at \(45^\circ \). Below are the corresponding analyses, enhanced by ordinary thin-plate splines that are not actually part of the arithmetical report. (But that spline has nothing to do with the analysis here, which deals only with the landmark positions per se, not any interstitial tissue. The quadratic extension to the interstitial rendering in Figs. 4, 5, 6, 7, 8, 9, 10, and 11 requires a minimum of six landmarks, not just these four; the further extension to a cubic fit in Figs. 17 and 19 requires at least ten.) In column (a) the square is transformed to a rhombus by rotating two of its edges without change in length. What change are the angles between the concurrent edges. In column (b) the same transformation is applied to the square of landmarks at \(45^\circ \) (in other words, the grid has rotated with respect to the landmarks, the configuration of which has not changed in either row). Now the report is reversed: the greatest change is in the ratio of lengths of diagonals, while the angle between them is left invariant at \(90^\circ .\) This was also the case for the configuration in column 1, where it was confounded by the inconvenient orientation of the grid lines.

The situation in Figs. 1d or  2 corresponds instead to the prototype in columns (c) or (d) of Fig. 3. The starting configuration is still the same square. But in column (c) the transformation changes the ratio of lengths of two edges that are parallel (horizontal in the figure), not perpendicular as in columns (a) or (b), while leaving unchanged the ratio of the other two edge lengths (the other pair of parallels in panel c1) while radically altering their angle. This is a transformation from a square to an isosceles trapezoid. The complementary transformation in column (d), which the geometer would call square-to-kite, leaves the diagonals unchanged in length and in angle while altering the relation between their midpoints. Now it is a different pair of paired edges whose length-ratio has not changed—the top and bottom V’s—and while the angles at the end of the horizontal diagonal are hardly altered, those at the ends of the vertical diagonal are greatly changed, one increased and the other decreased. To repeat, these reports rely not at all on any curricular GMM technology, neither Procrustes nor thin-plate spline.

The aim of this paper is to push this insight as far as it can go while remaining elementary in its biomathematics. (For instance, its multivariate analysis is limited to the familiar setting of multiple regression.) In particular, every step of the usual Procrustes procedure (centering, rotating, rescaling) proves surprisingly difficult to justify in any context of biotheoretical or biomathematical inference. The scale-standardization step has recently come under criticism as unnecessary and indeed inappropriate, at least for growth studies (see, e.g., the defense of Boas coordinates in Bookstein (2018, pp. 412–414 or Bookstein, 2021a). This essay argues for a similar rethinking of the centering and rotation, and then the replacement of the thin-plate spline interpolant of the resulting configurations by a different strategy that leaves unexplained residuals at every landmark individually in order to simplify the interpretation of the displayed grid as a whole, the “transformation grid” that has been highlighted as the true underlying topic ever since D’Arcy Thompson’s celebrated exposition of 1917. While the idea of two-point coordinates was originally Galton’s in (1907), the idea at which the analysis here is aimed, the quadratic or cubic simplification of a growth gradient or a phylogenetic comparison, is only half as old: it is present in embryo in Peter Sneath’s underrated paper of 1967 on trend-surface analysis of D’Arcy Thompson’s transformation grids.

The core of the argument inheres in any of the next eight figures, which selected eight interesting baselines from the 28 in Fig. 2 for expansion of the analysis to include an explicit quadratic regression of the averaged age-150 Cartesian coordinates against the same from the age-7 octagons. These analyses completely ignore the tools of standard GMM—there is no Procrustes centering, no scaling or reorientation beyond the (arbitrary) choice of baseline, and thin-plate splines are drawn only to be dismissed—while what results, you will see, is a coherent summary of this particular change of neurocranial form. A combined Fig. 12 arrays the eight separate summaries for a synthesis of their information content abstracted in Figs. 13 and 14. Following this exploration, a further analysis of some data from a study of cranial hominization my Vienna group published 20 years ago will consider some extensions of this approach, and a concluding Discussion will reflect on some implications of this seeming irrelevance of today’s conventional GMM toolkit for the explanatory purposes of evolutionary or developmental morphology.

Vilmann 7-to-150-Day Growth Analyzed Without Procrustes GMM

The recommended alternate analysis of the Vilmann growth analysis in Fig. 1 may be narrated by an extraction of common findings from a suite of separate analyses to my selection of baselines, some that are transects of the octagon and others that lie circumferential to it. The analyses to be surveyed are laid out in Figs. 4, 5, 6, 7, 8, 9, 10, and 11. Each of these eight composites offers four panels. They serve several different functions, as follows.

Fig. 4
figure 4

This is the first of eight figures that all have the same four-panel format as applied to one of eight selected baselines from the array of 28 offered in Fig. 2. Top-row panels, left to right: actual change of averaged Cartesian coordinates, with thin-plate spline oriented to selected baseline; ordinary thin-plate spline of the quadratic fit to the age-150 average as regressed on first and second powers of the x- and y-coordinates of the template and also their product xy. Bottom row, left, the quadratic fit (not a spline) as a grid of its own. Solid circles, the observed age-150 average; open circles, predictions from this regression. Bottom right: restriction of the display list of grid vertices to the interior of the age-7 octagon as explained in the text. The baseline of this figure runs from Basion to Opisthion (landmark 1 to landmark 2)

At upper left in each of these eight alternatives is the conventional thin-plate spline, as offered in many software packages, that warps the averaged octagon of Cartesian coordinates of the age-7-day animals into the corresponding average at age 150 days that minimizes net bending energy (Bookstein, 1991) while exactly fitting all the landmark positions. (All eight of these panels will later be collected as part of the composite Fig. 12.) While all these are splined deformations of the same pair of landmark configurations, they are drawn to eight different coordinate grid orientations, namely, the eight choices of baseline indicated by the pair of circled points in these upper-left panels. This is a graphical demonstration of one main point of this paper, which corrects an error of D’Arcy Thompson’s, his failure to consider the specification of grid orientation per se. It quickly becomes clear that different grid orientations lead to rather different reports of the same deformation, even though that orientation itself is not any kind of objective biological parameter. (For instance, in one version of Procrustes analysis, the axes applied to a Procrustes mean are those of its own principal components; in the notation of Bookstein (2018, p. 409), this is the condition that all six rows of what is called the J-matrix there are orthonormal, solely for the purpose of simplifying the resulting algebra.) Then if a particular orientation is to be used for some graphical reportage, how is it to be specified or optimized? One can’t say “align the starting grid at \(XX^\circ \) to the orientation returned by the Procrustes software” when that Procrustes orientation itself has no referent upon the organism (after all, it is overwhelmingly determined by the list of landmarks involved in the configuration)—when it cannot be referred to anything qualitatively observable except in the exceptional case of a bilaterally symmetric form. As GMM findings need to be communicated in terms of the landmark coordinates that comprise the GMM data base, the easiest way to reference a grid orientation is by reference to some pair of landmarks. Figures 4, 5, 6, 7, 8, 9, 10, and 11 demonstrate this for a selection of eight different landmark pairs out of the 28 possible choices.

A second standardization has been applied in these upper panels even though it is explicit only in the borders of the graphics in the lower row: the reference landmark pair circled and named in the figure caption has been placed with one landmark at Cartesian (0, 0) and the other at Cartesian (1, 0), thereby specifying a unit of distance as well. This is, of course, the two-point coordinate system (Bookstein, 1986) already explored in Fig. 2. A specification of this type is needed in order to produce the representations in the upper-right panels of these same eight figures, which are thin-plate splines for a derived geometric mapping, the “growth fit.” Here the x-coordinate of the deformed grid is the predicted x-coordinate from the regression of the baseline-standardized octagon vertices of the 150-day average on both coordinates of the age-7 average, and also their squares and their crossproduct (i.e., a regression of each \(x_{150}\) on \((x_7,y_7,x_7^{~2},y_7^{~2},x_7y_7)\)) and likewise the \(y_{150}\)-coordinates. Each of these regressions involves five predictors, plus a constant, for only eight “cases” (the relevant coordinate, x or y, of the eight landmarks), and so has only two degrees of freedom for error; they are not really regressions, but rather almost interpolations, when the landmark count is so small. In these upper-right grids, the local bends visible in the upper-left panel (the bends actually pertinent to the pairing of those averaged landmark configurations) are clearly somewhat weakened, but still bends remain that are difficult to describe in words.

At lower left in all eight of these figures is a more appropriate representation of the quadratic fit than the thin-plate spline (upper right) of the predicted values: the computed transforms of all the grid lines of the starting scene implied (but not drawn at upper left) as the squared grid over the age-7 configuration average, that is, centered and oriented according to the specified baseline pair. We see each regression explicitly now, as the deformed grid of all of its predicted values. (Like the panels above them, all eight of these versions are collected in Fig. 12.) The observed values for the age-150 landmarks in particular are again conveyed by the paired x- and y-coordinates of the filled dots, while the predictions from the age-7 data are the open circles nearby. Because this is a quadratic regression, every grid line is by algebraic necessity a parabola (although some curve so shallowly as to be indistinguishable from straight lines, we shall see). This is the graphical demonstration of the other principal point of the paper: the claim, in agreement with Sneath (1967), that transformation grids can be most effectively reported by intentionally simplified least-squares fits like this one.

Finally, the panel at lower right will restrict this grid to just the interior of the age-7 octagon from the upper-left panel. This is the only portion of the graphic deserving of empirical attention, as the extension of the thin-plate spline to the exterior is algebraically mandated to drop off quickly toward linearity (see Bookstein, 1991), while the corresponding extrapolations of polynomial fits weight the highest-order terms more and more highly and thus have no biological meaning. At the consequently enlarged scale of these restricted diagrams there is more room to show the residuals between the predicted and observed locations of every landmark, including the baseline pair itself. (Recall that there were no such residuals in the upper left panel—the thin-plate spline fits every designated landmark exactly, without error.) The eight examples of this truncated coordinate-grid deformation in Figs. 4, 5, 6, 7, 8, 9, 10, and 11 offer varying degrees of curving from side to side of the octagon along with varying degrees of rotation with respect to their parallels. In the ultimate analysis of this or any pair of landmark configurations, the preferred coordinate grid will be the one for which these grid lines lead to the simplest pattern report. Here, that will be the preponderance of nearly straight stretches within the organ’s outline, however they might rotate across the octagon. I return to this issue, the pursuit of simplicity of reporting, in the first part of my closing Discussion.

Consider, then, the first figure in this series, Fig. 4, which is the analysis for a baseline from Basion to Opisthion—a short interlandmark segment contained entirely within the posterior pentagonal compartment of Fig. 1d and hence one that might enlighten us as to the rotation archived there. That rotation between anterior landmark triangle and posterior landmark pentagon is essentially the same as what is displayed in Fig. 1 for the conventional GMM approach and in Fig. 2 for the panel corresponding to this baseline (there, the panel in row 1, column 1)—indeed it will be the same in all eight of the figures of this series. Here in Fig. 4, for the baseline Basion to Opisthion (axis of the midsagittal foramen magnum), the orientation of the form is rotated about \(130^\circ \) from the Procrustes convention in Fig. 1. The thin-plate spline (upper left panel) is interesting in that inside the posterior five-landmark component, SOS around to Lam, the interior as rendered by the spline appears to be nearly affine (all grid cells the same size and shape) except near IPP, and likewise nearly affine for the anterior component Bas–SES–ISS. The growth fit (upper right panel) apparently has pulled IPP to the left in this diagram. As Fig. 1 shows, and as has been exposited in earlier papers (e.g., Bookstein, 2017), this point participates in a specific focal process displacing it upward in the more realistic anatomical setting of Fig. 1. Thus the fit in this upper right panel of Fig. 4 does not show the deviation of change at IPP from change at its neighbors that is present in the actual data.

Either panel of the lower row shows how closely the fitted landmarks (open circles) track the averaged 150-day locations observed (the solid circles). The horizontal grid lines in the interior of the form (lower right panel) are mainly straight, while their orientation on this diagram is graded from top to bottom more smoothly than one would infer from the analogous diagram at upper left (the thin-plate spline based on the fully detailed data record, which, by design, is not conducive to any lower-dimensional summary). The steady rotation of this imputed grid line direction is complemented by a gentle curvature of the other grid line direction, a curvature that is not so apparent in the explicit thin-plate spline at upper left—the transformation that appears segmented there, one nearly linear system for the posterior pentagon and another for the anterior triangle, is smoothed by the quadratic regression into a continuous gradient from end to end of the template (top to bottom of the grid, in this coordinate system). The quadratic fit is not required to preserve the baseline length at 1.0—there is usually a regression residual at each end.

Figure 5 shows the same analysis for a different baseline, Basion to Interparietal point, from the same posterior pentagon. Again the quadratic fit (lower row) shows a substantial residual, this time at only one of the baseline points (IPP). In the deformed grid, both systems of lines are curved, a feature that makes interpretation more difficult.

Fig. 5
figure 5

The same as Fig. 4 for a baseline from Basion to Interparietal point, landmark 1 to landmark 3

Figure 6 is the first to involve a cross-component baseline, Basion (from the posterior pentagon) to Bregma (from the anterior triangle). The starting grid has rotated roughly \(80^\circ \) from its position in the first of this series (i.e., the angle between segments Basion–Opisthion and Basion–Bregma in the age-7 average is about \(80^\circ \)). Again the panels in the lower row inform us that the initially vertical grid lines (lines along Lambda–ISS or IPP–SOS) are transformed by the quadratic fit into a pencil of nearly straight lines at varying orientations, while the lines of the originally orthogonal system are gently curved in a manner that will concern us in detail in Fig. 13. At neither of the baseline points is there any substantial fitting error of the quadratic regressions. The approximate uniformity of cell sizes across the trimmed grid at lower right here and in every other figure of this series assures us that the recourse to distances from the centroid in models of centric allometry, such as Bookstein (2021a), is a reasonable default. Indeed the separation between the actual age-150 centroid and the quadratic trend transform of the age-7 centroid is a mere 0.054 units in the scale of this figure.

Fig. 6
figure 6

The same for a baseline from Basion to Lambda, landmark 1 to landmark 5

Figure 7, to a baseline from IPP to SOS, is very nearly the same analysis as in Fig. 6 inasmuch as the two baselines, IPP–SOS and Bas–Brg, are nearly at \(90^\circ \) in the age-7 template. The main difference is the substantial increase in fitting error, owing to the fact that landmark 3, IPP, is known to be strongly loaded on a special factor not shared with the rest of the configuration. Nevertheless, the grids of the lower row still greatly resemble those of the preceding figure, for the baseline at \(90^\circ \) to this one: lines parallel to IPP–ISS (here, the baseline) remain straight but rotate from end to end, while the orthogonals are gently curved.

Fig. 7
figure 7

The same for a baseline from Interparietal point to Sphenoöccipital synchondrosis, landmark 3 to landmark 8. Each panel is roughly a \(90^\circ \) rotation of the corresponding panel in Fig. 6, having a baseline at about \(90^\circ \) to this one

Let us move more quickly through the remaining versions of this four-panel scheme. In Fig. 8, baseline Lambda–SOS, both systems of grid lines in either panel of the lower row are gently curved (although the rotation from end to end of the original octagon is as clear as if they had remained straight). The errors of fit at the baseline points are moderate in magnitude, partly because the fit at Lambda is distorted by the need to accommodate the deviation at IPP.

Fig. 8
figure 8

The same for a baseline from Lambda to Sphenoöccipital synchondrosis, landmark 4 to landmark 8

Figure 9, for a baseline Bregma–SES within the anterior component of Fig. 1, displays gentle curves in both grid systems. Errors of the quadratic fit are again moderate, and the rotation so evident in Fig. 1d is very clear in spite of the curvature of these deformed grid lines. The baseline in Fig. 10, Bregma–ISS, has similar errors of fit and similar curving of the grid lines. Finally, Fig. 11, for an ISS–Bas baseline, is roughly the \(90^\circ \) rotation of the analysis in Fig. 5, where the baseline (Bas–IPP) was roughly at \(90^\circ \) to the baseline here.

Fig. 9
figure 9

The same for a baseline from Bregma to Sphenoëthmoid synchondrosis, landmark 5 to landmark 6

Fig. 10
figure 10

The same for a baseline from Bregma to Intersphenoidal suture, landmark 5 to landmark 7

Fig. 11
figure 11

The same for a baseline from Sphenoëthmoid synchondrosis to Basion, landmark 6 to landmark 1

Figure 12 summarizes all eight of these analyses in a way that permits some criteria of interpretability to emerge regarding replacement of the Procrustes rotation by a protocol more conducive to reportage: a protocol that associates the reorientation of specimens to the ultimate simplification of their deformation report by reference to the specific coordinate lines as deformed from the template’s square grid. We have seen that baseline analyses can sometime come in pairs if the corresponding interlandmark segments themselves lie at approximately \(90^\circ \) in the template, and it is better if they run close to the centroid of the octagon. More subtly, morphological comparisons that can result in reports of relative rotations of parts of a landmark configuration may be diagrammed best not by a thin-plate spline but by a choice of a specific baseline that highlights the rotation in question, like Figs. 6 or  7 here, by leaving one set of grid lines straight lines even as they are rotated. (For instance, in Fig. 6, the panel at lower right is more interpretable than the panel at upper left, even though the information content is effectively the same.) The thin-plate renderings in Fig. 12, columns 1 and 3, all confirm the relative rotation detectable already in Fig. 1, but do not otherwise appear to offer much intuitive accessibility. By comparison, the quadratic-fit displays, columns 2 and 4, vary enough in their legibility that some are truly insightful. Those that seem most helpful are the pair of analyses in the second row, to baseline Bas–Lam or IPP–SOS (two directions that happen to be nearly perpendicular)—these seem to be considerably better than the standard GMM analysis at showing potentially meaningful gradients for the growth process being visualized here.

Fig. 12
figure 12

Compendium of upper left and lower right panels of Figs. 4, 5, 6, 7, 8, 9, 10, and 11, based on eight of the 28 possible two-point baselines. Clearly some of these choices lead to simpler reports than others do. As the thin-plate spline is covariant with similarity transformations of its target, all the splines here (columns 1 and 3) are the same except for grid orientation and spacing. But the regressions associated with columns 2 and 4 weight different landmarks differently (in particular, weighting the two ends of the baseline not at all), so these grids can vary in more aspects than the baseline orientation per se

The analysis in Fig. 7 suggests a scenario I have highlighted in Fig. 13 by the simple trick of extending the domain of the quadratic fit beyond the bounds of the age-7 landmark configuration undergoing warping, as shown in the upper left panel. Here, the black dots are the age-7 configuration average, while the four open disks are the control points that will drive the bilinear map emerging as the interpretation of this scene as a whole.Footnote 2 The upper right diagram here extends the earlier gridded transformation merely by evaluating it on the new coordinate domain to the left in the same template coordinate system, i.e. into the empty space some distance above the foramen magnum of these animals, where the horizontal grid lines of Fig. 7 appear to be converging. We see that the near-linearity of the transformation along the baseline and all the grid lines parallel to that direction persists quite far beyond the actual anatomical limits of the comparison. The apparent rotation suggested in Fig. 1d is embedded here in a larger system of reorientations that might be viewed as continuous rather than segmented, or, in a more suggestive language, graded rather than modular.

Fig. 13
figure 13

Graphical extension of the quadratic fit to the IPP–SOS baseline yields a striking reinterpretation of the growth phenomenon. Upper left, square Cartesian grid extended to the left over the baseline-registered template. Filled dots: age-7 average. Open circles: corners of the quadrilateral in the template coordinate system that will be subjected to a bilinear transformation. Upper right, corresponding version of the fitted quadratic trend from Fig. 7. Lower right: simplification of the fitted quadratic trend as a bilinear map sending all proportional transects of parallel sides of the starting quadrilateral (open circles, upper left) to the same proportional transects of the target quadrilateral (open circles in this lower right panel)

Even though they are rotating, this approximate straightness and even spacing of the deformed coordinate grid lines suggests the geometrical interpretation as a bilinear map. Figure 14 will show such an interpretation in a textbook pose; here at the bottom of Fig. 13 I visualize it in situ as a pattern of relatively rotating lines that are not parabolas but absolutely straight in both coordinate systems. Such a map requires two quadrilaterals of control points, as displayed as open circles both at upper left (for the starting square grid) and then for the target form (lower right panel). The match between the second and third panels of the figure pleases me greatly: indeed this particular six-parameter regression can be represented well by the three-parameter version abstracted here. The rotation of Fig. 1d is completely accounted for by the disproportion of lengthening between the calvarial roof and the cranial base, which remain approximately parallel over this growth interval. The appearance of curving in most of the grids in Figs. 4, 5, 6, 7, 8, 9, 10, and 11 is mainly, though not completely, an artifact of an inappropriate choice of coordinate system. Bookstein (2023) goes in great detail into an algebraic index of the adequacy of such a lower-dimensional summary; here it is enough to note that the fit is pretty good. (But ignore that strong impression of some sort of descriptive center at an unphysiological distance outside the actual calva. This is just an algebraic artifact of the angulation of sides of that target quadrilateral; depending on details of position, the point could be near \(-\infty ,\) near \(+\infty ,\) or anywhere in-between but well outside the organism.)

Fig. 14
figure 14

Two alternatives for column (d) of Fig. 3. The map at upper right in Fig. 13 more closely resembles the bilinear map (far right panel) than the projection map (central panel)

Figure 14 sketches two prototypical kinds of transformations of quadrilaterals, of which the preceding analysis is going to match one. Each is an alternative to the thin-plate spline of column (d) in Fig. 3; one will prove more realistic than the other for this paper’s examples. The more familiar map is the projection, central panel, that takes every straight line onto another straight line. But this mapping substantially alters the spacing of the points where these deformed grid lines meet the bounding kite. No such respacing appears in the extended quadratic fit itself, Fig. 13. An alternative better matching that observed quadratic fit is the family prototyped in the right-hand panel of the figure, the bilinear map that I discussed in considerable detail in Bookstein (1985). Bilinear mapsFootnote 3 take one quadrilateral onto another as follows. Every point (xy) in the interior of the template quadrilateral is the intersection of two lines connecting opposite edges each of which divides that pair of edges in the same ratio. The map takes (xy) to the intersection of the two lines that divide the homologous pair of edges in the target in the same pair of ratios. For a warp of square onto kite this bilinear option can be written \((x,y)\to (x,y)+a(1+xy,1+xy)\) for some a.

The projection in Fig. 14 required the upper isosceles triangle of the template to be mapped into the space above the horizontal diagonal of the kite, entailing a considerable compression of its vertical coordinate; the bilinear transformation enforces much less compression here, at the cost of bending that horizontal diagonal over the course of the deformation. This attenuation of the variability of those ratios of area change seems to match the graphics of all the quadratic fits in Figs. 4, 5, 6, 7, 8, 9, 10, and 11 after an appropriate rotation. The projection map sends all straight lines to other straight lines; the bilinear map, in general, only the lines that join matched proportional aliquots from opposite edges. As graphed in Fig. 14, the deformations generate different warps (the dashed line) for the horizontal diameter of the starting diamond shape. The projection (middle panel) takes this curve to a straight line; the bilinear map (right-hand panel), to a parabola engendering a less extreme reduction of the template cells’ areas above this diameter. Owing to the shared symmetry axis of square and kite there is another set of straight lines within the grid in the rightmost panel—the verticals—but this third set is not present in the general case, hence the “bi” of “bilinear,” and so I have not drawn them here.

Returning one final time to the scheme in Fig. 1d, the decomposition of the neurocranial octagon into two nonoverlapping components, we see that the figure has indeed oversimplified the situation there. That the rotation of all edges of the posterior pentagon leaps to the viewer’s eye obscures the fact that all but one of these segments have changed their length. And likewise the anterior “triangle,” Brg–SES–ISS, does not rotate rigidly—its edge from SES to ISS shortens and also does not rotate as far as the other two. Any report focusing on the two “components” is deficient in failing to refer to the coordinate space in-between them, where the unconformity between anteroposterior changes of length along the cranial base versus along the calvarial roof seems better captured by the rotating lines of the Bas–Brg baseline and IPP-SOS baseline analyses (Fig. 12, row 2, columns 2 and 4) than by the irregularities of the corresponding thin-plate splines (columns 1 and 3).

An Example from Hominization of the Skull

The Vilmann analysis of “Vilmann 7-to-150-Day Growth Analyzed Without Procrustes GMM” section exploited the best study design that experimental zoomorphology has to offer: a sample of close-bred animals imaged by identical machinery at a fixed sequence of developmental ages. (The identification of this research design as the summum bonum of laboratory evo–devo research is a century old—it dates from no later than Przibram, 1922.) Most of the data structures to which GMM has been applied are not so elegantly balanced. This paper’s final example is a pair of comparisons, each much more typical in its design, that share one 20-landmark configuration scheme. The data are a selection from the 29 forms analyzed in Chap. 4 of Weber and Bookstein (2011) that originated in computed midsagittal sections of a larger sample of CT scans digitized by Philipp Gunz (then at the University of Vienna) for the growth analysis in Bookstein et al. (2003). That original analysis explicitly relied upon the same GMM toolkit that is most commonly invoked today: Procrustes analysis, principal components of the resulting shape coordinates, and visualizations by thin-plate spline.Footnote 4

Of the specimens homologously digitized in 2003, most are Homo sapiens, while four are named specimens of H. neanderthalensis (Atapuerca, Kabwe, Guattari, and Petralona), and two are specimens of Pan, one of each sex. For the present reanalysis I have averaged the 18 adult sapiens (one of which, Mladeč, is an archaic specimen) and, as a separate group, the four neanderthals. As a third “group” (present for a didactic purpose, a comparison of comparisons) I selected the female adult chimpanzee, because the adult male shows even more of the heterochrony that will render my final figure so extreme in certain aspects of its geometry. Of course these samples are far more limited than any data resources that would be brought to bear on the same comparisons today. It would be unreasonable to claim that the computations to be reported presently are valid empirical findings; my purpose is instead to demonstrate a methodological alternative to Procrustes- and spline-based GMM.

The left panel of Fig. 15 names these twenty landmarks at their positions in the average of the H. sapiens sample in the original CT coordinates, which were not far from a Sella-Nasion orientation. In the right panel this configuration is supplemented by the configurations of the same twenty points for the female chimpanzee and also for the neanderthal average, all after the two-point transformation (Bookstein coordinates) that put all three ANS’s (of which two are group averages) at (0, 0) and all three internal Lambda’s at (1, 0). Evidently this coordinate system has been rotated, translated, and scaled from the panel at its left, but none of these steps proceeded by the Procrustes method.

Fig. 15
figure 15

Landmark configurations for the hominization example, “An Example from Hominization of the Skull” section. (left) Abbreviated names of the twenty landmarks printed at the raw digitized coordinate averages of the adult sapiens subsample of Bookstein et al. (2003). Alv alveolare, inferior tip of the bony septum between the two maxillary central incisors, ANS anterior nasal spine, top of the spina nasalis anterior, Bas basion, midsagittal point on the anterior margin of the foramen magnum, BrE, BrI external and internal Bregma, outermost and innermost innermost intersections of sagittal and lambdoidal sutures, CaO canalis opticus intersection, intersection point of a chord connecting the two canalis opticus landmarks with the midsagittal plane, CrG crista galli, point at the posterior base of the crista galli, FCe foramen caecum, anterior margin of foramen caecum in the midsagittal plane, FoI fossa incisiva, midsagittal point on the posterior margin of the fossa incisiva, Gla glabella, most anterior point of the frontal in the midsagittal, InE, InI external and internal inion, most prominent projections of the occipital bone in the midsagittal, LaE, LaI external and internal lambda, outermost and innermost intersections of sagittal and lambdoidal sutures, Nas nasion, highest point on the nasal bones in the midsagittal plane, Opi opisthion, midsagittal point on the posterior margin of the foramen magnum, PNS posterior nasal spine, most posterior point of the spina nasalis, Rhi rhinion, lowest point of the internasal suture in the midsagittal plane, Sel sella turcica, top of dorsum sellae, Vmr vomer, sphenobasilar suture in the midsagittal plane. (right) Bookstein coordinates to an ANS-LaI baseline for the averaged adult H. sapiens and H. neanderthalensis samples and the single adult female chimpanzee. Landmarks are tracked by polylines H. sapiensH. neanderthalisPan to clarify their identifications

Consider first the analysis in Fig. 16, which in its design echoes three of the four panels of the Vilmann series, Figs. 4, 5, 6, 7, 8, 9, 10, and 11, but in this case only for one selected baseline, from ANS to LaI, as in the right panel of Fig. 15. (Analysis to a roughly perpendicular baseline, Opi–BrI, results in essentially the same diagrams.) The comparison in Fig. 16 is from the averaged points for H. sapiens in Fig. 15 to the averaged points for neanderthalensis. In both of these Homo averages (and also in the single female adult Pan specimen to come) the baseline crosses the cranial base near Sella roughly halfway along its length. The thin-plate spline deformation from the average of the eighteen humans to the average of the four neanderthals, upper left in the figure, shows the expected contrast of shrinking neurocranium and expanding splanchnocranium, particularly along the palate; the cranial base interposes itself as the so-called “hafting zone.” As the upper-right panel shows, this grid is tracked to some extent by the analogous grid for the fitted values of the same neanderthalis landmarks from the quadratic regression on the sapiens coordinates, That quadratic regression, already demonstrated many times in the Vilmann example preceding, shows most of its failure of fit (discrepancies between the open circles and their filled neighbors in the lower-left panel) along that central separatrix, with a possible exception at lower right where the pairings of the two inions are rearranged in both separation and orientation. As Fig. 15 hinted, this rearrangement is due mainly to excessive variation at InE, external inion.

Fig. 16
figure 16

Three grid diagrams for the comparison of the averaged H. sapiens and H. neanderthalensis twenty-landmark configurations, to an ANS-LaI baseline. (upper left) Conventional thin-plate spline grid deforming the sapiens average to the neanderthalensis. (upper right) Thin-plate spline rendering of the deformation from the same averaged sapiens to the quadratic regression fits (regressions on first and second powers of the x- and y-coordinates and also their product xy) of the neanderthalensis configuration. (lower left) Explicit grid of that quadratic regression. Solid circles, observed averaged neanderthalensis two-point coordinates; open circles, fitted locations. This is even more bilinear than the growth fit of Vilmann neurocranial octagons in Fig. 7

The final quadratic trend grid, at lower left in Fig. 16, is strikingly different from the thin-plate spline of the same point loci (upper right). Indeed this grid for the fit looks remarkably like a rotation of the grid at right in Fig. 14, the bilinear transformation leaving two specific families of straight lines straight after the deformation, while their orientations rotate across the diagram. At this large scale, the comparison of midsagittal crania of these sister species is largely smooth—the points in the hafting zone differ hardly at all from their predicted locations under the quadratic analysis. In particular, the implication of modularity in the upper right panel is completely effaced in the quadratic fit grid at lower left, indicating instead an approximating spatial process that is homogeneously graded with no natural boundaries embryological or otherwise. The grading is consistent with the observation that relative to the face the neanderthal neurocranium is smaller than that of sapiens with some relative rotation as well.

Figure 17 analyzes the same comparison by a cubic fit instead of the quadratic fit in Fig. 16. (Specifically, this fit models each of the twenty x-coordinates of the H. neanderthalensis average and then each of its 20 y-coordinates as a linear combination of nine terms \(x_{sap},\) \(y_{sap},\) \(x_{sap}^2,\) \(y_{sap}^2,\) \(x_{sap}y_{sap},\) \(x_{sap}^3,\) \( y_{sap}^3,\) \( x_{sap}^2y_{sap},\) and \(x_{sap}y_{sap}^2.\) The quadratic regressions used only the first five of these predictors.) These cubic grids show bizarre behavior outside the limits of their driving data (the strange cusps already clear in Sneath’s examples of 1967), so as in the Vilmann exposition of “Vilmann 7-to-150-Day Growth Analyzed Without Procrustes GMM” section I extended the figure by one more panel, lower right, that trims the grid to just the interior of the region occupied by the actual target configuration (here the H. neanderthalensis average). The straight lines of the rendering in Fig. 16 now appear as S-curves across that same hafting zone, and of course the new fit, a regression on nine predictors, has to be closer than that in Fig. 16 based on only five of the nine. (Note that quadratic maps cannot generate points of inflection such as characterize these S-curves of cubic fits.) But the change of size-ratios between neurocranium and splanchnocranium remains clear, as does the directional extension along the palate and the relative rotations from anterior to posterior and from caudal to cranial.

Fig. 17
figure 17

The same for a cubic regression of the neanderthalensis coordinates, nine predictors instead of five. Upper left, upper right, and lower left panels as in Fig. 16. At lower right, an enlarged version of the fitted grid (lower left) as trimmed to the interior of the actual neanderthalensis average

The situation is quite different for the comparison of the H. sapiens average to our more distant relative, the female chimpanzee. The quadratic analysis analogous to Fig. 16 can be found in Fig. 18, but it no longer appears to look entirely like the bilinear map of Fig. 14. Instead we encounter a strong local feature of the transformation, the apparent rotation and flattening of the parietal region, that is seen in both of the thin-plate spline renderings of the top row (at left, for the actual shape coordinates; at right, for the quadratic fit) and likewise in the gridded representation of that quadratic fit at lower left. Strikingly, the residuals of this analysis seem no greater than those of the comparison of the sapiens sample with the neanderthals, Fig. 16, yet the flattening of the splines is clearly detected by this quadratic fit as well, which has so many fewer coefficients (and also a matrix inversion step of much lower rank, \(5\times 5\) instead of \(23\times 23\)). The bidirectional linearity of the lower left panel in Fig. 16 has certainly ceased to apply globally, while the hafting zone here seems still to be no sort of natural boundary between multiple modules. The deformation remains smoothly graded except locally, in the parietal region.

Fig. 18
figure 18

The same as Fig. 16 for the comparison of the averaged H. sapiens to the single female chimpanzee in the data base of Bookstein et al. (2003)

Yet when we switch the algorithm from the quadratic (five-term) fit to the cubic (nine-term) fit, Fig. 19, nothing essential changes in the analysis as a result of these additional four degrees of freedom per coordinate. The thin-plate spline of the fitted points (upper right panel) is not much altered from that in the previous figure except in that same nonconforming parietal region, and while the cubic fit here leads to pathologies of the extrapolated grid at every corner of the original scheme (lower left panel), its restriction to the interior of the actual anatomy, lower right in the figure, shows grid lines that, ignoring their curvature, are actually well-aligned with those of the lower left panel in Fig. 16, the comparison from sapiens to neanderthalensis. We have thereby confirmed graphically that the shape difference in the parietal region is indeed local. Put this another way: in this warp analysis of the relation between the average H. sapiens 20-point configuration and the 20 points of the single female Pan, the 10-parameter quadratic fit (Fig. 18) and the 18-parameter cubic fit (Fig. 19) convey the same message: a relatively continuous gradient of deformation right across the hafting zone. And they agree, too, that the situation at the parietal (landmarks Opi through LaE) is not coherent with this large-scale gradient. From Bregma forward, the lower right panels in Figs. 17 and 19 differ mainly in the intensity of rotation of these gridline segments; but posterior to that arbitrary boundary the parietal landmarks participate in a reorganization that is incommensurate between the two comparisons.

Fig. 19
figure 19

The same as Fig. 18 for the comparison of H. sapiens to the female Pan using the cubic tools. The grid at lower left, for the cubic fit, is correctly drawn even though it looks like a whale

Thus we see again that, just as in the Vilmann growth example, an approach that eschews all of the standard Procrustes steps and also the usual thin-plate spline is capable of generating a better understanding of a morphological phenomenon, in this case a somewhat more complicated one, by polynomial fit instead of thin-plate interpolation.

Discussion

The Main Concern of GMM Ought to be the Transformation Grid Per Se

This focus was already clear from the earliest formal appearance of the concept in D’Arcy Thompson’s On Growth and Form (Thompson, 1917), where the review literature usually begins (even though portrait artists like Albrecht Dürer had thought about this much earlier). The endpoint of the method ought not to be statistical but instead graphical, and the derived report should be geometrical, not statistical, en route to an ultimately biophysical or otherwise morphodynamics-informed endpoint. Thompson (1961, p. 275) puts it this way: “The deformation of a complicated figure may be a phenomenon easy of comprehension, though the figure itself have to be left unanalyzed and undefined.” The main dilemmas in this tradition were already well-critiqued over the first six decades of its development as I reviewed them in Chap. 5 of Bookstein (1978). No matter how clearly defined the positions of individual landmark points might be, there was no complementary rhetoric for reporting meaningful features of the transformation grid that expressed comparisons of their configurations over meaningful biological contrasts. Of the classic expositions of this problem, the best remains that of Sneath 1967, a paper that struggled, ultimately unsuccessfully, to bring the algebra of landmark analysis (in that pre-spline era) into alignment with the reasoning of numerical taxonomy. Yet D’Arcy Thompson would have been delighted with the grid in Fig. 13, while presentations of the same information in Procrustes style, Fig. 1a, or spline-style, panels 4 (a) through 11 (a), would have been of no use to him at all. A more contemporary and quite distinct tradition of transformation studies approaches the problem via a calculus of diffeomorphisms (see, for example, Grenander & Miller, 2007), which makes no essential reference to landmarks at all, instead basing its computations on the full field of image contents, gray-scale or even colored, spanning the organ(s) of interest. The approach seems particularly helpful in neurological applications to imagery of the human brain. This contrasting method, however, is beyond the scope of my Procrustes critique here.

The analysis in Figs. 7 or  13 suggests renewing Thompson’s original concern in this domain, the interpretation of grids per se, via injecting a new theme into the discussion, an anatomical basis for orienting the starting grid on the template, that more intensively exploits the interaction between deformation graphics and the investigator’s prior awareness of how non-Cartesian coordinate systems themselves can vary in their visually dominant features. The biomathematics ought to begin, then, with a confluence of two insights: one, that some morphological domains might be amenable to some kind of functionally interpretable large-scale pattern analysis, and the other, an intuition about the geometrical language by which the pattern of interest might be quantified. For Henning Vilmann, this translation began with the knowledge that growth of rodent neurocrania is a plausible domain for morphometric exploration and that its midsagittal aspect bears enough information about growth and function to be worthy of geometrization not only in his own measurements of extent, nor the numerous intermediate multivariate investigations of this same data set (including several of my own), but also in the novelties of “Vilmann 7-to-150-Day Growth Analyzed Without Procrustes GMM” section. But given these two axioms, an applied study would culminate in an exploration not of alternative statistics but of alternative graphics: a survey not of diverse linear combinations but of diverse grid renderings. Information about absolute scale change, where relevant (as in biomechanical aspects of interpretation), can be embedded in any of these grid figures by a simple magnification over the course of printing, or can be inscribed upon interlandmark segments or the line-elements of a transformation grid by aligned text. In this context of large-scale comparison, rotation is a tool of rendering clarification, not a nuisance variable of digitizing.

The quadratic regressions in Figs. 4 through 11 all used the same list of five predictors \(x, y, x^2, y^2, xy\). This consistency lets the renderings here, unlike the approach in the lower row of Fig. 1, preserve the uniform component of the transformation grid, where we can see how it interacts with these gradients of large but finite scale. But the directions corresponding to those two axes x and y vary from baseline to baseline, and the baseline points are not privileged by the regressions. Consequently the coordinates pinned by the two-point registration are not quite pinned by the regression—they are permitted to shift to some extent from solid to fitted circles in the grid figures here.

The resulting dataflow sheds new light on what we mean by “the best rotation” when, as in both of this paper’s examples, different parts of an organ appear to rotate relative to one another over a comparison of interest. The role of the multiple two-point registrations that this paper recommends as a substitute for the Procrustes algorithm is not itself a “finding” of any sort but merely a convenience, a simple way of regularizing the landmarks’ Cartesian coordinates in order that a selection of reasonable polynomial trends can be fitted, each in a reasonably equably weighted way. Their advantage is that unlike the case for the Procrustes method, there is more than one of them. The Procrustes approach optimizes a quantity (sums of squares of landmark shifts) that is irrelevant to the ultimate purpose of an evolutionary or developmental GMM analysis, which is not a minimized sum of squares or a singular-value decomposition or a classification but rather a plausible biological hypothesis for the observed form-differences, their causes, or their consequences for the organism.

Then the logic of the inference engine we need is not the operationalized Procrustes arithmetic itself, the least-squares fit to what is almost always a completely wrong model (the null model, a pure similarity transformation). Instead we need the logic of E. T. Jaynes’s approach to numerical inference (e.g., Jaynes, 2003): the explicit acknowledgement of what we do not know—what is missing from the list of data-driven constraints on some quantitative empirical inference. (I have recently reviewed this logic in the rather different context of paleoseismology, which is the history of great earthquakes—see Bookstein, 2021b.) What is missing from a Procrustes analysis is, among other things, the acknowledgement that choice of an orientation constraint affects the resulting report: what we seek is the orientation that will best clarify the final published diagram. Furthermore, regardless of this issue of orientation, in every GMM context we already know there is no “correct” registration, because there is no “correct” list of landmarks—in the presence of any regional rotation or rescaling, different lists of landmarks or semilandmarks lead to different Procrustes registrations, and the empirical report of a shape comparison must accommodate that specific form of ignorance. That is the whole purpose of the grids—to free our attention from the landmark data per se to the space in-between, which is where biological processes actually take place.

The particular protocol dictating the selection of orientations to be considered may be irrelevant to the quantitative morphological inference under study. (Recall that in this paper the two points fixed in the baseline registration are not fixed by the fitted trend—the registration is not an inferential component of the grid report at all.) Orientation may be specified as any interlandmark segment from the available pairings, or any homologous boundary alignment, or even a specific force vector such as a muscle load or gravitational vertical—or possibly all of these. Whatever the choices of orientation, the investigator of a global deformation is led to the approach here, which is the selection of at least one satisfactory such orientation as judged by the ultimate diagram at the end of the workflow. In 3D, one could proceed via an assortment of what might be called “baseplanes” by analogy with baselines—large landmark triangles passing near the centroid, similarly searching for clarity and redundancy. But in other contexts that issue of orientation may be quite relevant to the interpretation. The examples here have all dealt with global trends, but Figs. 18 and 19 hinted at a need for a deformation tool suitable for local features as well. Such a tool would likewise entail a rotation of the Cartesian coordinate system prior to grid computation, but in general a different one—see, for example, the model of the crease in Bookstein (2000) or Bookstein (2014, Fig. 7.19).

The choice of a specific Cartesian coordinate system for reporting a fitted polynomial grid, as explored in Figs. 4 through 11 and later at Figs. 16 through 19, combines two different quantifiable aspects of biometric reporting: simplicity of the deformed grid, and magnitude (and distribution) of the residuals at the landmarks of the target configuration. This paper, agreeing with D’Arcy Thompson, privileges the first of these purposes over the second, and it is appropriate to pause here to explain why. It is helpful to borrow a concept from psychometrics, the notion of a universe of items of which the measures of any psychological test, however long or short, constitute a finite subsample. The purpose of psychometric factor analysis is to explore the features of that universe by careful generalizations based in the available subsample of items. GMM offers a clear analogy here in the notion of a universe of possible landmark locations and an even larger universe of their arbitrary combination into finite configurations—lists of matching points across a sample of specimens, any count, any distribution of positions, where “matching” could refer to any operational characterization of the points by geometry or image content. Any specific configuration constitutes a finite, usually arbitrary sample of one single template geometry from this immense universe of possibilities. (And there are also other ways of approaching morphometrics than by the combinatorics of finite lists like these: on curving semilandmarks see, for instance, (Srivastava & Klassen, 2016), an approach that might prove better than GMM for applications in botany and perhaps in studies of planktonic foraminifera as well.) Thompson operationalized something like this when he arbitrarily drew his curves on line drawings without much respecting any individual pairing of points. Sneath dealt with it more carefully with his notion of “h-points,” homologous points whose precision was uncertain, but likewise he paid no attention to the uncertainty of those points per se.

In a context like this, with no grounded theory of the actual geometry of a GMM data set (however often we are well-informed about the specimen sampling aspects), any such notion as “accuracy” has to be very carefully formalized. (The notion of a formal “theory of data” is real; I have borrowed the phrase from the title of yet another psychometric resource, Coombs, 1964.) There is a branch of statistics that does so, the subdiscipline usually called “spatial statistics,” from which it is helpful to draw two concepts. One is the praxis of kriging, which is the prediction of quantitative properties anywhere on a map from its measured properties at any finite point sample. (As it happens, the conventional thin-plate spline is in fact an example of kriging: Kent & Mardia, 1994.) The other is the notion of a nugget effect, which is the component of variance of predictions like those that derives from the irreducible measurement error of the original data. (The literature of factor analysis has a similar formalism, the “unique variance” of any measurement that correlates with nothing else in the universe of alternative data.) A good reference on this general domain is Kent and Mardia (2022), and the ideas were already injected into the literature of biomedical image analysis by Mardia et al. (2006). The goal of figures like Figs. 7 or  19 here is analogous to the goal of any other biometric regression analysis: to accurately estimate underlying meaningful quantities—not to reproduce the actual data but to appropriately visualize the interpretable pattern driving it with due regard for possible errors exogenous to the theory under consideration. Back in geometric morphometrics, it is no more a disadvantage for there to be discrepancies between the filled dots and the open circles in Fig. 4 than for a regression line or curve to pass near but not exactly through a measured data point or for a highly intelligent teenager to miss one or two easy questions on the Scholastic Aptitude Test.

The psychometric problem most akin to this paper’s search for a good Cartesian coordinate rotation is the problem that that field actually refers to by the same word: the problem of factor rotation, nicely reviewed for biologists by Reyment and Jöreskog (1991). In Bookstein (2017) I tried to import one of the fundamental maneuvers of this field, varimax rotation, into GMM using these same Vilmann octagons as a demonstration data set. A varimax rotation attempts to maximize the variance of a set of correlations of items with factors—to send as many as possible of these correlations to \(\pm 1\). The equivalent for this paper is the attempt to minimize the curvature of the deformed grid lines, which ultimately derive from the coefficients of those fitted quadratic or cubic polynomials. Such an explicit optimization could well be the subject of research on its own.

When GMM is being used as the source of hypotheses for subsequent biological exploration, rather than just a praxis for classification or facial recognition, its role is rather like a factor analysis, or a neural-net processor in some machine learning lab: a principled guess at some underlying cause of all the observable effects. Back in GMM, two well-known vicissitudes of the thin-plate spline, its tendency to extrapolate disproportions of small displacements interior to a configuration while relaxing toward linearity outside the convex hull of the configuration, count as bugs, not features—see the discussion in Bookstein (2021a). Correction of these two pervasive difficulties is one role of the polynomial grids explored in this article (for instance, with closely-spaced landmarks, the regressions attenuate adjacent discrepancies rather than disseminating them outward). It is not the error sum of squares of these regression residuals that matters for subsequent work in evo–devo, but their role in generating fertile new hypotheses that can follow from the simplified appearance of the corresponding grids. What leads us to prefer cubic (Fig. 19) to quadratic (Fig. 18) grids in the hominization example is not the net residual sum of squares of the polynomial fit, corrected for degrees of freedom, but instead the cogency of the observation that one fits the parietal bone so much better than the other.

The issue would be the same for interpretations in the language of modularity: not a matter of sums of squares pursuant to prior hypotheses, but a concern for the generation of new, better hypotheses that, e.g., search for evidence for the reality of boundaries between purported modules under various experimental designs. Because the interior of any non-nested module is at the same time a part of the exterior of every other module, one sees from the hominization example that the morphometric aspect of “modularity,” whatever its exact morphogenetic definition, is a matter not of landmark coordinates but of what happens to coordinate grid lines, especially in the vicinity of what would be claimed to be intermodule boundaries. Figures 17 and 19 confirm that, within the limits of these data resources (adult forms only, no growth series, a mere 20 landmarks), there is no graphical evidence for the cranial base as a separatrix between braincase and face, in spite of their obvious differences in function, but strong evidence for a separation of the whole anterior two-thirds of this landmark scheme from the five parietal landmarks, Opi through LaI and LaE, that so clearly seize control of the lower-right corner of the grids for either the quadratic fit (Fig. 18 lower left) or the cubic fit (Fig. 19 lower right) to the comparison across genera. While the empirical import of this second data example is obsolete, owing to advances in the accrual of samples of all these taxa, the practice whereby consideration of the transformation grids per se might shape inferences from landmark data about morphogenetic control processes ought to be transferred from the current GMM toolkit to these more integrated investigative tools along the lines of the examples here.

We Need to Broaden the Range of Ideas We Borrow from Geometry

A combination of two branches of geometry led us to the bilinear interpretation in Fig. 13 of the grid in Fig. 7, but this other toolkit is not among those currently being taught to biomathematicians. The kernel \(r^2~\log ~r\) of the thin-plate spline doesn’t much resemble the biological processes we are trying to understand, but the algebra of polynomial fits (here, mainly the specific appearance of bilinear maps leaving both pencils of coordinate lines straight and evenly spaced after deformation even as they rotate) does pick up much of the classic appearance of growth-gradients as laid out for analysis from Thompson on. More important than the extension of the idea of a coordinate system, though, is an extension of the domain of morphometric data to include empirical entities other than landmark points. The description of the grid in Fig. 13 makes no essential mention of any of the landmarks—the simple exegesis here (bilinear reorganization of that particular family of grid lines while remaining lines) pertains much more to the interior of this octagon (the directions of those transects across it, or, if you will, the pairing of points across the left and right sides of the outline in this orientation) than to any of its boundary delineation detail, even though that boundary is the sole data source for the example. Thus at root the finding exemplifies a language of intraorganismal matching, the pairing of points along a shared curve bounding some anatomical entity in section. Pairings like these are not like landmarks in any formal aspect.

So even though this paper’s first example began from a playful GMM-derived diagram, Fig. 1d, it ends up formalized in the rhetoric of a spatial extension (Fig. 13) unknown to GMM but accessible to any reader of Thompson’s chapter, as interpreted in Fig. 14 via a similar-looking figure suitable for some college-level geometry text. This logical sequence can be reversed: beginning from those same textbooks, to try finding biological examples that illustrate them. We are used to polar coordinates, for example (most recently in the study of centric allometry, Bookstein, 2021a), but what about bipolar coordinates or confocal coordinates (Bookstein, 1981, 1985) and other schemes that (literally) co-ordinate position with respect to two origins or two axial systems at the same time? The range of coordinate systems is vastly broader than the Cartesian on which today’s GMM automatically relies. My biorthogonal grids (Bookstein, 1978) already went beyond this possibility, though not in a statistically feasible way, via their formalism of one-axis and three-axis singularities corresponding to the “lemon” and “star” umbilics that are the topic of advanced treatises such as Koenderink (1990). From the earliest years of the twentieth century the mathematics of geometry has permitted us to talk about coordinates of many different extended structures: not just points, but lines, planes, circles, and many other formalisms. See, at first, (Hilbert & Cohn-Vossen, 1931/1952), and then, among the more contemporary surveys, (Porteous, 2001) or (Glaeser, 2012).

Thus the word “geometric” in the phrase “geometric morphometrics” needs to have its meaning broadened beyond the current focus on the multivariate-centered aspects of GMM or indeed any version based on analysis of landmark points as logically separate data elements. “Procrustes distance” between specimens, when computed as a minimizing sum of squared Cartesian coordinate differences, is just a theory-free proxy for the far more subtle and multifarious concept the biologist knows as the opposite of “similarity,” and today’s GMM treats Procrustes shape coordinates as just a list of Cartesian pairs (or triples) in their own coordinate space of position, without reference to any explicit features for describing how their interrelationships (e.g. the interlandmark segments of Fig. 1) actually change across a comparison of configurations. D’Arcy Thompson got this right all the way back in 1917: “This process of comparison,” he wrote (Thompson, 1961, p. 271), “recognizing in one form a definite permutation or deformation of another, apart altogether from a precise and adequate understanding of the original ‘type’ or standard of comparison, lies within the immediate province of mathematics.”

That geometry of “recognizing deformation” is not limited to the geometry of points referred individually to Cartesian axes. Thompson himself referred explicitly to the appearance of the deformed grid lines in his drawings. For the comparison to Mola, for instance, he wrote, “I have deformed [Diodon’s] vertical coordinates into a system of concentric circles, and its horizontal coordinates into a system of curves which, approximately and provisionally, are made to resemble a system of hyperbolas” (Thompson, 1961, p. 300). It is the configuration of these curves, not the landmarks on them, that is the bridge from arithmetic to understanding. In other words, the elementary language of deformation, the language by which we report morphological comparisons as deformations, must be based in a glossary of multiple elementary types of deformable image components, not disarticulated landmarks. The roster of these is broad indeed, including, among other options, the changes of point-pairs to other point-pairs at a different distance or direction that we already saw in Fig. 1, but also changes of triangles to other triangles, squares to any quadrilateral whether rectangle, parallelogram, trapezoid, or some other form, displacement of interior points with respect to an unchanging boundary, circles to ellipses, ellipses to any other simple closed curve, straight lines to other straight lines, lines to any other open curve, line-elements having an orientation in the small as well as a location (for a spline cognizant of this structure, see Bookstein & Green, 1993), or nearby pairs of parallel curves to any bent ribbon tracing the sequence of changes all along their shared length. All of these have appeared in biometric examples; to each corresponds a different geometric grammar for its reporting. For instance (in another acknowledgement of our sister discipline of neuromorphometrics), line elements per se summarize image data for the method known as diffusion tensor analysis that traces and summarizes patterns of wiring in the human brain.

As I hope you have already come to suspect from the figures in this paper, the thin-plate spline is not designed to be of any particular help in this matter. Its functional form is mainly a sum of terms \(r^2 \log r,\) where r is the distance from each grid point to each landmark of the template in turn, and so it has no machinery for collecting references to two or more landmarks at the same time, but must revert to the nonbiological symmetries of linear multivariate statistics for this purpose (so that the partial warps, for instance, are just a \((2k-4)-\)dimensional rotation of a configuration’s Cartesian coordinates however they were arrived at to that point, while the relative warps are just a different \((2k-4)-\)dimensional rotation of the same coordinates). No, the elements of a quantitative morphometric comparison in terms of deformation must be the whole coordinate systems of our deformation diagrams, and the features we extract must be features that refer to those deformed lines and areas, whether end to end or truncated to the vicinity of specific landmark subsets. Any geometric report qualified to drive a programme like Thompson’s aimed at simple descriptions of relationships among individually complicated specimens must begin with more complicated elementary entities than positions of discrete landmark points. A search for such explananda, beginning from the paired interlandmark segments in Fig. 1, leads immediately to the elementary aspects of this paper’s two examples, which make no reference to the formula \(r^2 \log r\) nor indeed any quantification beyond the squaring or cubing of coordinates and products of those powers that allows us to parameterize families of nearly parallel curves that began as parallel lines.

The Implications of a Diminished Role for the Existing Core of Geometric Morphometrics in Quantitative Morphology are Liberating

Via a new toolbox that intentionally discards Procrustes centering, Procrustes scaling, and Procrustes orientation, and that downplays the role of thin-plate splines—the whole core of today’s GMM as it has been conventionally black-boxed for routine use—we may be able to better achieve GMM’s principal declared purpose, the quantitative understanding of morphological variation and its causes or effects, by recourse to more diverse geometrical formalisms, some ancient and some relatively novel. This methodological possibility has several implications, some for actual analysis of morphologies and others for the methodological component of graduate curricula in the evo–devo sciences. The aspects of geometry that today’s GMM is accustomed to borrowing for its tools concentrate much too heavily on matrix algebra and linear multivariate analysis. As Peter Sneath suspected so long ago in his paper on trend-surface analysis, there are other geometric entities, such as those here dealing with quadratic and cubic bivariate polynomials, that speak more clearly to the investigator’s visual instincts, especially as regards phenomena of orientation. (Examine, for instance, panel 1d of Sneath (1967),Footnote 5 which shows a relative rotation between face and braincase in the comparison of Homo to Pan similar to the one in Fig. 18 here, without, however, the optimization of coordinates that “An Example from Hominization of the Skull” section exploited.) And far more objects can be assigned coordinates than discrete points (or semilandmarks) alone: grid lines, for instance, deserve coordinates of their own (Figs. 4 through 11) and also interlandmark segments (Fig. 1).

Similarly, the way GMM relies on thin-plate splines for its published renderings exaggerates their importance for organismal biology. The spline is an interpolating map, whereas, in view of how colossally arbitrary our landmark lists actually are, biological interpretation is usually deeper and better when it goes via approximating maps instead. The actual role of interpolating splines in the research cycle, then, might be shifted well earlier, all the way back to before the final rendering style is chosen, in order to supply guidance about which geometrical languages should be exploited for the most effective dissemination. At that early stage, interpolating splines are good aids to the search for component processes that are primarily local, but are poor at the analogous global reports, which, as Sneath already knew in 1967, do better with polynomial analyses. Both possibilities might appear in a report, the way Fig. 4 ff. show both the thin-plate spline, which reveals the local change at IPP, and the quadratic grid, which summarizes the overall change of form so much better (in both contexts ignoring the Procrustes side of GMM in favor of the different optimization of orientation recommended here).

The finding in Fig. 1d should not have been new to this paper. In the many previous GMM investigations of the Vilmann data there should long since have been mention of rotations of subanatomies, a rhetoric that has been suppressed, perhaps unintentionally, by virtue of our current traditions of overly symmetric data summaries like Procrustes distance, principal component analysis and interpolating splines.Footnote 6 It is time for the morphological side of biomathematics to return to its roots in biological geometry sensu lato—what might the organism’s function space “know” about its own form?—in order to rebuild the interplay between data and explanation using a much broader range of geometric formalisms than just “points” (or their “modules”) and “deformations.” The method of cubic regression, Figs. 17 and 19, is likewise not new; I copied it straight from Sneath (1967). The particularly careless way the Procrustes method dismisses orientation as just a nuisance variable has blinded our field to the possibility that relative intraspecimen orientations can be just as informative a channel of insight and explanation as relative extents (proportions). To restore and then extend this symmetry we need to abandon the standard Procrustes tool in favor of explorations that explicitly consider multiple orientations at the same time, just as studies of allometry have been considering multiple size measures since at least (Blackith & Reyment, 1971). More generally, to understand transformation grids we must extend our understanding of the sort of entities that can have coordinates from points to more extended structures. Only then can we trust our diagrams to provide straightforward practical summaries of the “blooming, buzzing confusion” (W. James) that is the spectrum of Darwinian phenomena we call evo–devo.