MetaPhenomics: quantifying the many ways plants respond to their abiotic environment, using light intensity as an example

Thousands of scientific papers have described how plants responded to different levels of a given environmental factor, for a wide variety of physiological processes and morphological, anatomical or chemical characteristics. There is a clear need to summarize this information in a structured and comparable way through meta-analysis. This paper describes how to use relative trait responses from many independent experiments to create generalized dose-response curves. By applying the same methodology to a wide range of plant traits, varying from the molecular to the whole plant level, we can achieve an unprecedented view on the many ways that plants are affected by and acclimate to their environment. We illustrate this approach, which we refer to as ‘MetaPhenomics’, with a variety of previously published and unpublished dose-response curves of the effect of light intensity on 25 plant traits. Furthermore, we discuss the need and difficulties to expand this approach to the transcriptomics and metabolomics level, and show how the generalized dose-response curves can be used to improve simulation models as well as the communication between modelers and experimental plant biologists.

Words to this effect are popular starting sentences in scientific papers (e.g. Queitsch et al. 2000;Zhang and Friml 2020). To fully oversee its consequences, this general plant characteristic has to be coupled to another essential aspect, in which plants and animals also differ. Where body size of animals of a given age is often only marginally dependent on the external environment, variation is far more pronounced for plants: depending on environmental conditions, plant size can vary tremendously (Tardieu et al. 2017). In controlled experiments, the variation in biomass among equally-aged plants of different treatments may well be 3-10 fold, and sometimes differ more than 30-fold (Pons and Poorter 2014). In nature, over 100-fold differences in biomass can occur for even-aged plants, depending on site conditions (Portsmuth et al. 2005vs. Ovington 1957Lu et al. 2017 andForrester et al. 2017). Although less variable than plant size, strong plasticity is also found for a diverse range of traits related to plant morphology, chemistry and physiology (Valladares and Niinemets 2008). To a certain extent these differences are merely physiological consequences of the environmental conditions: if light levels are low, the photosynthetic rates are necessarily also low. However, plants can also actively (re)program their development to acclimate to different levels of an environmental variable, by adjusting traits in a way that improves their performance under specific conditions as compared to when they had not reprogrammed themselves (Nicotra et al. 2010).
Analyzing the responses of plants to the range of environmental factors they experience is one of the main fields of focus of plant ecophysiology (Lambers and Oliveira 2019). An often-used experimental approach is to challenge seedlings or saplings for a period of time with two or more levels of a specific abiotic factor, such as light, water or nutrients. Subsequent plant measurements can have a focus on variables related to morphology and allocation, such as leaf size or thickness, chemical traits such as nitrogen or phosphorus concentration, physiological traits such as photosynthesis and transpiration, or variables describing growth and development, such as biomass or flowering time (Perez-Harguindeguy et al. 2016;Freschet et al. 2021). Over the last 30 years this approach has been extended by analyzing specific cellular messengers such as hormone or mRNA levels, and broad profiling of the transcriptome, proteome and metabolome (Sahoo et al. 2020).
Hundreds to thousands of such experimental studies on the environmental effects on plant growth and trait acclimation appear each year in the scientific literature, for a wide range of different species. The challenge for the scientific community is how to fruitfully handle and incorporate this enormous source of scientific data. Textbooks such as Lambers and Oliveira (2019) and Nobel (2020) or narrative reviews can help to structure this information to some extent. However, they will necessarily remain the author's personal impression of a field that gets more and more difficult to oversee, due to its breadth and the ever-increasing body of data. In this paper, we discuss how meta-analysis can help to digest this vast amount of information in a structured way. First, we focus on the need for generalized dose-response curves and explore some of the advantages and limitations of an approach we refer to as 'MetaPhenomics'. Second, we illustrate this methodology with 13 updated and 12 previously unpublished dose-response curves, focusing on the effects of light intensity on plants. We then go a step beyond and ask to what extent interaction between two or more environmental factors can be quantified. Finally, we discuss some possible options to expand this approach to the fields of molecular sciences and show how dose-response curves could be advantageously used for improving crop or ecosystem modeling.

Meta-analyses of plant responses to the environment
The need for generalization Meta-analyses are quantitative analyses of a range of primary studies (Harrer et al. 2021). They were initially developed in the medical field, to evaluate results of various clinical trials. The integrative power of the meta-analytical approach subsequently has led to wide applications in other biological disciplines (Hedges et al. 1999). Meta-analyses in the botanical field sometimes target the environmental response of one specific species (e.g. Ainsworth et al. 2002), but are generally broader: They often focus on a range of species with common characteristics (crop species, conifers; e.g. Kimball 2016) or plants investigated with a specific methodology (such as CO 2 enrichment with FACE technology; Ainsworth and Long 2021).

3
Vol.: (0123456789) In the broadest sense, they may even target responses of hundreds of species Liang et al. 2020). In almost all of these compilations, conditions among experiments will be variable: physiologists preferably study treatments in growth chambers where all other conditions are controlled, horticultural scientists predominantly use glasshouses that mimic horticultural practice, whereas agronomists and ecologists rely mostly on field studies. All of these scientists grow plants under a specific set of environmental conditions, yet try to unravel principles, which are hopefully appliccable to plants of more species, and grown at a wider range of conditions. Experiments that have been carried out at a range of background conditions will likely allow for more general conclusions (Richter et al. 2009), and such a generality applies even stronger to meta-analyses where a range of experiments is combined (Harrer et al. 2021).
An important requirement in science is to discuss results in relation to 'what is known already'. Citation of papers that confirm the results presented in a given paper help to achieve a sense of generality. With the myriad of published papers, it is often not difficult to find one or more publications where similar results have been observed. If this happens not to be the case, then simple 'explanations' can be suggested for observed discrepancies: e.g., other experiments were done with another species, at a different growth stage, or in a different growth environment. However, it is not easy to achieve firmer ground without a more systematic approach. Meta-analysis could be helpful to judge how general an observed difference between two treatments, for example low and high light, is. At the same time, it enables to test whether phylogeny (e.g. species from different families) or functional type of species (e.g. C 3 vs. C 4 plants) are relevant factors explaining variation in response among the range of compiled experiments.
There is another source of variation among experiments that is often not taken into account when comparing data. Using the example of light again, two experiments may show different phenotypic responses to light intensity (e.g. a strong positive effect vs. no effect, Fig. 1a). Where plant biologists often study the performance of a species like Arabidopsis thaliana at relatively low light intensities (say, 100 and 200 μmol.m −2 .s −1 ), agronomists may prefer to compare light effects on a given crop species at much higher light levels (say 600 and 1200 μmol. m −2 .s −1 ), because that bears more relevance to field conditions. Therefore, it could well be that differential results between these two experiments for a given phenotypic trait are only found because the effects of light were studied at different and nonoverlapping ranges of an overall non-linear doseresponse curve (Fig. 1b). Consequently, it would be very helpful if meta-analyses focusing on the effects of a specific environmental factor on plants would include the actual quantitative levels of the environmental factor of interest. Not only that, rather than asking whether two specific levels of a given environmental factor have differential effects on the plant phenotype, it would be far more instructive to derive dose-response curves from these data, as they Hypothetical example of how contrasting results from two experiments (colour-coded orange and green) can be interpreted. Differences in the response of a given phenotypic trait Y to, for example, a low and high light intensity could be due to a contrasting species or different growth facilities, or b due to rather different light levels used across experiments, with the two species actually following exactly the same dose-response curve 1 3 Vol: . (1234567890) bear information over a wide range of levels and are therefore more informative to analyze and compare plant responses. The derivation of generalized doseresponse curves by means of meta-analysis is the main focus of this paper.
Compiling and scaling data For the MetaPhenomics database we compile environmental and phenotypic data, mainly from published experiments, where plants were exposed (for most of their life) to different levels of a specific environmental factor. The choice of the measure for characterization of the environmental factor requires careful consideration. On the one hand, this measure should be sufficiently relevant and precise to adequately capture the plant's responses. On the other hand, it should not require far more detail than what is usually described in literature, as this would result in the exclusion of too many experiments from the meta-analysis, making the results less broadly applicable (Harrer et al. 2021). Therefore, this choice is a balancing act between precision on the one hand and generality on the other. For example, in the case of light, 'photosynthetic photon flux density' (PPFD; μmol.m −2 .s −1 ) would seem a logical first choice, as it is widely used in the plant biology literature. However, this is problematic, as glasshouse and field experiments are carried out at PPFDs that fluctuate continuously within and among days. In many of those experiments authors report 'PPFD measured at 12 o'clock under clear sky' or 'percentage of full light measured under an overcast sky' to characterize light levels. However, such characterizations are not well-defined and incomparable across experiments, as the maximum light intensity varies with season, latitude, shade from surrounding trees or buildings and -in the case of glasshouses -with roof transparency. Moreover, both measures ignore the frequency of sunny and cloudy days and duration of the light period. An alternative measure to characterize light intensity, which is applicable across all experimental platforms, is the Daily Light Integral (DLI, mol.m −2 .day −1 ), the flux of quanta integrated over the day and averaged over the experimental period. This measure has the additional advantage that many of the longer-term morphological and plant growth responses are known to be better correlated with DLI per se than with photon flux density at any moment in time or duration of the light period (Poorter and Van der Werf 1998;Kjaer and Ottosen 2011;Niinemets and Keenan 2012). However, using the average DLI over the experimental period will unavoidably miss out on details such as variability that may occur within the day between temporarily low-light and high-light periods, or similar variation among cloudy and sunny days during the experimental period (Wayne and Bazzaz 1993;Matsubara 2018). Another source of error more specific for growth chambers is that there is often moderate variation in light intensity depending on the horizontal position of a plant, but strong vertical variation within the growth chamber (Poorter et al. 2012a(Poorter et al. , 2012b. With some researchers measuring light intensity at pot level, others at plant height, and with most of these values determined only once during the experiment, also those DLI values are approximations of the actual light levels received by the plants. As much as for the environmental characterization, there is uncertainty and/or variability in the determination of phenotypic traits. Part of this is random variation, due to well-known biological variability. Part is systematic, and may relate to the development of phenotypic traits with age or size, or to systematic differences among measurement procedures (Quentin et al. 2015). Additional difficulties are that experiments are often carried out with different species, and dissimilar environmental backgrounds, such as pot size or watering frequency and duration of the experiment. These all preclude direct absolute comparisons of data among experiments. However, it is feasible to compare relative responses among experiments, by normalizing all phenotypic data within each experiment to the trait value observed at a predefined level of a given environmental factor (Poorter et al. 2010; see Box 1 for a summary of the methodological steps followed). The advantage of using a scaling approach is that it is very flexible, and works even for experiments with only two or three levels of a given environmental factor. Poorter et al. (2019), for example, normalized phenotypic data to a reference DLI level of 8 mol.m −2 . d −1 . If an experiment contains this level as one of the treatments, the calculations are straightforward (see Fig. 2a, b, orange lines). If two DLI levels are applied where one level is below and the other above that predefined value, normalization can be achieved after . In cases where >100 data are available, points will not be grouped per 10, but per decile. The ratio of the fitted phenotypic values at a DLI of 50 and 1 (these points on the curve are indicated by black open squares) is called the plasticity index (PI) and has a value of 10.8 in this case. In case of negative trends, the ratio is inversed and multiplied by −1, to maintain the same size of scaling, but clearly indicating the negative direction of response -Each trait entry Y in the database is the average value over one or more measurement days.
-Data are only considered for plants that had ample time to acclimate to their environment, which we define as being at least 2 weeks under those conditions and achieving preferably >80% of their biomass during the experimental treatment. -Physiological measurements are considered for the vegetative and early flowering phase, vegetative biomass at the end of the growth experiment, generative traits at the end of the generative phase. 2. Double-check data for possible mistakes in numbers or units.
-Trait data are checked against the normal ranges (5th -95th percentiles) of the data already in the database. 3. Scaling data for each species in each experiment.
-For every trait and each species (or genotype) within a given experiment, calculate by means of interpolation what estimated value Y R they have at the predefined reference level X R of the environmental factor of interest. -Scale all observed data for that species and experiment by dividing their trait values Y by Y R .
-To constrain the weight of an individual experiment in the overall compilation, consider a maximum of 10 species and 3 genotypes per experiment. If selection is necessary, choose species in a way that maximises phylogenetic or ecotypic diversity. 4. Scale the trait data from experiments where the range of levels X for the environmental factor of interest did not contain X R .
-Fit all scaled Y vs X data as calculated in point 3 by a smoothed regression.
-For each of the traits of species and experiments that did not include X R , take the treatment level X C which is closest to X R , and consider what the average Y′-value is as given by the smoothed regression. -Scale all other Y data in that experiment with respect to the Y-value at X C and multiply all with the Y′-value.
-After the previous step, remove the (X C ,Y C ) data point of each experiment that did not contain X R in their environmental range from the data, as they do not contain independent information anymore after the scaling.

Establish unsmoothed dose-response curves and normal ranges.
-Order all data points by their X-value and divide them in 10 decile groups.
-Calculate median values for X and Y for each decile group. -10th, 25th, 75th and 90th percentiles for Y in each decile group indicate the normal ranges to be expected. where parameter a reflects the a-symptotic value, a and b co-determine the trait value at X = 0 and b and c co-determine how quickly saturation is reached. • A quadratic equation with or without a local minimum or maximum: Y = a + bX + cX 2 -Test the most appropriate of these 4 equations by means of the Akaike Information Criterion 7. Calculate Plasticity index (PI).
-Take the ratio of Y H and Y L , for a predetermined X H and X L ., as caluclated from the dose-response curve selected in step 6. For light intensity we chose X H and X L to be 50 and 1 mol.m −2 .d −1 . If Y H is smaller than Y L , then calculate the inverse and multiply by −1, to indicate negative responses with increasing X.

Calculate the Consistency index (CI).
-For every species x experiment combination, deduct the phenotypic value observed at the lowest level of the environmental variable from the phenotypic value observed at the highest level of the environmental variable. interpolation ( Fig. 2a, b; black lines). By applying this normalization, variation across species and experiments can largely be partialled out (Fig. 2c).
Establishing dose-response curves Having computed relative responses for a given trait in each species x experiment combination of the data compiled, the next step then is to mathematically describe the relationship with the environmental factor of interest by establishing the appropriate doseresponse curve. This can be done by fitting one of a variety of functions. Of the four options we use, the null model is that the trait of interest Y is not affected by the level of the environmental factor X at all. The second is a linear relationship. Another frequentlyoccurring relationship is a saturating curve, which approaches a maximum or minimum. For this we use a monomolecular function with three parameters (France and Thornley 1984; Box 1). More rarely, dose-responses will show a quadratic relationship, with or without a local optimum or minimum. These curves are characterized by a 2nd-degree polynomial. Out of these four, the best-fitting curve is selected statistically. Based on the data and the selected equation, three descriptors of the established dose-response curve can be calculated: • Plasticity Index (PI). This is the ratio of trait values at a predefined high and low value of the environmental factor of interest. In case of a ratio less than 1, the inverse is taken and multiplied by −1, to clarify that the relationship is negative while keeping plasticity values in the same range (>1). The advantage of the plasticity index is that the extent of plasticity for a wide range of dose-response curves for different traits or species groups can easily be compared.
• Consistency Index (CI). This value indicates in what percentage of the species × experiment combinations the plants exposed to the highest level of the environmental factor of interest do have a higher value for the trait of interest than those exposed to the lowest level. Values close to 0% or 100% indicate a high consistency across experiments, whereas a value close to 50% indicates a highly variable response. The consistency index is particularly informative when discriminating traits that change marginally but do so in a very consistent manner from those that change marginally and variably. • Reliability Index (RI). Based on the number of observations per trait, the number of species on which the observations are based, the range of the environmental factor of interest over which traits are present in the database, and the inverse of the variability around the fitted dose response curve, the reliability of the dose-response curve is quantified on a scale from 1 to 10. The reliability index can be used to judge how much a dose-response curve could change when data of new experiments are included in the database.
The different steps to arrive at dose-response curves and their descriptors are described in more detail in Box 1.

Data distribution
The MetaPhenomics approach is flexible and can accommodate information from both small-and large-scale experiments, carried out over both narrow or wider ranges of values for environmental factors of interest. But to what extent is such information available from the literature? Taking the example of light again, most experiments in growth chambers, or glasshouses outside the summer season, will achieve DLI levels that are at best intermediate as compared -Determine the % of cases in which the difference is positive, and add to that half of the % of cases in which the difference is exactly 0. 9. Evaluate differences in the dose-response curves between species groups. -Carry out repeated bootstrapping for observations of each group of interest and calculate for each iteration the PI. Statistical evaluation can be obtained by evaluating the distribution of the calculated PI values for the different species groups. 10. More details.
-More specific details on test procedures can be found in the supplement of Poorter et al. (2022).

Box 1 (continued)
Plant Soil (2022) 476:421-454 427 to those prevailing in the field during the growing season. Some experiments specifically focus on lowlight acclimation (e.g. Bloor and Grubb 2003), or plant responses at high-light levels (e.g. Pendleton et al. 1967). Overall, the range of experimentallyapplied light levels is wide, but the distribution is clearly skewed, with less information at high DLI levels (Fig. 3a). In the case of atmospheric CO 2 experiments, the distribution of [CO 2 ] applied is different. Most experiments so far have focused on the effect of future CO 2 -concentrations, using ambient CO 2 as a control, and twice-ambient as a treatment. Consequently, there are clear peaks in the number of experiments carried out between 350 and 400 μmol.mol −1 and 700-800 μmol.mol −1 , with far less information outside these regions (Fig. 3b). Note that these distributions vary among traits, implying that doseresponse curves for less-frequently measured traits may only be derived over a more limited range. The non-uniform distribution of experimental data for the environmental factor of concern has two consequences. Firstly, the reference value of the

Cases without extrapolation (%)
environmental factor that is chosen to determine the trait value applied for scaling within each experiment should preferably encompass as many experiments as possible. The best choice in the case of DLI is a value of around 8 mol.m −2 .d −1 , as this yields a maximum of 81% of the cases where interpolation is possible (Fig. 3c). In the case of CO 2 , where almost all experiments use ambient values as 'control' and 1.5x or 2x that value as 'treatment', any value between 400 and 550 μmol.mol −1 will imply that almost 100% of the experiments are amenable to scaling (Fig. 3d). Secondly, information at the outer ends of the curves is generally scarce, but highly relevant for establishing the dose-response curve over a wide range. Some experiments focus only on various low-light or highlight levels and do not contain 8 mol.m −2 .d −1 . We therefore developed a procedure to link those data sets to all other scaled data, be it with a loss in the degrees of freedom (see point 4 in Box 1). Although this helps to add some additional data at the outer ends of the curves, data for these 'extreme' conditions remain limited. In the case of DLI, we were able -for most traits -to construct dose-response curves over a 50-fold range ( Although it intuitively makes sense to choose the level of the environmental factor used for scaling the trait values such that it is common to many experiments, it is still relevant to know how sensitive the resulting dose-response curve is for the reference level chosen. We therefore calculated the plasticity index (PI) of the observed dose-response curve, using Leaf Mass per Area as an example, for which we took the ratio between the fitted LMA values at a DLI of 50 and 1 mol.m −2 .d −1 , or a [CO 2 ] of 1200 and 200 μmol.mol −1 . We did so for a wide range of reference values for the environmental factor of interest. As expected, the choice for an extreme level that is hardly contained in any experiment may yield a somewhat deviating estimate. However, over a wide range of values for DLI and CO 2 , the resulting Plasticity Index is stable, as illustrated in Fig. 3e, f.

Further analyses
As a first approximation, we assume that the data we found in the compilation underlay a universal trend, valid for all plant species, and can be captured with one dose-response curve. This may often be sufficient. However, it cannot be excluded that different species groups have different dose-response curves. For example, photosynthetic responses to [CO 2 ] are generally different for C 3 and C 4 species, and this may have consequences for many more traits. Similarly, species from low-and high-light environments or cold and warm habitats may have different optima for various traits. Fitting one dose-response curve through all those data could then easily lead to oversimplification. Therefore, it is good to make the additional step to see whether species from different functional or phylogenetic groups show dose-response curves that deviate from the main trend, or have different plasticity. (See point 9 in Box 1).
Another application of the dose-response curves is that normal ranges can be calculated: By ranking all data from a low to a high value for the environmental factor of interest, and then dividing them into ten equally-sized decile groups, we can not only calculate the median X (environmental factor) and Y (scaled trait), but also calculate, for example, the 10th and 90th percentiles of the scaled trait in each of the ten decile groups. In this way we have the opportunity to check whether any specific experiment is indeed deviating from the majority of all other experiments, which could be an error, or an interesting case of a species that responds genuinely different from the majority of plants. Poorter et al. (2019) quantified the response of 70 plant traits employing dose-response curves, focusing on anatomy/morphology, chemical composition, and physiology of leaves, stems and roots, as well as growth/reproductive characteristics of whole plants. To illustrate the potential of generalized doseresponse curves we first present here trait dependencies on Daily Light Integral (DLI) for 13 of the 70 previously published traits, using an extended data set containing >20% more experiments. We then present DLI dose-response curves for 12 other traits, which were not included in the Poorter et al. (2019)   Dose-response curves for photosynthetic and growth parameters It is well-known that photosynthetic capacity per unit leaf area (Phot/A SL ) increases with the light intensity plants experience during growth (Björkman and Holmgren 1966;, which happens to occur in a saturating fashion (see Fig. 4a; also for other traits discussed). On average, the capacity more than doubles over the 1-50 mol.m −2 .d −1 trajectory (PI = 2.2). Leaf mass per area (LMA) increases even more strongly (PI = 2.7), with a high consistency index (99%), and so does the nitrogen content per unit leaf area (PI = 2.0; Poorter et al. 2019) as well as the amount or activity of the enzyme Rubisco expressed per unit area (PI = 4.5). Although it is clear that large changes occur in the N-allocation within the photosynthetic apparatus, the total organic N content per unit leaf dry mass ([Norg] L ) remains remarkably constant. The resulting PI has a value of −1.1, indicating that the N concentration may decrease marginally over the DLI range considered. Photosynthetic capacity per unit leaf dry mass (Phot/M SL ) even decreases somewhat more strongly over the trajectory considered (PI = -1.2). Therefore, for this set of traits, light responses expressed on an area-basis are all strong, but small or absent when expressed on a leaf dry mass basis. Clearly, the increased thickness of palisade and spongy parenchyma form the main drivers of the increased photosynthetic capacity. How then does the actual performance of the leaves change with DLI? Next to leaf structure, photosynthetic compounds, and stomatal conductance, this is co-determined by the prevailing light intensity. Photosynthetic activity per unit leaf area under growth light conditions (Phot/A GL ) increases strongly, with a PI of 15.1 (Fig. 4a). This is the largest increase over the 1-50 mol.m −2 .day −1 range for all traits considered here. Most of the increase is the result of a direct effect of light intensity on photosynthetic rate. Part of the increase in Phot/A GL , however, is enabled by the increase in photosynthetic capacity (Phot/A SL ) with DLI, which enables better exploitation of light at the high-intensity range. At the whole-plant level, a simple model to factorize growth is RGR = ULR * SLA * LMF (Evans 1972;Lambers and Poorter 1992), where RGR is the Relative Growth Rate, ULR the biomass increase per unit leaf area (Unit Leaf Rate), SLA the leaf area/leaf biomass ratio (Specific Leaf Area) and LMF the fraction of biomass invested in leaves (Leaf Mass Fraction). Among the growth-related traits, ULR is the variable most strongly related to photosynthesis per unit leaf area, and increases over the light trajectory considered with a PI of 8.9 ( Fig. 4b; also for the next traits discussed). However, that value is only little more than half the increase in photosynthetic activity per unit area. This could be partly explained by the fact that photosynthesis is typically measured on the 'youngest fully-developed leaf' exposed to the prevailing light intensity. Many of the plant's other leaves are subject to self-shading and thus have lower photosynthetic rates, making whole-plant C-gain lower than estimated from these single leaf measurements. Self-shading is more pronounced at high DLI compared to low, due to larger plant size. A decrease in photosynthetic capacity in older and/or shaded leaves may also contribute. Furthermore, in field and glasshouse experiments, photosynthesis is typically measured at noon when light intensity is highest, which may also overestimate the daily C-gain differences between light treatments. Additionally, we anticipate an increased respiratory load, especially because the allocation to leaves and stems (PI = -1.2) decreases in favor for biomass allocation to the roots (PI = 1.6). A somewhat higher [C] in high-light plants (PI = 1.1; Poorter et al. 2019) may also contribute to the observed difference in PI between Phot/A GL and ULR. Next to the biomass shift towards roots, there is also a decrease in SLA (inverse of LMA). Consequently, the increase in RGR is much more modest than the increases in photosynthesis or ULR. How this then results in changes in vegetative biomass will depend partly on the duration of growth and how plant size feeds back on the trajectory of growth stimulation. For the data compiled for these 610 experiments, the median response for vegetative biomass is almost 10-fold (PI = 9.8).

Dose-response curves for 12 additional traits
Next to the dose-response curves for 70 plant traits as presented in Poorter et al. (2019), we have compiled data for 12 more traits, for which we present the response curves here (Fig. 5, Table 1, see also the Suppl. Figs. S1-S12 for detailed graphs per trait). The first variable is the volumetric fraction of airspaces 1 3 Vol:. (1234567890) in the leaf (VoFrAs). This variable is not frequently reported, but there is a highly consistent decrease with increasing light intensity. Although a densely packed leaf will increase the photosynthetic machinery per unit leaf area, it may at the same time complicate the diffusion of CO 2 from the stomates to the chloroplasts (Oguchi et al. 2018). Stem diameter (SteDia), generally measured at the base of the plant, or otherwise at breast height for trees, increases with light intensity in a saturating fashion, and with a very high consistency ( Fig. 5b; CI = 98). Of all the morphological traits measured, plant height was on average one of the least affected by light (Poorter et al. 2019). Consequently, the slenderness index, the ratio between plant height and stem diameter, decreases with DLI (SleInd, Fig. 5c). We therefore presume that plants growing in the shade have a higher chance of mechanical failure (Peltola et al. 1999).  (Poorter et al. 2006) as well as the exchange of nitrate for soluble sugars (Blom-Zandstra et al. 1988) may all contribute to this increase in C/N ratio. Leaf phosphorous concentration decreases more (PI = -1.8; Poorter et al. 2019) than the leaf nitrogen concentration, and hence the N/P ratio of the leaves increases with increasing DLI (Fig. 5e, PI = 1.3), be it with a low consistency. This mirrors the effect of [CO 2 ], where leaf N/P decreases with increasing CO 2 levels (Poorter et al. 2022), also with relatively low consistency. It would be interesting to test whether the opposing effects of light and [CO 2 ] on the transpiration rate differentially affects mass flow around the roots, thereby affecting the uptake of nitrate more than of phosphorous compounds. Leaf carotenoid concentration generally scales well with chlorophyll content, but using the investment in carotenoids relative to chlorophyll, we see that carotenoid presence is favored at higher light levels (Fig. 5f, PI = 1.6). This response is predominantly due to increases in the Table 1 Summary of the dose-response curve analysis for 12 plant traits as dependent on the daily light integral (DLI) during growth Columns 2 and 3 indicate the range of daily light integrals for which records are present in the database and the total number of observations (= number of averaged values per species and DLI over all experiments; rounded to the nearest 10). Column 4 shows the number of species for which we have observations for the various traits. The fit refers to the form of the dose-response curve. Fitted equations were either no relationship (−; Y = a where Y is the scaled value of the phenotypic trait of interest and a is the overall average of Y values); linear (L; Y = a + bX where X is the DLI), or saturating (S; Y = a (1 -b. e(−cX))). The Plasticity Index (PI) as used here is the fitted value at DLI = 50 divided by the fitted value at DLI = 1, with positive values indicating positive trends with DLI and negative values decreasing trends; bold numbers indicate a |PI| ≥ 2.0. The pseudo r 2 refers to the approximate fit of the selected equation. The Consistency Index refers to the percentage of all cases (species x experiment combinations) where the phenotypic value at the highest DLI was larger than at the lowest DLI, indicating the consistency of the response. Values close to 0 or 100 indicate highly-consistent positive or negative responses. The next column shows the reliability index, based on the number of records in the database for that trait, the number of different species, the range of DLI levels at which it is measured and the average deviation from the median response, with a relative scale from 1 (low) to 10 (high reliability level). The last 3 columns give the values for parameters a, b and -if relevant -c for the equations mentioned above. Trait abbreviations: VoFrAs, Volumetric Fraction of Airspaces; SteDia, Stem Diameter; SleInd, Slenderness Index; C/N L , Carbon to Nitrogen ratio of the Leaves; N/P L , Nitrogen to Phosphorous ratio of the Leaves; Caro/Chl, carotenoid content per unit chlorophyll; ApQuYi, Apparent Quantum Yield; Phot/N GL , rate of Photosynthesis per unit leaf N as measured under Growth Light conditions; ∆ 13 C, discrimination against 13 C in the leaves of whole plant; iWUE, instantaneous Water Use Efficiency; Resp/A, leaf Respiration per unit leaf Area; TiToFl, Time To Flowering. The relative weight w i of the model selected by the AICc-test is given by: *, 0.70 < w i < 0.90; **, 0.90 < w i < 0.98; ***, w i > 0.98, but only indicated in case the Consistency Index is <40% or > 60% three carotenoids involved in the xanthophyll cycle, although lutein and β-carotene increase with DLI as well (Esteban et al. 2015). We also determined response curves related to the physiology of the plants. The apparent quantum yield is the CO 2 fixed per photons incident on a leaf measured in the linear light-limited part of the photosynthesis-light response. Theoretically, we would not expect this variable to be affected by growth light conditions (Evans 1987), and indeed, taken over all experiments the apparent quantum yield remains virtually constant (ApQuYi, Fig. 5g). However, there is a remarkable amount of variability across experiments (Suppl. Fig. 7), probably reflecting the different ways the apparent quantum yield is calculated, in combination with the difficulty to measure close-tozero CO 2 fluxes in leaf cuvettes that contain a small leaf area (Pons and Welschen 2002). Photosynthetic Nitrogen Use efficiency, the rate of photosynthesis per unit leaf N determined under growth light intensities, increases with DLI, with a very high consistency (Phot/N GL ; Fig. 5h). There is a slight but consistent decrease in the intercellular to ambient [CO 2 ] ratio (ci/ca) of the leaves in plants grown at higher DLI (Poorter et al. 2019), and so we expect a longterm indicator of the ci/ca ratio, Δ 13 C, to decrease as well. This is indeed what happens, with high consistency (Fig. 5i). With the large increase in photosynthesis under growth light levels (PI = 15.1), and a 2.7 fold increase in stomatal conductance we expected the intrinsic Water Use Efficiency (iWUE), the ratio of the two, to increase as well. This is indeed what is found (Fig. 5j), but the increase is less than the expected 6-fold increase calculated from the PI's of the components. We have as yet no explanation for this discrepancy.
Leaf respiration per unit leaf mass is slightly affected by the light level during growth (Poorter et al. 2019), but as LMA increases (Fig. 4a), we may expect respiration per unit leaf area to increase strongly with DLI. This happens to be the case, with a PI just slightly larger than the one for LMA (Resp/A; PI = 2.9; Fig. 5k). Also for this variable the CI is high. Finally, generative development is strongly retarded in low light, which shows up in a strongly increased time before plants flower (TiToFl; Fig. 5l). This is especially true for DLI levels lower than 10 mol. m −2 .d −1 . Low-light plants are also much smaller in biomass. For some species, at least monocarpic perennials, it is known that flowering only occurs when plants reach a certain biomass (Klinkhamer et al. 1987;Pons and During 1987).

Further applications of MetaPhenomics
Interaction between environmental factors So far, we have been able to construct dose-response curves for 12 abiotic environmental factors (Poorter et al. 2009(Poorter et al. , 2012a(Poorter et al. , 2012b. For most factors, such as light and CO 2 , it is relatively easy to objectively quantify the levels plants are exposed to. However, for others -notably nutrients and water-it is more complex, as the growth restriction imposed by these soil resources depends not only on the level or amount applied, but also on additional factors such as pot and plant size. An alternative way to express the strength of the environmental limitation could then be to use the biomass of low-resource plants relative to those growing at optimal conditions. Having established these dose-response curves, an interesting next step would be to calculate doseresponse surfaces, where the combined effect of two environmental factors on plant traits is visualized. These dose-response surfaces are particularly interesting to analyze how strong the interaction between two environmental factors can be, and where in the environmental space the interactions occur. For example, is the relative response to environmental factor X 1 similar over a wide range of levels for environmental factor X 2 and vice-versa? The Sprengel-Liebig Law of the Minimum assumes that plant growth is determined by only one environmental constraint at a time ( Van der Ploeg et al. 1999). Assuming this would also be true for other plant traits than biomass, we would for each of them expect simple dose-response curves consisting of two parts: a relatively linear increase (or a decrease) and a plateau. Dose-response surfaces would show similarly abrupt changes. However, exactly because of the acclimatory changes plants realize, such as a change in biomass allocation, two or more environmental factors can be co-limiting at the same time (Bloom et al. 1985;Gorban et al. 2011). Consequently, doseresponse curves and surfaces will change smoothly rather than showing abrupt alterations. If interactions are largely absent, then the dose-response surface could simply be composed by information from the two individual response curves.
Two problems arise which hinder the construction of dose-response surfaces. First, this analysis requires experiments where a factorial combination of two environmental factors is studied. Although factorial experiments are not uncommon, they comprise less than 30% of the data in the MetaPhenomics database. Thus, construction of these surfaces has to be done with ~70% less data than are available for simple dose-response curves. A second challenge is that trait scaling now has to be carried out with respect to two environmental factors. The chance that the trait scaling value Y R for this combination of reference levels X 1R and X 2R can be obtained by interpolation is lower, and extrapolation is more complicated due to the 3-dimensional characteristic of the dose-response surfaces.

Dose-response curves for gene expression, enzyme activities and metabolites
In principle, environmentally-induced changes in the levels of specific mRNA transcripts, proteins or metabolites are not different from changes in any classical phenotypic trait. We therefore can see a clear future for the MetaPhenomics approach in these areas, although the sheer amount of information makes the analyses more challenging. An additional complication is that most experiments in this field focus on the short-term consequences of changing a specific environmental factor from level L 1 to L 2 , with measurements typically concentrating on the first 3-48 h after a shift (e.g. Liu et al. 2019). Often, many time-specific changes will occur over that period, on top of diurnal effects on gene expression. This makes it rather different from ecophysiological traits, where we sought to select experiments and harvests where plants had ample time to fully acclimate to the new environment. Bringing in time after a change as an additional factor in the analysis will allow for a more complete picture, but also make the calculations more complicated. The simplest first step would be to avoid the strong temporal fluctuations after a switch and focus on the transcriptome of plants that have fully acclimated to the new growth conditions. This kind of data, however, is very scarce in the literature (but see Walters 2005).
We carried out an experiment where A. thaliana plants were grown at five light intensities and sampled for RNA transcripts as well as capacities of various enzymes after plants had ample time to fully acclimate to their light environment. The experimental design allowed a first impression of a dose-response curve, with specifics of this experiment summarized in the legend of Fig. 6. The mRNA expression of early light-induced protein 1 and 2 strongly increased with light (Fig. 6a). They are thought to play a role in photoprotection. There was no change whatsoever in Rubisco activase and the gene encoding the small subunit of Rubisco. This may be surprising at first sight, as Rubisco strongly increases with DLI (Fig. 4a), but this is when expressed per unit leaf area. The increase on a dry mass basis is much smaller, due to the increase in leaf mass per area (LMA) and this measure may be more comparable to total mRNA. Expressions of elip2 and especially elip1 showed strong increases with increasing DLI, as is also observed during shorter-term high-light exposure (Huang et al. 2019). A consistent decrease was found for pal4, which encodes a protein involved in lignin synthesis. We actually expected mRNA levels for this protein to increase with light, as lignin concentrations generally increase with light levels (Waring et al. 1985;Niinemets and Kull 1998). The discrepancy could be due to the timing of expression, post-translational modification in enzyme levels, or degradation processes.
One aspect which deserves attention is that expression of a given gene is generally calculated relative to the expression in all other genes. Integrating such data into the MetaPhenomics approach makes that in fact two different steps of normalization are carried out, which may complicate the interpretation of the link between mRNA data and ecophysiological traits. It is in principle possible to calculate the absolute concentration of a given mRNA, but for this, it is necessary to use spikes (internal standards of synthetic RNA added at the start of RNA extraction), a practice which is still very little used in the plant sciences (e.g., Belouah et al. 2019).
Other fields where meta-analyses of data could yield instructive dose-response curves include the activity of enzymes ('activome'), proteins in general (proteome) and metabolites (metabolome). Enzymes are major engines of cell metabolism, and the different chemical compounds produced may reflect the physiological status plants are in. So far, we have only been able to include Rubisco amount or activity as an important enzymatic factor in C-fixation, and chlorophyll, xanthophylls and other carotenoids, and soluble phenolics as relevant groups of specific metabolic compounds (Poorter et al. 2019(Poorter et al. , 2022. However, there is a wide range of enzymes and compounds that could be instructive for the physiological status of plants. For the same A. thaliana plants for which we showed some gene expression levels, we also measured various enzymes and metabolites, of which we show the capacity of NAD-dependent Malate Dehydrogenase (MHD) as an example. MDH, whose activity is much higher than that of other enzymes of the TCA cycle (Gibon et al. 2009), plays a central role in metabolism, i.e. in the assimilation of nitrogen (Hanning and Heldt 1993), in photorespiration (Journet et al. 1981), but also in cellular redox homeostasis (Scheibe 2004;Shameer et al. 2019). It seems logical that this activity would increase when increasing light intensity, since this generates more metabolic activity and growth, but also more reactive oxygen species. However, our expectation was not confirmed, as enzyme capacities expressed per unit dry mass decreased with light levels during growth (Fig. 6b).
Clearly, we need a more holistic understanding of changes in capacity and activity of enzyme levels and their products. With time, more and more datasets Fig. 6 Responses of (a) relative mRNA levels and (b) the capacity of the malate dehydrogenase enzyme (MDH) as dependent on the daily light integral (DLI). Genes shown are early lightinduced protein 1 (elip1) and 2 (elip2), Rubisco small subunit (rbcS), Rubisco activase (rca), phototropin 1 (phot1), and phenylalanine ammonia-lyase (pal4). All values are expressed relative to the total amount of mRNA expressed. Enzyme capacity of MDH is expressed per unit leaf fresh mass and dry mass, respectively. Data fare for become available for an increasing diversity of environments and species. A problem of metabolome data is that they are almost always expressed semi-quantitatively and making them interoperable via absolute quantification remains a considerable challenge (Ferreira et al. 2021;Røst et al. 2020).

The basis of normalization for plant processes and compounds
As plants or plant organs vary in size, it is common to normalize measured rates of physiological processes or chemical content by the size of the biological sample taken. However, there is a hidden problem here. Plant biologists studying photosynthesis generally consider leaf area as the 'logical' basis for normalization of photosynthetic and transpiration rates (Lloyd et al. 2013). Eco(physio)logists often express data on a dry mass basis, which helps to avoid variation due to time-dependent fluctuations in water availability, especially in the field. Cell biologists who grow their plants generally under controlled conditions express their data per unit dry mass, fresh mass or chlorophyll, depending on the nature of the study or sometimes on the lab's habits. For example dry mass is often used in water stress studies (e.g. Ahmadi and Baker 2001), whereas fresh mass or chlorophyll are preferred in other cases (e.g. Sicher and Bunce 1997). If data for different traits are normalized in different ways, then it is complicated to compare them. For example, the MDH capacity which did not follow our hypothesis when expressed per unit dry mass, confirms our hypothesis when data are expressed per unit fresh mass (Fig. 6b), simply because the water content per unit dry mass decreases strongly with increasing DLI (Poorter et al. 2019). Unfortunately, conversion factors are rarely reported in papers, as they are often not relevant for the research question of interest. However, without knowing the conversion factor between leaf area, dry mass, fresh mass and chlorophyll for a specific species in a specific experiment at a specific environmental level, these data cannot be matched with those from other reports, hindering reuse of data for purposes such as meta-analyses. To bridge the 'cultural' gaps among the different subdisciplines, and allow integration across fields, we strongly recommend that all plant biologists report the conversion factors among the four variables mentioned above as a standard routine in their papers. This should not just be a dutiful exercise. They are relatively easy to achieve, and there is highly relevant insight to be gained from comparing physiological rates and amounts of chemical compounds on different bases (McMillen and McClendon 1983;Garnier et al. 1999;Terashima et al. 2005;Poorter et al., 2014).

Use of dose-response curves in modeling
Modeling is a great way to integrate knowledge of different plant processes and is used advantageously to understand and forecast growth and productivity, both for crops (Keating et al. 2003) and worldwide vegetation (Keenan et al. 2021). Most of these models are based on a 'radiation use efficiency' (RUE), multiplied by the prevailing light intensity and some factor depending on temperature, or on Farquhar-Von-Caemmerer-Berry type of equations to predict photosynthesis depending on light intensity and CO 2 concentration (Boote et al. 2013). However, acclimation of plants to different levels of an environmental factor is generally not an intrinsic part of these simulation models. They do not necessarily form an integral part of ecosystem models that focus on global change either, even though these models often attain a high level of complexity.
How could the present information on doseresponse curves be used advantageously to improve plant growth models? We suggest two different options, using the light-response of SLA as an example. First, acclimation could be explicitly simulated using the generalized dose-response curve as we derived before. As an illustration we used an old crop model (SUCROS; Kropff et al. 1994), and either assumed a constant SLA, or allow SLA to acclimate to light as derived from the generalized dose-response curve of MetaPhenomics (Fig. 7a). For simplicity, we only considered plant biomass 40 days after germination, when plants are still vegetative and assumed a constant temperature. We then challenged the model with DLI levels between 8 and 50 mol.m −2 .d −1 , admittedly a broader range than most crop plants would ever experience outside. For clarity, the data of this analysis were normalized for a DLI of 35 mol. m −2 .d −1 , a typical light level crops experience under field conditions. As shown in Fig. 7b, total vegetative biomass after 40 days varied strongly with light, increasing 220-fold when SLA was kept constant. In 1 3 Vol:. (1234567890) contrast, variation was less than 50-fold when incorporating the 2-fold change in SLA into the model, showing the principle that the acclimation in leaf morphology is improving plant C-gain under lowlight conditions, and presumably also plant fitness under these conditions. Considerable differences in output of a vegetation-climate model were also found when the well-known decrease in SLA with increasing atmospheric [CO 2 ] was included in the simulations (Kovenock and Swann 2018).
The second way we see dose-response curves to be used advantageously, is in comparing and analyzing the dose-response curves of different variables simulated in the model with those found for experimental plants or vegetation. Modelers need to keep their models simple and tractable, which makes them selective in the processes and detail included in their models. Experimentalists often have a different background, and find it difficult to understand what exactly is or is not included in the wide variety of simulation models and what consequences this has for the output of these models. A model that seeks to incorporate some form of acclimation at a tractable level is GECROS (Yin and Struik 2017). It includes photosynthesis and respiration as the main processes for growth. Biomass allocation is governed by the sugarallocation to shoots and roots that maximizes growth. Leaf area expansion at each time step is derived from the amount of sugar allocated to leaf growth, but colimited by N availability for new growth. Again calculating model output for a wider range of light levels than ever anticipated, average SLA responded in a manner very similar to what was found in the Meta-Phenomics analysis (Fig. 7c). The root mass fraction (RMF) in the model increased, as also found for experimental plants, but the change was stronger for the modeled plants (Fig. 7d). Modeled leaf N concentration was constant over a broad range of DLI's, but increased strongly below a DLI of 15 mol.m −2 . d −1 (Fig. 7e). Photosynthesis per unit total leaf area was less plastic in the model, which may also be because the experiments compiled in MetaPhenomics often measure the youngest full-grown and full-light exposed leaf, whereas the model considers all leaves of a crop. Clearly, GECROS is not able to fully simulate crop response down to what are very low light levels for a crop (< 15 mol.m −2 .d −1 ). However, it is able to show acclimation as an intrinsic property of the model much the way that plants actually acclimate to light. This illustrates that the comparison of model output with the dose-response curves from MetaPhenomics may provide a good focal point for communication between modelers and experimentalists.

Conclusions
a. In this paper we have shown the power of generalized dose-response curves in summarizing plant responses to the environment. Using a systematic approach across all kinds of subdisciplines, a quantitative and systematic overview on plant responses for many traits can be obtained from the literature. b. The same information can be used to assess whether functional groups of species do behave similarly or differently in their acclimation to a given environmental factor. c. Although there are challenges based on different ways of normalization, the approach could also be advantageously used to describe mRNA, enzyme activities or metabolite concentrations. Another field of applications is the inclusion of the derived dose-response curves into plant models, or in the communication between modelers and experimental biologists. Dose-response curves in MetaPhenomics could be exploited as a yardstick to guide the future effort of improving models.