INTRODUCTION

It is known that many different terpenoids perform important functions in metabolism: they participate in the protective and adaptive reactions of trees and in the processes of hormonal regulation of growth, reproduction, and signaling (Pallardy, 2008; Pulido et al., 2012) and are synthesized in the tissues of coniferous plants (Pentegova et al., 1987; Lamotkin et al., 2008). They are valuable medicinal and chemical raw materials for various sectors of the economy (Chernodubov and Deryuzhkin, 1990; Plemenkov, 2001; Mccreath and Delgoda, 2016): monoterpenes make up the bulk of pine needle resin. These oxygen-containing compounds are often used in antibacterial, fungicidal, insecticidal, pesticide, and anthelmintic preparations (Plemenkov, 2001). Therefore, their study is of great interest and benefit in a wide range of solutions to scientific and practical issues. In particular, the study of the biochemical diversity of compounds, the diversity of “chemotypes” in populations (Chudny and Prokazin, 1973; Yumadilov et al., 1991; Stepen’, 2000; Domrachev et al., 2011; Tarakanov et al., 2012; Tikhonova et al., 2012; Kuzmin et al., 2020) is an integral part of the section studying the problems of study and conservation of the biodiversity and sustainability of natural ecosystems.

We have identified the most closely related volatile components (Tikhonova et al., 2014). This study is a continuation and development of an earlier work. Its purpose is to study the general correlation structure of the entire set of compounds, as well as to check the possibility of using the results of the analysis of static data to study the processes of formation and interconversion of terpenoids in population samples of Scots pine (Pinus sylvestris L.).

MATERIALS AND METHODS

The studies were carried out on Scots pine populations growing in the southern part of Central Siberia: in the Balgazynskii forest in Tuva (51°10′ N, 95°5′ E) in a forb–grass pine forest (10С, II–III bonitet) P1 and in the environs of Shira village in Khakassia (54°24′ N, 89°59′ E) in a stony–lichen pine forest (7P2B1L, V–Va bonitet) P2. Needles were collected from 10–20 shoots of the current growth year in the middle part of the crown from four sides of each tree in the populations on a representative sample (100 and 75 trees, respectively) at the beginning of October 2011. Samples were collected immediately before analysis and stored for a short time in special sealed test tubes in a refrigerator.

The component composition of volatile compounds in the needle samples were identified using an Agilent 5975C-7890A chromato-mass spectrometer (United States) and a HeadSpace Sampler G1888 headspace sampler. A 30-m HP-5 quartz column with an inner diameter of 0.25 mm (5%-diphenyl-95%-dimethylsiloxane copolymer) was used; helium served as the carrier gas. The column temperature was raised from 50 (10 min) to 200°C at a rate of 4°C/min; in the headspace sampler, the temperature of the thermostat was 100°C, the loop temperature was 110°C, the temperature of the HS interface was 115°C, the sample holding time in the thermostat of the sampler was 7 min; evaporator temperature, 280°C; ionization chamber temperature, 170°C; ionization energy, 70 eV. The components in the sample were identified using the AMDIS program, taking into account the absolute retention time, linear retention indices, and ion mass spectra, comparing them with our own standards and information from the literature (Tkachev, 2008, etc.). The linear retention indices were calculated using the formula Jx = Jn + 100k(tRx – tRn)/(tR(n + k) – tRn), where Jn = 100n is the retention index of an n-alkane containing n carbon atoms, tR is the absolute retention time of components: tRx is the retention time of the component under study, tR(n + k) – tRn is the retention time of the nearest reference n-alkanes with n and n + k carbon atoms, respectively. The relative amount of a component with a content of at least 0.01% in the sample was calculated from the area of the peaks in the chromatogram; the sum of the areas of the peaks (within the limits of linear retention indices of 700–1900) was taken as 100%; no correction factors were used. The quantitative fractions of components with overlapping peaks were calculated based on the intensity of individual ions.

A total of 81 distinct peaks were identified, and 34 components were identified. The rest are given under serial numbers in accordance with their retention time on the chromatogram (Table 1). The percentage of each component in the sample was calculated. There were 22 components common to all trees of both populations. The variability in the content of components in populations was assessed by the coefficient of variation (CV, %).

Table 1. Component composition of volatile needle terpenoids in the studied populations of Scots pine

The data obtained were analyzed using correlation, cluster, and multivariate analyses; in the latter case, the principal component method was used (Efimov and Kovaleva, 2007). The contingency analysis did not include any components (20 pieces) that occur in samples with a frequency of less than 5%. Since the distribution of some components differed significantly from the normal one, the data were preliminarily normalized according to the formula (xijxmin)/(xmaxxmin). The populations were compared with each other using the F-test in order to assess the repeatability of the results obtained on different samples of trees grown under different conditions. The significance of the correlation coefficients was assessed using the tr-test.

RESULTS AND DISCUSSION

Provocation is sometimes used to study the diversity of natural terpenoids: plants are placed under stress conditions, in which components rare for normal conditions are synthesized (Plemenkov, 2001). Among the samples studied by us, the trees of the first population (P1) were in the best, and the second (P2) were in the worst soil and climatic conditions (on stony soils, a greater deficit of summer and winter precipitation, and strong drying winds). The bonitet of plantations served as a complex indicator of the conditions.

It was found earlier that the content of lighter monoterpenes increases and the proportion of sesquiterpenes decreases with the deterioration of growth conditions in Scots pine (Fuchsman et al., 1997), which was also confirmed by our studies. Table 1 shows that not only the total relative mass of monoterpenes was higher (79.0 and 89.5%, respectively, the differences are significant at F = 10.56, P < 0.000) in the second population (with the same number of volatile needle components isolated in the populations), but also the relative content and diversity of more light-weighted compounds were also higher compared with the first population (for seven components out of the first 26). Thus, the share and diversity of more light-weighted monoterpenes is higher under unfavorable growing conditions. The share and diversity of heavier sesquiterpenes is higher in favorable growing conditions, which is quite explicable by their functional significance, since the former predominate in resin and the latter belong to growth and allelopathic substances. The total individual relative content of monoterpenes in the trees of the two populations is characterized by a low variability, 4 and 9%,, respectively; and the content of sesquiterpenes is characterized by a high level of variability (35 and 38%).

In the known literature focused on the study of population variability and the heritability of terpene content in coniferous plants, the results of comparison of geographical populations of species according to the values of the correlation coefficients between pairs of only a small group (4–12) of compounds are presented (Meier and Goggans, 1978; Yazdani et al., 1982; Hanover, 1992; Stepen’, 2000; Plyashechnik, et al., 2011; Tarakanov et al., 2012). Recommendations on the use of statistical methods in chemistry note that correlation and regression coefficients often describe non-close functional dependencies when a chemist is faced with solving more complex problems, including the identification of components that are difficult to analyze (Nalimov, 1960; Dörffel, 1994). Filling in the gaps in our earlier work (Tikhonova et al., 2014), we note that our conclusions about multiple correlations between the selected components, their structure, and proposals for applying the information obtained turned out to be close to a number of studies using spatial modeling methods: theoretical histology (Savostyanov, 2005), combinatorial (algebraic) topology, the analysis of complex chemical and technological processes, quantum physics, programming, and automation (Azarov et al., 1975; Kafarov and Dorokhov, 1979; Hatcher, 2011). The identification of such systems makes it possible to decompose them reasonably into blocks and indicate the most likely channels of their interaction. The specificity of their use in chemistry is that these blocks may not be delimited in real space; their topology is abstract. In this regard, it also seems promising to use a similar comparative analysis of correlations in closely related species: from biology to chemistry, and vice versa.

After the exclusion of compounds rare for both populations (with a frequency of <5%), a correlation analysis of the content variability of 59 components in the first population and 57 in the second population was carried out. In this case, only the closest relationships with r ≥ 0.87 were taken into account. The threshold value was chosen based on the level of significance for the values of the correlation coefficient (p = 0.05–0.01) for the rarest compounds among those taken into account. The only powerful correlation connection in the first population is formed by 58 compounds, 32 of them are included in the “core” of the connection (including 22 at r = 0.99–1.0). All of them are correlated with the following feature: the indicator of the connection under no. 6 (an unidentified component). Some of them are joined by 26 additional connections. Thus, only one of the compounds included in the analysis was not included in the connection: it was β‑pinene. The majority of the terpenoids within the “core” of the connections closely correlate not only with the key component no. 6, but also with each other, participating in the formation of 8 to 25 bonds each, including borneol, α-cubeben, copaene, cadinene, β-selinene, α-selinene, γ-murolene, β‑cadinene, longifolene, and compounds under nos. 13, 20, 21, 23, 24, 25, 30, 33, 75, 78, 81. Only six terpenoids (tricyclene, camphene, sabinene, β-myrcene, ∆3-carenee, and compound no. 79) have positive values of the correlation coefficients with the key compound that forms the connection. In the compounds that are not included in the “core” of the connection, the largest number of bonds (8–10) are formed by the components nos. 10, 17, 29 and β-copaenes (no. 64). This is how the main connections are graphically lined up in both populations (Fig. 1). Due to the complexity of their image in three coordinates, the correlation structure of features is shown on a plane where the same complementary component is repeated as many times as the number of bonds it forms relative to the “core of the connection.”

Fig. 1.
figure 1

Correlative connections of highly volatile compounds of needles (numbers) of two population samplings of Scots pine, P1 and P2.

The connection of the second population sampling included all 57 components with correlation coefficients r = 0.87–1.0. The “core” of the connection is formed by 31 compounds with a sign: indicator no. 80; it is supplemented by 26 compounds, 16 of which are associated with 8–25 compounds (nos. 1, 2, 6, 12, 20, 22, 24, 62, 75, 78, ∆3-carene, α-terpinene, γ-terpinene, borneol, longifolene, and β-copaene). In the “core” of the P2 connection, a greater number of components are in feedback with the indicator feature: sesquiterpene no. 80 (these are compounds nos. 4, 5, 7, 30, 31, α-pinene: no. 36, α-cubeben: no. 55, caryophyllene: no. 59, elemene: no. 70, and ecompound no. 62).

Despite the general similarity of the structure and the completeness of the connection and its uniqueness, in the first population, attention is drawn to the greater simplicity of construction: the greater correlation of the “core” of the connection (three closely related groups: compound nos. 20–33; borneol–caryophyllene; sesquiterpenes nos. 65–80) and the presence of a smaller number of “conglomerates” outside the core of the connection (4). Moreover, all five compounds within the “core” of the connection (nos. 6, 24, 25, 33, 80), which form the largest number of bonds (17–31), are closely and directly correlated with each other. The correlation connection in the second population, on the contrary, was distinguished by a weak correlation between the components within the “core” of the connection and a closer correlation between all the components of the “core” of the connection and its complementary compounds. Compound nos. 2, 12, 20, and 24, which form the largest number of bonds (18–30), turned out to be outside the “core” of the connection, not connected with each other or with key component no. 80. Based on this fact, we can conclude that Scots pine in the second population shows a certain decentralization of the correlation structure of volatile terpenoids (with the preservation of the common connection) compared to the first population. Topologically, the first connection P1 forms a pyramid with a vertex (compound no. 6) and a four-angled base, including 31 compounds; four large and several smaller groups of compounds are attached to most of them (Tikhonova et al., 2014). The second connection P2 is also a two-part pyramid with a vertex (compound no. 80), an intermediate four-angled base (nos. 20, 30, 72, 73), within which 30 compounds of the “core” of the connection are grouped, and an octagonal base of the second part (nos. 1, 2, 12, 22, 24, 75, borneol, β-copaene); on the sides it is “fixed” by several smaller groups, the key ones for which are sabinene, α-terpinene, γ-terpinene, longifolene, and compound nos. 6 and 78 with 8–10 correlation coefficients each. It should be noted that such correlation structures, where all features are included in one connection, can rarely be observed studying the morphological and anatomical features of trees, and only in the worst conditions for growth (Tikhonova, 2005), since genotypic correlations are stronger than phenotypic ones (Rostova, 2002).

In order to take into account weaker relationships, we once again conducted a correlation analysis, limiting the set to the compounds most frequently found in populations (50 in P1 and 43 in P2). Here, we took into account the values of correlation coefficients ≥0.51–0.62, which were significant for the corresponding samples. As a result of the analysis, similar structures with the same structural features and the manifestation of the 2nd key feature of the connections listed above, due to which the connection acquired the shape of an octahedron or a mapped cone, were again obtained (Fig. 2). In P1, 48 components form a connection (the “core” includes 38 more closely related compounds). In P2, these are 43 and 32 compounds, respectively. Two key features with the same number of connections (one of the light-weighted and the heaviest components nos. 4 and 80) are directly correlated with each other (r = 0.94). Interestingly, a greater number of components of the connection in the first population (22) are characterized by high correlation coefficients with the total content of (mono-) sesquiterpenes, and there are ten of them in the second population. The total content of mono- (sesquiterpenes), apparently, is a consequence of individual variability in the composition of the components that is four times higher: the coefficient of variation in the mass ratio of monoterpenes and sesquiterpenes in P1 and P2 is 48 and 207%, respectively, due to which the correlation coefficient between these indicators decreases from –0.92 in Р1 to –0.45 in Р2.

Fig. 2.
figure 2

Correlation structure of the relative content for the most frequently found volatile needle compounds (by numbers) in Scots pine populations P1 and P2 (compounds that show conjugation with the total content of mono- and sesquiterpenes are in circles).

Using multivariate data analysis, eight and nine principal components, explaining 82 and 81% of the total variance of features, were identified. The first four principal components with a total variance of 64 and 54% in samples P1 and P2, respectively, can be interpreted as factors of the relationship between three groups of compounds I–III. This is also confirmed by the results of the analysis of the correlation matrices: group II is monoterpenes from tricyclene to terpinolene in P1 and to oxygen-containing borneol in P2 (compound nos. 34–48 and 34–51, respectively, including the markers α-pinene and ∆3-carene); according to their total content, they are inversely correlated with lighter compounds nos. 1–33 (group I) (r = –0.86 in both samples) and with sesquiterpenes (group III) (r = –0.91 in P1 and r = –0.35 in R2). Compounds of group II are distinguished by a smaller number of bonds with other components and the stability of bonds within this group, for example, between the compounds ∆3-carenee, α-pinene, sabinene, α-terpinene, γ-terpinene, and terpinolene (and borneol in the second population), explaining 11% of the variance in the third principal components in both populations. It is known that a typical feature of bicyclic monoterpenes (thuyene, α-pinene, camphene, ∆3-carenee, borneol, and bornyl acetate) is their ability to profound structural changes. For example, α-pinene is easily converted into the majority of the compounds of this group in the presence of organic acids (Markevich et al., 1987). Significant correlations between compounds in group II, as well as their interconversions (Degenhardt et al., 2010), were also shown for other coniferous species (Plyashechnik et al., 2011). An inverse relationship between ∆3-carene and α-pinene has been found in many works devoted to the study of pine terpenoids (Chernodubov and Lamotkin, 1990; Stepen’, 2000; Tarakanov et al., 2012). The value of the correlation coefficient in the studied populations increases with the deterioration of tree growth conditions (from r = –0.61 in P1 to r = –0.90 in P2, both values are significant, t = 7.50 and 17.64, p < 0.0001). Sesquiterpenes from bicycloelemene and higher (component nos. 54–80, III) are positively correlated with each other and with light-weighted terpenoids (up to compound no. 24, I) (r = 0.59 in P1). These correlations in the first and second principal components explain 46% of the total variability of characters in P1 and 34% in P2. Given that the second group (II) of monoterpenes (in terms of the molecular mass of components) prevails among all terpenoids (their total mass is 74 and 83%, respectively) in both populations, it is negatively correlated with neighboring groups (multiple correlation coefficient rxyz = –0.99), and each of groups within themselves is generally positively correlated, it can be assumed that (1) the processes of group synthesis of compounds (three large groups and many small subgroups) in coniferous plants are simultaneously launched; (2) each of the groups can serve as a supplier for the neighboring group. However, it is most likely, given the closer relationships, that the components of the 2nd group of monoterpenes (nos. 34–51) are converted into sesquiterpenes or lighter terpenoids. Obviously, plants have mechanisms that make it possible to save time in the formation of the necessary sets of volatile compounds by many times and to have a quick response to external or internal stimuli. This process is not chaotic; it has a stable structure, in which three interacting “blocks” are identified. Smaller structures similar to the general one are found inside the “blocks,” and there are more of them in the sample from the second population: three directly related components, characterized by a wide and complete coverage of the entire available spectrum of compounds (nos. 6, 24, 80), are found in P1; there are nine of them in P2, and they are not interconnected (nos. 2, 6, 12, 20, 22, 24, 51, 64, and 80).

It should be noted that the “cluster-like” or “umbrella” structure of the main correlation connection observed by us is quite consistent with some known features of terpenoid biogenesis, in particular, with the isoprene rule for the synthesis of terpenoid groups through the formation of intermediate compounds (Poltavchenko and Rudakov, 1973; McGarvey and Croteau, 1995; Plemenkov, 2001), with such a phenomenon as the “germacrene tree”: a large group of modifications of one compound (Tkachev, 2008, p. 141), with established facts of the synthesis of multiple products (more than half of all mono- and sesquiterpenes) by the same terpene synthases due to the peculiarities of the enzyme protein structure (Degenhardt et al., 2010). The correlations we found not only between multiples of isoprene (C5H8) differing in large groups of compounds (two isoprene groups form monoterpenes, and three isoprene groups form sesquiterpenes) or isomers of one compound, but also within groups of monoterpenes and sesquiterpenes, indicate the presence of various reactions of interconversions within these groups or competition between them for one substrate (negative connections), as well as the simultaneous synthesis (positive connections) of a large number of similar sets of compounds in different individuals of the population, which is confirmed by high correlation coefficients within the connection. On the one hand, the absence of β-pinene in the connection of the first population can be explained by its formation in chloroplasts, where the alternative “methyl-4-phosphate” pathway of terpene biogenesis proceeds (the main “mevalonate” pathway occurs on the outer membrane of mitochondria and the nucleus, in the cytosol, peroxisomes, and endoplasmic reticulum) (Meier and Goggans, 1978; Pallardy, 2008; Degenhardt et al., 2010; Pulido et al., 2012). On the other hand, it can be explained by its sensitivity to the conditions of spectrometric analysis and partial conversion to α-pinene (Tkachev, 2008). The published data present the results of experiments proving the conjugation of the processes of synthesis of compounds assigned by us to the second group with illumination and photosynthesis, on the basis of which it was suggested that these compounds are synthesized not in mitochondria, but in chloroplasts (Loreto et al., 1996; Degenhardt et al., 2010), i.e., groups II and I, III can be separated in space. It is also indirectly confirmed by data on the dependence of the content of monoterpenes on the transparency of the atmosphere (Kolomiets et al., 2019). In our opinion, the formation of terpenes from photosynthesis products can also occur in mitochondria; their localization in chloroplasts has not yet been proven. The strengthening role of the alternative “methyl-4-phosphate” transformation of terpenes and the participation of a larger number of bypass routes for the synthesis and transformation of compounds can probably also explain the structural features of the correlation connection in the P2 population. Apparently, the nine most conjugated components in P2 are primarily involved in reactions leading to the appearance of many other terpenoids. On their basis, the composition of terpenoids is replenished. Thus, the time for the formation of the trees necessary for the life activity under the given growth conditions of the components can be reduced. A similar analysis carried out on the same samplings of trees at the beginning of the growing season, or under other environmental conditions, could possibly provide information on other components.

Cluster analysis was used to assess the structure of individual diversity of populations by the relative content and diversity of three groups of volatile needle compounds (I–III). In two populations, the samples are divided into four and three clusters, respectively, at Eij = 0.97–0.42 (Р1) and 1.08–0.65 (Р2). A small part of the samples represents trees with a high diversity and content of rare terpenoids of groups I and III: 19 and 6% of trees in P1 and P2, respectively (Fig. 3, Table 2). The next two clusters, which include the majority of the samples, are characterized by a gradual decrease in the mass and number of components in these groups. The last 4th cluster in P1 is formed by 11% of trees with a low relative mass of the components of groups I and III, but with the highest indicators of their diversity. The correlation coefficients between the three groups of compounds in P1 samplings, limited by trees one and four of the clusters with the most complete composition of terpenoids increased to rI, II = –0.89, rII, III = –0.96, rI, III = 0.67 compared with the above values.

Fig. 3.
figure 3

Dendrogram of the similarity of individuals within the P1 and P2 samplings according to the ratio of groups of volatile components of tree needles: trees are along the x-axis and the Euclidean distance is along the y-axis.

Table 2. Brief description of clusters in two populations of Scots pine

CONCLUSIONS

The results of this study indicate the existence of a close conjugation of the entire composition of volatile terpenoids in Scots pine needles and, accordingly, the processes of their mutual transformations. They form the only powerful correlation connection in the form of an octahedron with two key components at the “input” and “output,” within which there are groups of more closely related components that correlate with each other. The links between them weaken under unfavorable growth conditions; the correlation structure changes, but the unity of the connection is preserved. At the same time, the number of compounds forming bonds across the entire spectrum of terpenoids increases several times. Presumably, reactions leading to the appearance of many other terpenoids necessary for the life of trees occur with their participation. These components, not identified by us, after clarifying the molecular structure, can be used as indicators of the involvement of certain biochemical reactions in the adaptation processes of trees.

Three stable “blocks” were identified within the connections of both populations using the method of principal components: I, light components up to tricyclene; II, monoterpenoids from tricyclene to oxygen-containing borneol; III, sesquiterpenoids. Smaller structures similar to the general one are found inside the “blocks.” Group II terpenes are negatively correlated with the components of adjacent groups. The feedbacks between them, as well as the correlations of terpenes within group II, explain 54–64% of the variance in the content of sets of components within the samples. Features of the correlation structure of traits indicate that coniferous plants can simultaneously trigger the processes of group synthesis of compounds. The bicyclic components of the 2nd group of monoterpenes (thuyene, α-pinene, ∆3-carenee, camphene, etc.), apparently, serve as a substrate for the formation of compounds of adjacent groups.

A small part of the studied samples is represented by trees with a high diversity and content of rarer terpenoids. The correlation coefficients within this sampling of trees are higher than in the general sampling. A close (r = |0.91‒0.95|) correlation of ∆3-carene with α-pinene (negative), as well as with γ-terpinene and terpinolene (positive), was shown. It should be noted that the “core” of the correlation connection in the first population includes ∆3-carenee, while the second population contains α-pinene. The latter correlates with the total content of the three compounds listed in both samples (r = –0.58 and –0.82). Therefore, it is proposed to check the levels of heritability of a larger number of monoterpenes of group II, the total content of components included in groups I, II and III of terpenoids, and their ratio in order to replenish the list of traits that are selectively significant for Scots pine.

Since not all compounds are identified by mass spectrometry, certain important intermediate components may have escaped the attention of researchers, and the structure of all compounds is unknown; only the general structure, the ratio of terpenoid groups, and individual correlations can be considered using these methods and the proposed approach. Thus, not only can the methods of chemical analysis be applied in biology, the methods of population biology, including using interspecies comparisons in phylogenetic series, can be useful to chemists. A general idea on the structure of biochemical features is important for a better understanding of the object of research and clarification of the direction of search, for example, in the development of similar structures for the purpose of artificial synthesis of new compounds or their mixtures.