Introduction

Cyto-Nuclear Incompatibility (CNI), is the (partial) failure or breakdown in communication between nuclear and organellar genomes. It occurs when populations, derived from a single ancestor, and having become separated in space and time, undergo secondary contact. Such populations may have acquired mutations independently from each other, creating possible reproductive barriers. This is referred to as the Bateson-Dobzhansky-Muller (BDM) model of speciation (Bateson 1909; Dobzhansky 1936; Müller 1942) and is thought to underly the occurrence of CNI.

CNI can be caused by nuclear mismatch with mitochondria (mCNI) as well as with chloroplasts (pCNI). Whereas mCNI manifests itself as dwarf growth or (partial) male sterility (Schnable and Wise 1998), pCNI on the other hand, occurs as bleaching of the leaves (chlorosis), a regularly-occurring phenomenon in F1 hybrids of interspecific crosses in, for instance, Pelargonium (Geraniaceae) (Baur 1909; Horn 1994; Breman et al. 2020). Angiosperm-wide, CNI has so far been reported from at least 14 genera (Greiner et al. 2011). Sharbrough et al. 2022 investigated CNI in allopolyploids and how interactions between nuclear and cytoplasmic genes are coordinated, across angiosperms. In cases of biparental inheritance of organelles (mainly plastids), the presence of a plastid incompatible with one of the parental donor genomes will induce signaling mismatches between nuclear genomic genes controlling plastid expression (Postel and Touzet 2020; Canonge et al. 2021; Qin et al. 2021; Forsythe et al. 2021) and the plastome. The regulation and expression of organelles is coordinated by the nuclear genome upon light-activation of tissue (Zoschke and Bock 2018) which triggers five different pathways that co-interact with the nucleus and chloroplast in ways that are not fully understood (Zoscke and Bock 2018). For chloroplasts, the so-called anterograde (from nucleus to organelle) signals consist of nuclear encoded proteins (Tadini et al. 2020) that initially target the rpoB subunit of the plastid-encoded RNA polymerase (PEP) complex, which then initiates expression of plastome-encoded genes (Börner et al. 2015). The PEP complex consists of four subunits called α, β, β’ and β’’, respectively encoded by the rpoA, rpoB, rpoC1, and rpoC2 genes (Börner et al. 2015). In Pelargonium sect. Ciconium (and Geraniaceae in general), rpo gene sequences were found to be highly variable in length (Guisinger et al. 2008; Breman 2021) and showed signs of strong positive selection. Generally, PEP is involved in the initiation of plastome transcription and (almost) solely responsible for the expression of rRNA and photosystem I and II genes psba and psbb (Demarsy et al. 2006), as well as of tRNA (Williams-Carrier et al. 2014). All other plastid genes are, at least partially, expressed via the nuclear-encoded (RNA) polymerase (NEP) (Demarsy et al. 2006; Palomar et al. 2022). In contrast to the plastids, plant mitochondria do not encode their own polymerase and mitochondrial genes are expressed via the same NEP as used for a subset of genes in the plastids, as well as by a dedicated nuclear-encoded, mitochondrially targeted NEP (Zoschke and Bock 2018).

The commonly known 'garden geranium' (P × hortorum) is the product of interspecific hybridization between two species from P. sect. Ciconium (Sweet) Harvey (1860: 298), P. inquinans and P. zonale (James et al. 2004) and represents a suitable model for studying CNI as it exhibits both CNI and bi-parental inheritance. Its origins date back to the 19th century, when intense hybridization efforts were undertaken since the early 1800s (e.g. Sweet 1820-1830), especially in Victorian England. Early breeders noticed the frequent occurrence of chlorosis in offspring of species from P. section Ciconium (hereafter referred to as ‘Ciconium’), and later it was established that aberrant chloroplast inheritance was causal (Baur 1909). Establishing interspecific hybrids between several species of Pelargonium is relatively easy (Horn 1994; Breman et al. 2020) making Pelargonium an attractive model genus for studying CNI.

Biparental inheritance of plastids appears to be not as uncommon across angiosperms as usually considered (Greiner et al. 2015). However, it appears to be particularly common in Pelargonium, especially in the Ciconium clade (Baur 1909; Metzlaff et al. 1981; Guo and Hu 1995; Weihe et al. 2009; Apitz et al. 2013). All 17 species in this clade display the ability to transmit (and accept) plastids from either parent (Breman et al. 2020). Given the ubiquity of cytoplasmic biparental inheritance in Ciconium, it could be more widespread throughout the genus but this has not been studied yet. The level of chlorosis in interspecific offspring in this clade was found to be dependent on the specific plastome/nuclear genome combination (Tilney-Basset 1984; Tilney-Basset et al. 1992; Breman et al. 2020).

The plastomes of Pelargonium species are re-arranged structurally (Röschenbleck et al. 2017) when compared to the generally conserved angiosperm plastome structure (Wicke et al. 2011). This is probably due to the frequent occurrence of small and middle-sized repeats which can act as sites for non-homologous rearrangement (Ruhlman and Jansen 2018; for a review of Geraniaceae plastome properties). In addition, Pelargonium plastomes are among the largest known for angiosperms (i.e. ~275kb for P. × hortorum, Chumley et al. 2006; Weng et al. 2012, 2017). Combined with the unusual length variation in rpo gene sequences outlined above, these unique features make Ciconium plastomes an interesting test-case for studying the genetic and molecular basis for the observed pCNI in hybrid offspring. We therefore study the inheritance of chloroplasts in interspecific Pelargonium crosses for the entire clade and relate chlorotic phenotypes to the unusually high PEP structural variation encountered in section Ciconium, also at the protein-structure level. Having the complete complement of species for a clade allows us to compare all extant structural variants and evaluate the differences in detail.

Given their roles in expression regulation by the nucleus and involvement in plastid gene transcription (Demarsy et al. 2006), we consider the rpo genes as relevant in explaining chlorosis. We therefore explore the possible effects of the rpoB, rpoC1 and rpoC2 length and sequence variation in Ciconium species on their occurrence of CNI. We do that by describing and matching peptide physico-chemical properties and modelled peptide structures in the Ciconium species compared. We then explore whether correlation exists between peptide structure and chlorosis phenotypes and speculate what the structural effects could be of the observed sequence variation.

Materials and methods

Plant material, DNA extraction and sequencing

Plant material was obtained from other research (Breman et al. 2021) or collected from herbaria and living collections (Table 1). Plant DNA was extracted using an adjusted CTAB protocol (Bakker et al. 1998) followed by RNAse treatment. The obtained DNA extracts were sent to Novogene Inc. (Cambridge and Hong Kong) for Illumina HiSeq sequencing. Read libraries were generated from 1.0 μg genomic DNA using the NEBNext DNA Library Prep Kit following the manufacturer’s protocols, with genomic DNA randomly fragmented by shearing to ∼350 bp. Fragments were subsequently subjected to end polishing, A-tailing, and ligation to the NEBNext adapter for Illumina HiSeq sequencing (Illumina, Inc. San Diego, USA) (Breman et al. 2021) with an average coverage of 0.5–1 X. Throughout the text we use four-letter-acronyms for each accession used, see Table 1 for the corresponding species names.

Table 1 Plant materials used in this study, along with herbarium voucher information

Establishment of four F1 crossing series

Methods for generating and establishing F1 hybrids in this study were described in past research (Breman et al. 2020) for the domesticated P. × hortorum crossing series. We generated three additional crossing series, including three wild species representing phylogenetic diversity across the Ciconium clade (van de Kerke et al. 2019) and with which we crossed all available section members (Table 1). The three wild species used were: P. barklyi (‘BARK’), P. multibracteatum (‘MULT’) and P. acetosum, (‘ACET’), who were placed in different clades in our repeatome-based phylogenetic tree (Breman et al. 2021). A visual overview of the total of four crossing series (including P. × hortorum) is given in Fig. 1. Sixteen additional interspecific crosses from other, incomplete, crossing series (ALCH4X × BARK, ALCH4X × FRUT, ALCH4X × YEME, ARID × QUIN, FRUT × ACET, FRUT × BARK, INQU × TONG, PELT × ACET, PELT × ALCH, PELT × QUIN, QUIN × ARID, TONG × ACET, YEME × ALCH4X, ZONA × MULT, ZONA × QUIN) were also analyzed. We used embryo rescue (ER) of all F1’s to maximize the number of offspring for evaluation of chlorosis phenotypes, thus eliminating hybrid incompatibilities caused by failures of endosperm development in normal seed development in the plant.

Fig. 1
figure 1

Experimental setup for the four comprehensive Ciconium crossing experiments. The four species indicated on the left were selected as mother plants. The P. × hortorum series is from Breman et al. (2020). The paternal accessions are listed to the right and their respective floral and leaf phenotypes are shown below the species names. The paternal accessions are arranged according to decreasing phylogenetic distances relative to P. × hortorum based on a repeatome-based phylogenetic analysis by Breman et al. (2021). The empty black squares indicate the mother plant in the series

Assessing phenotypic effects of CNI

Phenotypes reflecting severity of plastome-induced CNI (‘pCNI’) are listed in Tables 2 and 3. They are distinguished here according to the following syndromes: I) ‘green’—no virescence or chlorosis observed; II) ‘near green’—virescence occurs under extreme physiological conditions; III) ‘mildly chlorotic’ —plants were always chlorotic, but never lethal; IV) ‘severely chlorotic’—plants were always chlorotic and easily turned yellow or lethal; and V) ‘lethal’—plants were always yellow or white. In the case of uniparental inheritance of organelles, the correlation of phenotype with genotype is straightforward. In an equal F1 nuclear genomic background, a direct connection can be established between the observed pCNI phenotype and the ‘responsible’ chloroplast genotype in an F1. In the case of variegated offspring, which we suspected contained chloroplasts inherited from both parents, we tested, when possible, green and white parts of leaves separately.

Table 2 Categories of pCNI
Table 3 Summary genotypes and phenotypes of F1 interspecific crosses from Pelargonium section Ciconium

Flow cytometry

The average total genomic content per cell (2C value expressed in pg) was determined using flow cytometry (Iribov SBW, the Netherlands) for all 19 accessions. As a reference for the size estimates, we used P. × hortorum PEZ-BD8517 with known ploidy (2×) and total genome size (2C = 2.33 pg). The measurements were done on freshly collected leaf material using a Partec CA-II flowcytometer (De Laat et al. 1987). Nuclei were stained with a High-Resolution Kit (Partec).

Plastome assembly

Plastomes were assembled using GetOrganelle (Jin et al. 2019) using default settings, except for the assumed insert size which we set to 350. Contigs were visualized and assessed using Bandage (Wick et al. 2015) and final contigs were concatenated using MEGA7 (Kumar et al. 2016) and subsequently aligned in a multiple sequence alignment (MSA) with all accessions, using MAFFT (Katoh et al. 2019).

Organellar genotyping using PCR markers

We used diagnostic PCR to genotype the inherited chloroplasts following the methods and primers developed in the past research (Breman et al. 2020) with the aim of tracking the types of plastome inherited across the generations.

Ciconium rpo sequence variation

We assessed sequence variation in rpoB, rpoC1 and rpoC2 genes and in case we encountered length variation (>5 aa residues) in exons we explored its functional relevance by checking codons, translating the sequence to amino acids and determining physico-chemical properties for the inferred peptides (Table 4). The R-package ‘Peptides’ (Osorio et al. 2015) was used to calculate the weight (Da), grand average of hydrophobicity (GRAVY) index, aliphatic index (‘aI’), iso-electric point of zero charge (‘IEPoZC’) (pH) and net charge at pH 7 (C) of each amino acid sequence, over both the full-length exon and the variable regions.

Table 4 rpoB peptide physico-chemical properties of the aa sequence across the full length

Three-dimensional (3D) homology models of Ciconium rpoB and rpoC1 were generated using the SWISS-MODEL protein homology modeling server (Bertoni et al. 2017; Bienert et al. 2017; Waterhouse et al. 2018; Studer et al. 2020, 2021). As a template, we used the crystal structure of Thermus thermophilus transcription initiation complex (TIC) (PDB ID: 4g7h, Zhang et al. 2012) which is the PEP bacterial homolog.

The PyMOL software (https://www.pymol.org/) was used to visualize the 3D homology model, compare it to the T. thermophilus transcription initiation complex, calculate distances, and prepare Figs. 4 and 5. The quality of the homology models was assessed using the GMQE and QMEAN scoring functions (Biasini et al. 2014).

Correlation of phenotype with rpo types

The correlation (R2) between rpo genotypes and observed leaf phenotypes was assessed by performing a linear regression on the differences between four physico-chemical properties of the rpo aa-sequences of each accession and the observed paternal/maternal phenotypes in each F1 cross. The four properties analyzed are: Gravy, aI, IEPoZC and net charge at C. The analyses were performed on rpoB and rpoC1 separately as well as on rpoB+CI for the two most complete series of crossings (the F1’s of P. × hortorum X Ciconium series and of the P. multibracteatum × Ciconium crossing series). The results are listed in Table 3.

Results

Confirmed F1 hybrids

To compare the effects of different chloroplasts on chlorosis, we created a total of 30 verified F1 interspecific hybrids (see Table 3) over four crossing seasons, by crossing P. acetosum, P. barklyi and P. multibracteatum with all other available Ciconium species. In addition, there were crossings available from our previous study (Breman et al. 2020) involving P. × hortorum as well as the genotypes for each offspring from that study. Using embryo rescue, we obtained offspring for nearly all interspecific crosses for which fruit and seed set was observed and we observed all modes of plastid inheritance (i.e. paternal, maternal and biparental) (Table 3). In most cases, at least some individuals of an offspring died quickly after germination or transplantation to the greenhouse. This was especially the case for MULT × ZONA, MULT × ACET, ACET × ZONA and BARK × QUIN from which a maximum of one, but usually no plants at all, survived transplantation to the greenhouse.

For the P. acetosum series, we obtained six F1 interspecific hybrids (see Table 3). Two were from reciprocal crosses (MULT × ACET and PELT × ACET) and these were chlorotic and sterile (pCNI class III). Progeny of cross ACET × ZONA was always lethal (pNCI and mCNI class V), as the plants never flowered. ACET × TONG, ACET, × FRUT and ACET × INQU yielded plants that were green or near green and were partially male-fertile (pCNI class II).

For the P. barklyi series, we obtained four verified F1 plants (BARK × FRUT, BARK × MULT, and BARK × QUIN). For BARK × INQU, we were unable to verify the status based on phenotype and it will not be considered further here. The three remaining crosses were chlorotic and sterile dwarfs (pCNI class III). Four more were obtained from reciprocal crosses (FRUT × BARK, MULT × BARK, HORT × BARK, ALCH4x × BARK). In addition, BARK × HORT was lethal as the plant was white, did not survive outside the laboratory and never flowered (pCNI class V). Two plants (BARK × MULT, and ALCH4x × BARK) were chlorotic and infertile (pCNI class III). One (BARK × FRUT) was severely chlorotic and flowered only once with sterile flowers, and the plants were dwarfs (pCNI class IV).

For the P. multibracteatum series, we obtained eight F1 hybrids (see Table 3). Two of these (MULT × ALCH, MULT × QUIN) were green or near green and partially fertile (pCNI class II). Two (F1 MULT × ARID, MULT × BARK) were chlorotic and infertile (pCNI class III). Three were severely chlorotic and flowering was never observed (F1 MULT × ACET and MULT × ZONA, pCNI IV, mCNI V). Pelargonium multibracteatum crosses with INQU, FRUT and TONG did not yield a single plant despite two full seasons of crossing attempts. Finally, P. multibracteatum crosses with ACET or ZONA rarely yielded progeny, and when they did the plants nearly always carried the MULT plastid. We found one case where the F1 MULT × ACET also carried the ACET plastid, alongside that from MULT. From the other crosses we obtained a further 15 F1’s, using additional Ciconium species as parents. From these series, the crosses with P. yemenense sp. Nov or P. alchemilloides (4x) with other accessions (P. barklyi and P. frutetorum) stand out as they were performed using parental accessions with different ploidy levels, resulting in polyploid offspring (see Fig. 2). We found maternal, paternal and biparentally inherited plastids across all F1 offspring (Table 3) and clear differences exist, in some cases, with respect to plastid type inherited and resulting chlorosis in the offspring. All chlorosis- and fertility-related phenotypes for the four crossing series are displayed in Table 5. All individual plants, together with their chlorosis phenotype, and plastid genotypes, are listed in Additional file 1 (see below).

Fig. 2
figure 2

Spiderweb diagram displaying Cx-values for the F1 Pelargonium multibracteatum crossing series showing parental and F1 hybrid genome sizes. The vertical axis displays the obtained Cx-values. The polyploids are indicated with ϕ. Species and accessions are more or less phylogenetically arranged. Two parental species flank each F1 hybrid (i.e., a P. multibracteatum parent crossed with another species from P. sect Ciconium)

Table 5 Correlation of phenotype and rpo physico-chemical properties by linear regression

Flow cytometry

In order to check for polyploidy, we performed flow cytometry on F1 hybrids. The results show that F1 hybrids have a C-value that is intermediate between the two parents (Fig. 2). The most striking result is the frequency of polyploids, especially for the MULT-series (Fig. 2) and the other crosses (see Additional file 1). We were unable to obtain flow cytometry readings for all verified hybrids because some of these never made it past the embryo rescue stage or died quickly after germination or transplantation to the greenhouse, which left us with insufficient material for analysis.

Plastome-typing

We performed PCR’s to determine the plastome-types inherited, and our results indicated plastomes to be inherited from either parent (Table 2).

For the P. multibracteatum series the results were conflicting. The individual plant used to extract the DNA from and to assemble the plastome from did test correctly, but other individuals from the same P. multibracteatum donor population also contained a non-multibracteatum genotype.

Ciconium rpo sequence variation and physico-chemical properties

To determine the level and type of variation in rpo genes we aligned rpoB, rpoC1 and C2 sequences for all accessions. From this alignment we noted three regions containing significant variation (Fig. 3a-c), both in length as well as in amino acid composition. These ‘variable indel’ (vind) regions will be referred to as the ‘rpoB-vind’, ‘rpoC1-vind1’ and ‘rpoC1-vind2’ regions. The alignment for rpoC2, showed length variation of one to three aa residues at nine different sites (not shown) and are not considered as ‘vind’ regions here.

Fig. 3
figure 3

Amino acid sequence alignment of Ciconium rpo-vind regions. rpoB-vind (a), rpoC1-vind1 (b) and rpoC1-vind2 (c). Numbering indicates amino acid position in the exon. Color codes for amino acids are listed at the right of the figure (a). The regions are flanked by non-length variable regions (not shown)

A striking feature in the Ciconium rpoB genes is the occurrence of three aa insertions of threonine (T), aspartic acid (E) and tyrosine (Y) in various combinations. These ‘TEY’ motive insertions are largest in P. aridum (14 amino acids in total) and absent in P. inquinans, P. acetosum, P. × hortorum, P. × salmoneum and P. frutetorum (Fig. 3a) For rpoC1-vind1, insertions consist of serine (S) and guanine (G) in ZONA, ACRA and TONG (Fig. 3b). For rpoC1-vind2, length variation is caused by the insertion of E, T and arginine (R) at this site (Fig. 3c).

To assess what consequences these variations may have on protein function, we calculated some of their physico-chemical properties: Da, GRAVY index, ‘aI’, ‘IEPoZC’ (pH) and net charge at C (Table 4). Across 20 rpoB- and C1 sequences the ‘IEPoZC’ and ‘net-charge at pH 7’, show the most striking differences between the accessions. The highest net charge for rpoB is for P. acetosum with a value of 21.5 C, and the lowest is for P. aridum (4.5 C), which has the second highest one for rpoC1 (40.8). a difference of 17.0 C. remarkably for these species the values for rpoC1 are exactly reversed. P. acetosum has the lowest value with 23.9 for rpoC1, whereas P. aridum has the second highest for rpoC1 (40.8). The (corresponding) iso-electric points differ the most for these two accessions as well, with the highest pH for P. acetosum for rpoB (pH 9.3) but the P. acetosum IEPoZC for rpoC1 is at pH 9.7 and this is the lowest from all accession, for P. aridum rpoB the IEPoZC is at pH 8.0 and for rpoC1 it is at pH 10.3.

The P. acetosum peptides were selected as examples of our modelling because of the high number of amino acid changes and presence of a large deletion in rpoB and a large insertion in rpoC1 vind1 compared to other Ciconium species (Fig. 3a, b, c). The T. thermophilus rpoB and rpoC1 were identified as the best template available because these showed the highest sequence overlap (98.5% and 95.5%) and sequence identity (37.86% and 60%) with P. acetosum rpoB and rpoC1. For rpoC2, only partial templates covering the N-terminal part of the sequence were available (sequence overlap 28%, sequence identity 40.16%). Therefore, we could model the first 331 residues only. Based on that, all rpoC2 variants yielded a structurally similar model.

Structure modelling

The P. acetosum homology models (Figs. 4a, and 5a) for rpoB, rpoC1 and rpoC2 yielded qmean scores of -2.74, 0.63 and 0.45 and gmqe of 0.67, 0.53, and 0.16, respectively. The values for rpoB and rpoC1 are considered ‘reasonable’ in terms of reliability of the model but for rpoC2 they are low.

Fig. 4
figure 4

Thermus thermophilus Transcription Initiation Complex (TIC, PDB ID: 4g7h) and 3D homology model of Pelargonium acetosum rpoB. (a) The full T. thermophilus Transcription Initiation operon. The four separately encoded parts are indicated with colouring, with yellow denoting subunit α, blue subunit β, teal the subunit β’ and red subunit β’’ (although this is encoded in a single gene in the bacterial model with β’). The white arrow indicates the rpoB-vind homologue; (b) The plastid-encoded RNA polymerase (PEP) β subunit overlayed with the P. acetosum model in gold. Arrows indicate unique regions for the bacterial model when compared to P. acetosum (grey arrow) and the rpoB-vind region (white arrow); (c) the P. acetosum PEP β subunit with the rpoB-vind region indicated in purple (and white arrow), template RNA indicated in green and the sigma factor in light blue; (d/e) zoom-in on the P. acetosum PEP β subunit region of interest (ROI) and the distances (in Å) of interaction between template ROI and RNA/sigma factor I (based on the bacterial model). Distances >20Å not indicated. Line colors correspond to the colors on the scale

Fig. 5
figure 5

rpoC1-vind structural interactions with DNA/RNA and σ-factor. (a) The full Thermus. thermophilus transcription initiation operon with the P. acetosum model overlayed; The four separately encoded parts of the Translation Initiation (TI) operon are independently-colored with: Yellow denotes subunit α, blue denoting subunit β, teal denoting subunit β’ and red denoting subunit β’’ (although this is encoded in one gene in the bacterial model with β’). The white arrows indicates rpoC1-vind unique structures. Template DNA/RNA in orange-green and the σ-factor in light blue. Estimated physical distances are in Å. b) The PEP β’ subunit overlayed with the P. acetosum model in white (bacterial part in grey, which in part consists of β’’). Arrows indicate unique rpoC1-vind1 (yellow) and 2 (purple) regions for the Ciconium model; c) Zoom in on the P. acetosum PEP β’ subunit with the rpoC1-vind1 region in yellow d) Zoom in on the P. acetosum PEP β’ subunit rpoC1-vind2 region of interaction between template ROI and RNA/sigma factor (based on the bacterial model)

The 3D model of the T. thermophilus PEP (PDB ID: 4g7h) showed that the rpoB-vind, rpoC1-vind1 and rpoC1-vind2 regions are not in contact with the other operons but in close vicinity of the sigma factor (rpoC1-vind1) and template DNA/RNA (rpoB-vind and rpoC1-vind2) (Figs. 4d, e and 5c, d). Given the high qmean and gqme values for the bacterial and the P. acetosum rpoB and rpoC1 models, the general structure observed in the bacterial PEP is probably conserved in the P. acetosum PEP at least for these two regions.

The rpoB model showed a high structural similarity with its bacterial homologue, except for two unique features. The first one is a region that consists of three long helices in T. thermophilus but of two shorter helices in P. acetosum. In the P. acetosum rpoB model, this region is located at the extreme opposite of the rpoB-vind region (Fig. 4b) and will not be further discussed. The second one is the rpoB-vind region itself, where Ciconium either possesses a longer alpha helix or an unordered region, and near which T. thermophilus possesses a unique beta-sheet (Fig. 4b). According to this model, the P. acetosum rpoB-vind region consists of two alpha helices connected by an unordered region (Fig. 4c, d). In summary, our modelling of all Ciconium rpoB sequences resulted in four rpoB structural variants (Fig. 4a, b, c, d, e): structure a was observed for ELON, ACET, FRUT, INQU, SALM and HORT; structure b for ALCH2x & 4x, ARID, BARK, INSU, OMAN, QUIN, SOMA and ZONA; structure c for MULT,YEME, RANU, ARTI, ACRA and TONG; and structure d for PELT only.

Similarly, for rpoC1, three regions are variable between the plant and bacterial models. One is a region far removed from the rpoC1-vind1 and 2 regions (not shown), where the plant protein contains two alpha helices, whereas the bacterial model has two beta sheets. The other two are the rpoC1-vind1 and 2 regions (see Fig. 5). Modelling of these regions for all Ciconium sequences revealed that rpoC1-vind1 region is defined by two types of unordered regions, i.e. a long and short one, as a direct result of the sequence length variation. rpoC1-vind2 modelling resulted in three alternative structures, defined by containing a single loop alpha helix flanked by two unordered regions INQU, HORT, SALM and FRUT, containing two short beta sheets (ACET only) and the majority is modelled to contain an unordered region at this site (all other accessions). For details see Additional file 2.

In our homology-based model the distances between the rpoB/C1-vind regions and the α factor/template DNA/RNA would range from 6-10 Å for rpoB (Fig. 4d), 12-18 Å for rpoC1-vind1 (Fig. 5c) and 11-14 Å for rpoC1-vind2 (Fig. 5d).

Besides rpoB, and rpoC1, the PEP complex in plants consists of two more fragments: rpoA and rpoC2 (or α and β). The gene(s) encoding rpoA is located in the repeat-rich region of the Pelargonium plastome and could not be assembled (Ruhlman et al. 2018). For rpoC2 the (low) sequence variation is discussed above.

Correlation of leaf phenotype with rpo types

There is no strong correlation between the observed phenotype in a cross and the physico-chemical properties of the rpoB and rpoC1 peptides (see Table 5). The maximum correlation was found for the differences of netto charges (R2 0.66) and for the IEPoZC (R2 0.61) in the P. × hortorum crossing series for rpoB and rpoC1 sequences combined. For the MULT series correlation was highest in the IEPoZC and net charge. For the IEPoZC in the MULT series, it was highest when rpoB and C1 were combined (R2 0.5096). the R2 for the net charge was highest for the rpoB peptide (0.4972). All correlation values and graphs are displayed in supplementary materials (Additional file 3).

Discussion

To study Ciconium CNI in detail, we generated a total of 30 verified F1 interspecific hybrids, over four crossing seasons. Given the high incidence with which nearly every wild species used in our study transmits plastids to the next generation, we conclude that biparental inheritance of chloroplasts occurs widely in Ciconium and that this is a property of the parent species, not an artifact of hybridization (Tilney-Basset et al. 1984, 1989a; Breman et al. 2020). We then explored the highly variable Pelargonium sect. Ciconium rpo gene and PEP protein structure and found that its sequence variation (which is actually higher than that found across angiosperms) leads to protein structural variation. We hypothesize that PEP structural variation in P. sect Ciconium results in evolutionary unstable structures which in turn may result in, possibly, a more error prone process of transcription. We modelled the PEP structures as they occur in Ciconium and found differences. The major structural differences we found to occur at sites that are in close contact with template dsDNA or processed ssRNA (Figs. 4, and 5). Of particular interest is the absence of an σ-helix or extended unordered region in the rpoB-vind region as well as the presence of an σ-helix in rpoC1-vind2 in the P. × hortorum/inquinans/frutetorum PEP structures. The plastomes of these species consistently lead to more lethality in our crossing series than any other plastome type. We hypothesize that not only PEP is evolutionary unstably, but it also has co-evolved with the nuclear encoded sigma factors (Zhang et al. 2013; Postel et al. 2022, for an example in Silene) and possibly with other nuclear encoded organelle management genes such as those encoding for PPR. To the point where the sigma factors from other species cannot interact properly with the different PEP subunits, thus leading to impaired function or even total cessation of development. The P. acetosum plastome is often lethal in crossings (its crossing series in our study were not successful), probably due to similar causes. The P. acetosum PEP also lacks the σ-helix or extended unordered region in the rpoB-vind and it, uniquely, contains two beta sheets in rpoC1-vind2 (Fig. 5d). The P. × hortorum crossing series was more successful in terms of established F1 plants, probably because P. × hortorum is a hybrid plant itself which already went through several cycles of selection, among others, for the ability to yield a green and robust plant. Further support for this comes from the fact that crosses with the main ancestor of P. × hortorum (P. inquinans) were also not that successful. Handling of template DNA/RNA by the PEP enzyme is dependent on refined and highly localized charge variations (Zhang et al. 2012; Sutherland and Murakami 2018). Local physico-chemical properties of one aa-change can already change the way DNA is entered into the PEP complex (or the way RNA is exported) and the Ciconium sequences have many. The fact that DNA/RNA handling sites appear to be affected may explain the occurrence of repeats and possibly also leads to selection pressure on promotor sites in the plastome (not shown).

We hypothesized that this variability might explain CNI as measured by chlorosis in Pelargonium F1 interspecific hybrids. We reject the hypothesis that rpo gene variation alone can predict chlorosis in P. sect. Ciconium interspecific crosses. The correlations were at best ~ 60% (see Table 5 and further down).

Asymmetric inheritance due to lethal and non-lethal CNI

Chlorotic effects of different plastids on offspring are often asymmetric, meaning that reciprocal crossings differ usually in the transmission of pCNI to F1 offspring. Furthermore, different chloroplast types induce different pCNI in crosses with equal nuclear genomic backgrounds. It is hard to explain the differences in chlorosis from F1 hybrids when just considering the physico-chemical properties or sequences. If the nuclear genomic background is equal, why does one chloroplast type perform better in this background than another? Yet we sometimes see pronounced differences in phenotypes (e.g. the P. × hortorum/P. inquinans type chloroplast almost always causes a more chlorotic F1 plant to occur when crossing. We therefore investigated the variation of the PEP protein structure in P. sect. Ciconium species.

Evolutionary unstable PEP structures in P. sect. Ciconium

Plastid encoded polymerase in plants is not well studied, often a bacterial model is assumed (Escherichia. coli or T. thermophilus) because of the chloroplasts’ cyanobacterial origins (Cavalier-Smith 1982). Given the fact that three (rpoB or β, rpoC1 or β’ and rpoC2 or γ) of the four plant subunits that make up PEP are homologues, this still allows us to deduce functional regions which may be affected by the changes. The rpoB1 and rpoC1 peptides are, when in the PEP complex, responsible for transcription. (Sutherland and Murakami 2018). As stated above, the variable regions of Ciconium PEP (and genus wide, Zhang et al. 2015) are located in regions of the subunits that are located near the uptake point where template dsDNA is taken into the enzyme (rpoC1-vind1) (Saecker et al. 2011; Sutherland and Murakami 2018). Here the unit interacts with the σ-factor and newly opened ssDNA (rpoB-vind see Fig. 4) (Saecker et al. 2011; Sutherland and Murakami 2018). Finally, rpoC1-vind2 is located near sites that interact with the σ-factor. We propose that, based on our modelling, PEP in Pelargonium affects the way template DNA is ‘handled’ during transcription and the way RNA is ‘exported’.

The variants of PEP occurring may also differ in correcting mistakes or may produce slippage during transcription, and this may explain the abundance of repeat-rich regions in Pelargonium plastomes. Water availability, which is frequently limited in the natural area of Ciconium species, determines pH in the plant cells. This would lead to altered physico-chemical dynamics of the many proteins involved and therefore necessitate changes in the way the chloroplast is expressed because water is essential to photosynthesis. We hypothesize that the ubiquitous occurrence of bi-parental inheritance of chloroplasts is an evolutionary adaptation to ‘cope’ with the potential for high plastome variation brought on by the variation in the PEP enzyme (Although it could also be hypothesized the other way round; Robin van Velzen pers. comm.).

RpoA and rpoC2

rpoC2 is the part of the enzyme that is needed for the proper folding of the PEP complex (Igloi and Kussel 1992, Sutherland and Murakami 2018). Changes in the sequence for this gene may represent adjustments to the sequence changes in rpoB and rpoC1, ensuring a properly folded structure.

We were unable to reliably assemble rpoA from our Illumina data, as it is distributed among several contigs in the repeat-rich region of the plastome. It is still considered to form a functional part of the enzyme in Pelargonium (Blazier et al. 2016). In addition, rpoA does not have homologues in the bacterial model in terms of functionality. The plant rpoA subunit has derived so much from the bacterial ancestor that when transplanted into a bacterium, it was found to no longer function in the polymerase, whereas this is no problem for the other subunits (Suzuki and Maliga 2000). Given the high sequence variation in Pelargonium (Weng et al. 2016) and especially Ciconium rRNA sequences (compared to all other angiosperms, Breman et al. in prep.) we would suspect that rpoA is also highly variable and may contain structural variants as well. When attempting to use the few available sequences for Ciconium rpoA from GenBank, they all appear to be functioning operons (Blazier et al. 2016), but do not align well, neither at the nucleotide sequence nor the amino acid level. Further studies using long range sequencing technologies should be conducted to get a better picture of rpoA gene structure.

Lack of correlation between phenotype and rpoB and rpoC1 physico-chemical properties

We have compared phenotypic effects versus two operons of one gene only, and could not find significant correlation (Table 5). While PEP is essential for gene expression in the chloroplast and appears to be under positive selection in Ciconium (Breman et al. 2021) it is not the only gene under positive selection (Breman et al. 2021a; Breman 2021b; Weng et al. 2016). Therefore, there may be other plastid encoded genes that are improperly expressed, such as uS19c or the genes encoding for ribosomal RNA (rrn23) as these are also highly diversified in Ciconium (Breman et al. 2021b). Given the critical importance of the ribosome for proper function of the chloroplast (Tiller et al. 2014 and references therein) the differences between the genes encoding for the various ribosomal elements (proteins and rRNA) may well also play a role in the occurrence of chlorosis in F1 interspecific hybrids of Ciconium.

Another cause for the lack of correlation observed between leaf phenotypes and rpo may be that the peptides have more than one function. Part of the function is maintaining structure as well and may favor amino acid changes throughout the sequence. Given that our approach targets the entire sequence this may result in ‘noise’ from these other constraints.

The nuclear genomic perspective

Other factors also influence the expression and regulation of PEP and thereby also the chloroplast (Siniauskaya et al. 2016), which include the sigma (σ) factors (Zhang et al. 2015), PPR genes (Wang et al. 2021) and Whirly genes (Maréchal et al. 2009; Isemer et al. 2012). The PPR genes seem to be reduced in number in P. × hortorum when compared with other angiosperms (Zhang et al. 2013) and may be interesting candidates for further research in Pelargonium and CNI. Previous crossing experiments demonstrated clearly that nuclear genomic factors play a role in expression and regulation of chloroplasts (Tilney-Basset et al. 1989b, 1992; Breman et al. 2020) and these must be taken into consideration when studying CNI. The same holds for the ‘interactome’ (Westrich et al. 2021), which is the set of mainly nuclear-encoded proteins that fulfill numerous roles during assembly and maturation of PEP (Shikanai and Fujii 2013). Mitochondrial expression and regulation are managed by the nucleus as well, given the presence of mitochondrially target nuclear-encoded polymerases, but the extent of coevolution remains untested, at least for Pelargonium.

Our 3D homology model of P. acetosum rpo and its comparison with the T. thermophilus PEP allows for studying the rpo-vind regions in a broader context. Our comparisons seem to indicate that this region is located on the surface of the protein and interacts with both the target DNA and the σ-factor (Figs. 4 and 5). Substitutions, as well as the occurrence of indels in Ciconium rpoB, are reminiscent of what has been recorded for Escherichia. coli where it was demonstrated that changes in the transcription initiation (TI) complex may result in arrested transcription and subsequently to double-strand breaks (Dutta et al. 2011); although the exact nature of the change in the sequence may be different in plants when compared to the bacterial homologue. Nevertheless, the changes in Ciconium rpoB sequence may be affecting the ‘cross-talk’ between DNA repair, replication and transcription as well. Interestingly, the changes in the TI complex were found to be associated with changes in ribosomal activity in E. coli (Dutta et al. 2011), to compensate for changes in transcription speed.

Changes in cross-talk in addition to the previously detected changes in the replication, recombination and repair (RRR) machinery in Geraniaceae (Zhang et al. 2016), may partially explain the numerous indels and rearrangements found in the Pelargonium (and Geraniaceae) plastomes (Röschenbleck et al. 2017; Ruhlman et al. 2018). Especially changes that affect the transcriptional process are implicated in genomic disruption (Kim and Jinks-Robertson 2012; Sebastian and Oberdoerffer 2017) and should be considered in future studies of plastome evolution. Changes in the ribosomes of Ciconium were hypothesized to exist in the other research (Breman et al. 2021) in which the rRNA backbone was modelled of the large subunit and two ribosomal proteins.

Biparental inheritance of organelles and speciation

Plastids are undeveloped during and directly after fertilization and seed development, whereas the mitochondria are active during these phases. Thus, any mCNI effect would be stronger than pCNI at crucial early developmental stages. This would explain the high number of aborted embryos and empty seeds found on all our F1 plants (not shown, but examples can be found in the reference, Breman et al. 2020). Given that the mother plant is ‘responsible’ for supplying energy to the development of the seeds, it is logical that there is a strong maternal bias. However, plastids are introduced to the embryo via pollen (Kuroiwa et al. 1992, 1993) and are sorted out, or expressed/developed incompletely in Pelargonium, early in development (Kirk and Tilney-Bassett 1967; Weihe et al. 2009). Thus, they can be present in all tissues early in development (Guo and Hu 1995). Whereas mCNI hardly plays a role in the post seedling phase of the plant pCNI is considered to determine further survival during the vegetative phase of life. Our results lend support to the idea that biparental inheritance of organelles could provide an ‘escape’ from CNI (Barnard-Kubow et al. 2017). They also support the hypothesis that organellar changes, resulting in CNI, have a profound influence on speciation (Greiner et al. 2013; Barnard-Kubow et al. 2016). Further support for these two hypotheses comes from the fact that second generation of plants segregate again for chlorosis with only one plastid type present, showing that selection for organelle management and expression genes acts immediately after the first generation of hybridization (Barnard-Kubow et al. 2016; Breman et al. 2020).

Possible effects of bi-parental transmission on phylogeny reconstruction

The preference for one type, as well as preferentially backcrossing with one of the parents (introgression) after a historical hybridization event, could explain the problematic position of taxa in Pelargonium phylogenetic trees due to conflict between plastid and nuclear genomic markers. For instance, the four-petalled Clade A species P. nanum, which is currently not assigned to a section (Röschenbleck et al. 2014), was suspected to be an ancient hybrid species because of the unique floral morphology and its ‘single branch’ status in current phylogenies (van de Kerke et al. 2019). It was found to be sister to clade A2 (Bakker et al. 1999) and its inclusion in a Bayesian phylogeny reconstruction appeared to prevent the Markov Chains used from converging (Jones et al. 2008). Other cases can be seen in P. sect. Hoarea where the occurrence of ‘non-monophyletic species’ has been attributed to ‘chloroplast capture’ (Bakker et al. 2005). Such taxa would have retained the chloroplast of one species, while displaying the morphology and nuclear genomic type of another. Further testing of such incongruencies could be done by using more markers from the nuclear genomes. For instance, the repeatome appears promising as a source of phylogenetic markers (Dodsworth et al. 2015; Vitales et al. 2020; Breman et al. 2021) as it provides resolution at a low taxonomic level and provides a genome-wide overview represented by the most abundant parts of the non-coding DNA (repeats). Naturally occurring hybrids in Pelargonium are rarely found (pers. comm. Powrie, Kirstenbosch RSA), but not unheard off (Knuth et al. 1912, van der Walt et al. 1990). This is logical given the reduced fitness characteristic of most hybrid offspring which will result in lower chances of surviving to the reproductive life stage, as is supported by results from our experiments. However, our study also shows that species are highly compatible as we obtained many (~30) interspecific crosses, some of which produce fully green and fertile offspring. We therefore do not exclude that hybridization plays an additional, minor role in Pelargonium speciation (see above for P. nanum, and the allopolyploids P. quercetorum, P. endlicherianum and possibly P. caylae from Madagascar which could represent a hybrid species). Two cases of possible natural hybrids from P. section Ciconium are known. The first is an herbarium specimen of a wild hybrid between P. peltatum and P. alchemilloides at RBGE (M. Gibby pers. comm., FCB pers. observ.), the second case is P. × salmoneum (from our own collections). The morphology of P. × salmoneum is intermediate between the hypothesized parental species. The phylogenetic position based on the repeatome is also intermediate between the supposed parents. (Breman et al. 2021). Pelargonium × salmoneum is a fully fertile, green plant that segregates for numerous traits such as plant size, flower and leaf shape, indicating it is not a ‘stable’ species (yet), but a hybrid. We propose that P. × salmoneum, irrespective of whether it arose naturally or was the result of human crossing activities, is a genuine interspecific hybrid with equal fitness comparable to either of its proposed parents.

Other examples of chlorosis linked to rpo types

Few other examples of where rpo genotypes used to explain chlorosis in F1 interspecific hybrids exist, but in Zantedeschia, for which recently four plastomes became available (He et al. 2020), when comparing the rpo types and known occurrence of chlorosis of interspecific crosses (RCS pers. obs.), we find that there is also an increase of chlorotic phenotypes with increased physico-chemical distance. Interestingly Zantedesschia shares a number of characters with Pelargonium. Its species are relatively easy to cross (RCS pers. comm.), it has biparental inheritance of plastids (Yao et al. 1994) and nuclear genomic alleles have been implicated in explaining the observed patterns of chlorosis in the interspecific crosses as well (Yao et al. 2000, Snijder et al. 2007).

Conclusions

With current efforts underway to control photosynthesis more precisely (Teeuwen et al. 2022), the function PEP plays in chloroplast expression cannot be ignored. Knowledge of structural variants of PEP and their functional impact in the plastid environment may contribute to engineer PEP to function under different conditions such as heat/water stress.