A 47-bp fragment contains all of the sequences required for suspensor transcription
We used the Scarlet Runner Bean G564 gene to identify cis-regulatory elements required for suspensor transcription. Previously, we showed that a gain-of-function (GOF) construct containing a 54-bp fragment of the G564 upstream region can program suspensor-specific transcription when fused to a Cauliflower Mosaic Virus (CaMV) 35S minimal promoter/GUS vector and introduced into tobacco plants (Kawashima et al. 2009). This 54-bp fragment contains the previously identified 10-bp motif (5′-GAAAAGCGAA-3′), 10-bp-like motif (5′-GAAAAACGAA-3′), and Region 2 motif that contains the sequence 5′-TTGGT-3′ (Fig. 1b) (Kawashima et al. 2009). To identify the minimal sequence required for G564 suspensor transcription, we deleted the distal 7-bp of the 54-bp fragment, and showed that it produced strong suspensor GUS activity under the direction of the CaMV
35S minimal promoter in transgenic tobacco embryos, similar to that observed for the 54-bp fragment (Fig. 1b). By contrast, a CaMV 35S minimal promoter/GUS construct produced no detectable suspensor GUS activity. Thus, the 47-bp fragment is the minimal cis-regulatory module that can activate G564 suspensor transcription.
A third 10-bp sequence is required for suspensor transcription
Within the 54-bp G564 fragment, there is 10-bp (−836/−827) between the 10-bp motif and the 3′ end of the fragment, in which there is enough space for another suspensor cis-regulatory element (Fig. 1b). To identify additional suspensor cis-regulatory sequences, we generated GOF constructs in which mutated versions of the 54-bp fragment were fused to the CaMV
35S minimal promoter/GUS vector, and introduced these constructs into tobacco plants. A GOF construct in which the 10-bp sequence at −836/−827 (5′-GAAAACCACA-3′) was mutated [m54(45–54)] showed no GUS activity, indicating that there is an additional motif that is required for suspensor transcription (Fig. 2a). Because this 10-bp sequence at −836/−827 is similar to the 10-bp and 10-bp-like motifs, differing by 3-bp, we have designated it as the 10-bp-related motif. Therefore, three 10-bp sequences in the 54-bp fragment are required for suspensor transcription: the 10-bp (−846/−837), 10-bp-like (−873/−864) and 10-bp-related (−836/−827) motifs.
To determine the core sequence of the 10-bp/10-bp-like/10-bp-related motifs, we took the 10-bp motif as a representative of these three motifs and mutated the first 3-bp, the second 3-bp and the last 4-bp of the 10-bp motif within the 54-bp fragment (Fig. 2b). Although the 10-bp motif is known to tolerate a 3-bp mismatch without affecting the level of suspensor transcription (Kawashima et al. 2009), we hypothesized that the mismatches would not be tolerated if they were all clustered together, rather than being spread throughout the motif. GUS activity in the suspensor was significantly decreased when the first and second 3-bp were mutated [m54(35–37) and m54(38–40)] and completely abolished when the last 4-bp were mutated [m54(41–44)] (Fig. 2b). These results demonstrated that at least one nucleotide in each of these three regions is critical for the function of the 10-bp motif.
Previously, we generated a consensus sequence for the 10-bp motif from divergent 10-bp sequences known to function as the 10-bp motif (Fig. 2c) (Kawashima et al. 2009). Each position of the 10-bp motif can tolerate a mismatch without affecting the level of suspensor transcription. All of the functional 10-bp sequences contain up to three mismatches relative to the 10-bp motif sequence of 5′-GAAAAGCGAA-3′ (Kawashima et al. 2009), and this sequence represents the nucleotide most commonly found in each position of the motif for all of the sequences that function as the 10-bp motif. Figure 2b shows that three mismatches cannot be tolerated if the mismatches are clustered in one part of the 10-bp motif. Taken together, we determined a consensus for the 10-bp/10-bp-like/10-bp-related motifs of 5′-GAAAAGCGAA-3′ with up to three non-adjacent mismatches.
The complete sequence of the Region 2 motif was identified
We previously identified the partial Region 2 motif sequence as 5′-TTGGT-3′ through sequence homology between the 150-bp G564 GUS-positive repeats (first, second and fourth repeats) and the GUS-negative fifth repeat (Fig. 1a) (Kawashima et al. 2009). The Region 2 motif was shown to be functional in the 54-bp fragment by mutating the central guanine at position 29 of the 54-bp fragment to adenine, rendering it inactive (Kawashima et al. 2009). In order to functionally test which nucleotides in Region 2 are required for suspensor transcription, we generated mutations across the Region 2 motif either 2-bp or 3-bp at a time and transferred GOF constructs containing these mutant motifs into transgenic tobacco plants (Fig. 3a). Transversional mutagenesis was used to impose the biggest change on the sequence; however transversional mutations of this GT-rich motif resulted in another GT-rich sequence. Consequently, constructs were also made using adenine substitutions (Fig. 3a).
Mutation of the 5′ TT using either mutagenesis strategy [m54(27–28) and m54(27–28a)] resulted in significantly decreased suspensor GUS activity. Mutation of the central GT to TG [m54(30–31)] also resulted in significantly decreased suspensor GUS activity. By contrast, mutation of GT to AA [m54(30–31a)], creating the sequence 5′-TTGAAAAT-3′, did not alter GUS activity. Finally, mutation of the 3′ AAT [m54(32–34)] abolished GUS activity, demonstrating that the Region 2 motif is longer than previously predicted. Taken together, these results show that the important nucleotides in the Region 2 motif are 5′-TTGGTAAT-3′, and that 5′-TTGAAAAT-3′ can also function as the Region 2 motif (Fig. 3a).
We identified a consensus for the Region 2 motif by comparing Region 2 motif sequences within all 150-bp G564 repeats that we showed previously can drive GUS gene expression in the suspensor of transgenic tobacco embryos (first, second and fourth repeats) (Kawashima et al. 2009). No other sequence element(s) in the 150-bp repeats can compensate for the function of the Region 2 motif, as was shown by a natural point mutation in the fifth repeat that causes a loss of suspensor GUS activity (Fig. 1a) (Kawashima et al. 2009). The sequence of the Region 2 motif in the second repeat and the 54-bp fragment within the fourth repeat is 5′-TTGGTAAT-3′, whereas the Region 2 sequence in the first repeat is 5′-TTGGGAAT-3′ (Fig. 3b). We combined this information with the knowledge that 5′-TTGAAAAT-3′ can also function as the Region 2 motif (Fig. 3a), generating a sequence logo of 5′-TTG(A/G)(A/G/T)AAT-3′ for the Region 2 motif (Fig. 3c). This consensus sequence is representative of the Region 2 motif in each 150-bp G564 upstream repeat, with the exception of the fifth repeat.
A fifth motif is required for suspensor transcription
Within the 54-bp fragment there is 9-bp between the 10-bp-like motif and Region 2 motif—providing enough space for a fifth cis-regulatory element (Fig. 1b). To determine whether there is a fifth suspensor cis-regulatory element, we mutated the sequence 5′-GAGTTAC-3′ in the region between the 10-bp-like and Region 2 motifs in the 54-bp fragment (Fig. 4a). Mutation of 5′-GAGTTAC-3′ [m54(18–24)] abolished suspensor GUS activity completely, indicating the presence of a fifth DNA control element required for suspensor transcription.
To define the length of the Fifth motif, we generated constructs by mutating the 9-bp between the 10-bp-like motif and Region 2 motif 3-bp at time (Fig. 4b). Mutation of 5′-GAG-3′ [m54(18–20)] and 5′-TTA-3′ [m54(21–23)] caused a significant decrease in GUS activity, indicating that these sequences are part of the Fifth motif. However, mutation of the next 3-bp, 5′-CTA-3′ [m54(24–26)], had no effect on GUS activity. Therefore, the final cytosine mutated in the m54(18–24) construct was not required for suspensor transcription. Taken together, these results showed that the important nucleotides in the Fifth motif are 5′-GAGTTA-3′.
We determined a consensus sequence for the Fifth motif through comparison of the Fifth motif sequence in GUS-positive 150-bp repeats (first, second and fourth) (Fig. 4c) (Kawashima et al. 2009). No other sequence in the 150-bp repeats closely resembles the Fifth motif, so we assume that the sequences in the position of the Fifth motif in the first and second repeats are functional. The sequence of the Fifth motif in the 54-bp fragment within the fourth repeat is 5′-GAGTTA-3′. The sequence in the position of the Fifth motif in the first and second repeats is 5′-AAGTTA-3′. Based on these results, we constructed a sequence logo for the Fifth motif with a consensus sequence of 5′-(A/G)AGTTA-3′ (Fig. 4d).
The 150-bp tandem repeats and suspensor motifs are conserved in the Common Bean G564 upstream region
The recent release of the Common Bean (Phaseolus vulgaris) genome (Schmutz et al. 2014) provides an opportunity to compare G564 expression and promoter sequences in two related bean species with giant suspensors (Fig. 5a) (Henry and Goldberg 2015) that diverged ~2 million years ago (mya) (Delgado-Salinas et al. 2006). We used Illumina sequencing technology to profile the mRNA populations of laser-capture microdissected embryo proper and suspensor regions from Scarlet Runner Bean and Common Bean globular-stage embryos, and mapped RNA-Seq reads to the sequenced Common Bean genome (GEO accession GSE57537). Common Bean G564 mRNA was up-regulated ~140-fold in the globular-stage suspensor relative to the embryo proper, similar to what we observed for Scarlet Runner Bean (Fig. 5b) (Weterings et al. 2001; Henry and Goldberg 2015). We compared the G564 upstream region in Scarlet Runner Bean and Common Bean, and found that the Common Bean G564 promoter also contains the five 150-bp tandem repeats (Fig. 5c–e). These tandem promoter repeats are not found in the upstream region of G564-related genes in soybean or Arabidopsis, which diverged from the Scarlet Runner Bean 19 and 120 mya, respectively (Lavin et al. 2005). Therefore, the 150-bp repeats originated before Scarlet Runner Bean and Common Bean diverged from their common ancestor, but after the divergence of these bean species from their soybean (Glycine max) relative within the Legume family.
Sequences nearly identical to the 10-bp, 10-bp-like, 10-bp-related, Region 2 and Fifth motifs were found in each Common Bean 150-bp repeat (Fig. 5e). The one exception was the functional Region 2 motif sequence in the Common Bean fifth repeat, in contrast with the natural mutation in the Scarlet Runner Bean fifth repeat Region 2 sequence (see bold A in Fig. 5e) that makes it non-functional (Kawashima et al. 2009). The suspensor motifs identified in the Common Bean G564 upstream region are most likely functional because the G564 promoter sequences and suspensor-specific expression patterns are nearly indistinguishable in both species (Fig. 5b, e).