Introduction

Rice is one of the most important crops for humans and is recognized as the best model plant of monocot crops (Khush 2005). The International Rice Genome Sequencing Project (2005) completed sequencing of the entire rice genome in 2005, and from the total number of expressed loci and unmapped rice full-length cDNA (FL-cDNA) clones the Rice Annotation Project (2007) estimated the gene number of rice to be ∼32 000. Among these genes, 28 540 are candidates for protein-coding genes. About one-fourth (7189 loci) of the potential open-reading frames (ORFs) have had functions assigned to them by BLASTX searches (Rice Annotation Project 2007). Appropriate biological experiments are needed to uncover the functions of the remaining genes.

To accelerate the identification and characterization of rice genes with so far unknown functions, various rice resources, including sets of backcross inbred lines (BILs), nearly isogenic lines (NILs), and chromosome segment substitution lines (CSSLs), have been produced and are publicly available for mapping and the subsequent map-based cloning of agronomically important genes and quantitative trait loci (QTLs) (Lin et al. 1998; Ebitani et al. 2005). Infrastructure for rice bioinformatics (reviewed in Sasaki et al. 2005), including such integrated databases as RAP-DB (Ohyanagi et al. 2006; Rice Annotation Project 2007), the TIGR Rice Genome Annotation Database (Ouyang et al. 2007), BGI-RIS (Zhao et al. 2004), and MOsDB (Karlowski et al. 2003), has been developed and is publicly available. The rice FL-cDNAs, clustered into 28 469 non-redundant clones (Rice Full-Length cDNA Consortium 2003), and the rice proteome database (Komatsu 2005) also have great potential to promote gene discovery and the elucidation of gene function.

For the large-scale identification of rice gene functions, several systems and tools to generate loss-of-function mutants have been developed (Hirochika et al. 2004; An et al. 2005a). Databases compiled for transferred DNA (T-DNA) insertion lines (An et al. 2003; Chen et al. 2003; Ryu et al. 2004; Sallaud et al. 2004) and for populations insertion-tagged by the retrotransposon Tos17 (Miyao et al. 2003, 2007) or the transposable element Ds (Upadhyaya et al. 2002; Kolesnik et al. 2004; van Enckevort et al. 2005; Park et al. 2007) have been constructed. These loss-of-function resources and databases are definitely of enormous importance for the tagging and hunting of non-redundant genes in the rice genome. However, 29% of rice genes are estimated to exist as clustered and redundant gene families (IRGSP 2005). Therefore, it is difficult to find loss-of-function phenotypes for these duplicated genes. Moreover, when knockouts of genes confer embryonic lethality or severe developmental defects upon the plants, it is impossible to find out the mutant phenotypes corresponding to the mutated genes.

To solve such problems, the activation tagging, a gain-of-function strategy, has been adopted. This technique involves the introduction of a transcriptional enhancer at random into a plant genome to activate transcription of the gene(s) adjacent to the enhancer (Jeong et al. 2002, 2006; Nakazawa et al. 2003; Hsing et al. 2007; Mori et al. 2007). Although this system is effective in changing the transcription level of the gene(s), the transcriptional enhancer often affects the expression of multiple genes, resulting in complex phenotypes. Moreover, T-DNA insertions preferentially occur in gene-rich regions and are not distributed uniformly throughout the entire rice genome (An et al. 2003; Chen et al. 2003; Sallaud et al. 2004).

Taking advantage of the availability of a large collection of Arabidopsis FL-cDNA clones (Seki et al. 2002), Ichikawa et al. (2006) recently developed an alternative gain-of-function approach for the systematic elucidation of gene functions in Arabidopsis—the so-called Full-length cDNA Over-eXpressor gene-hunting system (FOX-hunting system). This system uses ectopic overexpression of a single, or limited numbers of, FL-cDNA(s) in individual transgenic plants. It generates large numbers of dominant mutations, enabling the comprehensive characterization of novel and important traits and identification of the causal genes.

The FOX-hunting technology is in principle applicable not only to Arabidopsis but to any other plants in which transformation technology aided by Agrobacterium is available. A large collection of FL-cDNA clones from Arabidopsis, rice and whatever origin can be ectopically overexpressed to cause changes in phenotypes. As one of the most important and basically studied crops, rice is the most desirable plant for application of the FOX-hunting technique, as a recently developed rapid and highly efficient transformation system for rice (Toki et al. 2006), a large population of rice FL-cDNA clones (Rice Full-Length cDNA Consortium 2003), and the genome sequence information (IRGSP 2005) are available.

We have developed a FOX-hunting system in rice for the systematic functional analysis of rice genes. We have generated approximately 12 000 FOX-rice lines. By genomic PCR analysis, 81.4% of the FOX-rice lines so far analyzed have shown amplification of single fragments containing FL-cDNAs. Sequence analysis of the cDNA-containing fragments revealed that 5462 independent FL-cDNAs were inserted in 8225 FOX-rice lines carrying identified FL-cDNAs. Phenotypic observation of the FOX-rice plants revealed that approximately 16.6% of the tested lines showed altered phenotypes such as in cell and tissue proliferation, organ morphology, plant height, growth habit, heading date, and seed fertility. Here, we present information on extremely dwarfed mutants produced by overexpressing a novel gibberellin 2-oxidase gene. We also give details of several morphological mutants and the genes responsible. Our collection of FOX-rice lines should be a valuable tool for analyzing rice gene functions at the genome-wide scale and for efficiently identifying agronomically important genes.

Materials and methods

Plant materials, growth conditions, and rice transformation

Rice plants (Oryza sativa L. ‘Nipponbare’) were grown in growth chambers under white fluorescent light [a cycle of 14-h light (28°C)/10-h dark (25°C)] at 28°C and 60% relative humidity. Rice transformation was performed as described (Toki et al. 2006).

Plasmid construction

To make a double-stranded SfiI linker with XbaI-cohesive ends on both sides, 1 pmol each of phosphorylated SfiI-linker A (5′-CTAGACCTGCAGGGGCCAAATCGGCCGAGCTCGAATTCGTCGACCTCGAGGGCCATAAGGGCC-3′, SfiI sites underlined) and phosphorylated SfiI-linker B (5′-CTAGAGGCCCTTATGGCCCTCGAGGTCGACGAATTCGAGCTCGGCCGATTTGGCCCCTGCAGGT-3′, SfiI sites underlined) were mixed and annealed at room temperature for 1 h. The resultant linker was then inserted into the XbaI site in the pBI-based plant transformation vector pBIG2113-S (Fujita et al. 2007) to generate pBIG2113-SfiIL. The maize Ubiquitin-1 promoter (PUbi-1) was excised by using HindIII and SbfI from pUbi2.0/HSb; it carried a 2.0-kb HindIII-SbfI fragment derived from the 5′-upstream region of the maize Ubiquitin-1 gene (Christensen et al. 1992). The promoter was then inserted into the same restriction sites of pBIG2113-SfiIL to construct pBIGS-PUbi-SfiL. A 35S-hpt-Tnos [a hygromycin phosphotransferase (hpt) gene governed by cauliflower mosaic virus 35S (35S) promoter and the nopaline synthase terminator (Tnos)] fragment was excised with SpeI and KpnI from pSKHyg [a derivative of pBluescript SK(+) bearing a 2.2-kb HindIII–EcoRI cassette that consisted of 35S-hpt-Tnos] and ligated at compatible XbaI and KpnI sites of pLITMUS28 (New England Biolabs, Beverly, MA, USA) to generate pLIT-35SHyg. Then the 35S-hpt fragment was excised from pLIT-35SHyg with HindIII, MfeI, and ScaI, and the resulting two fragments (a 0.5-kb HindIII–MfeI and a 1.3-kb MfeI–ScaI fragments) were simultaneously ligated at the HindIII and ScaI sites of pBIGS-PUbi-SfiL to construct the pRiceFOX vector.

The pRiceFOX and FL-cDNA-inserted pRiceFOX plasmids were transferred into Agrobacterium tumefaciens strain EHA105 (Hood et al. 1993) by electroporation.

Construction of rice-FOX Agrobacterium library

A rice FL-cDNA expression library was made according to the method of Ichikawa et al. (2006). In brief, a mixture of 13 980 rice FL-cDNAs that were cloned in SfiI sites of the lambda-FLC-1-B vector (Rice Full-Length cDNA Consortium 2003) was digested with SfiI and ligated, in the forward orientation, into the compatible SfiI sites on a binary vector, pRiceFOX (see Results section for details on the vector). The ligated mixture was transformed into E. coli to produce a rice FL-cDNA library. Then, Agrobacterium EHA105 was transformed with a plasmid mixture extracted from the cDNA E. coli library. The resulting transformants were collected to construct the rice FL-cDNA Agrobacterium library to be introduced into rice.

PCR analysis of transgenes

A MagAttract 96 DNA Plant Kit (QIAGEN, Hilden, Germany) was used according to the protocol provided by the manufacturer to extract genomic DNA from the leaf blades of each transgenic plant grown for 2 weeks after transfer to soil. Transgenes inserted into the genome of FOX-rice lines were amplified by PCR by using one of the three thermostable DNA polymerases, TaKaRa Ex Taq polymerase (Takara Bio, Ohtsu, Japan), TaKaRa LA Taq with GC Buffer (Takara Bio), or PrimeSTAR HS DNA Polymerase (Takara Bio); a forward primer from one of the two as follows (located on Pubi-1): FOX5-1 (5′-CTTTGGGGAATCCTGGGATGGCTCTAGCCGTTCCGCAGACGGGA-3′) or FOX5-2 (5′-AGCCCTGCCTTCATACGCTATTTATTTGCTTGGTACTGTTTC-3′); and a reverse primer (located on Tnos), FOX3 (5′-GAAACTTTATTGCCAAATGTTTGAACGATCGGGGAAATTCGAG-3′) (Fig. 1A). The conditions for PCR with TaKaRa Ex Taq were 30 s at 94°C for denaturation, followed by 33 cycles of 10 s at 98°C for denaturation, 30 s at 58 to 62°C for annealing, and 5 min at 72°C for elongation. Those for PCR with TaKaRa LA Taq or PrimeSTAR HS were 1 min at 94°C for denaturation, followed by 33 cycles of 10 s at 98°C for denaturation and 5 min at 68°C for annealing and elongation. The PCR products were checked for size(s) and numbers by electrophoresis and subjected to DNA sequence analysis by the direct-sequencing method with FOX5-2 and/or FOX3 primer(s) in an ABI3100 or ABI3730LX sequencer (Applied Biosystems, Foster City, CA, USA). Sequenced FL-cDNAs were identified by using the Knowledge-based Oryza Molecular biological Encyclopedia (KOME, http://www.cdna01.dna.affrc.go.jp/cDNA/; Rice Full-Length cDNA Consortium 2003) and the Rice Annotation Project Database (RAP-DB, http://www.rapdb.dna.affrc.go.jp/; Ohyanagi et al. 2006; Rice Annotation Project 2007).

Fig. 1
figure 1

Transformation of rice with pRiceFOX, a binary Ti plasmid vector for overexpressing FL-cDNAs in plants. (A) Schematic structure of the pRiceFOX plasmid. The expression vector pRiceFOX allows unidirectional cloning of individual rice FL-cDNAs from RIKEN by using two different SfiI sites [SfiI(A) and SfiI(B)]. P35S, CaMV 35S promoter; PUbi-1, maize Ubiquitin-1 promoter; Tg7 and Tnos, polyadenylation signals from gene 7 and nopaline synthase (nos) gene in the T-DNA, respectively; hpt, hygromycin resistance gene; LB, left border; RB, right border. Arrows: directions of transcription; arrowheads: positions of the primers for genomic PCR. (B) Genomic Southern blot analysis of transgenic rice plants (T0 generation) transformed with the empty pRiceFOX vector. Transgenic plants were cultured and regenerated in the presence of hygromycin at 30 mg/L (Hyg 30) or 50 mg/L (Hyg 50). HindIII-digested DNAs were separated by agarose gel electrophoresis and hybridized with the probe specific for the hpt gene, indicated in (A)

The Tnos fragment was similary amplified by using the forward primer N5 (5′-GAGCTCGAATTTCCCCGATCGTTCAAAC-3′) and the reverse primer N3 (5′-CCCGATCTAGTAACATAGATGACACCGC-3′) (Fig. 1A). The conditions for PCR with PrimeSTAR HS were 1 min at 94 °C for denaturation, followed by 30 cycles of 10 s at 98°C for denaturation, 5 s at 60°C for annealing, and 30 s at 72°C for elongation. The presence and size of each PCR product were checked by electrophoresis.

Southern blot analysis

Southern blot hybridization was performed according to standard protocols (Sambrook et al. 1989). Genomic DNA was extracted from the leaf blades of each transgenic plant with a DNeasy Plant Mini Kit (QIAGEN) according to the protocol provided by the manufacturer. Ten micrograms of rice genomic DNA was digested with HindIII and fractionated in a 0.8% agarose gel. An hpt gene fragment was PCR-amplified with 5′ primer (5′-CCGATTCCGGAAGTGCTTGAC-3′) and 3′ primer (5′-TGGGAATCCCCGAACATCGCC-3′) and pRiceFOX as a template. The hpt fragment was labeled by the ECL Direct Nucleic Acid Labeling and Detecion System (GE Healthcare Bio-Sciences, Piscataway, NJ, USA) according to the procedure recommended by the manufacturer.

RNA preparation and RT-PCR

An RNeasy Plant Mini Kit (QIAGEN) was used according to the manufacturer’s instructions to extract total RNA from the leaf blades of each transgenic plant grown for 2 weeks after transfer to soil. First-strand cDNAs were synthesized from each RNA preparation (1-μg/reaction) with an oligo(dT) primer by using an ExScript RT reagent Kit (Takara-Bio) in a total volume of 20-μL, in accordance with the manufacturer’s instructions.

The specific sequences of each of the primer pairs used in semi-quantitative and quantitative reverse transcription PCR (RT-PCR) are listed in Table S1. Primer pairs amplifying cDNAs for Actin1 (AK100267; 5′-CTTCATAGGAATGGAAGCTGCGGGTA-3′ and 5′-TTCCTGTGCACAATGGATGG-3′) and UBQ5 (AK061988; 5′-ACCACTTCGACCGCCACTACT-3′ and 5′-ACGCCTAAGCCTGCTGGTT-3′) were used as loading controls for semi-quantitative RT-PCR. The Actin1 primers were also used as internal controls for normalization of the quantitative RT-PCR reaction (Jain et al. 2006).

Semi-quantitative RT-PCR was performed with 1 μL of template cDNA per 50-μL reaction with TaKaRa Ex Taq for 30 s at 94°C, followed by 28 cycles of 10 s at 98°C, 30 s at 60°C, and 30 s at 72°C. Quantitative RT-PCR was performed with a Thermal Cycler Dice Real Time System (Takara-Bio) using SYBR Premix Ex Taq (Takara-Bio) and 1 μL of template cDNA per 25-μL reaction, in accordance with the manufacturer’s instructions. The threshold cycle (Ct) was auto-calculated by the analysis software that accompanied the system. The expression level normalized to that of the endogenous control gene (Actin1) was calculated by relating the measured Ct to a standard curve obtained by diluting PCR-amplified DNA for which the exact DNA concentration was known. The resulting RT-PCR products were resolved by agarose gel electrophoresis followed by staining with ethidium bromide.

Results

Construction of a binary vector for overexpression of rice FL-cDNAs in rice

For the Arabidopsis FOX-hunting system, Ichikawa et al. (2006) constructed the binary plasmid pBIG2113SF from the pBI-based plasmid pBIG2113N. By the insertion of individual FL-cDNAs in forward orientation between two SfiI sites producing different cohesive ends in the pBIG2113SF vector, each FL-cDNA was placed under the transcriptional control of the 35S promoter. For FOX-hunting in rice, we constructed the binary vector pRiceFOX (Fig. 1A) from pBIG2113-S (Fujita et al. 2007), a derivative of pBIG2113N. Because PUbi-1 is stronger than the 35S promoter in rice cells and is highly expressed in various rice tissues (Cornejo et al. 1993), we exchanged the 35S promoter in pBIG2113N with PUbi-1. We also replaced the nopaline synthase promoter of the marker gene, hpt, with 35S for more efficient selection of transformed calluses (Dekeyser et al. 1989).

pRiceFOX was transformed into rice according to the method of Toki et al. (2006) by using two selection media, containing hygromycin B (Hyg) at 30 or 50 mg/L. The transformation efficiency was 38.9% in the former medium (Hyg30 selection group) and 26.8% in the latter (Hyg50). Ten plants were randomly chosen from each group and analyzed by Southern blotting with the hpt fragment as a probe (Fig. 1A). Single fragments were detected in 5 out 10 plants on Hyg30 (average T-DNA copy number per plant: 1.7) and 4 on Hyg50 (2.4) (Fig. 1B). These results imply that the use of higher concentrations of Hyg may lead to the generation of smaller numbers of transgenic rice plants with greater T-DNA copy numbers in their genomes. On the basis of these data, and with a view to economy, we thereafter conducted the rice transformation with 30 mg/L of Hyg.

Construction of a FL-cDNA expression library in Agrobacterium containing 13 980 rice FL-cDNAs

Rice FL-cDNA clones were originally collected and sequenced by two groups, the Laboratory of Genome Sequencing and Analysis Group of the Foundation for Advancement of International Science (FAIS) in Tsukuba, and the Laboratory for Genome Exploration Research Group, the Genomic Sciences Center, Institute of Physical and Chemical Research (RIKEN) in Yokohama. They used different vector systems for the construction of cDNA libraries (Rice Full-Length cDNA Consortium 2003).

We aliquoted 13 980 independent rice FL-cDNAs from RIKEN in nearly equal amounts to generate a normalized rice FL-cDNA mixture. After construction of the rice FL-cDNA overexpression library in agrobacteria (rice-FOX Agrobacterium library), plasmid DNAs were isolated from randomly chosen Agrobacterium colonies. Digestion of plasmids from 100 colonies with SfiI followed by assessment by agarose gel electrophoresis revealed that 96 colonies contained cDNA fragments and only four colonies contained no cDNA inserts (data not shown). Three out of the 96 colonies contained two different cDNA fragments, and seven colonies contained three different cDNAs. Since SfiI-digestion of the pRiceFOX vector and the RIKEN-cDNA clones generated different, but intercompatible, cohesive ends, it is probable that three different SfiI-digested cDNA fragments were inserted into the SfiI-digested pRiceFOX vector simultaneously.

Generation and PCR analysis of FOX-rice lines

FOX-rice lines were generated by transforming rice with the rice-FOX Agrobacterium library. To date we have produced about 12 000 independent FOX-rice plants (T0 generation) with Hyg resistance and, after they have differentiated roots, transplanted them into soil.

To identify integrated cDNAs in these lines, we have analyzed 10 219 FOX-rice lines by PCR, using their genomic DNAs as templates and T-DNA-specific primers (FOX5-1/FOX5-2 and FOX3, as shown in “Materials and Methods” and Fig. 1A). Figure 2A is a typical example of the PCR products from 21 plants from randomly selected FOX-rice lines. The sizes of the introduced rice FL-cDNAs were variable, ranging 1.1 to 3.4 kb. Among the 10 219 lines analyzed by PCR, 332 possessed a fragment of the same size as that from lines transformed with the empty vector (“Empty vector” in Fig. 2B, C). The PCR products from 8322 lines (81.4% of the FOX-rice lines analyzed) showed single bands after electrophoresis (“Single fragment” in Fig. 2B, C), and those from 293 lines showed multiple fragments (“Multiple fragments” in Fig. 2B, C). The remainder, 1272 lines, gave no fragment amplification, suggesting that only the hpt marker gene was integrated into their genomes, without the accompanying FL-cDNA(s). To confirm the above possibility, we performed PCR with a pair of specific primers for the Tnos sequence (N5 and N3 in Fig. 1A). The Tnos fragment (“No amplification/Tnos+” in Fig. 2B, C) was amplified from 254 of the 1272 lines. This indicates some difficulty in amplifying the cDNA fragments in these lines. Rice exon sequences exhibit higher GC contents (54.2%) (IRGSP 2005) than those from dicot species such as Arabidopsis (44.1%) (Arabidopsis Genome Initiative 2000). This tendency could have caused the difficulty in amplification of target fragments, even with the use of PCR conditions for templates with high GC contents (see “Materials and Methods” section for details). In contrast, no Tnos fragment was amplified from 1018 lines (“No amplification/Tnos–” in Fig. 2B, C). This result suggested that only the hpt gene was integrated into the rice genome, whereas the FL-cDNA to be located at the LB (left border) side of the T-DNA was not. Thus, the subtotal of FOX-rice plants without cDNA integration was 1350 (13.2%) (“Empty vector” plus “No amplification/Tnos–” in Fig. 2B, C).

Fig. 2
figure 2

PCR analysis of transgenes in FOX-rice lines. (A) Example of size distribution of the rice FL-cDNAs integrated in FOX-rice lines transformed with the rice-FOX Agrobacterium library. PCR-amplified fragments including FL-cDNA(s) were electrophoresed. In lane 4, two bands were amplified. Lane M, 1-kb DNA size markers. (B, C) Summaries of PCR analysis of transgenes. Genomic PCR data from 10 219 lines were compiled to show the patterns and numbers of cDNA-containing fragment(s) in the transgenic population. See text (Results section) for further details

Among the 8615 FOX lines showing PCR-amplification of a single or multiple cDNA fragment(s) (Fig. 2C), amplification of one, two, three, four, and five fragments was detected in 8322, 277, 14, 1, and 1 plants, respectively. On average, 1.04 FL-cDNAs were integrated into each diploid genome. This is smaller than the number of T-DNA copies inserted into pRiceFOX-transformed rice (Fig. 1B). Fifteen FOX-rice lines amplifying single cDNA fragments were chosen randomly and subjected to DNA gel-blot analysis. Approximately 2.1 copies of T-DNA inserts per plant were observed (Fig. S1). The results indicate the frequent occurrence of multiple insertions of single T-DNAs into the host genome. In other words, approximately two copies of a T-DNA were inserted into each diploid genome in our FOX-rice plants (T0 generation). A similar tendency was also observed in the Arabidopsis FOX-hunting system (Ichikawa et al. 2006).

Size distribution of FL-cDNAs inserted into the genomes of FOX-rice lines

To further examine the feature of FL-cDNAs integrated into our FOX-rice lines, we randomly selected 238 FOX-rice lines, which were generated from the early experiments and showed a single cDNA integration in each genome, and checked whether or not a bias in their size distribution of the FL-cDNAs is observed. We could not find any notable difference in the range of size distributions (Fig. 3) and the average sizes of FL-cDNAs between the 238 FOX-rice lines (average insert size: 1.66 kb) and the original RIKEN rice FL-cDNA library comprised of 16 897 clones (1.99 kb).

Fig. 3
figure 3

Size distribution of the rice FL-cDNAs integrated into 238 FOX-rice plants (black bars), compared with 16 897 RIKEN rice FL-cDNAs cloned in the lambda-FLC vector (white bars)

Sequence analysis of FL-cDNAs integrated into FOX-rice lines

All the amplified PCR fragments bearing the integrated FL-cDNA(s) from the 8615 FOX-rice lines (Fig. 2C) were subjected to sequence analysis. Thus far, cDNA sequences from 8225 of the 8615 FOX lines have been successfully read, and database searches of the sequence data using KOME and RAP-DB identified 5462 independent FL-cDNAs. Of these 5462 FL-cDNAs, 2090 appeared twice or more.

Expression levels of transgenes in FOX-rice lines

To examine whether expression of transgenes was elevated in the FOX-rice lines, 24 transgenic lines were randomly selected (line names: CO004 to CP076 in Table S1). Single FL-cDNAs were inserted in 23 of these 24 lines; two different cDNAs were inserted in the remaining line, designated CO130 (Table S1). Expression of these 25 genes was examined by semi-quantitative RT-PCR analysis (Fig. 4) with individual gene-specific primers (Table S1). Expression of 22 genes was obviously enhanced in the corresponding FOX lines. The RT-PCR products for the transgenes in lines CO100 and CO120 (AK071882 and AK067076, respectively) and that for one of the two transgenes in line CO130 (AK069185, labeled “CO130-2” in Fig. 4) showed faint bands. We therefore performed quantitative real-time RT-PCR analysis of these three genes. The transcript level of AK071882 integrated in line CO100 was approximately 370 times higher than that in the control line. In line CO120, AK067076 transcripts were accumulated to approximately 400% of those in the control line. However, there was no obvious difference in the expression level of AK069185 between CO130 and the control. These results demonstrate that most of the transgenes were overexpressed in our FOX-rice lines.

Fig. 4
figure 4

Expression of transgenes in FOX-rice lines. Total RNA preparations were isolated from leaf blades of 24 independent transgenic lines and subjected to RT-PCR analysis. Each transgene was amplified with the gene-specific primers listed in Table S1. RT-PCR products were resolved in 2.5% agarose gel. In FOX line CO130, two FL-cDNAs were integrated (indicated as CO130-1 and CO130-2). For individual transgenic lines, upper panels represent the expression levels of introduced FL-cDNAs, and lower panels indicate those of internal Actin1 used for loading adjustment. F, FOX-rice lines; C, control lines transformed with the empty pRiceFOX vector

Observation of phenotypes in the FOX-rice lines

Miyao et al. (2007) adopted 53 phenotype descriptors belonging to 12 classes for the classification of visible phenotypes in the Tos17 insertion lines. Other than the 53-phenotype descriptors, we added 8 possible phenotypes (3 classes) related to the characteristics on callus, regenerants, and root. According to the 61-phenotype descriptors (15 classes), we observed and classified the visible phenotypes of the T0 plants of 9021 FOX-rice lines. Our classification revealed that 1496 out of the 9021 lines (16.6%) had altered phenotypes (Table S2).

Several FL-cDNAs conferred interesting and remarkable phenotypes upon the corresponding FOX-rice lines (Fig. 5 and Table 1). To confirm the reproducibility of the altered phenotypes, those cDNAs were individually subcloned into the pRiceFOX vector and then reintroduced into rice. The phenotypes of these lines (retransformants) are summarized in Table 1. All the reintroduced lines showed the same phenotypes as those of the original FOX lines, except for line BL276, which showed a partial difference in the retransformants (Table 1). Line BL276 had drooping leaves in the early and middle stages of their growth, and its plant height was greater than in control plants at the maturing stage. BL276-RE, the reintroduced lines bearing the same cDNA as that integrated in line BL276, also exhibited drooping leaves, but the retransformant plants were no higher than the wild-type plants. Furthermore, the BL276-RE lines were pale green, whereas the original BL276 line was a normal green.

Fig. 5
figure 5

Representative phenotypes of FOX-rice lines. Line names are indicated in the corresponding photographs. T0 plants were grown for 7 weeks (A) or 14 weeks (B, C, F) on soil after transfer from hormone-free medium, or for 2 weeks on hormone-free medium after transfer from regeneration medium (E). (D) T1 seeds were sown and grown on callus-induction medium containing 2,4-dichlorophenoxyacetic acid at 2 mg/L (Toki et al. 2006). Bars = 10 cm in A, C and F, 1 cm in E, and 0.1 cm in D

Table 1 Reproducibility by retransformation of altered phenotypes appearing in the FOX-rice lines

Accordingly, it is likely that the position or pattern of integration of the T-DNA insert(s) into the genome of the original FOX line (BL276) affected the expression of neighboring genes that could be involved in plant growth. At least, however, we observed reproducibility of the “drooping leaf” phenotype in the BL276 and the BL276-RE lines.

These results indicate that the various phenotypes appearing in the FOX-rice lines were brought about by the individual introduction of a variety of FL-cDNAs under the control of the maize Ubi-1 promoter.

In the course of production of the FOX-rice lines, albino plants occasionally appeared. We randomly chose 12 albino lines and individually reintroduced their transgenes. None of the reintroduced lines was albino or pale green (data not shown). Thus, we concluded that most albino phenotypes in our FOX-rice lines were generated by other factors, such as somatic variation(s) during tissue culture, and not from overexpression of transgenes.

The weak growth and lethality found in line BD108 were reproducibly observed in the reintroduced lines. Line BD108 grew weakly and died 7 weeks after transfer to soil. A rice FL-cDNA, AK071961, was integrated into this line. The deduced protein encoded by AK071961 was annotated in RAP-DB as a phospholipid/glycerol acyltransferase. Since rice growth at the early stage is especially affected by changing environmental conditions, it seemed unconvincing that the phenotypes observed in line BD108 were caused by the overexpression of AK071961. To confirm this, we reintroduced AK071961 FL-cDNA into wild-type rice, obtained 13 independent AK071961-FOX transgenic lines, and then observed their growth after transfer to soil. All of them grew weakly compared with control plants and died 4 to 11 weeks after transfer to soil (Fig. 6A, B). Total RNA was extracted from four independent AK071961-FOX lines, and semi-quantitative RT-PCR analysis was performed. Although variable lifetimes were observed, the expression levels of AK071961 in all four AK071961-FOX lines were similar (Fig. 6C). The final plant heights and lifetimes of the 13 AK071961-FOX lines were largely dependent on their initial plant heights (Fig. 6B). These data strongly suggest that overexpression of AK071961 caused the growth defect and death in the transgenic rice plants.

Fig. 6
figure 6

Altered phenotypes related to the AK071961-FOX retransformed lines. (A) Growth of T0 plants of the AK071961-FOX retransformed lines grown for 4 weeks on soil after transfer from hormone-free medium. Bar = 10 cm. (B) Growth curves of T0 plants of the AK071961-FOX lines (shown as “BD108-#”) and control lines transformed with empty pRiceFOX vector (shown as “Vector-#”). BD108-22, the most vigorous plant among the AK071961-FOX lines, died 11 weeks after transfer to soil, but all the control plants grew normally. (C) Semi-quantitative RT-PCR analysis of T0 plants of the AK071961-FOX transgenic lines. Upper panel shows the transcript levels of AK071961 cDNA. Lower panel represents those of Actin1 used for loading adjustment

Characterization of three super-dwarf FOX lines

Three FOX-rice lines (AC214, AG244, and AM249) showed an extremely dwarfed phenotype. Plant height in these lines was less than 10 cm even 120 days after transfer to soil, whereas plants in the vector-control lines became more than 90 cm at their maturity. The leaf blades of the dwarf FOX plants were darker green and shorter and wider than those of the control plants, indicative of a typical phenotype for GA-deficient dwarf rice (Fig. 7A). Although control plants flowered approximately 90 to 100 days after transfer to soil, the FOX plants had not formed floral organs even 150 days after transfer to soil. Sequence analysis of the cDNA-containing fragments obtained by genomic PCR clearly demonstrated that the same rice FL-cDNA, AK101758, was integrated into all of these three lines. The gene was highly expressed in line AC214 (Fig. 7B). A RAP-DB and KOME database search revealed that the ORF encoded by AK101758 was annotated as a gibberellin 2-oxidase (GA2ox).

Fig. 7
figure 7

Super-dwarf phenotypes appeared in the FOX line AC214. (A) Growth of 14-week-old T0 plants of the control line (left) and the AC214 line (right). Upper right photograph is a magnified view of the AC214 plant. (B) Semi-quantitative RT-PCR analysis of the AC214 plant (T0). Upper panel shows the transcript levels of AK101758 cDNA. Lower panel indicates those of UBQ5 used as an internal control. (C) A retransformant with AK101758 was sprayed with water (−GA3) or 10 mM GA3 (+GA3), and was photographed 0 (0 d) and 8 days (8 d) after spraying. The super-dwarf phenotype of the AK101758-FOX retransformant was rescued by the GA3 treatment. Bars = 5 cm

The phenotypes of the three super-dwarf FOX lines described above are similar to that of the transgenic rice in which the OsGA2ox1 gene is ectopically expressed (Sakamoto et al. 2001). To confirm further that overexpression of the AK101758 confers a super-dwarf phenotype, we fused the cDNA 3′-downstream of the PUbi-1 constitutive promoter in the pRiceFOX vector and introduced it into wild-type rice. All primary transformants (12 independent lines) showed the same dwarf phenotype as observed in the three FOX lines. To determine whether the dwarf phenotype was due to reduced levels of active gibberellins (GAs), 10 mM GA3 was applied once to the reintroduced transformants 14 days after transfer to soil. The seedlings elongated gradually with GA3 treatment (Fig. 7C), indicating that overexpression of the OsGA2ox FL-cDNA in rice reduces biologically active, endogenous GAs to levels that cannot sustain normal plant growth.

Quantification of GAs in the dwarf mutants and the enzyme activity of the AK101758 gene product will be reported elsewhere.

Discussion

We produced approximately 12 000 transgenic rice lines by introduction of the rice-FOX Agrobacterium library comprising a maximum of 13 980 rice FL-cDNAs from RIKEN (Rice Full-Length cDNA Consortium 2003). Single cDNAs were inserted in 81.4% of the 10 219 FOX-rice lines analyzed by genomic PCR (Fig. 2B, C), and the average number of T-DNA copies per plant was estimated at 2.1 (Fig. S1). The size distribution pattern and average size of FL-cDNAs inserted into the genomes of randomly chosen 238 FOX-rice lines were very similar to those of the original RIKEN rice FL-cDNA library (Fig. 3). As we expected, a variety of transgenes were overexpressed in the corresponding FOX lines (Fig. 4). These features in our collection of FOX-rice lines indicate that our approach would be useful for the systematic and genome-wide gain-of-function analysis of rice genes.

We analyzed the sequences of the cDNA-containing fragments generated by genomic PCR of 10 219 FOX-rice plants, and we identified 5462 independent rice FL-cDNAs, corresponding to 39.1% of the 13 980 rice cDNAs in the FOX Agrobacterium library. The number of rice genes is estimated to be ∼32 000 (Rice Annotation Project 2007). This means that we have introduced only 17.1% of them into the genomes of 10 219 FOX plants.

To increase the number of introduced cDNAs, the simplest strategy for us will be to further produce FOX-rice lines by using the overexpression library bearing at most 13 980 RIKEN FL-cDNAs. As described in the Results section, we identified 5462 independent cDNAs integrated in the 8225 FOX plants of which we could read the transgene sequences. If we apply these values to the equation of Clarke and Carbon (1976), which is an approximation of the binomial distribution, about 6930 FOX lines are needed to obtain the 5462 cDNAs out of 13 980 cDNAs. Similarly, to obtain a population of FOX lines with 10 000 independent FL-cDNA insertions using our Agrobacterium library, we would need to generate approximately 17 560 FOX plants in total. We also need ∼64 380 plants to cover 99% (13 840) of FL-cDNAs. The above estimated number of lines (6930) is about 84% of the number in our results (8225 lines). One plausible reason for this discrepancy may be redundancy in the rice-FOX Agrobacterium library. In fact, we observed the reiterated appearance (twice or more) of 2090 of the 5462 cDNAs. If we assume that the appearance of cDNAs can be approximated to a binomial distribution, 1653 FL-cDNAs, theoretically, appear twice or more from the 8225 lines. The higher frequency for the cDNA reiteration in our FOX lines might be partly due to potential bias during construction of the rice-FOX Agrobacterium library. However, the size distribution and average size of FL-cDNAs inserted in 238 lines obtained from the early experiments was similar to those of the RIKEN rice FL-cDNA library (Fig. 3). Accordingly, the bias, if any, may not be a serious issue in regard to the reiteration.

Second, the growth rates of Agrobacterium cells harboring a variety of cDNAs may not be synchronized. Agrobacterium cells carrying a PUbi-1:gusA chimeric reporter gene show blue-staining when treated with X-glucuronide (Nakamura et al. unpublished observation), indicating that the maize promoter is active and can drive the expression of rice FL-cDNAs in Agrobacterium. This may affect the growth of some populations in the agrobacteria. The third possibility may be the effect of ectopic overexpression of various cDNAs in the rice cells. In general, the expression of genes for transcription factors, signal transducers, etc. is kept at low levels, with tissue and temporal specificity, and is often inducible by biotic or abiotic factors (Cheong et al. 2002; Rabbani et al. 2003; Eulgem 2005). We used PUbi-1 for the ectopic and constitutive expression of RIKEN FL-cDNAs in rice. Because the transcription activity of PUbi-1 is high (Cornejo et al. 1993), if the abovementioned genes were to become constitutively active, we would hopefully see altered or exaggerated phenotypes, which could give us useful insights in deducing the gene functions responsible (i.e., this is an advantage of FOX-hunting). In contrast, such overexpression may sometimes become toxic, conferring critical and lethal effects on the growth of rice (Yamamoto et al. 2007). The latter effect could be considered a disadvantage of FOX-hunting, as it would reduce the sum of FL-cDNAs integrated into the rice genome.

The alternative strategy to increase the number of introduced cDNAs in FOX-rice plants would be to use another population of FL-cDNAs from FAIS (Rice Full-Length cDNA Consortium 2003) for the construction of novel rice-FOX Agrobacterium libraries. We recently prepared six Agrobacterium expression libraries each containing 273 to 4240 normalized FAIS FL-cDNAs (13 823 in total). The novel FOX Agrobacterium libraries have been used to transform rice.

In 10.0% of the transformed rice lines analyzed by genomic PCR (“No amplification/Tnos–” in Fig. 2B, C), no cDNA was integrated. In the pRiceFOX—a pBI-based binary vector—FL-cDNA cassettes can be cloned next to the left border (LB) of T-DNA, whereas the chimeric hpt gene is positioned in the vicinity of the RB. VirD2 protein covalently binds to, and protects the 5′ end at the RB of the single-stranded T-DNA from exonucleolytic degradation (Dürrenberger et al. 1989; Jasper et al. 1994). Accordingly, transgenes adjacent to the RB are more accurately integrated into the host genomes than those at the LB (Gelvin 2000). Incomplete insertion of LB regions in the T-DNAs into the host genome could eliminate integration of the FL-cDNAs. To circumvent such a disadvantage in our current system, we made a novel binary plasmid with a T-DNA region bearing the promoter from the rice Actin1 gene (McElroy et al. 1990) for overexpression of 3′-downstream cDNAs to be located next to the RB and the hpt gene positioned on the LB side (Nakamura et al. unpublished results). This vector was used for the construction of the Agrobacterium expression libraries carrying FAIS FL-cDNAs.

We monitored the visible phenotypes in individual FOX lines, and we took photographs 2, 7, and 14 weeks after the lines were transplanted to soil. These phenotypic data, in addition to the sequence information on FL-cDNAs integrated into the FOX-rice lines, have been compiled to build up a database (DFR, for “A Database of FOX-Rice Lines”), which will be available for plant science researchers through our web site in future. For convenience, this database will be linked to such databases for rice functional genomics as the Rice Tos17 Insertion Mutant Database (http://www.tos.nias.affrc.go.jp/∼miyao/pub/tos17/index.html.en; Miyao et al. 2003, 2007), KOME (Rice Full-Length cDNA Consortium 2003), and RAP-DB (Ohyanagi et al. 2006; Rice Annotation Project 2007).

Phenotype databases of rice have been developed by several groups, e.g., the International Rice Information System (IRIS; Wu et al. 2005), Oryzabase (Kurata and Yamazaki 2006), Rice Mutant Database (RMD; Zhang et al. 2006), and Rice Tos17 Insertion Mutant Database (Miyao et al. 2007). In addition to these useful databases, our database, DFR, would be expected to provide valuable information for researchers to uncover rice gene functions.

We found that 16.6% of the observed FOX-rice plants [9021 lines (T0 generation)] showed altered phenotypes (Table S2). This rate was much higher than that of the activation tagging population (Jeong et al. 2002). Among the 9021 FOX lines, 104 plants (1.15%) showed weak growth. Growth of transgenic regenerants often suffers from the effects of environmental factors such as humidity, temperature, macro- and micro-nutrient deficiencies or excesses, and microbes (pathogens), especially in the initial growth stage after transfer to soil. Alternatively, in the course of tissue culture for selection and regeneration of transgenic plants, a variety of both epigenetic (e.g., DNA methylation, gene silencing, activation of retrotransposons and transposable elements) and genetic (e.g., nucleotide substitutions, deletions, insertions, rearrangements) changes could happen in the plant genome (Kaeppler et al. 2000; Cheng et al. 2006; Noro et al. 2007). These factors, i.e., abiotic and biotic factors and tissue-culture–induced variations, may more or less influence the growth conditions and consequently the phenotypes of FOX plants, and may somewhat increase the rate of phenotype alteration in our FOX lines.

However, one example of a FOX line (BD108) overexpressing AK071961 cDNA, provisionally encoding phospholipid/glycerol acyltransferase, showed weak growth and early death (Fig. 6). This phenotype was reproducible in the retransformed plants. Consequently, the overexpression of AK071961 was responsible for the mutant phenotype. A BLAST search using the deduced protein sequence encoded by AK071961 as a query showed that the AK071961 protein was similar to Arabidopsis membrane-bound glycerol-3-phosphate acyltransferases (AtGPATs). Zheng et al. (2003) identified seven AtGPATs and observed knockout phenotypes of the AtGPAT1 gene. The AK071961 protein contained all of the four previously defined acyltransferase domains common to the seven AtGPATs. The AtGPAT1-knockout mutants showed no growth or developmental defects at the vegetative stages but severely reduced male fertility and altered development in the tapetal cells. Deficiency in AtGPAT1 caused changes in the composition of several fatty acids in flower tissues and seeds. Although the putative AK071961 protein was most similar to AtGPAT2 and neither the knockout nor overexpression phenotypes of AtGPAT2 have been reported yet, it is likely that the ectopic overexpression of AK071961 caused alterations in fatty acid compositions, and these changes made the FOX-rice plants weak and lethal on the soil. Kachroo et al. (2003, 2004) demonstrated that a loss-of-function mutation in the ACT1 gene in Arabidopsis encoding the soluble chloroplastic enzyme glycerol-3-phosphate acyltransferase reversed the salicylic-acid- and the jasmonic-acid (JA)-mediated defense phenotypes of the ssi2 mutant (encoding stearoyl–acyl carrier protein desaturase) by increasing oleic acid levels. Although the AK071961 protein contains a conserved transmembrane domain, unlike the ACT1 protein, it may be speculated that alterations in fatty acid compositions by the overexpression of AK071961 affected the actions of fatty-acid–derived signaling molecules such as JA and caused the severe growth defect in the FOX-rice lines. To confirm these working hypotheses, we need at least to answer the question of whether or not fatty acid composition is altered in the AK071961-FOX lines.

In the activation tagging of rice, 35S enhancers have been shown to enhance endogenous gene expression without altering the expression patterns of most T-DNA-tagged genes in transgenic rice (Jeong et al. 2002, 2006). In the FOX-hunting system of rice, the maize Ubi-1 promoter directs ectopic overexpression of attached transgenes of any origin in transgenic plants. A typical example was found in the GA2ox-FOX lines, which showed an extremely dwarf phenotype with dark-green color. This overemphasized phenotype is a good indicator of the fundamental gene function(s). However, phenotypes caused by ectopic overexpression are not necessarily related to the native functions of the transgenes. Accordingly, much attention should be paid to deducing whether or not an observed phenotype is related to the native function of each transgene. To make such deductions with accuracy, the search for, and use of, loss-of-function mutants such as Tos17 insertion lines (Miyao 2003, 2007) and T-DNA insertion lines (An et al. 2005b) would be of great use. If such gene-knockout lines are not available, the production of transgenic lines with targeted interruption of gene expression using RNAi (Miki and Shimamoto 2004) and of those carrying a dominant chimeric repressor to suppress the expression of each transcription factor (Hiratsu et al. 2003) would be effective. Appropriate combinations of gain-of-function resources, such as the FOX-rice plants described here, and loss-of-function resources could be powerful tools to promote functional genomics in rice. From another perspective, we may be able to utilize the potentially useful traits that appear in FOX-rice plants in molecular breeding programs for crop improvement, even though the traits are brought about not by simple exaggeration of native gene function(s) but by ectopic overexpression of the genes responsible.

Here, we describe only the visible phenotypes in our FOX-rice lines. It will be difficult to screen FOX lines exhibiting various agriculturally important traits (e.g., biotic and abiotic stress tolerance) under ordinary growth conditions. Using T1 progeny seeds of FOX lines, screening for genes that confer resistance to blast fungus, salt tolerance, cadmium tolerance and/or accumulation, and tolerance to cold temperature and submergence is in progress, in collaboration with other researchers. These time-consuming screens are of great importance for gene discovery and in the manufacture of plant materials for crop improvement. Of similar importance are our efforts to generate the resources (rice cDNA expression libraries, FOX-rice plants and the progeny seeds) and the information on the resources (database of the FOX-rice lines) for the acceleration of rice gene-hunting.