Introduction

Plant cell wall characteristics strongly affect the availability of lignocellulosic-derived sugar for fermentation and are a major factor affecting cost and efficiency of biomass conversion to biofuels, due to the challenges of pretreatment steps [8, 34, 43]. Arabidopsis, as the primary model plant, has provided a research platform for important discoveries of genes and gene functions associated with primary and secondary cell wall biosynthesis. The genomic tools available for Arabidopsis have also been used to identify genes involved in xylem formation for application in understanding wood formation (e.g., [44, 68]). Nevertheless, it is not clear whether Arabidopsis will provide all the tools necessary for an expanded repertoire of agronomic traits of value in crop species. For instance, previous genetic analysis and transcript profiling studies suggest a role for specific fasciclin-like genes in both primary and secondary wall formation [39, 46, 55]. However, many fasciclins that are highly expressed during formation of cellulose-rich tension wood in Populus spp. appear to lack orthologs in Arabidopsis [2, 39].

Legumes have many traits that make them attractive bioenergy crops, especially as components of mixed grass swards or in crop rotations with maize. Alfalfa (Medicago sativa) is a potential bioenergy legume that fixes atmospheric nitrogen and produces leaf and stem coproducts: the leaf meal for livestock feed [14] and dried stems for conversion to syngas [15] and/or fermentation to ethanol [12]. A perennial crop with high biomass yields, alfalfa is the fourth most widely grown crop in the USA [5]. Nevertheless, studying alfalfa is challenging because it is a cross-pollinated autotetraploid, with complex segregation and inheritance patterns. Because of its ease of genetic manipulation and small genome size, barrel medic (Medicago truncatula) has become a model species for genomic studies of the Fabaceae, including alfalfa. In contrast to alfalfa, M. truncatula is a lesser-grown annual, diploid, and self-pollinating species. Comparative mapping among many legumes has shown a high degree of conservation of gene content and gene arrangement [9, 70], as well as a very high degree of DNA sequence homology between alfalfa and M. truncatula [60].

In previous research, four M. truncatula accessions and two alfalfa genotypes were evaluated for stem tissue morphology and cell wall characteristics to ascertain whether M. truncatula displays comparable diversity in stem cell wall traits to alfalfa [54]. One obvious morphological difference between M. truncatula and alfalfa plants relates to their stem growth habit. Perennial alfalfa plants each year produce erect stems, while the annual barrel medic forms decumbent stems. Nevertheless, cross sections of M. truncatula and alfalfa stems showed similar patterns of tissue differentiation and growth [54]. During primary growth in alfalfa, deposition of nonlignified primary walls predominates in elongating stem internodes proximal to the apical meristem. In older stem internodes of alfalfa undergoing secondary growth, synthesis of lignin- and cellulose-rich secondary wall predominate due to deposition of secondary tissues by vascular cambium [18]. During the postelongation phase, xylem vessel element and fiber cells develop lignified primary and thickened, lignified secondary walls soon after differentiation from the cambium. Phloem fiber cells also develop a thickened cellulose-rich secondary wall, but only the primary wall of phloem fibers lignifies [18]. Similarly, the range of stem cell wall composition and content among M. truncatula accessions was found to resemble that of alfalfa [54]. Statistically significant differences in cell wall composition among the four M. truncatula accessions tested indicates that naturally occurring variation in M. truncatula may be a rich resource for discovering mechanisms regulating cell wall biosynthesis. Overall, previously published results suggest that analysis of plant cell wall traits in alfalfa and other legumes would be facilitated by evaluation of M. truncatula, with well-developed genetic and genomic resources [10, 64, 67].

In Arabidopsis, secondary cell wall formation was shown to increase with increasing distance from the shoot apical meristem toward the base of the inflorescence stem [62]. Sampling of stem segments along this developmental gradient has been instrumental in uncovering plant genes responsible for cell wall biogenesis and control in Arabidopsis [7, 17]. Prassionos et al. [49] used a similar approach for sampling stem segments of hybrid aspen for transcriptome profiling. Transcript analysis of woody plants has unveiled genes involved in lignin, pectin, and cellulose biosynthesis [29, 49]. Macroarray analysis of different plant organs and stem segments has also been used to profile transcript expression patterns of cell walls in maize [26]. These efforts have uncovered many cell wall-associated genes that have putative functions in the phenylpropanoid pathway, several transcription factor (TF) gene families, cell death proteins, and transporters, among others. Additionally, proteome analysis of plant cell walls has allowed the identification of cell wall-localized proteins that have not been previously identified using transcript profiling [35, 65].

The Affymetrix Medicago genome array [1], which contains more than 52,000 probe sets from barrel medic and alfalfa, has been instrumental in the identification of biologically meaningful gene expression patterns in M. truncatula [3, 31, 60] and M. sativa [60]. In this study, we used the Affymetrix Medicago array for a genome-wide expression study in young (elongating) and old (postelongation) stem segments of the M. truncatula accessions A17 and DZA315.16 (hereafter referred to as DZA) and alfalfa clones 252 and 1283. These germplasms were chosen because they express divergent cell wall composition. Identification of differential expression profiles between stem developmental stages was instrumental in identifying genes with putative functions in primary and secondary cell wall biosynthesis and growth in the model legume and cultivated alfalfa.

Methods

Plant Culture

Alfalfa and M. truncatula plants were grown in greenhouse and controlled growth chambers, respectively. Alfalfa clones 252 and 1283, which have been identified with consistent differences in stem cell wall cellulose and Klason lignin concentrations (Lamb and Jung, unpublished), were propagated from vegetative cuttings and grown in plastic pots (10 × 10 × 10 cm) containing soil/sand (1:1; v/v) in a greenhouse. When plants reached the full flower stage of development, alfalfa plants were cut back by removing the aerial herbage at 2-cm cutting height. Plants were allowed to regrow for approximately 6 weeks after cutting until they developed multiple stems. Stem segments were sampled at the late bud stage of development as described below. There were three replicates with 16 plants in each replicate. Plants were watered daily with tap water and fertilized weekly with water soluble fertilizer (20:10:20; N/P/K).

For the M. truncatula experiment, seeds of M. truncatula A17 and DZA were scarified with sand paper and pregerminated in Petri plates on moist Whatman filter paper for 3 days at 4°C and then moved to room temperature for 24 h. Germinated seeds with approximately equal radicle lengths were planted in pots (10 × 10 × 10 cm) containing Metro-mix 200 (Sun Gro Horticulture, Bellevue, WA, USA) and were grown in a growth chamber (light intensity of 300 μmol m−2 s−1, temperature cycle of 25°C and 21°C, light and dark, with a 16-h photoperiod). One week after planting, seedlings were thinned to a single plant in each pot. Plants were watered with tap water as needed and fertilized weekly with water soluble fertilizer (20:10:20; N/P/K). Stem tissues were collected at 8 weeks after planting, when plants had developed multiple stems (three to four stems on each plant) with approximately eight to ten internodes per stem. There were three biological replicates with 21 pots in each replicate.

Stem Tissue Harvest

The transition from elongating to postelongation stage of stem internode development is easily identifiable in both M. truncatula and alfalfa by differences in pliability and suppleness of stem internodes. Stiff internodes are located lower on the stem axis and very pliable internodes near the tops of the stems. After identifying the internode which was in transition between these two developmental stages, it was excised and discarded. For microarray analysis, two internodes located immediately above (young stem segments, elongating) and below (old stem segments, postelongation) the transition internode were harvested. In general, young stem segments used for microarray analysis consisted of stem segments of the first and second internodes from the shoot apical meristem. Older stem segments generally contained the fifth and sixth internodes from the shoot apical meristem. Stem segments were immediately frozen in liquid nitrogen and stored at −80°C for subsequent RNA extraction. The remaining stem portions from each alfalfa plant (stem segments below the postelongating stem segment) were immediately collected and dried at 60°C for determination of cell wall composition [66].

RNA Extraction and GeneChip Hybridization

Approximately 150 mg of stem tissue ground in liquid nitrogen was used for total RNA extraction using TRIZOL reagent (Invitrogen, Carlsbad, CA, USA) following the manufacturer’s instructions. During the RNA extraction, contaminating genomic DNA was removed by incubating samples with RQ1 DNase following standard procedures suggested by the supplier (Promega, Madison, WI, USA). Ten micrograms of total RNA was used to produce biotin-labeled cRNA using Affymetrix kits following the manufacturer’s suggested procedures for eukaryotic reactions (Affymetrix, Santa Clara, CA, USA). Fifteen micrograms of biotin-labeled cRNA, fragmented as suggested by Affymetrix, was hybridized to the GeneChip® Medicago Genome Array. The integrity and quality of total RNA and fragmented biotin-labeled cRNA were verified using the Agilent 2100 Bioanalyzer RNA 6000 Nano LabChip (Agilent Technologies, Santa Clara, CA, USA). GeneChips were hybridized, washed, stained, and scanned as previously described [60].

Microarray Data Analysis

In all of the data analyses, gene expression signals corresponding to the bacterial microsymbiont probe sets were excluded. Gene expression values were calculated with the robust multi-array average [33] using quantile normalization, as provided with the Genedata Expressionist Pro version 4.5 (Genedata, San Francisco, CA, USA). Presence or absence calls of expression data for each probe set were made using MAS5 [42]. Principal components analysis (PCA) was initially used to evaluate gene expression patterns between young and old stem internodes of M. truncatula and alfalfa. PCA was conducted using the Genedata Expressionist Pro version 4.5 platform (Genedata).

Statistical analysis of the stem microarray data between young and old stem segments was based on the t test (p < 0.05) using the GeneSpring Expression analysis software version 7.3 (Agilent Technologies). Because the Medicago genome array contains a large number probe sets, p values were adjusted for multiple testing corrections to correct for occurrence of false positives [56]. For this, the Benjamini and Hochberg false discovery rate [4] was applied using options in the GeneSpring Expression analysis software (Agilent Technologies), and the corrected values were designated as the q values. Heatmaps and expression data clustering were also generated using GeneSpring Expression analysis software (Agilent Technologies). For clustering analysis, the average linkage clustering algorithm and the upregulated correlation similarity measure were used as provided in GeneSpring software (Agilent Technologies).

To identify genes that are potentially coexpressed probe sets, Pearson correlation coefficients were calculated for each probe set to identify genes that showed coregulated expression patterns with CESA and COBL4 genes. For coexpression analysis, 60 publicly available Medicago GeneChip data sets collected from several organs and tissues of A17 plants [3, 31], as well as the six GeneChip data sets from young and old stem segments of A17 in this study, were analyzed using GeneSpring Expression analysis software (Agilent Technologies).

The significantly differentially expressed and uniquely expressed probe sets were categorized into putative functional categories using GeneBins, an online bioinformatics tool for classifying probe sets of the Medicago chip [24] (http:bionfinroserver.rsbs.anu.edu.au/utils/GeneBins). Functional classifications of Medicago probe sets were further refined by homology searches using the predicted protein sequences of the Medicago chip as query sequences to perform BLASTX with an E value cutoff of 10−10 against plant cell wall protein families at the Purdue University cell wall genomics site (http://cellwall.genomics.purdue.edu/) and the cell wall navigator at the University of California, Riverside [23] (http://bioweb.ucr.edu/Cellwall/index.pl). For putative transcription factors, a similar sequence homology search using BLASTX with an E value cutoff of 10−10 was also performed against the database of Arabidopsis transcription factors [27] (http://datf.cbi.pku.edu.cn). Graphical display of cellular function and regulation overviews of the microarray data were oriented using the MapMan software [61] as adapted for the Medicago genome array [66]. All microarray data in this study have been deposited in the Gene Expression Omnibus database (http://www.ncbi.nlm.nih.gov/geo/) under platform number GPL4652.

Computational and Phylogenetic Analysis

Predicted Medicago fasciclin gene sequences (MtFLAs) were downloaded from Medicago BAC sequences (www.tigr.org/tigr-scripts/medicago/IMGAG/imgag_annotator.pl?). Arabidopsis and poplar fasciclin gene sequences used for comparison were obtained from GenBank. Multiple sequence alignments were performed using ClustalW and phylogenetic trees were constructed using amino acid sequence alignments of full-length sequences of all predicted protein sequences using PHYLIP software [20]. Presence and location of signal peptide cleavage sites in MtFLA amino acid sequences were predicted using the SignalP 3.0 server at http://www.cbs.dtu.dk/services/SignalP/. The presence of a fasciclin-like domain was predicted using interproscan (http://www.ebi.ac.uk/Tools/InterProScan).

Results and Discussion

Chemical Composition of Alfalfa and M. truncatula Stems

Alfalfa clones 252 and 1283 were identified as part of a long-term breeding program for stem quality traits (Lamb and Jung, unpublished). Chemical composition data from plants grown in field plots over several growing seasons showed that stems of alfalfa clone 252 on average showed consistently higher cellulose (302 ± 3 g kg−1 dry matter (DM)) and Klason lignin (165 ± 1 g kg−1 DM) concentration than stems of alfalfa clone 1283 (cellulose 273 ± 2 g kg−1 DM and Klason lignin 144 ± 1 g kg−1 DM; Lamb and Jung, unpublished). Greenhouse grown mature stem tissues of these clones collected at the same time as the stem samples for the current microarray analysis also showed significant differences for almost all chemical composition variables evaluated [66]. As expected, stems of alfalfa clone 252 displayed significantly higher cellulose and Klason lignin concentration than stems of alfalfa clone 1283. Significant variations were also observed in cell wall uronic acids, arabinose, galactose, and rhamnose concentrations between the two alfalfa clones [66]. In alfalfa, these cell wall monosaccharides were previously shown to be the primary components of pectin [28], indicating that pectin content was considerably higher in stems of clone 1283 compared to stems of clone 252.

The two M. truncatula germplasms used in this study differ significantly in both stem structure and cell wall composition. Stems of A17 showed significantly higher cellulose (421 g kg−1 cell wall) and hemicellulose (154 g kg−1 cell wall) content than stems of DZA (cellulose = 387 g kg−1 cell wall and hemicellulose = 141 g kg−1 cell wall), while DZA stems showed significantly higher pectin content than A17 stems [54]. M. truncatula A17 displayed significantly longer internodes than did DZA [54], consistent with the recent report by Juliet et al. [37], who also noted a thicker stem diameter in DZA plants than stem diameter in A17 plants. In contrast, there was comparable Klason lignin content between A17 and DZA stems [54].

These four germplasms were used for identifying robust expression patterns in cell wall-related genes occurring in Medicago stems regardless of variation in cell wall content or stem growth patterns. The study also will serve as a baseline for later identification of genes underlying biological variation in cell wall synthesis.

Overview of Microarray Results

The Affymetrix Medicago genome array was utilized for global transcript profiling of young (elongating) and old (postelongation) stem segments collected from the alfalfa and M. truncatula plants described above. The alfalfa plants were greenhouse grown, while the M. truncatula plants were grown in growth chambers. Consequently, the stem microarray data were analyzed separately for alfalfa and M. truncatula. Signal intensity values were assessed for variability of gene expression data among the three biological replicates of each genotype. Correlation coefficients among the three biological replicates were very high, ranging from 0.94 to 0.99, providing adequate statistical power for identification of differentially expressed probe sets between young and old stem segments of alfalfa and M. truncatula.

Principal component analysis of the stem microarray data indicated that all three biological replicates of each genotype of M. truncatula and alfalfa clustered tightly together (Fig. 1a, b), substantiating the very high correlation coefficients seen among the three biological replicates. In M. truncatula data, the first two principal components accounted for approximately 85% of the total gene expression variation. The second principal component (PCA 2) separated two distinct clusters of data: young versus old stem segments (Fig. 1a), indicating this component represented gene expression variation based on stem developmental stages, regardless of the M. truncatula accessions used in the study. In alfalfa microarray data, a large proportion (82%) of the gene expression variation was also explained by the first two principal components. PCA 2 explained nearly 3% of the total gene expression variation. Along PCA 2, there were four identifiable clusters of gene expression data (Fig. 1b), suggesting that PCA 2 is a measure of gene expression variation between alfalfa clones as well as stem developmental stages.

Fig. 1
figure 1

Principal component analyses (PCA) of genome-wide gene expression data from Medicago stem segments. Figures represent data clustering along the first two principal components for stem microarray data for (a) M. truncatula and (b) alfalfa. The percentages show gene expression variation explained by each principal component

The number of probe sets detected in stem segments of the two M. truncatula accessions was considerably more than the number of probe sets detected in stem segments of the two alfalfa clones. The percentages of detected probe sets in stem segments of DZA and A17 were 45% and 47% of all the probe sets on the chip, respectively (Supplemental Table S1). An average of 34% of the probe sets on the chip produced signal intensity values with stem samples from the two alfalfa clones (Supplemental Table S1). In our previous work, an average of 46% and 54% of the Medicago probe sets on the chip produced present calls when hybridized with mRNA from M. truncatula first trifoliate and young roots, respectively [60]. A lower percentage, 41% and 44%, of detected probe sets was observed for developmentally comparable alfalfa first trifoliate and young roots, respectively [60]. These results are consistent with the fact that the majority (96%) of the target probe sets were designed from nucleotide sequence information of M. truncatula A17 [1]. Only 4% of the Medicago probe sets were based on nucleotide sequences from alfalfa cDNA libraries.

Hundreds of Differentially Expressed Probe Sets in Young and Old Stem Segments Show Similar Patterns of Expression in M. truncatula and Alfalfa

There was a very high degree of overlap in the sets of probe sets detected in young and old stem segments; approximately 98% of the total probe sets detected in each Medicago germplasm were expressed in both young and old stem segments (Supplemental Table S1). A t test (p < 0.05) in combination with the Benjamini and Hochberg false discovery rate [4] was employed for each Medicago germplasm to identify probe sets with significant transcript differences between young and old stem segments. Consistent with the total number of hybridizing probe sets, the numbers of probe sets significantly differentially expressed between young and old stem segments in both M. truncatula accessions were considerably higher than the numbers of probe sets significantly differentially expressed between young and old stem segments in the two alfalfa clones. Approximately 5,117 and 6,638 probe sets showed significantly different expression patterns between young and old stem segments of DZA and A17, respectively. Of the differentially expressed probe sets in A17 and DZA, 2,629 probe sets were in common between A17 and DZA (Fig. 2; Supplemental Table S2A). Of these, approximately 42% of differentially regulated probe sets showed increased transcript accumulation in old stem segments of A17 and DZA. A further 73 probe sets were uniquely expressed in old stem segments of both A17 and DZA, while 67 probe sets were uniquely expressed in young stem segments of both A17 and DZA.

Fig. 2
figure 2

Venn diagram summarizing the number of significantly different probe sets between young and old stem segments of M. truncatula and alfalfa. To correct for occurrence of false positives in the t test (p < 0.05), multiple testing corrections were applied on the analysis [56] using the Benjamini and Hochberg false discovery rate [4]

For the alfalfa clones, approximately 1,384 and 383 probe sets were significantly differentially expressed between young and old stem segments of 252 and 1283, respectively. Of the differentially expressed probe sets in the alfalfa clones, 119 probe sets were in common to both 252 and 1283 (Fig. 2; Supplemental Table S2B). Approximately 13% of the differentially expressed probe sets showed significantly more transcript accumulation in old stem segments of the alfalfa clones. A further four probe sets were uniquely expressed in old stem segments, while 65 probe sets were uniquely expressed in young stem segments of both 252 and 1283. Overall, 52 probe sets were differentially expressed between stem segments across all four Medicago germplasms evaluated.

Functional Classification of Differentially Expressed Probe Sets and Visual Display Using MapMan Software

The differentially expressed probe sets between young and old stem segments of M. truncatula and alfalfa were assigned to functional categories as described in the “Methods”. Functional classification of the 2,629 differentially expressed probe sets in young and old stem segments of the two M. truncatula accessions showed that numerous genes have predicted roles in transcriptional regulation and signal transduction (8%), primary and secondary metabolism (19%), as well protein modification and degradation (10%). A similar functional approach was used to classify the 119 differentially regulated probe sets in stem segments of alfalfa clones 252 and 1283. The largest putative functional categories included transcriptional regulation and signal transduction (9%), primary and secondary metabolism (26%), as well as enzyme families (14%) and transport function (8%).

To gain an overview of cellular and metabolic functional categories, the transcriptional profiles of the 2,629 differentially regulated probe sets in M. truncatula stem segments were visually displayed using MapMan software [61], as modified recently for the Medicago GeneChip [66]. The overview of cellular functions presented in Supplemental Figure S1 showed that probe sets with significantly higher transcript accumulation in young stem segments were largely categorized in DNA repair and synthesis, cell division, cell cycle and cell organization, hormone-related signaling, and enzyme family classes. On the other hand, many probe sets with significantly higher transcript abundance in old stem segments of M. truncatula have predicted function in regulation of transcription, hormone-related signaling, protein modification, and degradation (Supplemental Fig. S1).

An overview of differentially regulated probe sets representing metabolism and regulatory functions in M. truncatula is presented in Fig. 3a, b. Genes with higher transcript abundance in young stem segments include many implicated in lipid and flavonoid metabolism and cell wall degradation and modification, among other metabolic classes. In contrast, genes with significantly higher transcript abundance in old stem segments of M. truncatula were implicated in cellulose synthesis, regulation of transcription (transcription factors), signaling (receptor kinases and auxin-mediated signaling), protein modification, and protein degradation (Fig. 3a, b).

Fig. 3
figure 3

Functional overview of significantly expressed probe sets in young versus old stem segments of M. truncatula. Figures depict MapMan software overview [61] for visualizing differential transcript abundance of probe sets with cellular functions associated with a primary and secondary metabolism and b regulatory and signal transduction. Expression ratios expressed as old/young stem internodes were log2 transformed and are shown as red and blue squares representing probe sets with upregulated expression in young and old stem segments, respectively

Genes Associated with Wall Modification Show Enhanced Expression in Young Stem Segments of Medicago

Cell wall-related genes that were differentially expressed between young and old stem segments were identified. Those genes with significantly higher transcript abundance in young stem segments of M. truncatula and alfalfa are presented in Tables 1 and 2, respectively. In young stem segments, most upregulated and/or preferentially expressed cell wall-related genes include expansins, beta-galactosidase, glycosyl hydrolase, xyloglucan endotransglucosylase/hydrolase (XET/XTH), proline-rich proteins, and fasciclin-like arabinogalactan proteins (AGP), among others. Many of the cell wall-related genes encode wall modifying proteins that play important roles in the relaxation of the rigid primary cell wall to allow elongation and extension during plant growth [11, 21, 22, 48]. We observed upregulated expression of two probe sets encoding XTHs, together with a uniquely expressed XET probe set (Mtr.45463.1.S1_at) in young stem segments of M. truncatula. There were also four expansin-like probe sets (Mtr.37590.1.S1_s_at, Mtr.9830.1.S1_at, Mtr.6653.1.S1_s_at, and Msa.1714.1.S1_at) that showed upregulated expression in young stem segments of M. truncatula. Expansins are thought to be distributed primarily over the expanding cell wall and function as cell wall loosening proteins [48]. Our observations for the increased transcript accumulation of expansins, XTHs, and XET in young stem segments are consistent with the view that these cell wall proteins are expected to be predominantly active in primary walls of elongating tissues during plant growth [11, 21, 22, 48].

Table 1 Cell wall-related genes upregulated in young stem segments of M. truncatula A17 and DZA
Table 2 Cell wall-related genes upregulated in young stem segments of alfalfa clones 252 and 1283

A number of probe sets that were upregulated in young stem segments of M. truncatula and alfalfa belong to genes encoding hydrolytic enzyme classes that appear to be involved in cell wall break down. Such genes include several homologs of glucanases, glucosyl hydrolases, and galactosidases, whose functions are largely related to cell wall expansion through hydrolysis of the pectin matrix (Tables 1 and 2).

Expression Patterns of Selected Marker Genes for Secondary Cell Wall Deposition Support Our Stem Segments Sampling Approach

Approximately 42% of the differentially expressed probes sets showed significantly upregulated expression in old stem segments of M. truncatula. Many genes that were upregulated in old stem segments showed similarities with proteins associated with secondary cell walls including cellulose synthases (CESA), COBRA-like protein 4 precursor (COBL4), never in mitosis gene A (NIMA)-related protein kinase, peroxidise, 4-coumarate:CoA ligase, cytochrome P450, cinnamyl alchol dehydrogenase, and some fasciclin-like AGPs (Tables 3 and 4). COBL4, some members of the CESA family, as well as selected lignin biosynthesis genes, are known marker genes for secondary wall biosynthesis in plants [6, 7, 46]. Our transcript profiling results coincide well with upregulated expression patterns of such secondary cell wall marker genes in old stem segments and appears to provide strong support for our stem segment sampling approach for studying cell wall genomics in Medicago. In most previously reported studies using different stem segments, transcript abundance of secondary cell wall marker genes including those involved in lignification was the highest in older stem segments and showed a decreased pattern of expression in stem segments near the apical meristem [7, 17, 49].

Table 3 Top 30 upregulated cell wall-related genes in old stem segments of M. truncatula A17 and DZA
Table 4 Cell wall-related genes upregulated in old stem segments of alfalfa clones 252 or 1283

In Arabidopsis, the extracellular glycosylphosphatidyl inositol (GPI)-anchored protein COBL4 appears to participate in cell expansion and was required for cellulose biosynthesis in secondary walls [6, 7, 52, 53]. Mutation of the Arabidopsis COBL4 gene (At5g15630) by T-DNA insertion resulted in plants showing a moderate irregular xylem (irx6) phenotype with significantly reduced levels of cellulose stem strength that resulted in mutant plants with easily broken stems [7]. A rice mutant described as brittle culm1 (bc1) was found to be a functional ortholog of the AtCOBL4 gene, as mutations in the rice gene resulted in reduced cell wall thickness affecting the mechanical strength of rice plants [40]. Here, a Medicago homolog of the COBL4 gene (Mtr.5947.1.S1_at) showed more than 23- and 50-fold higher transcript abundance in old stem segments of DZA and A17 compared to young stem segments, respectively (Table 3). This probe set was also upregulated in old stem segments of both alfalfa clones.

The Arabidopsis genome contains a superfamily of approximately 41 predicted CESA-like genes [30, 50], and the differential expression pattern of CESA genes in primary versus secondary cell walls is well documented. Based on genetic experiments and gene-expression analyses, three Arabidopsis CESA genes (AtCesA1, AtCesA3, and AtCesA6) typify primary walls and are coexpressed during primary cell wall formation [47], while three other CESA genes (AtCesA4/IRX5, AtCeSA7/IRX3, and AtCeSA8/IRX1) are involved in cellulose biosynthesis in secondary cell walls [47, 58, 59, 62]. In Arabidopsis, mutations of certain CESA genes resulted in the collapse of the secondary cell wall of xylem (irregular xylem), indicating that CESA genes are required for biosynthesis of secondary cell walls [62]. The Medicago chip contains approximately 28 probe sets encoding CESA genes. One CESA probe set (Mtr.33499.1.S1_at) was uniquely expressed in old stem segments of both A17 and DZA, while 15 other CESA probe sets showed particularly high transcript abundance in old stem segments of A17 and DZA (Fig. 4). The remaining Medicago CESA probes showed increased transcript abundance in young stem segments.

Fig. 4
figure 4

Cellulose synthases show differential expression patterns in stem segments of M. truncatula. The heat map shows ratio of signal intensity values in old stem segments of each Medicago ecotype relative to signal intensity values in young stem segments of the same ecotype. Red indicates upregulated expression, green indicates downregulation, and yellow indicates no change in expression profiles compared to young internodes

Differentially Expressed Transcription Factors and Signal Transduction Genes Suggest Transcriptional Control of Stem Development and Growth in Medicago

Transcription factors are key global regulators of gene expression and are known to play critical roles in many biological processes, including the regulation of cell wall development in plants. In Arabidopsis, TF-encoding genes make up approximately 6% (about 1,800) of the total number of genes including about 72 WRKY family genes, more than 600 zinc finger proteins, and 199 MYB and MYB-related transcription factors [19, 27, 51, 57]. Sequencing of the M. truncatula genome is in progress. Using BLAST analysis of the available M. truncatula genome sequencing data, Udvardi et al. [63] identified about 1,084 TF genes. We found that the Affymetrix Medicago chip contains approximately 1,870 probe sets that by amino acid homology could be classified as putative TFs by established criteria (http://datf.cbi.pku.edu.cn/). Approximately 113 putative TF probe sets were significantly differentially expressed in young versus old stem segments of both A17 and DZA; approximately 65% of these TFs showed increased transcript abundance in old stem segments of M. truncatula. The differentially expressed TF probe sets in M. truncatula stems represented 35 TF families (Supplemental Table S3). In alfalfa, approximately five putative TF probe sets were differentially expressed in young versus old stem segments of both 252 and 1283. Differentially regulated TF probe sets in alfalfa stems represented bHLH (Mtr.20533.1.S1_at and Mtr.33785.1.S1_at), CAMTA (Mtr.42126.1.S1_at), and APETALA2/ethylene-responsive element binding protein family (AP2/EREBP; Mtr.2744.1.S1_at and Mtr.41294.1.S1_at) TF families.

The list of differentially expressed TF families in M. truncatula stem segments consists of several plant-specific TF families including AP2-EREBP, auxin/indole-3-acetic acid (Aux/IAA), auxin-responsive factor (ARF), GRAS, NAC, and WRKY families. The precise contribution of the differentially regulated TFs in modulating cell wall biosynthesis in Medicago stems remains to be determined, although some or all of the differentially expressed TFs may have important roles in other plant developmental processes within Medicago stems. Nevertheless, our transcript profiling results were consistent with the lists of putative wall-associated TFs identified by transcript profiling in several plant species [reviewed by 13, 69]. For instance, the Aux/IAA genes are plant-specific TF gene families that participate in auxin-regulated transcriptional control of gene expression [38, 45]. Wall synthesis during plant development and growth was shown to be influenced by endogenous levels of hormones, and additional modifications can be induced by biotic or abiotic stresses [32]. Auxin triggers a specific signal transduction pathway that influences apical dominance, vascular tissue development, cell elongation, and tissue patterning. Expression of Aux/IAA genes is auxin inducible, which is expected to provide a negative-feedback loop for auxin responses by forming homo- and hetrodimers with Aux/IAAs or other TFs such as ARF proteins. There were several ARF probe sets that were differentially regulated in Medicago stem segments.

With regard to signal transduction, at least 4% of the differentially regulated probe sets in young and old stem segments of M. truncatula and alfalfa germplasm encode genes that are implicated in signal transduction cascades. Many differentially regulated signaling genes include homologs of receptor-like protein kinases (RLKs), several GPI-anchored proteins of unknown function, many Ser/Thr protein kinase/phosphatases, and genes with interacting domains such as leucine-rich repeat (LRR) containing protein kinases, enzyme inhibitors, and proteases. RLKs and LRR domain containing RLKs have been implicated in many plant developmental processes including controlling fiber development in cotton [41].

Identification of Additional Candidate Genes for Secondary Cell Walls Through Analysis of Coexpressed Probe Sets with Selected Marker Genes

To further identify candidate genes that may have roles associated with secondary cell wall deposition in Medicago, we extended our analysis to include a large collection of publicly available M. truncatula microarray data for analysis of coexpression patterns using marker genes. Recently, similar coexpression approaches were successfully used in Arabidopsis to identify genes required for cellulose synthesis in primary or secondary cell walls [7, 46]. Marker genes associated with secondary cell wall biosynthesis in Arabidopsis used for the coexpression analysis included three CESA genes (AtCesA4/IRX5, AtCeSA7/IRX3, and AtCeSA8/IRX1) and COBL4 [6, 46, 59]. We used the expression profiles of a CESA gene (represented by Mtr.5123.1.S1_at), as well as a COBL4 homolog (represented by Mtr.5947.1.S1_at), as reference probe sets to identify coregulated genes in the publicly available M. truncatula microarray data. These two probe sets were selected as reference points for our analysis because (a) the deduced protein sequence of Mtr.5123.1.S1_at was the probe set that showed the greatest similarity (approximately 80% amino acid sequence identity) to AtCesA8, which is known to be required for cellulose synthesis during secondary cell wall formation [58]; (b) the deduced protein sequence of Mtr.5947.1.S1_at was the probe set with the greatest similarity (approximately 76% amino acid sequence identity) to that of AtCOBL4, which in prior studies was found to be required for secondary cell wall synthesis and was among the genes that showed a high level of coexpression with secondary cell wall-associated CESA genes [7, 46]; and (c) these two probe sets were among those with highest expression ratios in old stem segments of Medicago.

Our analysis revealed 213 probe sets that were coexpressed (R 2 ≥ 0.7) with COBL4 and/or CESA probe sets (Supplemental Table S4), including those with putative functions associated with cellulose synthesis, cell wall structural components including fasciclin-like AGPs, laccases, peroxidases, and putative signaling and regulators such as no apical meristem (NAM)-like (NAC-like), WRKY, and MYB family transcription factors as well as several receptor like protein kinases (Table 5). Although the coexpressed probe sets implicated in signal response and TF genes have not been investigated experimentally, many of the coexpressed genes suggest novel finds among secondary cell wall-associated genes in Medicago. Our list of coexpressed genes included those that were previously reported to be secondary cell wall coregulated genes [7, 46]. For example, several Medicago fasciclins (MtFLA) were coexpressed with both marker CESA and COBL4 probe sets (Table 5). In previous studies, two Arabidopsis genes, AtFLA11 (At5g03170) and AtFLA12 (At5g60490), were among those that were most highly coexpressed with AtCesA genes required for secondary cell wall formation [7, 46]. Arabidopsis contains at least 21 fasciclin-like genes [36], and a mutation in the fasciclin domain of one of the fasciclin-like genes (AtFLA4) resulted in aberrant cell expansion in Arabidopsis [55]. Plant fasciclins are a subgroup of AGPs that contain unique fasciclin domains that also occur in proteins from bacteria, mammals, sea urchins, and yeast and are thought to be involved in cell adhesion [16, 36]. Fasciclin-like AGPs were highly abundant in Populus spp. with strong preferential expression in xylem cells during formation of cellulose-rich tension wood [2, 39]. Nevertheless, many poplar fasciclins appear to lack orthologs in the Arabidopsis genome [2, 39].

Table 5 List of most highly coexpressed probe sets for COBL4 and CESA genes using publicly available M. truncatula A17 microarray data

The Medicago chip contains approximately 17 probe sets encoding fasciclin-like genes (MtFLAs). Of these, 11 probe sets were designed from sequences based on gene predictions of the M. truncatula genome sequencing project. The coding sequence of MtFLAs, for which the complete sequence information is available, ranges in size from 931 to 1,971 nucleotides. Our analysis showed that many of the MtFLAs appear to contain several shared features: (a) an N-terminal signal peptide sequence, (ii) 5′ and 3′ untranslated regions, (c) a fasciclin domain, and (d) absence of introns. The exception was a gene represented by probe set Mtr.48645.1.S1_at that displayed two fasciclin domains along the coding sequence and contained a single intron of 673 nucleotides. Proteins encoded by some of the MtFLAs also contained one or two transmembrane domains at their N- and/or C-terminal regions.

In an attempt to construct a MtFLA expression atlas, the tissue-specific and developmental regulated expression patterns of all 17 MtFLAs on the chip were evaluated using the publicly available Medicago microarray data [3, 31] and data from this study. Three expression clusters were evident (Fig. 5). The first expression cluster comprised nine MtFLA probes sets that showed particularly high transcript abundance (more than tenfold difference) in old stem segments compared to young stem segments. Interestingly, these probe sets showed highly coregulated expression patterns with COBL4 and CESA genes (Table 5) and also showed very strong gene expression in petioles, stems, and young roots of 4-week-old A17 plants (Fig. 5). Very strong gene expression by members of this clade was also observed in young nodules of A17 roots at 4 days postinoculation (pi) with Sinorhizobium meliloti, presumably because the young nodule samples also contained some root tissues attached to the nodules. However, very low expression was observed in nodule samples at 10 and 14 days pi, as well as in mature nodules. The second expression cluster comprised three MtFLA probe sets (Mtr.17362.1.S1_at, Mtr.50904.S1_at, Mtr.48994.S1_at) that showed 1.5- to twofold more transcript abundance in old stem segments compared to young stem segments. These probes also showed enhanced gene expression in petioles and stems of 4-week-old A17 plants. The third expression cluster consisted of those MtFLAs (Mtr.17392.1.S1_at, Mtr.37610.1.S1_s_at, Mtr.42768.1.S1_at, Mtr.42950.1.S1_at, and Mtr.48645.1.S1_at) with considerably downregulated expression in old stem segments, indicating enhanced transcript abundance predominantly in young stem segments of A17. Members of this expression cluster also showed downregulated expression in roots, petioles, and stems of 4-week-old A17 plants. These results are consistent with previous genetic analysis and transcript profiling studies that suggested a role for fasciclin-like AGPs in both primary and secondary wall formation [39, 46, 55]. There is limited information in the functional importance of fasciclins in plants.

Fig. 5
figure 5

Transcript atlas of Medicago fasciclins (MtFLAs). Results include publicly available M. truncatula A17 microarray data as described in methods. The heat map represents ratio of gene expression in each organ relative to signal intensity values in young stem segments of A17. Red indicates upregulated expression, green indicates downregulation, and yellow indicates no change in expression profiles compared to young internodes

The phylogenetic position of MtFLAs was compared with FLAs homologs in the poplar and Arabidopsis genomes. We performed multiple sequence alignments of deduced amino acid sequences of 11 MtFLAs for which a complete coding sequence is available, along with the amino acid sequences of Arabidopsis and poplar fasciclins. Amino acid sequences of known classical AGPs were included for comparison in phylogenetic tree construction. In the phylogenetic tree, classical AGPs formed a tight cluster that was distinct from FLAs, while FLAs were clustered in about five subclades (Supplemental Figure S2). In two subclades, which contained one or no Arabidopsis FLAs, numerous MtFLAs were clustered very closely with poplar fasciclins (Supplemental Figure S2).

Conclusions

As an experimental system, the genus Medicago has the advantages of a model plant with ample genetic and genomic resources (M. truncatula) and a cultivated crop (M. sativa) with its economic importance as a forage plant, ability to improve soil fertility status, and potential to be a bioenergy feedstock. The large number of differentially regulated cell wall-related genes in M. truncatula will be valuable for exploring genetic systems controlling primary and secondary cell wall deposition in plants and may facilitate gene discovery and improvement of biomass production of quality traits in cultivated alfalfa or related dicots.

The phylogenetic grouping of the Eudicots shows that the eurosid I clade, in addition to Fabales (legumes), contains the Malpighiales (where polar and willow are placed), among many other woody species [25]. By contrast, Arabidopsis in the Brassicales lineage is found in a different branch of plant species with the Rosids, the eurosid II clade. Beyond legumes, M. truncatula may have utility as a genetic model for cell wall development in closely related woody dicots and offers a new approach to study an expanded repertoire of agronomic traits of value in other crops. Multiple model plants and a systems approach will be needed to decipher the regulation of cell wall biogenesis in plants. It is probable that M. truncatula could help to fill in the knowledge gap concerning divergent genes that cannot be addressed in Arabidopsis. Moreover, M. truncatula, a short-lived annual, will facilitate experimental analysis of gene function that would be difficult in woody dicots including poplar, due to their long generation times.