Toward better annotation in plant metabolomics: isolation and structure elucidation of 36 specialized metabolites from Oryza sativa (rice) by using MS/MS and NMR analyses

Metabolomics plays an important role in phytochemical genomics and crop breeding; however, metabolite annotation is a significant bottleneck in metabolomic studies. In particular, in liquid chromatography–mass spectrometry (MS)-based metabolomics, which has become a routine technology for the profiling of plant-specialized metabolites, a substantial number of metabolites detected as MS peaks are still not assigned properly to a single metabolite. Oryza sativa (rice) is one of the most important staple crops in the world. In the present study, we isolated and elucidated the structures of specialized metabolites from rice by using MS/MS and NMR. Thirty-six compounds, including five new flavonoids and eight rare flavonolignan isomers, were isolated from the rice leaves. The MS/MS spectral data of the isolated compounds, with a detailed interpretation of MS fragmentation data, will facilitate metabolite annotation of the related phytochemicals by enriching the public mass spectral data depositories, including the plant-specific MS/MS-based database, ReSpect.

Currently, metabolite identification is the major bottleneck in metabolomic studies (Matsuda et al. 2009;Dunn et al. 2012). It is estimated that there are over 200,000 to 1,000,000 different metabolites in the plant kingdom (Dixon and Strack 2003;Afendi et al. 2012). Many of plant secondary metabolites have been demonstrated to have 'specialized' roles for adaptive significance in protection against predator and microbial infection. Thus, these metabolites have recently been termed 'specialized metabolites', in addition, avoiding the impression of less important than 'primary' by the word 'secondary' (Pichersky and Lewinsohn 2011;Saito 2013). Identification of specialized metabolites still largely remains unknown, and many known metabolites are commercially unavailable. In untargeted metabolite profiling, most metabolites cannot be confidently identified due to the lack of authentic standards. These metabolites are often putatively annotated by comparison of their accurate mass and MS/MS patterns in the literature or databases ). However, the MS/MS spectra of plant-specialized metabolites in databases are especially limited. It is essential to acquire many MS/MS spectra to develop such databases. Isomers of many metabolites show similar chromatographic or mass spectrometric characteristics; therefore, substantial numbers of metabolites detected as MS peaks have not been unambiguously assigned to a single metabolite in MS-based metabolite profiling (Matsuda et al. 2009;Lei et al. 2011). Nuclear magnetic resonance (NMR) is a very powerful method for structural analysis, especially for stereoisomers. Hence, purification and structural elucidation of (un)known metabolites by combining a variety of spectroscopic methods such as MS/MS and NMR are useful for unambiguous identification of (un)known phytochemicals in plant metabolomics (Nakabayashi et al. 2009;Van der Hooft et al. 2013).
To enable better annotation in plant metabolomics, we aimed to isolate and identify specialized metabolites from model plants, like Arabidopsis thaliana (Nakabayashi et al. 2009), by using MS/MS and NMR methods. Recently, metabolome studies have attracted increasing attention in the case of Oryza sativa (rice) (Kusano et al. 2007;Suzuki et al. 2009;Calingacion et al. 2011;Redestig et al. 2011;Matsuda et al. 2012;Chen et al. 2013;Jung et al. 2013), which is one of the most important staple crops worldwide. Therefore, it is important to enrich the libraries of standard compounds and reference MS/MS spectra for specialized metabolites of rice. Habataki (indica variety) is one of elite Japanese cultivars, which has high yields. Previous studies have indicated that the rice leaves contain various flavonoids, and Habataki has high level production of a flavonoid C-glycoside (apigenin-6,8-di-C-a-L-arabinoside) due to the genetic polymorphism. Unequivocal structures of such metabolites are useful for understanding gene-tometabolite correlations (Matsuda et al. 2012). In the present study, we performed isolation and identification of specialized metabolites from rice leaves (cultivar Habataki). On the basis of the accurate mass of the precursor ion and fragmentation patterns of collision-induced dissociation (CID) MS/MS, together with NMR spectra, 36 compounds, including five new flavonoids, were isolated and assigned from rice leaves. Most of the isolated compounds were flavonoid glycosides with tricin, apigenin, and chrysoeriol as the aglycones. The MS/MS data have been uploaded to the ReSpect database (http://spectra.psc.riken. jp), which will help to analyze metabolomic studies of rice and its related species, and facilitate the annotation of plant metabolites (Sawada et al. 2012).

Isolation of specialized metabolites
The leaf powder of rice (90 g) was extracted with 90 % methanol as described in a previous study (Matsuda et al. 2012). The extract was dissolved, suspended in water, and partitioned into a hexane and water layer. The water layer was subjected to ODS column chromatography and eluted with CH 3 OH-H 2 O (0:100 ? 100:0 v/v; containing 0.05 % formic acid) to afford nine fractions (Fr.1-9). These fractions were purified using semipreparative HPLC performed under the following conditions: column, Cadenza CD-C18 or Unison UK-C18 columns, Imtakt 150 9 10 mm i.d.; particle size, 3 lm; solvents, water and methanol or acetonitrile, containing 0.1 % v/v formic acid; and flow rate, 3.0 mL/min. The following compounds were obtained: 1 (4.52 mg), 2 (12.69 mg), 3 (2.07 mg), 4 (2.57 mg), 5 (1.15 mg), 6 (0.94 mg), 7 (1.51 mg), 8 (1.23 mg), 9 (0.71 mg), 10 MS detection was performed on a Waters Xevo G2 QTOF mass spectrometer with an electrospray ionization (ESI) interface (Waters). Full scan mass spectra were recorded through a range of 50-1,500 m/z. Nitrogen was used as the nebulizer and auxiliary gas; argon was utilized as the collision gas. The ESI source was operated in positive and negative ionization modes with a capillary voltage of 3 kV, sampling cone voltage of 25 V, cone gas flow of 50 L/h, desolvation gas flow of 800 L/h, desolvation temperature of 450°C, source temperature of 120°C, and CID energy ramped from 10 to 50 eV. Tandem MS analysis was performed using fast data directed analysis (FastDDA), which is rapid automated, intelligent MS/MS data acquisition for targeted qualitative analyses. Data acquisition and processing were performed with the MassLynx 4.1 software.

NMR analysis
The NMR spectra were recorded on a Bruker 600 MHz spectrometer with a DCH CryoProbe (Bruker BioSpin GmbH, Rheinstetten, Germany). One-dimensional (1D) 1 H-NMR was measured of 64 or 128 scans and at a receiver gain of 11.3 using standard pulse sequences. 1D 13 C-NMR, and two-dimensional (2D) NMR experiments, 1 H-1 H correlation spectroscopy (COSY), 1 H-13 C heteronuclear single quantum coherence spectroscopy (HSQC), and 1 H-13 C heteronuclear multiple bond connectivity spectroscopy (HMBC) were obtained using standard pulse sequences. Dimethylsulfoxide-d 6 or methanol-d 4 was used as solvent, and tetramethylsilane (TMS) was used as an internal standard. The samples were added to 5 mm Shigemi micro NMR tubes (Shigemi, DMS-005B and MMS-005B, Tokyo). NMR data were acquired and processed with the TopSpin software (Bruker BioSpin GmbH, Rheinstetten, Germany).

Data upload
All data acquired by LC-QTOF-MS/MS were uploaded to DROP Met in PRIMe (http://prime.psc.riken.jp/) and are freely available.

Results and discussion
In this study, to achieve better metabolite identification, namely improving the metabolite annotation level in general metabolomics research community, we mainly focused on and selected the flavonoids and flavonolignans for further isolation and structure elucidation from initial LC-MS experiments, indicating those as the representative detectable metabolites. 36 compounds, including five new flavonoids (6-9 and 24), were isolated and assigned from the leaves of rice using MS/MS and NMR methods ( Fig. 1). To our knowledge, this is the first time that 18 of the known compounds (4, 5, 12, 13, 17-23, 29-33, 35, and 36) have been isolated from rice leaves. Those 36 compounds have been assigned in LC-PDA chromatogram of rice leaves extract (Supplementary Figure S1). Herein, we report the structural elucidation of new flavonoids and analysis of the MS/MS fragmentation patterns of isolated compounds by using high-resolution QTOF mass spectrometry with an ESI source. In the MS/MS analysis, the ramped collision energies mode was used to obtain a combined spectrum from fragments detected at various collision energies (Matsuda et al. 2009) because the fragmentation patterns observed in MS/MS spectra depend on many factors, including the mass spectrometer instrument and its operating conditions, especially collision energy. In addition, the structures of known compounds were identified by 1 H, 13 C-NMR analyses.

Structure elucidation of new compounds 6-9
and 24 Compound 6 was obtained as a yellow amorphous powder. The molecular formula of compound 6 was established as  (Table 1). Furthermore, in combination with the 13 C-NMR and 2D NMR (COSY, HSQC, and HMBC) spectra, these data indicated that compound 6 was tricin forms. In addition, in the HMBC spectrum, the anomeric proton signals d 5.33 (H-1 00 ) and 4.48 (H-1 000 ) showed longrange correlation with the carbon signals at d 162.5 (C-7) and 82.5 (C-2 00 ), respectively, suggesting that the glucuronosyl was located at the C-7 of aglycone and that glucose was located at the C-2 of glucuronosyl ( Fig. 2). Based on these findings, compound 6 was assigned as tricin 7-O-(2 00 - ? , corresponding to the loss of malonyl and hexose groups. The 1 H-NMR spectrum of compound 7 indicated an A 2 -type aromatic proton signal at d 7.37 (2H, s); three aromatic proton signals at d 6.45 (1H, brs), 6.73 (1H, brs), and 7.06 (1H, s); two methoxy proton signals at d 3.89 (6H, Toward better annotation in plant metabolomics 547 s); and a sugar of the anomeric proton signal at d 5.10 (1H, d, J = 7.4 Hz) (Table 1). These data, together with the 13 C-NMR and 2D NMR (COSY, HSQC, and HMBC) spectra, indicated that compound 7 was tricin malonylglucopyranoside. Furthermore, in the HMBC spectrum, the anomeric proton signal d 5.10 (H-1 00 ) showed long-range correlation with the carbon signal at d 162.7 (C-7), suggesting that the glucose was located at C-7. The sugar proton signal at d 4.15 (H-6 00 ) showed correlation with the carbon signal at d 167.4 (C-1 000 ), suggesting that the malonyl moiety was located at the C-6 of glucose (Fig. 2). Thus, compound 7 was assigned as tricin 7-O-(6 00 -O-malonyl)-b-D-glucopyranoside. , representing the loss of sinapoyl and hexose groups. The fragment ion of the sinapoyl moiety at m/z 207 was also observed (Cuyckens and Claeys 2004). The 1 H-NMR spectrum of compound 8 indicated an A 2 -type aromatic proton signal at d 7.28 (2H, s); meta-coupled proton signals at d 6.52 (1H, d, J = 1.9 Hz) and 6.89 (1H, d, J = 1.9 Hz); an aromatic proton signal at d 6.96 (1H, s); two methoxy proton signals at d 3.88 (6H, s); and an anomeric proton signal at d 5.15 (1H, d, J = 7.3 Hz), which were similar to those of compounds 6 and 7 (Table 1). In addition, we observed an A 2 -type aromatic proton signal at d 6.80 (2H, s); two methoxy proton signals at d 3.71 (6H, s); and two olefinic proton signals at d 6.44 (1H, d, J = 15.9 Hz) and 7.47 (1H, d, J = 15.9 Hz), suggesting the presence of a trans-sinapoyl moiety. Furthermore, in the HMBC spectrum, the anomeric proton signal d 5.15 (H-1 00 ) showed a long-range correlation with the carbon signal at d 162.6 (C-7), suggesting that the glucose was located at C-7. The sugar proton signal at d 4.10 (H-6 00 ) showed correlation with the carbon signal at d 166.2 (C-9 000 ), suggesting that the sinapoyl moiety was located at the C-6 of glucose (Fig. 2). Thus, compound 8 was assigned as tricin 7-O-(6 00 -(E)-sinapoyl)-b-D-glucopyranoside.
Compound 9 was obtained as a yellow amorphous powder. The molecular formula of compound 9 was established as C 34  The product ion at m/z 539 was formed by the loss of glucose and a water molecule from the precursor ion at m/z 719. In addition, a major product ion was observed at m/z 209, which was formed by the loss of a water molecule from the syringylglyceryl moiety. The 1 H-NMR spectrum of compound 9 indicated an A 2 -type aromatic proton signal at d 7.26 (2H, s); meta-coupled proton signals at d 6.10 (1H, d, J = 2.0 Hz) and 6.34 (1H, d, J = 2.0 Hz); an aromatic proton signal at d 6.64 (1H, s); two methoxy proton signals at d 3.96 (6H, s); and an anomeric proton signal at d 4.57 (1H, d, J = 7.5 Hz). Moreover, the 1 H-NMR spectrum of compound 9 was similar to that of compound 15, except for an A 2 -type aromatic proton signal at d 6.81 (2H, s) and six proton signals at d 3.84 (6H, s), corresponding to two methoxyl groups of the syringylglyceryl moiety (Table 2). Furthermore, in combination with the 13 C-NMR and 2D NMR (COSY, HSQC, and HMBC) spectra, these data indicated that compound 9 was a flavonolignan glycoside with tricin as the aglycone. In addition, the coupling constant of J H-7 00 , H-8 00 was 5.5 Hz, suggesting that compound 9 was of the threo type because the coupling constant between the adjacent protons of the threo form is known to be larger than that of the erythro form (Bouaziz et al. 2002). To determine the absolute configuration of the syringylglyceryl and guaiacylglyceryl moieties of flavonolignans 9-17, we measured the circular dichroism (CD) spectra. However, these compounds did not exhibit Cotton effects, presumably due to conformational mobility (Wenzig et al. 2005). Furthermore, in the HMBC spectrum, the anomeric proton signal d 4.57 (H-1 000 ) showed long-range correlation with the carbon signal at d 82.0 (C-7 00 ), suggesting that the glucose was located at C-7 00 . The syringylglyceryl proton signal at d 4.55 (H-8 00 ) showed correlation with the carbon signal at d 140.7 (C-4 0 ), suggesting that the location of the syringylglyceryl moiety was at C-4 0 (Fig. 2). Thus, compound 9 was assigned as tricin 4 0 -O-(threo-b-syringylglyceryl) ether 7 00 -O-b-Dglucopyranoside.
Compound 24 was obtained as a yellow amorphous powder. The molecular formula was found to be C  (Table 2). Furthermore, in combination with the 13 C-NMR and 2D NMR (COSY, HSQC, and HMBC) spectra, these data indicated that compound 24 was luteolin glucopyranosyl-arabinoside. The relatively large coupling constant values of anomeric protons suggested that the configuration of the glucose was the b form and of the arabinose was the a form (Xie et al. 2003). In addition, in the HMBC spectrum, the anomeric proton signals d 4.58 (H-1 00 ) and 4.19 (H-1 000 ) showed a long-range correlation with the carbon signals at d 108.3 (C-6) and 78.7 (C-2 00 ), respectively, suggesting that the arabinosyl moiety was located at the C-6 of aglycone and glucosyl at the C-2 of arabinose (Fig. 2). Based on these data, compound 24 was assigned as luteolin 6-C-(2 00 -O-b-D-glucopyranosyl)-a-Larabinoside.

Flavonoids
MS spectra of compound 1 in the positive and negative ionization modes showed a protonated molecular ion at m/z 331 and a deprotonated molecular ion at m/z 329, respectively. MS/MS spectra of compounds 2-5 and 10-17 in the  (1) (Victoire et al. 1988).

Phenylpropanoids and salicylic acid glycoside
The MS spectra of compounds 29, 30, 31, 32, and 33 in the negative ionization mode showed precursor ions at m/z 443, 355, 385, 337, and 367, respectively. The MS/MS spectra of compounds 29, 30, and 33 gave the same characteristic fragment ions at m/z 193 [ferulic acid-H] -, indicating the presence of a feruloyl moiety in these compounds. Similarly, in the MS/MS spectra of compounds 31 and 32, fragment ions of sinapic acid were observed at m/z 223 and of coumaric acid at m/z 163. On comparing the 1 H-and 13 C-NMR spectral data with those in the literature, these compounds were assigned as 1,3-Odiferuloylglycerol (29) (Luo et al. 2012), 1-O-feruloyl-b-Dglucose (30) (Miyake et al. 2007), 1-O-sinapoyl-b-D-glucose (31) (Miyake et al. 2007), 3-O-p-coumaroylquinic acid (32) (Ma et al. 2007), and 3-O-feruloylquinic acid (33) (Ida et al. 1994). The MS spectra of compound 34 in the negative ionization mode showed a deprotonated molecular ion at m/z 299. The MS/MS spectra of the precursor ion at m/z 299 gave a major fragment ion at m/z 137 [(M -H)-162] -, suggesting the presence of a hexose group. Compound 34 was assigned as salicylic acid 2-O-b-D-glucopyranoside (Grynkiewicz et al. 1993) by comparing the 1 Hand 13 C-NMR spectral data with those in the literature.

Alkaloids
The MS spectra of compound 35 in the positive and negative ionization modes showed precursor ions at m/z 190 and 188, respectively. The MS/MS spectra of the precursor ion at m/z 190 produced major fragment ions at m/z 172 and 144. This compound was assigned as kynurenic acid (35) (Beretta et al. 2007) by comparing the MS/MS and 1 H-, 13 C-NMR spectral data with those in the literature. The MS spectra of compound 36 in the positive and negative ionization modes showed a protonated molecular ion at m/z 217 and a deprotonated molecular ion at m/z 215, respectively. On comparing the 1 H-NMR spectral data with those in the literature, compound 36 was assigned as lycoperodine-1 (Yahara et al. 2004).

MS/MS data acquisition of isolated compounds
Certain classes of specialized metabolites with similar structures in plants show characteristic fragments or neutral losses in their MS/MS spectra. Flavonoids, a major class of plant-specialized metabolites, include subclasses such as flavonol, flavone, flavan-3-ol, isoflavone, and anthocyanin. Many flavonoids are positional isomers or homologues, which have a basic C6-C3-C6 skeleton, with two aromatic rings linked by a three-carbon chain (Dixon and Steele 1999). Flavonoids are commonly present as O-or C-glycosides. The flavonoid O-glycosides usually have sugar moieties bonded to the 4 0 -, 3-, and 7-hydroxyl groups of the aglycone. The flavonoid C-glycosides have sugar substituents directly linked to the aglycone by C-C bonds. The C-6 and C-8 positions are the common locations in C-glycosides. The flavonoid O,C-glycosides have sugar moieties linked to the hydroxyl group of the aglycone or C-glycosyl residue. Numerous flavonoid glycosides have been identified or characterized using the LC-MS approach (Cuyckens and Claeys 2004;de Rijke et al. 2006;Farag et al. 2007;Kachlicki et al. 2008;Van der Hooft et al. 2012;Wojakowska et al. 2013). To aid in the annotation of phytochemicals, we have reported the characteristic MS/ MS fragmentation patterns of the isolated compounds.  Figure S2). These results suggested that the glucose at the 5-position was lost more readily than at the 7-position. Our results are in agreement with earlier studies on luteolin 5-O-glucoside and luteolin 7-O-glucoside (Grayer et al. 2000).
The MS/MS spectra of tricin 7-O-rutinoside (4) in the positive ionization mode showed a major fragment ion at m/z 493 [(M ? H)-146] ? (Y 1 ? ), which was formed by the loss of rhamnose, whereas tricin 7-O-neohesperidoside (5) produced only a very low abundance of the Y 1 ? ion. Compounds 4 and 5 both showed the base peak of aglycone fragment ions at m/z 331 (Y 0 ? ), which were formed by the loss of rutinose and neohesperidose moieties, respectively. These results indicated that the Y 0 ? /Y 1 ? ratio was higher for 1 ? 2 linked neohesperidose [rhamnosyl (1 ? 2)-glucose] than for 1 ? 6 linked rutinose [rhamnosyl (1 ? 6)-glucose] in the positive ionization mode (Ma et al. 2001). However, in the negative ionization mode, compounds 4 and 5 showed the aglycone fragment ions at m/z 329 (Y 0 -). The fragment ions (Y 1 -) formed by the loss of rhamnose were not observed. Compound 4 produced a relatively higher level of the aglycone fragment ion than compound 5, suggesting that the rutinose was more readily lost than the neohesperidose in the negative ionization mode (Supplementary Figure S3).
The    Figure S4). These results suggested that fragment ions at m/ z 509 were characteristic fragments of flavonolignans 14, 15, and 16, which have a glucose located at the 7 00 -or 9 00position of the guaiacylglyceryl group. In the negative ionization mode, flavonolignans 10-16 also showed similar fragment patterns with neutral loss of guaiacylglyceryl and glucose groups (Supplementary Figure S4).

Fragmentation of flavonoid C-glycosides and O,Cglycosides
In the MS/MS spectra, the fragmentation patterns of C-glycosides differ from those of O-glycosides; loss of water molecules and cross-ring cleavages of sugar residues are characteristic fragments of C-glycosides, whereas the neutral loss of a sugar moiety can be observed in O,Cglycosides (Cuyckens and Claeys 2004). The MS/MS spectra of the flavonoid C-glycosides apigenin 6-C-a-L-arabinosyl-8-C-b-L-arabinoside (20) and chrysoeriol 6-C-a-L-arabinosyl-8-C-b-L-arabinoside (21) in the positive ionization mode showed the loss of one, two, and three water molecules from precursor ions at m/z 535 and 565, leading to product ions at m/z 517, 499, and 481, respectively, for compound 20 and product ions at m/z 547, 529, and 511, respectively, for compound 21. The crossring cleavage of the sugar residue of C-glycoside yielded many characteristic product ions, such as m/z 445 (   and Guttman 2010). In the negative ionization mode, the MS/MS spectra showed fewer but characteristic products ions such as m/z 473 ( 0.3 X -), 443 ( 0.2 X -), 383 ( 0.3 X --90 or 0.2 X --60), and 353 ( 0.2 X --90) of compound 20 as well as m/z 503 ( 0.3 X -), 473 ( 0.2 X -), 413 ( 0.3 X --90 or 0.2 X --60), and 383 ( 0.2 X --90) of compound 21 ( Supplementary Figure S5). These results suggested that the loss of m/z 90 ( 0.2 X) and 60 ( 0.3 X) are characteristics of flavonoid C-pentosides (Vukics and Guttman 2010). The MS/MS spectra of the flavonoid O,C-glycosides isoscoparin 2 00 -O-(6 000 -(E)-feruloyl)-glucopyranoside (25), isoscoparin 2 00 -O-(6 000 -(E)-p-coumaroyl)-glucopyranoside (26), isovitexin 2 00 -O-(6 000 -(E)-feruloyl)-glucopyranoside (27), and isovitexin 2 00 -O-(6 000 -(E)-p-coumaroyl)-glucopyranoside (28) in the positive ionization mode showed fragment ions of the C-glycoside at m/z 463 and 433, which were formed by the neutral loss of glucose and acyl substituents (feruloyl or coumaroyl moiety). Fragment ions of the feruloyl moiety at m/z 177 and coumaroyl moiety at m/z 147 were also observed ( Fig. 4 and Supplementary Figure S6). Compounds 25 and 26 showed fragment ions at m/z 445, 427, and 409, which were formed by the loss of water molecules from C-glycoside fragment ions at m/z 463. Compounds 25 and 26 also gave fragment ions at m/z 397 ( 2.3 X ? -2H 2 O), 367 ( 0.4 X ? -2H 2 O), and 343 ( 0.2 X ? ), which were formed by the cross-ring cleavage of the sugar residue of the C-glycoside fragment at m/z 463 (Vukics and Guttman 2010). Compounds 27 and 28 showed similar fragment patterns due to the loss of water molecules and cross-ring cleavages of sugar residues from C-glycoside fragments at m/z 433 (Supplementary Figure S6). However, the MS/MS spectra of compounds 25-28 in the negative ionization mode showed different fragment patterns compared with those for the positive ionization mode (Supplementary Figure S7). Product ions at m/z 623 and 593 were formed by the loss of feruloyl or coumaroyl moieties. C-glycoside fragment ions at m/z 443 and 413 were formed by the neutral loss of glucose and a water molecule from ions at m/z 623 and 593, respectively. The MS/MS spectra also showed ferulic acid and coumaric acid ions at m/z 193 and 163, respectively. The major fragment ions at m/z 323 and 293 ( 0.2 X -) were formed by cross-ring cleavages of sugar residues of C-glycoside fragments at m/z 443 and 413, respectively, which are similar to those observed in the positive ion mode. These results suggested that the loss of m/z 120 ( 0.2 X) is characteristic of flavonoid C-hexosides (Waridel et al. 2001;Vukics and Guttman 2010).

Concluding remarks
Metabolomics aims to identify and quantify all the metabolites in biological samples. The LC-MS/MS approach can generate structural information from precursor and product ions, which can be combined with NMR for unambiguous identification of (un)known phytochemicals. Using this strategy, 36 compounds, including five new flavonoids and eight rare flavonolignan isomers, were isolated and identified from rice. The unique MS/MS fragment patterns of flavonoid O-glycosides, C-glycosides, and O,C-glycosides will facilitate annotation of these plant-specialized metabolites in future studies. Moreover, isolation and structure elucidation of metabolites can enhance the understanding of gene-tometabolite correlations in phytochemical genomics studies (Nakabayashi et al. 2009;Saito 2013) by integrating metabolomics information with the genomic information. Unequivocal structures of metabolites are also useful for metabolome quantitative trait loci (mQTL) analysis (Matsuda et al. 2012) and genome-wide association studies (GWAS) (Yonemaru et al. 2012). The genomic region and genes potentially responsible for the biosynthesis of specialized metabolites can be presented by mQTL analysis (Matsuda et al. 2012). The obtained compounds and their MS/MS spectra can be used not only for metabolite annotation but also to investigate the relationships between gene expression and metabolite accumulation in rice and other plant metabolic systems.