Skip to main content
Log in

Tailoring high-density oligonucleotide arrays for transcript profiling of different Arabidopsis thaliana accessions using a sequence-based approach

  • Original Article
  • Published:
Plant Cell Reports Aims and scope Submit manuscript

Abstract

Key message

Excluding polymorphic probes from GeneChip ® transcript profiling experiments via a sequence-based approach results in improved detection of differentially expressed genes in developing seeds of Arabidopsis thaliana accessions Col-0 and C24.

Abstract

GeneChip® arrays represent a powerful tool for transcript profiling experiments. The ATH1 GeneChip® has been designed based on the sequence of the Arabidopsis thaliana reference genome Col-0, hence the features on the array exactly match the sequences of Col-0 transcripts. In contrast, transcripts of other A. thaliana accessions or related species may show nucleotide differences and/or insertions/deletions when compared to the corresponding Col-0 transcripts, therefore, comparisons of transcript abundance involving different A. thaliana accessions or related species may be compromised for a certain number of transcripts. To tackle this limitation, a sequence-based strategy was developed. Only features on the array that were identical in sequence for the specimen to be compared were considered for transcript profiling. The impact of the proposed strategy was evaluated for transcript profiles that were established for developing seeds of A. thaliana accessions Col-0 and C24.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Abbreviations

DAF:

Days after flower opening

eQTL:

Expression quantitative trait loci

GEO:

Gene Expression Omnibus

GO:

Gene Ontology

Indel:

Insertion/deletion

References

  • Alberts R, Terpstra P, Li Y, Breitling R, Nap JP, Jansen RC (2007) Sequence polymorphisms cause many false cis eQTLs. PLoS ONE 2:e622

    Article  PubMed  PubMed Central  Google Scholar 

  • Barrett T, Suzek TO, Troup DB, Wilhite SE, Ngau WC, Ledoux P, Rudnev D, Lash AE, Fujibuchi W, Edgar R (2005) NCBI GEO: mining millions of expression profiles–database and tools. Nucleic Acids Res 33(Database issue):D562–D566

  • Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Statist Soc B 57:289–300

    Google Scholar 

  • Cao J, Schneeberger K, Ossowski S, Günther T, Bender S, Fitz J, Koenig D, Lanz C, Stegle O, Lippert C, Wang X, Ott F, Müller J, Alonso-Blanco C, Borgwardt K, Schmid KJ, Weigel D (2011) Whole-genome sequencing of multiple Arabidopsis thaliana populations. Nat Genet 43:956–963

    Article  CAS  PubMed  Google Scholar 

  • Chen WJ, Chang SH, Hudson ME, Kwan WK, Li J, Estes B, Knoll D, Shi L, Zhu T (2005) Contribution of transcriptional regulation to natural variations in Arabidopsis. Genome Biol 6:32

    Article  Google Scholar 

  • Clark RM, Schweikert G, Toomajian C, Ossowski S, Zeller G, Shinn P, Warthmann N, Hu TT, Fu G, Hinds DA, Chen H, Frazer KA, Huson DH, Schölkopf B, Nordborg M, Rätsch G, Ecker JR, Weigel D (2007) Common sequence polymorphisms shaping genetic diversity in Arabidopsis thaliana. Science 317:338–342

    Article  CAS  PubMed  Google Scholar 

  • Dannemann M, Lachmann M, Lorenc A (2012) ‘maskBAD’—a package to detect and remove Affymetrix probes with binding affinity differences. BMC Bioinformatics 13:56

    Article  PubMed  PubMed Central  Google Scholar 

  • DeCook R, Lall S, Nettleton D, Howell SH (2006) Genetic regulation of gene expression during shoot development in Arabidopsis. Genetics 172:1155–1164

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gan X, Stegle O, Behr J, Steffen JG, Drewe P, Hildebrand KL, Lyngsoe R, Schultheiss SJ, Osborne EJ, Sreedharan VT, Kahles A, Bohnert R, Jean G, Derwent P, Kersey P, Belfield EJ, Harberd NP, Kemen E, Toomajian C, Kover PX, Clark RM, Rätsch G, Mott R (2011) Multiple reference genomes and transcriptomes for Arabidopsis thaliana. Nature 477:419–423

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Goda H, Sasaki E, Akiyama K, Maruyama-Nakashita A, Nakabayashi K, Li W, Ogawa M, Yamauchi Y, Preston J, Aoki K, Kiba T, Takatsuto S, Fujioka S, Asami T, Nakano T, Kato H, Mizuno T, Sakakibara H, Yamaguchi S, Nambara E, Kamiya Y, Takahashi H, Hirai MY, Sakurai T, Shinozaki K, Saito K, Yoshida S, Shimada Y (2008) The AtGenExpress hormone and chemical treatment data set: experimental design, data evaluation, model data analysis and data access. Plant J 55:526–542

    Article  CAS  PubMed  Google Scholar 

  • Hammond JP, Broadley MR, Craigon DJ, Higgins J, Emmerson ZF, Townsend HJ, White PJ, May ST (2005) Using genomic DNA-based probe-selection to improve the sensitivity of high-density oligonucleotide arrays when applied to heterologous species. Plant Methods 1:10

    Article  PubMed  PubMed Central  Google Scholar 

  • He F, Yoo S, Wang D, Kumari S, Gerstein M, Ware D, Maslov S (2016) Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis. Plant J 86:472–480

    Article  CAS  PubMed  Google Scholar 

  • Hsieh WP, Chu TM, Wolfinger RD, Gibson G (2003) Mixed-model reanalysis of primate data suggests tissue and species biases in oligonucleotide-based gene expression profiles. Genetics 165:747–757

    CAS  PubMed  PubMed Central  Google Scholar 

  • Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP (2003) Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res 31:e15

    Article  PubMed  PubMed Central  Google Scholar 

  • Jansen RC, Nap JP (2001) Genetical genomics: the added value from segregation. Trends Genet 17:388–439

    Article  CAS  PubMed  Google Scholar 

  • Kilian J, Whitehead D, Horak J, Wanke D, Weinl S, Batistic O, D’Angelo C, Bornberg-Bauer E, Kudla J, Harter K (2007) The AtGenExpress global stress expression data set: protocols, evaluation and model data analysis of UV-B light, drought and cold stress responses. Plant J 50:347–363

    Article  CAS  PubMed  Google Scholar 

  • Langmead B, Trapnell C, Pop M, Salzberg S (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25

    Article  PubMed  PubMed Central  Google Scholar 

  • Lipshutz RJ, Fodor SP, Gingeras TR, Lockhart DJ (1999) High density synthetic oligonucleotide arrays. Nature Genet 21:20–24

    Article  CAS  PubMed  Google Scholar 

  • Quigley D (2015) Equalizer reduces SNP bias in Affymetrix microarrays. BMC Bioinform 16:238

    Article  Google Scholar 

  • R Core Team (2015) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna

    Google Scholar 

  • Rensink WA, Buell CR (2005) Microarray expression profiling resources for plant genomics. Trends Plant Sci 10:603–609

    Article  CAS  PubMed  Google Scholar 

  • Schmid M, Davison TS, Henz SR, Pape UJ, Demar M, Vingron M, Schölkopf B, Weigel D, Lohmann JU (2005) A gene expression map of Arabidopsis thaliana development. Nat Genet 37:501–506

    Article  CAS  PubMed  Google Scholar 

  • Schneeberger K, Ossowski S, Ott F, Klein JD, Wang X, Lanz C, Smith LM, Cao J, Fitz J, Warthmann N, Henz SR, Huson DH, Weigel D (2011) Reference-guided assembly of four diverse Arabidopsis thaliana genomes. Proc Natl Acad Sci USA 108:10249–10254

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • The 1001 Genomes Consortium (2016) 1,135 genomes reveal the global pattern of polymorphism in Arabidopsis thaliana. Cell 166:481–491

    Article  Google Scholar 

Download references

Acknowledgements

The authors thank Dmitri Pescianschi for support during the custom C++ script development, Dr. Ruslana Radchuk and Dr. Yusheng Zhao for advice on the statistical analysis, and Jahnavi Koppolu for her contribution to the analysis of plant phenotypes. Kristin Langanke is acknowledged for skillful technical assistance.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Renate Schmidt.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Electronic supplementary material

Below is the link to the electronic supplementary material.

299_2017_2157_MOESM1_ESM.xlsx

Supplementary Table 1: Alignment of ATH1 probe sequences to the Col-0 and C24 genomes. Column “Hit note” summarises the results of sequence alignments with Bowtie, all hits containing up to three mismatches but no indels were considered whereas in the column “Best hit note” only the results of the best alignments are given for each probe. PM, unique perfect match; MisM, unique match containing between one and three mismatches but no indel(s); NoM, no match or matches that contain indels and/or more than three mismatches; MM multiple matches. (XLSX 19938 kb)

299_2017_2157_MOESM2_ESM.xlsx

Supplementary Table 2: Alignment of ATH1 probe sequences to the A. thaliana Col-0 representative transcripts – A data set. PM, unique perfect match; PM_ID, different probes of the same probe set reveal unique perfect matches to various genes; NoM, no match or matches that contain indels and/or mismatches; MM_PM, multiple perfect matches. Only array elements that revealed perfect sequence alignments were considered for the assignment of probe sets of the ATH1 array to TAIR10 representative gene models. At least 75% of all oligonucleotides belonging to a particular probe set had to show perfect matches to the + strand of a particular TAIR10 representative gene model in order to assign the gene model to this probe set. (XLSX 16680 kb)

299_2017_2157_MOESM3_ESM.xlsx

Supplementary Table 3: Alignment of ATH1 probe sequences to the A. thaliana Col-0 representative transcripts – R data set. PM, unique perfect match; PM_ID, different probes of the same probe set reveal unique perfect matches to various genes. (XLSX 19420 kb)

299_2017_2157_MOESM4_ESM.xlsx

Supplementary Table 4: Probe sets showing significantly different hybridisation intensities in the A and R data sets. DAF, days after flower opening; FDR, false discovery rate, p value passing Benjamini Hochberg multiple testing correction (<0.05). (XLSX 200 kb)

299_2017_2157_MOESM5_ESM.xlsx

Supplementary Table 5: Probe sets showing significantly different expression in the A and R data sets. DAF, days after flower opening; FDR, false discovery rate, p value passing Benjamini Hochberg multiple testing correction (<0.05). (XLSX 1682 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Boudichevskaia, A., Cao, H.X. & Schmidt, R. Tailoring high-density oligonucleotide arrays for transcript profiling of different Arabidopsis thaliana accessions using a sequence-based approach. Plant Cell Rep 36, 1323–1332 (2017). https://doi.org/10.1007/s00299-017-2157-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00299-017-2157-5

Keywords

Navigation