A Symmetric Length-Aware Enrichment Test

Manescu, David; Keich, Uri

doi:10.1007/978-3-319-16706-0_23

David Manescu⁵ &
Uri Keich⁵

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9029))

Included in the following conference series:

International Conference on Research in Computational Molecular Biology

2820 Accesses

Abstract

Young et al. [14] showed that due to gene length bias the popular Fisher Exact Test should not be used to study the association between a group of differentially expressed (DE) genes and a specific Gene Ontology (GO) category. Instead they suggest a test where one conditions on the genes in the GO category and draws the pseudo DE expressed genes according to a length-dependent distribution. The same model was presented in a different context by Kazemian et al. who went on to offer a dynamic programming (DP) algorithm to exactly estimate the significance of the proposed test [8]. Here we point out that while valid, the test proposed by these authors is no longer symmetric as Fisher’s Exact Test is: one gets different answers if one conditions on the observed GO category than on the DE set. As an alternative we offer a symmetric generalization of Fisher’s Exact Test and provide efficient algorithms to evaluate its significance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agresti, A.: A survey of exact inference for contingency tables. Statistical Science 7, 131–153 (1992)
Article MATH MathSciNet Google Scholar
Butler, R.W.: Saddlepoint Approximations with Applications. University Press, Cambridge (2007)
Book MATH Google Scholar
Cleveland, W.S., Devlin, S.J.: Locally-weighted regression: An approach to regression analysis by local fitting. Journal of the American Statistical Association 83, 596–610 (1988)
Article MATH Google Scholar
The Gene Ontology Consortium: Gene ontology: tool for the unification of biology. Nature Genetics 25, 25–29 (2000)
Google Scholar
Cowell, W.R. (ed.): Sources and Development of Mathematical Software. Prentice-Hall Series in Computational Mathematics, Cleve Moler, Advisor. Prentice-Hall, Upper Saddle River, NJ 07458, USA (1984)
Google Scholar
Fisher, R.A.: Statistical methods for research workers. Oliver & Boyd, London, 14th ed. edition (1970)
Google Scholar
Jones, E., Oliphant, T., Peterson, P., et al.: SciPy: Open source scientific tools for Python (2001)
Google Scholar
Kazemian, M., Zhu, Q., Halfon, M.S., Sinha, S.: Improved accuracy of supervised crm discovery with interpolated markov models and cross-species comparison. Nucleic Acids Research 39(22), 9463–9472 (2011)
Article Google Scholar
Nieduszynski, C.A., Hiraga, S., Ak, P., Benham, C.J., Donaldson, A.D.: Oridb: a dna replication origin database. Nucleic. Acids Res. 35(Database issue), D40–D46 (2007)
Google Scholar
R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2006). ISBN 3-900051-07-0
Google Scholar
Scannell, D.R., Zill, O.A., Rokas, A., Payen, C., Dunham, M.J., Eisen, M.B., Rine, J., Johnston, M., Hittinger, C.T.: The awesome power of yeast evolutionary genetics: New genome sequences and strain resources for the saccharomyces sensu stricto genus. G3 (Bethesda) 1(1), 11–25 (2011)
Article Google Scholar
Skovgaard, I.M.: Saddlepoint expansions for conditional distributions. J. Appl. Prob. 24, 875–87 (1987)
Article MATH MathSciNet Google Scholar
Wallenius, K.T.: Biased sampling: the non-central hypegeometric probability distribution. PhD thesis, Stanford University (1963)
Google Scholar
Young, M.D., Wakefield, M.J., Smyth, G.K., Oshlack, A.: Gene ontology analysis for rna-seq: accounting for selection bias. Genome Biology 11(R14), 11 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematics and Statistics, University of Sydney, Sydney, Australia
David Manescu & Uri Keich

Authors

David Manescu
View author publications
You can also search for this author in PubMed Google Scholar
Uri Keich
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Uri Keich .

Editor information

Editors and Affiliations

National Center of Biotechnology Information, Bethesda, Maryland, USA
Teresa M. Przytycka

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Manescu, D., Keich, U. (2015). A Symmetric Length-Aware Enrichment Test. In: Przytycka, T. (eds) Research in Computational Molecular Biology. RECOMB 2015. Lecture Notes in Computer Science(), vol 9029. Springer, Cham. https://doi.org/10.1007/978-3-319-16706-0_23

Download citation

DOI: https://doi.org/10.1007/978-3-319-16706-0_23
Published: 26 March 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16705-3
Online ISBN: 978-3-319-16706-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics