Skip to main content

Part of the book series: Advances in Intelligent and Soft Computing ((AINSC,volume 93))

Abstract

Models that rely exclusively on the Markov property, usually known as finite-context models, can model DNA sequences without considering mechanisms that take direct advantage of exact and approximate repeats. These models provide probability estimates that depend on the recent past of the sequence and have been used for data compression. In this paper, we investigate some properties of the finite-context models and we use these properties in order to improve the compression. The results are presented using the human genome as example.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berg, I., Bosnacki, D., Hilbers, P.: Large scale analysis of small repeats via mining of the human genome. In: 20th International Workshop on Database and Expert Systems Application, DEXA 2009, pp. 198–202 (2009)

    Google Scholar 

  2. Botta, M., Haider, S., Leung, I., Lio, P., Mozziconacci1, J.: Intra- and inter-chromosomal interactions correlate with CTCF binding genome wide. Molecular Systems Biology 6 (2010), doi:10.1038/msb.2010.79

    Google Scholar 

  3. Cao, M.D., Dix, T.I., Allison, L., Mears, C.: A simple statistical algorithm for biological sequence compression. In: Proc. of the Data Compression Conf. (DCC 2007), Snowbird, Utah (2007)

    Google Scholar 

  4. Haubold, B., Wiehe, T.: How repetitive are genomes? BMC Bioinformatics 7(1), 541 (2006)

    Article  Google Scholar 

  5. Pinho, A.J., Neves, A.J.R., Martins, D.A., Bastos, C.A.C., Ferreira, P.J.S.G.: Finite-context models for DNA coding. In: Miron, S. (ed.) Signal Processing, pp. 117–130. INTECH (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pratas, D., Pinho, A.J. (2011). Compressing the Human Genome Using Exclusively Markov Models. In: Rocha, M.P., Rodríguez, J.M.C., Fdez-Riverola, F., Valencia, A. (eds) 5th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB 2011). Advances in Intelligent and Soft Computing, vol 93. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19914-1_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19914-1_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19913-4

  • Online ISBN: 978-3-642-19914-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics