Skip to main content

Describing the Local Structure of Sequence Graphs

  • Conference paper
  • First Online:
Algorithms for Computational Biology (AlCoB 2017)

Abstract

Analysis of genetic variation using graph structures is an emerging paradigm of genomics. However, defining genetic sites on sequence graphs remains an open problem. Paten’s invention of the ultrabubble and snarl, special subgraphs of sequence graphs which can identified with efficient algorithms, represents important first step to segregating graphs into genetic sites. We extend the theory of ultrabubbles to a special subclass where every detail of the ultrabubble can be described in a series and parallel arrangement of genetic sites. We furthermore introduce the concept of bundle structures, which allows us to recognize the graph motifs created by additional combinations of variation in the graph, including but not limited to runs of abutting single nucleotide variants. We demonstrate linear-time identification of bundles in a bidirected graph. These two advances build on initial work on ultrabubbles in bidirected graphs, and define a more granular concept of genetic site.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    It is on bidirected graphs.

References

  1. 1000 Genomes Project Consortium, et al.: A global reference for human genetic variation. Nature 526(7571), 68–74 (2015)

    Google Scholar 

  2. Beyer, W.: Sequence tube maps (2016). https://github.com/wolfib/sequenceTubeMap

  3. Brankovic, L., Iliopoulos, C.S., Kundu, R., Mohamed, M., Pissis, S.P., Vayani, F.: Linear-time superbubble identification algorithm for genome assembly. Theor. Comput. Sci. 609(Pt. 2), 374–383 (2016). http://www.sciencedirect.com/science/article/pii/S0304397515009147

  4. Danecek, P., Auton, A., Abecasis, G., Albers, C.A., Banks, E., DePristo, M.A., Handsaker, R.E., Lunter, G., Marth, G.T., Sherry, S.T., et al.: The variant call format and vcftools. Bioinformatics 27(15), 2156–2158 (2011)

    Article  Google Scholar 

  5. Duffin, R.: Topology of series-parallel networks. J. Math. Anal. Appl. 10(2), 303–318 (1965). http://www.sciencedirect.com/science/article/pii/0022247X65901253

  6. Medvedev, P., Brudno, M.: Maximum likelihood genome assembly. J. Comput. Biol. 16(8), 1101–1116 (2009)

    Article  MathSciNet  Google Scholar 

  7. Novak, A.M., Hickey, G., Garrison, E., Blum, S., Connelly, A., Dilthey, A., Eizenga, J., Elmohamed, M.A.S., Guthrie, S., Kahles, A., Keenan, S., Kelleher, J., Kural, D., Li, H., Lin, M.F., Miga, K., Ouyang, N., Rakocevic, G., Smuga-Otto, M., Zaranek, A.W., Durbin, R., McVean, G., Haussler, D., Paten, B.: Genome graphs. bioRxiv (2017). http://biorxiv.org/content/early/2017/01/18/101378

  8. Onodera, T., Sadakane, K., Shibuya, T.: Detecting superbubbles in assembly graphs. In: Darling, A., Stoye, J. (eds.) WABI 2013. LNCS, vol. 8126, pp. 338–348. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40453-5_26

    Chapter  Google Scholar 

  9. Paten, B., Novak, A.M., Garrison, E., Hickey, G.: Superbubbles, ultrabubbles and cacti. bioRxiv (2017). http://biorxiv.org/content/early/2017/01/18/101493

  10. Sudmant, P.H., Rausch, T., Gardner, E.J., Handsaker, R.E., Abyzov, A., Huddleston, J., Zhang, Y., Ye, K., Jun, G., Fritz, M.H.Y., et al.: An integrated map of structural variation in 2,504 human genomes. Nature 526(7571), 75–81 (2015)

    Article  Google Scholar 

  11. Sung, W.K., Sadakane, K., Shibuya, T., Belorkar, A., Pyrogova, I.: An o(m log m)-time algorithm for detecting superbubbles. IEEE/ACM Trans. Comput. Biol. Bioinform. 12(4), 770–777. https://doi.org/10.1109/TCBB.2014.2385696

  12. Valdes, J., Tarjan, R.E., Lawler, E.L.: The recognition of series parallel digraphs. SIAM J. Comput. 11(2), 298–313 (1982). http://dx.doi.org/10.1137/0211023

  13. Zerbino, D.R., Birney, E.: Velvet: algorithms for de novo short read assembly using de bruijn graphs. Genome Res. 18(5), 821–829 (2008)

    Article  Google Scholar 

Download references

Acknowledgements

Y.R. is supported by a Howard Hughes Medical Institute Medical Research Fellowship. This work was also supported by the National Human Genome Research Institute of the National Institutes of Health under Award Number 5U54HG007990 and grants from the W.M. Keck foundation and the Simons Foundation. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. We thank Wolfgang Beyer for his visualizations of 1000 Genomes data in a variation graph.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Benedict Paten .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Rosen, Y., Eizenga, J., Paten, B. (2017). Describing the Local Structure of Sequence Graphs. In: Figueiredo, D., MartĂ­n-Vide, C., Pratas, D., Vega-RodrĂ­guez, M. (eds) Algorithms for Computational Biology. AlCoB 2017. Lecture Notes in Computer Science(), vol 10252. Springer, Cham. https://doi.org/10.1007/978-3-319-58163-7_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-58163-7_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-58162-0

  • Online ISBN: 978-3-319-58163-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics