Skip to main content

Information-Theoretic Inference of an Optimal Dictionary of Protein Supersecondary Structures

  • Protocol
  • First Online:
Protein Supersecondary Structures

Abstract

We recently developed an unsupervised Bayesian inference methodology to automatically infer a dictionary of protein supersecondary structures (Subramanian et al., IEEE data compression conference proceedings (DCC), 340–349, 2017). Specifically, this methodology uses the information-theoretic framework of minimum message length (MML) criterion for hypothesis selection (Wallace, Statistical and inductive inference by minimum message length, Springer Science & Business Media, New York, 2005). The best dictionary of supersecondary structures is the one that yields the most (lossless) compression on the source collection of folding patterns represented as tableaux (matrix representations that capture the essence of protein folding patterns (Lesk, J Mol Graph. 13:159–164, 1995). This book chapter outlines our MML methodology for inferring the supersecondary structure dictionary. The inferred dictionary is available at http://lcb.infotech.monash.edu.au/proteinConcepts/scop100/dictionary.html.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lesk AM (1995) Systematic representation of protein folding patterns. J Mol Graph 13:159–164

    Article  CAS  Google Scholar 

  2. Konagurthu AS, Lesk AM, Allison L (2012) Minimum message length inference of secondary structure from protein coordinate data. Bioinformatics 28(12):i97–i105

    Article  CAS  Google Scholar 

  3. Subramanian R, Allison L, Stuckey PJ, Garcia De La Banda M, Abramson D, Lesk AM, Konagurthu AS (2017) Statistical compression of protein folding patterns for inference of recurrent substructural themes. In: IEEE data compression conference proceedings (DCC), pp 340–349

    Google Scholar 

  4. Fox NK, Brenner SE, Chandonia JM (2013) SCOPe: Structural Classification of Proteins—extended, integrating SCOP and ASTRAL data and classification of new structures. Nucleic Acids Res 42(D1):D304–D309

    Article  Google Scholar 

  5. Kamat AP, Lesk AM (2007) Contact patterns between helices and strands of sheet define protein folding patterns. Proteins 66:869–876

    Article  CAS  Google Scholar 

  6. Konagurthu AS, Lesk AM (2010) Cataloging topologies of protein folding patterns. J Mol Recognit 23(2):253–257

    Article  CAS  Google Scholar 

  7. Konagurthu AS, Stuckey PJ, Lesk AM (2008) Structural search and retrieval using a tableau representation of protein folding patterns. Bioinformatics 24(5):645–651

    Article  CAS  Google Scholar 

  8. Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423 and 623–656

    Article  Google Scholar 

  9. Wallace CS (2005) Statistical and inductive inference by minimum message length. Springer Science & Business Media, New York

    Google Scholar 

  10. Allison L (2018) Coding Ockham’s Razor. Springer, Cham

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arun S. Konagurthu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Konagurthu, A.S. et al. (2019). Information-Theoretic Inference of an Optimal Dictionary of Protein Supersecondary Structures. In: Kister, A. (eds) Protein Supersecondary Structures. Methods in Molecular Biology, vol 1958. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-9161-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-9161-7_6

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-9160-0

  • Online ISBN: 978-1-4939-9161-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics