Skip to main content

Managing Large SNP Datasets with SNPpy

  • Protocol
  • First Online:
Genome-Wide Association Studies and Genomic Prediction

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1019))

Abstract

Using relational databases to manage SNP datasets is a very useful technique that has significant advantages over alternative methods, including the ability to leverage the power of relational databases to perform data validation, and the use of the powerful SQL query language to export data. SNPpy is a Python program which uses the PostgreSQL database and the SQLAlchemy Python library to automate SNP data management. This chapter shows how to use SNPpy to store and manage large datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Purcell S et al (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81(3):559–575

    Article  PubMed  CAS  Google Scholar 

  2. Aulchenko YS et al (2007) GenABEL: an R library for genome-wide association analysis. Bioinformatics 23(10):1294–1296

    Article  PubMed  CAS  Google Scholar 

  3. Mitha F, Herodotou H, Borisov N, Jiang C, Yoder J, Owzar K (2011) SNPpy - Database Management for SNP Data from Genome Wide Association Studies. PLoS ONE 6(10):e24,982, DOI 10.1371/journal.pone. 0024982, URL http://dx.doi.org/10.1371%2Fjournal.pone.0024982

Download references

Acknowledgements

This article is adapted from [3]. The author wishes to thank his coauthors on this, the original SNPpy project paper; namely Herodotos Herodotou, Nedyalko Borisov, Chen Jiang, Josh Yoder, and Kouros Owzar. The author also wishes to thank the PostgreSQL community, specifically the denizens of the postgresql-general and postgresql-performance mailing list, and particularly the Freenode IRC channel #postgresql. The people who contributed advice and suggestions are too numerous to list them all, but specific mention goes to Andrew Gierth (RhodiumToad) #postgresql’s resident expert on everything PostgreSQL related, Erikjan Rijkers (breinbaas), Jeff Trout (threshar), David Fetter (davidfetter), Jon T Erdman (StuckMojo), David Blewett (BlueAidan), depesz, Casey Allen Shobe (Raptelan), Chua Khee Chin (merlin83), Marko Tiikkaja (johto), Robert Haas and Robert Schnabel. The author also wishes to thank Michael Bayer, the author of SQLAlchemy, who has been extremely helpful in answering numerous queries on the SQLAlchemy user mailing list. Finally, the author wishes to thank the members of the StackExchange question-answer sites, especially stackoverflow.com, unix.stackexchange.com, and tex.stackexchange.com, for answering many questions in connection with this project. The members of tex.stackexchange.com were particularly helpful with regard to LaTeX issues.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Mitha, F. (2013). Managing Large SNP Datasets with SNPpy. In: Gondro, C., van der Werf, J., Hayes, B. (eds) Genome-Wide Association Studies and Genomic Prediction. Methods in Molecular Biology, vol 1019. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-447-0_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-62703-447-0_4

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-62703-446-3

  • Online ISBN: 978-1-62703-447-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics