Allen, H.L. et al (2010) Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467, no 7321: 832–838.
Altschul, S.F. et al (1990) Basic local alignment search tool. Journal of Molecular Biology 215: 403–410.
Anson, E. and Myers, E. (1999) Algorithms for whole genome shotgun sequencing. In: Proceedings of RECOMB’99, Lyon, pp. 1–9.
Belzer, J. et al (eds.) (1978) Encyclopedia of Computer Science and Technology. Vo1. 10. Linear and Matrix Algebra to Microorganisms. New York: Marcel Dekker.
Bisciglia, C. (2009) Analyzing human genomes with Apache Hadoop. Weblog, 15 October, Cloudera. http://blog.cloudera.com/blog/2009/10/analyzing-human-genomes-with-hadoop/, accessed 27 May 2015.
Bowker, G. (2006) Memory Practices in the Sciences. Cambridge: MIT Press.
Bowker, G. and Star, S.L. (1999) Sorting Things Out: Classification and its Consequences. Cambridge: MIT Press.
Boyd, D. and Crawford, K. (2012) Critical questions for big data. Information, Communication & Society 15(5): 662–679.
Brin, S. and Page, L. (2000) The anatomy of a large-scale hypertextual web search engine. Computer Science Department, Stanford University. http://infolab.stanford.edu/pub/papers/google.pdf, accessed 27 May 2015.
Brust, A. (2012) Cloudera and Mount Sinai: The structure of a big data revolution? ZDNet, 6 July. http://www.zdnet.com/article/cloudera-and-mount-sinai-the-structure-of-a-big-data-revolution/, accessed 27 May 2015.
Burrows, M. and Wheeler, D.J. (1994) A block sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation. http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-124.html, accessed 27 May 2015.
Carr, D.F. (2006) How Google Works: The Google File System. Baseline, 6 July. http://www.baselinemag.com/c/a/Infrastructure/How-Google-Works-1/4, accessed 27 May 2015.
Celera (2000) Celera Genomics to Acquire Paracel Inc. Press release, 20 March. https://www.celera.com/celera/pr_1056568938, accessed 18 September 2015.
Dalton, C. and Thatcher, J. (2014) What does a critical data studies look like, and why do we care? Seven points for a critical approach to big data. Society and Space. http://societyandspace.com/material/commentaries/craig-dalton-and-jim-thatcher-what-does-a-critical-data-studies-look-like-and-why-do-we-care-seven-points-for-a-critical-approach-to-big-data/#comments, accessed 23 September 2015.
Daly, A.K. (2010) Genome-wide association studies in pharmacogenomics. Nature Reviews Genetics 11: 241–246.
Dean, J. and Ghemawat, S. (2004) MapReduce: Simplified data processing on large clusters. Google Research Publications (appeared in OSDI’04: Sixth Symposium on Operating System Design and Implementation, San Francisco, California, December 2004). http://static.googleusercontent.com/media/research.google.com/es/us/archive/mapreduce-osdi04.pdf, accessed 27 May 2015.
Delcher, A.L. et al (1999) Alignment of whole genomes. Nucleic Acids Research 27(11): 2369–76.
Dourish, P. (2014) No SQL: The shifting materialities of database technology. Computational Culture: A Journal of Software. http://computationalculture.net/article/no-sql-the-shifting-materialities-of-database-technology, accessed 18 September 2015.
Eisen, M. (2012) Blinded by big science. Weblog entry, 10 September. www.michaeleisen.org/blog/?p=1179, accessed 23 September 2015.
ENCODE at UCSC (2012) ENCODE experiment matrix, http://genome.ucsc.edu/ENCODE/dataMatrix/encodeDataMatrixHuman.html, accessed 27 May 2015.
Ferragina, P. and Manzini, G. (2000) Opportunistic data structures with applications. Foundations of Computer Science. In: Proceedings, 41st Annual Symposium, pp. 390–398. IEEE.
Garland, A. (2015) Ex Machina (film). Writer and director: Alex Garland.
Gitelman, L., ed. (2013) Raw Data is an Oxymoron. Cambridge: MIT Press.
Gonella, G and Kurtz, S. (2012) Readjoiner: A fast and memory efficient string graph-based sequence assembler. BMC Bioinformatics 13(1): 1–19.
Gusfield, D. (1997) Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge.
Harris, D. (2012) Better medicine, brought to you by big data. GigaOm, 15 July. https://gigaom.com/2012/07/15/better-medicine-brought-to-you-by-big-data/, accessed 27 May 2015.
Hazelhurst, S. and Lipák, Z. (2011). KABOOM! a new auffix array based algorithm for clustering expression data. Bioinformatics 27(24): 3348–55.
Hebbring, S.J. (2014) The challenges, advantages and future of phenome-wide association studies. Immunology 141(2): 157–65.
Hernandez, D. (2013) Data crunchers ditch Hadoop for homegrown software. Wired, 20 February. http://www.wired.com/2013/02/genetic-data-glut/, accessed 27 May 2015.
Ilie, L. et al (2011) HiTEC: Accurate error correction in high-throughput sequencing data. Bioinformatics 27(3): 295–302.
Illumina (2013) An introduction to next-generation sequencing technology. http://res.illumina.com/documents/products/illumina_sequencing_introduction.pdf, accessed 27 May 2015.
Kay, L.E. (2000) Who Wrote the Book of Life? A History of the Genetic Code. Stanford University Press.
Kielbasa, S.M. et al (2011) Adaptive seeds tame genomic sequence comparison. Genome Research 21: 487–93.
Kirschenbaum, M. (2007) Mechanisms: New Media and the Forensic Imagination. Cambridge, MA: MIT Press.
Kitchin, R. (2014) The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences. SAGE Publications.
Knuth, D.E. (1973) The Art of Computer Programming, Volume 3, “Sorting and Searching.” Addison-Wesley, Redwood City.
Koboldt, D.C. et al (2013) The next-generation sequencing revolution and its impact on genomics. Cell 155(1): 27–38.
Kurtz, S. et al (2008) A new method to computer k-mer frequencies and its application to annotate large plant genomes. BMC Genomics 9(1): 1–18.
Langmead, B. et al (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 10: R25.
Levy, S. (2011) In the Plex: How Google Thinks, Works, and Shapes Our Lives. Simon & Schuster, New York.
Li, H. and Homer, N. (2010) A survey of sequence alignment algorithms for next-generation sequencing. Briefings in Bioinformatics 11(5): 473-483.
Lohr, S. (2015) On the case at Mount Sinai, It’s Dr. Data. New York Times, 7 March, BU1.
Luhn, H.P. (1958) A business intelligence system. IBM Journal of Research and Development 2(4): 314.
Mackenzie, A. (2012) More parts than elements: How databases multiply. Environment and Planning D: Society and Space 30: 335–350.
Mackenzie, A. (2015b) Machine learning and genomic dimensionality. In: S. Richardson and H. Stevens (eds.) Postgenomics: Perspectives on Biology After the Genome. Durham and London: Duke University Press, pp. 73–102.
Mackenzie, A. et al (2015) Post-archival genomics and the bulk Logistics of DNA sequences. Biosocieties 11(1): 82–105.
Manber, U. and Myers, E. (1990) Suffix arrays: a new method of on-line string searches. In: Proceedings of the 1st Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 319–327.
Manolio, T.A. et al (2009) Finding the missing heritability of complex diseases. Nature 461, no. 7265: 747–753.
Manovich, L. (1999) Database as a symbolic form. Millennium Film Journal 34 (Fall).
Manovich, L. (2014) Software Takes Command. Bloomsbury Academic, London.
M'Charek, A. (2005) The Human Genome Diversity Project: An Ethnography of Scientific Practice. Cambridge, UK: Cambridge University Press.
Metz, C. (2011) How Yahoo spawned Hadoop, the future of big data. Wired, 18 October. http://www.wired.com/2011/10/how-yahoo-spawned-hadoop/, accessed 27 May 2015.
Myers, E. et al (2000) Whole-genome assembly of Drosophila. Science 287: 2196–2204.
NextBio (2012) NextBio and Intel collaborate to optimize the Hadoop stack and advance big data technologies in genomics, Press release, 11 July. http://www.nextbio.com/b/corp/pressReleases.nb#pr40, accessed 27 May 2015.
Pasquale, F. (2015) The Black Box Society: The Secret Algorithms That Control Money and Information. Cambridge and London: Harvard University Press.
Patel, C.J. et al (2010) An Enviroment-Wide Association Study (EWAS) on Type 2 Diabetes Mellitus. PLoS One DOI:10.1371/journal.pone.0010746.
Pollack, A. (2000) Technology; Supercomputers Track Human Genome. New York Times, 28 August.
Rose, N. (2007) The Politics of Life Itself: Biomedicine, Power, and Subjectivity in the Twenty-First Century. Princeton: Princeton University Press.
Ruppert, E. et al (2015) Socializing big data: From concept to practice. CRESC Working Paper No. 138, The University of Manchester and Open University.
Schatz, M. (2009) Cloudburst: Highly sensitive read mapping with MapReduce. Bioinformatics 25(11): 1363–1369.
Schneier, B. (2015) Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World. New York: Norton.
Science (2001) Epigenetics. Science, special issue, 293, no. 5532: 1001–1208.
Shendure, J. and Ji, H. (2008) Next-generation DNA sequencing. Nature Biotechnology 26: 1135–45.
Silverman, J. (2015) Terms of Service: Social Media and the Price of Constant Connection. New York: Harper.
Smith, B.C. (1998) On the Origin of Objects. MIT Press, Cambridge.
Stein, R. A. (2008) Next-generation sequencing update. Genetic Engineering & Biotechnology News 28(15), 1 September. http://www.genengnews.com/gen-articles/next-generation-sequencing-update/2584/, accessed 27 May 2015.
Stevens, H. (2011a) Coding Sequences: A History of Sequence Comparison Algorithms as a Scientific Instrument. Perspectives on Science 19(3): 263–299.
Stevens, H. (2011b) On the means of bioproduction: Bioinformatics and how to make knowledge in a high-throughput genomics laboratory. Biosocieties 6(2): 217–242.
Stevens, H. (2013) Life Out of Sequence: A Data-Driven History of Bioinformatics. Chicago: University of Chicago Press.
Sutton et al (1995) TIGR assembler: a new tool for assembling large shotgun sequencing projects. Genome Science & Technology 1(1): 9–19.
Taylor, R.C. (2010) An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinformatics 11(Suppl 12): S1.
Thacker, E. (2005) The Global Genome: Biotechnology, Politics, and Culture. Cambridge: MIT Press.
Thomas, U.G. (2012) Google works with ISB to evaluate life sciences as application area for new cloud infrastructure. Genomeweb, 20 July. https://www.genomeweb.com/informatics/google-works-isb-evaluate-life-sciences-application-area-new-cloud-infrastructur, accessed 27 May 2015.
Vaidhyanathan, S. (2011) The Googlization of Everything (And Why We Should Worry). Berkeley: University of California Press.
Venter, J.C. et al (2001) The Sequence of the Human Genome. Science 291, no. 5507: 1304-1351.
Visscher, P.M. et al (2012a) Evidence-based psychiatric genetics, AKA the false dichotomy between the common and rare variant hypotheses. Molecular Psychiatry 17, no. 5: 474–485.
Visscher, P.M. et al (2012b) Five years of GWAS discovery. American Journal of Human Genetics 90, no. 1: 7-24.
Wojcicki, A. et al (2012) Deleterious Me: Whole Genome Sequencing, 23andMe, and the Crowd-Sourced Health Care Revolution. Science and Democracy Lecture Series, Harvard Kennedy School, 18 April. Available at https://vimeo.com/40657814.
Zhang, J. et al (2011) The impact of next-generation sequencing on genomics. Journal of Genetics and Genomics 38(3): 95–109.