Skip to main content

Big Biomedical Data Engineering

  • Chapter
  • First Online:
Book cover Principles of Data Science

Abstract

The Big Data, a massive amount of data, is the most popular buzzword and popular paradigm to change a game of any data-intensive field. The engagement of Big Data technology provides a new direction to an organization and the Big Data gives a vision to biomedical data engineering. Numerous data-intensive fields engage Big Data technology to achieve their vision. Interestingly, the Big Data plays a crucial role in Big Biomedical Data Engineering (BBDE). The massive amount of biomedical data becomes a dilemma in terms of analysis, diagnosis, and prediction. Besides, large-scale medical data cannot be stored and processed without employing Big Data technology. The deployment of Big Data technology can change the game of biomedical engineering. This chapter exploits the role of Big Data in biomedical data engineering and its storage dilemma.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abuin, J. M., Pichel, J. C., Pena, T. F., & Amigo, J. (2015). BigBWA: Approaching the burrows-wheeler aligner to big data technologies. Bioinformatics, 31(24), 4003–4005.

    Google Scholar 

  2. Adams, J. U. (2015). Genetics: Big hopes for big data. Nature, 527(7578), S108–S109.

    Article  Google Scholar 

  3. Al Aziz, M. M., Hasan, M. Z., Mohammed, N., & Alhadidi, D. (2016). Secure and efficient multiparty computation on genomic data. In Proceedings of the 20th International Database Engineering & Applications Symposium (pp. 278–283). New York: ACM. https://doi.org/10.1145/2938503.2938507.

    Chapter  Google Scholar 

  4. Andronico, G., Ardizzone, V., Barbera, R., Becker, B., Bruno, R., Calanducci, A., Carvalho, D., Ciuffo, L., Fargetta, M., Giorgio, E., La Rocca, G., Masoni, A., Paganoni, M., Ruggieri, F., & Scardaci, D. (2011). e-infrastructures for e-science: A global view. Journal of Grid Computing, 9(2), 155–184. https://doi.org/10.1007/s10723-011-9187-y.

    Article  Google Scholar 

  5. Baker, S., Xiang, W., & Atkinson, I. (2017). Internet of things for smart healthcare: Technologies, challenges, and opportunities. IEEE Access, (99), 1–1. https://doi.org/10.1109/ACCESS.2017.2775180.

  6. Bates, D. W., Saria, S., Ohno-Machado, L., Shah, A., & Escobar, G. (2014). Big data in health care: Using analytics to identify and manage high-risk and high-cost patients. Health Affairs, 33, 1123–1131.

    Article  Google Scholar 

  7. Bender, E. (2015). Big data in biomedicine: 4 big questions. Nature, 527(7576), S19.

    Article  Google Scholar 

  8. Bonenfant, M., Desai, B. C., Desai, D., Fung, B. C. M., Özsu, M. T., & Ullman, J. D. (2016). Panel: The state of data: Invited paper from panelists. In Proceedings of the 20th International Database Engineering & Applications Symposium (pp. 2–11). New York: ACM. https://doi.org/10.1145/2938503.2939572.

    Chapter  Google Scholar 

  9. Bourne, P. E., Lorsch, J. R., & Green, E. D. (2015). Perspective: Sustaining the big-data ecosystem. Nature, 527(7576), S16–S17. https://doi.org/10.1038/527S16a.

    Article  Google Scholar 

  10. Branson, A., McClatchey, R., Goff, J. M. L., & Shamdasani, J. (2014). Cristal: A practical study in designing systems to cope with change. Information Systems, 42, 139–152. https://doi.org/10.1016/j.is.2013.12.009.

    Article  Google Scholar 

  11. Bromley, D., Rysavy, S. J., Su, R., Toofanny, R. D., Schmidlin, T., & Daggett, V. (2014). Dive: A data intensive visualization engine. Bioinformatics, 30(4), 593–595.

    Article  Google Scholar 

  12. Cassavia, N., Ciampi, M., De Pietro, G., & Masciari, E. (2016). A big data approach for querying data in EHR systems. In Proceedings of the 20th International Database Engineering & Applications Symposium (pp. 212–217). New York: ACM. https://doi.org/10.1145/2938503.2938539.

    Chapter  Google Scholar 

  13. Chen, C. P., & Zhang, C. Y. (2014). Data-intensive applications, challenges, techniques and technologies: A survey on big data. Information Sciences, 275, 314–347. https://doi.org/10.1016/j.ins.2014.01.015.

    Article  Google Scholar 

  14. Chen, H. Y., Hsiung, M., Lee, H. C., Yen, E., Lin, S. C., & Wu, Y. T. (2010). GVSS: A high throughput drug discovery service of avian flu and dengue fever for EGEE and EUAsiaGrid. Journal of Grid Computing, 8(4), 529–541. https://doi.org/10.1007/s10723-010-9159-7.

    Article  Google Scholar 

  15. Chen, H., Chen, W., Liu, C., Zhang, L., Su, J., & Zhou, X. (2016). Relational network for knowledge discovery through heterogeneous biomedical and clinical features. Scientific Reports, 6, 29915.

    Article  Google Scholar 

  16. Clare, S. E., & Shaw, P. L. (2016). “Big data” for breast cancer: where to look and what you will find. NPJ Breast Cancer, 2, 16031.

    Article  Google Scholar 

  17. Council, N. I. (2008). Disruptive technologies global trends 2025. Six technologies with potential impacts on us interests out to 2025. Accessed on 25 November 2017 from https://fas.org/irp/nic/disruptive.pdf

  18. Cuzzocrea, A., Saccà, D., & Ullman, J. D. (2013). Big data: A research agenda. In Proceedings of the 17th International Database Engineering & Applications Symposium (pp. 198–203). New York: ACM. https://doi.org/10.1145/2513591.2527071.

    Chapter  Google Scholar 

  19. Desai, B. C. (2014). The state of data. In Proceedings of the 18th International Database Engineering & Applications Symposium (pp. 77–86). New York: ACM. https://doi.org/10.1145/2628194.2628229.

    Chapter  Google Scholar 

  20. Desai, B. C. (2014). Technological singularities. In Proceedings of the 19th International Database Engineering & Applications Symposium (pp. 10–22). New York: ACM. https://doi.org/10.1145/2790755.2790769.

    Chapter  Google Scholar 

  21. Dunn, W., Burgun, A., Krebs, M. O., & Rance, B. (2016). Exploring and visualizing multidimensional data in translational research platforms. Brief Bioinformatics, bbw080.

    Google Scholar 

  22. Editorial. (2016). The power of big data must be harnessed for medical progress. Nature, 539(7630), 467–468. https://doi.org/10.1038/539467b.

    Article  Google Scholar 

  23. Emeakaroha, V. C., Maurer, M., Stern, P., Łabaj, P. P., Brandic, I., & Kreil, D. P. (2013). Managing and optimizing bioinformatics workflows for data analysis in clouds. Journal of Grid Computing, 11(3), 407–428. https://doi.org/10.1007/s10723-013-9260-9.

    Article  Google Scholar 

  24. Greene, A. C., Giffin, K. A., Greene, C. S., & Moore, J. H. (2016). Adapting bioinformatics curricula for big data. Brief Bioinformatics, 17(1), 43–50.

    Article  Google Scholar 

  25. Howe, D., Costanzo, M., Fey, P., Gojobori, T., Hannick, L., Hide, W., Hill, D. P., Kania, R., Schaeffer, M., Pierre, S. S., Twigger, S., White, O., & Rhee, S. Y. (2008). Big data: The future of biocuration. Nature, 455(7209), 47–50.

    Article  Google Scholar 

  26. Hoxha, J., & Weng, C. (2016). Leveraging dialog systems research to assist biomedical researchers’ interrogation of big clinical data. Journal of Biomedical Informatics, 61, 176–184.

    Article  Google Scholar 

  27. Huang, Z., Ayday, E., Lin, H., Aiyar, R. S., Molyneaux, A., Xu, Z., Fellay, J., Steinmetz, L. M., & Hubaux, J. P. (2016). A privacy-preserving solution for compressed storage and selective retrieval of genomic data. Genome Research, 26, 1687–1696.

    Article  Google Scholar 

  28. Jiang, X., & Neapolitan, R. E. (2015). Evaluation of a two-stage framework for prediction using big genomic data. Brief Bioinformatics, 16(6), 912–921.

    Article  Google Scholar 

  29. Jithesh, P. V., Donachy, P., Harmer, T., Kelly, N., Perrott, R., Wasnik, S., Johnston, J., McCurley, M., Townsley, M., & McKee, S. (2006). GeneGrid: Architecture, implementation and application. Journal of Grid Computing, 4(2), 209–222. https://doi.org/10.1007/s10723-006-9045-5.

    Article  MATH  Google Scholar 

  30. Karasneh, Y., Ibrahim, H., Othman, M., & Yaakob, R. (2009). A model for matching and integrating heterogeneous relational biomedical databases schemas. In Proceedings of the 2009 International Database Engineering & Applications Symposium (pp. 242–250). New York: ACM. https://doi.org/10.1145/1620432.1620458.

    Chapter  Google Scholar 

  31. Khazaei, H., McGregor, C., Eklund, M., El-Khatib, K., & Thommandram, A. (2014). Toward a big data healthcare analytics system: A mathematical modeling perspective. In 2014 IEEE World Congress on Services (pp. 208–215). https://doi.org/10.1109/SERVICES.2014.45.

    Chapter  Google Scholar 

  32. Khoury, M. J., & Ioannidis, J. P. A. (2014). Big data meets public health. Science, 346(6213), 1054–1055.

    Article  Google Scholar 

  33. Khozin, S., Kim, G., & Pazdur, R. (2017). Regulatory watch: From big data to smart data: FDA’s informed initiative. Nature Reviews Drug Discovery, 16(5), 306.

    Article  Google Scholar 

  34. Landhuis, E. (2017). Neuroscience: Big brain, big data. Nature, 541(7638), 559–561.

    Article  Google Scholar 

  35. Laney, D. (2015, February). Gartner predicts three big data trends for business intelligence. Gartner, 12. Retrieved on December 10, 2016, from http://www.forbes.com/sites/gartnergroup/2015/02/12/gartner-predicts-three-big-data-trends-for-business-intelligence/

  36. Levine, A. G. (2014). An explosion of bioinformatics careers. Science. https://doi.org/10.1126/science.opms.r1400143.

  37. Li, G., Bankhead, P., Dunne, P. D., O’Reilly, P. G., James, J. A., Salto-Tellez, M., Hamilton, P. W., & McArt, D. G. (2016). Embracing an integromic approach to tissue biomarker research in cancer: Perspectives and lessons learned. Brief Bioinformatics, 1–13. https://doi.org/10.1093/bib/bbw044.

  38. Li, S., Besson, S., Blackburn, C., Carroll, M., Ferguson, R.K., Flynn, H., Gillen, K., Leigh, R., Lindner, D., Linkert, M., Moore, W. J., Ramalingam, B., Rozbicki, E., Rustici, G., Tarkowska, A., Walczysko, P., Williams, E., Allan, C., Burel, J. M., Moore, J., & Swedlow, J. R. (2016) Metadata management for high content screening in OMERO. Methods 96(Supplement C), 27–32 https://doi.org/10.1016/j.ymeth.2015.10.006, high-throughput Imaging.

  39. Liu, J., Pacitti, E., Valduriez, P., & Mattoso, M. (2015). A survey of data-intensive scientific workflow management. Journal of Grid Computing, 13(4), 457–493. https://doi.org/10.1007/s10723-015-9329-8.

    Article  Google Scholar 

  40. Lynch, C. (2008). Big data: How do your data grow? Nature, 455(7209), 28–29. https://doi.org/10.1038/455028a.

    Article  Google Scholar 

  41. Maddineni, S., Kim, J., El-Khamra, Y., & Jha, S. (2012). Distributed application runtime environment (dare): A standards-based middleware framework for science-gateways. Journal of Grid Computing, 10(4), 647–664. https://doi.org/10.1007/s10723-012-9244-1.

    Article  Google Scholar 

  42. Maestre, C., Segrelles Quilis, J. D., Torres, E., Blanquer, I., Medina, R., Hernández, V., & Martí, L. (2012). Assessing the usability of a science gateway for medical knowledge bases with TRENCADIS. Journal of Grid Computing, 10(4), 665–688. https://doi.org/10.1007/s10723-012-9243-2.

    Article  Google Scholar 

  43. Marx, V. (2013). Biology: The big challenges of big data. Nature, 498(7453), 255–260. https://doi.org/10.1038/498255a.

    Article  Google Scholar 

  44. Masseroli, M., Pinoli, P., Venco, F., Kaitoua, A., Jalili, V., Palluzzi, F., Muller, H., & Ceri, S. (2015). GenoMetric query language: a novel approach to large-scale genomic data management. Bioinformatics, 31(12), 1881–1888.

    Article  Google Scholar 

  45. Mattmann, C. A. (2013). Computing: A vision for data science. Nature, 493(7433), 473–475. https://doi.org/10.1038/493473a.

    Article  Google Scholar 

  46. McClatchey, R., Branson, A., & Shamdasani, J. (2016). Provenance support for biomedical big data analytics. In Proceedings of the 20th International Database Engineering & Applications Symposium (pp. 386–391). New York: ACM. https://doi.org/10.1145/2938503.2938540.

    Chapter  Google Scholar 

  47. Mooney, S. J., Westreich, D. J., & El-Sayed, A. M. (2015). Epidemiology in the era of big data. Epidemiology (Cambridge, MA), 26(3), 390–394. https://doi.org/10.1097/EDE.0000000000000274.

    Article  Google Scholar 

  48. Murdoch, T. B., & Detsky, A. S. (2013). The inevitable application of big data to health care. JAMA, 309(13), 1351–1352.

    Article  Google Scholar 

  49. Nielsen, C. B., Younesy, H., O’Geen, H., Xu, X., Jackson, A. R., Milosavljevic, A., Wang, T., Costello, J. F., Hirst, M., Farnham, P. J., & Jones, S. J. M. (2012). Spark: A navigational paradigm for genomic data exploration. Genome Research, 22(11), 2262–2269.

    Article  Google Scholar 

  50. Noor, A. M., Holmberg, L., Gillett, C., & Grigoriadis, A. (2015). Big data: The challenge for small research groups in the era of cancer genomics. British Journal of Cancer, 113(10), 1405–1412.

    Article  Google Scholar 

  51. Patgiri, R. (2016). MDS: In-depth insight. In 2016 International Conference on Information Technology (ICIT) (pp. 193–199). https://doi.org/10.1109/ICIT.2016.048.

    Chapter  Google Scholar 

  52. Patgiri, R., & Ahmed, A. (2016). Big data: The v’s of the game changer paradigm. In 2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS) (pp. 17–24). Sydney: IEEE. https://doi.org/10.1109/HPCC-SmartCity-DSS.2016.0014.

    Chapter  Google Scholar 

  53. Patgiri, R., Dev, D., & Ahmed, A. (2018). dMDS: Uncover the hidden issues of metadata server design. In Progress in intelligent computing techniques: Theory, practice, and applications: Proceedings of ICACNI 2016 (Vol. 1, pp. 531–541). Singapore: Springer. https://doi.org/10.1007/978-981-10-3373-5_53.

    Chapter  Google Scholar 

  54. Rider, A. K., & Chawla, N. V. (2013) An ensemble topic model for sharing healthcare data and predicting disease risk. In Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics (pp. 333:333–333:340). New York: ACM. https://doi.org/10.1145/2506583.2506640

  55. Robbins, D. E., Gruneberg, A., Deus, H. F., Tanik, M. M., & Almeida, J. (2013). TCGA toolbox: an open web app framework for distributing big data analysis pipelines for cancer genomics. In Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics (pp. 62–67).

    Google Scholar 

  56. Robbins, D. E., Gruneberg, A., Deus, H. F., Tanik, M. M., & Almeida, J. S. (2013). A self-updating road map of the cancer genome atlas. Bioinformatics, 29(10), 1333–1340.

    Article  Google Scholar 

  57. Rumsfeld, J. S., Joynt, K. E., & Maddox, T. M. (2016). Big data analytics to improve cardiovascular care: Promise and challenges. Nature Reviews Cardiology, 13(6). https://doi.org/10.1038/nrcardio.2016.42.

  58. Saez-Rodriguez, J., Costello, J. C., Friend, S. H., Kellen, M. R., Mangravite, L., Meyer, P., Norman, T., & Stolovitzky, G. (2016). Crowdsourcing biomedical research: Leveraging communities as innovation engines. Nature Reviews Genetics, 17(8), 470–486.

    Article  Google Scholar 

  59. Schadt, E. E. (2012). The changing privacy landscape in the era of big data. Molecular Systems Biology, 8(612), 1–3.

    Google Scholar 

  60. Schadt, E. E., Linderman, M. D., Sorenson, J., Lee, L., & Nolan, G. P. (2010). Computational solutions to large-scale data management and analysis. Nature Reviews Genetics, 11(9), 647–657.

    Article  Google Scholar 

  61. Seife, C. (2015). Big data: The revolution is digitized. Nature, 518(7540), 480–481. https://doi.org/10.1038/518480a.

    Article  Google Scholar 

  62. Shahand, S., Santcroos, M., van Kampen, A. H. C., & Olabarriaga, S. D. (2012). A grid-enabled gateway for biomedical data analysis. Journal of Grid Computing, 10(4), 725–742. https://doi.org/10.1007/s10723-012-9233-4.

    Article  Google Scholar 

  63. Silva, G. G. Z., Green, K. T., Dutilh, B. E., & Edwards, R. A. (2016). Super-focus: A tool for agile functional analysis of shotgun metagenomic data. Bioinformatics, 32(3), 354–361.

    Article  Google Scholar 

  64. Sinha, G. (2016). A career in cancer research? Computational skills wanted. Science. https://doi.org/10.1126/science.opms.r1600163.

  65. Sinnott, R. O., Beuschlein, F., Effendy, J., Eisenhofer, G., Gloeckner, S., & Stell, A. (2016). Beyond a disease registry: An integrated virtual environment for adrenal cancer research. Journal of Grid Computing, 14(4), 515–532. https://doi.org/10.1007/s10723-016-9375-x.

    Article  Google Scholar 

  66. Sonnhammer, E. L., Gabaldon, T., da Silva, A. W. S., Martin, M., Robinson-Rechavi, M., Boeckmann, B., Thomas, P. D., & Dessimoz, C. (2014). The quest for orthologs consortium: Big data and other challenges in the quest for orthologs. Bioinformatics, 30(21), 2993–2998.

    Article  Google Scholar 

  67. Srinivasan, R., Li, Q., Zhou, X., Lu, J., Lichtman, J., & Wong, S. T. (2010). Reconstruction of the neuromuscular junction connectome. Bioinformatics, 26(12), i64–i70.

    Article  Google Scholar 

  68. Stein, L. D., Knoppers, B. M., Campbell, P., Getz, G., & Korbel, J. O. (2015). Data analysis: Create a cloud commons. Nature, 523(7559), 149–151.

    Article  Google Scholar 

  69. Szabo, C., Sheng, Q. Z., Kroeger, T., Zhang, Y., & Yu, J. (2014). Science in the cloud: Allocation and execution of data-intensive scientific workflows. Journal of Grid Computing, 12(2), 245–264. https://doi.org/10.1007/s10723-013-9282-3.

    Article  Google Scholar 

  70. Ta, V. D., Liu, C. M., & Nkabinde, G. W. (2016). Big data stream computing in healthcare real-time analytics. In 2016 IEEE international conference on cloud computing and big data analysis (ICCCBDA) (pp. 37–42). https://doi.org/10.1109/ICCCBDA.2016.7529531.

    Chapter  Google Scholar 

  71. Topol, E. J. (2015). The big medical data miss: Challenges in establishing an open medical resource. Nature Reviews Genetics, 16(5), 253–254.

    Article  Google Scholar 

  72. Watts, N. A., & Feltus, F. A. (2017). Big data smart socket (BDSS): A system that abstracts data transfer habits from end users. Bioinformatics, 33(4), 627–628.

    Google Scholar 

  73. Weil, A. R. (2014). Big data in health: A new era for research and patient care. Health Affairs, 33, 1110.

    Article  Google Scholar 

  74. Zeng, T., Zhang, W., Yu, X., Liu, X., Li, M., & Chen, L. (2016). Big-data-based edge biomarkers: Study on dynamical drug sensitivity and resistance in individuals. Brief Bioinformatics, 17(4), 576–592.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ripon Patgiri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Patgiri, R., Nayak, S. (2020). Big Biomedical Data Engineering. In: Arabnia, H.R., Daimi, K., Stahlbock, R., Soviany, C., Heilig, L., Brüssau, K. (eds) Principles of Data Science. Transactions on Computational Science and Computational Intelligence. Springer, Cham. https://doi.org/10.1007/978-3-030-43981-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-43981-1_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-43980-4

  • Online ISBN: 978-3-030-43981-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics