Skip to main content

Dynamic Data Driven Application Systems for Identification of Biomarkers in DNA Methylation

  • 166 Accesses

Abstract

The term ‘epigenetic’ refers to all heritable alterations that occur in a given gene function without having any change on the DeoxyriboNucleic Acid (DNA) sequence. Epigenetic modifications play a crucial role in development and differentiation of various diseases including cancer. The specific epigenetic alteration that has garnered a great deal of attention is DNA methylation, i.e., the addition of a methyl-group to cytosine. Recent studies have shown that different tumor types have distinct methylation profiles. Identifying idiosyncratic DNA methylation profiles of different tumor types and subtypes can provide invaluable insights for accurate diagnosis, early detection, and tailoring of the related treatment for cancer. In this study, our goal is to identify the informative genes (biomarkers) whose methylation level change correlates with a specific cancer type or subtype. To achieve this goal, we propose a novel high dimensional learning framework inspired by the dynamic data driven application systems paradigm to identify the biomarkers, determine the outlier(s) and improve the quality of the resultant disease detection. The proposed framework starts with a principal component analysis (PCA) followed by hierarchical clustering (HCL) of observations and determination of informative genes based on the HCL predictions. The capabilities and performance of the proposed framework are demonstrated using a DNA methylation dataset stored in GEO DataSets on lung cancer. The preliminary results demonstrate that our framework outperforms the conventional clustering algorithms with embedded dimension reduction methods, in its efficiency to identify informative genes and outliers, and removal of their contaminating effects at the expense of reasonable computational cost.

Keywords

  • Dynamic data driven application systems
  • DNA methylation
  • Hierarchical clustering
  • Principal component analysis
  • Outlier detection

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-74568-4_12
  • Chapter length: 21 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   219.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-74568-4
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Hardcover Book
USD   279.99
Price excludes VAT (USA)
Fig. 12.1
Fig. 12.2
Fig. 12.3
Fig. 12.4
Fig. 12.5
Fig. 12.6
Fig. 12.7
Fig. 12.8
Fig. 12.9
Fig. 12.10
Fig. 12.11

References

  1. Aved A (2013) Scene understanding for real time processing of queries over big data streaming video. University of Central Florida

    Google Scholar 

  2. Bibikova M, Lin Z, Zhou L, Chudin E, Garcia EW, Wu B, Doucet D, Thomas NJ, Wang Y, Vollmer E, et al (2006) High-throughput dna methylation profiling using universal bead arrays. Genome research 16(3):383–393

    CrossRef  Google Scholar 

  3. Blasch E, Al-Nashif Y, Hariri S (2014) Static versus dynamic data information fusion analysis using dddas for cyber security trust. Procedia Computer Science 29:1299–1313

    CrossRef  Google Scholar 

  4. Blasche E (2018) Dddas advantages from high-dimensional simulation. In: Winter Simulation Conference (WSC) 2019, pp 1418–1429

    Google Scholar 

  5. Blasche E, Aved A (2015) Dynamic data-driven application system (dddas) for video surveillance user support. In: Procedia Computer Science, vol 51, pp 2503–2517

    CrossRef  Google Scholar 

  6. Blasche E, Xu R, Nikouei S, Chen Y (2018) A study of lightweight dddas architecture for real-time public safety applications through hybrid simulation. In: Winter Simulation Conference (WSC) 2019, pp 762–773

    Google Scholar 

  7. Celik N, Lee S, Vasudevan K, Son YJ (2010) Dddas-based multi-fidelity simulation framework for supply chain systems. IIE Transactions 42(5):325–341

    CrossRef  Google Scholar 

  8. Christensen BC, Marsit CJ, Houseman EA, Godleski JJ, Longacker JL, Zheng S, Yeh RF, Wrensch MR, Wiemels JL, Karagas MR, et al (2009) Differentiation of lung adenocarcinoma, pleural mesothelioma, and nonmalignant pulmonary tissues using dna methylation profiles. Cancer research 69(15):6315–6321

    CrossRef  Google Scholar 

  9. Cunningham JP, Ghahramani Z (2015) Linear dimensionality reduction: Survey, insights, and generalizations. Journal of Machine Learning Research 16:2859–2900

    MathSciNet  MATH  Google Scholar 

  10. Damgacioglu H, Iyigun C (2012) Uncertainity and a new measure for classification uncertainity. In: Uncertainty Modeling in Knowledge Engineering and Decision Making, World Scientific, pp 925–930

    Google Scholar 

  11. Darema F (1996) On the parallel characteristics of engineering/scientific and commercial applications: differences, similarities and future outlook. In: Keane J (ed) Parallel Commercial Processing

    Google Scholar 

  12. Darema F (2002) Dynamic data driven application systems. Internet Process Coordination p 149

    Google Scholar 

  13. Darema F (2004) Dynamic data driven applications systems: A new paradigm for application simulations and measurements. In: International Conference on Computational Science, Springer, pp 662–669

    Google Scholar 

  14. Darema F (2011) Computational model and environments. Journal of Algorithms and Computational Technology 5(4):545–600

    CrossRef  Google Scholar 

  15. Darema F (2012, June) New frontiers through computer and information science. Presented at the 2012 International Conference on Computational Science (ICCS), Omaha, NE.

    Google Scholar 

  16. Darville J, Celik N (2020) Simulation optimization for unit commitment using a region-based sampling (rbs) algorithm. In: Proceedings of the 2020 Institute of Industrial and Systems Engineers

    Google Scholar 

  17. Devaskar SU, Raychaudhuri S (2007) Epigenetics–a science of heritable biological adaptation. Pediatric research 61:1R–4R

    CrossRef  Google Scholar 

  18. Eccleston A, DeWitt N, Gunter C, Marte B, Nath D (2007) Epigenetics. Nature 447(7143):395–395

    CrossRef  Google Scholar 

  19. Egger G, Liang G, Aparicio A, Jones PA (2004) Epigenetics in human disease and prospects for epigenetic therapy. Nature 429(6990):457–463

    CrossRef  Google Scholar 

  20. Esteller M (2008) Epigenetics in cancer. New England Journal of Medicine 358(11):1148–1159

    CrossRef  Google Scholar 

  21. Esteller M, Corn PG, Baylin SB, Herman JG (2001) A gene hypermethylation profile of human cancer. Cancer research 61(8):3225–3229

    Google Scholar 

  22. Fujimoto R, Guensler R, Hunter M, Kim HK, Lee J, Leonard II J, Palekar M, Schwan K, Seshasayee B (2006) Dynamic data driven application simulation of surface transportation systems. In: International Conference on Computational Science, Springer, pp 425–432

    Google Scholar 

  23. Fujimoto RM, Celik N, Damgacioglu H, Hunter M, Jin D, Son YJ, Xu J (2016) Dynamic data driven application systems for smart cities and urban infrastructures. In: Winter Simulation Conference (WSC), 2016, IEEE, pp 1143–1157

    Google Scholar 

  24. Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, et al (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. science 286(5439):531–537

    Google Scholar 

  25. Holm K, Hegardt C, Staaf J, Vallon-Christersson J, Jönsson G, Olsson H, Borg Å, Ringnér M (2010) Molecular subtypes of breast cancer are associated with characteristic dna methylation patterns. Breast Cancer Research 12(3):1

    CrossRef  Google Scholar 

  26. Hunter M, Biswas A, Fujimoto R (2018) Energy efficient middleware for dynamic data driven application systems. In: Proceedings of the 2018 Winter Simulation Conference, pp 628–639

    Google Scholar 

  27. Iyigun C, Ben-Israel A (2009) Semi-supervised probabilistic distance clustering and the uncertainty of classification. In: Advances in data analysis, data handling and business intelligence, Springer, pp 3–20

    Google Scholar 

  28. Jin D, Nicole D (2015) Parallel simulation and virtual-machine-based emulation of software-defined networks. In: ACM Transactions on Modeling and Computer Simulation (TOMACS), vol 1, pp 1–27

    Google Scholar 

  29. Khaleghi AM, Xu D, Wang Z, Li M, Lobos A, Liu J, Son YJ (2013) A dddams-based planning and control framework for surveillance and crowd control via uavs and ugvs. Expert Systems with Applications 40(18):7168–7183

    CrossRef  Google Scholar 

  30. Knox EM, Ng RT (1998) Algorithms for mining distancebased outliers in large datasets. In: Proceedings of the International Conference on Very Large Data Bases, Citeseer, pp 392–403

    Google Scholar 

  31. Laird PW, Jaenisch R (1996) The role of dna methylation in cancer genetics and epigenetics. Annual review of genetics 30(1):441–464

    CrossRef  Google Scholar 

  32. Lecerf M, Allaire D, Willcox K (2015) Methodology for dynamic data-driven online flight capability estimation. AIAA Journal 53(10):3073–3087

    CrossRef  Google Scholar 

  33. Li E, Beard C, Jaenisch R (1993) Role for dna methylation in genomic imprinting. Nature 366(6453):362–365

    CrossRef  Google Scholar 

  34. Shi X, Damgacioglu H, Celik N (2015) A dynamic data-driven approach for operation planning of microgrids. Procedia Computer Science 51:2543–2552

    CrossRef  Google Scholar 

  35. Siegmund KD, Laird PW, Laird-Offringa IA (2004) A comparison of cluster analysis methods using dna methylation data. Bioinformatics 20(12):1896–1904

    CrossRef  Google Scholar 

  36. Thanos AE, Shi X, Sáenz JP, Celik N (2013) A dddams framework for real-time load dispatching in power networks. In: Proceedings of the 2013 Winter Simulation Conference: Simulation: Making Decisions in a Complex World, IEEE Press, pp 1893–1904

    Google Scholar 

  37. Thanos AE, Bastani M, Celik N, Chen CH (2017) Dynamic data driven adaptive simulation framework for automated control in microgrids. IEEE Transactions on Smart Grid 8(1):209–218

    CrossRef  Google Scholar 

  38. Ueno H, Okita H, Akimoto S, Kobayashi K, Nakabayashi K, Hata K, Fujimoto J, Hata Ji, Fukuzawa M, Kiyokawa N (2013) Dna methylation profile distinguishes clear cell sarcoma of the kidney from other pediatric renal tumors. PloS one 8(4):e62,233

    CrossRef  Google Scholar 

  39. Virmani AK, Tsou JA, Siegmund KD, Shen LY, Long TI, Laird PW, Gazdar AF, Laird-Offringa IA (2002) Hierarchical clustering of lung cancer cell lines using dna methylation markers. Cancer Epidemiology Biomarkers & Prevention 11(3):291–297

    Google Scholar 

  40. Wang RYH, Gehrke CW, Ehrlich M (1980) Comparison of bisulfite modification of 5-methyldeoxycytidine and deoxycytidine residues. Nucleic acids research 8(20):4777–4790

    CrossRef  Google Scholar 

  41. Xu J, Zhang S, Huang E, Chen C, Lee L, Celik N (2014) Efficient multi-fidelity simulation optimization. Winter Simulation Conference, pp 3940–3951

    Google Scholar 

  42. Xu J, Zhang S, Huang E, Chen C, Lee L, Celik N (2016) Multi-fidelity optimization with ordinal transformation and optimal sampling. Asia-Pacific Journal of Operational Research 33(3):165–170

    CrossRef  Google Scholar 

  43. Yavuz A, Darville J, Celik N, Xu J, Chen C, Langhals B, Engle R (2020) Advancing self healing capabilities in interconnected microgrids via ddas with relational database management. In: Proceedings of the 2020 Winter Simulation Conference

    Google Scholar 

  44. Ye C, Ding Y, Wang P, Lin Z (2019) A data-driven bottom-up approach for spatial and temporal electric load forecasting. In: IEEE Trans Power Syst, vol 34, pp 1966–1979

    CrossRef  Google Scholar 

  45. Zhou K, Chen Y, Xu Z, Lu J, Hu Z (2018) A smart-community demand response load scheduling method based on consumer clustering. 2nd IEEE Conference Energy Internet Energy System Integration

    Google Scholar 

Download references

Acknowledgements

This project is supported by the AFOSR DDDAS.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nurcin Celik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Verify currency and authenticity via CrossMark

Cite this chapter

Damgacioglu, H., Celik, E., Yuan, C., Celik, N. (2022). Dynamic Data Driven Application Systems for Identification of Biomarkers in DNA Methylation. In: Blasch, E.P., Darema, F., Ravela, S., Aved, A.J. (eds) Handbook of Dynamic Data Driven Applications Systems. Springer, Cham. https://doi.org/10.1007/978-3-030-74568-4_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-74568-4_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-74567-7

  • Online ISBN: 978-3-030-74568-4

  • eBook Packages: Computer ScienceComputer Science (R0)