Abstract
The term ‘epigenetic’ refers to all heritable alterations that occur in a given gene function without having any change on the DeoxyriboNucleic Acid (DNA) sequence. Epigenetic modifications play a crucial role in development and differentiation of various diseases including cancer. The specific epigenetic alteration that has garnered a great deal of attention is DNA methylation, i.e., the addition of a methyl-group to cytosine. Recent studies have shown that different tumor types have distinct methylation profiles. Identifying idiosyncratic DNA methylation profiles of different tumor types and subtypes can provide invaluable insights for accurate diagnosis, early detection, and tailoring of the related treatment for cancer. In this study, our goal is to identify the informative genes (biomarkers) whose methylation level change correlates with a specific cancer type or subtype. To achieve this goal, we propose a novel high dimensional learning framework inspired by the dynamic data driven application systems paradigm to identify the biomarkers, determine the outlier(s) and improve the quality of the resultant disease detection. The proposed framework starts with a principal component analysis (PCA) followed by hierarchical clustering (HCL) of observations and determination of informative genes based on the HCL predictions. The capabilities and performance of the proposed framework are demonstrated using a DNA methylation dataset stored in Gene Expression Omnibus (GEO) DataSets on lung cancer. The preliminary results demonstrate that our framework outperforms the conventional clustering algorithms with embedded dimension reduction methods, in its efficiency to identify informative genes and outliers, and removal of their contaminating effects at the expense of reasonable computational cost.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
M. Bibikova, Z. Lin, L. Zhou, E. Chudin, E.W. Garcia, B. Wu, D. Doucet, N.J. Thomas, Y. Wang, E. Vollmer et al., High-throughput DNA methylation profiling using universal bead arrays. Genome Res. 16(3), 383–393 (2006)
E. Blasch, Y. Al-Nashif, S. Hariri, Static versus dynamic data information fusion analysis using DDDAS for cyber security trust. Proc. Comput. Sci. 29, 1299–1313 (2014)
N. Celik, S. Lee, K. Vasudevan, Y.J. Son, Dddas-based multi-fidelity simulation framework for supply chain systems. IIE Trans. 42(5), 325–341 (2010)
B.C. Christensen, C.J. Marsit, E.A. Houseman, J.J. Godleski, J.L. Longacker, S. Zheng, R.F. Yeh, M.R. Wrensch, J.L. Wiemels, M.R. Karagas et al., Differentiation of lung adenocarcinoma, pleural mesothelioma, and nonmalignant pulmonary tissues using dna methylation profiles. Cancer Res. 69(15), 6315–6321 (2009)
J.P. Cunningham, Z. Ghahramani, Linear dimensionality reduction: survey, insights, and generalizations. J. Mach. Learn. Res. 16, 2859–2900 (2015)
H. Damgacioglu, C. Iyigun, Uncertainity and a new measure for classification uncertainity, in Uncertainty Modeling in Knowledge Engineering and Decision Making, ed. by C. Kahraman (World Scientific, Hackensack, 2012), pp. 925–930
F. Darema, Dynamic data driven application systems. Internet Process Coordination p. 149 (2002)
F. Darema, Dynamic data driven applications systems: A new paradigm for application simulations and measurements, in International Conference on Computational Science, Krakow, (Springer, 2004), pp. 662–669
S.U. Devaskar, S. Raychaudhuri, Epigenetics–a science of heritable biological adaptation. Pediatr. Res. 61, 1R–4R (2007)
A. Eccleston, N. DeWitt, C. Gunter, B. Marte, D. Nath, Epigenetics. Nature 447(7143), 395–395 (2007)
G. Egger, G. Liang, A. Aparicio, P.A. Jones, Epigenetics in human disease and prospects for epigenetic therapy. Nature 429(6990), 457–463 (2004)
M. Esteller, Epigenetics in cancer. N. Engl. J. Med. 358(11), 1148–1159 (2008)
M. Esteller, P.G. Corn, S.B. Baylin, J.G. Herman, A gene hypermethylation profile of human cancer. Cancer Res. 61(8), 3225–3229 (2001)
R. Fujimoto, R. Guensler, M. Hunter, H.K. Kim, J. Lee, J. Leonard II, M. Palekar, K. Schwan, B. Seshasayee, Dynamic data driven application simulation of surface transportation systems, in International Conference on Computational Science, the University of Reading, UK (Springer, 2006), pp. 425–443
R.M. Fujimoto, N. Celik, H. Damgacioglu, M. Hunter, D. Jin, Y.J. Son, J. Xu, Dynamic data driven application systems for smart cities and urban infrastructures, in Winter Simulation (WSC), Washington, D.C. (IEEE, 2016), pp. 1143–1157
T.R. Golub, D.K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J.P. Mesirov, H. Coller, M.L. Loh, J.R. Downing, M.A. Caligiuri et al., Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)
K. Holm, C. Hegardt, J. Staaf, J. Vallon-Christersson, G. Jönsson, H. Olsson, Å. Borg, M. Ringnér, Molecular subtypes of breast cancer are associated with characteristic dna methylation patterns. Breast Cancer Res. 12(3), 1 (2010)
C. Iyigun, A. Ben-Israel, Semi-supervised probabilistic distance clustering and the uncertainty of classification, in Advances in Data Analysis, Data Handling and Business Intelligence ed. by A. Fink (Springer, Berlin/Heidelberg, 2009), pp. 3–20
A.M. Khaleghi, D. Xu, Z. Wang, M. Li, A. Lobos, J. Liu, Y.J. Son, A DDDAMS-based planning and control framework for surveillance and crowd control via UAVs and UGVs. Expert Systems with Applications 40(18), 7168–7183 (2013)
E.M. Knox, R.T. Ng, Algorithms for mining distance based outliers in large datasets, in Proceedings of the International Conference on Very Large Data Bases, New York City, NY (Citeseer, 1998) pp. 392–403
P.W. Laird, R. Jaenisch, The role of DNA methylation in cancer genetics and epigenetics. Annu. Rev. Genet. 30(1), 441–464 (1996)
E. Li, C. Beard, R. Jaenisch, Role for dna methylation in genomic imprinting. Nature 366(6453), 362–365 (1993)
X. Shi, H. Damgacioglu, N. Celik, A dynamic data-driven approach for operation planning of microgrids. Proc. Comput. Sci. 51, 2543–2552 (2015)
K.D. Siegmund, P.W. Laird, I.A. Laird-Offringa, A comparison of cluster analysis methods using dna methylation data. Bioinformatics 20(12), 1896–1904 (2004)
A.E. Thanos, X. Shi, Sáenz, J.P., N. Celik, A DDDAMS framework for real-time load dispatching in power networks, in Proceedings of the 2013 Winter Simulation Conference: Simulation: Making Decisions in a Complex World, Washington, D.C. (IEEE Press, 2013), pp. 1893–1904
A.E. Thanos, D.E. Moore, X. Shi, N. Celik, System of systems modeling and simulation for microgrids using DDDAMS, in Modeling and Simulation Support for System of Systems Engineering Applications (Wiley, Hoboken, 2015), p. 337
A.E. Thanos, M. Bastani, N. Celik, C.H. Chen, Dynamic data driven adaptive simulation framework for automated control in microgrids. IEEE Trans. Smart Grid 8(1), 209–218 (2017)
H. Ueno, H. Okita, S. Akimoto, K. Kobayashi, K. Nakabayashi, K. Hata, J. Fujimoto, J.I. Hata, M. Fukuzawa, N. Kiyokawa, DNA methylation profile distinguishes clear cell sarcoma of the kidney from other pediatric renal tumors. PLoS One 8(4), e62233 (2013)
A.K. Virmani, J.A. Tsou, K.D. Siegmund, L.Y. Shen, T.I. Long, P.W. Laird, A.F. Gazdar, I.A. Laird-Offringa, Hierarchical clustering of lung cancer cell lines using DNA methylation markers. Cancer Epidemiol. Biomark. Prev. 11(3), 291–297 (2002)
R.Y.H. Wang, C.W. Gehrke, M. Ehrlich, Comparison of bisulfite modification of 5-methyldeoxycytidine and deoxycytidine residues. Nucleic Acids Res. 8(20), 4777–4790 (1980)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Damgacioglu, H., Celik, E., Yuan, C., Celik, N. (2018). Dynamic Data Driven Application Systems for Identification of Biomarkers in DNA Methylation. In: Blasch, E., Ravela, S., Aved, A. (eds) Handbook of Dynamic Data Driven Applications Systems. Springer, Cham. https://doi.org/10.1007/978-3-319-95504-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-95504-9_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95503-2
Online ISBN: 978-3-319-95504-9
eBook Packages: Computer ScienceComputer Science (R0)