Abstract
DNA methylation regulates the transcription of genes without changing their coding sequences. It plays a vital role in the process of embryogenesis and tumorgenesis. To gain more insights into how such epigenetic mechanism works in the human cells, we apply the two popular data mining techniques, i.e., Rough Sets, and Decision Trees, to uncover the logical rules of DNA methylation. Our results show that the Rough Sets method can generate and utilize fewer rules to fully separate the methylation dataset, whereas Decision Trees method relies on more rules but involves fewer decision variables to do the same task. We also find that some of the gene promoters are highly comethylated, demonstrating the evidence that genes are highly interactive epigenetically in human cells.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Jaenisch, R., Bird, A.: Epigenetic Regulation of Gene Expression: How the Genome Integrates Intrinsic and Environmental Signals. Nature Genetics 33 Suppl., 245–254 (2003)
Fabian, M., Peter, A., Alexander, O., Christian, P.: Feature Selection for DNA Methylation based Cancer Classification. Bioinformatics 17(90001), S157–S164 (2001)
Bibikova, M., et al.: Human Embryonic Stem Cells Have a Unique Epigenetic Signature., Genome Research, online article (August 2006)
Bhasin, M., Zhang, H., Reinherz, E., Reche, P.A.: Prediction of Methylated CpGs in DNA Sequences Using a Support Vector Machine. FEBS Letters 579, 4302–4308 (2005)
Marjoram, P., Chang, J., Laird, P.W., Siegmund, K.D.: Cluster Analysis for DNA Methylation Profiles Having a Detection Threshold. BMC Bioinformatics 7, 361 (2006)
Das, R., et al.: Computational Prediction of Methylation Status in Human Genomic Sequences. PNAS 103(28), 10713–10716 (2006)
Pawlak, Z., Wong, S.K.M., Ziarko, W.: Rough sets: Probabilistic versus Deterministic Approach. International Journal of Man-Machine Studies 29, 81–95 (1988)
Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1, 81–106 (1986)
Rosetta software, http://rosetta.lcb.uu.se/
SPASS Clementine software, http://www.spss.com/clementine/
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ben, N., Yang, Q., Li, J., Chi-keung, S., Pal, S. (2007). Discovering Patterns of DNA Methylation: Rule Mining with Rough Sets and Decision Trees, and Comethylation Analysis. In: Ghosh, A., De, R.K., Pal, S.K. (eds) Pattern Recognition and Machine Intelligence. PReMI 2007. Lecture Notes in Computer Science, vol 4815. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77046-6_48
Download citation
DOI: https://doi.org/10.1007/978-3-540-77046-6_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77045-9
Online ISBN: 978-3-540-77046-6
eBook Packages: Computer ScienceComputer Science (R0)