Discovering Patterns of DNA Methylation: Rule Mining with Rough Sets and Decision Trees, and Comethylation Analysis
DNA methylation regulates the transcription of genes without changing their coding sequences. It plays a vital role in the process of embryogenesis and tumorgenesis. To gain more insights into how such epigenetic mechanism works in the human cells, we apply the two popular data mining techniques, i.e., Rough Sets, and Decision Trees, to uncover the logical rules of DNA methylation. Our results show that the Rough Sets method can generate and utilize fewer rules to fully separate the methylation dataset, whereas Decision Trees method relies on more rules but involves fewer decision variables to do the same task. We also find that some of the gene promoters are highly comethylated, demonstrating the evidence that genes are highly interactive epigenetically in human cells.
KeywordsEmbryonic Stem Cell Human Embryonic Stem Cell Methylation Profile Logical Rule Decision Tree Method
- 2.Fabian, M., Peter, A., Alexander, O., Christian, P.: Feature Selection for DNA Methylation based Cancer Classification. Bioinformatics 17(90001), S157–S164 (2001)Google Scholar
- 3.Bibikova, M., et al.: Human Embryonic Stem Cells Have a Unique Epigenetic Signature., Genome Research, online article (August 2006) Google Scholar
- 8.Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1, 81–106 (1986)Google Scholar
- 9.Rosetta software, http://rosetta.lcb.uu.se/
- 10.SPASS Clementine software, http://www.spss.com/clementine/