Discovering Motiv Based Association Rules in a Set of DNA Sequences
The research of similarity between DNA sequences is an important problem in Bio-Informatics. In the traditional approach, the dynamic programming based pair-wise alignment is used for measuring the similarity between two sequences. This method does not work well in a large data set. In this paper, we consider motifs like the phrase of document and use text mining techniques for finding the frequent motifs, maximal frequent motifs, motif based association rules in a group of genes.
Unable to display preview. Download preview PDF.
- A Chouchoulas and Q. Shen, A Rough Set based approach to text classification, In the Proceedings of RFDGRC99 international conference, Yamaguchi-UBE, Japan, 1999.Google Scholar
- Anders Krogh: An introduction to Hidden Markov Models for Biological Sequences, Computer Methods in Molecular Biology, Elservier, 1998Google Scholar
- Hoang Kiem, Do Phuc: Discovering the binary and fuzzy association rules from database: hi the proceedings of the AFSS2000 international conference, Tsukuba, Japan, 2000Google Scholar
- Hoang Kiem, Do Phuc: On the Extension of lower approximation in rough set theory for classification problem in data mining, the WCC2000 conference, Beijing, August 2000 (to be accepted for presentation).Google Scholar
- Timothy L. Bailey: Discovering motifs in DNA and protein sequence: the approximate common sub-string problem: Ph D dissertation, Univ California, San Diego, USA, 1995Google Scholar
- Robert Giegerich and David Wheeler: Pair wise Sequence Alignment, 1996 website: http://www.techfak.uni-bielefeld.de/bcd/Curric/PrwAli/prwali.html
- R. Agrawal, R. Srikant, Fast Algorithm for Mining Association Rules in large database, Research report RJ, IBM Almaden Research Center, San Jose, CA,1994Google Scholar