, Volume 14, Issue 2, pp 122-131

A neural network based multi-classifier system for gene identification in DNA sequences

Purchase on Springer.com

$39.95 / €34.95 / £29.95*

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access

Abstract

The paper presents a neural network based multi-classifier system for the identification of Escherichia coli promoter sequences in strings of DNA. As each gene in DNA is preceded by a promoter sequence, the successful location of an E. coli promoter leads to the identification of the corresponding E. coli gene in the DNA sequence. A set of 324 known E. coli promoters and a set of 429 known non-promoter sequences were encoded using four different encoding methods. The encoded sequences were then used to train four different neural networks. The classification results of the four individual neural networks were then combined through an aggregation function, which used a variation of the logarithmic opinion pool method. The weights of this function were determined by a genetic algorithm. The multi-classifier system was then tested on 159 known promoter sequences and 171 non-promoter sequences not contained in the training set. The results obtained through this study proved that the same data set, when presented to neural networks in different forms, can provide slightly varying results. It also proves that when different opinions of more classifiers on the same input data are integrated within a multi-classifier system, we can obtain results that are better than the individual performances of the neural networks. The performances of our multi-classifier system outperform the results of other prediction systems for E. coli promoters developed so far.