Skip to main content
Log in

A novel approach for predicting DNA splice junctions using hybrid machine learning algorithms

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Accurate identification of splice junctions in a DNA sequence is an active area of research. The knowledge of splice junction’s occurrence provides valuable information about its internal genomic structure and aids in its deeper analysis and interpretation. The major problems faced during gene analysis are diversity, complexity and the uncertainty nature of DNA sequences. The application of computational techniques using machine learning algorithms in this direction has attracted enormous attention in the last few decades. In this study, the development of hybrid machine learning ensembles approaches is presented that address the splice junction problem more effectively. Multiple classifier systems consisting of random subspace, rotation forest and boosting methods are implemented and are validated over the real genome sequence dataset. A novel feature selection technique based on attribute’s correlation estimation using Best first strategy is proposed. The average prediction accuracy achieved is more than 98 % in identifying the splice junctions. All the computations are performed with 95 % confidence interval. The results presented in this study are superior as compared to the state-of-the-art approaches in the literature. This work strengthens the viability of expanding and using machine learning models to similar problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Indrajit Mandal.

Additional information

Communicated by E. Lughofer.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mandal, I. A novel approach for predicting DNA splice junctions using hybrid machine learning algorithms. Soft Comput 19, 3431–3444 (2015). https://doi.org/10.1007/s00500-014-1550-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-014-1550-z

Keywords

Navigation