Skip to main content

An Efficient Framework for Prediction in Healthcare Data Using Soft Computing Techniques

  • Conference paper
Book cover Advances in Computing and Communications (ACC 2011)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 192))

Included in the following conference series:

Abstract

Healthcare organizations aim at deriving valuable insights employing data mining and soft computing techniques on the vast data stores that have been accumulated over the years. This data however, might consist of missing, incorrect and most of the time, incomplete instances that can have a detrimental effect on the predictive analytics of the healthcare data. Preprocessing of this data, specifically the imputation of missing values offers a challenge for reliable modeling. This work presents a novel preprocessing phase with missing value imputation for both numerical and categorical data. A hybrid combination of Classification and Regression Trees (CART) and Genetic Algorithms to impute missing continuous values and Self Organizing Feature Maps (SOFM) to impute categorical values is adapted in this work. Further, Artificial Neural Networks (ANN) is used to validate the improved accuracy of prediction after imputation. To evaluate this model, we use PIMA Indians Diabetes Data set (PIDD), and Mammographic Mass Data (MMD). The accuracy of the proposed model that emphasizes on a preprocessing phase is shown to be superior over the existing techniques. This approach is simple, easy to implement and practically reliable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cabena, P., Hadjinian, P., Stadler, R., Verhees, J., Zanasi, A.: Discovering Data Mining: from Concepts to Implementation. Prentice Hall, Englewood Cliffs (1998)

    Google Scholar 

  2. Acuna, E., Rodriguez, C.: The Treatment of Missing Values and its Effect in the Classifier Accuracy. In: Multiscale Methods in Science and Engineering. LNCS, pp. 639–647. Springer, Heidelberg (2004)

    Google Scholar 

  3. Peng, L., Lei, L.: A Review of Missing Data Treatment Methods. Intelligent Information Management Systems and Technologies 1(3), 412–419 (2005)

    Google Scholar 

  4. Bhat, V.H., Rao, P.G., Shenoy, P.D., Venugopal, K.R., Patnaik, L.M.: An Efficient Prediction Model for Diabetic Database Using Soft Computing Techniques. In: Sakai, H., Chakraborty, M.K., Hassanien, A.E., Ślęzak, D., Zhu, W. (eds.) RSFDGrC 2009. LNCS, vol. 5908, pp. 328–335. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  5. Mehala, B., Ranjit, J.T.P., Vivekanandan, K.: Selecting Scalable Algorithms to Deal with Missing Values. International Journal of Recent Trends in Engineering 1(2) (2009)

    Google Scholar 

  6. Batista, G.E.A.P.A., Monard, M.C.: K-Nearest Neighbour as Imputation Method. Experimental Results. Tech. Report 186, ICMC-USP (2002)

    Google Scholar 

  7. Breault, J.L.: Data Mining Diabetic Databases: Are Rough Sets a Useful Addition? Artificial Intelligence in Medicine 27, 227–236 (2003)

    Article  Google Scholar 

  8. King, M.A., Elder IV, J.F., et al.: Evaluation of Fourteen Desktop Data Mining Tools. In: Proc. of IEEE International Conference on Systems, Man and Cybernetics, San Diego, CA (1998)

    Google Scholar 

  9. Khan, A.H.: Multiplier-free Feedforward Networks. In: Proc. of the IEEE International Joint Conference on Neural Networks (IJCNN), Honolulu, Hawaii, vol. 3, pp. 2698–2703 (2002)

    Google Scholar 

  10. Elsayad, A.M.: Predicting the Severity of Breast Masses with Ensemble of Bayesian Classifiers. Journal of Computer Science 6(5), 576–584 (2010)

    Article  Google Scholar 

  11. Machine Learning Database Repository at the University of California, Irvine, http://www.ics.uci.edu/mlearn/MLRepository

  12. Kayaer, K., Yildirim, T.: Medical Diagnosis on Pima Indian Diabetes using General Regression Neural Networks. In: Proc. of the International Conference on Artificial Neural Networks/International Conference on Neural Information Processing, Istanbul, Turkey, pp. 181–184 (2003)

    Google Scholar 

  13. Aslam, M.W., Nandi, A.K.: Detection of Diabetes using Genetic Programming. In: 18th European Signal Processing Conference, Denmark, pp. 1184–1188 (August 2010)

    Google Scholar 

  14. Magnani, M.: Techniques for Dealing with Missing Data in Knowledge Discovery Tasks. By Department of Computer Science, University of Bologna (2004)

    Google Scholar 

  15. Estébanez, C., Aler, R., José, M.: Method Based on Genetic Programming for Improving the Quality of Data Sets in Classification Problems. International Journal of Computer Science and Applications 4(1), 69–80 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bhat, V.H., Rao, P.G., Krishna, S., Shenoy, P.D., Venugopal, K.R., Patnaik, L.M. (2011). An Efficient Framework for Prediction in Healthcare Data Using Soft Computing Techniques. In: Abraham, A., Mauri, J.L., Buford, J.F., Suzuki, J., Thampi, S.M. (eds) Advances in Computing and Communications. ACC 2011. Communications in Computer and Information Science, vol 192. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22720-2_55

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22720-2_55

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22719-6

  • Online ISBN: 978-3-642-22720-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics