Knowledge and Information Systems

, Volume 34, Issue 3, pp 521–546 | Cite as

Decision rules extraction from data stream in the presence of changing context for diabetes treatment

Open Access
Regular Paper

Abstract

The knowledge extraction is an important element of the e-Health system. In this paper, we introduce a new method for decision rules extraction called Graph-based Rules Inducer to support the medical interview in the diabetes treatment. The emphasis is put on the capability of hidden context change tracking. The context is understood as a set of all factors affecting patient condition. In order to follow context changes, a forgetting mechanism with a forgetting factor is implemented in the proposed algorithm. Moreover, to aggregate data, a graph representation is used and a limitation of the search space is proposed to protect from overfitting. We demonstrate the advantages of our approach in comparison with other methods through an empirical study on the Electricity benchmark data set in the classification task. Subsequently, our method is applied in the diabetes treatment as a tool supporting medical interviews.

Keywords

Decision rules Forgetting Incremental learning Hidden context Diabetes 

References

  1. 1.
    Alemdar H, Ersoy C (2010) Wireless sensor networks for healthcare: a survey. Comput Netw 54: 2688–2710CrossRefGoogle Scholar
  2. 2.
    Andersen TL, and Martinez TR (1995) NP-completeness of minimum rule sets. In: Proceedings of the 10th international symposium on computer and information sciences, pp 411–418Google Scholar
  3. 3.
    Auer P, Warmuth MK (1998) Tracking the best disjunction. Mach Learn 32: 127–150MATHCrossRefGoogle Scholar
  4. 4.
    Bach SH, and Maloof MA (2008) Paired learners for concept drift. In: Proceedings of eighth IEEE international conference on data mining, pp 23–32Google Scholar
  5. 5.
    Baena-García M, del Campo-Ávila J, Fidalgo R, Bifet A et al (2006) Early drift detection method. In: ECML PKDD 2006 workshop on knowledge discovery from data streams, BerlinGoogle Scholar
  6. 6.
    Bishop CM (2006) Pattern recognition and machine learning. Springer, SingaporeMATHGoogle Scholar
  7. 7.
    Bouamrane M-M, Rector A., Hurrell M. (2011) Using OWL ontologies for adaptive patient information modelling and preoperative clinical decision support. Knowl Inf Syst 29(2): 405–418CrossRefGoogle Scholar
  8. 8.
    Box GEP, Jenkins GM (1976) Time series analysis. Forecasting and control, Revised edn. Holden-Day, OaklandGoogle Scholar
  9. 9.
    Breiman L, Friedman JH, Olshen RA, Stone PJ (1984) Classification and regression trees. Wadsworth, BelmontMATHGoogle Scholar
  10. 10.
    Breiman L (2001) Random forests. Mach Learn 45(1): 5–32MATHCrossRefGoogle Scholar
  11. 11.
    Bubnicki Z (1994) Knowledge-based approach as a generalization of pattern recognition problems. Syst Sci 19(2): 5–20MathSciNetGoogle Scholar
  12. 12.
    Bubnicki Z (1980) Identification of control plants. Elsevier, OxfordMATHGoogle Scholar
  13. 13.
    Chang WW, Sung TJ, Huang HW et al (2011) A smart medication system using wireless sensor network technologies. Sens Actuators A Phys 172(1): 315–321CrossRefGoogle Scholar
  14. 14.
    Cherkassky V, Mulier F (2007) Learning from data: concepts, theory, and methods. Wiley, New JerseyMATHCrossRefGoogle Scholar
  15. 15.
    Clark P, Niblett T (1989) The CN2 induction algorithm. Mach Learn 3: 261–283Google Scholar
  16. 16.
    Cook DJ, Holder LB (2000) Graph-based data mining. IEEE Intell Syst Appl 15(2): 32–41CrossRefGoogle Scholar
  17. 17.
    Devroye L, Györfi L, Lugosi G (1997) A probabilistic theory of pattern recognition. Springer, New YorkGoogle Scholar
  18. 18.
    Diestel R (2000) Graph theory. Springer, New YorkGoogle Scholar
  19. 19.
    Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proceedings of KDD 2000, pp 71–80Google Scholar
  20. 20.
    European Coalition for Diabetes (2009). EU Diabetes Working Group (2009–2014) Delivering for Diabetes in Europe. Policy paper. http://www.ecdiabetes.eu/documents/EUDWG-policy-paper-2009-2014
  21. 21.
    Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. Lect Notes Comput Sci 3171: 66–112Google Scholar
  22. 22.
    Gama J, Sebastião R, Rodrigues P (2009) Issues in evaluation of stream learning algorithms. In: Proceedings of the 15th ACM SIGKDD international conference on KDD, pp 329–338Google Scholar
  23. 23.
    Georgii E, Tsuda K, Schölkopf G (2011) Multi-way set enumeration in weight tensors. Mach Learn 82(2): 123–155MATHCrossRefGoogle Scholar
  24. 24.
    Grzech A, Rygielski P, P (2010) Translations of service level agreement in systems based on service-oriented architectures. Cybern Syst 41(8): 610–627MATHCrossRefGoogle Scholar
  25. 25.
    Gonczarek A, Tomczak JM, Grzech J (2010) Decision rules clustering using K-means algorithm with different distance measures. In: Grzech A, Świa¸tek P, Drapała J (eds) Advances in systems science. Exit, Warsaw, pp 139–147Google Scholar
  26. 26.
    Grandinetti L, Pisacane O (2011) Web based prediction for diabetes treatment. Futur Gener Comput Syst 27: 139–147CrossRefGoogle Scholar
  27. 27.
    Grzeszczak W (ed) (2010) Clinical recommendations for diabetics 2010. A Standpoint of Polish Diabetes Association. Pismo Polskiego Towarzystwa Diabetologicznego, vol 11, issue A (in Polish)Google Scholar
  28. 28.
    Harries MB (1999) Splice-2 comparative evaluation: electricity pricing. Technical Report UNSW-CSE-TR-9905Google Scholar
  29. 29.
    Harries MB, Sammut C, Horn K (1998) Extracting hidden context. Mach Learn 32: 101–126MATHCrossRefGoogle Scholar
  30. 30.
    Herrera F, Carmona CJ, Gonzlez P, del Jesus MJ (2011) An overview on subgroup discovery: foundations and applications. Knowl Inf Syst 29(3): 495–525CrossRefGoogle Scholar
  31. 31.
    Holder LB, Cook DJ (2005) Graph-based Data Mining. In: Wang J (eds) Encyclopedia of data warehousing and mining. Information Science Reference, Hershey, pp 540–545CrossRefGoogle Scholar
  32. 32.
    Hulten G, Spencer L, and Domingos P (2001) Mining time changing data streams. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data dining, San Francisco, California, ACM, pp 97–106Google Scholar
  33. 33.
    International Diabetes Federation and Federation of European Nurses in Diabetes (2005) The policy puzzle: towards benchmarking in the EU 25. IDF/FEND Report. http://www.idf.org/webdata/docs/idf-europe/DiabetesReport2005
  34. 34.
    Inokuchi A, Washio T, Motoda H (2003) Complete mining of frequent patterns from graphs: mining graph data. Mach Learn 50: 321–354MATHCrossRefGoogle Scholar
  35. 35.
    International Telecommunication Union (2008) Implementing e-health in developing countries. Guidance and Principles, ITU ReportGoogle Scholar
  36. 36.
    Jordan MI (2004) Graphical models. Stat Sci 19(1): 140–155MATHCrossRefGoogle Scholar
  37. 37.
    Kearns M, Li M, Valiant L (1994) Learning Boolean formulae. J ACM 41(6): 1298–1328MathSciNetMATHCrossRefGoogle Scholar
  38. 38.
    Koleszynska J (2007) GIGISim—the intelligent telehealth system. Computer aided diabetes managment—a new review. Lect Notes Comput Sci 4692: 789–796CrossRefGoogle Scholar
  39. 39.
    Kolter JZ, and Maloof MA (2005) Using additive expert ensembles to cope with concept drift. In: Proceedings of the twenty-second international conference on machine learning, ACM Press, New York, NY, pp 449–456Google Scholar
  40. 40.
    Kubat M (1993) Flexible concept learning in real-time systems. J Intell Robotic Syst 1: 155–171Google Scholar
  41. 41.
    Kulkarnia P, Ozturk Y (2011) mPHASiS: mobile patient healthcare and sensor information system. J Netw Comput Appl 34: 402–417CrossRefGoogle Scholar
  42. 42.
    Last M (2002) Online classification of nonstationary data streams. Intell Data Anal 6: 129–147MATHGoogle Scholar
  43. 43.
    Last M, Klein Y, Kandel A (2001) Knowledge discovery in time series databases. IEEE Trans Syst Man Cybern Part B Cybern 31: 160–169CrossRefGoogle Scholar
  44. 44.
    Macía I, Grańa M, Paloc C (2011) Knowledge management in image-based analysis of blood vessel structures. Knowl Inf Syst 30(2): 457–491CrossRefGoogle Scholar
  45. 45.
    Maloof MA, Michalski RS (1999) Selecting examples for partial memory learning. Mach Learn 41(1): 27–52CrossRefGoogle Scholar
  46. 46.
    Michalski RS (1969) On the quasi-minimal solution of the general covering problem. In: Proceedings of the Vth international symposium on information processing. Yugoslavia, A3:125–128Google Scholar
  47. 47.
    Mitchell T (1997) Machine learning. McGraw Hill, New YorkMATHGoogle Scholar
  48. 48.
    Mohktar MS, Basilakis J, Redmond SJ, and Lovell NH (2010) A guideline-based decision support system for generating referral recommendations from routinely recorded home telehealth measurement data. In: Proceedings of 32nd annual international conference of the IEEE EMBS Buenos Aires, Argentina, pp 6166–6169, 31 August–4 September 2010Google Scholar
  49. 49.
    Mougiakakou SG, Bartsocas CS, Bozas E et al (2010) SMARTDIAB: a communication and information technology approach for the intelligent monitoring, management and follow-up of type 1 diabetes patients. IEEE Trans Inf Technol Biomed 14: 622–633CrossRefGoogle Scholar
  50. 50.
    Pantelopoulos A, Bourbakis NG (2010) A survey on wearable sensor-based systems for health monitoring and prognosis. IEEE Trans Syst Man Cybern C Appl Rev 40(1): 1–12CrossRefGoogle Scholar
  51. 51.
    Pattichis CS, Kyriacou E, Voskarides S et al (2002) Wireless telemedicine systems: an overview. IEEE Trans Antennas Propag Maga 44(2): 143–153CrossRefGoogle Scholar
  52. 52.
    Pawlak Z (2002) Decision Algorithms, Bayes’ Theorem and Flow Graphs. In: Rutkowski L, Kacprzyk J (eds) Neural networks and soft computing. Physica, Springer, Heidelberk, New YorkGoogle Scholar
  53. 53.
    Pawlak Z (2004) Data analysis and flow graphs. J Telecommun Inf Technol 3: 1–5MathSciNetGoogle Scholar
  54. 54.
    Potts J, Cook DJ, Holder LB (2007) Learning from supervised graphs. Stud Comput Intell 52: 183–201CrossRefGoogle Scholar
  55. 55.
    Qin B, Xia Y, Prabhakar S (2011) Rule induction for uncertain data. Knowl Inf Syst 29(1): 103–130CrossRefGoogle Scholar
  56. 56.
    Quinlan JR (1983) Learning efficient classification procedures and their application to chess end games. In: Michalski RS, Carbonell JG, Mitchell TM (eds) Machine learning: an artificial intelligence approach. Morgan Kaufmann, San Mateo, pp 463–482Google Scholar
  57. 57.
    Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers, MassachusettsGoogle Scholar
  58. 58.
    Rensink A (2004) Representing first-order logic using graphs. Lect Notes Comput Sci 3256: 319–335MathSciNetCrossRefGoogle Scholar
  59. 59.
    Salganicoff M (1997) Tolerating concept and sampling shift in lazy learning using prediction error context switching. Artif Intell Rev 11: 133–155CrossRefGoogle Scholar
  60. 60.
    Świa¸tek J, Brzostowski K, Tomczak JM (2011) Computer aided physician interview for remote control system of diabetes therapy. In: InterSymp 2011, 23rd international conference on system research, informatics and cybernetics, Baden-Baden, GermanyGoogle Scholar
  61. 61.
    Tomczak JM, Brzostowski K, Grzech J (2010) Knowledge extraction using shifting window from non-stationary datastreams. In: Grzech A (eds) Information systems architecture and technology: networks and networks’ services. Oficyna Wydawnicza PWr, Wrocław, pp 321–331Google Scholar
  62. 62.
    Tomczak JM, Grzech J (2010) Bayesian classifiers with incremental learning for nonstationary datastreams. In: Grzech A, Świa¸tek P, Drapała J (eds) Advances in systems science. EXIT, Warszawa,, pp 251–260Google Scholar
  63. 63.
    UCI Machine Learning Repository (1994) Dataset prepared by Michael Kahn, MD, PhD. http://archive.ics.uci.edu/ml/datasets/Diabetes
  64. 64.
    Vapnik VN (1998) The statistical learning theory. A Wiley-Interscience Publication. John Wiley & Sons, New YorkGoogle Scholar
  65. 65.
    Verhoeven F, van Gemert-Pijnen L, Dijkstra K et al (2007) The contribution of teleconsultation and videoconferencing to diabetes care: a systematic literature review. J Med Internet Res 9(5): e37CrossRefGoogle Scholar
  66. 66.
    Washio T, Motoda H (2003) State of the art of graph-based data mining. ACM SIGKDD Explor Newsl 5(1): 59–68CrossRefGoogle Scholar
  67. 67.
    Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23: 69–101Google Scholar
  68. 68.
    World Health Organization (2006) Definition and diagnosis of diabetes mellitus and intermediate hyperglycemia. Report of a WHO/IDF Consultation. http://whqlibdoc.who.int/publications/2006/9241594934_eng

Copyright information

© The Author(s) 2012

Authors and Affiliations

  1. 1.Institute of Computer Science, Faculty of Computer Science and ManagementWrocław University of TechnologyWrocławPoland

Personalised recommendations