Combining Different Data Mining Techniques to Improve Data Analysis

  • Sergio Greco
  • Elio Masciari
  • Luigi Pontieri
Conference paper
Part of the Advances in Soft Computing book series (AINSC, volume 7)

Abstract

In this paper we propose the combined use of different methods to improve the data analysis process. This is obtained by combining inductive and deductive techniques. Inductive techniques are used for generating hypotheses from data whereas deductive techniques are used to derive knowledge and to verify hypotheses. In order to guide users in the the analysis process, we have developed a system which integrates deductive tools, data mining tools (such as classification algorithms and features selection algorithms), visualization tools and tools for the easy manipulation of data sets. The system developed is currently used in a large project whose aim is the integration of information sources containing data concerning the socio-economic aspects of Calabria and the analysis of the integrated data. Several experiments on socio-economic indicators of Calabrian cities have shown that the combined use of different techniques improves both the comprehensibility and the accuracy of models.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Chen M.S., Han J., Yu P.S. (1996) Data Mining: An Overview from a Database Perspective. IEEE Trans. on Know. Disc. and Data Eng. 8 (6): 866–883CrossRefGoogle Scholar
  2. 2.
    Cheeseman P., Stutz J. (1996) Bayesian Classification (Autoclass): Theory and Results. In: [6], 153–180Google Scholar
  3. 3.
    Dougherty J., Kohavi R., Sahami M. (1997) Supervised and unsupervised discretization of continuous features. In Proc. 12th Int. Conf. Mach. Learn., 194–202Google Scholar
  4. 4.
    Bauer E., Kohavi R. (1999) An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants. Machine Lerning 36 (1–2): 105–139CrossRefGoogle Scholar
  5. 5.
    Fayyad U.M., Piatesky-Shapiro G., Smyth P. (1996) LFrom Data Mining to Knowledge Discovery: An overview. In: [6], 1–36Google Scholar
  6. 6.
    Fayyad U.M., Piatesky-Shapiro G., Smyth P., Uthurusamy R., (Eds.) (1996) Advances in Knoweldge Discovery and Data Mining. The MIT Press.Google Scholar
  7. 7.
    Freund Y., Shapire R.E. (1997) A Decision-Theoretic Generalization of On-line Learning and an Application to Boosting. In Journal of Computer System Sciences, 55 (1): 119–139.Google Scholar
  8. 8.
    Quinlan J.R. (1986) Induction of Decision Trees. Machine Learning 1 (1): 81–106Google Scholar
  9. 9.
    Hanson R., Stutz J., Cheeseman P (1991) Bayesian classification with correlation and inheritance. Proc. 12th IJCAI Conf., 1991. 692–698Google Scholar
  10. 10.
    Mardia K.V., Kent J.T., Bibby J.M. (1979) Multivariant Analysis. Academic Press, New YorkGoogle Scholar
  11. 11.
    Salzberg S.L. (1997) On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach. Data Mining and Knowledge Discovery 1 (3): 317–328CrossRefGoogle Scholar
  12. 12.
    Simoudis E., Livezey B., Kerber R. (1996) Integrating Inductive and Deductive Reasoning for Data Mining. In: [6], 353–373Google Scholar
  13. 13.
    Scheffer T., Herbrich H. (1997) Unbiased assessment of learning algorithm. In: Proc. 15th IJCAI Conf., 1997, 798–803Google Scholar
  14. 14.
    Waikato Environment for Knowledge Analysis (WEKA). Available at http://www.cs.waikato.ac.nz/ml/weka“.

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Sergio Greco
    • 1
    • 2
  • Elio Masciari
    • 1
    • 2
  • Luigi Pontieri
    • 1
    • 2
  1. 1.DEISUniversità della CalabriaRendeItaly
  2. 2.ISI-CNRRendeItaly

Personalised recommendations