Advertisement

LEFT–Logical Expressions Feature Transformation: A Framework for Transformation of Symbolic Features

  • Mehreen Saeed
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7368)

Abstract

The accuracy of a classifier relies heavily on the encoding and representation of input data. Many machine learning algorithms require that the input vectors be composed of numeric values on which arithmetic and comparison operators be applied. However, many real life applications involve the collection of data, which is symbolic or ‘nominal type’ data, on which these operators are not available. This paper presents a framework called logical expression feature transformation (LEFT), which can be used for mapping symbolic attributes to a continuous domain, for further processing by a learning machine. It is a generic method that can be used with any suitable clustering method and any appropriate distance metric. The proposed method was tested on synthetic and real life datasets. The results show that this framework not only achieves dimensionality reduction but also improves the accuracy of a classifier.

Keywords

Feature Vector Symbolic Data Logical Expression Binary Encode Breast Cancer Dataset 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. John Wiley and Sons (2000)Google Scholar
  2. 2.
    Ralambondrainy, H.: A conceptual version of the k-means algorithm. Pattern Recognition Letters 16, 1147–1157 (1995)CrossRefGoogle Scholar
  3. 3.
    Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Machine Learning 6, 37–66 (1991)Google Scholar
  4. 4.
    Hernández-Pereira, E., Suárez-Romero, J., Fontenla-Romero, O., Alonso-Betanzos, A.: Conversion methods for symbolic features: A comparison applied to an intrusion detection problem. Expert Systems with Applications 36, 10612–10617 (2009)CrossRefGoogle Scholar
  5. 5.
    Nagabhushan, P., Gowda, K.C., Diday, E.: Dimensionality reduction of symbolic data. Pattern Recognition Letters 16, 219–223 (1995)CrossRefGoogle Scholar
  6. 6.
    Michalski, R.S., Stepp, R.E.: Automated construction of classifications: conceptual clustering versus numerical taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence 5(4), 396–410 (1983)CrossRefGoogle Scholar
  7. 7.
    Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley and Sons (1990)Google Scholar
  8. 8.
    Huang, Z.: Extenstions to the k-means algorithm for clustering large data sets with categorial values. Data Mining and Knowledge Discovery 2, 283–304 (1998)CrossRefGoogle Scholar
  9. 9.
    Guyon, I., Saffari, A., Dror, G., Cawley, G.: Agnostic learning vs. prior knowledge challenge. In: Proceedings of International Joint Conference on Neural Networks (August 2007)Google Scholar
  10. 10.
    Saffari, A., Guyon, I.: Quick start guide for CLOP (May 2006), http://ymer.org/research/files/clop/QuickStartV1.0.pdf
  11. 11.
    Asuncion, A., Newman, D.: UCI machine learning repository (2007)Google Scholar
  12. 12.
    Knopf, A.A.: Mushroom records drawn from The Audubon Society Field Guide to North American Mushrooms. G. H. Lincoff (Pres.), New York (1981)Google Scholar
  13. 13.
    Kohavi, R.: Scaling up the accuracy of naive-bayes classifiers: a decision-tree hybrid. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (1996)Google Scholar
  14. 14.
    Zwitter, M., Soklic, M.: Breast cancer data. Institute of Oncology, University Medical Center, Ljubljana, Yugoslavia (1988); Donors: Tan, M., Schlimmer, J.,Google Scholar
  15. 15.
    Aha, D.W.: Incremental constructive induction: An instance-based approach. In: Proceedings of the Eighth International Workshop on Machine Learning (1991)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Mehreen Saeed
    • 1
  1. 1.Department of Computer Science, FASTNational University of Computer and Emerging SciencesPakistan

Personalised recommendations