Skip to main content

Cost-Sensitive Feature Selection on Heterogeneous Data

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9078))

Included in the following conference series:

Abstract

Evaluation functions, used to measure the quality of features, have great influence on the feature selection algorithms in areas of data mining and knowledge discovery. However, the existing evaluation functions are often inadequately measured candidate features on cost-sensitive heterogeneous data. To address this problem, an entropy-based evaluation function is firstly proposed for measuring the uncertainty for heterogeneous data. To further evaluate the quality of candidate features, we propose a multi-criteria based evaluation function, which attempts to find candidate features with the minimal total costs and the same information as the whole feature set. On this basis, a cost-sensitive feature selection algorithm on heterogeneous data is developed. Compared with the existing feature selection algorithms, the experimental results show that the proposed algorithm is more efficient to find a subset of features without losing the classification performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)

    MATH  Google Scholar 

  2. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)

    Article  MATH  Google Scholar 

  3. Farahat A.K., Ghodsi A., Kamel M.S.: An efficient greedy method for unsupervised feature selection. In: The 11th IEEE International Conference on Data Mining (ICDM), pp. 161–170 (2011)

    Google Scholar 

  4. Xue, B., Cervante, L., et al.: Multi-Objective Evolutionary Algorithms for Filter Based Feature Selection in Classification. International Journal on Artificial Intelligence Tools. 22(4), 1350024, 1–31 (2013)

    Google Scholar 

  5. Xue, B., Zhang, M.J., et al.: Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach. IEEE Transactions on Cybernetics 43(6), 1656–1671 (2013)

    Article  Google Scholar 

  6. Pawlak, Z., Skowron, A.: Rough sets and Boolean reasoning. Information Sciences 177(1), 41–73 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  7. Hu, Q., Zhao, H., Xie, Z., Yu, D.: Consistency based attribute reduction. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 96–107. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  8. Qian, Y.H., Liang, J.Y., Pedrycz, W.: Positive approximation: an accelerator for attribute reduction in rough set theory. Artificial Intelligence 174, 597–618 (2010)

    Article  MATH  MathSciNet  Google Scholar 

  9. Sun, L., Xu, J.C.: Feature selection using rough entropy-based uncertainty measures in incomplete decision systems. Knowledge-Based Systems 36, 206–216 (2012)

    Article  Google Scholar 

  10. Yang, M., Yang, P.: A novel condensing tree structure for rough set feature selection. Neurocomputing 71, 1092–1100 (2008)

    Article  Google Scholar 

  11. Min, F., Hu, Q.H., Zhu, W.: Feature selection with test cost constraint. International Journal of Approximate Reasoning 55, 167–179 (2014)

    Article  MathSciNet  Google Scholar 

  12. Weiss, Y., Elovici, Y., Rokach, L.: The CASH algorithm cost-sensitive attribute selection using histograms. Information Sciences 222, 247–268 (2013)

    Article  MathSciNet  Google Scholar 

  13. Bolon-Canedo, V., Porto-Daz, I., Sanchez-Marono, N.: A framework for cost-based feature selection. Pattern Recognition 47, 2481–2489 (2014)

    Article  Google Scholar 

  14. Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research 5, 1205–1224 (2004)

    MATH  Google Scholar 

  15. Hu, Q.H., Pedrycz, W., Yu, D.R., Lang, J.: Selecting discrete and continuous features based on neighborhood decision error minimization. IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics 40(1), 137–150 (2010)

    Article  Google Scholar 

  16. Chen, D.G., Yang, Y.Y.: Attribute reduction for heterogeneous data based on the combination of classical and fuzzy rough set models. IEEE Transactions on Fuzzy Systems 22(5), 1325–1334 (2014)

    Article  Google Scholar 

  17. Dai, J.H., Wang, W.T.: An uncertainty measure for incomplete decision tables and its applications. IEEE Transactions on Cybernetics 43(4), 1277–1289 (2013)

    Article  Google Scholar 

  18. UCI Dataset: http://www.ics.uci.edu/mlearn/MLRepository.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenhao Shu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Qian, W., Shu, W., Yang, J., Wang, Y. (2015). Cost-Sensitive Feature Selection on Heterogeneous Data. In: Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D., Motoda, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2015. Lecture Notes in Computer Science(), vol 9078. Springer, Cham. https://doi.org/10.1007/978-3-319-18032-8_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18032-8_31

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18031-1

  • Online ISBN: 978-3-319-18032-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics