Skip to main content
Log in

Streamwise feature selection: a rough set method

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Traditional feature selection methods assume that the entire input feature set is available from the beginning. However, streaming features (SF) is an integral part of many real-world applications. In this scenario, the number of training examples is fixed while the number of features grows with time as new features stream in. A critical challenge for streamwise feature selection (SFS) is the unavailability of the entire feature set before learning starts. Several efforts have been made to address the SFS problem, however they all need some prior knowledge about the entire feature set. In this paper, the SFS problem is considered from the rough sets (RS) perspective. The main motivation for this consideration is that RS-based data mining does not require any domain knowledge other than the given dataset. The proposed method uses the significance analysis concepts in RS theory to control the unknown feature space in SFS problems. This algorithm is evaluated extensively on several high-dimensional datasets in terms of compactness, classification accuracy, and running time. Experimental results demonstrate that the algorithm achieves better results than existing SFS algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer-Verlag New York Inc., Secaucus

    MATH  Google Scholar 

  2. Theodoridis S, Koutroumbas K (2009) Pattern recognition. Academic Press, Cambridge

    MATH  Google Scholar 

  3. Guyon I, Elliseff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    MATH  Google Scholar 

  4. Wang J, Zhao P, Hoi S, Jin R (2014) Online feature selection and its applications. IEEE Trans Knowl Data Eng 26(3):698–710

    Article  Google Scholar 

  5. Wu X, Yu K, Ding W, Wang H, Zhu X (2013) Online feature selection with streaming features. IEEE Trans Pattern Anal Mach Intell 35:1178–1192

    Article  Google Scholar 

  6. Ungar L, Zhou J, Foster D, Stine B (2005) Streaming feature selection using IIC. In: Proceedings of the 10th International Conference on Articial Intelligence and Statistics

  7. He YL, Liu JNK, Hu YH, Wang XZ (2015) OWA operator based link prediction ensemble for social network. Expert Syst Appl 42(1):21–50

    Article  Google Scholar 

  8. Perkins S, Lacker K, Theiler J (2003) Grafting: fast, incremental feature selection by gradient descent in function space. J Mach Learn Res 3:1333–1356

    MathSciNet  MATH  Google Scholar 

  9. Perkins S, Theiler J (2003) Online feature selection using grafting. In: International Conference on Machine Learning. ACM Press, pp 592–599

  10. Pudil P, Novoviov J, Kittler J (1994) Floating search methods in feature selection. Pattern Recogn Lett 15(11):1119–1125

    Article  Google Scholar 

  11. Wang F, Liang J, Qian Y (2013) Attribute reduction: a dimension incremental strategy. Knowl Based Sys 39:95–108

    Article  Google Scholar 

  12. Hedar AR, Wang J, Fukushima M (2008) Tabu search for attribute reduction in rough set theory. Soft Comput 12(9):909–918

    Article  MATH  Google Scholar 

  13. Li HR, Zhang WX (2005) Applying indiscernibility attribute sets to knowledge reduction. In: AI 2005: advances in artificial intelligence, vol 3809. Springer, Berlin, Heidelberg, pp 816–821. doi:10.1007/11589990_87

  14. Li K, Liu YS (2002) Rough set based attribute reduction approach in data mining. In: Proceedings of International Conference on Machine Learning and Cybernetics, vol. 1, pp 60–63

  15. Parthalain N, Shen Q, Jensen R (2010) A distance measure approach to exploring the rough set boundary region for attribute reduction. IEEE Trans Knowl Data Eng 22(3):305–317

    Article  Google Scholar 

  16. Jensen R, Tuson A, Shen Q (2014) Finding rough and fuzzy-rough set reducts with SAT. Inf Sci 255:100–120

    Article  MathSciNet  MATH  Google Scholar 

  17. Weihua X, Yuan L, Xiuwu L (2012) Approaches to attribute reductions based on rough set and matrix computation in inconsistent ordered information systems. Knowl Based Syst 27:78–91

    Article  Google Scholar 

  18. Wang XZ (2015) Learning from big data with uncertainty–editorial. J Intell Fuzzy Sys 28(5):2329–2330

    Article  MathSciNet  Google Scholar 

  19. Wang XZ, Ashfag RAR, Fu AM (2015) Fuzziness based sample categorization for classifier performance improvement. J Intell Fuzzy Sys 29(3):1185–1196

    Article  MathSciNet  Google Scholar 

  20. He YL, Wang XZ, Huang JZX (2016) Fuzzy nonlinear regression analysis using a random weight network. Inf Sci 364–365:222–240

    Article  Google Scholar 

  21. Pawlak Z (1982) Rough sets. Int J Comput Inform Sci 11(5):341–356

    Article  MATH  Google Scholar 

  22. Wentao L, Weihua X (2015) Double-quantitative decision-theoretic rough set. Inf Sci 316:54–67

    Article  Google Scholar 

  23. Eskandari S, Javidi MM (2016) Online streaming feature selection using rough sets. Int J Approx Reason 69:35–57

    Article  MathSciNet  MATH  Google Scholar 

  24. Swiniarski RW, Skowron A (2003) Rough set methods in feature selection and recognition. Pattern Recogn Lett 24(6):833–849

    Article  MATH  Google Scholar 

  25. Jensen R, Shen Q (2001) A rough set-aided system for sorting WWW bookmarks. In: Proceedings of the First Asia-Pacific Conference on Web Intelligence: Research and Development. WI’01. London, UK

  26. Jensen R, Shen Q (2004) Semantics-preserving dimensionality reduction: rough and fuzzy-rough based approaches. IEEE Trans Knowl Data Eng 16(16):1457–1471

    Article  Google Scholar 

  27. Ziarko W (1993) Variable precision rough set model. J Comput Syst Sci 46(1):39–59

    Article  MathSciNet  MATH  Google Scholar 

  28. Skowron A, Stepaniuk J (1996) Tolerance approximation spaces. Fundam Inform 27(2–3):245–253

    MathSciNet  MATH  Google Scholar 

  29. Dubois D, Prade H (1992) Putting rough sets and fuzzy sets together. In: Słowinski´ R (ed) Intelligent decision support. Theory and decision library, vol 11. Springer, Netherlands, pp 203–232

    Chapter  Google Scholar 

  30. Yong L, Wenliang H, Yunliang J, Zhiyong Z (2014) Quick attribute reduct algorithm for neighborhood rough set model. Inf Sci 271:65–81

    Article  MathSciNet  MATH  Google Scholar 

  31. Kumar SU, Inbarani HH (2015) A novel neighborhood rough set based classification approach for medical diagnosis. Proc Comput Sci 47:351–359

    Article  Google Scholar 

  32. Hu Q, Yu D, Liu J, Wu C (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594

    Article  MathSciNet  MATH  Google Scholar 

  33. Ashfaq RAR, Wang XZ, Huang JZX, Abbas H, He YL (2016) Fuzziness based semi-supervised learning approach for intrusion detection system. Inf Sci. doi:10.1016/j.ins.2016.04.019 (in press)

    Google Scholar 

  34. Clopinet, Feature Selection Challenge, NIPS (2003). http://clopinet.com/isabelle/Projects/NIPS2003/. Accessed 06 March 2015

  35. Blake C, Merz CJ (1998) UCI repository of machine learning databases. http://www.ics.uci.edu/mlearn/MLRepository.html. Accessed 06 March 2015

  36. Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco

    Google Scholar 

  37. Chang CC, Lin CJ (2011) Libsvm: a library for support vector machines. ACM Trans Intell Sys Technol 2(3):1–27

    Article  Google Scholar 

  38. Qian Y, Liang J (2008) Combination entropy and combination granulation in rough set theory. Int J Uncertain Fuzziness Knowl Based Sys 16(2):179–193

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sadegh Eskandari.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Javidi, M.M., Eskandari, S. Streamwise feature selection: a rough set method. Int. J. Mach. Learn. & Cyber. 9, 667–676 (2018). https://doi.org/10.1007/s13042-016-0595-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-016-0595-y

Keywords

Navigation