Attribute Reduction for Massive Data Based on Rough Set Theory and MapReduce

  • Yong Yang
  • Zhengrong Chen
  • Zhu Liang
  • Guoyin Wang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6401)


Data processing and knowledge discovery for massive data is always a hot topic in data mining, along with the era of cloud computing is coming, data mining for massive data is becoming a highlight research topic. In this paper, attribute reduction for massive data based on rough set theory is studied. The parallel programming mode of MapReduce is introduced and combined with the attribute reduction algorithm of rough set theory, a parallel attribute reduction algorithm based on MapReduce is proposed, experiment results show that the proposed method is more efficiency for massive data mining than traditional method, and it is a effective method effective method effective method for data mining on cloud computing platform.


Attribute reduction rough set MapReduce 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hu, F., Wang, G.: Quick reduction algorithm based on attribute order. Chinese Journal of Computers 30(8), 1429–1435 (2007)MathSciNetGoogle Scholar
  2. 2.
    Sharer, J., Agrawal, R., Mehta, M.: SPRINTA Scalable Parallel Classifier for Data Mining. In: Proceedings of the 22th International Conference on Very Large Data Bases, pp. 544–555 (1996)Google Scholar
  3. 3.
    Andrew, W., Christopher, K., Kevin, D.: Parallel PSO Using MapReduce. In: Proceedings of 2007 IEEE Congress on Evolutionary Computation, pp. 7–16 (2007)Google Scholar
  4. 4.
    Abhishek, V., Xavier, L., David, E., Roy, H.: Scaling Genetic Algorithms using MapReduce. In: Proceedings of the 2009 Ninth International Conference on Intelligent Systems Design and Applications, pp. 13–18 (2009)Google Scholar
  5. 5.
    Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM 51(1), 107–113 (2008)CrossRefGoogle Scholar
  6. 6.
    Jaliya, E., Shrideep, P., Geoffrey, F.: MapReduce for Data Intensive Scientific Analyses. In: Proceedings of Fourth IEEE International Conference on eScience, pp. 277–284 (2008)Google Scholar
  7. 7.
    Pawlak, Z.: On Rough Sets. Bulletin of the EATCS 24, 94–108 (1984)Google Scholar
  8. 8.
    Pawlak, Z.: Rough Classification. International Journal of Man-Machine Studies 20(5), 469–483 (1984)zbMATHCrossRefGoogle Scholar
  9. 9.
    Wang, G.: Rough reduction in algebra view and information view. International Journal of Intelligent System 18(6), 679–688 (2003)zbMATHCrossRefGoogle Scholar
  10. 10.
    Miao, D., Hu, G.: A Heuristic Algorithm for Reduction of Knowledge. Journal of Computer Research and Development 6, 681–684 (1999) (in Chinese)Google Scholar
  11. 11.
    Skowron, A., Rauszer, C.: The Discernibility Functions Matrics and Fanctions in Information Systems. In: Slowinski, R. (ed.) Intelligent Decision Support CHandbook of Applications and Advances of the Rough Sets Theory, pp. 331–362. Kluwer Academic Publisher, Dordrecht (1991)Google Scholar
  12. 12.
  13. 13.
  14. 14.
    Wang, G.: Rough Set Theory and Knowledge Acquisition. Xi’an Jiaotong University Press, Xi’an (2001) (in Chinese)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Yong Yang
    • 1
  • Zhengrong Chen
    • 1
  • Zhu Liang
    • 1
  • Guoyin Wang
    • 1
  1. 1.Institute of Computer Science & TechnologyChongqing University of Posts and TelecommunicationsChongqingP.R. China

Personalised recommendations