Skip to main content

Concept Discovery from Un-Constrained Distributed Context

  • Conference paper
  • First Online:
Big Data Analytics (BDA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9498))

Included in the following conference series:

Abstract

Formal Concept Analysis (FCA) is a method for analysing data set consisting of binary relation matrix between objects and their attributes to discover concepts that describe special kind of relationships between set of attributes and set of objects. These concepts are related to each other and are arranged in a hierarchy. FCA finds its application in several areas including data mining, machine learning and semantic web.

Few iterative MapReduce based algorithms have been proposed to mine concepts from a given data set. These algorithms either copy the entire data set (context) on each node or partition it in a specific manner. They assume that all attributes are known apriori and are ordered. These algorithms iterate based on the ordering of attributes. In some applications these assumptions will limit the scalability of algorithms.

In this paper, we present a concept mining algorithm which does not assume apriori knowledge of all attributes and permits the distribution of context on different nodes in an arbitrary manner. Our algorithm utilizes Apache Spark framework for discovering and eliminating redundant concepts in each iteration. When we aggregate data on attribute basis, we order the attributes based on the number of objects containing them. Our method relies on finding extents for combinations of attributes of particular size (say ‘k’). An extent which is not regenerated in attribute combinations of size k + 1 corresponds to a valid concept. All concepts with particular intent size k are saved in one Resilient Distributed Data-set (RDD). We have tested our algorithms on two data sets and have compared its performance with earlier algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ganter, B., Wille, R.: Formal Concept Analysis. Springer, Berlin (1999)

    Book  MATH  Google Scholar 

  2. Wille, R.: Restructuring lattice theory: an approach based on hierarchies of concepts. In: Ferré, S., Rudolph, S. (eds.) ICFCA 2009. LNCS, vol. 5548, pp. 314–339. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  3. Krajca, P., Vychodil, V.: Distributed algorithm for computing formal concepts using map-reduce framework. In: Adams, N.M., Robardet, C., Siebes, A., Boulicaut, J.-F. (eds.) IDA 2009. LNCS, vol. 5772, pp. 333–344. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  4. Xu, B., de Fréin, R., Robson, E., Ó Foghlú, M.: Distributed formal concept analysis algorithms based on an iterative mapreduce framework. In: Domenach, F., Ignatov, D.I., Poelmans, J. (eds.) ICFCA 2012. LNCS, vol. 7278, pp. 292–308. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  5. Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, pp. 10–10 (2010)

    Google Scholar 

  6. Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, pp. 2–2. USENIX Association (2012)

    Google Scholar 

  7. Spark programming guide. http://Spark.apache.org/docs/latest/programming-guide.html. Accessed 01 July 2015

  8. UCI Machine Learning Repository: Data Sets. http://archive.ics.uci.edu/ml/datasets.html. Accessed: 01 July 2015

  9. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Inf. Syst. 24, 25–46 (1999)

    Article  Google Scholar 

  10. du Boucher-Ryan, P., Bridge, D.: Collaborative recommending using formal concept analysis. Knowl.-Based Syst. 19(5), 309–315 (2006)

    Article  Google Scholar 

  11. Rajaraman, A., Ullman, J.: Mining of Massive Datasets. Cambridge University Press, New York (2012)

    Google Scholar 

  12. Ying., W., Mingqing, X.: Diagnosis rule mining of airborne avionics using formal concept analysis. In: International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC). IEEE (2013)

    Google Scholar 

  13. Ganter, B., Reuter, K.: Finding all closed sets: a general approach. Order 8(3), 283–290 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  14. van der Merwe, D., Obiedkov, S., Kourie, D.G.: AddIntent: a new incremental algorithm for constructing concept lattices. In: Eklund, P. (ed.) ICFCA 2004. LNCS (LNAI), vol. 2961, pp. 372–385. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  15. Kuznetsov, S.O.: Learning of simple conceptual graphs from positive and negative examples. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 384–391. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

Download references

Acknowledgments

Authors would like to thank Dr. Sriram Kailasam, Assistant Professor at IIT Mandi for his help in improving the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vishal Goel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Goel, V., Chaudhary, B.D. (2015). Concept Discovery from Un-Constrained Distributed Context. In: Kumar, N., Bhatnagar, V. (eds) Big Data Analytics. BDA 2015. Lecture Notes in Computer Science(), vol 9498. Springer, Cham. https://doi.org/10.1007/978-3-319-27057-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27057-9_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27056-2

  • Online ISBN: 978-3-319-27057-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics