Abstract
Formal Concept Analysis (FCA) is a method for analysing data set consisting of binary relation matrix between objects and their attributes to discover concepts that describe special kind of relationships between set of attributes and set of objects. These concepts are related to each other and are arranged in a hierarchy. FCA finds its application in several areas including data mining, machine learning and semantic web.
Few iterative MapReduce based algorithms have been proposed to mine concepts from a given data set. These algorithms either copy the entire data set (context) on each node or partition it in a specific manner. They assume that all attributes are known apriori and are ordered. These algorithms iterate based on the ordering of attributes. In some applications these assumptions will limit the scalability of algorithms.
In this paper, we present a concept mining algorithm which does not assume apriori knowledge of all attributes and permits the distribution of context on different nodes in an arbitrary manner. Our algorithm utilizes Apache Spark framework for discovering and eliminating redundant concepts in each iteration. When we aggregate data on attribute basis, we order the attributes based on the number of objects containing them. Our method relies on finding extents for combinations of attributes of particular size (say ‘k’). An extent which is not regenerated in attribute combinations of size k + 1 corresponds to a valid concept. All concepts with particular intent size k are saved in one Resilient Distributed Data-set (RDD). We have tested our algorithms on two data sets and have compared its performance with earlier algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ganter, B., Wille, R.: Formal Concept Analysis. Springer, Berlin (1999)
Wille, R.: Restructuring lattice theory: an approach based on hierarchies of concepts. In: Ferré, S., Rudolph, S. (eds.) ICFCA 2009. LNCS, vol. 5548, pp. 314–339. Springer, Heidelberg (2009)
Krajca, P., Vychodil, V.: Distributed algorithm for computing formal concepts using map-reduce framework. In: Adams, N.M., Robardet, C., Siebes, A., Boulicaut, J.-F. (eds.) IDA 2009. LNCS, vol. 5772, pp. 333–344. Springer, Heidelberg (2009)
Xu, B., de Fréin, R., Robson, E., Ó Foghlú, M.: Distributed formal concept analysis algorithms based on an iterative mapreduce framework. In: Domenach, F., Ignatov, D.I., Poelmans, J. (eds.) ICFCA 2012. LNCS, vol. 7278, pp. 292–308. Springer, Heidelberg (2012)
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, pp. 10–10 (2010)
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, pp. 2–2. USENIX Association (2012)
Spark programming guide. http://Spark.apache.org/docs/latest/programming-guide.html. Accessed 01 July 2015
UCI Machine Learning Repository: Data Sets. http://archive.ics.uci.edu/ml/datasets.html. Accessed: 01 July 2015
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Inf. Syst. 24, 25–46 (1999)
du Boucher-Ryan, P., Bridge, D.: Collaborative recommending using formal concept analysis. Knowl.-Based Syst. 19(5), 309–315 (2006)
Rajaraman, A., Ullman, J.: Mining of Massive Datasets. Cambridge University Press, New York (2012)
Ying., W., Mingqing, X.: Diagnosis rule mining of airborne avionics using formal concept analysis. In: International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC). IEEE (2013)
Ganter, B., Reuter, K.: Finding all closed sets: a general approach. Order 8(3), 283–290 (1991)
van der Merwe, D., Obiedkov, S., Kourie, D.G.: AddIntent: a new incremental algorithm for constructing concept lattices. In: Eklund, P. (ed.) ICFCA 2004. LNCS (LNAI), vol. 2961, pp. 372–385. Springer, Heidelberg (2004)
Kuznetsov, S.O.: Learning of simple conceptual graphs from positive and negative examples. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 384–391. Springer, Heidelberg (1999)
Acknowledgments
Authors would like to thank Dr. Sriram Kailasam, Assistant Professor at IIT Mandi for his help in improving the manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Goel, V., Chaudhary, B.D. (2015). Concept Discovery from Un-Constrained Distributed Context. In: Kumar, N., Bhatnagar, V. (eds) Big Data Analytics. BDA 2015. Lecture Notes in Computer Science(), vol 9498. Springer, Cham. https://doi.org/10.1007/978-3-319-27057-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-27057-9_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27056-2
Online ISBN: 978-3-319-27057-9
eBook Packages: Computer ScienceComputer Science (R0)