Advertisement

Towards a Scalable Approach for Mining Frequent Patterns from the Linked Open Data Cloud

  • Rajesh Mahule
  • O. P. Vyas
Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 27)

Abstract

In recent years, the linked data principles have become one of the prominent ways to interlink and publish datasets on the web creating the web space a big data store. With the data published in RDF form and available as open data on the web opens up a new dimension to discover knowledge from the heterogeneous sources. The major problem with the linked open data is the heterogeneity and the massive volume along with the preprocessing requirements for its consumption. The massive volume also constraint the high memory dependencies of the data structures required for methods in the mining process in addition to the mining process overheads. This paper proposes to extract and store the RDF dumps available for the source data from the linked open data cloud which can be further retrieved and put in a format for mining and then suggests the applicability of an efficient method to generate frequent patterns from these huge volumes of data without any constraint of the memory requirement.

Keywords

Linked Data Mining Data Mining Semantic Web data Mining RDF data mining 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abedjan, Z., Naumann, F.: Context and Target Configurations for Mining RDF data. In: International Workshop on Search and Mining Entity-Relationship Data (2011)Google Scholar
  2. 2.
    Agrawal, R., Srikant, R.: Fast Algorithms for mining association rules in large databases. In: International Conference on Very Large Databases (1994)Google Scholar
  3. 3.
    El-Hajj, M., Zaiane, O.R.: COFI-tree Mining: A New Approach to Pattern Growth with Reduced Candidacy Generation. In: Workshop on Frequent Itemset Mining Implementations (FIMI 2003) in conjunction with IEEE-International Conference on Data Mining (2003)Google Scholar
  4. 4.
    Bloehdorn, S., Sure, Y.: Kernel methods for mining instance data in ontologies. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 58–71. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  5. 5.
    Fanizzi, N., Amato, C., Esposito, F.: Metric-based stochastic conceptual clustering for ontologies. Information System 34(8), 792–806 (2009)CrossRefGoogle Scholar
  6. 6.
    Amato, C., Bryl, V., Serafini, L.: Data-Driven logical reasoning. In: 8th International Workshop on Uncertainty Reasoning for the Semantic Web (2012)Google Scholar
  7. 7.
    Nebot, R.B.V.: Finding association rules in semantic web data. Knowledge-based System 25(1), 51–62 (2012)CrossRefGoogle Scholar
  8. 8.
    Agrawal, R., Swami, A.N.: Mining association rules between sets of items in large databases. In: ACM SIGMOD International Conference on Management of Data (1993)Google Scholar
  9. 9.
    Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD International Conference on Management of Data (2000)Google Scholar
  10. 10.
    Bizer, T.H.C., Berners-Lee, T.: Linked Data - The Story so Far. International Journal on Semantic Web and Information Systems (2009)Google Scholar
  11. 11.
    Ramezani, R., Saraee, M., Nematbakhsh, M.A.: Finding Association Rules in Linked Data a centralized approach. In: 21st Iranian Conference on Electrical Engineering (ICEE) (2013)Google Scholar
  12. 12.
    Narasimha, R.V., Vyas, O.P.: LiDDM: A Data Mining System for Linked Data. In: Workshop on Linked Data on the Web. CEUR Workshop Proceedings, vol. 813. Sun SITE Central Europe (2011)Google Scholar
  13. 13.
  14. 14.
    Potoniec, J., Ławrynowicz, A.: RMonto: Ontological extension to RapidMiner. In: Poster and Demo Session of the ISWC 2011 - 10th International Semantic Web Conference, Bonn, Germany (2011)Google Scholar
  15. 15.
    The Data Hub, http://thedatahub.org
  16. 16.
    The Association for Computing Machinery (ACM) Portal, http://portal.acm.org/portal.cfm
  17. 17.
    The DBLP Computer Science Bibliography, http://dblp.uni-trier.de/
  18. 18.
    The Scientific Literature Digital Library and Search Engine, http://citeseer.ist.psu.edu/

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Department of Information TechnologyIndian Institute of Information TechnologyAllahabadIndia

Personalised recommendations