Exploration of Document Classification with Linked Data and PageRank

Conference paper

DOI: 10.1007/978-3-319-01571-2_6

Part of the Studies in Computational Intelligence book series (SCI, volume 511)
Cite this paper as:
Dostal M., Nykl M., Ježek K. (2014) Exploration of Document Classification with Linked Data and PageRank. In: Zavoral F., Jung J., Badica C. (eds) Intelligent Distributed Computing VII. Studies in Computational Intelligence, vol 511. Springer, Cham

Abstract

In this article, we would like to present a new approach to classification using Linked Data and PageRank. Our research is focused on classification methods that are enhanced by semantic information. The semantic information can be obtained from ontology or from Linked Data. DBpedia was used as a source of Linked Data in our case. The feature selection method is semantically based so features can be recognized by non-professional users as they are in a human readable and understandable form. PageRank is used during the feature selection and generation phase for the expansion of basic features into more general representatives. This means that feature selection and PageRank processing is based on network relations obtained from Linked Data. The discovered features can be used by standard classification algorithms. We will present promising results that show the simple applicability of this approach to two different datasets.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.NTIS - New Technologies for the Information Society, Faculty of Applied SciencesUniversity of West BohemiaPilsenCzech Republic
  2. 2.Department of Computer Science and Engineering, Faculty of Applied SciencesUniversity of West BohemiaPilsenCzech Republic

Personalised recommendations