SAPOP: Semiautomatic Framework for Practical Ontology Population from Structured Knowledge Bases

  • Xinruo SunEmail author
  • Haofen Wang
  • Yong Yu
Conference paper
Part of the Springer Proceedings in Complexity book series (SPCOM)


The Semantic Web is evolving very quickly. There are already many theories and tools to model various kinds of semantics using ontologies. However after organizations completed modeling the ontology structure, the ontologies must also be filled with instances and relationships to make them practical. This process of ontology population could be hard because we are facing a cold start problem. On the other hand, the potential instances could already exist in LOD, online encyclopedia, or corporate databases in the form of structured data. We think these instances along with their related features could remedy the cold start problem a lot. We present a practical framework to verify this hypothesis. In this framework, first a semiautomated seed discovery method is used to bootstrap the population. Then, we use semi-supervised learning methods to refine and expand the seed instances. Finally the population quality is verified using an effective evaluation process.


  1. 1.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data the semantic web. In: Proceedings of 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference (ISWC+ASWC 2007), pp. 722–735, 2007Google Scholar
  2. 2.
    Bizer, C., Heath, T., Berners-Lee, T.: Linked data - the story so far. Int. J. Semantic Web Inf. Syst. 5, 1–22 (2009)Google Scholar
  3. 3.
    Liu, B., Lee, W.S., Yu, P.S., Li, X.: Partially supervised classification of text documents. In: ICML, p. 387–394Google Scholar
  4. 4.
    Pasca, M.: Acquisition of categorized named entities for web search. In: Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management, pp. 137–145, CIKM ’04Google Scholar
  5. 5.
    Petasis, G., Karkaletsis, V., Paliouras, G., Krithara, A., Zavitsanos, E.: Ontology population and enrichment: state of the art. In: Paliouras, G., Spyropoulos, C., Tsatsaronis, G. (eds.) Knowledge-Driven Multimedia Information Extraction and Ontology Evolution, Lecture Notes in Computer Science, vol. 6050, pp. 134–166. Springer, Berlin/Heidelberg (2011)CrossRefGoogle Scholar
  6. 6.
    Sriram, B., Fuhry, D., Demir, E., Ferhatosmanoglu, H., Demirbas, M.: Short text classification in twitter to improve information filtering, pp. 841–842. SIGIR ’10Google Scholar
  7. 7.
    Stevenson, M., Gaizauskas, R.: Using corpus-derived name lists for named entity recognition. In: Proceedings of the Sixth Conference on Applied Natural Language Processing, pp. 290–295. ANLC ’00Google Scholar
  8. 8.
    Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, pp. 697–706. WWW ’07. ACM, New York, NYGoogle Scholar
  9. 9.
    Toral, A., Munoz, R.: A proposal to automatically build and maintain gazetteers for Named Entity Recognition by using Wikipedia. EACL (2006)Google Scholar
  10. 10.
    Wang, R.C., Schlaefer, N., Cohen, W.W., Nyberg, E.: Automatic set expansion for list question answering. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 947–954. EMNLP ’08Google Scholar
  11. 11.
    Zhang, B., Zuo, W.: Learning from positive and unlabeled examples: A survey, pp. 650–654. IEEE (May)Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.APEX Data & Knowledge Management LabShanghai Jiao Tong UniversityShanghaiChina

Personalised recommendations