Multi-source knowledge integration based on machine learning algorithms for domain ontology

  • 96 Accesses

  • 2 Citations


In this paper, a new approach of automatic building for domain ontology based on machine learning algorithm is proposed, and by which the large-scale e-Gov ontology is built automatically. The advent of the knowledge graph era puts forward higher requirements for semantic search and analysis. Since traditional manual ontology construction requires the participation of domain experts in large-scale ontology construction, which will take time and considerable resources, and the ontology scale is also limited. The approach proposed in this paper not only makes up for the shortage of thesaurus description of the semantic relation between terms, but also takes advantage of the massive online encyclopedia knowledge and typical similarity algorithm in machine learning to fill the domain ontology automatically, so that the advantages of the two different knowledge sources are fully utilized and the system as a whole is gained. Ultimately, this may provide the foundation and support for the construction of knowledge graph and the semantic-oriented applications.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 99

This is the net price. Taxes to be calculated in checkout.

Fig. 1


  1. 1.



  1. 1.

    Berners-Lee T, Hendler J et al (2001) The semantic web. Scientific American, New York

  2. 2.

    Seung Jin L, Yiu Kai N (1999) An automated approach for retrieving hierarchical data from HTML tables. In: Proceedings of the 8th international conference on information and knowledge management. ACM, New York, N.Y., USA, pp 466–474

  3. 3.

    Yoshida M, Torisawa K, Tsujii J (2001) Extracting ontologies from world wide web via HTML tables. In; Proceedings of the Pacific association for computational linguistics. Morgan Kaufman, San Francisco, Cal., USA, pp 332–341

  4. 4.

    Tijerino YA, David WE, Deryle WL et al (2005) Towards ontology generation from tables. World Wide Web 8(3):261–285

  5. 5.

    Embley D, Cui T, Liddle S (2005) Automating the extraction of data from HTML tables with unknown structure. Data Knowl Eng 54(1):3–28

  6. 6.

    Aleksander P (2005) Automatic ontology generation from Web tabular structures. University of Maribor, Maribor

  7. 7.

    Hurst M (1999) Layout and language: beyond simple text for information interaction—modeling the table. In: Proceedings of the 2nd international conference on multimodal interfaces. Hong Kong Baptist University, Hong Kong, China, pp 27–30

  8. 8.

    Tanaka M, Ishida T (2006) Ontology extraction from tables on the web. In: Proceedings of 2006 international symposium on applications and the internet. IEEE Computer Society, Los Alamitos, CA, USA, pp 284–290

  9. 9.

    Wu F, Weld DS (2008) Automatically refining the wikipedia infobox ontology. In: Proceedings of the 17th international conference on world wide web. ACM, New York, pp 635–644

  10. 10.

    Wu F, Weld DS (2007) Autonomously semantifying Wikipedia. In: Proceedings of the sixteenth ACM conference on information and knowledge management. ACM, New York

  11. 11.

    Suchanek FM, Kasneci G, Weikum G (2008) YAGO: a large ontology from Wikipedia and WordNet. Web Semant Sci Serv Agents World Wide Web 6(3):203–217

  12. 12.

    Wang Z, Wang Z, Li J et al (2012) Knowledge extraction from chinese wiki encyclopedias. J Zhejiang Univ Sci C 13(4):268–280

  13. 13.

    Chen Y, Chen L, Xu K (2012) Learning Chinese entity attributes from online encyclopedia. APWeb, 179–186

  14. 14.

    Lauser B et al (2006) From AGROVOC to the agricultural ontology service/concept server. An OWL model for creating ontologies in the agricultural domain. In Dublin core conference proceedings. Dublin Core DCMI

  15. 15.

    Guojian X (2008) The study and implementation of the conversion system from Chinese agricultural thesaurus to agricultural ontology. Chin Acad Agric Sci 6:101 (in Chinese)

  16. 16.

    Woods JW, Sneiderman CA, Hameed K, Ackerman MJ, Hatton C (2006) Using UMLS metathesaurus concepts to describe medical images: dermatology vocabulary. Comput Biol Med 36:89–100

  17. 17.

    Stark MM, Riesenfeld RF (1998) Wordnet: an electronic lexical database. In: Proceedings of 11th Eurographics workshop on rendering. MIT Press, Cambridge

  18. 18.

    Dong ZD, Qiang D, Hao CL (2007) Theoretical findings of HowNet. J Chin Inf Process 21(4):3–9 (in Chinese)

  19. 19.

    Tambouris E, Gorilas S, Kavadias G, Apostolou D, Abecker A, Stojanovic L, Mentzas G (2004) Ontology-enabled e-gov service configuration: an overview of the OntoGov project. In: Wimmer M (ed) Proceedings of knowledge management in electronic government—KMGov 2004—5th IFIP international working conference, pp 122–127. Springer LNAI 3035: Krems, Austria, May 17–19, 2004

  20. 20.

    Chao-chen C, Jian-hua Y, Shun-hong S (2005) Government ontology and thesaurus construction: a Taiwanese experience. ICADL 2005, LNCS, pp 263–27

  21. 21.

    Xin-li Z (2005) Chinese general e-government thesaurus (category table). Scientific and Technical Documentation Press (in Chinese)

  22. 22.

    Xin-li Z (2005) Chinese general e-government thesaurus (alphabetical table). Scientific and Technical Documentation Press (in Chinese)

  23. 23.

    Wang T, Song JC, Di RH et al (2013) A thesaurus and online encyclopedia merging method for large scale domain-ontology automatic construction. In: Wang M (ed) KSEM 2013. LNCS (LNAI), vol 8041. Springer, Heidelberg, pp 132–146

  24. 24.

    Wang T, Xu T, Tang Z et al (2017) TongSACOM: a TongYiCiCiLin and sequence alignment-based ontology mapping model for Chinese linked open data. IEICE Trans Inf Syst 100(6):1251–1261

Download references


This work was supported by the Scientific Research Project of Beijing Municipal Education Commission (General Social Science Project) and the Youth Excellent Teachers Grant of Capital University of Economics and Business (No. 23491854840429).

Author information

Correspondence to Ting Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, T., Gu, H., Wu, Z. et al. Multi-source knowledge integration based on machine learning algorithms for domain ontology. Neural Comput & Applic 32, 235–245 (2020).

Download citation


  • Domain ontology
  • Thesaurus
  • Online encyclopedia
  • Similarity computing