Abstract
This study concerns the constructing of domain ontology from web tables in a specific domain. Ontology defines the common terms and their meaning (concepts) within a context. Thus only meaningful tables are our concern. The meaningful table is composed of a head and a body, which are formatted in rows and columns. The head abstracts the meaning expressed in the body. Thus, in order to obtain a table-information-extraction framework, this study extracts, as prerequisite work, the structural semantic, that is, the domain ontology that frames web-table information, from the head. We suggest a method for automatically extracting domain ontology using the structural and semantic characteristics of the web-table head. The construction of domain ontology proceeds through two steps: (a) extracting table schema as pseudo-ontology from each table from the same domain and (b) constructing domain ontology combining those extracted table schemata. The combination of schemata proceeds through splitting and clustering using (a) statistical information and (b) heuristics based on the structural and semantic characteristics of the web-table head.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Antoniou, G., Harmelen, F.: A Semantic Web Primer, pp. 10–11. MIT Press, Cambridge (2004)
Kushmerick, N., Weld, D.S., Doorenbos, R.: Wrapper Induction for Information Extraction. In: 15th International Joint Conference on Artificial Intelligence (IJCAI-97), Nagoya (August 1997)
Chen, H.H., Tsai, S.C., Tsai, J.H.: Mining Tables from Large Scale HTML Texts. In: Proceedings of 18th International Conference on Computational Linguistics, Saabrucken, Germany (July 2000)
Hurst, M.: Layout and Language: Beyond Simple Text for Information Interaction - Modeling the Table. In: Proceedings of the 2nd International Conference on Multimodal Interfaces, Hong Kong (1999)
Yang, Y.: Web Table Mining and Database Discovery. M.Sc. thesis, Simon Fraser University (August 2002)
Yoshida, M., Torisawa, K., Tsujii, J.: Extracting ontologies from World Wide Web via HTML tables. In: Proceedings of the Pacific Association for Computational Linguistics (2001)
Tijerino, Y., Embley, D., Longsdale, D., Ding, Y., Nagy, G.: Towards Ontology Generation from Tables. Springer, Heidelberg (2005)
Jung, S.W., Kwon, H.C.: A Scalable Hybrid Approach for Extracting Head Components from Web Tables. IEEE transaction on knowledge and data engineering 18(2), 174–187 (2006)
Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Francisco (1993)
Chakrabarti, S.: Mining the Web. Morgan Kaufmann Publishers, San Francisco (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Jung, Sw., Kang, My., Kwon, Hc. (2007). Constructing Domain Ontology Using Structural and Semantic Characteristics of Web-Table Head. In: Okuno, H.G., Ali, M. (eds) New Trends in Applied Artificial Intelligence. IEA/AIE 2007. Lecture Notes in Computer Science(), vol 4570. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73325-6_66
Download citation
DOI: https://doi.org/10.1007/978-3-540-73325-6_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73322-5
Online ISBN: 978-3-540-73325-6
eBook Packages: Computer ScienceComputer Science (R0)