Abstract
The objective of this research is to present an innovative technique of extracting and presenting knowledge in construction documents. A construction project can generate a huge number of documents such as contract, correspondences, meeting minutes, quality and safety reports. Traditional document management methods cannot automatically process the information within the documents. Natural language processing is a promising tool to improve information extraction and knowledge management. In this article, we use a conditional random field model to extract domain terms from construction documents. Based on the extraction results, we transfer the contract into a knowledge graph. Then, we visualize the knowledge graphs and some tacit knowledge is found.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Al Qady, M., & Kandil, A. (2014). Automatic clustering of construction project documents based on textual similarity. Automation in Construction, 42, 36–49.
Ananiadou, S., Pyysalo, S., Tsujii, J., & Kell, D. B. (2010). Event extraction for systems biology by text mining the literature. Trends in Biotechnology, 28, 381–390.
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., & Ives, Z. (2007). Dbpedia: A nucleus for a web of open data (pp. 722–735). The Semantic Web: Springer.
Banko, M., Cafarella, M. J., Soderland, S., Broadhead, M., & Etzioni, O. (2007). Open information extraction from the web. Ijcai2007 (pp. 2670–2676).
Bilgin, G., Dikmen, I., & Birgonul, M. T. (2018). An ontology-based approach for delay analysis in construction. KSCE Journal of Civil Engineering, 22, 384–398.
Borgatti, S. P., Everett, M. G., & Freeman, L. C. (2002). Ucinet for Windows: Software for social network analysis (Vol. 6). Harvard, MA: Analytic Technologies.
Borthwick, A., & Grishman, R. (1999). A maximum entropy approach to named entity recognition. Citeseer.
Cakmak, E., & Cakmak, P. I. (2014). An analysis of causes of disputes in the construction industry using analytical network process. Procedia-Social Behavioral Sciences, 109, 183–187.
Caldas, C. H., & Soibelman, L. (2003). Automating hierarchical document classification for construction management information systems. Automation in Construction, 12, 395–406.
Chaphalkar, N., Iyer, K., & Patil, S. K. (2015). Prediction of outcome of construction dispute claims using multilayer perceptron neural network model. International Journal of Project Management, 33, 1827–1835.
Che, W., Li, Z., & Liu, T. (2010). Ltp: A Chinese language technology platform. In Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations (pp. 13–16). Association for Computational Linguistics.
Chou, J.-S., Hsu, S.-C., Lin, C.-W., & Chang, Y.-C. (2016). Classifying influential information to discover rule sets for project disputes and possible resolutions. International Journal of Project Management, 34, 1706–1716.
Edwards, D. J., Shaw, T., & Holt, G. D. (1996). Electronic document management systems and the management of UK construction projects: Investigative research based on literature search and unstructured telephone interviews on electronic document management systems (EDMS) with a range of construction professions. Building Research Information, 24, 287–292.
Fan, H., & Li, H. (2013). Retrieving similar cases for alternative dispute resolution in construction accidents using text mining techniques. Automation in Construction, 34, 85–91.
Huang, F., & Yates, A. (2009). Distributional representations for handling sparsity in supervised sequence-labeling. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP (Vol. 1, 495–503). Association for Computational Linguistics.
Huang, L., Du, Y., & Chen, G. (2015). GeoSegmenter: A statistically learned Chinese word segmenter for the geoscience domain. Computers Geosciences, 76, 11–17.
Kraft, B., Meyer, O., & Nagl, M. (2010). Graph technology support for conceptual design in civil engineering.
Lee, J., Yi, J.-S., & Son, J. (2019). Development of automatic-extraction model of poisonous clauses in international construction contracts using rule-based NLP. Journal of Computing in Civil Engineering, 33, 04019003.
Li, D., Savova, G., & Kipper-Schuler, K. (2008). Conditional random fields and support vector machines for disorder named entity recognition in clinical texts. In Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing (pp. 94–95).
Lv, X., & El-Gohary, N. M. (2016). Semantic annotation for supporting context-aware information retrieval in the transportation project environmental review domain. Journal of Computing in Civil Engineering, 30, 04016033.
Marzouk, M., & Enaba, M. (2019). Text analytics to analyze and monitor construction project contract and correspondence. Automation in Construction, 98, 265–274.
Miller, G. A. (1995). WordNet: A lexical database for English. Communications of the ACM, 38, 39–41.
Mitchell, T., Cohen, W., Hruschka, E., Talukdar, P., Yang, B., Betteridge, J., et al. (2018). Never-ending learning. Communications of the ACM, 61, 103–115.
Peters, S. E., & McClennen, M. (2016). The paleobiology database application programming interface. Paleobiology, 42, 1–7.
Richardson, R., & Smeaton, A. F. (1995). Using WordNet in a knowledge-based approach to information retrieval.
Schuhmacher, M., & Ponzetto, S. P. (2014). Knowledge-based graph document modeling. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining (pp. 543–552). ACM.
Staub-French, S., Fischer, M., Kunz, J., & Paulson, B. (2003). An ontology for relating features with activities to calculate costs. Journal of Computing in Civil Engineering, 17, 243–254.
Sun, J., Gao, J., Zhang, L., Zhou, M., & Huang, C. (2002). Chinese named entity identification using class-based language model. In Proceedings of the 19th International Conference on Computational Linguistics (Vol. 1, pp. 1–7). Association for Computational Linguistics.
Zhang, J., & El-Gohary, N. M. (2017). Integrating semantic NLP and logic reasoning into a unified system for fully-automated code checking. Automation in Construction, 73, 45–57.
Zhang, J., & El-Gohary, N. M. (2013). Semantic NLP-based information extraction from construction regulatory documents for automated compliance checking. Journal of Computing in Civil Engineering, 30, 04015014.
Zhong, B., Ding, L., Luo, H., Zhou, Y., Hu, Y., & Hu, H. (2012). Ontology-based semantic modeling of regulation constraint for automated construction quality compliance checking. Automation in Construction, 28, 58–70.
Zhou, P., & El-Gohary, N. (2015). Ontology-based information extraction from environmental regulations for supporting environmental compliance checking. Computing in Civil Engineering, 34, 190–198.
Acknowledgements
This study is supported by the Fundamental Research Funds for the Central Universities, China.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhang, Q., Hong, Z., Su, X. (2021). Content Analysis Based on Knowledge Graph: A Practice on Chinese Construction Contracts. In: Ye, G., Yuan, H., Zuo, J. (eds) Proceedings of the 24th International Symposium on Advancement of Construction Management and Real Estate. CRIOCM 2019. Springer, Singapore. https://doi.org/10.1007/978-981-15-8892-1_59
Download citation
DOI: https://doi.org/10.1007/978-981-15-8892-1_59
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-8891-4
Online ISBN: 978-981-15-8892-1
eBook Packages: Business and ManagementBusiness and Management (R0)