Skip to main content

Content Analysis Based on Knowledge Graph: A Practice on Chinese Construction Contracts

  • Conference paper
  • First Online:
Proceedings of the 24th International Symposium on Advancement of Construction Management and Real Estate (CRIOCM 2019)

Abstract

The objective of this research is to present an innovative technique of extracting and presenting knowledge in construction documents. A construction project can generate a huge number of documents such as contract, correspondences, meeting minutes, quality and safety reports. Traditional document management methods cannot automatically process the information within the documents. Natural language processing is a promising tool to improve information extraction and knowledge management. In this article, we use a conditional random field model to extract domain terms from construction documents. Based on the extraction results, we transfer the contract into a knowledge graph. Then, we visualize the knowledge graphs and some tacit knowledge is found.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Al Qady, M., & Kandil, A. (2014). Automatic clustering of construction project documents based on textual similarity. Automation in Construction, 42, 36–49.

    Article  Google Scholar 

  2. Ananiadou, S., Pyysalo, S., Tsujii, J., & Kell, D. B. (2010). Event extraction for systems biology by text mining the literature. Trends in Biotechnology, 28, 381–390.

    Google Scholar 

  3. Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., & Ives, Z. (2007). Dbpedia: A nucleus for a web of open data (pp. 722–735). The Semantic Web: Springer.

    Google Scholar 

  4. Banko, M., Cafarella, M. J., Soderland, S., Broadhead, M., & Etzioni, O. (2007). Open information extraction from the web. Ijcai2007 (pp. 2670–2676).

    Google Scholar 

  5. Bilgin, G., Dikmen, I., & Birgonul, M. T. (2018). An ontology-based approach for delay analysis in construction. KSCE Journal of Civil Engineering, 22, 384–398.

    Article  Google Scholar 

  6. Borgatti, S. P., Everett, M. G., & Freeman, L. C. (2002). Ucinet for Windows: Software for social network analysis (Vol. 6). Harvard, MA: Analytic Technologies.

    Google Scholar 

  7. Borthwick, A., & Grishman, R. (1999). A maximum entropy approach to named entity recognition. Citeseer.

    Google Scholar 

  8. Cakmak, E., & Cakmak, P. I. (2014). An analysis of causes of disputes in the construction industry using analytical network process. Procedia-Social Behavioral Sciences, 109, 183–187.

    Article  Google Scholar 

  9. Caldas, C. H., & Soibelman, L. (2003). Automating hierarchical document classification for construction management information systems. Automation in Construction, 12, 395–406.

    Article  Google Scholar 

  10. Chaphalkar, N., Iyer, K., & Patil, S. K. (2015). Prediction of outcome of construction dispute claims using multilayer perceptron neural network model. International Journal of Project Management, 33, 1827–1835.

    Article  Google Scholar 

  11. Che, W., Li, Z., & Liu, T. (2010). Ltp: A Chinese language technology platform. In Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations (pp. 13–16). Association for Computational Linguistics.

    Google Scholar 

  12. Chou, J.-S., Hsu, S.-C., Lin, C.-W., & Chang, Y.-C. (2016). Classifying influential information to discover rule sets for project disputes and possible resolutions. International Journal of Project Management, 34, 1706–1716.

    Article  Google Scholar 

  13. Edwards, D. J., Shaw, T., & Holt, G. D. (1996). Electronic document management systems and the management of UK construction projects: Investigative research based on literature search and unstructured telephone interviews on electronic document management systems (EDMS) with a range of construction professions. Building Research Information, 24, 287–292.

    Article  Google Scholar 

  14. Fan, H., & Li, H. (2013). Retrieving similar cases for alternative dispute resolution in construction accidents using text mining techniques. Automation in Construction, 34, 85–91.

    Article  Google Scholar 

  15. Huang, F., & Yates, A. (2009). Distributional representations for handling sparsity in supervised sequence-labeling. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP (Vol. 1, 495–503). Association for Computational Linguistics.

    Google Scholar 

  16. Huang, L., Du, Y., & Chen, G. (2015). GeoSegmenter: A statistically learned Chinese word segmenter for the geoscience domain. Computers Geosciences, 76, 11–17.

    Article  Google Scholar 

  17. Kraft, B., Meyer, O., & Nagl, M. (2010). Graph technology support for conceptual design in civil engineering.

    Google Scholar 

  18. Lee, J., Yi, J.-S., & Son, J. (2019). Development of automatic-extraction model of poisonous clauses in international construction contracts using rule-based NLP. Journal of Computing in Civil Engineering, 33, 04019003.

    Article  Google Scholar 

  19. Li, D., Savova, G., & Kipper-Schuler, K. (2008). Conditional random fields and support vector machines for disorder named entity recognition in clinical texts. In Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing (pp. 94–95).

    Google Scholar 

  20. Lv, X., & El-Gohary, N. M. (2016). Semantic annotation for supporting context-aware information retrieval in the transportation project environmental review domain. Journal of Computing in Civil Engineering, 30, 04016033.

    Article  Google Scholar 

  21. Marzouk, M., & Enaba, M. (2019). Text analytics to analyze and monitor construction project contract and correspondence. Automation in Construction, 98, 265–274.

    Article  Google Scholar 

  22. Miller, G. A. (1995). WordNet: A lexical database for English. Communications of the ACM, 38, 39–41.

    Article  Google Scholar 

  23. Mitchell, T., Cohen, W., Hruschka, E., Talukdar, P., Yang, B., Betteridge, J., et al. (2018). Never-ending learning. Communications of the ACM, 61, 103–115.

    Article  Google Scholar 

  24. Peters, S. E., & McClennen, M. (2016). The paleobiology database application programming interface. Paleobiology, 42, 1–7.

    Article  Google Scholar 

  25. Richardson, R., & Smeaton, A. F. (1995). Using WordNet in a knowledge-based approach to information retrieval.

    Google Scholar 

  26. Schuhmacher, M., & Ponzetto, S. P. (2014). Knowledge-based graph document modeling. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining (pp. 543–552). ACM.

    Google Scholar 

  27. Staub-French, S., Fischer, M., Kunz, J., & Paulson, B. (2003). An ontology for relating features with activities to calculate costs. Journal of Computing in Civil Engineering, 17, 243–254.

    Article  Google Scholar 

  28. Sun, J., Gao, J., Zhang, L., Zhou, M., & Huang, C. (2002). Chinese named entity identification using class-based language model. In Proceedings of the 19th International Conference on Computational Linguistics (Vol. 1, pp. 1–7). Association for Computational Linguistics.

    Google Scholar 

  29. Zhang, J., & El-Gohary, N. M. (2017). Integrating semantic NLP and logic reasoning into a unified system for fully-automated code checking. Automation in Construction, 73, 45–57.

    Article  Google Scholar 

  30. Zhang, J., & El-Gohary, N. M. (2013). Semantic NLP-based information extraction from construction regulatory documents for automated compliance checking. Journal of Computing in Civil Engineering, 30, 04015014.

    Article  Google Scholar 

  31. Zhong, B., Ding, L., Luo, H., Zhou, Y., Hu, Y., & Hu, H. (2012). Ontology-based semantic modeling of regulation constraint for automated construction quality compliance checking. Automation in Construction, 28, 58–70.

    Article  Google Scholar 

  32. Zhou, P., & El-Gohary, N. (2015). Ontology-based information extraction from environmental regulations for supporting environmental compliance checking. Computing in Civil Engineering, 34, 190–198.

    Google Scholar 

Download references

Acknowledgements

This study is supported by the Fundamental Research Funds for the Central Universities, China.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xing Su .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, Q., Hong, Z., Su, X. (2021). Content Analysis Based on Knowledge Graph: A Practice on Chinese Construction Contracts. In: Ye, G., Yuan, H., Zuo, J. (eds) Proceedings of the 24th International Symposium on Advancement of Construction Management and Real Estate. CRIOCM 2019. Springer, Singapore. https://doi.org/10.1007/978-981-15-8892-1_59

Download citation

Publish with us

Policies and ethics