Abstract
Acquiring timely and proper information of host countries is a crucial element to lead successful and lucrative delivery of international construction projects. This information, however, commonly exists in forms of unstructured text data such as news articles and reports, which calls for the need of text mining. The aim of this research is to develop a prototype of construction document management system for global contract, which provides the user-needed information in a timely manner. The system named UNI (User Needed Information)-Tacit collects text data containing the recent information of the global construction market by using the web crawling algorithm, automatically allocates tags for each document with the most representative keywords based on Natural Language Processing, and eventually visualizes the results in the form of word clouds. The developed system, the survey validated its usefulness, is expected to contribute to collecting and organizing the latest issues on the global construction market, which provides better understanding of the target countries for decision makers.
Similar content being viewed by others
References
Bilgin, G., Dikmen, I., and Birgonul, M. T. (2018). “An ontology-based approach for delay analysis in construction.” KSCE Journal of Civil Engineering, Vol. 22, No. 2, pp. 384–398, DOI: 10.1007/s12205-017-0651-5.
Cho, J. (2002). Crawling the web: Discovery and maintenance of large-scale web data, Ph.D. Thesis, Stanford University, Stanford, California, U.S.
Craig, N. and Sommerville, J. (2006). “Information management systems on construction projects: Case reviews.” Records Management Journal, Vol. 16, No. 3, pp. 131–148, DOI: 10.1108/09565690610713192.
Cui, W., Wu, Y., Liu, S., Wei, F., Zhou, M. X., and Qu, H. (2010). “Context preserving dynamic word cloud visualization.” Proc. 2010 IEEE Pacific Visualization Symposium (PacificVis), Taipei, Taiwan, pp. 121–128.
Forcada, N., Casals, M., Roca, X., and Gangolells, M. (2007). “Adoption of web databases for document management in SMEs of the construction sector in Spain.” Automation in Construction, Vol. 16, No. 4, pp. 411–424, DOI: 10.1016/j.autcon.2006.07.011.
Han, S. H., Park, S. H., Kim, D. Y., Kim, H. K., and Kang, Y. W. (2007). “Causes of bad profit in overseas construction projects.” Journal of Construction Engineering and Management, Vol. 133, No. 12, pp. 932–943, DOI: 10.1061/(ASCE)0733-9364(2007)133:12(932).
Heimerl, F., Lohmann, S., Lange, S., and Ertl, T. (2014). “Word cloud explorer: Text analytics based on word clouds.” Proc. 47th Hawaii Int. Conf. on System Sciences, Hawaii, USA, DOI: 10.1109/HICSS.2014.231.
Hjelt, M. and Björk, B. C. (2006). “Experiences of EDM usage in construction projects.” e-Commerce in Construction, Vol. 11, No. 9, pp. 113–125, https://doi.org/www.itcon.org/2006/9.
International Contractors Association of Korea (ICAK) (2018). Available online: <https://doi.org/www.icak.or.kr/sta/sta_0101.php> (29/01/2018).
Joukar, A. and Nahmens, I. (2016). “Volatility forecast of construction cost index using general autoregressive conditional heteroskedastic method.” Journal of Construction Engineering and Management, Vol. 142, No. 1, pp. 1–12, DOI: 10.1061/(ASCE)CO.1943-7862.0001020,04015051.
Kim, D. Y., Han, S. H., Kim, H., and Park, H. (2009). “Structuring the prediction model of project performance for international construction projects: A comparative analysis.” Expert Systems with Applications, Vol. 36, No. 2, pp. 1961–1971, DOI: 10.1016/j.eswa.2007.12.048.
Lee, K. P., Kim, D. N., and Kim, H. J. (2007). “A survey on tagging in the Web 2.0 environment.” Communications of the Korean Institute of Information Scientists and Engineers, Vol. 25, No. 10, pp. 36–42 (in Korean).
Lee, D. W. and Lee, T. S. (2004). “The improvement & management of historical data at the construction site: Focused on the supervision committee.” KSCE Journal of Civil Engineering, Vol. 8, No. 5, pp. 479–489, DOI: 10.1007/bf02899575.
Li, S., Cai, H. B., and Kamat, V. R. (2016). “Integrating Natural Language Processing and spatial reasoning for utility compliance checking.” Journal of Construction Engineering and Management, Vol. 142, No. 12, pp. 1–13, DOI: 10.1061/(ASCE)CO.1943-7862.0001199.
Ma, Z., Lu, N., and Wu, S. (2011). “Identification and representation of information resources for construction firms.” Advanced Engineering Informatics, Vol. 25, No. 4, pp. 612–624, DOI: 10.1016/j.aei.2011.08.008.
Manning, C. D., Raghavan, P., and Schutze, H. (2008). Introduction to information retrieval, Cambridge University Press.
Park, M., Jang, Y., Lee, H. S., Ahn, C., and Yoon, Y. S. (2013). “Application of knowledge management technologies in Korean small and mediumsized construction companies.” KSCE Journal of Civil Engineering, Vol. 17, No. 1, pp. 22–32, DOI: 10.1007/s12205-013-1607-z.
Park, M., Lee, K. W., Lee, H. S., Pan, J. Y., and Yu, J. (2013). “Ontology-based construction knowledge retrieval system.” KSCE Journal of Civil Engineering, Vol. 17, No. 7, pp. 1654–1663, DOI: 10.1007/s12205-013-1155-6.
Pathirage, C. P., Amaratunga, D. G., and Haigh, R. P. (2007). “Tacit knowledge and organisational performance: Construction industry perspective.” Journal of Knowledge Management, Vol. 11, No. 1, pp. 115–126, DOI: 10.1108/13673270710728277.
Qady, M. A. and Kandil, A. (2010). “Concept relation extraction from construction documents using Natural Language Processing.” Journal of Construction Engineering and Management, Vol. 136, No. 3, pp. 294–302, DOI: 10.1061/(ASCE)CO.1943-7862.0000131.
Qady, M. A. and Kandil, A. (2013a). “Document discourse for managing construction project documents.” Journal of Computing in Civil Engineering, Vol. 27, No. 5, pp. 466–475, DOI: 10.1061/(ASCE)cp.1943-5487.0000201.
Qady, M. A. and Kandil, A. (2013b). “Document management in construction: Practices and opinions.” Journal of Construction Engineering and Management, Vol. 139, No. 10, pp. 1–7, DOI: 10.1061/(ASCE)co.1943-7862.0000741.
Qady, M. A. and Kandil, A. (2014). “Automatic classification of project documents on the basis of text content.” Journal of Computing in Civil Engineering, Vol. 29, No. 3, pp. 1–11, DOI: 10.1061/(ASCE)cp.1943-5487.0000338.
Rubin, R. A., Fairweather, V., and Guy, S. D. (1999). Construction claims: Prevention and resolution, John Wiley & Sons.
Salama, D. M. and El-Gohary, N. M. (2016). “Semantic text classification for supporting automated compliance checking in construction.” Journal of Computing in Civil Engineering, Vol. 30, No. 1, pp. 1–14, DOI: 10.1061/(ASCE)CP.1943-5487.0000301.
Shin, Y. (2015). Designing a system prototype for construction document management using automated tagging and visualization, MSc Thesis, Seoul National University, Seoul, Korea.
Soibelman, L., Wu, J., Caldas, C., Brilakis, I., and Lin, K. Y. (2008). “Management and analysis of unstructured construction data types.” Advanced Engineering Informatics, Vol. 22, No. 1, pp. 15–27, DOI: 10.1016/j.aei.2007.08.011.
Tixier, A. J. P., Hallowell, M. R., Rajagopalan, B., and Bowman, D. (2016). “Automated content analysis for construction safety: A Natural Language Processing system to extract precursors and outcomes from unstructured injury reports.” Automation in Construction, Vol. 62, No. 1, pp. 45–56, DOI: 10.1016/j.autcon.2015.11.001.
Williams, T. P. and Gong, J. (2014). “Predicting construction cost overruns using text mining, numerical data and ensemble classifiers.” Automation in Construction, Vol. 43, No. 1, pp. 23–29, DOI: 10.1016/j.autcon. 2014.02.014.
Xu, J. W. and Moon, S. (2013). “Stochastic forecast of construction cost index using a cointegrated vector autoregression model.” Journal of Management in Engineering, Vol. 29, No. 1, pp. 10–18, DOI: 10.1061/(ASCE)ME.1943-5479.0000112.
Zhang, J. S. and El-Gohary, N. M. (2016). “Semantic NLP-based information extraction from construction regulatory documents for automated compliance checking.” Journal of Computing in Civil Engineering, Vol. 30, No. 2, pp. 1–14, DOI: 10.1061/(ASCE)CP.1943-5487.0000346.
Zhou, P. and El-Gohary, N. (2016). “Ontology-based multilabel text classification of construction regulatory documents.” Journal of Computing in Civil Engineering, Vol. 30, No. 4, pp. 1–13, DOI: 10.1061/(ASCE)CP.1943-5487.0000530.
Zou, Y., Kiviniemi, A., and Jones, S. W. (2017). “Retrieving similar cases for construction project risk management using Natural Language Processing techniques.” Automation in Construction, Vol. 80, No. 1, pp. 66–76, DOI: 10.1016/j.autcon.2017.04.003.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Moon, S., Shin, Y., Hwang, BG. et al. Document Management System Using Text Mining for Information Acquisition of International Construction. KSCE J Civ Eng 22, 4791–4798 (2018). https://doi.org/10.1007/s12205-018-1528-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12205-018-1528-y