Skip to main content

Segregating Discourse Segments from Engineering Documents for Knowledge Acquisition

  • Conference paper

Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT,volume 442)


The broader goal of the research being described here is to automatically acquire diagnostic knowledge from documents in the domain of manual and mechanical assembly of aircraft structures. These documents are treated as a discourse used by experts to communicate with others. It therefore becomes possible to use discourse analysis to enable machine understanding of the text. The research challenge addressed in the paper is to identify documents or sections of documents that are potential sources of knowledge. In a subsequent step, domain knowledge will be extracted from these segments. The segmentation task requires partitioning the document into relevant segments and understanding the context of each segment. In discourse analysis, the division of a discourse into various segments is achieved through certain indicative clauses called cue phrases that indicate changes in the discourse context. However, in formal documents such language may not be used. Hence the use of a domain specific ontology and an assembly process model is proposed to segregate chunks of the text based on a local context. Elements of the ontology/model, and their related terms serve as indicators of current context for a segment and changes in context between segments. Local contexts are aggregated for increasingly larger segments to identify if the document (or portions of it) pertains to the topic of interest, namely, assembly. Knowledge acquired through such processes enables acquisition and reuse of knowledge during any part of the lifecycle of a product.


  • Knowledge acquisition
  • mechanical assembly
  • discourse analysis
  • segmentation


  1. Madhusudanan, N., Chakrabarti, A.: Implementation and initial validation of a knowledge acquisition system for mechanical assembly. In: CIRP Design 2012, pp. 267–277. Springer (2013)

    Google Scholar 

  2. Sadeghi, S., Noel, F., Masclet, C.: Collaborative specification of virtual environments to support PLM activities. In: PLM11 8th International Conference on Product Lifecycle Management (2011)

    Google Scholar 

  3. Teng, F., Moalla, N., Bouras, A.: A PPO Model-based Knowledge Management Approach for PLM Knowledge Acquisition and Integration. In: International Conference on Product Lifecycle Management Eindhoven (2011)

    Google Scholar 

  4. Pugliese, D., Colombo, G., Spurio, M.S.: About the integration between KBE and PLM. In: Advances in Life Cycle Engineering for Sustainable Manufacturing Businesses, pp. 131–136. Springer, London (2007)

    CrossRef  Google Scholar 

  5. Briggs, H.C.: Knowledge management in the engineering design environment. Jet Propulsion Laboratory, National Aeronautics and Space Administration, Pasadena (2006)

    Google Scholar 

  6. Penoyer, J.A., Burnett, G.J.F.D., Fawcett, D.J., Liou, S.Y.: Knowledge based product life cycle systems: principles of integration of KBE and C3P. Computer-Aided Design 32(5), 311–320 (2000)

    CrossRef  Google Scholar 

  7. Brandt, S.C., Morbach, J., Miatidis, M., Theißen, M., Jarke, M., Marquardt, W.: An ontology-based approach to knowledge management in design processes. Computers & Chemical Engineering 32(1), 320–342 (2008)

    CrossRef  Google Scholar 

  8. Savory, S.E.: Some views on the state of the art in artificial intelligence. In: Artificial Intelligence and Expert Systems, pp. 21–34. John Wiley & Sons, Inc. (1988)

    Google Scholar 

  9. Chen, H.: Learning semantic structures from in-domain documents. PhD thesis, Massachusetts Institute of Technology (2010)

    Google Scholar 

  10. Han, X., Sun, L.: An entity-topic model for entity linking. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 105–115. Association for Computational Linguistics (2012)

    Google Scholar 

  11. Li, Y., Chung, S.M., Holt, J.D.: Text document clustering based on frequent word meaning sequences. Data & Knowledge Engineering 64(1), 381–404 (2008)

    CrossRef  Google Scholar 

  12. Andrews, N.O., Fox, E.A.: Recent developments in document clustering. Computer Science, Virginia Tech, Blacksburg, VA, Technical Report TR-07-35 (2007)

    Google Scholar 

  13. Zheng, H.-T., Kang, B.-Y., Kimp, H.-G.: Exploiting noun phrases and semantic relationships for text document clustering. Information Sciences 179(13), 2249–2262 (2009)

    CrossRef  Google Scholar 

  14. Shahriar Hossain, M., Angryk, R.A.: Gdclust: A graph-based document clustering technique. In: Seventh IEEE International Conference on Data Mining Workshops, ICDM Workshops 2007, pp. 417–422. IEEE (2007)

    Google Scholar 

  15. Loftus, C., Hicks, B., McMahon, C.: Capturing key relationships and stakeholders over the product lifecycle: an email based approach. In: 6th International Conference on Project LifeCycle Management, PLM 2009, 2009-07-06 -2009-07-08, Bath (2009)

    Google Scholar 

  16. Marti, A.: Hearst. Multi-paragraph segmentation of expository text. In: Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics, pp. 9–16. Association for Computational Linguistics (1994)

    Google Scholar 

  17. Grosz, B.J., Sidner, C.L.: Attention, intentions, and the structure of discourse. Computational Linguistics 12(3), 175–204 (1986)

    Google Scholar 

  18. Allen, J.: Natural Language Understanding, 2/e. Pearson (2011)

    Google Scholar 

  19. Fraser, B.: What are discourse markers? Journal of Pragmatics 31(7), 931–952 (1999)

    CrossRef  Google Scholar 

  20. Ntlk tokenize package, text tiling module (October 2013),

  21. Case study of aircraft wing manufacture (October 2013),

  22. Madhusudanan, N., Chakrabarti, A.: Combining product information and process information to build virtual assembly situations for knowledge acquisition. In: ASME (2011)

    Google Scholar 

  23. Kamp, H., Van Genabith, J., Reyle, U.: Discourse representation theory. In: Handbook of Philosophical Logic, pp. 125–394. Springer (2011)

    Google Scholar 

  24. Lohse, N., Hirani, H., Ratchev, S., Turitto, M.: An ontology for the definition and validation of assembly processes for evolvable assembly systems. In: The 6th IEEE International Symposium on Assembly and Task Planning: From Nano to Macro Assembly and Manufacturing (ISATP 2005), pp. 242–247 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations


Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2014 IFIP International Federation for Information Processing

About this paper

Cite this paper

N., M., Gurumoorthy, B., Chakrabarti, A. (2014). Segregating Discourse Segments from Engineering Documents for Knowledge Acquisition. In: Fukuda, S., Bernard, A., Gurumoorthy, B., Bouras, A. (eds) Product Lifecycle Management for a Global Market. PLM 2014. IFIP Advances in Information and Communication Technology, vol 442. Springer, Berlin, Heidelberg.

Download citation

  • DOI:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-45936-2

  • Online ISBN: 978-3-662-45937-9

  • eBook Packages: Computer ScienceComputer Science (R0)