EDCMS: A content management system for engineering documents

  • Shaofeng LiuEmail author
  • Chris McMahon
  • Mansur Darlington
  • Steve Culley
  • Peter Wild


Engineers often need to look for the right pieces of information by sifting through long engineering documents. It is a very tiring and time-consuming job. To address this issue, researchers are increasingly devoting their attention to new ways to help information users, including engineers, to access and retrieve document content. The research reported in this paper explores how to use the key technologies of document decomposition (study of document structure), document mark-up (with EXtensible Mark-up Language (XML), HyperText Mark-up Language (HTML), and Scalable Vector Graphics (SVG)), and a facetted classification mechanism. Document content extraction is implemented via computer programming (with Java). An Engineering Document Content Management System (EDCMS) developed in this research demonstrates that as information providers we can make document content in a more accessible manner for information users including engineers.

The main features of the EDCMS system are

1) EDCMS is a system that enables users, especially engineers, to access and retrieve information at content rather than document level. In other words, it provides the right pieces of information that answer specific questions so that engineers don’t need to waste time sifting through the whole document to obtain the required piece of information.

2) Users can use the EDCMS via both the data and metadata of a document to access engineering document content.

3) Users can use the EDCMS to access and retrieve content objects, i.e. text, images and graphics (including engineering drawings) via multiple views and at different granularities based on decomposition schemes.

Experiments with the EDCMS have been conducted on semi-structured documents, a textbook of CADCAM, and a set of project posters in the Engineering Design domain. Experimental results show that the system provides information users with a powerful solution to access document content.


Document content management engineering design decomposition schemes document mark-up facetted classification 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    J. Jacob, A. Sachde, S. Chakravarthy. CX-DIFF: a Change Detection Algorithm for XML Content and Change Visualisation for WebVigil. Data and Knowledge Engineering, vol. 52, no. 2, pp. 209–230, 2005.CrossRefGoogle Scholar
  2. [2]
    T. Hendley. Reviewing the Options for Information and Records Management and Collaborative Working. Managing Information and Documents: The Definitive Guide, 16th ed., M-ID, London, pp. 11–35, 2005.Google Scholar
  3. [3]
    S. McKeever. Understanding Web Content Management Systems: Evolution, Lifecycle and Market. Industrial Management and Data Systems, vol. 103, no. 9, pp. 686–692, 2003.CrossRefGoogle Scholar
  4. [4]
    L. H. Chen, W. L. Chue. Using Web Structure and Summarisation Techniques for Web Content Mining. Information Processing and Management, vol. 41, no. 5, pp. 1225–142, 2005.CrossRefGoogle Scholar
  5. [5]
    J. T. Sprehe. The Positive Benefits of Electronic Records Management in the Context of Enterprise Content Management. Government Information Quarterly, vol. 22, no. 2, pp. 297–303, 2005.CrossRefGoogle Scholar
  6. [6]
    J. Robertson. Is It Document Management or Content Management? [Online], Available: papers/cmb_dmorcm/index.html, June 30, 2006.
  7. [7]
    Extending Ccross the Organisation: Reuse and Collaboration with XML-based Content, Dynamic Content Software Strategies Consulting Service. CAP Ventures. [Online], Available:, June 30, 2006.
  8. [8]
    Sitecore Content Manager. [Online], Availabler:, June 30, 2006.
  9. [9]
    T. Wales. Library Subject Guides: a Content Management Case Study at the Open University, UK. Program — Electronic Library and Information Systems, vol. 39, no. 2, pp. 112–121, 2005.CrossRefGoogle Scholar
  10. [10]
    A. Lowe. Studies of Information Use by Engineering Designers and the Development of Strategies to Aid in its Classification and Retrieval. Ph.D. dissertation, Bristol University, UK, 2002.Google Scholar
  11. [11]
    A. Lowe, C. A. McMahon, S. J. Culley. Characterising the Requirements of Engineering Information Systems. International Journal of Information Management, vol. 24, no. 5, pp. 401–422, 2004.CrossRefGoogle Scholar
  12. [12]
    R. Fidel, M. Green. The Many Faces of Accessibility: Engineer’s Perception of Information Sources. Information Processing and Management, vol. 40, no. 3, pp. 563–581, 2004.CrossRefGoogle Scholar
  13. [13]
    S. B. Harris, J. Owen, M. S. Bloor, I. Hogg. Engineering Document Management Strategy: Analysis of Requirements, Choice of Direction and System Implementation. Proceedings of the Institution of Mechanical Engineers Part B—Journal of Engineering Manufacture, vol. 211, no. 5, pp. 385–405, 1997.CrossRefGoogle Scholar
  14. [14]
    P. J. Wild, S. J. Culley, C. A. McMahon, M. J. Darlington, S. Liu. Towards a Method for Profiling Engineering Documentation. In Proceedings of the 9th International Design Conference of DESIGN 2006, University of Zagreb Press, Dubrovnik, Croatia, pp. 1309–1318, 2006.Google Scholar
  15. [15]
    S. Liu, C. A. McMahon, M. J. Darlington, S. J. Culley, P. J. Wild. An Approach for Document Fragment Retrieval and its Formatting Issues in Engineering Information Management. Lecture Notes in Computer Science, vol. 3981, pp. 279–287, 2006.Google Scholar
  16. [16]
    C. A. McMahon, A. Lowe, S. J. Culley, M. Corderoy, R. Crossland, T. Shah, D. Stewart. Waypoint: an Integrated Search and Retrieval System for Engineering Documents. Journal of Computing and Information Science in Engineering, vol. 4, no. 4, pp. 329–338, 2004.CrossRefGoogle Scholar
  17. [17]
    M. Erdmann, R. Studer. How to Structure and Access XML Documents with Ontology. Data and Knowledge Engineering, vol. 36, no. 3, pp. 317–335, 2001.CrossRefGoogle Scholar
  18. [18]
    S. Klink, A. Dengel, T. Kieninger. Document Structure Analysis Based on Layout and Textual Features. [Online], Available:, June 30, 2006.
  19. [19]
    J. Kingston, A. Macintosh. Knowledge Management through Multi-perspective Modelling: Representing and Distributing Organisational Memory. Knowledge-based Systems, vol. 13, no. 2–3, pp. 121–131, 2000.CrossRefGoogle Scholar
  20. [20]
    K. A. Chatha, R. H. Weston, R. P. Monfared. An Approach to Modelling Dependencies Linking Engineering Processes. Proceedings of the Institution of Mechanical Engineers Part B — Journal of Engineering Manufacture, vol. 217, no. 5, pp. 669–687, 2003.CrossRefGoogle Scholar
  21. [21]
    M. Fowler, K. Scott. UML Distilled: A Brief Guide to the Standard Object Modelling Language, Addison-Wesley, Boston, 2000.Google Scholar
  22. [22]
    XML DTD, W3C. [Online], Available:, June 30, 2006.
  23. [23]
    DOM, W3C. [Online], Available:, June 30, 2006.
  24. [24]
    H. Y. Kao, J. M. Ho, M. S. Chen. WISDOM: Web Intra-page Informative Structure Mining Based on Document Object Model. IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 5, pp. 614–627, 2005.CrossRefGoogle Scholar
  25. [25]
    S. Liu, C. A. McMahon, M. J. Darlington, S. J. Culley, P. J. Wild. A Computational Framework for Retrieval of Document Fragments Based on Decomposition Schemes in Engineering Information Management. Advanced Engineering Informatics, vol. 20, no. 4, pp. 401–403, 2006.CrossRefGoogle Scholar
  26. [26]
    D. Zijm. The History of Mark-up Languages. [Online], Available:, June 30, 2006.
  27. [27]
    B. K. Reid. Scribe: a Document Specification Language and its Compiler. Ph.D. dissertation, Carnegie-Mellon University, USA, 1981.Google Scholar
  28. [28]
    L. Lamport. LATEX: A Document Preparation System: User’s Guide and Reference Manual. Addison-Wesley, London, 1986.Google Scholar
  29. [29]
    Document Mark-up Meta-language: GENCODE and the Standard Generalized Mark-up Language (SGML), GCA standard 101, 1983.Google Scholar
  30. [30]
    Office Document Architecture (ODA), ISO/DIS 8613, Information processing, 1986.Google Scholar
  31. [31]
    D. W. Langridge. Classification: Its Kinds, Elements, Systems and Applications. Bowker-Saur, London, 1992.Google Scholar
  32. [32]
    J. Rowley, J. Farrow. Organising Knowledge: an Introduction to Information Retrieval, 3rd ed. Gower Publishing, London, 2000.Google Scholar
  33. [33]
    A. C. Foskett. The Subject Approach to Information, 5th ed., Library Association Publishing, London, 1996.Google Scholar
  34. [34]
    M. L. Mackenzie. The Personal Organization of Electronic Mail Messages in a Business Environment: An Exploratory Study. Library and Information Science Research, vol. 22, no. 4, pp. 405–426, 2000.CrossRefMathSciNetGoogle Scholar
  35. [35]
    A. Taylor. Introduction to Cataloguing and Classification. Libraries Unlimited, London, 1992.Google Scholar
  36. [36]
    J. Mills. Facetted Classification and Logical Division in Information Retrieval. Library Trends, vol. 52, no. 3, pp. 541–570, 2004.Google Scholar
  37. [37]
    T. Quatrani. Visual Modelling with Rational Rose 2000 and UML. Addison-Wesley, Boston, 2000.Google Scholar
  38. [38]
    C. F. Goldfarb. A Generalized Approach to Document Mark-up. In Proceedings of the ACM SIGPLAN SIGOA Symposium on Text Manipulation, Portland, Oregon, SIGPLAN Notices, vol. 16, no. 6, pp. 68–73, 1981.CrossRefMathSciNetGoogle Scholar
  39. [39]
    J. D. Eisenberg. SVG Essentials. O’Reilly, Beijing, 2002.Google Scholar
  40. [40]
    S. Gupta, G. E. Kaiser, P. Grimm, M. F. Chiang, J. Starren. Automating Content Extraction of HTML Documents. World Wide Web—Internet and Web Information Systems, vol. 8, no. 2, pp. 179–224, 2005.CrossRefGoogle Scholar
  41. [41]
    D. A. Lizorkin, K. Y. Lisovsky. Implementation of the XML Linking Language XLink by Functional Methods. Programming and Computer Software, vol. 319, no. 1, pp. 34–46, 2005.Google Scholar
  42. [42]
    XLink and XPointer. W3C. [Online], Available:, June 30, 2006.
  43. [43]
    E. Freeman. Head First HTML with CSS and XHTML. O’Reilly, Beijing, 2002.Google Scholar
  44. [44]
    G. Falquet, C. L. Mottaz-Jiang, J. C. Ziswiler. Ontology Based Interfaces to Access a Library of Virtual Hyper-books. Lecture Notes in Computer Science, vol. 3232, pp. 99–110, 2004.Google Scholar
  45. [45]
    C. A. McMahon, J. Browne. CADCAM Principles, Practice and Manufacturing Management, 2nd ed., Addison-Wesley, Harlow, England, 1998.Google Scholar
  46. [46]
    H. S. Na, O. H. Choi. FSMI: MDR-based Metadata Interoperability Framework for Sharing XML Documents, Systems Modelling and Simulation: Theory and Applications. Lecture Notes in Computer Science, vol. 3398, pp. 343–351, 2005.Google Scholar
  47. [47]
    P. Rigaux, and N. Spyratos. Metadata Inference for Document Retrieval in a Distributed Repository. Lecture Notes in Computer Science, vol. 3321, pp. 418–436, 2004.Google Scholar
  48. [48]
    Dublin Core. [Online], Available:, June 30, 2006.
  49. [49]
    S. Ahmed, K. M. Wallace. Identifying and Supporting the Knowledge Needs of Novice Designers within the Aerospace Industry. Journal of Engineering Design, vol. 15, no. 5, pp. 475–492, 2004.CrossRefGoogle Scholar
  50. [50]
    S. Liu, C. A. McMahon, M. J. Darlington, S. J. Culley, P. J. Wild. An Automatic Mark-up Approach for Structured Document Retrieval in Engineering Design. In Proceeding of International Conference of Manufacturing Research, Liverpool, UK, pp. 23–28, 2006.Google Scholar

Copyright information

© Institute of Automation, Chinese Academy of Sciences 2007

Authors and Affiliations

  • Shaofeng Liu
    • 1
    Email author
  • Chris McMahon
    • 1
  • Mansur Darlington
    • 1
  • Steve Culley
    • 1
  • Peter Wild
    • 1
  1. 1.Department of Mechanical EngineeringUniversity of BathBathUK

Personalised recommendations