Extraction of Structured Rules from Web Pages and Maintenance of Mutual Consistency: XRML Approach

  • Juyoung Kang
  • Jae Kyu Lee
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2876)


Web pages provide valuable knowledge for human comprehension in text, tables, and mathematical notations. However, the extraction and maintenance of structured rules from the Web pages are not easy tasks. To tackle these problems, we adopt the eXtensible Rule Markup Language framework. The RIML (Rule Identification Markup Language) and RSML (Rule Structure Markup Language) are two compliant representations in XRML for this purpose. RIML identifies the implicit rules in the Web pages possibly using multiple pages to make a rule or rule group. RSML specifies the complete rule structure to be processed by software agents or expert systems.

In this study, we cover the natural text, tables, and implicit numeric functions in the texts. In order to fulfill the research goal, we define the necessary tags for the rule extraction and maintenance in XRML. Typical ones include tags for rule grouping, tabular rules, numeric operators, and functions. The rule acquisition process consists of rule base design, rule identification with RIML, and rule structuring with RSML. The maintenance process for the revisions that may occur either in Web pages and structured rules is also described. The approach is demonstrated with the shipping cost comparison on the electronic book stores.


Rule Base Structure Rule Knowledge Engineer Rule Extraction Delivery Cost 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Babowal, D., Joerg, W.: From Information to Knowledge: Introducing WebStract’s Knowledge Engineering Approach. In: Proceedings of the 1999 IEEE Canadian Conference on Electrical and Computer Engineering (1999) Google Scholar
  2. 2.
    Chan, K., Low, B.T., Lam, W., Lam, K.P.: Extracting Causation Knowledge from Natural Language Texts. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 555–560. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  3. 3.
    Cravan, M., DiPasco, D., McCallum, A., Mitchell, T., Nigamm, K., Quek, C.Y.: Learning to Construct Knowledge Bases from the World Wide Web. Artificial Intelligence 118(1-2), 69–113 (1999)CrossRefGoogle Scholar
  4. 4.
    Crow, L.R., Shadbolt, N.R.: Extracting focused knowledge from the semantic web. International Journal of Human-Computer Studies 54, 155–184 (2001)zbMATHCrossRefGoogle Scholar
  5. 5.
    Devedzic, V.: The Semantic Web. In: Tutorial of PAIS Conference (2001) Google Scholar
  6. 6.
    Fensel, D., Horrocks, I., van Harmelen, F., Decker, S., Erdmann, M., Klein, M.: OIL in a nutshell. In: Knowledge Acquisition, Modeling, and Management, Proceedings of the European Knowledge Acquisition Conference (2000)Google Scholar
  7. 7.
    Heijst, V., Wielinga, S.: Using explicit ontologies in KBS development. International Journal of Human-Computer Studies 45, 183–292 (1997)Google Scholar
  8. 8.
    Hemnani, A., Bressan, S.: Extracting Information from Semi-Structured Web Documents. In: Bruel, J.-M., Bellahsène, Z. (eds.) OOIS 2002. LNCS, vol. 2426, pp. 166–175. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  9. 9.
    Jicheng, W., Yuan, H., Gangshan, W., Fuyan, Z.: Web Mining: Knowledge Discovery on the Web. In: IEEE SMC 1999 Conference Proceedings, vol. 2 (1999)Google Scholar
  10. 10.
    Lee, J.K., Sohn, M.: Extensible Rule Markup Language – toward Intelligent Web Platform. Communications of the ACM 46, 59–64 (2003)CrossRefGoogle Scholar
  11. 11.
    Lee, J.K., Sohn, M.: Extensible Rule Markup Language Version 1.0 specification (2002),
  12. 12.
    Lee, J.K., Song, Y.U., Kwon, S.B., Kim, W.J., Kim, M.Y.: Rule Syntax of UNIK-BWD. Development of Expert System with UNIK, Bup Young, Ltd., p. 99 (1996)Google Scholar
  13. 13.
    Liebowitz, J.: The Handbook of Applied Expert Systems. CRC Press LLC, Boca Raton (1998)zbMATHGoogle Scholar
  14. 14.
    Moulin, B., Rousseau, D.: Automated Knowledge Acquisition from Regulatory Texts. IEEE Expert (1992)Google Scholar
  15. 15.
    Nguyen, T.A., Perkins, W.A.: Knowledge Base Verification. AI Magazine 8(2), 69–75 (1987)Google Scholar
  16. 16.
    Schmidt, G., Wetter, T.: Using natural language sources in model-based knowledge acquisition. Data & Knowledge Engineering 26, 327–356 (1998)zbMATHCrossRefGoogle Scholar
  17. 17.
    Semantic Web: Semantic Web Introduction, Specifications and Related Works (2001),
  18. 18.
    Torsun, I.S.: Foundations of Intelligent Knowledge-Based Systems. Academic Press, London (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Juyoung Kang
    • 1
  • Jae Kyu Lee
    • 1
  1. 1.Graduate School of Management Korea Advanced Institute of Science and TechnologySeoulKorea

Personalised recommendations