Tree-Structured Conditional Random Fields for Semantic Annotation

  • Jie Tang
  • Mingcai Hong
  • Juanzi Li
  • Bangyong Liang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4273)


The large volume of web content needs to be annotated by ontologies (called Semantic Annotation), and our empirical study shows that strong dependencies exist across different types of information (it means that identification of one kind of information can be used for identifying the other kind of information). Conditional Random Fields (CRFs) are the state-of-the-art approaches for modeling the dependencies to do better annotation. However, as information on a Web page is not necessarily linearly laid-out, the previous linear-chain CRFs have their limitations in semantic annotation. This paper is concerned with semantic annotation on hierarchically dependent data (hierarch-ical semantic annotation). We propose a Tree-structured Conditional Random Field (TCRF) model to better incorporate dependencies across the hierarchic-ally laid-out information. Methods for performing the tasks of model-parameter estimation and annotation in TCRFs have been proposed. Experimental results indicate that the proposed TCRFs for hierarchical semantic annotation can significantly outperform the existing linear-chain CRF model.


Support Vector Machine Hide Markov Model Text Line Conditional Random Field Semantic Annotation 


  1. 1.
    Benjamins, R., Contreras, J.: Six challenges for the semantic web. Intelligent Software Components. Intelligent Software for the Networked Economy (isoco) (2002)Google Scholar
  2. 2.
    Berger, A.L., Della Pietra, S.A., Della Pietra, V.J.: A maximum entropy approach to natural language processing. Computational Linguistics 22, 39–71 (1996)Google Scholar
  3. 3.
    Bunescu, R.C., Mooney, R.J.: Collective information extraction with relational Markov networks. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL 2004), pp. 439–446 (2004)Google Scholar
  4. 4.
    Ciravegna, F.: (LP)2, an adaptive algorithm for information extraction from web-related texts. In: Proceedings of the IJCAI 2001 Workshop on Adaptive Text Extraction and Mining held in conjunction with 17th IJCAI 2001, Seattle, USA, pp. 1251–1256 (2001)Google Scholar
  5. 5.
    Collins, M.: Discriminative training methods for hidden Markov models: Theory and Experiments with Perceptron Algorithms. In: Proceedings of EMNLP 2002 (2002)Google Scholar
  6. 6.
    Cortes, C., Vapnik, V.: Support-Vector Networks. Machine Learning 20, 273–297 (1995)MATHGoogle Scholar
  7. 7.
    Finn, A., Kushmerick, N.: Multi-level Boundary Classification for Information Extraction. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS, vol. 3201, pp. 156–167. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  8. 8.
    Ghahramani, Z., Jordan, M.I.: Factorial hidden Markov models. Machine Learning 29, 245–273 (1997)MATHCrossRefGoogle Scholar
  9. 9.
    Gillick, L., Cox, S.: Some statistical issues in the compairson of speech recognition algorithms. In: International Conference on Acoustics Speech and Signal Processing, vol. 1, pp. 532–535 (1989)Google Scholar
  10. 10.
    Hammersley, J., Clifford, P.: Markov fields on finite graphs and lattices (unpublished manuscript, 1971)Google Scholar
  11. 11.
    Hammond, B., Sheth, A., Kochut, K.: Semantic enhancement engine: a modular document enhancement platform for semantic applications over heterogeneous content, in real world semantic web applications, pp. 29–49. IOS Press, Amsterdam (2002)Google Scholar
  12. 12.
    Handschuh, S., Staab, S., Ciravegna, F.: S-CREAM – semi-automatic cREAtion of metadata. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS, vol. 2473, pp. 358–372. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  13. 13.
    Kushmerick, N., Weld, D.S., Doorenbos, R.B.: Wrapper induction for information extraction. In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), Nagoya, Japan, pp. 729–737 (1997)Google Scholar
  14. 14.
    Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th International Conference on Machine Learning (ICML 2001), pp. 282–289 (2001)Google Scholar
  15. 15.
    Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Mathematical Programming, 503–528 (1989)Google Scholar
  16. 16.
    Lou, T., Song, R., Li, W.L., Luo, Z.Y.: The design and implementation of a modern general purpose segmentation system. Journal of Chinese Information Processing (5) (2001)Google Scholar
  17. 17.
    McCallum, A., Freitag, D., Pereira, F.: Maximum entropy Markov models for information extraction and segmentation. In: Proceedings of the 17th International Conference on Machine Learning (ICML 2000), pp. 591–598 (2000)Google Scholar
  18. 18.
    Popov, B., Kiryakov, A., Kirilov, A., Manov, D., Ognyanoff, D., Goranov, M.: KIM - semantic annotation platform. In: Fensel, D., Sycara, K.P., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 834–849. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  19. 19.
    Reeve, L.: Integrating hidden Markov models into semantic web annotation platforms. Technique Report (2004)Google Scholar
  20. 20.
    Sha, F., Pereira, F.: Shallow parsing with conditional random fields. In: Proceedings of Human Language Technology, NAACL (2003)Google Scholar
  21. 21.
    Sutton, C., Rohanimanesh, K., McCallum, A.: Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data. In: Proceedings of ICML 2004 (2004)Google Scholar
  22. 22.
    Tang, J., Li, J., Lu, H., Liang, B., Wang, K.: iASA: learning to annotate the semantic web. Journal on Data Semantic IV, 110–145 (2005a)Google Scholar
  23. 23.
    Tang, J., Li, H., Cao, Y., Tang, Z.: Email data cleaning. In: Proceedings of SIGKDD 2005, Chicago, Illinois, USA, August 21-24, 2005, pp. 489–499, Full paper (2005)Google Scholar
  24. 24.
    Wainwright, M., Jaakkola, T., Willsky, A.: Tree-based reparameterization for approximate estimation on graphs with cycles. In: Proceedings of Advances in Neural Information Processing Systems (NIPS 2001), pp. 1001–1008 (2001)Google Scholar
  25. 25.
    Yedidia, J., Freeman, W., Weiss, Y.: Generalized belief propagation. In: Advances in Neural Information Processing Systems (NIPS) (2000)Google Scholar
  26. 26.
    Zhu, J., Nie, Z., Wen, J., Zhang, B., Ma, W.: 2D conditional random fields for web information extraction. In: Proceedings of ICML 2005 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jie Tang
    • 1
  • Mingcai Hong
    • 1
  • Juanzi Li
    • 1
  • Bangyong Liang
    • 2
  1. 1.Department of Computer ScienceTsinghua UniversityBeijingChina
  2. 2.NEC Labs ChinaBeijingChina

Personalised recommendations