Skip to main content
Log in

The research and realization about automatic abstracting based on text clustering and natural language understanding

  • Research Article
  • Published:
Frontiers of Electrical and Electronic Engineering in China

Abstract

A method of realization of automatic abstracting based on text clustering and natural language understanding is explored, aimed at overcoming shortages of some current methods. The method makes use of text clustering and can realize automatic abstracting of multi-documents. The algorithm of twice word segmentation based on the title and first sentences in paragraphs is investigated. Its precision and recall is above 95 %. For a specific domain on plastics, an automatic abstracting system named TCAAS is implemented. The precision and recall of multi-document’s automatic abstracting is above 75%. Also, the experiments prove that it is feasible to use the method to develop a domain automatic abstracting system, which is valuable for further in-depth study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Califf M. E., Mooney R. J., Relational learning of pattern-match rules for information extraction, Proceedings of the 19th National Conference on Artificial Intelligence, 2003, 19(1): 87–90

    Google Scholar 

  2. Li Lei, Zhong Yi-xin, The application of comprehensive information theory in automatic abstract system, Chinese Journal of Computers, 2000, 23(1): 4–7 (in Chinese)

    Google Scholar 

  3. Terje Brasethvik, Jon Atle Gulla, Natural language analysis for semantic document modeling, Data and Knowledge Engineering, 2001, 38(1): 45–62

    Article  MATH  Google Scholar 

  4. Brown P., Della Pietra V., Class-based n-gram models of natural language, Computational Linguistics, 2002, 28(4): 477–480

    Google Scholar 

  5. Liu Ting, Wang Kai-zhu, Four kinds of main methods of automatic abstracting, Journal Information, 1999, 18(1): 11–19 (in Chinese)

    Google Scholar 

  6. Wu Si, Cluster analysis and Its application in the automatic information extraction from agricultural texts, Xiang tan: Xiang Tan University Press, 2001, 22–28 (in Chinese)

    Google Scholar 

  7. Yao Tian-shun, Natural language understanding, Beijing: Tsinghua University Press, 2002: 98–101 (in Chinese)

    Google Scholar 

  8. Li Jin-qian, Zhang Dong-mo, Yao Tian-fang, The optimization of sentence structure in natural language generation, The Research of Computer Application, 1998, 19(1): 53–54 (in Chinese)

    MATH  Google Scholar 

  9. Liu Chang-yu, Tang Chang-jie, Bayes discriminator for BBS documents based on latent semantic analysis, Chinese Journal of Computers, 2004, 27(4): 567–568 (in Chinese)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guo Qing-lin.

Additional information

__________

Translated from Transactions of Beijing Institute of Technology (Natural Science Edition), 2005, 25(8): 705–709 (in Chinese)

About this article

Cite this article

Guo, Ql., Fan, Xz. & Liu, Ca. The research and realization about automatic abstracting based on text clustering and natural language understanding. Front. Electr. Electron. Eng. China 1, 460–464 (2006). https://doi.org/10.1007/s11460-006-0088-y

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11460-006-0088-y

Keywords

Navigation