Extractive Single Document Summarization via Multi-feature Combination and Sentence Compression

Liu, Maofu; Yu, Yan; Qi, Qiaosong; Hu, Huijun; Ren, Han

doi:10.1007/978-3-319-73618-1_70

Maofu Liu^18,19,
Yan Yu^18,19,
Qiaosong Qi^18,19,
Huijun Hu^18,19 &
…
Han Ren²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10619))

Included in the following conference series:

National CCF Conference on Natural Language Processing and Chinese Computing

3372 Accesses

Abstract

In this paper, we attempt to extract and generate the short summary for the news article with the length limit of 60 Chinese characters. Firstly, we preprocess the news article by segmenting sentences and words, and then extract four kinds of central words to form the keyword dictionary based on parsing tree. After that, the four kinds of features, i.e. the sentence weight, the sentence similarity, the sentence position and the length of sentence, will be employed to measure the significance of each sentence. Finally, we extract two sentences in the descending order of significance score and compress them to get the summary for each news article. This approach can analyze the grammatical elements from original sentences in order to generate compression rules and trim syntactic elements according to their parsing trees. The evaluation results show that our system is efficient in Chinese news summarization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Luhn, H.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2(2), 159–165 (1958)
Article MathSciNet Google Scholar
Liu, M., Wang, L., Nie, L.: Weibo-oriented Chinese news summarization via multi-feature combination. In: Li, J., Ji, H., Zhao, D., Feng, Y. (eds.) NLPCC 2015. LNCS (LNAI), vol. 9362, pp. 581–589. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25207-0_55
Chapter Google Scholar
John, A., Wilscy, M.: Random forest classifier based multi-document summarization system. In: International Conference on Computer Engineering and Systems, pp. 132–138 (2013)
Google Scholar
Moawad, I., Aref, M.: Semantic graph reduction approach for abstractive text summarization. In: International Conference on Computer Engineering and Systems, pp. 132–138 (2012)
Google Scholar
Hirao, T., Yoshida, Y., Nishino, M.: Single-document summarization as a tree knapsack problem. In: Conference on Empirical Methods in Natural Language Processing, pp. 1515–1520 (2013)
Google Scholar
Napoles, C., Durme, B.: Evaluating sentence compression: pitfalls and suggested remedies. In: Workshop on Monolingual Text-to-text Generation, pp. 91–97 (2011)
Google Scholar
Cohn, T., Lapata, M.: Sentence compression as tree transduction. J. Artif. Intell. Res. 34(1), 637–674 (2009)
MATH Google Scholar
Alias, S., Mohammad, S.K., Hoon, G.K.: A Malay text summarizer using pattern-growth method with sentence compression rules. In: Third International Conference on Information Retrieval and Knowledge Management, pp. 7–12. IEEE (2017)
Google Scholar
Filippova, K., Alfonseca, E.: Sentence compression by deletion with LSTMs. In: Conference on Empirical Methods in Natural Language Processing, pp. 360–368 (2015)
Google Scholar
Nallapati, R., Zhou, B.: Abstractive Text Summarization Using Sequence-to-Sequence RNNs and Beyond. IBM Watson (2016)
Google Scholar

Download references

Acknowledgments

The work presented in this paper is partially supported by the Major Projects of National Social Science Foundation of China under No. 11&ZD189, Natural Science Foundation of China under No. 61402341, Planning Foundation of Wuhan Science and Technology Bureau under No. 2016060101010047, and Open Foundation of Hubei Province Key Laboratory under No. 2016znss05A.

Author information

Authors and Affiliations

College of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, 430065, China
Maofu Liu, Yan Yu, Qiaosong Qi & Huijun Hu
Hubei Province Key Laboratory of Intelligent Information Processing and Real-Time Industrial System, Wuhan University of Science and Technology, Wuhan, 430065, China
Maofu Liu, Yan Yu, Qiaosong Qi & Huijun Hu
Laboratory of Language Engineering and Computing, Guangdong University of Foreign Studies, Guangzhou, 510006, China
Han Ren

Authors

Maofu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yan Yu
View author publications
You can also search for this author in PubMed Google Scholar
Qiaosong Qi
View author publications
You can also search for this author in PubMed Google Scholar
Huijun Hu
View author publications
You can also search for this author in PubMed Google Scholar
Han Ren
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Han Ren .

Editor information

Editors and Affiliations

Fudan University, Shanghai, China
Xuanjing Huang
Singapore Management University, Singapore, Singapore
Jing Jiang
Peking University, Beijing, China
Dongyan Zhao
Peking University, Beijing, China
Yansong Feng
Soochow University, Suzhou, China
Yu Hong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, M., Yu, Y., Qi, Q., Hu, H., Ren, H. (2018). Extractive Single Document Summarization via Multi-feature Combination and Sentence Compression. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2017. Lecture Notes in Computer Science(), vol 10619. Springer, Cham. https://doi.org/10.1007/978-3-319-73618-1_70

Download citation

DOI: https://doi.org/10.1007/978-3-319-73618-1_70
Published: 05 January 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73617-4
Online ISBN: 978-3-319-73618-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics