Advertisement

Automatic Control and Computer Sciences

, Volume 41, Issue 3, pp 132–140 | Cite as

Summarization of text-based documents with a determination of latent topical sections and information-rich sentences

  • R. M. Alguliev
  • R. M. Alyguliev
Article

Abstract

A method is proposed for use in summarization of text-based documents. By means of the method it is possible to discover latent topical sections and information-rich sentences. The underlying basis of the method — clustering of sentences — is formulated mathematically in the form of a problem of quadratic-type integer programming. An algorithm that makes it possible to determine with specified precision the optimal number of clusters is developed. The synthesis of a neural network is described for the purpose of solving a problem of integer quadratic programming.

Key words

summarization clustering optimal number of clusters information-rich sentence neural networks 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Mani, I. and Maybury, M.T., Advances in Automated Text Summarization, Cambridge: MIT Press, 1999.Google Scholar
  2. 2.
    Salton, G., Singhal, A., Mitra, M., and Buckley, C., Automated Text Structuring and Summarization, Inf. Process. Manage., 1997, vol. 33, no. 2, pp. 193–207.CrossRefGoogle Scholar
  3. 3.
    Mitra, M., Singhal, A., and Buckley, C., Automatic Text Summarization by Paragraph Extraction, Proc. ACL’97/EACL’97 Workshop on Intelligent Scalable Text Summarization, Madrid, July 7–12, 1997, pp. 39–46.Google Scholar
  4. 4.
    Kruengkrai, C. and Jaruskulchai, C., Generic Text Summarization Using Local and Global Properties of Sentences, Proc. IEEE/WIC Intern. Conf. Web Intelligence (WI’03), Halifax, Canada, October 13–17, 2003, pp. 201–206.Google Scholar
  5. 5.
    Yeh, J.-Y., Ke, H.-R., Yang, W.-P., and Meng, I.-H., Text Summarization Using a Trainable Summarizer and Latent Semantic Analysis, Inf. Process. Manage., 2005, vol. 41, no. 1, pp. 75–95.CrossRefGoogle Scholar
  6. 6.
    Goldstein, J., Kantrowitz, M., Mitral, V., and Carbonell, J., Summarization of Text Documents: Sentence Selection and Evaluation Metrics, Proc. 22nd Annual International ACM SIGIR Conf. Res. Develop. in Information Retrieval (SIGIR’99), Berkeley, USA, August 15–19, 1999, pp. 121–128.Google Scholar
  7. 7.
    Gong, Y. and Liu, X., Generic Text Summarization Using Relevance Measure and Latent Semantic Analysis, Proc. 24th Annual Intern. ACM SIGIR Conf. Res. Develop. in Information Retrieval, New Orleans, USA, 2001, pp. 19–25.Google Scholar
  8. 8.
    Hu, P., He, T., Ji, D., and Wang, M. A Study of Chinese Text Summarization Using Adaptive Clustering of Paragraphs, Proc. 4th Intern. Conf. Computers and Information Technology (CIT’04), Wuhan, China, September 14–16, 2004, pp. 1159–1164.Google Scholar
  9. 9.
    Shen, D., Chen, Z., Yang, Q., Zeng, H.J., Zhang, B., Lu, Y, and Ma, W.Y., Web-Page Classification Through Summarization, Proc. 27th Annual Intern. Conf. Res. Develop. Information Retrieval, Sheffield, UK, July 25–29, 2004, pp. 242–249.Google Scholar
  10. 10.
    Delort, J.-Y., Bouchon-Meuniere, B., and Rifqi, M., Enhanced Web Document Summarization Using Hyperlinks, Proc. 14th ACM Conf. Hypertext and Hypermedia, Nottingham, UK, August 26–30, 2003, pp. 208–215.Google Scholar
  11. 11.
    Luhn, H.P., The Automatic Creation of Literature Abstracts, IBM J. Res. Develop., 1958, vol. 2, no. 2, pp. 159–165.MathSciNetCrossRefGoogle Scholar
  12. 12.
    Banko, M., Mitral, V., Kantrowitz, M., and Goldstein, J., Generating Extraction-Based Summaries from Hand-Written Summaries by Aligning Text Spans, Proc. 14th Conf. Pacific Assoc. Computational Linguistics (PACLING’99), Waterloo, Canada, August 25–28, 1999, pp. 36–40.Google Scholar
  13. 13.
    Grabmeier, J. and Rudolph, A., Techniques of Cluster Algorithms in Data Mining, Data Mining Knowledge Discovery, 2002, vol. 6, no. 4, pp. 303–360.CrossRefMathSciNetGoogle Scholar
  14. 14.
    Bradley, P.S., Fayyad, U.M., and Mangasarian, O.L., Mathematical Programming for Data Mining: Formulations and Challenges, INFORMS J. Comput., 1999, vol. 11, no. 3, pp. 217–238.zbMATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    Bagirov, A.M., Ferguson, B., Ivkovic, S., Saunders, G., and Yearwood, J., New Algorithms for Multi-class Diagnosis Using Tumor Gene Expression Signature, Bioinformatics, 2003, vol. 19, no. 14, pp. 1800–1807.CrossRefGoogle Scholar
  16. 16.
    Alguliev, R.M., Alyguliev, R.M., and Alekperov, R.K., An Approach to Optimal Assignment of Tasks in a Distributed System, Avtom. Vychisl. Tekh., 2004, no. 5, pp. 55–61.Google Scholar
  17. 17.
    Neyromatematika. Kniga 6. Uchebnoe posobie dlya vuzov (Neuro-Mathematics. Vol. 6. A Textbook for Post-Secondary Educational Institutions), Galushkin, A.I., Ed., Moscow: IPRZhR, 2002.Google Scholar
  18. 18.
    Kim, D.-W., Lee, K.H., and Lee, D., On Cluster Validity Index for Estimation of the Optimal Number of Fuzzy Clusters, Pattern Recognition, 2004, vol. 37, no. 10, pp. 2009–2025.CrossRefGoogle Scholar

Copyright information

© Allerton Press, Inc. 2007

Authors and Affiliations

  • R. M. Alguliev
    • 1
  • R. M. Alyguliev
    • 1
  1. 1.Institute of Information TechnologiesNational Academy of Sciences of AzerbaijanBakuAzerbaijan

Personalised recommendations