Experiments in Newswire Summarisation

  • Stuart Mackie
  • Richard McCreadie
  • Craig Macdonald
  • Iadh Ounis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9626)

Abstract

In this paper, we investigate extractive multi-document summarisation algorithms over newswire corpora. Examining recent findings, baseline algorithms, and state-of-the-art systems is pertinent given the current research interest in event tracking and summarisation. We first reproduce previous findings from the literature, validating that automatic summarisation evaluation is a useful proxy for manual evaluation, and validating that several state-of-the-art systems with similar automatic evaluation scores create different summaries from one another. Following this verification of previous findings, we then reimplement various baseline and state-of-the-art summarisation algorithms, and make several observations from our experiments. Our findings include: an optimised Lead baseline; indication that several standard baselines may be weak; evidence that the standard baselines can be improved; results showing that the most effective improved baselines are not statistically significantly less effective than the current state-of-the-art systems; and finally, observations that manually optimising the choice of anti-redundancy components, per topic, can lead to improvements in summarisation effectiveness.

Keywords

Entropy 

Notes

Acknowledgements

Mackie acknowledges the support of EPSRC Doctoral Training grant 1509226. McCreadie, Macdonald and Ounis acknowledge the support of EC SUPER project (FP7-606853).

References

  1. 1.
    Allan, J., Wade, C., Bolivar, A.: Retrieval and novelty detection at the sentence level. In: Proceedings of SIGIR (2003)Google Scholar
  2. 2.
    Conroy, J.M., Schlesinger, J.D., O’Leary, D.P.: Topic-focused multi-document summarization using an approximate oracle score. In: Proceedings of COLING-ACL (2006)Google Scholar
  3. 3.
    Erkan, G., Radev, D.R.: LexRank: Graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22(1), 457–479 (2004)Google Scholar
  4. 4.
    Gillick, D., Favre, B.: A scalable global model for summarization. In: Proceedings of ACL ILP-NLP (2009)Google Scholar
  5. 5.
    Guo, Q., Diaz, F., Yom-Tov, E.: Updating users about time critical events. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 483–494. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  6. 6.
    Haghighi, A., Vanderwende, L.: Exploring content models for multi-document summarization. In: Proceedings of NAACL-HLT (2009)Google Scholar
  7. 7.
    Hong, K., Conroy, J., Favre, B., Kulesza, A., Lin, H., Nenkova, A.: A repository of state of the art and competitive baseline summaries for generic news summarization. In: Proceedings of LREC (2014)Google Scholar
  8. 8.
    Kedzie, C., McKeown, K., Diaz, F.: Predicting salient updates for disaster summarization. In: Proceedings of ACL-IJCNLP (2015)Google Scholar
  9. 9.
    Lin, C.Y.: ROUGE: A package for automatic evaluation of summaries. In: Proceedings of ACL (2004)Google Scholar
  10. 10.
    Lin, C.Y., Hovy, E.: The automated acquisition of topic signatures for text summarization. In: Proceedings of COLING (2000)Google Scholar
  11. 11.
    Mackie, S., McCreadie, R., Macdonald, C., Ounis, I.: Comparing algorithms for microblog summarisation. In: Kanoulas, E., Lupu, M., Clough, P., Sanderson, M., Hall, M., Hanbury, A., Toms, E. (eds.) CLEF 2014. LNCS, vol. 8685, pp. 153–159. Springer, Heidelberg (2014)Google Scholar
  12. 12.
    Mackie, S., McCreadie, R., Macdonald, C., Ounis, I.: On choosing an effective automatic evaluation metric for microblog summarisation. In: Proceedings of IIiX (2014)Google Scholar
  13. 13.
    McCreadie, R., Macdonald, C., Ounis, I.: Incremental update summarization: Adaptive sentence selection based on prevalence and novelty. In: Proceedings of CIKM (2014)Google Scholar
  14. 14.
    Nenkova, A.: Automatic text summarization of newswire: Lessons learned from the document understanding conference. In: Proceedings of AAAI (2005)Google Scholar
  15. 15.
    Nenkova, A., McKeown, K.: Automatic summarization. Found. Trends Inf. Retrieval 5(2–3), 103–233 (2011)CrossRefGoogle Scholar
  16. 16.
    Nenkova, A., Vanderwende, L., McKeown, K.: A compositional context sensitive multi-document summarizer: Exploring the factors that influence summarization. In: Proceedings of SIGIR (2006)Google Scholar
  17. 17.
    Owczarzak, K., Conroy, J.M., Dang, H.T., Nenkova, A.: An assessment of the accuracy of automatic evaluation in summarization. In: Proceedings of NAACL-HLT WEAS (2012)Google Scholar
  18. 18.
    Radev, D.R., Jing, H., Styś, M., Tam, D.: Centroid-based summarization of multiple documents. Inf. Process. Manage. 40(6), 919–938 (2004)CrossRefMATHGoogle Scholar
  19. 19.
    K, Spärck Jones: Automatic summarising: The state-of-the-art. Inf. Process. Manage. 43(6), 1449–1481 (2007)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Stuart Mackie
    • 1
  • Richard McCreadie
    • 1
  • Craig Macdonald
    • 1
  • Iadh Ounis
    • 1
  1. 1.School of Computing ScienceUniversity of GlasgowGlasgowUK

Personalised recommendations