Microblog Contextualization: Advantages and Limitations of a Multi-sentence Compression Approach

  • Elvys Linhares PontesEmail author
  • Stéphane Huet
  • Juan-Manuel Torres-Moreno
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11018)


The content analysis task of the MC2 CLEF 2017 lab aims to generate small summaries in four languages to contextualize microblogs. This paper analyzes the challenges of this task and also details the advantages and limitations of our approach using a cross-lingual compressive text summarization. We split this task in several subtasks in order to discuss their setup. In addition, we suggest an evaluation protocol to reduce the bias of the current metrics toward the approaches by extraction.


Microblog contextualization Multi-sentence compression Word embedding Wikipedia 



This work was partially financed by the European Project CHISTERA-AMIS ANR-15-CHR2-0001.


  1. 1.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003). Scholar
  2. 2.
    Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606 (2016)
  3. 3.
    Ermakova, L., Goeuriot, L., Mothe, J., Mulhem, P., Nie, J.-Y., SanJuan, E.: CLEF 2017 microblog cultural contextualization lab overview. In: Jones, G.J.F., et al. (eds.) CLEF 2017. LNCS, vol. 10456, pp. 304–314. Springer, Cham (2017). Scholar
  4. 4.
    Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Workshop Text Summarization Branches Out (ACL), pp. 74–81 (2004)Google Scholar
  5. 5.
    Linhares Pontes, E., Huet, S., Torres-Moreno, J.M., Linhares, A.C.: Microblog contextualization using continuous space vectors: multi-sentence compression of cultural documents. In: Working Notes of the CLEF Lab on Microblog Cultural Contextualization, vol. 1866. (2017)Google Scholar
  6. 6.
    Linhares Pontes, E., Huet, S., da Silva, T.G., Linhares, A.C., Torres-Moreno, J.M.: Multi-sentence compression with word vertex-labeled graphs and integer linear programming. In: Workshop on Graph-Based Methods for Natural Language Processing, TextGraphs-12. Association for Computational Linguistics (2018)Google Scholar
  7. 7.
    Linhares Pontes, E., Huet, S., Torres-Moreno, J.-M., Linhares, A.C.: Cross-language text summarization using sentence and multi-sentence compression. In: Silberztein, M., Atigui, F., Kornyshova, E., Métais, E., Meziane, F. (eds.) NLDB 2018. LNCS, vol. 10859, pp. 467–479. Springer, Cham (2018). Scholar
  8. 8.
    Linhares Pontes, E., da Silva, T.G., Linhares, A.C., Torres-Moreno, J.M., Huet, S.: Métodos de otimização combinatória aplicados ao problema de compressão multifrases. In: Anais do XLVIII Simpósio Brasileiro de Pesquisa Operacional (SBPO), pp. 2278–2289 (2016)Google Scholar
  9. 9.
    Torres-Moreno, J.M.: Automatic Text Summarization. Wiley, Hoboken (2014)CrossRefGoogle Scholar
  10. 10.
    Wan, X.: Using bilingual information for cross-language document summarization. In: ACL, pp. 1546–1555 (2011)Google Scholar
  11. 11.
    Zhang, J., Zhou, Y., Zong, C.: Abstractive cross-language summarization via translation model enhanced predicate argument structure fusing. IEEE/ACM Trans. Audio Speech Lang. Process. 24(10), 1842–1853 (2016)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Elvys Linhares Pontes
    • 1
    • 2
    • 3
    Email author
  • Stéphane Huet
    • 1
  • Juan-Manuel Torres-Moreno
    • 1
    • 2
    • 3
  1. 1.LIAUniversité d’Avignon et des Pays de VaucluseAvignonFrance
  2. 2.Polytechnique MontréalMontréalCanada
  3. 3.Université du Québec à MontréalMontréalCanada

Personalised recommendations