TAGNet: a tiny answer-guided network for conversational question generation

Wang, Zekun; Zhu, Haichao; Liu, Ming; Qin, Bing

doi:10.1007/s13042-022-01737-x

TAGNet: a tiny answer-guided network for conversational question generation

Original Article
Published: 19 December 2022

Volume 14, pages 1921–1932, (2023)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Zekun Wang ORCID: orcid.org/0000-0003-0151-5367¹,
Haichao Zhu¹,
Ming Liu^1,2 &
…
Bing Qin^1,2

295 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Conversational Question Generation (CQG) aims to generate conversational questions with the given passage and conversation history. Previous work of CQG presumes a contiguous span as the answer and generates a question targeting it. However, this limits the application scenarios because answers in practical conversations are usually abstractive free-form text instead of extractive spans. In addition, most state-of-the-art CQG systems are based on pretrained language models consisting of hundreds of millions of parameters, bringing challenges to real-life applications due to latency and capacity constraints. To elegantly address these problems, in this work, we introduce the Tiny Answer-Guided Network (TAGNet) based on the lightweight module (Bi-LSTM) for CQG. We explicitly take the target answers as input, which interacts with the passages and conversation history in the encoder and guides the question generation through the gated attention mechanism in the decoder. Besides, we distill the knowledge from larger pretrained language models into our smaller network to make the trade-off between performance and efficiency. Experimental results show that our TAGNet achieves a comparable performance with large pretrained language models (retaining \(95.9\%\) of teacher performance) while using \(5.7\times\) fewer parameters and \(10.4\times\) faster inference latency. TAGNet outperforms the previous best-performing model with similar parameter size by a large margin, and further analysis shows that TAGNet generates more answer-specific conversational questions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancing Vietnamese Question Generation with Reinforcement Learning

Improving Long Content Question Generation with Multi-level Passage Encoding

Multi-turn dialogue-oriented pretrained question generation model

Article Open access 16 May 2020

Data Availibility Statement

The CoQA dataset supporting Tables 1, 2, 3, 4, 5, 6, 7, 8 is available at https://stanfordnlp.github.io/coqa. The QuAC dataset supporting Table 9 is available at https://quac.ai/.

Notes

B for the first token of the rationale sentence, I for other tokens in the sentence, and O for others.
This model achieves 74.4 F1 scores on the CoQA development set.

References

Reddy S, Chen D, Manning CD (2019) Coqa: A conversational question answering challenge. Trans Assoc Comput Linguist 7:249–266
Article Google Scholar
Huang H-Y, Choi E, Yih W-t (2019) FlowQA: Grasping flow in history for conversational machine comprehension. In: International Conference on Learning Representations
Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp 4171–4186
Dong L, Yang N, Wang W, Wei F, Liu X, Wang Y, Gao J, Zhou M, Hon H-W (2019) Unified language model pre-training for natural language understanding and generation. In: Advances in Neural Information Processing Systems, pp 13042–13054
Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov RR, Le QV (2019) Xlnet: Generalized autoregressive pretraining for language understanding. In: Advances in Neural Information Processing Systems, pp 5754–5764
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692
Choi E, He H, Iyyer M, Yatskar M, Yih W-t, Choi Y, Liang P, Zettlemoyer L (2018) QuAC: Question answering in context. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp 2174–2184
Li J, Liu M, Kan M, Zheng Z, Wang Z, Lei W, Liu T, Qin B (2020) Molweni: A challenge multiparty dialogues-based machine reading comprehension dataset with discourse structure. In: Scott D, Bel N, Zong C (eds) Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Barcelona, Spain (Online), December 8-13, pp. 2642–2652. International Committee on Computational Linguistics. https://doi.org/10.18653/v1/2020.coling-main.238
Gao Y, Li P, King I, Lyu MR (2019) Interconnected question generation with coreference alignment and conversation flow modeling. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp 4853–4862
Pan B, Li H, Yao Z, Cai D, Sun H (2019) Reinforced dynamic reasoning for conversational question generation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp 2114–2124
Nakanishi M, Kobayashi T, Hayashi Y (2019) Towards answer-unaware conversational question generation. In: Proceedings of the 2nd Workshop on Machine Reading for Question Answering, pp 63–71
Qi P, Zhang Y, Manning CD (2020) Stay hungry, stay focused: Generating informative and specific questions in information-seeking conversations. arXiv preprint arXiv:2004.14530
Gu J, Mirshekari M, Yu Z, Sisto A (2021) Chaincqg: Flow-aware conversational question generation. In: Merlo, P., Tiedemann, J., Tsarfaty, R. (eds.) Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, EACL 2021, Online, April 19 - 23, 2021, pp. 2061–2070. Association for Computational Linguistics, ??? . https://doi.org/10.18653/v1/2021.eacl-main.177
Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I (2019) Language models are unsupervised multitask learners
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Hinton GE, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. CoRR abs/1503.02531arXiv:1503.02531
Jiao X, Yin Y, Shang L, Jiang X, Chen X, Li L, Wang F, Liu Q (2019) Tinybert: Distilling BERT for natural language understanding. CoRR abs/1909.10351arXiv:1909.10351
Wang W, Wei F, Dong L, Bao H, Yang N, Zhou M (2020) Minilm: Deep self-attention distillation for task-agnostic compression of pre-trained transformers. In: Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H (eds) Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, Virtual. https://proceedings.neurips.cc/paper/2020/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Guyon I, von Luxburg U, Bengio S, Wallach HM, Fergus R, Vishwanathan SVN, Garnett R (eds) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pp. 5998–6008. https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html
Heilman M, Smith NA (2010) Good question! statistical ranking for question generation. In: Human Language Technologies. In: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp 609–617
Tang D, Duan N, Qin T, Yan Z, Zhou M (2017) Question answering and question generation as dual tasks. arXiv preprint arXiv:1706.02027
Zhu H, Dong L, Wei F, Wang W, Qin B, Liu T (2019) Learning to ask unanswerable questions for machine reading comprehension. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp 4238–4248
Du X, Shao J, Cardie C (2017) Learning to ask: Neural question generation for reading comprehension. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1342–1352
Zhou Q, Yang N, Wei F, Tan C, Bao H, Zhou M (2017) Neural question generation from text: A preliminary study. In: National CCF Conference on Natural Language Processing and Chinese Computing, pp 662–671 . Springer
Rajpurkar P, Zhang J, Lopyrev K, Liang P (2016) SQuAD: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp 2383–2392
Yuan X, Wang T, Gulcehre C, Sordoni A, Bachman P, Zhang S, Subramanian S, Trischler A (2017) Machine comprehension by text-to-text neural question generation. In: Proceedings of the 2nd Workshop on Representation Learning for NLP, pp 15–25
Du X, Cardie C (2018) Harvesting paragraph-level question-answer pairs from Wikipedia. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1907–1917
Zhao Y, Ni X, Ding Y, Ke Q (2018) Paragraph-level neural question generation with maxout pointer and gated self-attention networks. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp 3901–3910
Sun X, Liu J, Lyu Y, He W, Ma Y, Wang S (2018) Answer-focused and position-aware neural question generation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp 3930–3939
Song L, Wang Z, Hamza W, Zhang Y, Gildea D (2018) Leveraging context information for natural question generation. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pp 569–574
Kim Y, Lee H, Shin J, Jung K (2019) Improving neural question generation using answer separation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp 6602–6609
Puri R, Spring R, Patwary M, Shoeybi M, Catanzaro B (2020) Training question answering models from synthetic data. arXiv preprint arXiv:2002.09599
Bao H, Dong L, Wei F, Wang W, Yang N, Liu X, Wang Y, Gao J, Piao S, Zhou M, Hon H (2020) Unilmv2: Pseudo-masked language models for unified language model pre-training. In: Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event. Proceedings of Machine Learning Research, vol. 119, pp 642–652. PMLR, ??? . http://proceedings.mlr.press/v119/bao20a.html
Wang Y, Liu C, Huang M, Nie L (2018) Learning to ask questions in open-domain conversational systems with typed decoders. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 2193–2203
Ba J, Caruana R (2014) Do deep nets really need to be deep? In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, pp. 2654–2662. https://proceedings.neurips.cc/paper/2014/hash/ea8fcd92d59581717e06eb187f10666d-Abstract.html
Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2015) Fitnets: Hints for thin deep nets. In: Bengio Y, LeCun Y (eds) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings. http://arxiv.org/abs/1412.6550
Zagoruyko S, Komodakis N (2017) Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net, ??? . https://openreview.net/forum?id=Sks9_ajex
Sanh V, Debut L, Chaumond J, Wolf T (2019) Distilbert, a distilled version of BERT: smaller, faster, cheaper and lighter. CoRR abs/1910.01108arXiv:1910.01108
Wang Z, Wang W, Zhu H, Liu M, Qin B, Wei F (2021) Distilled dual-encoder model for vision-language understanding. CoRR abs/2112.08723arXiv:2112.08723
Kim Y, Rush AM (2016) Sequence-level knowledge distillation. In: Su J, Carreras X, Duh K (eds) Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1-4, 2016, pp 317–1327. The Association for Computational Linguistics, ???. https://doi.org/10.18653/v1/d16-1139
Shleifer S, Rush AM (2020) Pre-trained summarization distillation. arXiv preprint arXiv:2010.13002
Zhang S, Zhang X, Bao H, Wei F (2022) Attention temperature matters in abstractive summarization distillation. In: Muresan, S., Nakov, P., Villavicencio, A. (eds.) Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, pp 127–141. Association for Computational Linguistics, ??? . https://aclanthology.org/2022.acl-long.11
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
See A, Liu PJ, Manning CD (2017) Get to the point: Summarization with pointer-generator networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1073–1083
Bao H, Dong L, Wang W, Yang N, Wei F (2021) s2s-ft: Fine-tuning pretrained transformer encoders for sequence-to-sequence learning. CoRR abs/2110.13640arXiv:2110.13640
Pennington J, Socher R, Manning C (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 532–1543
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958
MathSciNet MATH Google Scholar
Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12:2121–2159
MathSciNet MATH Google Scholar
Papineni K, Roukos S, Ward T, Zhu W-J (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp 311–318
Denkowski M, Lavie A (2014) Meteor universal: Language specific translation evaluation for any target language. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp 376–380
Lin C-Y (2004) ROUGE: A package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp 74–81
Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21:140–114067
MathSciNet MATH Google Scholar
Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2020) BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Jurafsky D, Chai J, Schluter N, Tetreault JR (eds) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, pp. 7871–7880. Association for Computational Linguistics, ???. https://doi.org/10.18653/v1/2020.acl-main.703
Zaheer M, Guruganesh G, Dubey KA, Ainslie J, Alberti C, Ontañón S, Pham P, Ravula A, Wang Q, Yang L, Ahmed A (2020) Big bird: Transformers for longer sequences. In: Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H (eds) Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, Virtual. https://proceedings.neurips.cc/paper/2020/hash/c8512d142a2d849725f31a9a7a361ab9-Abstract.html
Richardson M, Burges CJC, Renshaw E (2013) Mctest: A challenge dataset for the open-domain machine comprehension of text. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP 2013, 18-21 October 2013, Grand Hyatt Seattle, Seattle, Washington, USA, A Meeting of SIGDAT, a Special Interest Group of The ACL, pp 193–203. ACL, ???. https://aclanthology.org/D13-1020/
Lai G, Xie Q, Liu H, Yang Y, Hovy EH (2017) RACE: large-scale reading comprehension dataset from examinations. In: Palmer, M., Hwa, R., Riedel, S. (eds.) Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9-11, 2017, pp 785–794. Association for Computational Linguistics, ???. https://doi.org/10.18653/v1/d17-1082
Hermann KM, Kociský T, Grefenstette E, Espeholt L, Kay W, Suleyman M, Blunsom P (2015) Teaching machines to read and comprehend. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pp 1693–1701. https://proceedings.neurips.cc/paper/2015/hash/afdec7005cc9f14302cd0474fd0f3c96-Abstract.html
Fan A, Lewis M, Dauphin YN (2018) Hierarchical neural story generation. In: Gurevych I, Miyao Y (eds) Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, pp. 889–898. Association for Computational Linguistics, ???. https://doi.org/10.18653/v1/P18-1082. https://aclanthology.org/P18-1082/
Welbl J, Liu NF, Gardner M (2017) Crowdsourcing multiple choice science questions. In: Derczynski L, Xu W, Ritter A, Baldwin T (eds) Proceedings of the 3rd Workshop on Noisy User-generated Text, NUT@EMNLP 2017, Copenhagen, Denmark, September 7, 2017, pp 94–106. Association for Computational Linguistics, ???. https://doi.org/10.18653/v1/w17-4413

Download references

Acknowledgements

We thank anonymous reviewers for their insightful feedback that helped improve the paper. The research in this article is supported by the National Key Research and Development Project (2021YFF0901600), the National Science Foundation of China (U22B2059, 61976073, 62276083), and Shenzhen Foun-dational Research Funding (JCYJ20200109113441941),the Project of State Key Laboratory of Communication Content Cognition (A02101), the Major Key Project of PCL (PCL2021A06). Ming Liu is the corresponding author.

Author information

Authors and Affiliations

Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology, Harbin, China
Zekun Wang, Haichao Zhu, Ming Liu & Bing Qin
Peng Cheng Laboratory, Shenzhen, China
Ming Liu & Bing Qin

Authors

Zekun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Haichao Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Ming Liu
View author publications
You can also search for this author in PubMed Google Scholar
Bing Qin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ming Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, Z., Zhu, H., Liu, M. et al. TAGNet: a tiny answer-guided network for conversational question generation. Int. J. Mach. Learn. & Cyber. 14, 1921–1932 (2023). https://doi.org/10.1007/s13042-022-01737-x

Download citation

Received: 19 June 2022
Accepted: 29 November 2022
Published: 19 December 2022
Issue Date: May 2023
DOI: https://doi.org/10.1007/s13042-022-01737-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TAGNet: a tiny answer-guided network for conversational question generation

Abstract

Access this article

Similar content being viewed by others

Enhancing Vietnamese Question Generation with Reinforcement Learning

Improving Long Content Question Generation with Multi-level Passage Encoding

Multi-turn dialogue-oriented pretrained question generation model

Data Availibility Statement

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

TAGNet: a tiny answer-guided network for conversational question generation

Abstract

Access this article

Similar content being viewed by others

Enhancing Vietnamese Question Generation with Reinforcement Learning

Improving Long Content Question Generation with Multi-level Passage Encoding

Multi-turn dialogue-oriented pretrained question generation model

Data Availibility Statement

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation