Skip to main content

A Framework for Math Word Problem Solving Based on Pre-training Models and Spatial Optimization Strategies

  • Conference paper
  • First Online:
Computer Supported Cooperative Work and Social Computing (ChineseCSCW 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1682))

  • 334 Accesses

Abstract

Automatic Math Word Problem (MWP) solving plays an important role in AI-tutoring, which aims to generate corresponding math expressions and results from a series of MWP. For the applicability of the MWP solving model, two aspects are considered to be optimized. Firstly, to address the weak linguistic representation of RNN which leads to the poor accuracy of MWP solution models, we propose to use Bidirectional Encoder Representation from Transformers (BERT) as an encoder and combine it with Transformer decoder to form a model framework. It is about 8% higher on the dataset Math23K compared to GTS, reaching 82.6%. However, pre-trained models tend to be large in size, which is not conducive to the deployment on the web server. A knowledge distillation strategy integrating teacher model’s evaluation is proposed. By enabling a model to patiently learn from and imitate the teacher through multi-layer distillation, the above BERT based model is compressed into a shallow structured student model. It achieves accuracy of 76.3% on Math23K, while the model weighs only 0.61 times that of the teacher’s model, and improves the prediction speed by 1.71 times.

This work is partially supported by the National Nature Science Foundation (Project No. 62177015.)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://huggingface.co/bert-base-chinese.

References

  1. Sundaram, S.S., Khemani, D.: Natural language processing for solving simple word problems. In: Proceedings of the 12th International Conference on Natural Language Processing (2015)

    Google Scholar 

  2. Zhou, L., Dai, S., Chen, L.: Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (2015)

    Google Scholar 

  3. Huang, D., et al.: Learning fine-grained expressions to solve math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (2017)

    Google Scholar 

  4. Shi, S., et al.: Automatically solving number word problems by semantic parsing and reasoning. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (2015)

    Google Scholar 

  5. Wang, Y., Liu, X., Shi, S.: Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (2017)

    Google Scholar 

  6. Wang, L., et al.: Translating a math word problem to an expression tree. arXiv preprint arXiv:1811.05632 (2018)

  7. Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, vol. 28 (2015)

    Google Scholar 

  8. Zhu, H., et al.: Learning to ask unanswerable questions for machine reading comprehension. arXiv preprint arXiv:1906.06045 (2019)

  9. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  10. Devlin, J., et al.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  11. Liu, Y., Ott, M., Goyal, N., et al.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

  12. Liang, Z., et al.: Mwp-bert: A strong baseline for math word problems. arXiv preprint arXiv:2107.13435 (2021)

  13. Liu, Q., et al.: Tree-structured decoding for solving math word problems. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (2019)

    Google Scholar 

  14. Chiang, T.R., Chen, Y.N.: Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720 (2018)

  15. Xie, Z., Shichao, S.: A goal-driven tree-structured neural model for math word problems. In: IJCAI (2019)

    Google Scholar 

  16. Zhang, J., et al.: Graph-to-tree learning for solving math word problems. In: Association for Computational Linguistics (2020)

    Google Scholar 

  17. Hong, Y., et al.: Learning by fixing: solving math word problems with weak supervision. In: AAAI Conference on Artificial Intelligence (2021)

    Google Scholar 

  18. Hinton, G., Oriol, V., Jeff, D.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)

  19. Sun, S., et al.: Patient knowledge distillation for bert model compression. arXiv preprint arXiv:1908.09355 (2019)

  20. Jiao, X., et al.: Tinybert: Distilling bert for natural language understanding. arXiv preprint arXiv:1909.10351 (2019)

  21. Han, S., Huizi, M., William, J.D.: Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv preprint arXiv:1510.00149 (2015)

  22. Lan, Z., et al.; Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019)

  23. Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: a survey. Int. J. Comput. Vision 129(6), 1789–1819 (2021). https://doi.org/10.1007/s11263-021-01453-z

    Article  Google Scholar 

  24. Yang, Z., et al.: Textbrewer: An open-source knowledge distillation toolkit for natural language processing. arXiv preprint arXiv:2002.12620 (2020)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jing Xiao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fan, W., Xiao, J., Cao, Y. (2023). A Framework for Math Word Problem Solving Based on Pre-training Models and Spatial Optimization Strategies. In: Sun, Y., et al. Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2022. Communications in Computer and Information Science, vol 1682. Springer, Singapore. https://doi.org/10.1007/978-981-99-2385-4_37

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-2385-4_37

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-2384-7

  • Online ISBN: 978-981-99-2385-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics