Skip to main content

Ner4Opt: Named Entity Recognition for Optimization Modelling from Natural Language

  • Conference paper
  • First Online:
Integration of Constraint Programming, Artificial Intelligence, and Operations Research (CPAIOR 2023)

Abstract

Solving combinatorial optimization problems involves a two-stage process that follows the model-and-run approach. First, a user is responsible for formulating the problem at hand as an optimization model, and then, given the model, a solver is responsible for finding the solution. While optimization technology has enjoyed tremendous theoretical and practical advances, the overall process has remained the same for decades. To date, transforming problem descriptions into optimization models remains a barrier to entry. To alleviate users from the cognitive task of modeling, we study named entity recognition to capture components of optimization models such as the objective, variables, and constraints from free-form natural language text, and coin this problem as Ner4Opt. We show how to solve Ner4Opt using classical techniques based on morphological and grammatical properties and modern methods leveraging pre-trained large language models and fine-tuning transformers architecture with optimization-specific corpora. For best performance, we present their hybridization combined with feature engineering and data augmentation to exploit the language of optimization problems. We improve over the state-of-the-art for annotated linear programming word problems, identify several next steps and discuss important open problems toward automated modeling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://huggingface.co/spaces/skadio/ner4opt.

  2. 2.

    https://github.com/open-optimization.

  3. 3.

    https://github.com/nl4opt.

  4. 4.

    https://github.com/nl4opt/nl4opt-subtask1-baseline.

  5. 5.

    https://github.com/TeamHG-Memex/sklearn-crfsuite.

References

  1. Araci, D.: Finbert: financial sentiment analysis with pre-trained language models. arXiv preprint arXiv:1908.10063 (2019)

  2. Balcan, M., Prasad, S., Sandholm, T., Vitercik, E.: Sample complexity of tree search configuration: cutting planes and beyond. In: Ranzato, M., Beygelzimer, A., Dauphin, Y.N., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, 6–14 December 2021, pp. 4015–4027 (2021). https://proceedings.neurips.cc/paper/2021/hash/210b7ec74fc9cec6fb8388dbbdaf23f7-Abstract.html

  3. Balcan, M., Prasad, S., Sandholm, T., Vitercik, E.: Improved sample complexity bounds for branch-and-cut. In: Solnon, C. (ed.) 28th International Conference on Principles and Practice of Constraint Programming, CP 2022, 31 July to 8 August 2022, Haifa, Israel. LIPIcs, vol. 235, pp. 3:1–3:19. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2022). https://doi.org/10.4230/LIPIcs.CP.2022.3

  4. Beldiceanu, N., Simonis, H.: A model seeker: extracting global constraint models from positive examples. In: Milano, M. (ed.) CP 2012. LNCS, pp. 141–157. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33558-7_13

    Chapter  Google Scholar 

  5. Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, pp. 3615–3620. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/D19-1371. https://aclanthology.org/D19-1371

  6. Bengio, Y., Lodi, A., Prouvost, A.: Machine learning for combinatorial optimization: a methodological tour d’horizon. Eur. J. Oper. Res. 290(2), 405–421 (2021). https://doi.org/10.1016/j.ejor.2020.07.063. https://www.sciencedirect.com/science/article/pii/S0377221720306895

  7. Bessiere, C., Coletta, R., Freuder, E.C., O’Sullivan, B.: Leveraging the learning power of examples in automated constraint acquisition. In: Wallace, M. (ed.) CP 2004. LNCS, vol. 3258, pp. 123–137. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30201-8_12

    Chapter  MATH  Google Scholar 

  8. Boyd, S., Boyd, S.P., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)

    Book  MATH  Google Scholar 

  9. Brown, T., et al.: Language models are few-shot learners. Adv. Neural. Inf. Process. Syst. 33, 1877–1901 (2020)

    Google Scholar 

  10. Chinchor, N., Robinson, P.: Appendix E: MUC-7 named entity task definition (version 3.5). In: Seventh Message Understanding Conference (MUC-7): Proceedings of a Conference Held in Fairfax, Virginia, 29 April–1 May 1998 (1998). https://aclanthology.org/M98-1028

  11. Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116 (2019)

  12. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  13. Fisch, A., Jia, R., Schuster, T.: Uncertainty estimation for natural language processing. In: COLING (2022). https://sites.google.com/view/uncertainty-nlp

  14. Goodwin, S., Mears, C., Dwyer, T., de la Banda, M.G., Tack, G., Wallace, M.: What do constraint programming users want to see? Exploring the role of visualisation in profiling of models and search. IEEE Trans. Vis. Comput. Graph. 23(1), 281–290 (2017). https://doi.org/10.1109/TVCG.2016.2598545

    Article  Google Scholar 

  15. Grishman, R., Sundheim, B.: Message understanding conference- 6: a brief history. In: COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics (1996). https://aclanthology.org/C96-1079

  16. Guns, T.: On learning and branching: a survey. In: The 18th Workshop on Constraint Modelling and Reformulation (2019)

    Google Scholar 

  17. He, H., Daume III, H., Eisner, J.M.: Learning to search in branch and bound algorithms. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K. (eds.) Advances in Neural Information Processing Systems, vol. 27. Curran Associates, Inc. (2014). https://proceedings.neurips.cc/paper/2014/file/757f843a169cc678064d9530d12a1881-Paper.pdf

  18. Hildebrand, R., Poirrier, L., Bish, D., Moran, D.: Mathematical programming and operations research (2022). https://github.com/open-optimization/open-optimization-or-book

  19. Honnibal, M., Montani, I., Van Landeghem, S., Boyd, A.: Spacy: industrial-strength natural language processing in python (2020)

    Google Scholar 

  20. Howard, J., Ruder, S.: Universal language model fine-tuning for text classification. arXiv preprint arXiv:1801.06146 (2018)

  21. Hutter, F., Hoos, H.H., Leyton-Brown, K., Stützle, T.: Paramils: an automatic algorithm configuration framework. J. Artif. Int. Res. 36(1), 267–306 (2009)

    MATH  Google Scholar 

  22. Kadioglu, S., Malitsky, Y., Sabharwal, A., Samulowitz, H., Sellmann, M.: Algorithm selection and scheduling. In: Lee, J. (ed.) CP 2011. LNCS, vol. 6876, pp. 454–469. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23786-7_35

    Chapter  Google Scholar 

  23. Kadioglu, S., Malitsky, Y., Sellmann, M.: Non-model-based search guidance for set partitioning problems. In: Hoffmann, J., Selman, B. (eds.) Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 22–26 July 2012, Toronto, Ontario, Canada. AAAI Press (2012). http://www.aaai.org/ocs/index.php/AAAI/AAAI12/paper/view/5082

  24. Kadioglu, S., Malitsky, Y., Sellmann, M., Tierney, K.: ISAC - instance-specific algorithm configuration. In: Coelho, H., Studer, R., Wooldridge, M.J. (eds.) ECAI 2010 - 19th European Conference on Artificial Intelligence, Lisbon, Portugal, 16–20 August 2010, Proceedings. Frontiers in Artificial Intelligence and Applications, vol. 215, pp. 751–756. IOS Press (2010). https://doi.org/10.3233/978-1-60750-606-5-751

  25. Kolb, S., Paramonov, S., Guns, T., Raedt, L.D.: Learning constraints in spreadsheets and tabular data. Mach. Learn. 106(9–10), 1441–1468 (2017). https://doi.org/10.1007/s10994-017-5640-x

    Article  MathSciNet  Google Scholar 

  26. Kumar, M., Kolb, S., Guns, T.: Learning constraint programming models from data using generate-and-aggregate. In: Solnon, C. (ed.) 28th International Conference on Principles and Practice of Constraint Programming, CP 2022, 31 July to 8 August 2022, Haifa, Israel. LIPIcs, vol. 235, pp. 29:1–29:16. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2022). https://doi.org/10.4230/LIPIcs.CP.2022.29

  27. Laborie, P., Rogerie, J., Shaw, P., Vilím, P.: IBM ILOG CP optimizer for scheduling - 20+ years of scheduling with constraints at IBM/ILOG. Constraints 23(2), 210–250 (2018). https://doi.org/10.1007/s10601-018-9281-x

    Article  MathSciNet  MATH  Google Scholar 

  28. Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Brodley, C.E., Danyluk, A.P. (eds.) Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williams College, Williamstown, MA, USA, 28 June–1 July 2001, pp. 282–289. Morgan Kaufmann (2001)

    Google Scholar 

  29. Lai, G., Xie, Q., Liu, H., Yang, Y., Hovy, E.: Race: large-scale reading comprehension dataset from examinations. arXiv preprint arXiv:1704.04683 (2017)

  30. Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., So, C.H., Kang, J.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics (2019). https://doi.org/10.1093/bioinformatics/btz682

  31. Liberto, G.M.D., Kadioglu, S., Leo, K., Malitsky, Y.: DASH: dynamic approach for switching heuristics. Eur. J. Oper. Res. 248(3), 943–953 (2016). https://doi.org/10.1016/j.ejor.2015.08.018

    Article  MathSciNet  MATH  Google Scholar 

  32. Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

  33. Lodi, A., Zarpellon, G.: On learning and branching: a survey. TOP 25(2), 207–236 (2017). https://doi.org/10.1007/s11750-017-0451-6

    Article  MathSciNet  MATH  Google Scholar 

  34. Morwal, S., Jahan, N., Chopra, D.: Named entity recognition using hidden Markov model (HMM). Int. J. Nat. Lang. Comput. (IJNLC) 1 (2012)

    Google Scholar 

  35. do Nascimento, H.A.D., Eades, P.: User hints: a framework for interactive optimization. Future Gener. Comput. Syst. 21(7), 1171–1191 (2005). https://doi.org/10.1016/j.future.2004.04.005

    Article  Google Scholar 

  36. Nethercote, N., Stuckey, P.J., Becket, R., Brand, S., Duck, G.J., Tack, G.: MiniZinc: towards a standard CP modelling language. In: Bessière, C. (ed.) CP 2007. LNCS, vol. 4741, pp. 529–543. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74970-7_38

    Chapter  Google Scholar 

  37. O’Callaghan, B., O’Sullivan, B., Freuder, E.C.: Generating corrective explanations for interactive constraint satisfaction. In: van Beek, P. (ed.) CP 2005. LNCS, vol. 3709, pp. 445–459. Springer, Heidelberg (2005). https://doi.org/10.1007/11564751_34

    Chapter  MATH  Google Scholar 

  38. Paramonov, S., Kolb, S., Guns, T., Raedt, L.D.: Tacle: learning constraints in tabular data. In: Lim, E., et al. (eds.) Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, Singapore, 06–10 November 2017, pp. 2511–2514. ACM (2017). https://doi.org/10.1145/3132847.3133193

  39. Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: International Conference on Machine Learning, pp. 1310–1318. PMLR (2013)

    Google Scholar 

  40. Pawlak, T.P., Krawiec, K.: Automatic synthesis of constraints from examples using mixed integer linear programming. Eur. J. Oper. Res. 261(3), 1141–1157 (2017). https://doi.org/10.1016/j.ejor.2017.02.034. https://www.sciencedirect.com/science/article/pii/S037722171730156X

  41. Quattoni, A., Collins, M., Darrell, T.: Conditional random fields for object recognition. In: Advances in Neural Information Processing Systems, vol. 17 (2004)

    Google Scholar 

  42. Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training (2018)

    Google Scholar 

  43. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019)

    Google Scholar 

  44. Raedt, L.D., Passerini, A., Teso, S.: Learning constraints from examples. In: AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  45. Rajapakse, T.C.: Simple transformers (2019). https://github.com/ThilinaRajapakse/simpletransformers

  46. Rajpurkar, P., Jia, R., Liang, P.: Know what you don’t know: unanswerable questions for squad. arXiv preprint arXiv:1806.03822 (2018)

  47. Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250 (2016)

  48. Ramamonjison, R., Li, H., et al.: Augmenting operations research with auto-formulation of optimization models from problem descriptions (2022). https://doi.org/10.48550/ARXIV.2209.15565. https://arxiv.org/abs/2209.15565

  49. Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL-2009), pp. 147–155 (2009)

    Google Scholar 

  50. Rau, L.F.: Extracting company names from text. In: Proceedings the Seventh IEEE Conference on Artificial Intelligence Application, pp. 29–30. IEEE Computer Society (1991)

    Google Scholar 

  51. Sabharwal, A., Samulowitz, H., Reddy, C.: Guiding combinatorial optimization with UCT. In: Beldiceanu, N., Jussien, N., Pinson, É. (eds.) CPAIOR 2012. LNCS, vol. 7298, pp. 356–361. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-29828-8_23

    Chapter  Google Scholar 

  52. Simonis, H., Davern, P., Feldman, J., Mehta, D., Quesada, L., Carlsson, M.: A generic visualization platform for CP. In: Cohen, D. (ed.) CP 2010. LNCS, vol. 6308, pp. 460–474. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15396-9_37

    Chapter  Google Scholar 

  53. Tang, Y., Agrawal, S., Faenza, Y.: Reinforcement learning for integer programming: learning to cut. In: Daume III, H., Singh, A. (eds.) Proceedings of the 37th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 119, pp. 9367–9376. PMLR (2020). https://proceedings.mlr.press/v119/tang20a.html

  54. Thie, P.R., Keough, G.E.: An Introduction to Linear Programming and Game Theory. Wiley, Hoboken (2011)

    MATH  Google Scholar 

  55. Tjong Kim Sang, E.F.: Introduction to the CoNLL-2002 shared task: language-independent named entity recognition. In: COLING-02: The 6th Conference on Natural Language Learning 2002 (CoNLL-2002) (2002). https://aclanthology.org/W02-2024

  56. Van Hentenryck, P.: The OPL Optimization Programming Language. MIT Press, Cambridge (1999)

    Google Scholar 

  57. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  58. Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.R.: Glue: a multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461 (2018)

  59. Wolf, T., et al.: Huggingface’s transformers: state-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019)

  60. Yadav, V., Bethard, S.: A survey on recent advances in named entity recognition from deep learning models. arXiv preprint arXiv:1910.11470 (2019)

  61. Yang, Y., Boland, N., Dilkina, B., Savelsbergh, M.: Learning generalized strong branching for set covering, set packing, and 0-1 knapsack problems. Eur. J. Oper. Res. 301(3), 828–840 (2022). https://doi.org/10.1016/j.ejor.2021.11.050. https://www.sciencedirect.com/science/article/pii/S0377221721010018

  62. Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V.: Xlnet: generalized autoregressive pretraining for language understanding. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  63. Zhao, S.: Named entity recognition in biomedical texts using an hmm model. In: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP), pp. 87–90 (2004)

    Google Scholar 

  64. Zhong, V., Xiong, C., Socher, R.: Seq2sql: generating structured queries from natural language using reinforcement learning. arXiv preprint arXiv:1709.00103 (2017)

  65. Zhou, G., Su, J.: Named entity recognition using an HMM-based chunk tagger. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 473–480 (2002)

    Google Scholar 

  66. Zhu, C., Byrd, R.H., Lu, P., Nocedal, J.: Algorithm 778: L-BFGS-B: fortran subroutines for large-scale bound-constrained optimization. ACM Trans. Math. Softw. (TOMS) 23(4), 550–560 (1997)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Serdar Kadıoğlu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dakle, P.P. et al. (2023). Ner4Opt: Named Entity Recognition for Optimization Modelling from Natural Language. In: Cire, A.A. (eds) Integration of Constraint Programming, Artificial Intelligence, and Operations Research. CPAIOR 2023. Lecture Notes in Computer Science, vol 13884. Springer, Cham. https://doi.org/10.1007/978-3-031-33271-5_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-33271-5_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-33270-8

  • Online ISBN: 978-3-031-33271-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics