Ner4Opt: Named Entity Recognition for Optimization Modelling from Natural Language

Dakle, Parag Pravin; Kadıoğlu, Serdar; Uppuluri, Karthik; Politi, Regina; Raghavan, Preethi; Rallabandi, SaiKrishna; Srinivasamurthy, Ravisutha

doi:10.1007/978-3-031-33271-5_20

Parag Pravin Dakle⁸,
Serdar Kadıoğlu ORCID: orcid.org/0000-0002-4672-6830^8,9,
Karthik Uppuluri⁸,
Regina Politi⁸,
Preethi Raghavan⁸,
SaiKrishna Rallabandi⁸ &
…
Ravisutha Srinivasamurthy⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13884))

Included in the following conference series:

International Conference on Integration of Constraint Programming, Artificial Intelligence, and Operations Research

846 Accesses

Abstract

Solving combinatorial optimization problems involves a two-stage process that follows the model-and-run approach. First, a user is responsible for formulating the problem at hand as an optimization model, and then, given the model, a solver is responsible for finding the solution. While optimization technology has enjoyed tremendous theoretical and practical advances, the overall process has remained the same for decades. To date, transforming problem descriptions into optimization models remains a barrier to entry. To alleviate users from the cognitive task of modeling, we study named entity recognition to capture components of optimization models such as the objective, variables, and constraints from free-form natural language text, and coin this problem as Ner4Opt. We show how to solve Ner4Opt using classical techniques based on morphological and grammatical properties and modern methods leveraging pre-trained large language models and fine-tuning transformers architecture with optimization-specific corpora. For best performance, we present their hybridization combined with feature engineering and data augmentation to exploit the language of optimization problems. We improve over the state-of-the-art for annotated linear programming word problems, identify several next steps and discuss important open problems toward automated modeling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Araci, D.: Finbert: financial sentiment analysis with pre-trained language models. arXiv preprint arXiv:1908.10063 (2019)
Balcan, M., Prasad, S., Sandholm, T., Vitercik, E.: Sample complexity of tree search configuration: cutting planes and beyond. In: Ranzato, M., Beygelzimer, A., Dauphin, Y.N., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, 6–14 December 2021, pp. 4015–4027 (2021). https://proceedings.neurips.cc/paper/2021/hash/210b7ec74fc9cec6fb8388dbbdaf23f7-Abstract.html
Balcan, M., Prasad, S., Sandholm, T., Vitercik, E.: Improved sample complexity bounds for branch-and-cut. In: Solnon, C. (ed.) 28th International Conference on Principles and Practice of Constraint Programming, CP 2022, 31 July to 8 August 2022, Haifa, Israel. LIPIcs, vol. 235, pp. 3:1–3:19. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2022). https://doi.org/10.4230/LIPIcs.CP.2022.3
Beldiceanu, N., Simonis, H.: A model seeker: extracting global constraint models from positive examples. In: Milano, M. (ed.) CP 2012. LNCS, pp. 141–157. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33558-7_13
Chapter Google Scholar
Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, pp. 3615–3620. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/D19-1371. https://aclanthology.org/D19-1371
Bengio, Y., Lodi, A., Prouvost, A.: Machine learning for combinatorial optimization: a methodological tour d’horizon. Eur. J. Oper. Res. 290(2), 405–421 (2021). https://doi.org/10.1016/j.ejor.2020.07.063. https://www.sciencedirect.com/science/article/pii/S0377221720306895
Bessiere, C., Coletta, R., Freuder, E.C., O’Sullivan, B.: Leveraging the learning power of examples in automated constraint acquisition. In: Wallace, M. (ed.) CP 2004. LNCS, vol. 3258, pp. 123–137. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30201-8_12
Chapter MATH Google Scholar
Boyd, S., Boyd, S.P., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Book MATH Google Scholar
Brown, T., et al.: Language models are few-shot learners. Adv. Neural. Inf. Process. Syst. 33, 1877–1901 (2020)
Google Scholar
Chinchor, N., Robinson, P.: Appendix E: MUC-7 named entity task definition (version 3.5). In: Seventh Message Understanding Conference (MUC-7): Proceedings of a Conference Held in Fairfax, Virginia, 29 April–1 May 1998 (1998). https://aclanthology.org/M98-1028
Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116 (2019)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Fisch, A., Jia, R., Schuster, T.: Uncertainty estimation for natural language processing. In: COLING (2022). https://sites.google.com/view/uncertainty-nlp
Goodwin, S., Mears, C., Dwyer, T., de la Banda, M.G., Tack, G., Wallace, M.: What do constraint programming users want to see? Exploring the role of visualisation in profiling of models and search. IEEE Trans. Vis. Comput. Graph. 23(1), 281–290 (2017). https://doi.org/10.1109/TVCG.2016.2598545
Article Google Scholar
Grishman, R., Sundheim, B.: Message understanding conference- 6: a brief history. In: COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics (1996). https://aclanthology.org/C96-1079
Guns, T.: On learning and branching: a survey. In: The 18th Workshop on Constraint Modelling and Reformulation (2019)
Google Scholar
He, H., Daume III, H., Eisner, J.M.: Learning to search in branch and bound algorithms. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K. (eds.) Advances in Neural Information Processing Systems, vol. 27. Curran Associates, Inc. (2014). https://proceedings.neurips.cc/paper/2014/file/757f843a169cc678064d9530d12a1881-Paper.pdf
Hildebrand, R., Poirrier, L., Bish, D., Moran, D.: Mathematical programming and operations research (2022). https://github.com/open-optimization/open-optimization-or-book
Honnibal, M., Montani, I., Van Landeghem, S., Boyd, A.: Spacy: industrial-strength natural language processing in python (2020)
Google Scholar
Howard, J., Ruder, S.: Universal language model fine-tuning for text classification. arXiv preprint arXiv:1801.06146 (2018)
Hutter, F., Hoos, H.H., Leyton-Brown, K., Stützle, T.: Paramils: an automatic algorithm configuration framework. J. Artif. Int. Res. 36(1), 267–306 (2009)
MATH Google Scholar
Kadioglu, S., Malitsky, Y., Sabharwal, A., Samulowitz, H., Sellmann, M.: Algorithm selection and scheduling. In: Lee, J. (ed.) CP 2011. LNCS, vol. 6876, pp. 454–469. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23786-7_35
Chapter Google Scholar
Kadioglu, S., Malitsky, Y., Sellmann, M.: Non-model-based search guidance for set partitioning problems. In: Hoffmann, J., Selman, B. (eds.) Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 22–26 July 2012, Toronto, Ontario, Canada. AAAI Press (2012). http://www.aaai.org/ocs/index.php/AAAI/AAAI12/paper/view/5082
Kadioglu, S., Malitsky, Y., Sellmann, M., Tierney, K.: ISAC - instance-specific algorithm configuration. In: Coelho, H., Studer, R., Wooldridge, M.J. (eds.) ECAI 2010 - 19th European Conference on Artificial Intelligence, Lisbon, Portugal, 16–20 August 2010, Proceedings. Frontiers in Artificial Intelligence and Applications, vol. 215, pp. 751–756. IOS Press (2010). https://doi.org/10.3233/978-1-60750-606-5-751
Kolb, S., Paramonov, S., Guns, T., Raedt, L.D.: Learning constraints in spreadsheets and tabular data. Mach. Learn. 106(9–10), 1441–1468 (2017). https://doi.org/10.1007/s10994-017-5640-x
Article MathSciNet Google Scholar
Kumar, M., Kolb, S., Guns, T.: Learning constraint programming models from data using generate-and-aggregate. In: Solnon, C. (ed.) 28th International Conference on Principles and Practice of Constraint Programming, CP 2022, 31 July to 8 August 2022, Haifa, Israel. LIPIcs, vol. 235, pp. 29:1–29:16. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2022). https://doi.org/10.4230/LIPIcs.CP.2022.29
Laborie, P., Rogerie, J., Shaw, P., Vilím, P.: IBM ILOG CP optimizer for scheduling - 20+ years of scheduling with constraints at IBM/ILOG. Constraints 23(2), 210–250 (2018). https://doi.org/10.1007/s10601-018-9281-x
Article MathSciNet MATH Google Scholar
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Brodley, C.E., Danyluk, A.P. (eds.) Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williams College, Williamstown, MA, USA, 28 June–1 July 2001, pp. 282–289. Morgan Kaufmann (2001)
Google Scholar
Lai, G., Xie, Q., Liu, H., Yang, Y., Hovy, E.: Race: large-scale reading comprehension dataset from examinations. arXiv preprint arXiv:1704.04683 (2017)
Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., So, C.H., Kang, J.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics (2019). https://doi.org/10.1093/bioinformatics/btz682
Liberto, G.M.D., Kadioglu, S., Leo, K., Malitsky, Y.: DASH: dynamic approach for switching heuristics. Eur. J. Oper. Res. 248(3), 943–953 (2016). https://doi.org/10.1016/j.ejor.2015.08.018
Article MathSciNet MATH Google Scholar
Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Lodi, A., Zarpellon, G.: On learning and branching: a survey. TOP 25(2), 207–236 (2017). https://doi.org/10.1007/s11750-017-0451-6
Article MathSciNet MATH Google Scholar
Morwal, S., Jahan, N., Chopra, D.: Named entity recognition using hidden Markov model (HMM). Int. J. Nat. Lang. Comput. (IJNLC) 1 (2012)
Google Scholar
do Nascimento, H.A.D., Eades, P.: User hints: a framework for interactive optimization. Future Gener. Comput. Syst. 21(7), 1171–1191 (2005). https://doi.org/10.1016/j.future.2004.04.005
Article Google Scholar
Nethercote, N., Stuckey, P.J., Becket, R., Brand, S., Duck, G.J., Tack, G.: MiniZinc: towards a standard CP modelling language. In: Bessière, C. (ed.) CP 2007. LNCS, vol. 4741, pp. 529–543. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74970-7_38
Chapter Google Scholar
O’Callaghan, B., O’Sullivan, B., Freuder, E.C.: Generating corrective explanations for interactive constraint satisfaction. In: van Beek, P. (ed.) CP 2005. LNCS, vol. 3709, pp. 445–459. Springer, Heidelberg (2005). https://doi.org/10.1007/11564751_34
Chapter MATH Google Scholar
Paramonov, S., Kolb, S., Guns, T., Raedt, L.D.: Tacle: learning constraints in tabular data. In: Lim, E., et al. (eds.) Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, Singapore, 06–10 November 2017, pp. 2511–2514. ACM (2017). https://doi.org/10.1145/3132847.3133193
Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: International Conference on Machine Learning, pp. 1310–1318. PMLR (2013)
Google Scholar
Pawlak, T.P., Krawiec, K.: Automatic synthesis of constraints from examples using mixed integer linear programming. Eur. J. Oper. Res. 261(3), 1141–1157 (2017). https://doi.org/10.1016/j.ejor.2017.02.034. https://www.sciencedirect.com/science/article/pii/S037722171730156X
Quattoni, A., Collins, M., Darrell, T.: Conditional random fields for object recognition. In: Advances in Neural Information Processing Systems, vol. 17 (2004)
Google Scholar
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training (2018)
Google Scholar
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019)
Google Scholar
Raedt, L.D., Passerini, A., Teso, S.: Learning constraints from examples. In: AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Rajapakse, T.C.: Simple transformers (2019). https://github.com/ThilinaRajapakse/simpletransformers
Rajpurkar, P., Jia, R., Liang, P.: Know what you don’t know: unanswerable questions for squad. arXiv preprint arXiv:1806.03822 (2018)
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250 (2016)
Ramamonjison, R., Li, H., et al.: Augmenting operations research with auto-formulation of optimization models from problem descriptions (2022). https://doi.org/10.48550/ARXIV.2209.15565. https://arxiv.org/abs/2209.15565
Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL-2009), pp. 147–155 (2009)
Google Scholar
Rau, L.F.: Extracting company names from text. In: Proceedings the Seventh IEEE Conference on Artificial Intelligence Application, pp. 29–30. IEEE Computer Society (1991)
Google Scholar
Sabharwal, A., Samulowitz, H., Reddy, C.: Guiding combinatorial optimization with UCT. In: Beldiceanu, N., Jussien, N., Pinson, É. (eds.) CPAIOR 2012. LNCS, vol. 7298, pp. 356–361. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-29828-8_23
Chapter Google Scholar
Simonis, H., Davern, P., Feldman, J., Mehta, D., Quesada, L., Carlsson, M.: A generic visualization platform for CP. In: Cohen, D. (ed.) CP 2010. LNCS, vol. 6308, pp. 460–474. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15396-9_37
Chapter Google Scholar
Tang, Y., Agrawal, S., Faenza, Y.: Reinforcement learning for integer programming: learning to cut. In: Daume III, H., Singh, A. (eds.) Proceedings of the 37th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 119, pp. 9367–9376. PMLR (2020). https://proceedings.mlr.press/v119/tang20a.html
Thie, P.R., Keough, G.E.: An Introduction to Linear Programming and Game Theory. Wiley, Hoboken (2011)
MATH Google Scholar
Tjong Kim Sang, E.F.: Introduction to the CoNLL-2002 shared task: language-independent named entity recognition. In: COLING-02: The 6th Conference on Natural Language Learning 2002 (CoNLL-2002) (2002). https://aclanthology.org/W02-2024
Van Hentenryck, P.: The OPL Optimization Programming Language. MIT Press, Cambridge (1999)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.R.: Glue: a multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461 (2018)
Wolf, T., et al.: Huggingface’s transformers: state-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019)
Yadav, V., Bethard, S.: A survey on recent advances in named entity recognition from deep learning models. arXiv preprint arXiv:1910.11470 (2019)
Yang, Y., Boland, N., Dilkina, B., Savelsbergh, M.: Learning generalized strong branching for set covering, set packing, and 0-1 knapsack problems. Eur. J. Oper. Res. 301(3), 828–840 (2022). https://doi.org/10.1016/j.ejor.2021.11.050. https://www.sciencedirect.com/science/article/pii/S0377221721010018
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V.: Xlnet: generalized autoregressive pretraining for language understanding. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Zhao, S.: Named entity recognition in biomedical texts using an hmm model. In: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP), pp. 87–90 (2004)
Google Scholar
Zhong, V., Xiong, C., Socher, R.: Seq2sql: generating structured queries from natural language using reinforcement learning. arXiv preprint arXiv:1709.00103 (2017)
Zhou, G., Su, J.: Named entity recognition using an HMM-based chunk tagger. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 473–480 (2002)
Google Scholar
Zhu, C., Byrd, R.H., Lu, P., Nocedal, J.: Algorithm 778: L-BFGS-B: fortran subroutines for large-scale bound-constrained optimization. ACM Trans. Math. Softw. (TOMS) 23(4), 550–560 (1997)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

AI Center of Excellence, Fidelity Investments, Boston, USA
Parag Pravin Dakle, Serdar Kadıoğlu, Karthik Uppuluri, Regina Politi, Preethi Raghavan, SaiKrishna Rallabandi & Ravisutha Srinivasamurthy
Department of Computer Science, Brown University, Providence, USA
Serdar Kadıoğlu

Authors

Parag Pravin Dakle
View author publications
You can also search for this author in PubMed Google Scholar
Serdar Kadıoğlu
View author publications
You can also search for this author in PubMed Google Scholar
Karthik Uppuluri
View author publications
You can also search for this author in PubMed Google Scholar
Regina Politi
View author publications
You can also search for this author in PubMed Google Scholar
Preethi Raghavan
View author publications
You can also search for this author in PubMed Google Scholar
SaiKrishna Rallabandi
View author publications
You can also search for this author in PubMed Google Scholar
Ravisutha Srinivasamurthy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Serdar Kadıoğlu .

Editor information

Editors and Affiliations

Department of Management, University of Toronto Scarborough and Rotman School of Management, University of Toronto, Toronto, ON, Canada
Andre A. Cire

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dakle, P.P. et al. (2023). Ner4Opt: Named Entity Recognition for Optimization Modelling from Natural Language. In: Cire, A.A. (eds) Integration of Constraint Programming, Artificial Intelligence, and Operations Research. CPAIOR 2023. Lecture Notes in Computer Science, vol 13884. Springer, Cham. https://doi.org/10.1007/978-3-031-33271-5_20

Download citation

DOI: https://doi.org/10.1007/978-3-031-33271-5_20
Published: 23 May 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-33270-8
Online ISBN: 978-3-031-33271-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Ner4Opt: Named Entity Recognition for Optimization Modelling from Natural Language