Strong Mixed-Integer Programming Formulations for Trained Neural Networks

  • Ross Anderson
  • Joey HuchetteEmail author
  • Christian Tjandraatmadja
  • Juan Pablo Vielma
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11480)


We present an ideal mixed-integer programming (MIP) formulation for a rectified linear unit (ReLU) appearing in a trained neural network. Our formulation requires a single binary variable and no additional continuous variables beyond the input and output variables of the ReLU. We contrast it with an ideal “extended” formulation with a linear number of additional continuous variables, derived through standard techniques. An apparent drawback of our formulation is that it requires an exponential number of inequality constraints, but we provide a routine to separate the inequalities in linear time. We also prove that these exponentially-many constraints are facet-defining under mild conditions. Finally, we study network verification problems and observe that dynamically separating from the exponential inequalities (1) is much more computationally efficient and scalable than the extended formulation, (2) decreases the solve time of a state-of-the-art MIP solver by a factor of 7 on smaller instances, and (3) nearly matches the dual bounds of a state-of-the-art MIP solver on harder instances, after just a few rounds of separation and in orders of magnitude less time.


Mixed-integer programming Formulations Deep learning 



The authors gratefully acknowledge Yeesian Ng and Ondřej Sýkora for many discussions on the topic of this paper, and for their work on the development of the tf.opt package used in the computational experiments.


  1. 1.
    Amos, B., Xu, L., Kolter, J.Z.: Input convex neural networks. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 70, pp. 146–155. PMLR, International Convention Centre, Sydney, Australia, 06–11 August 2017Google Scholar
  2. 2.
    Anderson, R., Huchette, J., Tjandraatmadja, C., Vielma, J.P.: Strong convex relaxations and mixed-integer programming formulations for trained neural networks (2018).
  3. 3.
    Arulkumaran, K., Deisenroth, M.P., Brundage, M., Bharath, A.A.: Deep reinforcement learning: a brief survey. IEEE Signal Process. Mag. 34(6), 26–38 (2017)CrossRefGoogle Scholar
  4. 4.
    Atamtürk, A., Gómez, A.: Strong formulations for quadratic optimization with M-matrices and indicator variables. Math. Program. 170, 141–176 (2018)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Balas, E.: Disjunctive programming and a hierarchy of relaxations for discrete optimization problems. SIAM J. Algorithmic Discret. Methods 6(3), 466–486 (1985)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Balas, E.: Disjunctive programming: properties of the convex hull of feasible points. Discret. Appl. Math. 89, 3–44 (1998)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Bartolini, A., Lombardi, M., Milano, M., Benini, L.: Neuron constraints to model complex real-world problems. In: Lee, J. (ed.) CP 2011. LNCS, vol. 6876, pp. 115–129. Springer, Heidelberg (2011). Scholar
  8. 8.
    Bartolini, A., Lombardi, M., Milano, M., Benini, L.: Optimization and controlled systems: a case study on thermal aware workload dispatching. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, pp. 427–433 (2012)Google Scholar
  9. 9.
    Bastani, O., Ioannou, Y., Lampropoulos, L., Vytiniotis, D., Nori, A.V., Criminisi, A.: Measuring neural net robustness with constraints. In: Advances in Neural Information Processing Systems, pp. 2613–2621 (2016)Google Scholar
  10. 10.
    Belotti, P., et al.: On handling indicator constraints in mixed integer programming. Comput. Optim. Appl. 65(3), 545–566 (2016)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Bertsimas, D., Kallus, N.: From predictive to prescriptive analytics. Management Science (2018).
  12. 12.
    Biggs, M., Hariss, R.: Optimizing objective functions determined from random forests (2017).
  13. 13.
    Bonami, P., Lodi, A., Tramontani, A., Wiese, S.: On mathematical programming with indicator constraints. Math. Program. 151(1), 191–223 (2015)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Bunel, R., Turkaslan, I., Torr, P.H., Kohli, P., Kumar, M.P.: A unified view of piecewise linear neural network verification. In: Advances in Neural Information Processing Systems (2018)Google Scholar
  15. 15.
    Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 39–57 (2017)Google Scholar
  16. 16.
    Cheng, C.-H., Nührenberg, G., Ruess, H.: Maximum resilience of artificial neural networks. In: D’Souza, D., Narayan Kumar, K. (eds.) ATVA 2017. LNCS, vol. 10482, pp. 251–268. Springer, Cham (2017). Scholar
  17. 17.
    Deng, Y., Liu, J., Sen, S.: Coalescing data and decision sciences for analytics. In: Recent Advances in Optimization and Modeling of Contemporary Problems. INFORMS (2018)Google Scholar
  18. 18.
    Donti, P., Amos, B., Kolter, J.Z.: Task-based end-to-end model learning in stochastic optimization. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 5484–5494. Curran Associates, Inc. (2017)Google Scholar
  19. 19.
    Dulac-Arnold, G., et al.: Deep reinforcement learning in large discrete action spaces (2015).
  20. 20.
    Dutta, S., Jha, S., Sankaranarayanan, S., Tiwari, A.: Output range analysis for deep feedforward neural networks. In: Dutle, A., Muñoz, C., Narkawicz, A. (eds.) NFM 2018. LNCS, vol. 10811, pp. 121–138. Springer, Cham (2018). Scholar
  21. 21.
    Dvijotham, K., et al.:: Training verified learners with learned verifiers (2018).
  22. 22.
    Dvijotham, K., Stanforth, R., Gowal, S., Mann, T., Kohli, P.: A dual approach to scalable verification of deep networks. In: Thirty-Fourth Conference Annual Conference on Uncertainty in Artificial Intelligence (2018)Google Scholar
  23. 23.
    Ehlers, R.: Formal verification of piece-wise linear feed-forward neural networks. In: D’Souza, D., Narayan Kumar, K. (eds.) ATVA 2017. LNCS, vol. 10482, pp. 269–286. Springer, Cham (2017). Scholar
  24. 24.
    Elmachtoub, A.N., Grigas, P.: Smart "Predict, then Optimize" (2017).
  25. 25.
    Fischetti, M., Jo, J.: Deep neural networks and mixed integer linear optimization. Constraints 23, 296–309 (2018)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style (2015).
  27. 27.
    Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning, vol. 1. MIT Press, Cambridge (2016)zbMATHGoogle Scholar
  28. 28.
    den Hertog, D., Postek, K.: Bridging the gap between predictive and prescriptive analytics - new optimization methodology needed (2016).
  29. 29.
    Hijazi, H., Bonami, P., Cornuéjols, G., Ouorou, A.: Mixed-integer nonlinear programs featuring “on/off” constraints. Comput. Optim. Appl. 52(2), 537–558 (2012)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Hijazi, H., Bonami, P., Ouorou, A.: A note on linear on/off constraints (2014).
  31. 31.
    Hooker, J.: Logic-Based Methods for Optimization: Combining Optimization and Constraint Satisfaction. Wiley, Hoboken (2011)zbMATHGoogle Scholar
  32. 32.
    Huchette, J.: Advanced mixed-integer programming formulations: methodology, computation, and application. Ph.D. thesis, Massachusetts Institute of Technology (June 2018)Google Scholar
  33. 33.
    Katz, G., Barrett, C., Dill, D.L., Julian, K., Kochenderfer, M.J.: Reluplex: an efficient SMT solver for verifying deep neural networks. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10426, pp. 97–117. Springer, Cham (2017). Scholar
  34. 34.
    Khalil, E.B., Gupta, A., Dilkina, B.: Combinatorial attacks on binarized neural networks. In: International Conference on Learning Representations (2019)Google Scholar
  35. 35.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRefGoogle Scholar
  36. 36.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)CrossRefGoogle Scholar
  37. 37.
    Lombardi, M., Gualandi, S.: A lagrangian propagator for artificial neural networks in constraint programming. Constraints 21(4), 435–462 (2016)MathSciNetCrossRefGoogle Scholar
  38. 38.
    Lomuscio, A., Maganti, L.: An approach to reachability analysis for feed-forward ReLU neural networks (2017).
  39. 39.
    Mišić, V.V.: Optimization of tree ensembles (2017).
  40. 40.
    Mladenov, M., Boutilier, C., Schuurmans, D., Elidan, G., Meshi, O., Lu, T.: Approximate linear programming for logistic Markov decision processes. In: Proceedings of the Twenty-sixth International Joint Conference on Artificial Intelligence (IJCAI 2017), pp. 2486–2493, Melbourne, Australia (2017)Google Scholar
  41. 41.
    Mordvintsev, A., Olah, C., Tyka, M.: Inceptionism: going deeper into neural networks (2015).
  42. 42.
    Olah, C., Mordvintsev, A., Schubert, L.: Feature Visualization. Distill (2017).
  43. 43.
    Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: IEEE European Symposium on Security and Privacy, pp. 372–387, March 2016Google Scholar
  44. 44.
    Say, B., Wu, G., Zhou, Y.Q., Sanner, S.: Nonlinear hybrid planning with deep net learned transition models and mixed-integer linear programming. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, pp. 750–756 (2017)Google Scholar
  45. 45.
    Schweidtmann, A.M., Mitsos, A.: Global deterministic optimization with artificial neural networks embedded. J. Optim. Theory Appl. 180(3), 925–948 (2019)MathSciNetCrossRefGoogle Scholar
  46. 46.
    Serra, T., Ramalingam, S.: Empirical bounds on linear regions of deep rectifier networks (2018).
  47. 47.
    Serra, T., Tjandraatmadja, C., Ramalingam, S.: Bounding and counting linear regions of deep neural networks. In: Thirty-Fifth International Conference on Machine Learning (2018)Google Scholar
  48. 48.
    Szegedy, C., et al.: Intriguing properties of neural networks. In: International Conference on Learning Representations (2014)Google Scholar
  49. 49.
    Tjeng, V., Xiao, K., Tedrake, R.: Verifying neural networks with mixed integer programming. In: International Conference on Learning Representations (2019)Google Scholar
  50. 50.
    Vielma, J.P.: Mixed integer linear programming formulation techniques. SIAM Rev. 57(1), 3–57 (2015)MathSciNetCrossRefGoogle Scholar
  51. 51.
    Vielma, J.P.: Small and strong formulations for unions of convex sets from the Cayley embedding. Math. Program. (2018)Google Scholar
  52. 52.
    Vielma, J.P., Nemhauser, G.: Modeling disjunctive constraints with a logarithmic number of binary variables and constraints. Math. Program. 128(1–2), 49–72 (2011)MathSciNetCrossRefGoogle Scholar
  53. 53.
    Wong, E., Kolter, J.Z.: Provable defenses against adversarial examples via the convex outer adversarial polytope. In: International Conference on Machine Learning (2018)Google Scholar
  54. 54.
    Wong, E., Schmidt, F., Metzen, J.H., Kolter, J.Z.: Scaling provable adversarial defenses. In: 32nd Conference on Neural Information Processing Systems (2018)Google Scholar
  55. 55.
    Wu, G., Say, B., Sanner, S.: Scalable planning with Tensorflow for hybrid nonlinear domains. In: Advances in Neural Information Processing Systems, pp. 6276–6286 (2017)Google Scholar
  56. 56.
    Xiao, K.Y., Tjeng, V., Shafiullah, N.M., Madry, A.: Training for faster adversarial robustness verification via inducing ReLU stability. In: International Conference on Learning Representations (2019)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Ross Anderson
    • 1
  • Joey Huchette
    • 1
    Email author
  • Christian Tjandraatmadja
    • 1
  • Juan Pablo Vielma
    • 2
  1. 1.Google ResearchCambridgeUSA
  2. 2.MITCambridgeUSA

Personalised recommendations