Abstract
We propose and analyze a new Markov Chain Monte Carlo algorithm that generates a uniform sample over full and non-full-dimensional polytopes. This algorithm, termed “Matrix Hit and Run” (MHAR), is a modification of the Hit and Run framework. For a polytope in \(\mathbb {R}^n\) defined by m linear constraints, the regime \(n^{1+\frac{1}{3}} \ll m\) has a lower asymptotic cost per sample in terms of soft-O notation (\(\mathcal {O}^*\)) than do existing sampling algorithms after a warm start. MHAR is designed to take advantage of matrix multiplication routines that require less computational and memory resources. Our tests show this implementation to be substantially faster than the hitandrun R package, especially for higher dimensions. Finally, we provide a python library based on PyTorch and a Colab notebook with the implementation ready for deployment in architectures with GPU or just CPU.
Similar content being viewed by others
Data availability and materials
The code for replicating the experiments is available on github. The source code is available in the github repository, and can be replicated in the free online colab platform: https://github.com/uumami/mhar_pytorch. The authors created a library for testing at https://github.com/uumami/mhar.
Code availability
Code: https://github.com/uumami/mhar_pytorch . Python library: https://pypi.org/project/mhar/. Library Code: https://github.com/uumami/mhar.
Notes
The prefix “M” in reality stands for mentat, a type of human in Frank Herbert’s Dune series who could simultaneously see the multiple probable paths the future may take.
PyTorch-lighting Batch Size Finder https://pytorch-lightning.readthedocs.io/en/stable/advanced/training_tricks.html.
References
Chay SC, Fardo RD, Mazumdar M (1975) On using the Box-Muller transformation with multiplicative congruential pseudo-random number generators. J R Stat Soc Series C (Appl Stat) 24(1):132–135. https://doi.org/10.2307/2346711
Chen Y, Dwivedi R, Wainwright MJ, Yu B (2017) Vaidya walk: A sampling algorithm based on the volumetric barrier. In: 2017 55th Annual allerton conference on communication, control, and computing (allerton), IEEE, pp 1220–1227, https://doi.org/10.1109/ALLERTON.2017.8262876
Chen Y, Dwivedi R, Wainwright MJ, Yu B (2018) Fast MCMC sampling algorithms on polytopes. J Mach Learn Res 19(55):1–86
Chrzeszczyk A, Chrzeszczyk J (2013) Matrix computations on the GPU: Cublas and magma by example. Accessed: 28 May 2023
Cid GM, Montiel LV (2019) Negociaciones de máxima probabilidad para juegos cooperativos con fines comerciales. Revista Mexicana de Economía y Finanzas 14(2):245–259 https://doi.org/10.21919/remef.v14i2.382
Cormen TH, Leiserson C, Rivest R, Stein C (2009) Introduction to algorithms, 3rd edn. MIT Press, Cambridge, pp 827–831
Dubhir T, Mishra M, Singhal R (2021) Benchmarking of quantization libraries in popular frameworks. In: 2021 IEEE international conference on big data (big data), IEEE, pp 3050–3055, https://doi.org/10.1109/BigData52589.2021.9671500
Emiris IZ, Fisikopoulos V (2014) Efficient random-walk methods for approximating polytope volume. In: Proceedings of the thirtieth annual symposium on computational geometry, association for computing machinery, New York, NY, USA, pp 318–327, https://doi.org/10.1145/2582112.2582133
Feldman J, Wainwright MJ, Karger DR (2005) Using linear programming to decode binary linear codes. IEEE Trans Inf Theory 51(3):954–972. https://doi.org/10.1109/TIT.2004.842696
Friedman JH, Rafsky LC (1979) Multivariate generalizations of the Wald–Wolfowitz and Smirnov two-sample tests. Ann Stat 7(4):697–717. https://doi.org/10.1214/aos/1176344722
Geyer Charles J (1992) Practical Markov chain Monte Carlo. Stat Sci 7(4):473–483. https://doi.org/10.1214/ss/1177011137
Gustafson A, Narayanan H (2022) John’s walk. Adv Appl Probabil 55:1–19. https://doi.org/10.1017/apr.2022.34
Huang J, Yu CD, van de Geijn RA (1993) Implementing Strassen’s algorithm with CUTLASS on NVIDIA Volta GPUs. Tech. rep., The University of Texas at Austin, https://doi.org/10.48550/arXiv.1808.07984
Huang KL, Mehrotra S (2015) An empirical evaluation of a walk-relax-round heuristic for mixed integer convex programs. Comput Optim Appl 60(3):559–585. https://doi.org/10.1007/s10589-014-9693-5
Kannan R, Narayanan H (2013) Random walks on polytopes and an affine interior point method for linear programming. Math Oper Res 37(1):1–20. https://doi.org/10.1287/moor.1110.0519
Kapfer SC, Krauth W (2013) Sampling from a polytope and hard-disk Monte Carlo. J Phys: Conf Ser 454(1):012031. https://doi.org/10.1088/1742-6596/454/1/012031
Kimm H, Paik I, Kimm H (2021) Performance comparision of tpu, gpu, cpu on google colaboratory over distributed deep learning. In: 2021 IEEE 14th International symposium on embedded multicore/many-core systems-on-chip (MCSoC), IEEE, pp 312–319, https://doi.org/10.1109/MCSoC51149.2021.00053
Knight PA (1995) Fast rectangular matrix multiplication and QR decomposition. Linear Algebra Appl 221:69–81. https://doi.org/10.1016/0024-3795(93)00230-W
Lai PW, Arafat H, Elango V, Sadayappan P (2013) Accelerating Strassen-Winograd’s matrix multiplication algorithm on GPUs. In: 20th Annual international conference on high performance computing, pp 139–148, https://doi.org/10.1109/HiPC.2013.6799109
Lawrence J (1991) Polytope volume computation. Math Comput 57(195):259–271. https://doi.org/10.1090/S0025-5718-1991-1079024-2
Le Gall F (2014) Powers of tensors and fast matrix multiplication. In: Proceedings of the 39th international symposium on symbolic and algebraic computation, association for computing machinery, New York, NY, USA, ISSAC ’14, p 296-303, https://doi.org/10.1145/2608628.2608664
Lee YT, Vempala SS (2018) Convergence rate of Riemannian Hamiltonian Monte Carlo and faster polytope volume computation. In: Proceedings of the 50th annual ACM SIGACT symposium on theory of computing, association for computing machinery, New York, NY, USA, STOC 2018, pp 1115–1121, https://doi.org/10.1145/3188745.3188774
Lee YT, Vempala SS (2022) Geodesic walks in polytopes. SIAM J Comput 51(2):400–488. https://doi.org/10.1137/17M1145999
Li J, Ranka S, Sahni S (2011) Strassen’s matrix multiplication on gpus. In: 2011 IEEE 17th international conference on parallel and distributed systems, pp 157–164, https://doi.org/10.1109/ICPADS.2011.130
Lovász L (1999) Hit-and-Run mixes fast. Math Program 86(3):443–461. https://doi.org/10.1007/s101070050099
Lovász L, Simonovits M (1993) Random walks in a convex body and an improved volume algorithm. Random Struct Algor 4(4):359–412. https://doi.org/10.1002/rsa.3240040402
Ma YA, Chen Y, Jin C, Flammarion N, Jordan MI (2019) Sampling can be faster than optimization. Proc Natl Acad Sci 116(42):20881–20885. https://doi.org/10.1073/pnas.1820003116
Marsaglia G (1972) Choosing a point from the surface of a sphere. Ann Math Stat 43(2):645–646. https://doi.org/10.1214/aoms/1177692644
Matteucci M, Veldkamp BP (2013) On the use of MCMC computerized adaptive testing with empirical prior information to improve efficiency. Stat Methods Appl 22(2):243–267. https://doi.org/10.1007/s10260-012-0216-1
Mittal S, Vaishay S (2019) A survey of techniques for optimizing deep learning on GPUs. J Syst Architect 99:101635. https://doi.org/10.1016/j.sysarc.2019.101635
Montiel LV, Bickel EJ (2012) A simulation-based approach to decision making with partial information. Decis Anal 9(4):329–347. https://doi.org/10.1287/deca.1120.0252
Montiel LV, Bickel EJ (2013) Approximating joint probability distributions given partial information. Decis Anal 10(1):26–41. https://doi.org/10.1287/deca.1120.0261
Montiel LV, Bickel EJ (2013) Generating a random collection of discrete joint probability distributions subject to partial information. Methodol Comput Appl Probab 15(4):951–967. https://doi.org/10.1007/s11009-012-9292-9
Montiel LV, Bickel EJ (2014) A generalized sampling approach for multilinear utility functions given partial preference information. Decis Anal 11(3):147–170. https://doi.org/10.1287/deca.2014.0296
Nikolić GS, Dimitrijević BR, Nikolić TR, Stojcev MK (2022) A survey of three types of processing units: CPU, GPU and TPU. In: 2022 57th international scientific conference on information, communication and energy systems and technologies (ICEST), IEEE, pp 1–6, https://doi.org/10.1109/ICEST55168.2022.9828625
Paszke A, et al. (2019) Pytorch: An imperative style, high-performance deep learning library. In: Advances in neural information processing systems 32, Curran Associates, Inc., Vancouver, pp 8026–8037, https://doi.org/10.48550/arXiv.1912.01703
Press WH, Teukolsky SA, Vetterling WT, Flannery BP (2007) Numerical recipes 3rd edition: the art of scientific computing, 3rd edn. Cambridge University Press, Cambridge
Rubin DB (1981) The Bayesian bootstrap. Ann Stat 6(1):130–134. https://doi.org/10.1214/aos/1176345338
Sharkawi SS, Chochia GA (2020) Communication protocol optimization for enhanced GPU performance. IBM J Res Dev 64(3/4):9–1. https://doi.org/10.1147/JRD.2020.2967311
Smith RL (1984) Efficient Monte Carlo procedures for generating points uniformly distributed over bounded regions. Oper Res 32(6):1296–1308. https://doi.org/10.1287/opre.32.6.1296
Smith RL (1996) The Hit-and-Run sampler: a globally reaching Markov chain sampler for generating arbitrary multivariate distributions. In: Proceedings winter simulation conference, pp 260–264, https://doi.org/10.1145/256562.256619
Tervonen T, Valkenhoef v G, Basturk N, Postmus D, (2013) Hit-and-Run enables efficient weight generation for simulation-based multiple criteria decision analysis. Eur J Oper Res 224(3):168–184. https://doi.org/10.1016/j.ejor.2012.08.026
Vempala S, Bertsimas D (2004) Solving convex programs by random walks. J ACM 51(4):540–556. https://doi.org/10.1145/1008731.1008733
Wang N, Choi J, Brand D, Chen CY, Gopalakrishnan K (2018) Training deep neural networks with 8-bit floating point numbers. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in Neural Information Processing Systems, Curran Associates, Inc., Vancouver, vol 31, pp 1–10
Acknowledgements
Mario Vazquez Corte wants to acknowledge CONACYT and ITAM for the support provided in the completion of his academic work. Additionally, he also acknowledges Dr. Fernando Eponda, Dr. Jose Octavio Gutierrez, and Dr. Rodolfo Conde for their support and insight in the development of this work, and want to express his special gratitude to Saul Caballero, Daniel Guerrero, Alfredo Carrillo, Jesus Ledezma, and Erick Palacios Moreno for their invaluable feedback during this process. Dr. Montiel thanks, unique and exclusively, two anonymous reviewers whose comments help us to improve this manuscript.
Funding
The research was conducted without external funding.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors state that there are no conflicts or competing interests.
Consent to participate
No people were involved in experiments that required consent from subjects.
Consent for publication
Both authors express consent to publish this article.
Ethical approval
We do not perform any actions or experiments that require ethics approval.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A. Mathematical proofs of lemmas and theorems
Lemma Appendix A.1
If \(m_E < n,\) then the complexity of calculating \(P_{\Delta ^E}\) is \(\mathcal {O}(m_E^{\omega -2}n^2).\)
Proof
Computing \(P_{\Delta ^E}\) is done in three matrix multiplications, one matrix-to-matrix subtraction, and one matrix inversion operation over \((A^EA'^E)\). The number of operations needed to calculate the inverse matrix depends on the algorithm used for matrix multiplication Cormen et al. (2009). The order of number of operations for computing \(P_{\Delta ^E}\) is the sum of the following:
-
1.
Obtain \((A^EA'^E)\) in \(\mathcal {O}(\mu _{A^E, A'^E})=\mathcal {O}(\mu (m_E,n,m_E)) =\mathcal {O}(m_E^{\omega -1}n)\) operations.
-
2.
Find the inverse \((A^EA'^E)^{-1}\) in \(\mathcal {O}(m_E^{\omega })\), since \((A^EA'^E)^{-1}\) has dimension \(m_E \times m_E\).
-
3.
Multiply \( A'^E(A^EA'^E)^{-1}\) in \(\mathcal {O}(\mu _{ A'^E,(A^EA'^E)^{-1}})=\mathcal {O}(\mu (n,m_E,m_E)) =\mathcal {O}(m_E^{\omega -1}n)\).
-
4.
Calculate \(A'^E(A^EA'^E)^{-1}A^E\) in \(\mathcal {O}(\mu _{ A'^E(A^EA'^E)^{-1},A^E})=\mathcal {O}(\mu (n,m_E,n)) =\mathcal {O}(m_E^{\omega -2}n^2)\).
-
5.
Subtract \( I - A'^E(A^EA'^E)^{-1}A^E\) in \(\mathcal {O}(n^2)\).
These sum to \(2 \times \mathcal {O}(m_E^{\omega -1}n) + \mathcal {O}(m_E^{\omega }) + \mathcal {O}(m_E^{\omega -2}n^2)+\mathcal {O}(n^2)\). Hence the complexity of calculating \(P_{\Delta ^E}\) is \( \mathcal {O}(\mu _{A'^E(A^EA'^E)^{-1},A^E})=\mathcal {O}(m_E^{\omega -2}n^2)\). \(\square \)
Lemma Appendix A.2
The cost per iteration of HAR for \(0 \le m_E\) is \(\mathcal {O}(\max \{ m_In,m_E^{\omega -2}n^2\}).\)
Proof
As seen in Algorithm 1, the only difference between the full and non-full-dimensional cases is the projection step \(P_{\Delta ^E}h=d\). Then, the cost per iteration is defined by the larger of the original cost per iteration \(\mathcal {O}(m_In)\) of HAR for \(m_E=0\), and the extra cost induced by the projection when \(m_E>0\).
Because \(P_{\Delta ^E}\) has dimension \(n \times n \) and h is an \(n \times 1\) vector, \(\mu _{P_{\Delta ^E},h}=n^2\) and the complexity is \(\mathcal {O}(n^2)\). By Lemma 3.1, finding \(P_{\Delta ^E}\) has an asymptotic complexity of \(\mathcal {O}(m_E^{\omega -2}n^2)\). Therefore, the cost of projecting h at each iteration is \( \mathcal {O}(n^2) + \mathcal {O}(m_E^{\omega -2}n^2) = \mathcal {O}(m_E^{\omega -2}n^2)\), since \(m_E>0\). Therefore, the cost per iteration for \(m_E >0\) is \(\mathcal {O}(\max \{ m_In,m_E^{\omega -2}n^2)\})\). If \(m_E=0\), then the coefficient \(\max \{ m_In,m_E^{\omega -2}n^2)\}\) equals \(\max \{m_In,0\}=m_In\) and the cost per sample is \(\mathcal {O}^*(\max \{m_In,0)\})=\mathcal {O}^*(m_In)\). \(\square \)
Lemma Appendix A.3
The complexity of generating matrix D in MHAR given \(P_{\Delta ^E}\) and \(\max \{m_I, n\} \le z\) is \(\mathcal {O}(nz)\) if \(m_E=0,\) and \(\mathcal {O}(n^{\omega -1}z)\) if \(m_E>0.\)
Proof
Generating H has complexity \(\mathcal {O}(nz)\) using the Box-Muller method. If \(m_E=0\), then \(D=H\), implying a total asymptotic cost \(\mathcal {O}(nz)\). If \(m_E>0\), then \(D=P_{\Delta ^E}H\), whose cost \(\mathcal {O}(\mu _{P_{\Delta ^E},H})=\mathcal {O}(n^{\omega -1}z)\) given by \(\max \{m_I, n\} \le z\), needs to be included. \(\mathcal {O}(n^{\omega -1}z)\) bounds \(\mathcal {O}(nz)\). Therefore, the total cost of computing D for \(m_E>0\) is bounded by \(\mathcal {O}(n^{\omega -1}z)\). \(\square \)
Lemma Appendix A.4
The complexity of generating all line sets \(\{L^k\}_{k=1}^{z}\) in MHAR given D, X, and \(\max \{m_I, n\} \le z\) is bounded by \(\mathcal {O}(m_In^{\omega -2}z) \ if \ n \le m_I,\) and by \(\mathcal {O}(m_I^{\omega -2}nz)\) otherwise.
Proof
All \(\Lambda ^k\)s can be obtained as follows:
-
1.
Obtain matrix \(A^IX\) in \(\mathcal {O}(\mu _{A^I,X})\). This is done in \(\mathcal {O}(m_In^{\omega -2}z) \ if \ n \le m_I\), and in \(\mathcal {O}(m_I^{\omega -2}nz)\) otherwise.
-
2.
Compute \(B^I - A^IX\), where \(B_I=(b^I|...|b^I) \in \mathbb {R}^{m^I \times z}\), which takes \(\mathcal {O}(m_Iz)\) operations.
-
3.
Calculate \(A^ID\), which is bounded by \(\mathcal {O}(\mu _{A^I,D})\), which is done in \(\mathcal {O}(m_In^{\omega -2}z) \ if \ n \le m_I\), and in \(\mathcal {O}(m_I^{\omega -2}nz)\) otherwise.
-
4.
Divide \(\frac{B^I - A^IX}{A^ID}\) (entry-wise) to obtain all \(\lambda ^k_i\). All the necessary point-wise operations for this calculation have a combined order of \(\mathcal {O}(m_Iz)\).
-
5.
For each \(k \in \{1,...,z\}\), find which coefficients \(a_i^I d^k\) are positive or negative, which takes \(\mathcal {O}(m_Iz)\).
-
6.
For each \(k \in \{1,...,z\}\), find the intervals \(\lambda _{min}^k=\max \ \{\lambda _i^k \ | \ a_i^I d^k < 0\}\) and \( \lambda _{max}^k=\min \ \{\lambda _i^k \ | \ a_i^I d^k > 0\}\), which can be done in \(\mathcal {O}(m_Iz)\).
This procedure constructs all the intervals \(\Lambda ^k=(\lambda _{\min }^k, \lambda _{\max }^k)\). The complexity of this operation is bounded by \(\mathcal {O}(\mu _{A^I,X}) = \mathcal {O}(\mu _{A^I,D})\). Hence, the complexity of finding all line sets is bounded by \(\mathcal {O}(m_In^{\omega -2}z) \ if \ n \le m_I\), and by \(\mathcal {O}(m_I^{\omega -2}nz)\) otherwise. \(\square \)
Lemma Appendix A.5
Sampling z new points given \(\{\Lambda ^k\}_{k=1}^z\) has complexity \(\mathcal {O}(zn).\)
Proof
Selecting a random \(\theta ^k \in \Lambda ^k\) takes \(\mathcal {O}(1)\). Sampling a new point \(x^k_{t,j+1} = x^k_{t,j} + \theta d^k_{t,j}\) has complexity \(\mathcal {O}(n)\) because it requires n scalar multiplications and n sums. Then, sampling all new \(x_{t,j+1}^k\) points is bounded by \(\mathcal {O}(zn)\). \(\square \)
Lemma Appendix A.6
Assume \(m_E = 0,\) \(\max \{n,m\} < z,\) and \(n \le m_I.\) Then, the cost per iteration of MHAR is \(\mathcal {O}(m_In^{\omega -2}z), \) which is the number of operations needed for finding all line sets \(\{L^k\}_{k=1}^z.\)
Proof
First, we enumerate the cost of each step of the iteration for \(m_E=0\) and \(n \le m_I\) if \(\max \{n,m\} < z\):
-
1.
By Lemma 3.1, generating \(P_{\Delta ^E}\) is bounded by \(\mathcal {O}(1)\).
-
2.
By Lemma 4.1, generating D is bounded by \(\mathcal {O}(nz)\).
-
3.
By Lemma 4.2, generating \(\{L^k\}_{k=1}^z\) for \(n \le m_I\) is bounded by \(\mathcal {O}(m_In^{\omega -2}z)\).
-
4.
By Lemma 4.3, generating all new \(x_{t,j+1}^k\) is bounded by \(\mathcal {O}(zn)\).
By hypothesis, \(0<n\le m_I\). Then, \(nz \le m_Iz < m_In^{\omega -2}z\), because \(\omega \in (2,3]\). Therefore, \(\mathcal {O}(1) \subseteq \mathcal {O}(nz) \subseteq \mathcal {O}(m_In^{\omega -2}z)\), where the first term is the complexity of finding the projection matrix (omitted for \(m_E=0\)), the second one bounds generating D and sampling new points, and the third one is the asymptotic cost of finding all line sets \(\{L^k\}_{k=1}^z\). \(\square \)
Lemma Appendix A.7
Assume \(m_E = 0,\) \(\max \{n,m\} < z,\) and \(n > m_I.\) Then, the cost per iteration of MHAR is \(\mathcal {O}(nm_I^{\omega -2}z),\) which is the number of operations needed for finding all line sets \(\{L^k\}_{k=1}^z.\)
Proof
As in the proof of Lemma 4.4, the complexity of the projection matrix, generating D, and sampling all-new \(x_{t,j+1}^k\) points is the same, given by \(m_E=0\) and \(n > m_I\). Hence, the only change is provided by Lemma 4.2, in which the cost of finding all line sets \(\{L^k\}_{k=1}^z\) for \(n > m_I\) is \(\mathcal {O}(nm_I^{\omega -2}z)\). By hypothesis, \(0<m_I\) and \(\max \{n,m\} < z\), thus \(nz < nm_I^{\omega -2}z\). Therefore, \(\mathcal {O}(1) \subseteq \mathcal {O}(nz) \subseteq \mathcal {O}(nm_I^{\omega -2}z)\), where the third term is the cost of finding all line sets \(\{L^k\}_{k=1}^z\). \(\square \)
Lemma Appendix A.8
Assume \(m_E < n\) and \((m,n)<z.\) Then, the cost of calculating the projection matrix \(P_{\Delta ^E}\) is bounded by the cost of generating D.
Proof
By hypothesis \(m_E < n \), implying that \(m_E^{\omega -2}n^2 < n^{\omega -2}n^2 = n^{\omega }\). Because \(n<z\), \(n^{\omega } = n^{\omega -1}n <n^{\omega -1}z \). Combining both inequalities yields \(m_E^{\omega -2}n^2< n^{\omega }< n^{\omega -1}z\). Therefore, \(\mathcal {O}(m_E^{\omega -2}n^2) \subseteq \mathcal {O}(n^{\omega -1}z)\), where the first term is the complexity of computing \(P_{\Delta ^E}\) (by Lemma 3.1), and the second term is the complexity of projecting H in order to obtain D (by Lemma 4.1). \(\square \)
Lemma Appendix A.9
Assume \(m_E>0,\) \(\max \{n,m\} < z,\) and \(n \le m_I.\) Then, the cost per iteration of MHAR is \(\mathcal {O}(m_In^{\omega -2}z),\) which is the number of operations needed for finding all line sets \(\{L^k\}_{k=1}^z.\)
Proof
First, we enumerate the cost of each step of the iteration for \(m_E>0\), \(n \le m_I\), and \(\max \{n,m\} < z\):
-
1.
By Lemma 3.1, generating \(P_{\Delta ^E}\) is bounded by \(\mathcal {O}(m_E^{\omega -2}n^2)\).
-
2.
By Lemma 4.1, generating D is bounded by \(\mathcal {O}(n^{\omega -1}z)\).
-
3.
By Lemma 4.2, generating \(\{L^k\}_{k=1}^z\) for \(n \le m_I\) is bounded by \(\mathcal {O}(m_In^{\omega -2}z)\).
-
4.
By Lemma 4.3, generating all new \(x_{t,j+1}^k\) is bounded by \(\mathcal {O}(zn)\).
Using Lemma 4.7, the Big-O term for finding \(P_{\Delta ^E}\) (step 1) is bounded by the term of generating D (step 2). Because \(n<m_I\), \(n^{\omega -1}z=n^{\omega -2}nz<n^{\omega -2}m_Iz\). Therefore, \(\mathcal {O}(m_E^{\omega -2}n^2)\subseteq \mathcal {O}(n^{\omega -1}z) \subseteq \mathcal {O}(m_In^{\omega -2}z)\), which are the respective costs of steps 1, 2, and 3. Furthermore, \(nz \le n^{\omega -2}m_Iz\), implying that step 4 is also bounded by step 3 in terms of complexity. This implies that all the operations above are bounded by the term \(\mathcal {O}(m_In^{\omega -2}z)\), which is the asymptotic complexity of finding all line sets \(\{L^k\}_{k=1}^z\). \(\square \)
Lemma Appendix A.10
Assume \(m_E>0,\) \(\max \{n,m\} < z,\) and \(n>m_I.\) Then, the cost per iteration of MHAR is \(\mathcal {O}(nm_I^{\omega -2}z),\) which is the number of operations needed for generating D.
Proof
As in the proof of Lemma 4.8, the cost of the projection matrix, generating D, and sampling all-new \(x_{t,j+1}^k\) points is the same, given by \(m_E>0\) and \(n > m_I\). Hence, the only change is provided by Lemma 4.2, in which the cost of finding all line sets \(\{L^k\}_{k=1}^z\) for \(n > m_I\) is \(\mathcal {O}(nm_I^{\omega -2}z)\).
By Lemma 4.7, the Big-O term for finding \(P_{\Delta ^E}\) is bounded by the term of generating D. Because \(n>m_I\), \(m_I^{\omega -2}nz < n^{\omega -2}nz=n^{\omega -1}z\). Therefore, \(\mathcal {O}(m_E^{\omega -2}n^2) \subseteq \mathcal {O}(nm_I^{\omega -2}z) \subseteq \mathcal {O}(n^{\omega -1}z)\), which are the respective costs of the projection matrix, finding all line sets, and generating D. Furthermore, \(nz \le n^{\omega -2}nz=n^{\omega -1}z\), implying that the cost of sampling all new \(x_{t,j+1}^k\) is also bounded by the cost of generating D. This implies that all the operations above are bounded by \(\mathcal {O}(nm_I^{\omega -2}z)\). \(\square \)
Lemma Appendix A.11
For \(\max \{n,m\} < z\) m, and \(n<m,\) MHAR has a lower cost per sample than does John’s walk after proper pre-processing, warm start, and ignoring the logarithmic and error terms.
Proof
Given proper pre-processing, \(n \ll m\), and \(\max \{n,m\} < z\), then MHAR’s cost per sample is \(\mathcal {O}^*(mn^{\omega + 1})\), and that for John’s walk is \(\mathcal {O}(mn^{11} + n^{15})\). Note that \(mn^{\omega + 1}\in \mathcal {O}(mn^{11} + n^{15})\). Therefore, when ignoring the logarithmic and error terms, MHAR has a lower cost per sample. \(\square \)
Lemma Appendix A.12
For \(\max \{n,m\} < z\) and the regime \(n \ll m,\) MHAR has a lower cost per sample than does the John walk after proper pre-processing, warm start, and ignoring logarithmic and error terms.
Proof
From proper pre-processing, \(n \ll m\), and \(\max \{n,m\} < z\), MHAR’s cost per sample is \(\mathcal {O}^*(mn^{\omega + 1})\) and that for John walk is \(\mathcal {O}(mn^{\omega + \frac{3}{2}})\). Note that \(mn^{\omega + 1} \in \mathcal {O}(mn^{\omega + \frac{3}{2}})\). Therefore when ignoring the logarithmic and error terms, MHAR has a lower cost per sample. \(\square \)
Appendix B. Additional optimal expansion experiments
Here we present the results for different expansion parameters using 10 MHAR runs for each dimension (25, 50, 100, 500) on simplices and hypercubes. Figures 8, 9, 10 and 11 shows the box-plots for simplices while Figs. 12, 13, 14 and 15 shows the box-plots for hypercubes.
The box in the boxplots shows the 25%, 50%, and 75% percentiles. The dots mark the outliers, and the upper and lower limits mark the maximum and minimum values without considering outliers.
Appendix C. Additional performance experiments
Here we present additional experiments on the fitness of MHAR. Table 5 reports the running times and the average sampled points per second for the best values of z for each combination of Figure and dimension. For each combination, we conducted the experiment 10 times. Table 5 shows that average samples per second are lower for higher dimensions, due to the curse of dimensionality. However, the performance of MHAR is outstanding.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Corte, M.V., Montiel, L.V. Novel matrix hit and run for sampling polytopes and its GPU implementation. Comput Stat (2023). https://doi.org/10.1007/s00180-023-01411-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00180-023-01411-y