Novel matrix hit and run for sampling polytopes and its GPU implementation

Corte, Mario Vazquez; Montiel, Luis V.

doi:10.1007/s00180-023-01411-y

Novel matrix hit and run for sampling polytopes and its GPU implementation

Original Paper
Published: 19 September 2023

(2023)
Cite this article

Computational Statistics Aims and scope Submit manuscript

68 Accesses
1 Citation
Explore all metrics

Abstract

We propose and analyze a new Markov Chain Monte Carlo algorithm that generates a uniform sample over full and non-full-dimensional polytopes. This algorithm, termed “Matrix Hit and Run” (MHAR), is a modification of the Hit and Run framework. For a polytope in \(\mathbb {R}^n\) defined by m linear constraints, the regime \(n^{1+\frac{1}{3}} \ll m\) has a lower asymptotic cost per sample in terms of soft-O notation (\(\mathcal {O}^*\)) than do existing sampling algorithms after a warm start. MHAR is designed to take advantage of matrix multiplication routines that require less computational and memory resources. Our tests show this implementation to be substantially faster than the hitandrun R package, especially for higher dimensions. Finally, we provide a python library based on PyTorch and a Colab notebook with the implementation ready for deployment in architectures with GPU or just CPU.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sticky PDMP samplers for sparse and local inference problems

Article Open access 28 November 2022

Precomputing strategy for Hamiltonian Monte Carlo method based on regularity in parameter space

Article 22 September 2016

Evaluation of Iterative Methods on Large Markov Chains Generated by GSPN Models

Data availability and materials

The code for replicating the experiments is available on github. The source code is available in the github repository, and can be replicated in the free online colab platform: https://github.com/uumami/mhar_pytorch. The authors created a library for testing at https://github.com/uumami/mhar.

Code availability

Code: https://github.com/uumami/mhar_pytorch . Python library: https://pypi.org/project/mhar/. Library Code: https://github.com/uumami/mhar.

Notes

The prefix “M” in reality stands for mentat, a type of human in Frank Herbert’s Dune series who could simultaneously see the multiple probable paths the future may take.
PyTorch-lighting Batch Size Finder https://pytorch-lightning.readthedocs.io/en/stable/advanced/training_tricks.html.

References

Chay SC, Fardo RD, Mazumdar M (1975) On using the Box-Muller transformation with multiplicative congruential pseudo-random number generators. J R Stat Soc Series C (Appl Stat) 24(1):132–135. https://doi.org/10.2307/2346711
Chen Y, Dwivedi R, Wainwright MJ, Yu B (2017) Vaidya walk: A sampling algorithm based on the volumetric barrier. In: 2017 55th Annual allerton conference on communication, control, and computing (allerton), IEEE, pp 1220–1227, https://doi.org/10.1109/ALLERTON.2017.8262876
Chen Y, Dwivedi R, Wainwright MJ, Yu B (2018) Fast MCMC sampling algorithms on polytopes. J Mach Learn Res 19(55):1–86
MathSciNet MATH Google Scholar
Chrzeszczyk A, Chrzeszczyk J (2013) Matrix computations on the GPU: Cublas and magma by example. Accessed: 28 May 2023
Cid GM, Montiel LV (2019) Negociaciones de máxima probabilidad para juegos cooperativos con fines comerciales. Revista Mexicana de Economía y Finanzas 14(2):245–259 https://doi.org/10.21919/remef.v14i2.382
Cormen TH, Leiserson C, Rivest R, Stein C (2009) Introduction to algorithms, 3rd edn. MIT Press, Cambridge, pp 827–831
MATH Google Scholar
Dubhir T, Mishra M, Singhal R (2021) Benchmarking of quantization libraries in popular frameworks. In: 2021 IEEE international conference on big data (big data), IEEE, pp 3050–3055, https://doi.org/10.1109/BigData52589.2021.9671500
Emiris IZ, Fisikopoulos V (2014) Efficient random-walk methods for approximating polytope volume. In: Proceedings of the thirtieth annual symposium on computational geometry, association for computing machinery, New York, NY, USA, pp 318–327, https://doi.org/10.1145/2582112.2582133
Feldman J, Wainwright MJ, Karger DR (2005) Using linear programming to decode binary linear codes. IEEE Trans Inf Theory 51(3):954–972. https://doi.org/10.1109/TIT.2004.842696
Article MathSciNet MATH Google Scholar
Friedman JH, Rafsky LC (1979) Multivariate generalizations of the Wald–Wolfowitz and Smirnov two-sample tests. Ann Stat 7(4):697–717. https://doi.org/10.1214/aos/1176344722
Article MathSciNet MATH Google Scholar
Geyer Charles J (1992) Practical Markov chain Monte Carlo. Stat Sci 7(4):473–483. https://doi.org/10.1214/ss/1177011137
Article Google Scholar
Gustafson A, Narayanan H (2022) John’s walk. Adv Appl Probabil 55:1–19. https://doi.org/10.1017/apr.2022.34
Article MATH Google Scholar
Huang J, Yu CD, van de Geijn RA (1993) Implementing Strassen’s algorithm with CUTLASS on NVIDIA Volta GPUs. Tech. rep., The University of Texas at Austin, https://doi.org/10.48550/arXiv.1808.07984
Huang KL, Mehrotra S (2015) An empirical evaluation of a walk-relax-round heuristic for mixed integer convex programs. Comput Optim Appl 60(3):559–585. https://doi.org/10.1007/s10589-014-9693-5
Article MathSciNet MATH Google Scholar
Kannan R, Narayanan H (2013) Random walks on polytopes and an affine interior point method for linear programming. Math Oper Res 37(1):1–20. https://doi.org/10.1287/moor.1110.0519
Article MathSciNet MATH Google Scholar
Kapfer SC, Krauth W (2013) Sampling from a polytope and hard-disk Monte Carlo. J Phys: Conf Ser 454(1):012031. https://doi.org/10.1088/1742-6596/454/1/012031
Article Google Scholar
Kimm H, Paik I, Kimm H (2021) Performance comparision of tpu, gpu, cpu on google colaboratory over distributed deep learning. In: 2021 IEEE 14th International symposium on embedded multicore/many-core systems-on-chip (MCSoC), IEEE, pp 312–319, https://doi.org/10.1109/MCSoC51149.2021.00053
Knight PA (1995) Fast rectangular matrix multiplication and QR decomposition. Linear Algebra Appl 221:69–81. https://doi.org/10.1016/0024-3795(93)00230-W
Article MathSciNet MATH Google Scholar
Lai PW, Arafat H, Elango V, Sadayappan P (2013) Accelerating Strassen-Winograd’s matrix multiplication algorithm on GPUs. In: 20th Annual international conference on high performance computing, pp 139–148, https://doi.org/10.1109/HiPC.2013.6799109
Lawrence J (1991) Polytope volume computation. Math Comput 57(195):259–271. https://doi.org/10.1090/S0025-5718-1991-1079024-2
Article MathSciNet MATH Google Scholar
Le Gall F (2014) Powers of tensors and fast matrix multiplication. In: Proceedings of the 39th international symposium on symbolic and algebraic computation, association for computing machinery, New York, NY, USA, ISSAC ’14, p 296-303, https://doi.org/10.1145/2608628.2608664
Lee YT, Vempala SS (2018) Convergence rate of Riemannian Hamiltonian Monte Carlo and faster polytope volume computation. In: Proceedings of the 50th annual ACM SIGACT symposium on theory of computing, association for computing machinery, New York, NY, USA, STOC 2018, pp 1115–1121, https://doi.org/10.1145/3188745.3188774
Lee YT, Vempala SS (2022) Geodesic walks in polytopes. SIAM J Comput 51(2):400–488. https://doi.org/10.1137/17M1145999
Article MathSciNet MATH Google Scholar
Li J, Ranka S, Sahni S (2011) Strassen’s matrix multiplication on gpus. In: 2011 IEEE 17th international conference on parallel and distributed systems, pp 157–164, https://doi.org/10.1109/ICPADS.2011.130
Lovász L (1999) Hit-and-Run mixes fast. Math Program 86(3):443–461. https://doi.org/10.1007/s101070050099
Article MathSciNet MATH Google Scholar
Lovász L, Simonovits M (1993) Random walks in a convex body and an improved volume algorithm. Random Struct Algor 4(4):359–412. https://doi.org/10.1002/rsa.3240040402
Article MathSciNet MATH Google Scholar
Ma YA, Chen Y, Jin C, Flammarion N, Jordan MI (2019) Sampling can be faster than optimization. Proc Natl Acad Sci 116(42):20881–20885. https://doi.org/10.1073/pnas.1820003116
Article MathSciNet MATH Google Scholar
Marsaglia G (1972) Choosing a point from the surface of a sphere. Ann Math Stat 43(2):645–646. https://doi.org/10.1214/aoms/1177692644
Article MATH Google Scholar
Matteucci M, Veldkamp BP (2013) On the use of MCMC computerized adaptive testing with empirical prior information to improve efficiency. Stat Methods Appl 22(2):243–267. https://doi.org/10.1007/s10260-012-0216-1
Article MathSciNet MATH Google Scholar
Mittal S, Vaishay S (2019) A survey of techniques for optimizing deep learning on GPUs. J Syst Architect 99:101635. https://doi.org/10.1016/j.sysarc.2019.101635
Article Google Scholar
Montiel LV, Bickel EJ (2012) A simulation-based approach to decision making with partial information. Decis Anal 9(4):329–347. https://doi.org/10.1287/deca.1120.0252
Article MathSciNet MATH Google Scholar
Montiel LV, Bickel EJ (2013) Approximating joint probability distributions given partial information. Decis Anal 10(1):26–41. https://doi.org/10.1287/deca.1120.0261
Article MathSciNet MATH Google Scholar
Montiel LV, Bickel EJ (2013) Generating a random collection of discrete joint probability distributions subject to partial information. Methodol Comput Appl Probab 15(4):951–967. https://doi.org/10.1007/s11009-012-9292-9
Article MathSciNet MATH Google Scholar
Montiel LV, Bickel EJ (2014) A generalized sampling approach for multilinear utility functions given partial preference information. Decis Anal 11(3):147–170. https://doi.org/10.1287/deca.2014.0296
Article MathSciNet MATH Google Scholar
Nikolić GS, Dimitrijević BR, Nikolić TR, Stojcev MK (2022) A survey of three types of processing units: CPU, GPU and TPU. In: 2022 57th international scientific conference on information, communication and energy systems and technologies (ICEST), IEEE, pp 1–6, https://doi.org/10.1109/ICEST55168.2022.9828625
Paszke A, et al. (2019) Pytorch: An imperative style, high-performance deep learning library. In: Advances in neural information processing systems 32, Curran Associates, Inc., Vancouver, pp 8026–8037, https://doi.org/10.48550/arXiv.1912.01703
Press WH, Teukolsky SA, Vetterling WT, Flannery BP (2007) Numerical recipes 3rd edition: the art of scientific computing, 3rd edn. Cambridge University Press, Cambridge
MATH Google Scholar
Rubin DB (1981) The Bayesian bootstrap. Ann Stat 6(1):130–134. https://doi.org/10.1214/aos/1176345338
Article MathSciNet Google Scholar
Sharkawi SS, Chochia GA (2020) Communication protocol optimization for enhanced GPU performance. IBM J Res Dev 64(3/4):9–1. https://doi.org/10.1147/JRD.2020.2967311
Article Google Scholar
Smith RL (1984) Efficient Monte Carlo procedures for generating points uniformly distributed over bounded regions. Oper Res 32(6):1296–1308. https://doi.org/10.1287/opre.32.6.1296
Article MathSciNet MATH Google Scholar
Smith RL (1996) The Hit-and-Run sampler: a globally reaching Markov chain sampler for generating arbitrary multivariate distributions. In: Proceedings winter simulation conference, pp 260–264, https://doi.org/10.1145/256562.256619
Tervonen T, Valkenhoef v G, Basturk N, Postmus D, (2013) Hit-and-Run enables efficient weight generation for simulation-based multiple criteria decision analysis. Eur J Oper Res 224(3):168–184. https://doi.org/10.1016/j.ejor.2012.08.026
Vempala S, Bertsimas D (2004) Solving convex programs by random walks. J ACM 51(4):540–556. https://doi.org/10.1145/1008731.1008733
Article MathSciNet MATH Google Scholar
Wang N, Choi J, Brand D, Chen CY, Gopalakrishnan K (2018) Training deep neural networks with 8-bit floating point numbers. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in Neural Information Processing Systems, Curran Associates, Inc., Vancouver, vol 31, pp 1–10

Download references

Acknowledgements

Mario Vazquez Corte wants to acknowledge CONACYT and ITAM for the support provided in the completion of his academic work. Additionally, he also acknowledges Dr. Fernando Eponda, Dr. Jose Octavio Gutierrez, and Dr. Rodolfo Conde for their support and insight in the development of this work, and want to express his special gratitude to Saul Caballero, Daniel Guerrero, Alfredo Carrillo, Jesus Ledezma, and Erick Palacios Moreno for their invaluable feedback during this process. Dr. Montiel thanks, unique and exclusively, two anonymous reviewers whose comments help us to improve this manuscript.

Funding

The research was conducted without external funding.

Author information

Authors and Affiliations

Department of Computer Science, Instituto Tecnológico Autónomo de México - ITAM, Río Hondo 1, CDMX, Mexico
Mario Vazquez Corte
Faculty of Engineering, Universidad Nacional Auntónoma de México - UNAM, Ciudad Universitaria, CDMX, 04510, Mexico
Luis V. Montiel

Authors

Mario Vazquez Corte
View author publications
You can also search for this author in PubMed Google Scholar
Luis V. Montiel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mario Vazquez Corte.

Ethics declarations

Conflict of interest

The authors state that there are no conflicts or competing interests.

Consent to participate

No people were involved in experiments that required consent from subjects.

Consent for publication

Both authors express consent to publish this article.

Ethical approval

We do not perform any actions or experiments that require ethics approval.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A. Mathematical proofs of lemmas and theorems

Lemma Appendix A.1

If \(m_E < n,\) then the complexity of calculating \(P_{\Delta ^E}\) is \(\mathcal {O}(m_E^{\omega -2}n^2).\)

Proof

Computing \(P_{\Delta ^E}\) is done in three matrix multiplications, one matrix-to-matrix subtraction, and one matrix inversion operation over \((A^EA'^E)\). The number of operations needed to calculate the inverse matrix depends on the algorithm used for matrix multiplication Cormen et al. (2009). The order of number of operations for computing \(P_{\Delta ^E}\) is the sum of the following:

1.
Obtain \((A^EA'^E)\) in \(\mathcal {O}(\mu _{A^E, A'^E})=\mathcal {O}(\mu (m_E,n,m_E)) =\mathcal {O}(m_E^{\omega -1}n)\) operations.
2.
Find the inverse \((A^EA'^E)^{-1}\) in \(\mathcal {O}(m_E^{\omega })\), since \((A^EA'^E)^{-1}\) has dimension \(m_E \times m_E\).
3.
Multiply \( A'^E(A^EA'^E)^{-1}\) in \(\mathcal {O}(\mu _{ A'^E,(A^EA'^E)^{-1}})=\mathcal {O}(\mu (n,m_E,m_E)) =\mathcal {O}(m_E^{\omega -1}n)\).
4.
Calculate \(A'^E(A^EA'^E)^{-1}A^E\) in \(\mathcal {O}(\mu _{ A'^E(A^EA'^E)^{-1},A^E})=\mathcal {O}(\mu (n,m_E,n)) =\mathcal {O}(m_E^{\omega -2}n^2)\).
5.
Subtract \( I - A'^E(A^EA'^E)^{-1}A^E\) in \(\mathcal {O}(n^2)\).

These sum to \(2 \times \mathcal {O}(m_E^{\omega -1}n) + \mathcal {O}(m_E^{\omega }) + \mathcal {O}(m_E^{\omega -2}n^2)+\mathcal {O}(n^2)\). Hence the complexity of calculating \(P_{\Delta ^E}\) is \( \mathcal {O}(\mu _{A'^E(A^EA'^E)^{-1},A^E})=\mathcal {O}(m_E^{\omega -2}n^2)\). \(\square \)

Lemma Appendix A.2

The cost per iteration of HAR for \(0 \le m_E\) is \(\mathcal {O}(\max \{ m_In,m_E^{\omega -2}n^2\}).\)

Proof

As seen in Algorithm 1, the only difference between the full and non-full-dimensional cases is the projection step \(P_{\Delta ^E}h=d\). Then, the cost per iteration is defined by the larger of the original cost per iteration \(\mathcal {O}(m_In)\) of HAR for \(m_E=0\), and the extra cost induced by the projection when \(m_E>0\).

Because \(P_{\Delta ^E}\) has dimension \(n \times n \) and h is an \(n \times 1\) vector, \(\mu _{P_{\Delta ^E},h}=n^2\) and the complexity is \(\mathcal {O}(n^2)\). By Lemma 3.1, finding \(P_{\Delta ^E}\) has an asymptotic complexity of \(\mathcal {O}(m_E^{\omega -2}n^2)\). Therefore, the cost of projecting h at each iteration is \( \mathcal {O}(n^2) + \mathcal {O}(m_E^{\omega -2}n^2) = \mathcal {O}(m_E^{\omega -2}n^2)\), since \(m_E>0\). Therefore, the cost per iteration for \(m_E >0\) is \(\mathcal {O}(\max \{ m_In,m_E^{\omega -2}n^2)\})\). If \(m_E=0\), then the coefficient \(\max \{ m_In,m_E^{\omega -2}n^2)\}\) equals \(\max \{m_In,0\}=m_In\) and the cost per sample is \(\mathcal {O}^*(\max \{m_In,0)\})=\mathcal {O}^*(m_In)\). \(\square \)

Lemma Appendix A.3

The complexity of generating matrix D in MHAR given \(P_{\Delta ^E}\) and \(\max \{m_I, n\} \le z\) is \(\mathcal {O}(nz)\) if \(m_E=0,\) and \(\mathcal {O}(n^{\omega -1}z)\) if \(m_E>0.\)

Proof

Generating H has complexity \(\mathcal {O}(nz)\) using the Box-Muller method. If \(m_E=0\), then \(D=H\), implying a total asymptotic cost \(\mathcal {O}(nz)\). If \(m_E>0\), then \(D=P_{\Delta ^E}H\), whose cost \(\mathcal {O}(\mu _{P_{\Delta ^E},H})=\mathcal {O}(n^{\omega -1}z)\) given by \(\max \{m_I, n\} \le z\), needs to be included. \(\mathcal {O}(n^{\omega -1}z)\) bounds \(\mathcal {O}(nz)\). Therefore, the total cost of computing D for \(m_E>0\) is bounded by \(\mathcal {O}(n^{\omega -1}z)\). \(\square \)

Lemma Appendix A.4

The complexity of generating all line sets \(\{L^k\}_{k=1}^{z}\) in MHAR given D, X, and \(\max \{m_I, n\} \le z\) is bounded by \(\mathcal {O}(m_In^{\omega -2}z) \ if \ n \le m_I,\) and by \(\mathcal {O}(m_I^{\omega -2}nz)\) otherwise.

Proof

All \(\Lambda ^k\)s can be obtained as follows:

1.
Obtain matrix \(A^IX\) in \(\mathcal {O}(\mu _{A^I,X})\). This is done in \(\mathcal {O}(m_In^{\omega -2}z) \ if \ n \le m_I\), and in \(\mathcal {O}(m_I^{\omega -2}nz)\) otherwise.
2.
Compute \(B^I - A^IX\), where \(B_I=(b^I|...|b^I) \in \mathbb {R}^{m^I \times z}\), which takes \(\mathcal {O}(m_Iz)\) operations.
3.
Calculate \(A^ID\), which is bounded by \(\mathcal {O}(\mu _{A^I,D})\), which is done in \(\mathcal {O}(m_In^{\omega -2}z) \ if \ n \le m_I\), and in \(\mathcal {O}(m_I^{\omega -2}nz)\) otherwise.
4.
Divide \(\frac{B^I - A^IX}{A^ID}\) (entry-wise) to obtain all \(\lambda ^k_i\). All the necessary point-wise operations for this calculation have a combined order of \(\mathcal {O}(m_Iz)\).
5.
For each \(k \in \{1,...,z\}\), find which coefficients \(a_i^I d^k\) are positive or negative, which takes \(\mathcal {O}(m_Iz)\).
6.
For each \(k \in \{1,...,z\}\), find the intervals \(\lambda _{min}^k=\max \ \{\lambda _i^k \ | \ a_i^I d^k < 0\}\) and \( \lambda _{max}^k=\min \ \{\lambda _i^k \ | \ a_i^I d^k > 0\}\), which can be done in \(\mathcal {O}(m_Iz)\).

This procedure constructs all the intervals \(\Lambda ^k=(\lambda _{\min }^k, \lambda _{\max }^k)\). The complexity of this operation is bounded by \(\mathcal {O}(\mu _{A^I,X}) = \mathcal {O}(\mu _{A^I,D})\). Hence, the complexity of finding all line sets is bounded by \(\mathcal {O}(m_In^{\omega -2}z) \ if \ n \le m_I\), and by \(\mathcal {O}(m_I^{\omega -2}nz)\) otherwise. \(\square \)

Lemma Appendix A.5

Sampling z new points given \(\{\Lambda ^k\}_{k=1}^z\) has complexity \(\mathcal {O}(zn).\)

Proof

Selecting a random \(\theta ^k \in \Lambda ^k\) takes \(\mathcal {O}(1)\). Sampling a new point \(x^k_{t,j+1} = x^k_{t,j} + \theta d^k_{t,j}\) has complexity \(\mathcal {O}(n)\) because it requires n scalar multiplications and n sums. Then, sampling all new \(x_{t,j+1}^k\) points is bounded by \(\mathcal {O}(zn)\). \(\square \)

Lemma Appendix A.6

Assume \(m_E = 0,\) \(\max \{n,m\} < z,\) and \(n \le m_I.\) Then, the cost per iteration of MHAR is \(\mathcal {O}(m_In^{\omega -2}z), \) which is the number of operations needed for finding all line sets \(\{L^k\}_{k=1}^z.\)

Proof

First, we enumerate the cost of each step of the iteration for \(m_E=0\) and \(n \le m_I\) if \(\max \{n,m\} < z\):

1.
By Lemma 3.1, generating \(P_{\Delta ^E}\) is bounded by \(\mathcal {O}(1)\).
2.
By Lemma 4.1, generating D is bounded by \(\mathcal {O}(nz)\).
3.
By Lemma 4.2, generating \(\{L^k\}_{k=1}^z\) for \(n \le m_I\) is bounded by \(\mathcal {O}(m_In^{\omega -2}z)\).
4.
By Lemma 4.3, generating all new \(x_{t,j+1}^k\) is bounded by \(\mathcal {O}(zn)\).

By hypothesis, \(0<n\le m_I\). Then, \(nz \le m_Iz < m_In^{\omega -2}z\), because \(\omega \in (2,3]\). Therefore, \(\mathcal {O}(1) \subseteq \mathcal {O}(nz) \subseteq \mathcal {O}(m_In^{\omega -2}z)\), where the first term is the complexity of finding the projection matrix (omitted for \(m_E=0\)), the second one bounds generating D and sampling new points, and the third one is the asymptotic cost of finding all line sets \(\{L^k\}_{k=1}^z\). \(\square \)

Lemma Appendix A.7

Assume \(m_E = 0,\) \(\max \{n,m\} < z,\) and \(n > m_I.\) Then, the cost per iteration of MHAR is \(\mathcal {O}(nm_I^{\omega -2}z),\) which is the number of operations needed for finding all line sets \(\{L^k\}_{k=1}^z.\)

Proof

As in the proof of Lemma 4.4, the complexity of the projection matrix, generating D, and sampling all-new \(x_{t,j+1}^k\) points is the same, given by \(m_E=0\) and \(n > m_I\). Hence, the only change is provided by Lemma 4.2, in which the cost of finding all line sets \(\{L^k\}_{k=1}^z\) for \(n > m_I\) is \(\mathcal {O}(nm_I^{\omega -2}z)\). By hypothesis, \(0<m_I\) and \(\max \{n,m\} < z\), thus \(nz < nm_I^{\omega -2}z\). Therefore, \(\mathcal {O}(1) \subseteq \mathcal {O}(nz) \subseteq \mathcal {O}(nm_I^{\omega -2}z)\), where the third term is the cost of finding all line sets \(\{L^k\}_{k=1}^z\). \(\square \)

Lemma Appendix A.8

Assume \(m_E < n\) and \((m,n)<z.\) Then, the cost of calculating the projection matrix \(P_{\Delta ^E}\) is bounded by the cost of generating D.

Proof

By hypothesis \(m_E < n \), implying that \(m_E^{\omega -2}n^2 < n^{\omega -2}n^2 = n^{\omega }\). Because \(n<z\), \(n^{\omega } = n^{\omega -1}n <n^{\omega -1}z \). Combining both inequalities yields \(m_E^{\omega -2}n^2< n^{\omega }< n^{\omega -1}z\). Therefore, \(\mathcal {O}(m_E^{\omega -2}n^2) \subseteq \mathcal {O}(n^{\omega -1}z)\), where the first term is the complexity of computing \(P_{\Delta ^E}\) (by Lemma 3.1), and the second term is the complexity of projecting H in order to obtain D (by Lemma 4.1). \(\square \)

Lemma Appendix A.9

Assume \(m_E>0,\) \(\max \{n,m\} < z,\) and \(n \le m_I.\) Then, the cost per iteration of MHAR is \(\mathcal {O}(m_In^{\omega -2}z),\) which is the number of operations needed for finding all line sets \(\{L^k\}_{k=1}^z.\)

Proof

First, we enumerate the cost of each step of the iteration for \(m_E>0\), \(n \le m_I\), and \(\max \{n,m\} < z\):

1.
By Lemma 3.1, generating \(P_{\Delta ^E}\) is bounded by \(\mathcal {O}(m_E^{\omega -2}n^2)\).
2.
By Lemma 4.1, generating D is bounded by \(\mathcal {O}(n^{\omega -1}z)\).
3.
By Lemma 4.2, generating \(\{L^k\}_{k=1}^z\) for \(n \le m_I\) is bounded by \(\mathcal {O}(m_In^{\omega -2}z)\).
4.
By Lemma 4.3, generating all new \(x_{t,j+1}^k\) is bounded by \(\mathcal {O}(zn)\).

Using Lemma 4.7, the Big-O term for finding \(P_{\Delta ^E}\) (step 1) is bounded by the term of generating D (step 2). Because \(n<m_I\), \(n^{\omega -1}z=n^{\omega -2}nz<n^{\omega -2}m_Iz\). Therefore, \(\mathcal {O}(m_E^{\omega -2}n^2)\subseteq \mathcal {O}(n^{\omega -1}z) \subseteq \mathcal {O}(m_In^{\omega -2}z)\), which are the respective costs of steps 1, 2, and 3. Furthermore, \(nz \le n^{\omega -2}m_Iz\), implying that step 4 is also bounded by step 3 in terms of complexity. This implies that all the operations above are bounded by the term \(\mathcal {O}(m_In^{\omega -2}z)\), which is the asymptotic complexity of finding all line sets \(\{L^k\}_{k=1}^z\). \(\square \)

Lemma Appendix A.10

Assume \(m_E>0,\) \(\max \{n,m\} < z,\) and \(n>m_I.\) Then, the cost per iteration of MHAR is \(\mathcal {O}(nm_I^{\omega -2}z),\) which is the number of operations needed for generating D.

Proof

As in the proof of Lemma 4.8, the cost of the projection matrix, generating D, and sampling all-new \(x_{t,j+1}^k\) points is the same, given by \(m_E>0\) and \(n > m_I\). Hence, the only change is provided by Lemma 4.2, in which the cost of finding all line sets \(\{L^k\}_{k=1}^z\) for \(n > m_I\) is \(\mathcal {O}(nm_I^{\omega -2}z)\).

By Lemma 4.7, the Big-O term for finding \(P_{\Delta ^E}\) is bounded by the term of generating D. Because \(n>m_I\), \(m_I^{\omega -2}nz < n^{\omega -2}nz=n^{\omega -1}z\). Therefore, \(\mathcal {O}(m_E^{\omega -2}n^2) \subseteq \mathcal {O}(nm_I^{\omega -2}z) \subseteq \mathcal {O}(n^{\omega -1}z)\), which are the respective costs of the projection matrix, finding all line sets, and generating D. Furthermore, \(nz \le n^{\omega -2}nz=n^{\omega -1}z\), implying that the cost of sampling all new \(x_{t,j+1}^k\) is also bounded by the cost of generating D. This implies that all the operations above are bounded by \(\mathcal {O}(nm_I^{\omega -2}z)\). \(\square \)

Lemma Appendix A.11

For \(\max \{n,m\} < z\) m, and \(n<m,\) MHAR has a lower cost per sample than does John’s walk after proper pre-processing, warm start, and ignoring the logarithmic and error terms.

Proof

Given proper pre-processing, \(n \ll m\), and \(\max \{n,m\} < z\), then MHAR’s cost per sample is \(\mathcal {O}^*(mn^{\omega + 1})\), and that for John’s walk is \(\mathcal {O}(mn^{11} + n^{15})\). Note that \(mn^{\omega + 1}\in \mathcal {O}(mn^{11} + n^{15})\). Therefore, when ignoring the logarithmic and error terms, MHAR has a lower cost per sample. \(\square \)

Lemma Appendix A.12

For \(\max \{n,m\} < z\) and the regime \(n \ll m,\) MHAR has a lower cost per sample than does the John walk after proper pre-processing, warm start, and ignoring logarithmic and error terms.

Proof

From proper pre-processing, \(n \ll m\), and \(\max \{n,m\} < z\), MHAR’s cost per sample is \(\mathcal {O}^*(mn^{\omega + 1})\) and that for John walk is \(\mathcal {O}(mn^{\omega + \frac{3}{2}})\). Note that \(mn^{\omega + 1} \in \mathcal {O}(mn^{\omega + \frac{3}{2}})\). Therefore when ignoring the logarithmic and error terms, MHAR has a lower cost per sample. \(\square \)

Appendix B. Additional optimal expansion experiments

Here we present the results for different expansion parameters using 10 MHAR runs for each dimension (25, 50, 100, 500) on simplices and hypercubes. Figures 8, 9, 10 and 11 shows the box-plots for simplices while Figs. 12, 13, 14 and 15 shows the box-plots for hypercubes.

The box in the boxplots shows the 25%, 50%, and 75% percentiles. The dots mark the outliers, and the upper and lower limits mark the maximum and minimum values without considering outliers.

Appendix C. Additional performance experiments

Here we present additional experiments on the fitness of MHAR. Table 5 reports the running times and the average sampled points per second for the best values of z for each combination of Figure and dimension. For each combination, we conducted the experiment 10 times. Table 5 shows that average samples per second are lower for higher dimensions, due to the curse of dimensionality. However, the performance of MHAR is outstanding.

Table 5 Samples Per Second of the MHAR

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Corte, M.V., Montiel, L.V. Novel matrix hit and run for sampling polytopes and its GPU implementation. Comput Stat (2023). https://doi.org/10.1007/s00180-023-01411-y

Download citation

Received: 29 August 2021
Accepted: 17 August 2023
Published: 19 September 2023
DOI: https://doi.org/10.1007/s00180-023-01411-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Novel matrix hit and run for sampling polytopes and its GPU implementation

Abstract

Access this article

Similar content being viewed by others

Sticky PDMP samplers for sparse and local inference problems

Precomputing strategy for Hamiltonian Monte Carlo method based on regularity in parameter space

Evaluation of Iterative Methods on Large Markov Chains Generated by GSPN Models

Data availability and materials

Code availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Consent to participate

Consent for publication

Ethical approval

Additional information

Publisher's Note

Appendices

Appendix A. Mathematical proofs of lemmas and theorems

Lemma Appendix A.1

Proof

Lemma Appendix A.2

Proof

Lemma Appendix A.3

Proof

Lemma Appendix A.4

Proof

Lemma Appendix A.5

Proof

Lemma Appendix A.6

Proof

Lemma Appendix A.7

Proof

Lemma Appendix A.8

Proof

Lemma Appendix A.9

Proof

Lemma Appendix A.10

Proof

Lemma Appendix A.11

Proof

Lemma Appendix A.12

Proof

Appendix B. Additional optimal expansion experiments

Appendix C. Additional performance experiments

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation