Siegel’s Lemma Is Sharp

Beck, József

doi:10.1007/978-3-319-44479-6_8

József Beck⁴

2042 Accesses
3 Citations

Abstract

Siegel’s Lemma is concerned with finding a “small” nontrivial integer solution of a large system of homogeneous linear equations with integer coefficients, where the number of variables substantially exceeds the number of equations (for example, n equations and N variables with N ≥ 2n), and “small” means small in the maximum norm. Siegel’s Lemma is a clever application of the Pigeonhole Principle, and it is a pure existence argument. The basically combinatorial Siegel’s Lemma is a key tool in transcendental number theory and diophantine approximation. David Masser (a leading expert in transcendental number theory) asked the question whether or not the Siegel’s Lemma is best possible. Here we prove that the so-called “Third Version of Siegel’s Lemma” is best possible apart from an absolute constant factor. In other words, we show that no other argument can beat the Pigeonhole Principle proof of Siegel’s Lemma (apart from an absolute constant factor). To prove this, we combine a concentration inequality (i.e., Fourier analysis) with combinatorics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

N. Alon, D.N. Kozlov, Coins with arbitrary weights. J. Algorithms 25, 162–176 (1997)
Article MathSciNet MATH Google Scholar
N. Alon, V.H. Vu, Anti-Hadamard matrices, coin weighing, threshold gates and indecomposable hypergraphs. J. Comb. Theory Ser. A 79, 133–160 (1997)
Article MathSciNet MATH Google Scholar
A. Baker, Transcendental Number Theory (Cambridge University Press, Cambridge, 1975)
Book MATH Google Scholar
E. Bombieri, W. Gubler, Heights in Diophantine Geometry. New Mathematical Monographs, vol. 4 (Cambridge University Press, Cambridge, 2006)
Google Scholar
E. Bombieri, J.D. Vaaler, On Siegel’s lemma. Invent. Math. 73(1), 11–32 (1983)
Article MathSciNet MATH Google Scholar
P. Erdős, On a lemma of Littlewood and Offord. Bull. Am. Math. Soc. 51, 898–902 (1945)
Article MathSciNet MATH Google Scholar
G. Halász, Estimates for the concentration function of combinatorial number theory and probability. Period. Math. Hung. 8, 197–211 (1977)
Article MathSciNet MATH Google Scholar
A.M. Macbeath, On measure of sum-sets, II. The sum-theorem for the torus. Proc. Camb. Philos. Soc. 49, 40–43 (1953)
Article MATH Google Scholar
W.M. Schmidt, Diophantine Approximations and Diophantine Equations. Lecture Notes in Mathematics, vol. 1467 (Springer, Berlin, 1991)
Google Scholar
J. Spencer, Six standard deviations suffice. Trans. Am. Math. Soc. 289, 679–706 (1985)
Article MathSciNet MATH Google Scholar
J.D. Vaaler, The best constant in Siegel’s lemma. Monatshaft. Math. 140(1), 71–89 (2003)
Article MathSciNet MATH Google Scholar
J.D. Vaaler, A.J. van der Poorten, Bounds for solutions of systems of linear equations. Bull. Aust. Math. Soc. 25, 125–132 (1982)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

I am grateful to David Masser (Basel) who called my attention to the problem.

Author information

Authors and Affiliations

Mathematics Department, Rutgers University, New Brunswick, NJ, USA
József Beck

Authors

József Beck
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to József Beck .

Editor information

Editors and Affiliations

Department of Applied Mathematics, Charles University, Praha, Czech Republic
Martin Loebl
Computer Science Institute of Charles University, Charles University, Praha, Czech Republic
Jaroslav Nešetřil
School of Mathematics, Georgia Institute of Technology, Atlanta, Georgia, USA
Robin Thomas

Appendix: Proof of the Third Version of Siegel’s Lemma

We combine the usual pigeonhole principle argument with probability theory; in particular, we borrow some ideas from the paper of Spencer [10]. We use the following variant of the large deviation theorem in probability theory. Bernstein’s inequality Let Z ₁, Z ₂, …, Z _N be real-valued independent random variables with zero expectation E Z _j = 0 and | Z _j | ≤ M, 1 ≤ j ≤ N. Then, for all positive τ > 0,

$$\displaystyle{\Pr \left [\left \vert \sum _{j=1}^{N}Z_{ j}\right \vert \geq \tau \right ] \leq 2\exp \left (-{ \tau ^{2}/2 \over \left (\sum _{j=1}^{N}\mathbf{E}Z_{j}^{2}\right ) + (\tau M/3)}\right ).}$$

Consider now the i-th row of the homogeneous linear system

$$\displaystyle{\sum _{j=1}^{N}d_{ i,j}x_{j} = 0,\mathrm{\ \ where\ \ }\vert d_{i,j}\vert \leq A.}$$

For a positive integer B ≥ 1 and an integer j in 1 ≤ j ≤ N, let X _j denote the random variable with $\Pr [X_{j} = b] ={ 1 \over 2B+1}$, where b runs over the integers − B, −B + 1, −B + 2, …, B. Moreover, assume that X ₁, X ₂, …, X _N are independent. We apply Bernstein’s inequality with Z _j = d _i, j X _j, $\tau =\lambda \sqrt{N}AB$ and M = AB:

$$\displaystyle{\Pr \left [\left \vert \sum _{j=1}^{N}Z_{ j}\right \vert \geq \lambda \sqrt{N}AB\right ] \leq 2\exp \left (-{ \lambda ^{2}NA^{2}B^{2}/2 \over \left (NA^{2}B^{2}\right ) + \left (\lambda \sqrt{N}A^{2}B^{2}/3\right )}\right ) =}$$

$$\displaystyle{ = 2\exp \left (-{ \lambda ^{2} \over 2 + 2N^{-1/2}\lambda /3}\right ). }$$

(109)

Let

$$\displaystyle{ \lambda _{h} = 6h,\ \ 1 \leq h \leq \log n. }$$

(110)

Then by (109), for every 1 ≤ h ≤ logn,

$$\displaystyle{\Pr \left [\left \vert \sum _{j=1}^{N}d_{ i,j}X_{j}\right \vert \geq 6h\sqrt{N}AB\right ] \leq 2\exp \left (-{ 36h^{2} \over 2 + (12N^{-1/2}h/3)}\right ) \leq }$$

$$\displaystyle{ \leq 2\exp \left (-{ 36h^{2} \over 2h + (12h/3)}\right ) = 2e^{-6h}. }$$

(111)

For every integer h in 1 ≤ h ≤ logn, define the random variable

$$\displaystyle{ Y _{h} = \left \vert \left \{i \in \{ 1,2,\ldots,n\}:\ \left \vert \sum _{j=1}^{N}d_{ i,j}X_{j}\right \vert \geq 6h\sqrt{N}AB\right \}\right \vert. }$$

(112)

By using (111), we obtain the following upper bound for the expected value of Y _h:

$$\displaystyle{ \mathbf{E}Y _{h} \leq 2e^{-6h}n,\ 1 \leq h \leq \log n. }$$

(113)

Since the random variable Y _h has non-negative values, we can use the simple Markov inequality stating that for any random variable Y with non-negative values and finite expectation

$$\displaystyle{\Pr [Y \geq a] \leq { \mathbf{E}Y \over a} \mathrm{\ \ for\ any\ }a> 0.}$$

Applying Markov inequality in (113) we have

$$\displaystyle{\Pr \left [Y _{h} \geq 2h(h + 1) \cdot 2e^{-6h}n\right ] \leq { 1 \over 2h(h + 1)},\ 1 \leq h \leq \log n.}$$

Using the telescoping sum

$$\displaystyle{\sum _{h=1}^{\infty }{ 1 \over h(h + 1)} =\sum _{ h=1}^{\infty }\left ({1 \over h} -{ 1 \over h + 1}\right ) = 1,}$$

we obtain that

$$\displaystyle{\sum _{1\leq h\leq \log n}{ 1 \over 2h(h + 1)} <{ 1 \over 2},}$$

so, with probability greater than 1∕2 we have

$$\displaystyle{ Y _{h} <2h(h + 1) \cdot 2e^{-6h}n\mathrm{\ \ for\ every\ }1 \leq h \leq \log n. }$$

(114)

(114) means that, with probability greater than 1∕2, the number of row-sums ∑ _{j = 1} ^N d _i, j X _j that have absolute value $\geq 6h\sqrt{N}AB$, is less than

$$\displaystyle{2h(h + 1) \cdot 2e^{-6h}n\mathrm{\ \ for\ every\ }1 \leq h \leq \log n.}$$

It follows that, with probability greater than 1∕2, the number of row-sums ∑ _{j = 1} ^N d _i, j X _j that have absolute value between $6h\sqrt{N}AB$ and $6(h + 1)\sqrt{N}AB$, is less than

$$\displaystyle{ 2h(h + 1) \cdot 2e^{-6h}n\mathrm{\ \ for\ every\ }1 \leq h \leq \log n, }$$

(115)

and there is no row-sum with absolute value $\geq 6\log n\sqrt{N}AB$. In the last step we used the fact that

$$\displaystyle{2h(h + 1) \cdot 2e^{-6h}n <1\mathrm{\ with\ }h = \lfloor \log n\rfloor }$$

(lower integral part).

Write (see (115))

$$\displaystyle{ k_{h} = \lfloor 2h(h + 1) \cdot 2e^{-6h}n\rfloor,\ \ 1 \leq h \leq \log n, }$$

(116)

and

$$\displaystyle{ k_{0} = n -\sum _{1\leq h\leq \log n}k_{h}. }$$

(117)

The total number of row-sum vectors (with n coordinates) satisfying (115) can be estimated from above by using the parameters k _h in (116)–(117) as follows:

$$\displaystyle{ \leq { n\choose k_{0}}\left (2 \cdot 6\sqrt{N}AB + 1\right )^{k_{0} }\prod _{1\leq h\leq \log n}\left ({n\choose k_{h}}\left (2 \cdot 6(h + 1)\sqrt{N}AB + 1\right )^{k_{h} }\right ). }$$

(118)

On the other hand, “with probability greater than 1∕2” means more than

$$\displaystyle{{ 1 \over 2}(2B + 1)^{N} }$$

(119)

possible vectors v ∈ {−B, −B + 1, −B + 2, …, B}^N.

So, if (119) is greater or equal to (118), than the Pigeonhole Principle applies, and there exist two different vectors

$$\displaystyle{\mathbf{v}_{1},\mathbf{v}_{2} \in \{-B,-B + 1,-B + 2,\ldots,B\}^{N}}$$

such that they generate the same row-vector, i.e., Dv ₁ = Dv ₂. Then x = v ₁ −v ₂ satisfies the homogeneous linear system Dx = 0 with

$$\displaystyle{ \max _{1\leq j\leq N}\vert x_{j}\vert \leq 2B. }$$

(120)

The rest is routine estimation. Clearly

$$\displaystyle{{n\choose k_{0}}\left (2 \cdot 6\sqrt{N}AB + 1\right )^{k_{0} }\prod _{1\leq h\leq \log n}\left ({n\choose k_{h}}\left (2 \cdot 6(h + 1)\sqrt{N}AB + 1\right )^{k_{h} }\right ) \leq }$$

$$\displaystyle{\leq { n\choose k_{0}}\left (13\sqrt{m}AB\right )^{k_{0} }\prod _{1\leq h\leq \log n}\left ({n\choose k_{h}}\left (13(h + 1)\sqrt{N}AB\right )^{k_{h} }\right ) =}$$

$$\displaystyle{ = \left (13\sqrt{N}AB\right )^{n}{n\choose k_{ 0}}\prod _{1\leq h\leq \log n}\left ({n\choose k_{h}}\left (h + 1\right )^{k_{h} }\right ). }$$

(121)

By using s! ≥ (s∕e)^s and (116), we have

$$\displaystyle{{n\choose k_{0}}\prod _{1\leq h\leq \log n}\left ({n\choose k_{h}}\left (h + 1\right )^{k_{h} }\right ) \leq { n\choose k_{0}}\prod _{1\leq h\leq \log n}\left (e^{6h}(h + 1)\right )^{k_{h} } \leq }$$

$$\displaystyle{\leq { n\choose k_{0}}\prod _{1\leq h\leq \log n}e^{7hk_{h} } ={ n\choose k_{0}}\exp \left (\sum _{1\leq h\leq \log n}7hk_{h}\right ) \leq }$$

$$\displaystyle{\leq { n\choose k_{0}}\exp \left (\sum _{1\leq h\leq \log n}7h \cdot 3h(h + 1)e^{-6h}n\right ) \leq }$$

$$\displaystyle{ \leq { n\choose k_{0}}\exp \left (21n\sum _{h=1}^{\infty }h^{2}(h + 1)e^{-6h}\right ) \leq 2^{n}e^{n}. }$$

(122)

Using (122) in (121), we have

$$\displaystyle{{n\choose k_{0}}\left (2 \cdot 6\sqrt{N}AB + 1\right )^{k_{0} }\prod _{1\leq h\leq \log n}\left ({n\choose k_{h}}\left (2 \cdot 6(h + 1)\sqrt{N}AB + 1\right )^{k_{h} }\right ) \leq }$$

$$\displaystyle{ \leq \left (13\sqrt{N}AB\right )^{n}2^{n}e^{n} <\left (70\sqrt{N}AB\right )^{n}. }$$

(123)

Combining (118), (119) and (123), it suffices to guarantee the inequality

$$\displaystyle{{ 1 \over 2}(2B + 1)^{N} \geq \left (70\sqrt{N}AB\right )^{n}. }$$

(124)

Inequality (124) clearly holds with

$$\displaystyle{ 2B = \left \lfloor \left (70\sqrt{N}A\right )^{n/(N-n)}\right \rfloor, }$$

(125)

and using (120) in (125), we conclude

$$\displaystyle{\max _{1\leq j\leq N}\vert x_{j}\vert \leq \left (70\sqrt{N}A\right )^{n/(N-n)},}$$

completing the proof of the Third Version of Siegel’s Lemma. □

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Beck, J. (2017). Siegel’s Lemma Is Sharp. In: Loebl, M., Nešetřil, J., Thomas, R. (eds) A Journey Through Discrete Mathematics. Springer, Cham. https://doi.org/10.1007/978-3-319-44479-6_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-44479-6_8
Published: 06 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44478-9
Online ISBN: 978-3-319-44479-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Siegel’s Lemma Is Sharp

Abstract

Access this chapter

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix: Proof of the Third Version of Siegel’s Lemma

Appendix: Proof of the Third Version of Siegel’s Lemma

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation