Skip to main content

Siegel’s Lemma Is Sharp

  • Chapter
  • First Online:
A Journey Through Discrete Mathematics

Abstract

Siegel’s Lemma is concerned with finding a “small” nontrivial integer solution of a large system of homogeneous linear equations with integer coefficients, where the number of variables substantially exceeds the number of equations (for example, n equations and N variables with N ≥ 2n), and “small” means small in the maximum norm. Siegel’s Lemma is a clever application of the Pigeonhole Principle, and it is a pure existence argument. The basically combinatorial Siegel’s Lemma is a key tool in transcendental number theory and diophantine approximation. David Masser (a leading expert in transcendental number theory) asked the question whether or not the Siegel’s Lemma is best possible. Here we prove that the so-called “Third Version of Siegel’s Lemma” is best possible apart from an absolute constant factor. In other words, we show that no other argument can beat the Pigeonhole Principle proof of Siegel’s Lemma (apart from an absolute constant factor). To prove this, we combine a concentration inequality (i.e., Fourier analysis) with combinatorics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. N. Alon, D.N. Kozlov, Coins with arbitrary weights. J. Algorithms 25, 162–176 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  2. N. Alon, V.H. Vu, Anti-Hadamard matrices, coin weighing, threshold gates and indecomposable hypergraphs. J. Comb. Theory Ser. A 79, 133–160 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  3. A. Baker, Transcendental Number Theory (Cambridge University Press, Cambridge, 1975)

    Book  MATH  Google Scholar 

  4. E. Bombieri, W. Gubler, Heights in Diophantine Geometry. New Mathematical Monographs, vol. 4 (Cambridge University Press, Cambridge, 2006)

    Google Scholar 

  5. E. Bombieri, J.D. Vaaler, On Siegel’s lemma. Invent. Math. 73(1), 11–32 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  6. P. Erdős, On a lemma of Littlewood and Offord. Bull. Am. Math. Soc. 51, 898–902 (1945)

    Article  MathSciNet  MATH  Google Scholar 

  7. G. Halász, Estimates for the concentration function of combinatorial number theory and probability. Period. Math. Hung. 8, 197–211 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  8. A.M. Macbeath, On measure of sum-sets, II. The sum-theorem for the torus. Proc. Camb. Philos. Soc. 49, 40–43 (1953)

    Article  MATH  Google Scholar 

  9. W.M. Schmidt, Diophantine Approximations and Diophantine Equations. Lecture Notes in Mathematics, vol. 1467 (Springer, Berlin, 1991)

    Google Scholar 

  10. J. Spencer, Six standard deviations suffice. Trans. Am. Math. Soc. 289, 679–706 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  11. J.D. Vaaler, The best constant in Siegel’s lemma. Monatshaft. Math. 140(1), 71–89 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  12. J.D. Vaaler, A.J. van der Poorten, Bounds for solutions of systems of linear equations. Bull. Aust. Math. Soc. 25, 125–132 (1982)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

I am grateful to David Masser (Basel) who called my attention to the problem.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to József Beck .

Editor information

Editors and Affiliations

Appendix: Proof of the Third Version of Siegel’s Lemma

Appendix: Proof of the Third Version of Siegel’s Lemma

We combine the usual pigeonhole principle argument with probability theory; in particular, we borrow some ideas from the paper of Spencer [10]. We use the following variant of the large deviation theorem in probability theory. Bernstein’s inequality Let Z 1, Z 2, , Z N be real-valued independent random variables with zero expectation E Z j = 0 and | Z j | ≤ M, 1 ≤ jN. Then, for all positive τ > 0,

$$\displaystyle{\Pr \left [\left \vert \sum _{j=1}^{N}Z_{ j}\right \vert \geq \tau \right ] \leq 2\exp \left (-{ \tau ^{2}/2 \over \left (\sum _{j=1}^{N}\mathbf{E}Z_{j}^{2}\right ) + (\tau M/3)}\right ).}$$

Consider now the i-th row of the homogeneous linear system

$$\displaystyle{\sum _{j=1}^{N}d_{ i,j}x_{j} = 0,\mathrm{\ \ where\ \ }\vert d_{i,j}\vert \leq A.}$$

For a positive integer B ≥ 1 and an integer j in 1 ≤ jN, let X j denote the random variable with \(\Pr [X_{j} = b] ={ 1 \over 2B+1}\), where b runs over the integers − B, −B + 1, −B + 2, , B. Moreover, assume that X 1, X 2, , X N are independent. We apply Bernstein’s inequality with Z j = d i, j X j , \(\tau =\lambda \sqrt{N}AB\) and M = AB:

$$\displaystyle{\Pr \left [\left \vert \sum _{j=1}^{N}Z_{ j}\right \vert \geq \lambda \sqrt{N}AB\right ] \leq 2\exp \left (-{ \lambda ^{2}NA^{2}B^{2}/2 \over \left (NA^{2}B^{2}\right ) + \left (\lambda \sqrt{N}A^{2}B^{2}/3\right )}\right ) =}$$
$$\displaystyle{ = 2\exp \left (-{ \lambda ^{2} \over 2 + 2N^{-1/2}\lambda /3}\right ). }$$
(109)

Let

$$\displaystyle{ \lambda _{h} = 6h,\ \ 1 \leq h \leq \log n. }$$
(110)

Then by (109), for every 1 ≤ h ≤ logn,

$$\displaystyle{\Pr \left [\left \vert \sum _{j=1}^{N}d_{ i,j}X_{j}\right \vert \geq 6h\sqrt{N}AB\right ] \leq 2\exp \left (-{ 36h^{2} \over 2 + (12N^{-1/2}h/3)}\right ) \leq }$$
$$\displaystyle{ \leq 2\exp \left (-{ 36h^{2} \over 2h + (12h/3)}\right ) = 2e^{-6h}. }$$
(111)

For every integer h in 1 ≤ h ≤ logn, define the random variable

$$\displaystyle{ Y _{h} = \left \vert \left \{i \in \{ 1,2,\ldots,n\}:\ \left \vert \sum _{j=1}^{N}d_{ i,j}X_{j}\right \vert \geq 6h\sqrt{N}AB\right \}\right \vert. }$$
(112)

By using (111), we obtain the following upper bound for the expected value of Y h :

$$\displaystyle{ \mathbf{E}Y _{h} \leq 2e^{-6h}n,\ 1 \leq h \leq \log n. }$$
(113)

Since the random variable Y h has non-negative values, we can use the simple Markov inequality stating that for any random variable Y with non-negative values and finite expectation

$$\displaystyle{\Pr [Y \geq a] \leq { \mathbf{E}Y \over a} \mathrm{\ \ for\ any\ }a> 0.}$$

Applying Markov inequality in (113) we have

$$\displaystyle{\Pr \left [Y _{h} \geq 2h(h + 1) \cdot 2e^{-6h}n\right ] \leq { 1 \over 2h(h + 1)},\ 1 \leq h \leq \log n.}$$

Using the telescoping sum

$$\displaystyle{\sum _{h=1}^{\infty }{ 1 \over h(h + 1)} =\sum _{ h=1}^{\infty }\left ({1 \over h} -{ 1 \over h + 1}\right ) = 1,}$$

we obtain that

$$\displaystyle{\sum _{1\leq h\leq \log n}{ 1 \over 2h(h + 1)} <{ 1 \over 2},}$$

so, with probability greater than 1∕2 we have

$$\displaystyle{ Y _{h} <2h(h + 1) \cdot 2e^{-6h}n\mathrm{\ \ for\ every\ }1 \leq h \leq \log n. }$$
(114)

(114) means that, with probability greater than 1∕2, the number of row-sums j = 1 N d i, j X j that have absolute value \(\geq 6h\sqrt{N}AB\), is less than

$$\displaystyle{2h(h + 1) \cdot 2e^{-6h}n\mathrm{\ \ for\ every\ }1 \leq h \leq \log n.}$$

It follows that, with probability greater than 1∕2, the number of row-sums j = 1 N d i, j X j that have absolute value between \(6h\sqrt{N}AB\) and \(6(h + 1)\sqrt{N}AB\), is less than

$$\displaystyle{ 2h(h + 1) \cdot 2e^{-6h}n\mathrm{\ \ for\ every\ }1 \leq h \leq \log n, }$$
(115)

and there is no row-sum with absolute value \(\geq 6\log n\sqrt{N}AB\). In the last step we used the fact that

$$\displaystyle{2h(h + 1) \cdot 2e^{-6h}n <1\mathrm{\ with\ }h = \lfloor \log n\rfloor }$$

(lower integral part).

Write (see (115))

$$\displaystyle{ k_{h} = \lfloor 2h(h + 1) \cdot 2e^{-6h}n\rfloor,\ \ 1 \leq h \leq \log n, }$$
(116)

and

$$\displaystyle{ k_{0} = n -\sum _{1\leq h\leq \log n}k_{h}. }$$
(117)

The total number of row-sum vectors (with n coordinates) satisfying (115) can be estimated from above by using the parameters k h in (116)–(117) as follows:

$$\displaystyle{ \leq { n\choose k_{0}}\left (2 \cdot 6\sqrt{N}AB + 1\right )^{k_{0} }\prod _{1\leq h\leq \log n}\left ({n\choose k_{h}}\left (2 \cdot 6(h + 1)\sqrt{N}AB + 1\right )^{k_{h} }\right ). }$$
(118)

On the other hand, “with probability greater than 1∕2” means more than

$$\displaystyle{{ 1 \over 2}(2B + 1)^{N} }$$
(119)

possible vectors v ∈ {−B, −B + 1, −B + 2, , B}N.

So, if (119) is greater or equal to (118), than the Pigeonhole Principle applies, and there exist two different vectors

$$\displaystyle{\mathbf{v}_{1},\mathbf{v}_{2} \in \{-B,-B + 1,-B + 2,\ldots,B\}^{N}}$$

such that they generate the same row-vector, i.e., Dv 1 = Dv 2. Then x = v 1v 2 satisfies the homogeneous linear system Dx = 0 with

$$\displaystyle{ \max _{1\leq j\leq N}\vert x_{j}\vert \leq 2B. }$$
(120)

The rest is routine estimation. Clearly

$$\displaystyle{{n\choose k_{0}}\left (2 \cdot 6\sqrt{N}AB + 1\right )^{k_{0} }\prod _{1\leq h\leq \log n}\left ({n\choose k_{h}}\left (2 \cdot 6(h + 1)\sqrt{N}AB + 1\right )^{k_{h} }\right ) \leq }$$
$$\displaystyle{\leq { n\choose k_{0}}\left (13\sqrt{m}AB\right )^{k_{0} }\prod _{1\leq h\leq \log n}\left ({n\choose k_{h}}\left (13(h + 1)\sqrt{N}AB\right )^{k_{h} }\right ) =}$$
$$\displaystyle{ = \left (13\sqrt{N}AB\right )^{n}{n\choose k_{ 0}}\prod _{1\leq h\leq \log n}\left ({n\choose k_{h}}\left (h + 1\right )^{k_{h} }\right ). }$$
(121)

By using s! ≥ (se)s and (116), we have

$$\displaystyle{{n\choose k_{0}}\prod _{1\leq h\leq \log n}\left ({n\choose k_{h}}\left (h + 1\right )^{k_{h} }\right ) \leq { n\choose k_{0}}\prod _{1\leq h\leq \log n}\left (e^{6h}(h + 1)\right )^{k_{h} } \leq }$$
$$\displaystyle{\leq { n\choose k_{0}}\prod _{1\leq h\leq \log n}e^{7hk_{h} } ={ n\choose k_{0}}\exp \left (\sum _{1\leq h\leq \log n}7hk_{h}\right ) \leq }$$
$$\displaystyle{\leq { n\choose k_{0}}\exp \left (\sum _{1\leq h\leq \log n}7h \cdot 3h(h + 1)e^{-6h}n\right ) \leq }$$
$$\displaystyle{ \leq { n\choose k_{0}}\exp \left (21n\sum _{h=1}^{\infty }h^{2}(h + 1)e^{-6h}\right ) \leq 2^{n}e^{n}. }$$
(122)

Using (122) in (121), we have

$$\displaystyle{{n\choose k_{0}}\left (2 \cdot 6\sqrt{N}AB + 1\right )^{k_{0} }\prod _{1\leq h\leq \log n}\left ({n\choose k_{h}}\left (2 \cdot 6(h + 1)\sqrt{N}AB + 1\right )^{k_{h} }\right ) \leq }$$
$$\displaystyle{ \leq \left (13\sqrt{N}AB\right )^{n}2^{n}e^{n} <\left (70\sqrt{N}AB\right )^{n}. }$$
(123)

Combining (118), (119) and (123), it suffices to guarantee the inequality

$$\displaystyle{{ 1 \over 2}(2B + 1)^{N} \geq \left (70\sqrt{N}AB\right )^{n}. }$$
(124)

Inequality (124) clearly holds with

$$\displaystyle{ 2B = \left \lfloor \left (70\sqrt{N}A\right )^{n/(N-n)}\right \rfloor, }$$
(125)

and using (120) in (125), we conclude

$$\displaystyle{\max _{1\leq j\leq N}\vert x_{j}\vert \leq \left (70\sqrt{N}A\right )^{n/(N-n)},}$$

completing the proof of the Third Version of Siegel’s Lemma. □

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International publishing AG

About this chapter

Cite this chapter

Beck, J. (2017). Siegel’s Lemma Is Sharp. In: Loebl, M., Nešetřil, J., Thomas, R. (eds) A Journey Through Discrete Mathematics. Springer, Cham. https://doi.org/10.1007/978-3-319-44479-6_8

Download citation

Publish with us

Policies and ethics