Skip to main content

Rigorous Foundations for Dual Attacks in Coding Theory

  • Conference paper
  • First Online:
Theory of Cryptography (TCC 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14372))

Included in the following conference series:

Abstract

Dual attacks aiming at decoding generic linear codes have been found recently to outperform for certain parameters information set decoding techniques which have been for 60 years the dominant tool for solving this problem and choosing the parameters of code-based cryptosystems. However, the analysis of the complexity of these dual attacks relies on some unproven assumptions that are not even fully backed up with experimental evidence. These dual attacks can actually be viewed as the code-based analogue of dual attacks in lattice based cryptography. Here too, dual attacks have been found out those past years to be strong competitors to primal attacks and a controversy has emerged whether similar heuristics made for instance on the independence of certain random variables really hold. We will show that the dual attacks in coding theory can be studied by providing in a first step a simple alternative expression of the fundamental quantity used in these dual attacks. We then show that this expression can be studied without relying on independence assumptions whatsoever. This study leads us to discover that there is indeed a problem with the latest and most powerful dual attack proposed in [CDMT22] and that for the parameters chosen in this algorithm there are indeed false candidates which are produced and which are not predicted by the analysis provided there which relies on independence assumptions. We then suggest a slight modification of this algorithm consisting in a further verification step, analyze it thoroughly, provide experimental evidence that our analysis is accurate and show that the complexity claims made in [CDMT22] are indeed valid for this modified algorithm. This approach provides a simple methodology for studying rigorously dual attacks which could turn out to be useful for further developing the subject.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    It is decreasing in the interval \([0,n/2-\sqrt{w(n-w)}]\) and then even if there are fluctuations (the polynomial has zeros) it behaves like \(\mathop {\textrm{poly}}\limits \left( n\right) \sin (\alpha ) e^{\beta n}\) with an exponent \(\beta \) which is decreasing with \(t=|{\textbf{e}}|\) and where Proposition 2 shows that there are nearby weights \(t'\) and \(w'\) for t and w respectively for which the exponential term captures the behavior of \(K_{w'}(t')\).

  2. 2.

    https://github.com/tillich/RLPNdecoding/blob/master/supplementaryMaterial/RLPN_Dumer89.csv.

  3. 3.

    https://github.com/meyer-hilfiger/Rigorous-Foundations-for-Dual-Attacks-in-Coding-Theory.

  4. 4.

    https://github.com/meyer-hilfiger/Rigorous-Foundations-for-Dual-Attacks-in-Coding-Theory.

References

  1. Albrecht, M.R.: On dual lattice attacks against small-secret LWE and parameter choices in HElib and SEAL. In: Coron, J.-S., Nielsen, J.B. (eds.) EUROCRYPT 2017. LNCS, vol. 10211, pp. 103–129. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56614-6_4

    Chapter  Google Scholar 

  2. Barg, A.: Complexity issues in coding theory. Electronic Colloquium on Computational Complexity, October 1997

    Google Scholar 

  3. Blinovsky, V., Erez, U., Litsyn, S.: Weight distribution moments of random linear/coset codes. Des. Codes Crypt. 57, 127–138 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  4. Becker, A., Joux, A., May, A., Meurer, A.: Decoding random binary linear codes in \(2^{ \mathit{n}/20}\): how 1+1 = 0 improves information set decoding. In: Pointcheval, D., Johansson, T. (eds.) EUROCRYPT 2012. LNCS, vol. 7237, pp. 520–536. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-29011-4_31

    Chapter  MATH  Google Scholar 

  5. Bernstein, D.J., Lange, T., Peters, C.: Smaller decoding exponents: ball-collision decoding. In: Rogaway, P. (ed.) CRYPTO 2011. LNCS, vol. 6841, pp. 743–760. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22792-9_42

    Chapter  Google Scholar 

  6. Both, L., May, A.: Optimizing BJMM with nearest neighbors: full decoding in \(2^{2/21 n}\) and McEliece security. In: WCC Workshop on Coding and Cryptography, September 2017

    Google Scholar 

  7. Carrier, K., Debris-Alazard, T., Meyer-Hilfiger, C., Tillich, J.P.: Statistical decoding 2.0: reducing decoding to LPN. In: Agrawal, S., Lin, D. (eds.) Advances in Cryptology – ASIACRYPT 2022. ASIACRYPT 2022. LNCS, vol. 13794, pp. 477–507. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-22972-5_17

  8. Ducas, L., Pulles, L.N.: Does the dual-sieve attack on learning with errors even work? IACR Cryptol. ePrint Arch., p. 302 (2023)

    Google Scholar 

  9. Debris-Alazard, T., Tillich, J.P.: Statistical decoding. In: Proceedings of the IEEE International Symposium on Information Theory - ISIT 2017, pp. 1798–1802, Aachen, Germany, June 2017

    Google Scholar 

  10. Debris-Alazard, T., Tillich, J.P.: Statistical decoding. preprint, January 2017. arXiv:1701.07416

  11. Dumer, I.: On syndrome decoding of linear codes. In: Proceedings of the 9th All-Union Symposium on Redundancy in Information Systems, abstracts of papers (in Russian), Part 2, pp. 157–159, Leningrad (1986)

    Google Scholar 

  12. Dumer, I.: On minimum distance decoding of linear codes. In: Proceedings of the 5th Joint Soviet-Swedish International Workshop Information Theory, pp. 50–52, Moscow (1991)

    Google Scholar 

  13. Espitau, T., Joux, A., Kharchenko, N.: On a dual/hybrid approach to small secret LWE. In: Bhargavan, K., Oswald, E., Prabhakaran, M. (eds.) INDOCRYPT 2020. LNCS, vol. 12578, pp. 440–462. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-65277-7_20

    Chapter  Google Scholar 

  14. Finiasz, M., Sendrier, N.: Security bounds for the design of code-based cryptosystems. In: Matsui, M. (ed.) ASIACRYPT 2009. LNCS, vol. 5912, pp. 88–105. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-10366-7_6

    Chapter  Google Scholar 

  15. Guo, Q., Johansson, T.: Faster dual lattice attacks for solving LWE with applications to CRYSTALS. In: Tibouchi, M., Wang, H. (eds.) ASIACRYPT 2021. LNCS, vol. 13093, pp. 33–62. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-92068-5_2

    Chapter  Google Scholar 

  16. Jabri, A.A.: A statistical decoding algorithm for general linear block codes. In: Honary, B. (ed.) Cryptography and Coding 2001. LNCS, vol. 2260, pp. 1–8. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-45325-3_1

    Chapter  Google Scholar 

  17. Kirshner, N., Samorodnitsky, A.: A moment ratio bound for polynomials and some extremal properties of Krawchouk polynomials and hamming spheres. IEEE Trans. Inform. Theory 67(6), 3509–3541 (2021)

    Google Scholar 

  18. Linial, N., Mosheiff, J.: On the weight distribution of random binary linear codes. Random Struct. Algorithms 56, 5–36 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  19. MATZOV. Report on the Security of LWE: Improved Dual Lattice Attack, April 2022

    Google Scholar 

  20. May, A., Meurer, A., Thomae, E.: Decoding random linear codes in \(\tilde{\cal{O}}(2^{0.054n})\). In: Lee, D.H., Wang, X. (eds.) ASIACRYPT 2011. LNCS, vol. 7073, pp. 107–124. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25385-0_6

    Chapter  MATH  Google Scholar 

  21. May, A., Ozerov, I.: On computing nearest neighbors with applications to decoding of binary linear codes. In: Oswald, E., Fischlin, M. (eds.) EUROCRYPT 2015. LNCS, vol. 9056, pp. 203–228. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46800-5_9

    Chapter  Google Scholar 

  22. Micciancio, D., Regev, O.: Lattice-based cryptography. In: Bernstein, D.J., Buchmann, J., Dahmen, E. (eds.) Post-Quantum Cryptography, pp. 147–191. Springer, Berlin, Heidelberg (2009). https://doi.org/10.1007/978-3-540-88702-7_5

  23. MacWilliams, F.J., Sloane, N.J.A.: The Theory of Error-Correcting Codes, 5th edn. North-Holland, Amsterdam (1986)

    Google Scholar 

  24. Overbeck, R.: Statistical decoding revisited. In: Batten, L.M., Safavi-Naini, R. (eds.) ACISP 2006. LNCS, vol. 4058, pp. 283–294. Springer, Heidelberg (2006). https://doi.org/10.1007/11780656_24

    Chapter  Google Scholar 

  25. Prange, E.: The use of information sets in decoding cyclic codes. IRE Trans. Inf. Theory 8(5), 5–9 (1962)

    Article  MathSciNet  Google Scholar 

  26. Stern, J.: A method for finding codewords of small weight. In: Cohen, G., Wolfmann, J. (eds.) Coding Theory 1988. LNCS, vol. 388, pp. 106–113. Springer, Heidelberg (1989). https://doi.org/10.1007/BFb0019850

    Chapter  Google Scholar 

  27. Van Lint, J.H.: Introduction to Coding Theory, 3rd edn. In: Graduate Texts in Mathematics. Springer, Berlin, Heidelberg (1999). https://doi.org/10.1007/978-3-642-58575-3

Download references

Acknowledgement

We would like to express our thanks to Thomas Debris-Alazard for the insightful discussions. The work of C. Meyer-Hilfiger was funded by the French Agence de l’innovation de défense and by Inria. The work of J-P. Tillich was funded by the French Agence Nationale de la Recherche through the France 2023 ANR project ANR-22-PETQ-0008 PQ-TLS.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Charles Meyer-Hilfiger .

Editor information

Editors and Affiliations

Appendices

Appendix

A Complexity of [Dum86] to Compute Low Weight Parity-Checks

We give here the asymptotic complexity of one of the method devised in [CDMT22, §4.1] to produce low weight parity-checks.

Proposition 7

Asymptotic complexity of Dumer’s [Dum86] method to compute low weight parity-checks. Let \( R \mathop {=}\limits ^{\triangle }\lim \limits _{n \rightarrow \infty }\frac{k}{n}, \omega \mathop {=}\limits ^{\triangle }\lim \limits _{n \rightarrow \infty }\frac{w}{n}.\) The asymptotic time and space complexities of Dumer’s [Dum86] method to compute and store \(2^{n \, \left( h_2\left( \omega \right) - R + o(1)\right) }\) parity-checks are in \(2^{ n \, (\alpha + o(1))}\) where \( \alpha \mathop {=}\limits ^{\triangle }\max \left( \frac{h\left( \omega \right) }{2} , h_2(\omega ) - R\right) \).

B [Dum91] ISD Decoder

Proposition 8

(Asymptotic time complexity of ISD Decoder [Dum91]). Let \( R \mathop {=}\limits ^{\triangle }\lim \limits _{n \rightarrow \infty }\frac{k}{n}, \tau \mathop {=}\limits ^{\triangle }\lim \limits _{n \rightarrow \infty }\frac{t}{n}\) and suppose that \(\tau \le h_2^{-1}\left( 1-R\right) \). Let \(\ell \) and w be two (implicit) parameters of the algorithm and define \( \lambda \mathop {=}\limits ^{\triangle }\lim \limits _{n \rightarrow \infty }\frac{\ell }{n}, \omega \mathop {=}\limits ^{\triangle }\lim \limits _{n \rightarrow \infty }\frac{w}{n}\). The time and space complexities of [Dum91] decoder to decode a word at distance t in an [nk] linear code are given by \(2^{n \, \left( \alpha + o(1)\right) }\) and \(2^{n \, \left( \beta + o(1)\right) }\) respectively where

$$\begin{aligned} \alpha &\mathop {=}\limits ^{\triangle }\pi + \max \left( \frac{R + \lambda }{2}h_2 \left( \frac{\omega }{ R+\lambda }\right) , \left( R + \lambda \right) h_2 \left( \frac{\omega }{ R+\lambda }\right) - \lambda \right) , \end{aligned}$$
(B.1)
$$\begin{aligned} \pi &\mathop {=}\limits ^{\triangle }h_2(\tau ) - (1- R-\lambda ) h_2 \left( \frac{\tau - \omega }{1- R-\lambda }\right) - (R + \lambda ) h_2 \left( \frac{\omega }{R+\lambda }\right) , \end{aligned}$$
(B.2)
$$\begin{aligned} \beta &\mathop {=}\limits ^{\triangle }\frac{R + \lambda }{2}h_2 \left( \frac{\omega }{R+\lambda }\right) . \end{aligned}$$
(B.3)

Moreover \(\lambda \) and \(\omega \) must verify the following constraints:

$$ 0 \le \lambda \le 1- R, \qquad \max \left( R + \lambda + \tau - 1, 0\right) \le \omega \le \min \left( \tau ,R + \lambda \right) .$$

C Proofs and Results Corresponding to Section 5

1.1 C.1 Proof of Proposition 4

The aim of this subsection is to prove

Proposition 4

(Correctness) After at most \(N_{iter } = \omega \left( \frac{\left( {\begin{array}{c}s\\ t-u\end{array}}\right) \left( {\begin{array}{c}n-s\\ u\end{array}}\right) }{\left( {\begin{array}{c}n\\ t\end{array}}\right) }\right) \) executions of the outer loop (Line 2) of Algorithm 4.1, the algorithm outputs with probability \(1-o(1)\) over the choices of \(\mathscr {C}\) and \({\mathscr {H}}\), an \({\textbf{e}}\in \mathbb {F}_2^n\) of weight t such that \({\textbf{y}}- {\textbf{e}}\in \mathscr {C}\).

It is readily seen that when \(N_{iter } = \omega \left( \frac{\left( {\begin{array}{c}s\\ t-u\end{array}}\right) \left( {\begin{array}{c}n-s\\ u\end{array}}\right) }{\left( {\begin{array}{c}n\\ t\end{array}}\right) }\right) \) there exists, with probability \(1-o(1)\), an iteration such that \(|{\textbf{e}}_{{\mathscr {N}}}|=u\). Let us consider such an iteration and show that \({\textbf{e}}_{{\mathscr {P}}} \in \mathcal {S}\) with high probability. Recall that

$$\mathcal {S} \mathop {=}\limits ^{\triangle }\{ {\textbf{x}}\in \mathcal {S}_{t-u}^s \, : \widehat{f_{{\textbf{y}},{\mathscr {H}}}}({\textbf{x}}) > \frac{\delta }{2}N \}$$

and that from Fact 7 we have that

$$\widehat{f_{{\textbf{y}},{\mathscr {H}}}}({\textbf{x}}) = N \, {\mathop {\textrm{bias}}\limits }_{{\textbf{h}}\hookleftarrow {\mathscr {H}}}\left( \langle {\textbf{y}}, {\textbf{h}}\rangle + \langle {\textbf{x}}, {\textbf{h}}_{{\mathscr {P}}} \rangle \right) .$$

Now, using the fact that

$$ \langle {\textbf{y}}, {\textbf{h}}\rangle + \langle {\textbf{e}}_{{\mathscr {P}}}, {\textbf{h}}_{{\mathscr {P}}} \rangle = \langle {\textbf{e}}_{{\mathscr {N}}}, {\textbf{h}}_{{\mathscr {N}}}\rangle ,$$

we only have to lower bound the term

$$ {{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C},{\mathscr {H}}}\left( {\mathop {\textrm{bias}}\limits }_{{\textbf{h}}\hookleftarrow {\mathscr {H}}} \left( \langle {\textbf{e}}_{{\mathscr {N}}}, {\textbf{h}}_{{\mathscr {N}}}\rangle \right) > \frac{\delta }{2}\right) $$

to lower bound the probability that \({\textbf{e}}_{{\mathscr {P}}}\) belong to \(\mathcal {S}\). The only known results we have regarding the bias of \(\langle {\textbf{e}}_{{\mathscr {N}}}, {\textbf{h}}_{{\mathscr {N}}}\rangle \) is when \({\textbf{h}}\) is sampled in \(\widetilde{{\mathscr {H}}}\) (see [CDMT22, Prop. 3.1] and Proposition 1). But, because \({\mathscr {H}}\) is a random subset of N (where N is lower bounded in Parameter constraint 6) elements of the set \(\widetilde{{\mathscr {H}}}\), the distribution of \( {\mathop {\textrm{bias}}\limits }_{{\textbf{h}}\hookleftarrow {\mathscr {H}}} \left( \langle {\textbf{e}}_{{\mathscr {N}}}, {\textbf{h}}_{{\mathscr {N}}}\rangle \right) \) is relatively close to the distribution of \({\mathop {\textrm{bias}}\limits }_{{\textbf{h}}\hookleftarrow \widetilde{{\mathscr {H}}}} \left( \langle {\textbf{e}}_{{\mathscr {N}}}, {\textbf{h}}_{{\mathscr {N}}}\rangle \right) \). Namely, we have:

Lemma 2

For any constant \(c > 0\):

$$ {{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C},{\mathscr {H}}}\left( \left| {\mathop {\textrm{bias}}\limits }_{{\textbf{h}}\hookleftarrow {\mathscr {H}}} \left( \langle {\textbf{e}}_{{\mathscr {N}}}, {\textbf{h}}_{{\mathscr {N}}}\rangle \right) - {\mathop {\textrm{bias}}\limits }_{{\textbf{h}}\hookleftarrow \widetilde{{\mathscr {H}}}} \left( \langle {\textbf{e}}_{{\mathscr {N}}}, {\textbf{h}}_{{\mathscr {N}}}\rangle \right) \right| \ge \delta \, c \right) \le 2^{-\omega \left( n\right) }$$

Proof

The proof is available in the eprint version of the paper.    \(\square \)

And thus, as a corollary we get the following lower bound on our probability:

Corollary 2

We have that

$$ {{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C},{\mathscr {H}}}\left( {\mathop {\textrm{bias}}\limits }_{{\textbf{h}}\hookleftarrow {\mathscr {H}}} \left( \langle {\textbf{e}}_{{\mathscr {N}}}, {\textbf{h}}_{{\mathscr {N}}}\rangle \right) > \frac{\delta }{2}\right) \ge (1-e^{-\omega (n)}) {{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C}}\left( {\mathop {\textrm{bias}}\limits }_{{\textbf{h}}\hookleftarrow \widetilde{{\mathscr {H}}}} \left( \langle {\textbf{e}}_{{\mathscr {N}}}, {\textbf{h}}_{{\mathscr {N}}}\rangle \right) > \frac{\delta }{1.5}\right) . $$

Now, recall that we supposed that the iteration considered is such that \(|{\textbf{e}}_{{\mathscr {N}}}|=u\). Moreover, from Condition (5.7) in Parameter constraint 6 we have that \(N=\omega (\frac{n}{\delta ^2})\). Thus, a direct application of Proposition [CDMT22, Prop. 3.1] gives us that with probability \(1-o(1)\) over the choice of \(\mathscr {C}\), we have

$${\mathop {\textrm{bias}}\limits }_{{\textbf{h}}\hookleftarrow \widetilde{{\mathscr {H}}}} \left( \langle {\textbf{e}}_{{\mathscr {N}}}, {\textbf{h}}_{{\mathscr {N}}}\rangle \right) = \delta \left( 1 + o(1)\right) .$$

And thus we have that:

$${{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C}}\left( {\mathop {\textrm{bias}}\limits }_{{\textbf{h}}\hookleftarrow \widetilde{{\mathscr {H}}}} \left( \langle {\textbf{e}}_{{\mathscr {N}}}, {\textbf{h}}_{{\mathscr {N}}}\rangle \right) > \frac{\delta }{1.5}\right) = 1 - o(1).$$

This concludes the proof that \({\textbf{e}}_{{\mathscr {P}}}\) belongs to \(\mathcal {S}\) with probability \(1-o(1)\) which in turns proves the correctness of Proposition 4.

1.2 C.2 A Simple Lower Bound

It turns out that we could easily compute a lower bound on the size of \(\mathcal {S}\) using Lemma 1 altogether with a slight adaptation of Proposition [CDMT22, Prop. 3.1]. Indeed, recall that in Lemma 1 we proved that for a parity-check \({\textbf{h}}\):

$$\begin{aligned} \langle {\textbf{y}}, {\textbf{h}}\rangle + \langle {\textbf{x}}, {\textbf{h}}_{{\mathscr {P}}}\rangle &= \langle \left( {\textbf{x}}+ {\textbf{e}}_{{\mathscr {P}}}\right) {\textbf{R}}+ {\textbf{e}}_{{\mathscr {N}}}, {\textbf{h}}_{{\mathscr {N}}} \rangle , \end{aligned}$$
(C.1)
$$\begin{aligned} &=\langle \left( {\textbf{x}}+ {\textbf{e}}_{{\mathscr {P}}}\right) {\textbf{R}}+ {\textbf{e}}_{{\mathscr {N}}} + {\textbf{c}}^{{\mathscr {N}}}, {\textbf{h}}_{{\mathscr {N}}} \rangle , \quad \forall {\textbf{c}}^{{\mathscr {N}}} \in \mathscr {C}^{{\mathscr {N}}}, \end{aligned}$$
(C.2)

where in the last line we used the fact that \({\textbf{h}}_{{\mathscr {N}}} \in \left( \mathscr {C}^{{\mathscr {N}}} \right) ^{\perp }\). Thus if there exists \({\textbf{c}}^{{\mathscr {N}}} \in \mathscr {C}^{{\mathscr {H}}}\) such that

$$\begin{aligned} \left| \left( {\textbf{x}}+ {\textbf{e}}_{{\mathscr {P}}}\right) {\textbf{R}}+ {\textbf{e}}_{{\mathscr {N}}} + {\textbf{c}}^{{\mathscr {N}}} \right| \le u. \end{aligned}$$
(C.3)

then with high probability

$$ \textrm{bias}_{{\textbf{h}}\hookleftarrow \widetilde{{\mathscr {H}}}}\left( \langle \left( {\textbf{x}}+ {\textbf{e}}_{{\mathscr {P}}}\right) {\textbf{R}}+ {\textbf{e}}_{{\mathscr {N}}} + {\textbf{c}}^{{\mathscr {N}}}, {\textbf{h}}_{{\mathscr {N}}} \rangle \right) \ge \delta (1+o(1)). $$

As a matter of fact, from Parameter constraint 6 we have that \(N=\omega \left( \frac{n}{\delta ^2}\right) \), and, because \(K_w^{(n-s)}\) is decreasing in this range, we can use a slight adaptation of Proposition [CDMT22, Prop. 3.1] to show this point. And thus with high probability \({\textbf{x}}\in \mathcal {S}\). We can give a lower bound on the number of \({\textbf{x}}\) verifying Condition (C.3) by counting the number of codewords of weight lower than u in the following code:

$$ \{ \left( {\textbf{x}}+ {\textbf{e}}_{{\mathscr {P}}}\right) {\textbf{R}}+ {\textbf{e}}_{{\mathscr {N}}} + {\textbf{c}}^{{\mathscr {N}}} \, : ({\textbf{x}},{\textbf{c}}^{{\mathscr {N}}} ) \in \mathcal {S}_{t-u}^s \times \mathscr {C}^{{\mathscr {N}}}\}.$$

This is a random code of length \(n-s\) and with a maximum size of \(\left| \mathcal {S}_{t-u}^s\right| \, \left| \mathscr {C}^{{\mathscr {N}}}\right| \). Thus we expect that there are at most

$$ \mathop {}\mathopen {}\widetilde{O}\mathopen {}\left( \left| \mathcal {S}_{t-u}^s\right| \, \left| \mathscr {C}^{{\mathscr {N}}}\right| \frac{\left( {\begin{array}{c}n-s\\ u\end{array}}\right) }{2^{n-s}}\right) = \mathop {}\mathopen {}\widetilde{O}\mathopen {}\left( \left( {\begin{array}{c}s\\ t-u\end{array}}\right) \, \frac{\left( {\begin{array}{c}n-s\\ u\end{array}}\right) }{2^{n-k}}\right) $$

codewords of weight lower than u in the previous code, giving us a lower bound on the expected size of \(\mathcal {S}\). This lower-bound actually matches up to polynomial terms the upper-bound appearing in Proposition 5.

1.3 C.3 Proof of Proposition 5

Let us first recall Proposition 5

Proposition 5

We have under Assumption 8 that:

$$\begin{aligned} \mathbb {E}_{\mathscr {C},{\mathscr {H}}} \left( \left| \mathcal {S} \setminus \{{\textbf{e}}_{{\mathscr {P}}}\} \right| \right) = \mathop {}\mathopen {}\widetilde{O}\mathopen {}\left( \left( {\begin{array}{c}s\\ t-u\end{array}}\right) \frac{\left( {\begin{array}{c}n-s\\ u\end{array}}\right) }{2^{n-k}} \right) . \end{aligned}$$
(5.10)

To ease up our analysis, we will suppose in the following that the predicate \(P\left( \widetilde{{\mathscr {H}}}\right) \) defined as:

$$\begin{aligned} P\left( \widetilde{{\mathscr {H}}}\right) \, :"\left| \widetilde{{\mathscr {H}}}\right| \ge \frac{ \mathbb {E}_{\mathscr {C}}\left( \left| \widetilde{{\mathscr {H}}}\right| \right) }{2}" \end{aligned}$$
(C.4)

is true and we will only compute the value of \(\mathbb {E}_{\mathscr {C},{\mathscr {H}}}\left( \left| \mathcal {S} \right| \, \left| \, P\left( \widetilde{{\mathscr {H}}}\right) \right. \right) \). We can show using the Bienaymé-Tchebychev inequality that \(P\left( \widetilde{{\mathscr {H}}}\right) \) is true with probability \(1 - o(1)\). However, contrary to the previous supposition that \(\mathscr {C}_{{\mathscr {P}}}\) is of full rank dimension s, in general, we have no way of verifying in polynomial time if \(P\left( \widetilde{{\mathscr {H}}}\right) \) is true or not, thus we cannot just simply restart the algorithm from Line 2 if it is not true. The strategy we adopt is to bound the complexity of each iteration of corrected RLPN regardless of the value of the predicate \(P\left( \widetilde{{\mathscr {H}}}\right) \), this is done by discarding the iterations that are such that the size of \(\mathcal {S}\) is greater than a certain threshold (Line 7 of Algorithm 4.1). The correctness of our algorithm is not impacted by this as the probability that \(P \left( \widetilde{{\mathscr {H}}}\right) \) is verified is in \(1-o(1)\) and the threshold will be chosen such that, when \(P\left( \widetilde{{\mathscr {H}}}\right) \) is verified the set \(\mathcal {S}\) meets the threshold with probability \(1-o(1)\). More specifically, the threshold \(N_{candi }^{max }\) is chosen as

$$ n \, \mathbb {E}_{\mathscr {C},{\mathscr {H}}}\left( \left| \mathcal {S} \right| \, \left| \, P\left( \widetilde{{\mathscr {H}}}\right) \right. \right) ,$$

and, we can show using Markov inequality that this treshold is met with probability \(1-o(1)\) for the iterations such that \(P\left( \widetilde{{\mathscr {H}}}\right) \) is true. Thus in what follows we will only compute the value of \(\mathbb {E}_{\mathscr {C},{\mathscr {H}}}\left( \left| \mathcal {S} \right| \, \left| \, P\left( \widetilde{{\mathscr {H}}}\right) \right. \right) \) and, to simplify notation we just write it as \( \mathbb {E}_{\mathscr {C},{\mathscr {H}}}\left( \left| \mathcal {S} \right| \right) \). We are ready now to prove Proposition 5.

Step 1. It is readily seen that by linearity of the expected value and from the fact that the distribution of \(\textrm{bias}_{{\textbf{h}}\hookleftarrow {\mathscr {H}}} \left( \langle {\textbf{y}}, {\textbf{h}}\rangle + \langle {\textbf{x}}, {\textbf{h}}_{{\mathscr {P}}} \rangle \right) > \frac{\delta }{2}\) does not depend on \({\textbf{x}}\) as long as \({\textbf{x}}\ne {\textbf{e}}_{{\mathscr {P}}}\) we have

Fact 9

$$\mathbb {E}_{\mathscr {C},{\mathscr {H}}} \left( \left| \mathcal {S} \setminus \{ {\textbf{e}}_{{\mathscr {P}}} \}\right| \right) \le \left( {\begin{array}{c}s\\ t-u\end{array}}\right) {{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C},{\mathscr {H}}}\left( {\mathop {\textrm{bias}}\limits }_{{\textbf{h}}\hookleftarrow {\mathscr {H}}} \left( \langle {\textbf{y}}, {\textbf{h}}\rangle + \langle {\textbf{x}}, {\textbf{h}}_{{\mathscr {P}}} \rangle \right) > \frac{\delta }{2}\right) .$$

The only known results we have regarding the bias is when \({\textbf{h}}\) is sampled in \(\widetilde{{\mathscr {H}}}\) (see [CDMT22, Prop. 3.1] and Proposition 1). But we have the following slight generalization of Lemma 2 which essentially tells us that the distribution of the bias when \({\textbf{h}}\) is sampled in \({\mathscr {H}}\) is close to the bias when \({\textbf{h}}\) is sampled in \(\widetilde{{\mathscr {H}}}\).

Lemma 3

For any constant \(c > 0\):

$$ {{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C},{\mathscr {H}}}\left( \left| {\mathop {\textrm{bias}}\limits }_{{\textbf{h}}\hookleftarrow {\mathscr {H}}} \left( \langle {\textbf{y}}, {\textbf{h}}\rangle + \langle {\textbf{x}}, {\textbf{h}}_{{\mathscr {P}}}\rangle \right) - {\mathop {\textrm{bias}}\limits }_{{\textbf{h}}\hookleftarrow \widetilde{{\mathscr {H}}}} \left( \langle {\textbf{y}}, {\textbf{h}}\rangle + \langle {\textbf{x}}, {\textbf{h}}_{{\mathscr {P}}}\rangle \right) \right| \ge \delta \, c \right) \le 2^{-\omega \left( n\right) }.$$

As a direct corollary we get the following bound

Corollary 3

We have that

$$ {{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C},{\mathscr {H}}}\left( \textrm{bias}_{{\textbf{h}}\hookleftarrow {\mathscr {H}}} \left( \langle {\textbf{y}}, {\textbf{h}}\rangle + \langle {\textbf{x}}, {\textbf{h}}_{{\mathscr {P}}} \rangle \right) > \frac{\delta }{2}\right) = \mathop {}\mathopen {}O\mathopen {}\left( {{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C}}\left( \textrm{bias}_{{\textbf{h}}\hookleftarrow \widetilde{{\mathscr {H}}}} \left( \langle {\textbf{y}}, {\textbf{h}}\rangle + \langle {\textbf{x}}, {\textbf{h}}_{{\mathscr {P}}} \rangle \right) > \frac{\delta }{4}\right) \right) . $$

Step 2. Thus we are now interested in upper bounding \({\mathbb {P}}_{\mathscr {C}}\left( \textrm{bias}_{{\textbf{h}}\hookleftarrow \widetilde{{\mathscr {H}}}} \left( \langle {\textbf{y}}, {\textbf{h}}\rangle +\right. \right. \left. \left. \langle {\textbf{x}}, {\textbf{h}}_{{\mathscr {P}}} \rangle \right) > \frac{\delta }{4}\right) \). A first step is given in the next Lemma 4. This lemma uses the fact that we can write using Proposition 1 the former bias as

$$ \textrm{bias}_{{\textbf{h}}\hookleftarrow \widetilde{{\mathscr {H}}}} \left( \langle {\textbf{y}}, {\textbf{h}}\rangle + \langle {\textbf{x}}, {\textbf{h}}_{{\mathscr {P}}} \rangle \right) = \frac{1}{2^{k-s} \left| \widetilde{{\mathscr {H}}} \right| } \sum _{i=0}^{n-s} N_i \left( \mathscr {C}^{{\mathscr {N}}} + \left( {\textbf{x}}+ {\textbf{e}}_{{\mathscr {P}}}\right) {\textbf{R}}+ {\textbf{e}}_{{\mathscr {N}}} \right) K_w^{(n-s)}(i)$$

where \(N_i()\) is the number of codeword of weight i in a code, namely:

$$ N_i \left( \mathscr {C}^{{\mathscr {N}}} + \left( {\textbf{x}}+ {\textbf{e}}_{{\mathscr {P}}}\right) {\textbf{R}}+ {\textbf{e}}_{{\mathscr {N}}} \right) \mathop {=}\limits ^{\triangle }\left| \left( \mathscr {C}^{{\mathscr {N}}} + \left( {\textbf{x}}+ {\textbf{e}}_{{\mathscr {P}}}\right) {\textbf{R}}+ {\textbf{e}}_{{\mathscr {N}}} \right) \bigcap \mathcal {S}_i^{n-s} \right| .$$

We define, for simplicity:

Notation 10

$$\begin{aligned} N_i &\mathop {=}\limits ^{\triangle }N_i \left( \mathscr {C}^{{\mathscr {N}}} + \left( {\textbf{x}}+ {\textbf{e}}_{{\mathscr {P}}}\right) {\textbf{R}}+ {\textbf{e}}_{{\mathscr {N}}} \right) , \end{aligned}$$
(C.5)
$$\begin{aligned} \bar{N_i} &\mathop {=}\limits ^{\triangle }\mathbb {E}_{\mathscr {C}} \left( N_i\right) . \end{aligned}$$
(C.6)

Recall that \(\mathscr {C}^{{\mathscr {N}}} + \left( {\textbf{x}}+ {\textbf{e}}_{{\mathscr {P}}}\right) {\textbf{R}}+ {\textbf{e}}_{{\mathscr {N}}}\) is a coset of the \([n-s,k-s]\) linear code \(\mathscr {C}^{{\mathscr {N}}}\) thus it is readily seen that we have

Fact 11

[Bar97, Lem. 1.1, §1.3]

$$\begin{aligned} \bar{N_i} &= \frac{\left( {\begin{array}{c}n-s\\ i\end{array}}\right) }{2^{n-k}}, \end{aligned}$$
(C.7)
$$\begin{aligned} \textbf{Var}_{\mathscr {C}} \left( N_i \right) &\le \frac{\left( {\begin{array}{c}n-s\\ i\end{array}}\right) }{2^{n-k}}. \end{aligned}$$
(C.8)

The following lemma essentially says that we can study only the dominant term in the previous sum to bound the tail distribution of the bias. The key trick will be to use Krawtchouk polynomials orthogonality with the measure \(\mu (i) = \left( {\begin{array}{c}n-s\\ i\end{array}}\right) \) so that we gain a factor \(\bar{N_i}K_w^{(n-s)}(i)\) in our expressions

Lemma 4

We have that

$$\begin{aligned} {} & {} \mathbb {P}_{\mathscr {C}} \left( bias_{{\textbf{h}}\hookleftarrow \widetilde{{\mathscr {H}}}} \left( \langle {\textbf{y}}, {\textbf{h}}\rangle - \langle {\textbf{x}}, {\textbf{h}}_{{\mathscr {P}}} \rangle \right) > \frac{\delta }{4} \right) \le \\ {} & {} n \max _{i = 0...n-s} {{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C}}\left( \left| N_i - \bar{N_i} \right| > \left| \frac{K_w^{(n-s)}\left( u\right) }{K_w^{(n-s)}\left( i\right) } \right| \frac{1}{8 \left( n-s+ 1\right) } \right) \end{aligned}$$

Proof

By using Proposition 1 we derive that

$$ \mathop {\textrm{bias}}\limits _{{\textbf{h}}\hookleftarrow \widetilde{{\mathscr {H}}}} \left( \langle {\textbf{y}}, {\textbf{h}}\rangle - \langle {\textbf{x}}, {\textbf{h}}_{{\mathscr {P}}} \rangle \right) = \frac{1}{2^{k-s} |\widetilde{{\mathscr {H}}}|} \sum _{i = 0}^{n-s} N_i \, K^{(n-s)}_w \left( i\right) .$$

From the orthogonality of Krawtchouk polynomials relatively to the measure \(\mu \left( i\right) = \left( {\begin{array}{c}n-s\\ i\end{array}}\right) \) [MS86, Ch. 5. §7. Theorem 16] we have:

$$ \sum _{i = 0}^{n-s} \left( {\begin{array}{c}n-s\\ i\end{array}}\right) K_w^{(n-s)}\left( i\right) = 0.$$

And thus, altogether with Fact 11 we have that

$$\frac{1}{2^{k-s} |\widetilde{{\mathscr {H}}}|} \sum _{i = 0}^{n-s} \bar{N_i} \, K^{(n-s)}_w \left( i\right) = 0.$$

And thus,

$$\begin{aligned} \frac{1}{2^{k-s} |\widetilde{{\mathscr {H}}}|} \sum _{i = 0}^{n-s} N_i \, K^{(n-s)}_w \left( i\right) = \frac{1}{2^{k-s} |\widetilde{{\mathscr {H}}}|} \sum _{i = 0}^{n-s}\left( N_i - \bar{N_i}\right) \, K^{(n-s)}_w \left( i\right) . \end{aligned}$$
(C.9)

Moreover, the event

$$\frac{1}{2^{k-s} |\widetilde{{\mathscr {H}}}|} \sum _{i = 0}^{n-s}\left( N_i - \bar{N_i}\right) \, K^{(n-s)}_w \left( i\right) > \frac{\delta }{4}$$

implies that it exists \(i \in \llbracket 0, n-s \rrbracket \) such that

$$\begin{aligned} \frac{1}{2^{k-s} |\widetilde{{\mathscr {H}}}|} \left( N_i - \bar{N_i}\right) \, K^{(n-s)}_w \left( i\right) > \frac{\delta }{4 \left( n-s+1\right) }. \end{aligned}$$
(C.10)

Thus we get:

$$\begin{aligned} &{{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C}}\left( \frac{1}{2^{k-s} |\widetilde{{\mathscr {H}}}|} \sum _{i = 0}^{n-s}\left( N_i - \bar{N_i}\right) \, K^{(n-s)}_w \left( i\right) > \frac{\delta }{4}\right) \\ {} &\le {{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C}}\left( \bigvee _{i = 0}^{n-s} \left( \frac{1}{2^{k-s} |\widetilde{{\mathscr {H}}}|} \left( N_i - \bar{N_i}\right) \, K^{(n-s)}_w \left( i\right) > \frac{\delta }{4 \left( n-s+1\right) } \right) \right) \quad (\hbox {using } C.10)\\ &\le \sum _{i = 0}^{n-s} {{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C}}\left( \frac{1}{2^{k-s} |\widetilde{{\mathscr {H}}}|} \left( N_i - \bar{N_i}\right) \, K^{(n-s)}_w \left( i\right) > \frac{\delta }{4 \left( n-s+1\right) } \right) \quad (\hbox {union bound}) \\ &\le (n-s+1) \max _{i = 0 ... (n-s)} {{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C}}\left( \frac{1}{2^{k-s} |\widetilde{{\mathscr {H}}}|} \left( N_i - \bar{N_i}\right) \, K^{(n-s)}_w \left( i\right) > \frac{\delta }{4 \left( n-s+1\right) } \right) \\ &\le (n-s+1) \max _{i = 0 ... (n-s)} {{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C}}\left( \left| N_i - \bar{N_i}\right| \, > \left| \frac{\delta \, 2^{k-s} \, |\widetilde{{\mathscr {H}}}| }{K^{(n-s)}_w \left( i\right) }\right| \frac{1}{4 \left( n-s+1\right) } \right) \\ \end{aligned}$$

Now we get our result using the fact that \(\delta = \frac{K_w^{(n-s)}\left( u\right) }{\left( {\begin{array}{c}n-s\\ w\end{array}}\right) }\) (Equation (5.3) of Parameter constraint 6) altogether with the fact that we supposed in (C.4) that \(|\widetilde{{\mathscr {H}}}| > \frac{1}{2}\,\mathbb {E}_{\mathscr {C}} \left( |\widetilde{{\mathscr {H}}}|\right) \) and that from Fact 5 we have that \(\mathbb {E}_{\mathscr {C}} \left( |\widetilde{{\mathscr {H}}}|\right) = \frac{\left( {\begin{array}{c}n-s\\ w\end{array}}\right) }{2^{k-s}}\).    \(\square \)

Step 3. In this step we want to upper bound

$$ {{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C}}\left( \left| N_i - \bar{N_i} \right| > \left| \frac{K_w^{(n-s)}\left( u\right) }{K_w^{(n-s)}\left( i\right) } \right| \frac{1}{8 \left( n-s+ 1\right) } \right) $$

First we give a useful lemma which essentially tells us that the right term in the probability is always greater than \(\sqrt{\textbf{Var}_{\mathscr {C}}\left( N_i\right) } = \sqrt{\bar{N_i}}\).

Lemma 5

We have that

$$ \bar{N_i} \, f(n) < \left( \frac{K_w^{(n-s)}(u) }{K_w^{(n-s)} (i) }\right) ^2 $$

Proof

The proof is available in the eprint version of the paper.    \(\square \)

We want to obtain an exponential bound on the previous probability

$$ {{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C}}\left( \left| N_i - \bar{N_i}\right| > \left| \frac{K_w^{(n-s)}(u) }{K_w^{(n-s)} \left( i\right) } \right| \frac{1}{ 8 \left( n-s+ 1\right) } \right) . $$

But, very few results are known about the \(N_i\)’s, or, more generally the weight distribution of a random affine code. The first two moments of \(N_i\) are known with Fact 11. Some higher moments are studied in [LM19, BEL10], but in general, there is no known expressions for all the higher moments of the weight distribution of a random linear code. Furthermore, up to our knowledge no exponential tail bound on \(N_i - \bar{N_i}\) exists. Thus, we are left to bound the previous probability using only the expected value and the variance of \(N_i\) by using Bienaymé-Tchebychev second order bound (which is the best bound we can get for a generic random variable using only its first two moments) which is given by:

$$ {{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C}}\left( \left| N_i - \bar{N_i}\right| > p(n) \sqrt{\textbf{Var} \left( N_i\right) }\right) \le \frac{1}{p(n) ^2 }$$

Now the problem is that the previous Lemma 5 is tight for some \(i \approx \frac{n-s}{2}\), namely we have:

$$\begin{aligned} \bar{N_i}\,f(n) = \textrm{poly}(n) \left( \frac{K_w^{(n-s)}(u) }{K_w^{(n-s)} (i) }\right) ^2 \end{aligned}$$
(C.11)

And thus if f is big enough we have that

$$ \left| \frac{K_w^{(n-s)}(u) }{K_w^{(n-s)} \left( i\right) } \right| \frac{1}{ 8 \left( n-s+ 1\right) } = \textrm{poly}(n) \, \sqrt{\textbf{Var} \left( N_i\right) }. $$

As a result we can only get a polynomial bound using the Bienaymé-Tchebychev inequality. The fact that equation (C.11) holds can be seen with the fact that Krawtchouk polynomials attain their \(\ell _2\) norm [KS21, Prop. 2.15] (regarding the measure \(\mu (i) = \frac{\left( {\begin{array}{c}n-s\\ i\end{array}}\right) }{2^{n-s}}\)), up to a polynomial factor for certain i close to \((n-s)/2\), namely we have

$$ \left( K_w^{(n-s)}\left( i\right) \right) ^2 = \textrm{poly}(n) \left( {\begin{array}{c}n-s\\ w\end{array}}\right) 2^{n-s}.$$

All in all, to be able to use tighter bounds, we decide to model in Assumption 8 the weight distributions \(N_i\) as Poisson variables of parameters

$$\mathbb {E} \left( N_i \right) = \frac{\left( {\begin{array}{c}n-s\\ i\end{array}}\right) }{2^{n-k}}.$$

We ran extensive experimentations to show that the distribution of the bias remains unchanged is we replace the weight distribution \(N_i\) by the former model. See Sect. D for the experimental results. We are now ready to give the two following tail bounds for \(N_i - \bar{N_i}\). We first give a bound for the i’s that are relatively small compared to u, namely when \(i < u + \mathop {}\mathopen {}O\mathopen {}\left( \log n\right) \) (which corresponds to the case \(\mathop {\textrm{poly}}\limits \left( n\right) K_w^{(n-s)}\left( u\right) < K_w^{(n-s)}\left( i\right) \)). Then, we prove a second bound using our model for the i’s that are relatively big compared to u (which corresponds to the case \(\mathop {\textrm{poly}}\limits \left( n\right) K_w^{(n-s)}\left( u\right) > K_w^{(n-s)}\left( i\right) \)).

Lemma 6

Define \(\epsilon \mathop {=}\limits ^{\triangle }1/4\). We have:

$$\begin{aligned} \hbox {If} \quad \left| \frac{K_w^{(n-s)}(u) }{K_w^{(n-s)} \left( i\right) } \right| \le n^{2 + \epsilon } \; \; &\hbox {then} \; \; {{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C}}\left( \left| N_i - \bar{N_i}\right| > \left| \frac{K_w^{(n-s)}(u) }{K_w^{(n-s)} \left( i\right) } \right| \frac{1}{ 8(n-s+1) } \right) \le \frac{\left( {\begin{array}{c}n-s\\ i\end{array}}\right) }{2^{n-k}}. \end{aligned}$$
(C.12)

Moreover, under Assumption 8 we have:

$$\begin{aligned} \hbox {If} \quad \left| \frac{K_w^{(n-s)}(u) }{K_w^{(n-s)} \left( i\right) } \right| > n^{2 + \epsilon } \;\; &\hbox {then} \;\; {{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C}}\left( \left| N_i - \bar{N_i}\right| > \left| \frac{K_w^{(n-s)}(u) }{K_w^{(n-s)} \left( i\right) } \right| \frac{1}{ 8 \left( n-s+ 1\right) } \right) \le 2^{- \omega (n)} . \end{aligned}$$
(C.13)

Proof

The proof is available in the eprint version of the paper.

Lemma 7

We have under Assumption 8 that:

$$\begin{aligned} \mathbb {P}_{\mathscr {C}} \left( bias_{{\textbf{h}}\hookleftarrow \widetilde{{\mathscr {H}}}} \left( \langle {\textbf{y}}, {\textbf{h}}\rangle - \langle {\textbf{x}}, {\textbf{h}}_{{\mathscr {P}}} \rangle \right) > \frac{\delta }{4} \right) \le \mathop {}\mathopen {}\widetilde{O}\mathopen {}\left( \frac{\left( {\begin{array}{c}n-s\\ u\end{array}}\right) }{2^{n-k}}\right) \end{aligned}$$
(C.14)

Proof

The proof is available in the eprint version of the paper.    \(\square \)

And thus, Fact 9 altogether with Lemma 7 proves Proposition 5 which gives the expected size of \(\mathcal {S}\).

1.4 C.4 Further Results on the Complexity of Algorithm 4.1

Let us give here the complexity of Algorithm 4.1 when the support is split in two parts \({\mathscr {P}}\) and \({\mathscr {N}}\).

Proposition 9

The expected space and time complexities of the “2-split” version of Algorithm 4.1 to decode an [nk]-linear code at distance t are given by

$$\begin{aligned} &{\textbf {Time: }}\;\; \mathop {}\mathopen {}\widetilde{O}\mathopen {}\left( \frac{\text {T}_{\text {eq}}+T_{\textrm{FFT}}}{\text {P}_{\text {succ}}}\right) +\mathop {}\mathopen {}\widetilde{O}\mathopen {}\left( \max \left( 1,\frac{S}{\text {P}_{\text {succ}}} \right) \, T'(n-s,k-s,u)\right) , \\ &{\textbf {Space: }}\;\;\mathop {}\mathopen {}O\mathopen {}\left( \text {S}_{\text {eq}}+2^s + S'\right) . \end{aligned}$$

Where

  • \(\text {P}_{\text {succ}}= \frac{\left( {\begin{array}{c}s\\ t-u\end{array}}\right) \left( {\begin{array}{c}n-s\\ u\end{array}}\right) }{\left( {\begin{array}{c}n\\ t\end{array}}\right) }\) is the probability over \({\mathscr {N}}\) that \(|{\textbf{e}}_{{\mathscr {N}}}|=u\),

  • \(\text {S}_{\text {eq}}, \text {T}_{\text {eq}}\) are respectively the space and time complexities of for computing N parity-checks of weight w on \({\mathscr {N}}\),

  • \(T_{\textrm{FFT}} = \mathop {}\mathopen {}O\mathopen {}\left( 2^s\right) \) is the time complexity of the fast Fourier Transform,

  • \(S =\mathop {}\mathopen {}\widetilde{O}\mathopen {}\left( \left( {\begin{array}{c}s\\ t-u\end{array}}\right) \frac{\left( {\begin{array}{c}n-s\\ u\end{array}}\right) }{2^{n-k}}\right) \) is the average number of candidates in the set \(\mathcal {S}\)

  • \(S',T'\) are respectively the space and time complexities of to decode an \([n-s,k-s]\) code at distance u.

and where N the number of LPN samples and the parameters s, u and w are such that:

$$\begin{aligned} N < \frac{\left( {\begin{array}{c}n-s\\ w\end{array}}\right) }{2^{k-s}} \quad \hbox {and} \quad N = \omega \left( n^5 \left( \frac{\left( {\begin{array}{c}n-s\\ w\end{array}}\right) }{K_w^{(n-s)}(u)}\right) ^2 \right) . \end{aligned}$$

Notice here that the only change in the complexity of Algorithm 4.1 compared to the original RLPN algorithm (Proposition 3.10 of [CDMT22]) is that we added a term

$$ \mathop {}\mathopen {}\widetilde{O}\mathopen {}\left( \max \left( 1,\frac{S}{\text {P}_{\text {succ}}} \right) \, T'(n-s,k-s,u) \right) $$

in the complexity. We now give the asymptotic complexity of the corrected RLPN algorithm. Note that the techniques used to compute low-weight parity-checks [Dum86, CDMT22] compute in fact all the parity checks of weight w on \({\mathscr {N}}\). Thus, to simplify the next proposition we will simply replace the parameter N by the expected number of parity-checks of weight w on \({\mathscr {N}}\) that is \(\frac{\left( {\begin{array}{c}n-s\\ w\end{array}}\right) }{2^{k-s}}\).

Proposition 10

Define

$$ R \mathop {=}\limits ^{\triangle }\lim \limits _{n \rightarrow \infty }\frac{k}{n}, \quad \tau \mathop {=}\limits ^{\triangle }\lim \limits _{n \rightarrow \infty } \frac{t}{n}, \quad \sigma \mathop {=}\limits ^{\triangle }\lim \limits _{n \rightarrow \infty } \frac{s}{n}, \quad \omega \mathop {=}\limits ^{\triangle }\lim \limits _{n \rightarrow \infty } \frac{w}{n}, \quad \mu \mathop {=}\limits ^{\triangle }\lim \limits _{n \rightarrow \infty } \frac{u}{n}.$$

The number of LPN samples N is chosen to be equal to the total number of parity-checks of weight w on \({\mathscr {N}}\), namely \(N = 2^{n\, (\nu _{eq }+o(1))}\) where \(\nu _{eq } \mathop {=}\limits ^{\triangle }\left( 1 - \sigma \right) h_2 \left( \frac{\omega }{1-\sigma }\right) - (R - \sigma )\).

Then, the time complexity of the RLPN-decoder to decode an [nk]-linear code at distance t is given by \(2^{n \, \left( \alpha + o(1)\right) }\) and the space complexity is \(2^{n \, \left( \alpha _{space } + o(1)\right) }\) where

figure aa

And where

  • the time complexity of to compute all parity-checks of relative weight \(\tau '\) of a code of rate \(R'\) and length \(n'\) is given by \(2^{n' \, \gamma (R',\tau ') }\) and the space complexity is \(2^{n' \, \gamma _{space }(R',\tau ') }\).

  • The time complexity of to decode a code of rate \(R'\) and length \(n'\) at relative distance \(\tau '\) is given by \(2^{n' \, \alpha '(R',\tau ') }\). Its space complexity is \(2^{n' \, \alpha '_{space }(R',\tau ') }\).

Moreover, \(\sigma , \, \mu ,\) and \(\omega \) are non-negative and such that

$$ \sigma \le R, \quad \tau - \sigma \le \mu \le \tau , \quad \omega \le 1 - \sigma , $$
$$\nu _{eq }\ge 2 \; (1-\sigma ) \left( h_2 \left( \frac{\omega }{1-\sigma }\right) - \widetilde{\kappa } \left( \frac{\omega }{1-\sigma },\frac{\mu }{1-\sigma }\right) \right) , $$

where \(\widetilde{\kappa }\) is the function defined in Proposition 2.

The only added term here compared to the original RLPN asymptotic complexity exponent is

$$\max \left( \chi + \pi , 0 \right) + (1-\sigma )\alpha '\left( \frac{R-\sigma }{1-\sigma }, \frac{\mu }{1-\sigma }\right) .$$

For simplicity, we make the choice here to use [Dum91] ISD-decoder as the routine to solve the relevant decoding problem. Thus, we simply replace \(\alpha '\) in Proposition 10 by the asymptotic complexity of [Dum91] ISD-decoder given in Proposition 8 of Sect. B. One could use [Dum86] to compute parity-checks and thus replace \(\gamma \) by the exponent given in Proposition 7 or use some more involved methods as described in [CDMT22, §5].

Fig. 1.
figure 1

Expected size of the set \( \{ {\textbf{x}}\in \mathcal {S}_{t-u}^s \, : \widehat{f_{{\textbf{y}},{\mathscr {H}}}}\left( {\textbf{x}}\right) \ge T \} \) as a function of T for two different parameters.

figure ae
Experimentally in corrected RLPN.

figure af
Theoretically under the independent Poisson Model (D.1).

figure ag
Theoretically under LPN model, that is, if we supposed that the LPN samples produced by RLPN followed exactly the framework of Problem 2.

We define \(\widehat{f}\left( GV_1\right) \mathop {=}\limits ^{\triangle }N - 2 \, d_{\textrm{GV}}\left( N, \log _2 \left( \left( {\begin{array}{c}s\\ t-u\end{array}}\right) \right) \right) \) corresponding to the highest theoretical Fourier coefficient under the LPN model.

D Experimental Results Regarding the Poisson Model

In this section we give experimental evidence for our claims which rely on the Poisson Model 8. Specifically, we show that the experimental distribution of the bias of \(\langle {\textbf{y}}, {\textbf{h}}\rangle + \langle {\textbf{x}}, {\textbf{h}}_{{\mathscr {P}}}\rangle \) coincides with the theoretical distribution of the random variable X obtained by replacing the \(N_i\)’s in Proposition 3:

$$\begin{aligned} & {\mathop {\textrm{bias}}\limits }_{{\textbf{h}}\hookleftarrow \widetilde{{\mathscr {H}}}}\left( \langle {\textbf{y}}, {\textbf{h}}\rangle + \langle {\textbf{x}}, {\textbf{h}}_{{\mathscr {P}}}\rangle \right) \\ & \qquad \qquad \qquad \qquad = \frac{1}{2^{k-s} \left| \widetilde{{\mathscr {H}}}\right| } \sum _{i = 0}^n N_i\left( \mathscr {C}^{{\mathscr {N}}} + \left( {\textbf{x}}+ {\textbf{e}}_{{\mathscr {P}}}\right) {\textbf{R}}+ {\textbf{e}}_{{\mathscr {N}}} \right) K_w^{(n-s)} \left( i\right) \end{aligned}$$

by independent Poisson variables \(\widetilde{N_i}\) of parameter \(\mathbb {E}_{\mathscr {C}} \left( \left| N_i\left( \mathscr {C}^{{\mathscr {N}}} + \left( {\textbf{x}}+ {\textbf{e}}_{{\mathscr {P}}}\right) {\textbf{R}}\right. \right. \right. \left. \left. \left. +\, {\textbf{e}}_{{\mathscr {N}}} \right) \right| \right) = \frac{\left( {\begin{array}{c}n-s\\ i\end{array}}\right) }{2^{n-k}}\). That is

$$\begin{aligned} X \mathop {=}\limits ^{\triangle }\frac{1}{2^{k-s} \left| \widetilde{{\mathscr {H}}}\right| } \sum _{i = 0}^n \widetilde{N_i} K_w^{(n-s)} \left( i\right) . \end{aligned}$$
(D.1)

The independence assumption on the \(\widetilde{N}_i\)’s was just made to be able to compute numerically the distribution of X.

Now, as far as we are aware, there is no simple way to derive a closed expression for the probability distribution of \({\mathop {\textrm{bias}}\limits }_{{\textbf{h}}\hookleftarrow \widetilde{{\mathscr {H}}}}\left( \langle {\textbf{y}}, {\textbf{h}}\rangle + \langle {\textbf{x}}, {\textbf{h}}_{{\mathscr {P}}}\rangle \right) \) and \({\mathop {\textrm{bias}}\limits }_{{\textbf{h}}\hookleftarrow {\mathscr {H}}}\left( \langle {\textbf{y}}, {\textbf{h}}\rangle + \langle {\textbf{x}}, {\textbf{h}}_{{\mathscr {P}}}\rangle \right) \) under the previous model. As such, we simply experimentally computed it using Monte-Carlo method.

As a side note, so that our quantity are intuitively interpreted as outputs of the corrected RLPN algorithm we in fact compare the distribution of the bias but normalized by a factor \(\left( {\begin{array}{c}s\\ t-u\end{array}}\right) \), indeed we have:

Fact 12

If \({\textbf{y}}\) is a random word of \(\mathbb {F}_2^n\) then for any real T we have

$$\begin{aligned} & \mathbb {E}_{\mathscr {C}} \left( \left| \{ {\textbf{x}}\in \mathcal {S}_{t-u}^{s} \, : \widehat{f_{{\textbf{y}}, {\mathscr {H}}}} \left( {\textbf{x}}\right) \ge T \} \right| \right) \\ {} & \qquad \qquad \qquad \qquad = \left( {\begin{array}{c}s\\ t-u\end{array}}\right) \, {{\,\mathrm{{\mathbb {P}}}\,}}_{\mathscr {C},{\mathscr {H}}}\left( {\mathop {\textrm{bias}}\limits }_{{\textbf{h}}\hookleftarrow {\mathscr {H}}}\left( \langle {\textbf{y}}, {\textbf{h}}\rangle + \langle {\textbf{x}}, {\textbf{h}}_{{\mathscr {P}}}\rangle \right) \ge T\right) . \end{aligned}$$

Some experimental results are summed up in Fig. 1. More figures can be found in the supplementary material GitHub pageFootnote 4 .

Note that the parameters considered are such that we have unusually large Fourier coefficients compared to what we would expect if the original LPN model made in [CDMT22] was to hold. It is readily seen that if the LPN model of [CDMT22] was to hold the full curve in red should roughly match the green dash-dotted one. However, as [CDMT22, §3.4] already noticed, and as we also notice here, it is not the case. The dashed blue curve represents the tail coefficients given by Fact 12 under this independent Poisson Model. We see that our model very well predicts the experimental distribution of the Fourier coefficients.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 International Association for Cryptologic Research

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Meyer-Hilfiger, C., Tillich, JP. (2023). Rigorous Foundations for Dual Attacks in Coding Theory. In: Rothblum, G., Wee, H. (eds) Theory of Cryptography. TCC 2023. Lecture Notes in Computer Science, vol 14372. Springer, Cham. https://doi.org/10.1007/978-3-031-48624-1_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-48624-1_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-48623-4

  • Online ISBN: 978-3-031-48624-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics