Quantum-safe cryptography: crossroads of coding theory and cryptography

We present an overview of quantum-safe cryptography (QSC) with a focus on post-quantum cryptography (PQC) and information-theoretic security. From a cryptographic point of view, lattice and code-based schemes are among the most promising PQC solutions. Both approaches are based on the hardness of decoding problems of linear codes with different metrics. From an information-theoretic point of view, lattices and linear codes can be constructed to achieve certain secrecy quantities for wiretap channels as is intrinsically classical- and quantum-safe. Historically, coding theory and cryptography are intimately connected since Shannon’s pioneering studies but have somehow diverged later. QSC offers an opportunity to rebuild the synergy of the two areas, hopefully leading to further development beyond the NIST PQC standardization process. In this paper, we provide a survey of lattice and code designs that are believed to be quantum-safe in the area of cryptography or coding theory. The interplay and similarities between the two areas are discussed. We also conclude our understandings and prospects of future research after NIST PQC standardisation.


Introduction
Cryptography is a subject area of long history, playing a crucial role in human society ever since the dawn of civilization. Claude Shannon's 1945 paper "A mathematical theory of cryptography" is widely regarded as the beginning of modern cryptography, as a new science. In 1949, Shannon published another landmark paper "A mathematical theory of communication", founding the area of information theory. As noted by Shannon himself, cryptography and information theory are very close together. Cryptography is the science of concealing information, while information theory is primarily concerned with transmitting information. Thus the two subjects can benefit from each other. In fact, the insights Shannon obtained from the study of cryptography were key to his development of information theory.
The two subjects share many common concepts and formulations, and it seemed they were to live happily ever after -until the rise of public-key cryptography. In 1976, Diffie and Hellman published a famous paper "New directions in cryptography" in IEEE Transactions on Information Theory. Ironically, this paper triggered the decline of the influence of information theory on cryptography, whereas cryptography as a stand-alone subject has grown tremendously in the past few decades. This is because where A is a (public) matrix, x is a vector and e is noise, the problem is to recover x. This is also known as learning with errors (LWE) in lattice-based cryptography. Since maximum-likelihood decoding is computationally hard for random A, y is pseudorandom and can be used to hide a message m. Therefore, the ciphertext is given by where Enc(·) represents an error-correcting code to cope with noise, and informally, x can be thought of as a secret key. A legitimate user with access to the key will be able to decode the message m after subtracting out Ax from y ′ , but an adversary without access to the key cannot distinguish the ciphertext y ′ from a random string. Note the two-fold role of noise here: it is added to hide the message, but it also causes decoding errors, prompting the usage of error correction coding. Another popular problem in code and lattice-based cryptography is syndrome decoding: upon receiving vector y, computing the syndrome s = Ay, the problem is to find a low-weight vector x satisfying the above syndrome equation, i.e., Ax = s. This is also known as the inhomogeneous short integer solution (ISIS) problem in lattice-based cryptography. Clearly, LWE and ISIS are inspired by their counterparts in coding theory.
In this paper, we present a coherent view of cryptography and coding theory in the context of quantumsafe cryptography, and the bridge is built by lattices and codes. Quantum-safe cryptographic primitives and channel coding schemes are reviewed and standard designs are described. Especially for post-quantum cryptography (PQC) which has developed rapidly in the last decade, we give our understanding of the interdisciplinary "genes" of cryptography (especially the lattice and code-based primitives) and coding theory. We also conclude what can be projected into future research from the reintegration of the two areas.
Roadmap. This rest of the paper is organized as follows. In Section 2, an overview of the NIST standardization process is given. In Section 3, public-key encryption/key encapsulation mechanisms and signature schemes based on hard lattice problems are introduced; a key technique of lattice cryptography, lattice Gaussian sampling, is also reviewed. Section 4 presents a review of code-based cryptography. An interplay between lattices and codes can be found in Section 5. Section 6 presents information-theoretic security which will also remain secure in the post-quantum age and Section 7 summarizes the paper.

Overview of NIST standardization
It is complicated and controversial to answer when a full-fledged quantum computer will be available because current quantum machines operating on a few dozen quantum bits are far from doing anything dazzling. As the world's tech giants (Intel, Google, IBM, etc.) are continuing hitting new milestones in the field of quantum computing and some governments are supporting the research strategically and financially, the reality is that the quantum revolution is happening right now and we must stay ahead of the curve in light of the approaching quantum era. One area of urging importance is the migration to post-quantum cryptography. In the past few years, industry and standard organizations have started their own activities in this field. In December 2016 the NIST (USA) announced a call for proposals for quantum-resistant algorithms including public-key encryption (PKE)/key encapsulation mechanism (KEM) and digital signatures. While symmetric cryptography (e.g., advanced encryption standard (AES), secure Hash algorithm 2 (SHA-2)) is regarded as quantum-safe as the impact of quantum attack can be effectively mitigated by increasing the key size, the severe impact is mainly on asymmetric cryptography like RSA (Rivest-Shamir-Adleman), Diffie-Hellman, and elliptic curve. The focus of NIST PQC standardization is on public-key solutions including digital signatures and public-key encryption/key establishment.
Cryptographic researchers and practitioners from over 25 countries actively contributed to NIST's standardization process. By December of 2017, NIST received 69 valid submissions for the first round evaluation. These submissions use techniques from a number of different mathematical families, including lattices, error-correcting codes, multivariate equations, hash functions, elliptic curves, and others. In January 2019 the field was cut to 26 candidates for the second round, and in July 2020 the program proceeded to the third round (and apparently final) announcing a short list of 15 candidate proposals with 7 finalists and another 8 as optional candidates. Identified by their underlying hard problems, 7 schemes are built on lattice cryptography, 3 on code-based cryptography, 2 on multivariate methods, 1 on hash functions, and 1 on isogenies of elliptic curves 1) . As shown in Table 1, the vast majority of these successful candidates (10 out of 15) are based on lattices (7) and codes (3).
Making decisions on which proposal to adopt and standardize is sophisticated when every proposal has its pros and cons. For every submission, NIST investigates and assesses its security strength from both theoretical and practical aspects. In the context of public-key encryption/key encapsulation mechanism, NIST intends to standardize schemes that can achieve IND-CCA2 (indistinguishability under adaptive chosen ciphertext attack) security. But if ephemeral-only PKE/KEM is considered, security guarantee under chosen plaintext attack (i.e., IND-CPA security) suffices. In the case of digital signature schemes, proposals should enable existentially unforgeability with respect to adaptive chosen message attacks (i.e., EUF-CMA security). NIST categorizes 5 security levels from I to V each of which defines a security threshold. To break a proposed primitive of a certain security level is supposed to consume at least comparable computational resource to an existing NIST standard in symmetric cryptography. The computational resource is deemed to be a variety of different metrics (e.g., the number of classical elementary operations, quantum circuit size, etc.). Additional security factors NIST will consider include resistance to side-channel attacks, perfect forward secrecy, and resistance to multi-key attacks.
While the security of a PQC proposal is obviously the factor referees care about the most, NIST's competition also scopes out their complexity and compatibility. Tradeoffs have to be made between 1) Picnic is a signature scheme in none of the above categories because it does not rely on number theoretic or structured hardness assumptions. Instead, it is designed using zero-knowledge proof and symmetric primitives.
Parameters are m, n, q; error distribution χ on Z m security, performance, and complexity after a comprehensive study of every submission. As indicated in NIST's document on submission requirement and evaluation criteria, NIST's PQC standardization is exactly but should not be treated as a competition. It is admitted that different schemes may work with different scenarios and different platforms. Rather than drawing the conclusion one scheme is better than another, NIST's competition aims to encourage discussion and evaluation on proposals which forms the decision for NIST's standardization.

Alice Bob
3 Lattice-based cryptography

An overview
A large family of lattice-based PKE/KEM is constructed with the LWE problem [1] and its variants including ring learning with errors (ring-LWE or RLWE), module-LWE (MLWE), and learning with rounding (LWR). Definition 1 (LWE sample and LWE distribution). An LWE sample is defined as (A, As + e mod q) where the A is uniform in Z m×n q , the secret s is uniform in Z n q and the error term e is drawn from some distribution ψ over Z n . The LWE sample satisfies an LWE distribution A s,ψ .
Observing arbitrarily many (A, As + e), the search-LWE problem is to find the secret vector s while the decision-LWE is to distinguish A s,ψ from a uniform distribution with non-negligible advantage. The hardness of LWE problems relies on the worst-case approximate shortest independent vectors problem (SIVP γ ) and the decisional approximate shortest vector problem (GapSVP γ ). If we look into LWE from a coding perspective, an LWE sample (A, b = As + e) can be viewed as a random lattice code with some additive errors. The search version of LWE problem is to find s observing (A, As + e). The vector b is relatively close to a point in the LWE lattice defined as So the search-LWE problem can be viewed as bounded distance decoding (BDD) of lattice L in the average case. A typical LWE-based public-key encryption is given in Figure 1. As the decryption step on Alice's side, the residue error is small enough by choosing appropriate ψ so that the plaintext can be recovered almost surely. The computational complexity of encrypting a single bit in a typical LWE-based PKE requires O(n 2 ) scalar operations and the sizes of the public key and the ciphertext are O(n 2 ) and O(n), respectively. The security of LWE-based PKE schemes (e.g., Frodo and Lizard) relies upon the worst-case hardness of the approximate SIVP. These schemes provide good resilience against quantum attacks but suffer from large key sizes and high computational complexity.
The first step towards facilitating LWE using structured lattice is the ring learning with errors problem, as described in Definition 2. Observing arbitrarily many RLWE samples, the search RLWE problem is to recover s and its decision version is to distinguish an RLWE distribution from a uniform one.
Parameters are R, n, q; error distribution χ on R q := R/qR Alice Bob shared secret= hash(µ) The cyclotomic ring admits quite fast and elegant ring operations using an FFT-like technique, i.e., number theoretic transform (NTT), and consequentially enhances the efficiency and compactness in the application.
Definition 2 (RLWE sample and RLWE distribution). Define a cyclotomic ring R := Z[x] 1+x n and denote by R q = R/qR a quotient ring with an integer prime modulus q ≡ 1 mod 2n. Let · be polynomial multiplication. For a secret s ∈ R q and some error distribution ψ over R, we derive a RLWE sample (a, a · s + e) ∈ R q × R q by drawing a from R q uniformly at random and drawing e from ψ. A RLWE sample follows the RLWE distribution A s,ψ .
Compared with Definition 2, a more rigorous definition of RLWE is given in [2] elaborating a RLWE sample as (a, a · s + e mod qR ∨ ) where a is uniformly drawn from the ring of integers R modulo q, s is drawn from the dual ring R ∨ q and e adheres to some distribution ψ over K R (K is the cyclotomic number field associated with the ring R and K R is the tensor product of K and R). Compact and efficient public-key cryptosystems based on RLWE are designed in [3] though they are a bit far from real-world protocols due to the choice of ψ and the usages of fractional ideal R ∨ (unless in the 2-power cyclotomic setting where R ∨ = 1 n R is a scaling of R). Peikert took one more step to develop these cryptosystems into "drop-in" components as Internet protocols [4]. In his work, the R ∨ is translated to R cautiously without distorting the canonical geometry and a reconciliation method is employed for key agreement. A consensus technique similar to reconciliation can be found in another PQC candidate key consensus from lattice (KCL). Other NIST candidates, e.g., NewHope, HILA5, and lattice-based cryptography (LAC), adopt error-correcting codes for key agreement/encryption.
A RLWE-based KEM scheme is given in Figure 2 where the hash(·) function yields an n-bit shared secret given an n-bit secret µ as input. To encrypt an n-bit message or to share an n-bit secret, the public key and ciphertext size is as large as O(n). Moreover, the polynomial multiplication (a · s) in the ring setting can be performed in O(n log n) scalar operations using NTT. Although none of the RLWEbased PKE/KEMs are selected as NIST's third round finalists in the fierce competition, we cannot deny their great potential in future research and standardization due to their attractive features especially the efficiency.

Error correction coding
Since we expect to give some flavor of coding in PQC, it is worthwhile revisiting the error correction problem in lattice based PKE/KEM. We found error-correction codes in some PKE/KEM proposals, e.g., repetition codes in NewHope, BCH codes in LAC, and another polynomial codes named XEf in Round5. This "unusual design", remarked in NIST's report, is employed to fix the decryption failures as a result of a minor residue error term after decryption. Some attacks using the decryption failure rate (DFR) put some schemes under threat because the residue error term might be related to some ciphertexts and even some specific secret terms. Once the underlying relation is learned by attackers, the security will be impacted. There are already attacks of this type. For example, the "failure boosting" and "directional failure boosting", proposed by D'Anvers et al. [5,6] can be used to find some ciphertexts which are more likely to trigger decryption failures. They also verified these attacks on some basic versions of ring/module-LWE/LWR PKE/KEMs with comparable parameterization to NIST candidates, e.g., NTRUEncrypt, KYBER, SABER. The security is impacted assuming unlimited decryption queries are allowed. Another attack proposed by Guo et al. [7] also exploits the relation between some "weak" ciphertexts and some secret keys of certain Hamming weight pattern.
Besides the error-correction codes used in NIST submissions, some academic researches [8] can be found to apply modern error correcting codes and its soft-decision decoding to lattice cryptography, e.g., low-density parity-check (LDPC) codes. Normally, soft-decision decoding algorithms give much better error-correcting capability, and to the best of our knowledge, they assume independent channel models. It is common in communication systems whereas in RLWE-based cryptography it is not the case. As in Figure 2, the residue error term e·t−e ′ ·s+e ′′ has correlated coordinates due to polynomial multiplications over R q . As in Figure 2, the ring structure introduces dependency between the coordinates of the residue error. It becomes obvious if we present the residue error term e · t − e ′ · s + e ′′ in vector format, i.e., In [8], this dependency is assumed to be negligible. However, the impact of the dependency is analyzed by D'Anvers et al. [9] showing that the DFR will be underestimated (the security will be overestimated consequently) when error-correcting codes are used. They also proposed a relaxed independence assumption by which they gave a new DFR estimation for LAC. A drawback of this relaxed assumption is that it works with schemes like LAC which draws secret terms and error terms from centered binomial over {−1, 0, 1}. In the case when a genuine discrete Gaussian is used or when the binomial distribution is wide, their method becomes computationally infeasible.
In [10], polar codes are employed for error correction in RLWE-based PKE. They address the error dependency issue using canonical embedding. To be specific, the polynomial multiplication of the ring elements is converted to coordinate-wise multiplication of vectors under canonical bedding, leaving the residue error term with identically independent coordinates. Moreover, some knowledge about the residue error is actually known by the decoder in RLWE-based PKE. By analogy with channel coding, they name this knowledge as channel state information (CSI) which can be exploited to improve decoding performance. Compared with the BCH code and LDPC code, polar code is more friendly to constanttime implementation because both encoding and decoding can be realized by a butterfly circuit with quasi-linear complexity O(N log N ).

Signatures
In the early stage of lattice-based cryptography, signature schemes following the Goldreich-Goldwasser-Halevi (GGH) strategy suffer certain vulnerability that each signature leaks information on the signer's secret key (i.e., the signer's secret lattice basis), and this property has been exploited in Nguyen and Regev's attack of "learning the parallelotope". Specifically, the shape of the parallelotope defined by the secret basis can be learned by using a number of signature samples to perform blind source separation. The NIST digital signature schemes based on LBC fall into two categories: (a) Gentry-Peikert-Vaikuntanathan (GPV) framework with trapdoor sampling and (b) signatures using Fiat-Shamir transform.
Gentry et al. [11] first constructed a type of "trapdoor" cryptographic tools based on the (inhomogeneous) short integer solution (SIS) which can be reduced to the standard worst-case lattice problems. Such trapdoor construction immediately finds its applications in cryptography including digital signatures. Before giving an instantiation of GPV signature using SIS, we first review the SIS as described in Definition 3. Definition 3 (SIS). Given a uniformly random matrix A ∈ Z n×m q , find a nonzero integer vector z ∈ Z m of norm z β such that f A (z) := Az = 0 ∈ Z n q . For m = poly(n), β > 0 and q β · poly(n), solving the above SIS problem is as hard as GapSVP γ and SIVP γ . Assuming the hardness of SIS, function f A (z) is collision resistant (or one-way) because the equation Parameters are m, n, q, s; η(L): smoothing parameter of L ⊥ (A)

Signer Verifier
Public key: Secret key: S ∈ Z m×m be a good basis of L ⊥ (A) otherwise, invalid signature. Then we define a q-array m-dimensional integer lattice and its coset as respectively. A concrete instantiation of GPV signature using q-array integer lattice and preimage sampling is given in Figure 3.
• At key generation step, a signer has a uniformly random public matrix A ∈ Z n×m q and a secret matrix S ∈ Z m×m q such that AS = 0 mod q. The secret S is said to be a good basis of a lattice L ⊥ (A) with small coefficients and good orthogonality. A specific way to construct A, S called "Gadget" can be found in [12].
• The signer hash the message m into a vector u ∈ Z n q . A not necessarily short solution x to Ax = u can be derived using Gaussian elimination. Using the good basis S, the signer derives the signature y by sampling from a lattice Gaussian distribution (LGD) D L ⊥ u (A),s over a coset L ⊥ u (A) of L ⊥ (A) with standard deviation s larger than the smoothing parameter η(L). The signer accepts the signature y if y s √ m. • At the verifier's side, a signature y is considered to be valid if Ay = u and y s √ m. Otherwise, y is invalid.
The above hash-and-sign signature is unforgeable because an adversary who only knows A and u cannot recover a valid y assuming the hardness of SIS. It is notable that in the above GPV framework, a trapdoor sampler that can yield a smaller standard deviation enables stronger security while Klein's and Peikert's algorithms do not work with arbitrarily small standard deviations. Another lattice sampling algorithm using Monte Carlo Markov Chain supports arbitrary standard deviations, therefore, enjoys higher security at the cost of longer running time [13,14].
Falcon adapts NTRU lattice to the GPV framework while another third-round signature candidate CRYSTALS-DILITHIUM employs the Fiat-Shamir transform and rejection sampling. Its security relies on the hardness of MLWE and module-SIS. Digital signatures using Fiat-Shamir's technique was first proposed by Lyubashevsky [15] and the rejection sampling was later adapted and improved in [16] and the signature scheme BLISS [17]. As a high-level description, the signer has a secret key S ∈ Z m×n q with small coefficients and a public key consists of A ∈ Z n×m q and the other matrix T = AS. The signer firstly picks a short vector y from a normal distribution N (0, σ 2 I m×m ), calculates c = Ay and then computes z = Sc + y which adheres to N (Sc, σ 2 I m×m ). A rejection sampling is applied to shape the distribution of z by a shift of −Sc. In this manner, the signature is statistically independent of the secret S, therefore, leaks almost no information about S. The singer outputs (z, c) as the signature and the verification is done by checking if the norm of z is short and if Ay = Az − T c. Remark 1. For some lattice-based cryptosystems (and some code-based ones in the sequel), we observe some links between encryption and channel coding, signature and source coding, identity-based encryption (IBE) and joint source-channel coding.
• Taking LWE for example, encryption is to map the plaintext to a codeword of a random linear code disrupted by the noise of a certain distribution. The search-LWE problem is similar to the maximumlikelihood decoding problem in the Euclidean metric. Another example is that the ISIS problem can be seen as syndrome decoding. Similar links can be found in code-based encryption schemes in Hamming or rank metric in the sequel.
• For lattice-based digital signatures, the signature is either drawn from a certain distribution (e.g., discrete Gaussian over a lattice coset) or randomized to hide the secret (e.g., Lyubashevsky's scheme without trapdoor). We can consider it as producing an unbiased distribution while source coding is on the contrary to remove the randomness of a certain unbiased distribution. This relation can be reflected by the Knuth-Yao sampling and Huffman source coding techniques both of which employ a binary tree but for opposite purposes.
• In some sense, we can found a link between IBE from GPV [11] and joint source-channel coding. The GPV IBE can be seen as the "dual" version of Regev's LWE cryptosystem. The master public key of the authority is a matrix A and its trapdoor is the master secret key. The public key of a user is a concatenation of A and a hash u of its unique identity (e.g., email address). The user secret key x is a preimage of u = Ax. As introduced earlier in this section, vector x is derived by trapdoor sampling from distribution D L ⊥ u (A),s which can be seen as the "reverse process" of source coding. What follows is the standard encryption and decryption of an LWE-based scheme as is analogous to the channel coding problem.

Lattice Gaussian sampling
Definition 4 (LGD). Let Λ be an n-dimensional lattice in R n and let ρ c,s (x) = exp(−π x − c 2 /s 2 ) be a spherical Gaussian function centered at x = c with Gaussian parameter s. The Gaussian distribution over a lattice Λ is defined as where ρ c,s (Λ) = z∈Λ ρ c,s (z). We can use D Λ,s (x) for abbreviation if c = 0. Lattice Gaussian sampling (LGS) has become a fundamental tool in LBC. Typically, the task of LGS is to draw vectors from LGD, which is a discretized Gaussian distribution over a lattice. In LGD, every lattice point in the lattice is assigned with a probability, and the lattice point closer to the center of the distribution naturally corresponds to a larger sampling probability. Intuitively, the probabilities of lattice points are controlled by the standard deviation of a LGD, where a small standard deviation accounts for a sharp distribution and vice-versa. An example of a LGD over a Z 2 -lattice is illustrated in Figure 4.
LGD is frequently used in lattice cryptosystems especially lattice signatures. A special case of LGS is integer Gaussian sampling which is deemed as the fundamental building block of LBC. However, inappropriate LGS sometimes accounts for some efficiency and security issues. Many lattice-based proposals managed to avoid the lattice/integer Gaussian sampling operations, e.g., CRYSTALS-DILITHIUM uses uniform distribution instead, and SABER and KYBER use binomial distribution instead. Nonetheless, LGS, especially integer Gaussian sampling, cannot be completely avoided.
Well-studied LGS algorithms include Klein's sampler which is a randomized version of Babai's nearest plane algorithm and Peikert's parallelizable sampler [18]. Falcon, a NIST' third round finalist, employs a fast Fourier orthogonalization method [19] to construct a trapdoor sampler for NTRU lattices which combines the quality of Klein's sampler and the efficiency of Peikert's. A "Gadget"-based trapdoor was devised in [12] which enables efficient trapdoor sampling using "Gadget".
Another line of research focuses on integer Gaussian sampling though it is indeed an aged topic. Traditional solutions include the cumulative distribution table (CDT) sampler, the Knuth-Yao sampler, and the discrete Ziggurat sampler. Recent researches [20,21] suggested an integer Gaussian sampler works in a two-phase manner: a base sampler (e.g., Knuth-Yao, CDT) produces and stores plenty of samples from an integer Gaussian distribution of small parameters; the target distribution is shaped with the stored samples by a constant-time convolution/expansion method.
Remark 2. From a coding perspective, Knuth-Yao sampler employs a binary searching tree analogous to the Huffman coding tree where a parent node splits into two child nodes according to the occurrence frequency. The difference between the two is that the binary tree constructed by Knuth-Yao is for distribution generation while that for Huffman coding is for lossless data compression. This implies that sampling can be viewed as a reverse of data compression and related work about sampling from the integers using polar source coding was proposed in [22]. It is featured with asymptotically informationtheoretic optimality: to produce the desired distribution with the optimal randomness.

Code-based cryptography
Code-based cryptography employs coding theory and the hard decoding problems to build primitives such as encryption schemes, one-way functions, digital signatures, key exchange 2) . Generally speaking, code-based KEM requires more bandwidth and offers slower key generation in comparison with latticebased ones. Driven by the NIST process, code-based cryptography has overcome the limitation of the classic McEliece cryptosystem and efficiency enhancements have been made to reduce the key size and to accelerate the key generation using structured codes (e.g., cyclic codes). Moreover, a specialty of codebased cryptography is the strong confidence in security. While lattice-based cryptography emerged in the last decade, code-based cryptography originated from the scheme proposed by McEliece in 1978 [23] and its "dual" version was proposed by Niederreiter in 1986 [24]. Efforts have been put into the cryptanalysis of McElice cryptosystems with little if any significant and practical outcomes for over 40 years.
The idea of McEliece's scheme is to use an apparently random generator matrix of an error-correcting code (a random binary Goppa code) as a public key to encrypt a message, where t random bit errors are also added to it. The code is chosen to be able to correct up to t errors. Thus, legitimate users who know a fast decoding algorithm for the code as a private key can recover the plaintext. The security of McEliece's scheme relies on the following two computational assumptions: • Decoding a random linear code in the fixed-error model is hard on average; • The generator matrix (public key) is hard to distinguish from a random matrix. The first problem is proved to be nondeterministic polynomial time (NP)-complete [25] and is believed to be hard on average, while the second problem is more open. The McEliece encryption scheme is described as follows in Figure 5.
Among NIST third-round finalists, classic McEliece is the only code-based scheme KEM. It is the dual version of the original McEliece with moderate efficiency improvement but no security compromise. The major changes include the migration to Niederreiter's dual variant and some refinements of parameters in order to keep up with the increased computing power. Due to its early invention, classic McEliece is the most researched one among all NIST candidates and thus is better understood compared with other schemes. Furthermore, classic McEliece enables efficient and straightforward conversion of one-way-CPA PKE into IND-CCA2 KEM. Besides, it can be configured to match all five NIST security levels. The main drawback is that it also has extremely large public key sizes, ranging from 250 KB for NIST security level 1 up to 1.3 MB for NIST security level 5. : a permutation matrix

Quasi-cyclic code-based public-key encryption/KEM
The main reason for the extremely large public key size of classic McEliece is that one has to store the whole generator matrix of the code. To reduce the key size, one approach is to choose a proper code family such that the generator matrix can be expressed more compactly. In [26], it was proposed to use quasi-cyclic codes. A code spanned by a block-circulant matrix is called quasi-cyclic. A circulant matrix is a square matrix in which all row vectors are composed of the same elements and each row vector is rotated one element to the right relative to the preceding row vector. A block-circulant matrix is formed of blocks of circulant square matrices. The generator matrix of a quasi-cyclic code is completely defined by its first row, and thus, the public key size of quasi-cyclic code-based cryptographic schemes can be greatly reduced. Similar to the original McEliece scheme, the security proof of the quasi-cyclic code-based McEliece scheme relies on the following two computational assumptions: • Generic decoding of a random quasi-cyclic code is hard; • The generator matrix (public key) is hard to distinguish from a random block-circulant matrix.
Since quasi-cyclic codes are more structured, the code family must be chosen carefully, otherwise, some types of attacks are possible.
Recently, quasi-cyclic moderate density parity-check (QC-MDPC) codes [27] have attracted research interests in cryptography society. MDPC codes are variants of the well-known LDPC codes with moderately sparse parity-check matrices. Specifically, the rows of the parity-check matrix of an MDPC code have length n and Hamming weight of order √ n. The denser parity-check matrices deteriorate the errorcorrecting performance. However, in cryptography, we are not interested in correcting as many errors as possible, but only a number that is enough for the schemes. The most important reason for considering moderately sparse parity-check matrices is to avoid certain types of attacks. When using LDPC codes, low weight parity-check rows can be seen as dual codewords. If one searches for low-weight dual codewords to build a sparse parity-check matrix, which is for sure easily decodable, the scheme may be effectively attacked. It deserves to mention that Guo et al. [28] also presented an attack against the QC-MDPC McEliece encryption scheme.
Among NIST third-round candidates, two quasi-cyclic code-based schemes are selected as alternates, i.e., Hamming quasi-cyclic (HQC) and bit flipping key encapsulation (BIKE). HQC is based on BCH code, while BIKE is based on QC-MDPC codes. Both schemes have parameters that target NIST levels 1, 3, and 5 with much smaller public key sizes compared with classic McEliece. The public-key encryption algorithm of HQC is illustrated in Figure 6, where matrix G is public known, the public key is (h, s) and the secret key is (x, y).

Rank metric-based public-key encryption/KEM
According to the underlying hard problems, lattice-based and code-based cryptography can be interpreted as distance-based cryptography. The former is a Euclidean-based one while the latter is Hamming-based. Different types of metrics have different properties, resulting in their respective advantages and drawbacks. Rank metric codes are a special type of linear error-correcting codes that use the rank metric instead of the Hamming metric. Loo-Keng Hua first introduced rank metric in 1951 and Delsarte introduced the metric, rank distance, and constructed the optimal matrix codes in bilinear representation. In 1985, Parameters: k, n, w; S n w (F 2 ) := {x ∈ F n 2 : wt(x) = w}; wt(·): Hamming weight; a ⊙ b: vector a multiplied by the circulant matrix generated by vector b.

Alice
Bob Parameters: n, k, m, w, and P ∈ F q [X]; S n w (F q m ) = {x ∈ F n q m : rank(x) = w}; rank(·): rank weight; Gabidulin proposed vector-form rank codes as well as efficient encoding and decoding algorithms. Since then, rank codes have been used in many applications, such as communication as well as cryptography. Let u = {u 1 , . . . , u n }, v = {v 1 , . . . , v n } ∈ F n q m be two length-n vectors in the vector space of F n q m , where q is a prime, F q m is the finite field with q m elements and m is a positive integer. The rank weight rank(u) of u is defined as the dimension of the F q -subspace generated by {u 1 , . . . , u n }. The rank distance rd(u, v) between u and v is defined as rank(u − v). A [n, k] rank code C over F q m is a subspace of dimension k of F n q m embedded with rank metric. When using rank codes in cryptography, the main problem is the generalization of the generic decoding problem with Hamming distance in the case of rank distance. Specifically, let H be an ((n − k) × n) matrix over F q m with k n, s ∈ F n q m and r an integer. The problem is to find x such that rank(x) = r and Hx t = s. This problem has been proved to be NP-hard with a randomized reduction [29].
Rank quasi-cyclic (RQC), a NIST 2nd-round candidate, is based on Gabidulin codes, which has an even shorter public key size compared with HQC and BIKE. Unlike McEliece whose security relies on the hardness of syndrome decoding problem and Goppa code distinguishing problem, the security of RQC relies solely on the decisional version of syndrome decoding of quasi-cyclic codes. Moreover, the family of Gabidulin codes has zero probability of decoding failure when using deterministic decoding. The main drawback of RQC is that the security analysis under algebraic attacks deserves more time to mature [30]. Besides, the decoding complexity of Gabidulin codes, approximately O(n 2 ), is much higher than its Hamming metric-based counterparts. The public-key encryption algorithm of RQC is illustrated in Figure 7 where G is the generator matrix of a Gabidulin code C and can be represented by the first row g vector. The public key is (g, h, s) and the secret key is (x, y).

Code-based signatures
While code-based cryptography provides promising PKE/KEM solutions to be standardized in NIST competition, code-based signatures are not in an advantageous position. Attempts to devise code-based signature schemes to meet the efficiency and security requirements were unsuccessful and none of the existing code-based signatures in the literature is now under consideration for standardisation in the NIST competition. Similar to lattice signatures, code-based signatures fall into two categories: hash-andsign paradigm using trapdoor and Fiat-Shamir paradigm.
By the first approach, one hashes the message to a syndrome and finds its preimage as the signature subject to some limitations (e.g., Hamming distance, rank distance). Roughly speaking, a one-way function is devised based on the hardness of syndrome decoding of random linear codes. The preimage can be derived using the trapdoor but is hard to recover from the public key and signature due to the one-wayness. Typical signatures of the first type include CFS [31], WAVE [32], and RankSign [33]. CFS exploits the decoding capability of high rate Goppa codes for which a non-negligible fraction of the syndromes can be decoded to the nearest codeword. However, the CFS is unpractical in terms of efficiency and security. One drawback of CFS is that the performance scales poorly with the security. To be specific, achieving 128 bits of classical security requires a public key of several gigabytes and a signature generation of several seconds. Besides, for high rate Goppa codes, the public key was found to be distinguishable from random matrix [34]. Similar distinguisher also exists for the rank-based signature RankSign [35]. The hidden structure as the trapdoor has its pros and cons. On the one hand, it enables efficient preimage calculation, but on the other hand, there might exist unknown structural attacks to recover the hidden structure and break the scheme.
The other signature paradigm using Fiat-Shamir transform and zero-knowledge identification protocol circumvents the above problem. It does not employ any hidden structure because the public key is exactly the parity-check matrix of the underlying linear code and no syndrome decoding is involved. Lattice signature schemes of this type proposed by Lyubashevsky are introduced in Subsection 3.2 where the signature is randomized in Euclidean metric to hide the secret key. When this approach is adapted to code-based schemes, randomizing the signatures in Hamming and rank metric is nontrivial because one has to consider the whole codeword rather than independent coordinates of a lattice vector. Persichetti [36] proposed a one-time signature. However, this scheme is afterward shown to suffer from statistical attacks and the secret key can be recovered using a single signature [37]. Other signatures of this type include NIST 1st-round candidate RaCoSS [38] and Durandal [39], whereas secret information leakage is a common issue to solve.

The interplay between lattices and codes
In addition to the aforementioned KEM, coding theory has a lot to contribute to lattice cryptography. For instance, Regev's LWE problem [1], as an average-case hard problem of lattices, can be equivalently presented as the problem of decoding random linear codes. Thus the complexity (security) reduction from one side implies the security guarantees from the other side. In addition, the lattice-reduction-aided cryptanalysis technique also benefits a lot by refining the algorithm from the perspective of structured codes. This section inspects these two fronts with details.

On designing hard problems: LWE and random codes
Lattice problems have become popular in quantum resistant cryptographic schemes. The classic computational problems on lattices are the following.
• Shortest vector problem (SVP). Given a lattice Λ, find the shortest nonzero vector in Λ.
• Closest vector problem (CVP). Given a lattice Λ and a query point y, find the closest vector to y in Λ.
• SIVP. Given a lattice Λ, find n linearly independent vectors such that the length of the longest vector is minimized.
• BDD. Given a lattice Λ and a query point y, if the query point is not too far from the lattice, solve the CVP problem.
The hardness of these lattice problems can be analyzed by using complexity theory while these problems have been proved computationally intractable [40]. In the early ages of lattice-based cryptography, cryptographic schemes are built directly from these hard problems and migrated to their approximate versions afterwards. There exists a tight connection between the hardness of a problem and the security of a cryptographic scheme. Nevertheless, modern lattice-based cryptographic schemes are mostly designed from average-case hard problems, rather than the above worst-case hard problems. On one hand, worstcase hardness means that the worst-case instance of the problem is hard to solve. On the other hand, average-case hardness means that given any random instances of the problem, it is computationally hard to solve. While in the early ages many have believed that NP-hard problems would have been the perfect basis to build cryptographic schemes, not all NP-hard problems can achieve average case hardness. Thus it is crucial to design average-case hard problems which also feature proofs of worst-case hardness.

LWE
The randomly constructed LWE is a popular average-case problem. As introduced in Section 3, the search-version LWE is to find s ∈ Z n q given (A, b = As + e mod q), where e ∈ Z m is the noise.

Construction A
In coding theory, construction A is a method for generating a lattice by "lifting" a linear code to the Euclidean space. Let C = C[n, k] ⊆ Z n q be a linear code with dimension k and length n, where q is a prime number. A lattice Λ constructed from the code C based on construction A is defined by where φ : Z n p → R n is the embedding function which maps a vector in Z n q to its real-valued version. Many properties of construction A lattices can be related to the properties of their underlying codes.
LWE can be interpreted as the decoding over construction A lattices. As the public basis A defines a random code C, As represents the codeword generated by message s. Thus the problem of recovering s from the noisy observation b ∈ Z m q is in essence the decoding of a random linear code.

On the decoding of hard problems: lattice reduction and algebraic codes
Both algebraic and non-algebraic lattice problems are presumed hard classically and quantumly. For the sake of evaluating the bit-level security of cryptographic schemes based on LWE, SIS, ring-LWE, ring-SIS, module-LWE, and module-SIS, a major approach is to employ lattice reduction-based cryptanalysis [41]. Lattice reduction is to find a basis with short and nearly orthogonal vectors when given a basis as input. Its applications include not only cryptanalysis but also information theory (e.g., designing the network coding coefficients in compute-and-forward [42]) and wireless communications (e.g., lattice-reductionaided MIMO detection/precoding [43]). This subsection will review the lattice reduction attacks in analyzing the actual complexity of nonalgebraic and algebraic lattice problems, in which the algebraic lattice reduction is examined from the perspective of algebraic codes. Specifically, an essential question in the LBC community is whether the algebraic lattices induced by ring-LWE, ring-SIS, module-LWE, and module-SIS, may downgrade the security level of the associated cryptographic schemes. While the cryptanalytic state of the art about lattice reduction is to unfold the ring/module to a large degree, the algebraic lattice coding theory shows that small polynomial speed-ups can always be guaranteed.
The LLL algorithm is polynomial time, but the first vector of the output basis is an approximation of the shortest vector of the lattice with an exponential approximation bound. Block-wise generalisations of LLL include the BKZ, whose complete analysis has been open until very recently [51], Schnorr's algorithm, the transference algorithm by Gama et al., and Gama-Nguyen's slide algorithm (see [41] for a survey containing the analysis of the latter three). With the help of a given subroutine for solving SVP in lattices of dimension at most the block size, block-wise generalisations achieve better provable approximation bounds than LLL. Moreover, the choice of the block size offers flexibility for fine-tuning the algorithms catering to different application settings. When the block size β is logarithmic in the dimension n of the lattice, these algorithms remain polynomial-time and the approximation bounds remain exponential. For applications in cryptography, the block size is set to be linear in the dimension of the lattice, in which case these algorithms are only known to be the exponential time while the approximation bounds are polynomial in the dimension of the lattice. Gama-Nguyen's slide algorithm has recently been revisited [52] with the following improved approximation bound: where γ ′ denotes the approximation bound of the given subroutine that solves SVP of lattices of dimension at most β.

Algebraic variants
To achieve lower storage and communication costs, many lattice-based cryptographic schemes rely on using algebraic lattices, e.g., RingLWE, RingSIS. These types of lattices can be classified as O K -lattices, also referred to as O K -modules, where O K denotes the ring of integers of a number field K. Based on the tower decomposition of fields, as shown in Figure 8, the O K -lattices can be unfolded to modules of different degrees. By transforming the O K -modules to Z-modules, conventional lattice reduction algorithms can be readily applied, but this treatment has failed to capture/employ the structures by the algebraic lattices.
The second type of reduction algorithms is to transform the O K -modules to Z[ξ]-modules [53,54], where Z[ξ] ∼ = Z 2 refers to the ring of integers of imaginary quadratic fields. These algorithms were actually motivated by the requirements in coding theory and communications. Specifically, if signal constellations and codes carved from Z[ξ] are used in the side of transmission [55][56][57], then the decoding tasks in the side of receivers require performing algebraic lattice reduction over Z[ξ]-lattices. It has been shown that the Z[ξ]-lattice reduction algorithms are approximately 2 times faster than their Z-lattice counterparts.
More advanced treatment is to transform the O K -modules to high-degree modules and design LLL/BKZ reduction algorithms directly for these modules. This line of works include: Napias's work [58] extends LLL to lattices defined by Euclidean rings, Fieker and Pohst's work [59] of LLL over Dedekind domains, and Kim and Lee's work [60] of LLL for arbitrary Euclidean domains.
This line of investigations have accumulated into the breakthrough studies on module lattice reductions [61,62], where the lattices are R-modules M with a general extension ring R of Z (usually the ring of integers of an extension field of Q). These results are developed with their cryptographic applications in mind and are naturally comparable with the block-wise generalisations of LLL (in particular, when the block size is linear in the dimension of the lattice). For module lattices the block-size β is always a multiple of the degree d = [R : Z] for the obvious reason that a rank one sub-module (also called an ideal) of M is corresponding to a d-dimensional Z-lattice. The reduction uses an SVP solver for sub-modules of rank β/d (β/d 2) to solve an SVP in an R-module of rank n/d. The approximation bound achieved is where γ ′ denotes the approximation bound of the given subroutine that solves SVP of R-module lattices of rank at most β/d. This shows that SVP solvers for R-module lattices of dimension β are almost as good as SVP solvers for general lattices in R-module lattice reduction.

Information-theoretic security
In this section, we present information-theoretic security to establish secure networks and communication systems. Since information-theoretic security does not rely on computation hardness at all, it is also quantum safe. The starting point of information-theoretic security is generally attributed to Shannon's work on the so-called perfect secrecy. However, compared with the more mainstream encryption algorithms, information-theoretic security was regarded as no more than a beautiful, yet unpractical, theoretical construct, because such security is based on the characteristics of the communication channel instead of mathematical operations that are assumed to be hard to compute. With the development of capacity-approaching codes such as turbo codes, LDPC codes, and polar codes, information theorists are now aware of more practical methods to resist the unpredictable disturbance over the communication channel, gradually removing the shadow of cryptography. The potential of information-theoretic security to strengthen the security of the physical layer becomes increasingly clear. It is important to indicate the principle differences between classical cryptography and informationtheoretic security, which may help one select the appropriate method in practice. In general, the security of classical cryptography such as public-key cryptography is based on the conjecture that several one-way functions are hard to invert, meaning that the attackers cannot break the cryptographic system with efficient algorithms and limited computational resource. The security-based on computational hardness cannot be persistently guaranteed from a mathematical perspective, as the computing power continues to increase at a very fast pace. A typical example is that some quantum algorithms that are able to solve the prime factorization problem are now within reach. In addition, when it comes to compare the strengths of different cipher systems, there are no precise metrics available. Whereas for informationtheoretic security, no computational restrictions are forced on the eavesdropper and very precise metrics such as the information leakage to the eavesdropper can be evaluated to measure its strength of security. Moreover, the system architecture for information-theoretic security is basically compatible to the one for communication, making it possible to provide an additional layer of security to the existing communication networks without any changes to their infrastructure.
In the following content, we will focus on the wiretap channel model and briefly show how lattices and linear codes manage to achieve the secrecy capacity inherently associated with such a model under an information-theoretic secrecy requirement. The design of secure channel codes for information-theoretic security dates back to Wyner's study on the wiretap channel model [63]. As shown in Figure 9, a wiretap channel is a broadcast channel where one of the receivers is legitimate and the other is treated as an adversary. The channel between the transmitter (Alice) and the legitimate receiver (Bob) is called the main channel, and the one between Alice and the eavesdropper (Eve) is called the wiretapper's channel. For transmission, both the confidential message M and the auxiliary message M ′ are encoded to the codeword X n . The outputs of the main channel and the wiretapper's channel are given by Y n and Z n , respectively. The reliability requirement says that M should be recovered correctly on Bob's side, and the security requirement is related to the measure of information leakage. When the average information leakage is asymptotically vanishing, i.e., 1 n I(M; Z n ) → 0, the notion is usually called the weak secrecy condition [63]. A stronger condition directly requires the information leakage I(M; Z n ) to be vanishing as n increases, which is called the strong secrecy condition [64]. The secrecy capacity is defined as the maximum rate of M when the reliability requirement and the security requirement are both satisfied. It is also interesting that the secrecy capacity remains unchanged when the weak secrecy condition is replaced by the strong secrecy condition [64].
When both the main channel and the wiretapper's channel are symmetric, and the latter one is degraded with respect to the former one, the secrecy capacity is given by C b − C e , where C b and C e denote the capacities of the main channel and the wiretapper's channel, respectively. In this case, it turns out that the design of secure channel codes is highly related to that of capacity-achieving codes. One may construct two capacity-achieving codes C b and C e for the main channel and the wiretapper's channel, respectively. To fulfill the reliability and the security conditions, C e conveys all auxiliary message M ′ and the confidential message M is encoded to the quotient group C b /C e . Since both C b and C e are capacityachieving, their rates approach C b and C e , respectively. Therefore, the secrecy capacity C b − C e can be achieved.

Lattices
Particularly, for the Gaussian wiretap channel, where both the main channel and the wiretapper's channel are AWGN (additive white Gaussian noise) channels, we can construct two capacity-achieving lattice codes accordingly for the two channels. More explicitly, we construct two nested lattices Λ b and Λ e for Bob and Eve respectively, where Λ b is AWGN-good [65] and Λ e is secrecy-good [66]. To satisfy the power constraint, we then assign a Gaussian distribution to Λ b and Λ e simultaneously. As can be seen from Figure 10, when the two lattices are nested, i.e., Λ e is a sub-lattice of Λ b , it is convenient to shape the two lattices simultaneously because a Gaussian distribution defined over a sub-lattice is still a LGD. It is worth noting that the lattice-based secure coding scheme can even achieve the so-called semantic security [66], namely the scheme is secure for an arbitrarily distributed confidential message. A polar-lattice based implementation of the scheme in [66] can be found in [67].

AWGN-good lattices
The AWGN-goodness of a lattice can be regarded as the maximum efficiency of the lattice volume when resisting the Gaussian noise. Suppose W is an n-dimensional zero-mean white-Gaussian vector. The probability density function of W depends only on the Euclidean norm W . By the law of large numbers (LLN), the normalized squared norm 1 n W 2 converges in probability to the Gaussian noise variance σ 2 , which means that a zero-mean white-Gaussian vector tends to be uniformly distributed over a spherical shell of radius √ nσ 2 . For a high dimensional sphere, most of its volume is distributed on its surface and the probability of escaping such sphere for W vanishes. Therefore, to guarantee a vanishing error probability, the volume of a lattice is only required to be barely larger than the volume of an ndimensional spherical shell of radius √ nσ 2 , which is approximated by (2πeσ 2 ) n 2 . Let V (Λ) denote the volume of a lattice Λ, the AWGN-goodness establishes a lower bound on the normalized volume of Λ as for any desirable error probability P e → 0. Loeliger has proven the existence of AWGN-goodness lattices in [68], using the random linear codes in F q . This construction method which directly lifts a q-ary non-binary linear code to the Euclidean space is named as construction A as is introduced in Section 5. The decoding of such construction A lattices is highly related to that of the embedded q-ary non-binary codes, which is relatively more complex than decoding binary linear codes. For this reason, a more practical method of constructing AWGN-good lattices attributes to Forney et al. [69], which utilizes a multi-level structure and a series of nested binary linear codes. This method is referred to as construction D. Forney et al. showed that the construction D lattices are AWGN-good, or equivalently sphere-bound-achieving when the underlying binary codes are capacity-achieving for each level. Following this line, we can use capacity-achieving codes such as polar codes [70] and LDPC codes [71] to construct such construction D lattices.

Secrecy-good lattices
Unlike the mentioned AWGN-good lattices, which are constructed to establish reliable transmissions, the so-called secrecy-good lattices [66] are designed for preventing information leakage to adversarial parties in a communication system. For secrecy-good lattices, the inequality (2) is reversed to make it difficult to correctly decode the confidential message. This can be done by forcing the rate of binary codes slightly exceeding the channel capacity at each level in the construction D, and making the resulted lattice relatively denser such that V (Λ)/σ 2 becomes smaller than 2πe. According to Shannon's channel coding theory, it is impossible to recover the correct lattice point from the received signal since the transmission rate is above the channel capacity. A more rigorous proof in [66] shows that secrecy-good lattices are capable of achieving strong secrecy.

LDPC codes for BEC wiretap channels
LDPC codes are famous for their capacity-approaching performance on many communication channels. However, LDPC codes have been used to build wiretap codes only with limited success. When the main channel V is noiseless and the wiretapper's channel W is a binary erasure channel (BEC), LDPC codes for this BEC wiretap channel were presented in [72,73]. Especially in [73], the authors generalized the link between capacity-approaching codes and weak secrecy capacity. The use of capacity-achieving codes for the wiretapper's channel is a sufficient condition for weak secrecy. This view point provided a clear code construction method for secure communication across arbitrary wiretap channels. Then, they used this idea to construct the first secrecy capacity-achieving LDPC codes for a wiretap channel with a noiseless V and a BEC W under the belief propagation decoding (BP) in terms of weak secrecy. Later, Ref. [74] proved that the same construction can be used to guarantee strong secrecy at lower rates. A similar construction based on two-edge-type LDPC codes was proposed in [75] for the BEC wiretap channel (V is no longer noiseless). Unfortunately, general LDPC codes do not have the capacity-achieving property for binary memoryless symmetric channels (BMSCs) other than BECs. Therefore, the coset coding scheme using general LDPC codes cannot achieve the secrecy capacity when the wiretapper's channel is not a BEC.
In this case, spatially coupled LDPC (SC-LDPC) codes, which are provable to achieve the capacity of general BMSCs, provide us with a promising approach. In [76], a coset coding scheme based on regular two-edge-type SC-LDPC codes is proposed for a BEC wiretap channel, where the main channel is also a BEC. It is shown that the whole rate equivocation region of such BEC wiretap channel can be achieved by using this scheme under weak secrecy condition. Since SC-LDPC codes are universally capacity-achieving, it is also conjectured that this construction is optimal for the class of wiretap channel where the main channel and wiretapper's channel are BMSCs and the wiretapper's channel is physically degraded with respect to the main channel.

Polar codes for degraded wiretap channels
Compared with LDPC codes, polar codes provide a more powerful approach to design wiretap codes, since they are capacity-achieving for general BMS channels, not just for BECs. Recently there has been a lot of interest in the design of wiretap codes based on polar codes. For example, polar codes are employed to build secure schemes for the degraded wiretap setting with BMS channels in [77], but only weak secrecy is guaranteed. In [78], it was shown that, with a minor modification of the original design, polar codes achieve strong secrecy (and also semantic security) for degraded wiretap channels. Unfortunately, they could not guarantee the reliability of the main channel in the non-degraded case. In [79], a multiblock polar coding scheme was proposed to solve this reliability problem under the condition that the number of blocks is sufficiently large. In the meantime, a similar multi-block coding scheme was discussed in [80]. Their polar coding scheme also achieves the secrecy capacity under strong secrecy condition and guarantees reliability for the legitimate receiver. However, Ref. [80] only proved the existence of this coding scheme, and thus it might be computationally hard to find the explicit structure. Now we briefly introduce the idea of polar wiretap coding. The construction of polar codes consists of an information set I, which carries message bits to be transmitted, and a frozen set F , which is fixed with some known value. When a channel W is degraded with respect to another channel V , their information sets, I W and I V , satisfy I W ⊆ I V [81, Lemma 1.8]. Then, the indices 1, . . . , n can be divided into 3 sets: I c V , I W , and I V \ I W . The wiretap coding scheme is to assign these three sets with frozen bits which are known to both Bob and Eve prior to transmission, random bits in order to confuse the eavesdropper, and message bits, respectively. Due to the capacity-achieving property of polar codes, the secrecy capacity is achieved as lim n→∞ The reliability for Bob is also guaranteed by the standard polar decoding. The weak secrecy can be proved using Fano's inequality.
For strong secrecy and non-degraded wiretap channels, the above wiretap coding scheme needs to be modified. In these cases, the inclusion relation of information sets in the degraded wiretap channel does not hold, and we have to partition the index set {1, . . . , n} into four sets, one of which is unreliable for Bob but reliable for Eve. This becomes problematic since bits in this set need to be known by Bob but kept secret from Eve. Fortunately, this problem can be solved by a multi-block technique proposed in [79] to achieve reliability and strong secrecy simultaneously. The idea is to allocate some reliable and secure bits in the current block for the bits in the problematic set of the next block. Details of this scheme can be found in [79].

Practical instances
The above coding techniques to achieve secrecy capacity for wiretap channels assume infinite block length. In the context of real-world design, engineers prefer bit error rate (BER) as a measure of security, i.e., the legitimate recipient on the main channel should have overwhelmingly smaller BER than the eavesdropper on the wiretap channel to achieve secure transmission. The security gap, as a practical metric, is defined by S sg = SNR th (P B e ) − SNR th (P E e ), where SNR th (P B e ) is the signal-to-noise ratio (SNR) threshold for Bob to reliably recover the message at a BER of P B e and SNR th (P E e ) is the threshold for Eve to operate at a BER (approximate to 0.5) not able to extract enough information about the message from the received signal. In other words, the security gap defines the minimum SNR advantage the main channel has over the wiretap channel to achieve reliable and secure transmission.
Considering a typical BER-SNR curve illustrating the decoding performance of certain codes, a sharper falloff will render a smaller security gap. Theorists and practitioners have contributed to reducing the security gap. An exploration towards that direction is the punctured LDPC codes in [82] for the Gaussian wiretap channel where the punctured bits are used to transmit the secret information. As a result, the BER of the eavesdropper deteriorates faster towards 0.5 as SNR decreases than it does without puncturing. The security gap required to reach a certain level of security can be further reduced with scrambling [83].
When the wiretap channel has a quality the same as or even better than the main channel, a feedback mechanism, automatic repeat-request (ARQ) protocol, was used to achieve reliability and security [83]. Another secure transmission method using ARQ can be found in [84] where a code-hopping scheme [85] is employed for encoding with a single-use parity-check matrix and ARQ is used for synchronization.

Open problems
The NIST standardization process marks the beginning of a paradigm shift to PQC, not the end. We offer a perspective into the future of this exciting area.
• What will post-NIST PQC look like, i.e., PQC beyond the NIST process? Traditionally, cryptography is also known as secure communications. Can we merge communication and cryptography as envisioned by Shannon himself? As cryptography plays an increasingly important role in secure computation, can we develop a unified theory of communications, computation, and security? Compared to information-theoretic security, information theory is of little use for computational security; can algorithmic information theory make a difference? In data science and artificial intelligence, the privacy and security of data become a central concern; thus PQC will find broad applications in these areas.
• In PQC, cryptosystems based on linear codes (e.g., LWE) seem to be the most successful. The analogy of methods (e.g., construction A, ideal lattices) used in coding theory and cryptography is especially striking; PQC would greatly benefit from the synergy of the two fields. Moving further, why do we restrict ourselves to linear codes? Linear codes are arguably the simplest algebraic structure from a mathematical point of view. What about nonlinear codes and more sophisticated algebraic structures such as algebraic geometry?
• Decoding random linear codes, regardless of the metric, is believed to be hard even for quantum computers. But the supporting evidence is not enough -people are merely unaware of devastating quantum attacks to codes and lattices nowadays. More research on the quantum hardness of coding problems is required.
• More interaction between coding theory and cryptography is needed. As discussed in Remark 1 encryption is analogous to channel coding, some lattice signatures can be deemed as the "reverse process" of source coding, and we also find a link between IBE and joint source-channel coding. Lattice and codebased cryptosystems, especially code-based ones, require powerful error-correction codes with extremely low decoding error rates and high decoding speed. Intricate channel models of these cryptosystems invite new methods to design good codes for error correction, as opposed to the standard i.i.d. model.
• So far, quantum computing is destructive to cryptography. Can we use quantum computers constructively in cryptography? That is, can we use the computational power of quantum computers to design new cryptosystems that are also resilient to quantum attacks?
• Even classically, the security of PQC is not well understood. Problems of algebraic lattices, such as lattice reduction, deserve a thorough investigation. Similar techniques for codes are needed as well.
• Crypto conferences are apparently dominated by a small number of countries. The community should become more open and diverse, not only in terms of people but also in terms of disciplines. PQC is a chance to attract people from different countries and different disciplines.