Non-Commutative Ring Learning With Errors From Cyclic Algebras

The Learning with Errors (LWE) problem is the fundamental backbone of modern lattice based cryptography, allowing one to establish cryptography on the hardness of well-studied computational problems. However, schemes based on LWE are often impractical, so Ring LWE was introduced as a form of `structured' LWE, trading off a hard to quantify loss of security for an increase in efficiency by working over a well chosen ring. Another popular variant, Module LWE, generalizes this exchange by implementing a module structure over a ring. In this work, we introduce a novel variant of LWE over cyclic algebras (CLWE) to replicate the addition of the ring structure taking LWE to Ring LWE by adding cyclic structure to Module LWE. The proposed construction is both more efficient than Module LWE and conjecturally more secure than Ring LWE, the best of both worlds. We show that the security reductions expected for an LWE problem hold, namely a reduction from certain structured lattice problems to the hardness of the decision variant of the CLWE problem. As a contribution of theoretic interest, we view CLWE as the first variant of Ring LWE which supports non-commutative multiplication operations. This ring structure compares favorably with Module LWE, and naturally allows a larger message space for error correction coding.


I. INTRODUCTION
With the predicted advent of quantum computers compromising the bulk of existent cryptographic constructions, lattice based cryptography has emerged as a promising foundation for long term security. In particular, the Learning with Errors (henceforth LWE) problem introduced in [1], as well as its variants over rings (RLWE) [2] and modules (MLWE) [3], provides a natural intermediate step to base cryptographic hardness on lattice short vector problems in a post quantum setting. Indeed, second round submissions to the NIST post quantum standardisation process such as NewHope [4] and KYBER [5] rely on the hardness of LWE variants. Cryptography based on the classical LWE problem is typically somewhat impractical, in part due to large key sizes. To solve this, the ring variant was introduced as a way to provide extra structure in LWE to trade C. Grover is with the Department of Electrical and Electronic Engineering, Imperial College London, London SW7 2AZ, UK (e-mail: c.grover15@imperial.ac.uk).
C. Ling is with the Department of Electrical and Electronic Engineering, Imperial College London, London SW7 2AZ, UK (e-mail: cling@ieee.org).
R. Vehkalahti is with the Department of Communications and Networking, Aalto University, Espoo, FI-02150, Finland (e-mail: roope.vehkalahti@aalto.fi). a potential loss of security for an increase in efficiency. MLWE generalizes ring and classical LWE, providing a smoother transition between security and efficiency than the binary option presented by ring or classical LWE. The flexibility of MLWE is highly desirable in practice, as demonstrated by third-round NIST finalists KYBER and SABER, both based on MLWE [6].
Conceptually, one may view all these problems as variations on a single problem. The (search) LWE problem tasks a solver with recovering a secret vector s ∈ Z n q from a collection of pairs (a i , b = a i , s +e i ), where ·, · denotes the inner product, each a i ∈ Z n q is uniformly random and the e i 's are small random errors. In practice, we view this collection of equations in matrix-vector form: where all operations and entries are over Z q and the challenge is to recover s from A, b. A popular ring variant replaces A, s, e with elements a, s, e from the ring R q := Zq[x] x n +1 , requiring the solver to obtain s from samples a i · s + e i . For power-of-two n this can be expressed in matrix-vector form by considering the matrix rot(a), the negacyclic matrix obtained from the coefficients of a. Explicitly, for a = a 0 + a 1 x + ... + a n−1 x n−1 and bold faced letters denoting coefficient vectors, a sample from the RLWE distribution takes the form: where once again operations and entries are over Z q . This is exactly a structured version of the classical LWE problem, where the uniformly random matrix A has been replaced by the negacyclic matrix rot(a). Of course, this should be an easier problem to solve, yet no substantial progress has been made in using the structure of rot(a) to solve the problem efficiently. We can extend this matrix-vector view to MLWE as well. An MLWE instance takes place in a module M of dimension d over R q , such that a solver has to recover s ∈ M from a collection of pairs (a i , a i , s + e i ) where a i is a uniformly random element of M and each e i is a small random element of R q . A collection of such pairs can be viewed as As + e = b, where the ambient space Z q has been replaced by R q e.g. with d samples: where all operations are over R q and each a i,j is uniformly random. Of course, we could extend this to have operations over Z q by applying the rot(·) operation coordinatewise, to obtain a structured LWE instance in dimension nd.
An advantage of these structured matrices is that they allow for streamlined storage and operations. For example, storing a uniformly random matrix A requires one to store all n 2 of its entries, but rot(a) requires a factor n less memory since one need only store its first column. Equivalently, one RLWE sample generates n LWE samples while reducing the storage space and key sizes. Multiplication can also be speeded up by using the Chinese Remaindering Theorem (CRT) or other techniques.
This concept of improving efficiency by adding structure motivates this work; can we perform an analog of the transformation taking an LWE matrix A to an RLWE matrix rot(a) for the module M ? We solve this by constructing a new variant of the LWE problem over a certain non-commutative space known as a cyclic algebra. In recent years, cyclic algebras have received significant attention in the field of coding theory (see e.g. [7], [8], [9]) due to the particular nature of the matrix lattices they induce, and we view them as a suitable option for defining an LWE problem over a non-commutative ring. Though some efforts have been made to construct non-commutative LWE problems, for example [10], [11], the majority of non-commutative cryptography has relied on group theoretic constructions, whose underlying hard problems are often less robust than those of lattice cryptography. Somewhat informally, for a cyclic algebra A and well chosen parameters there exists an automorphism θ of R q and a γ ∈ R q such that an LWE style sample a · s + e over A can be written in matrix-vector form where all entries and operations are now over R q . Though more complex than the transformation taking LWE to RLWE this fulfills our goal of providing a structured version of MLWE, since we have replaced the uniformly random matrix A over R q with a structured matrix which we denote φ(a) that requires a factor of d less storage. Of course, by applying the rot(·) operation coordinatewise, one can extend this to a high dimensional version of the LWE problem, now with two sets of structure lying on top of each other.

A. Contributions and Methodology
The main novel contribution of this work is a definition of Cyclic Algebra LWE (CLWE), together with justifications for its construction and a polynomial time reduction from short vector problems over matrix lattices induced by ideals in a cyclic algebra to CLWE, establishing its security on the assumption that such problems are hard. As in [2], the algorithm bases the security of CLWE on short vector problems over ideal lattices in A; similarly to ideal lattices in K, these have some extra underlying structure that might make computational problems easier. However, we leave the relative complexity of these problems an open area of investigation.
Overall we consider it plausible that LWE in cyclic algebras could be both more efficient than MLWE and more secure than RLWE in a quantum setting. CLWE represents a middle ground between RLWE and MLWE, with the salient feature of its non-commutative ring structure. Cyclic algebra is equipped with a proper ring multiplication which preserves the dimension of the lattice. This is in sharp contrast to MLWE which only supports scalar multiplication and to RLWE whose multiplication is commutative. Specifically, we consider the following advantages of our CLWE construction: • Efficiency. CLWE can be seen a structured variant of MLWE. Assuming for simplicity that the public key in LWE based schemes is a sample (A, b), a public key generated as A = rot(φ(a)) requires only as much storage as that of an equivalent dimension RLWE public key 1 . Multiplication in cyclic algebras can be implemented over a product of skew polynomial rings following a CRT-style decomposition (see Appendix G), for which well known fast algorithms, such as those of [13] and [14], can applied to compute the operation A·s more efficiently in the case where A = φ(a) than in the module case where A is uniform.
• Security. Following recent works on quantum attacks on related ideal lattice problems (e.g. [15], [16], [17], [18] amongst others), we observe that the non-commutativity of multiplication in cyclic algebras may be viewed as a security advantage. This is because the Hidden Subgroup Problem (HSP), an integral part of the majority of algorithms using quantum computing to gain an advantage over classical computation, requires that the underlying group, in this case the unit group of O K , is commutative, see e.g. [19], which is untrue for a non-commutative algebra. We conjecture that the security level is higher than RLWE, but welcome further cryptanalysis. We actively avoid known attacks on previous attempts to create structured MLWE (see Section III-B).
• Decryption failure rates. The scalar multiplication of MLWE is dimension-lossy. In other words, the message space of MLWE is restricted in R q , whose dimension is smaller than that of the module lattice. It leaves less room for error correction coding in MLWE-based schemes (e.g., a KYBER instance for a key size of 256 within R q of dimension 256). This limitation of MLWE appears to be fundamental, due to its module structure. In contrast, the dimension of the message space of CLWE is that of the (non-commutative) ring, which is higher by a factor of d. Thus, it accommodates better error correction coding (see Section V-B), and low decryption failure rates are desired under chosen ciphertext attacks (CCA). Even trivial repetition coding can dramatically reduce decryption failure rates (e.g., NewHope).
• Functionality. We view the ring structure of CLWE as a major advantage over MLWE, which opens up the prospect of extra functionality. For example, since operations are composable and non-commutative, one could hope to construct FHE in this non-commutative ring. We leave this frontier open for separate work.

B. Related Work and Organization
This work is related to a number of different areas: lattice-based cryptography, information theory and number theory.
In lattice-based cryptography, an alternative construction for structured module LWE, called multivariate-RLWE, was presented in [20], [21], where they tensor product two (or more) number fields in order to provide a structured module matrix. However, an efficient implementation of [20] was attacked in [22], together with a warning about taking care when putting structure on a module. In short, [22] attacks certain instances of multivariate-RLWE by providing a homomorphism to some underlying subfield K, dramatically reducing the dimension of the lattice problem to be attacked. Fortunately for this work, a somewhat technical condition on the choice of γ known as the non-norm condition precludes such a homomorphism existing to reduce the dimension of CLWE (see Section III-B). It is worth pointing out that that their problem has been addressed in [21], and in fact this fix looks somewhat like our non-norm condition (e.g., unlike the original version, full rank is maintained in [21]). This paper is inspired by the abundant literature of space-time coding based on cyclic division algebras (see the monographs [8], [23] and references therein). On a high level, our construction is reminiscent of multi-block space-time codes [24], [25], rather than single-block codes [26], [27], with the caveat of scaling up the number of blocks to make the codes practically undecodable. In the context of space-time coding, our construction generalizes [25] and offers greater flexibility in the code parameters (the number of blocks vs. the number of antennas). Multi-block spacetime codes have been used in [9] to achieve information-theoretic security over wiretap channels, as opposed to computational security in a classic cryptographic setting of this paper. Maximal orders were shown in [28], [7] as advantageous to the so-called natural orders; both types of orders play a crucial role in this paper. There is a major difference between the roles of cyclic algebras in coding and cryptography, though: the primary concern for coding is the non-vanishing determinant (NVD), while the non-commutative ring structure becomes crucial for cryptography. For efficient multiplication of elements in a cyclic algebra, we heavily rely on the CRT technique of [29]; a similar technique has been used in lattice index codes [30], [31].
We present two approaches (subfields and compositum fields) to the construction of novel cyclic division algebras, which enlarge the pool of algebras and may find other applications. Specifically, our proof that the natural order of the family of cyclic division algebras constructed in Section III-C1 (including those in [25]) is in fact maximal, is an original contribution.
The rest of this paper is organized as follows. In Section II we provide necessary background material on lattices, number fields, and cyclic algebras. In Section III we provide a definition and discussion of CLWE, together with novel constructions of cyclic division algebras for the CLWE problem. In Section IV we provide a reduction from structured lattice problems to search CLWE, as well as a search-worst case decision reduction for CLWE. In Section V we show a sample CLWE cryptosystem and provide an estimate of its asymptotic operation complexity. Finally, the paper is concluded in Section VI with a discussion of open problems. For a smooth flow of the main text, certain proofs, sideline discussions and technical details are deferred to appendices.

A. Lattices
A lattice is a discrete additive subgroup of a vector space V . If V has dimension n a lattice L can be viewed as the set of all integer linear combinations of a set of linearly independent If k = n we call the lattice full-rank, and we will only consider lattices of full-rank. We can extend this notion of lattices to matrix spaces by stacking the columns of a matrix. We recall two standard lattice definitions. Definition 1. Given a lattice L in a space V endowed with a metric · , the minimum distance of L is defined as λ 1 (L) = min v∈Λ/{0} v . Similarly, λ n (L) is the minimum length of a set of n linearly independent vectors, where the length of a set of vectors {x 1 , ..., x n } is defined as Definition 2. Given a lattice L ⊂ V , where V is endowed with an inner product ·, · , the dual lattice L * is defined L * = {v ∈ V : L, v ⊂ Z}.
We can use this function to define the spherical Gaussian distribution D r over V , which outputs v with probability proportional to ρ r (v). Similarly, we can sample an elliptical Gaussian D r in a basis b 1 , ..., b n of V , for r = (r 1 , ..., r n ) a vector of positive reals, by sampling x 1 , ..., x n independently from the one dimensional Gaussian distributions D r i and outputting n i=1 x i b i . When sampling a Gaussian over a lattice L we will use the discrete form of the Gaussian distribution. We define the distribution D Λ,r over Λ by outputting x with probability ρr(x) ρr(L) for each x ∈ L. This version of the discrete Gaussian is centered at 0, which in general need not be the case.
An important lattice quantity, known as the smoothing parameter, was introduced in [32]. The motivation for the name is provided by Lemma 1 following the definition.
The following is a special case of [32], Lemma 4.1.
Lemma 1. For a lattice L over R n , ε > 0, r ≥ η ε (L), and x ∈ R n , the statistical distance between (D r + x) mod L and the uniform distribution modulo L is bounded above by ε/2.
We introduce well known lemmas used to relate the smoothing parameter to standard lattice properties. The first comes from [33], the second from [34].

C. Algebraic Number Theory
Definition 5. A number field K is a finite degree extension of the rationals Q. Typically, we define a number field by adjoining some algebraic element α ∈ C and set K = Q(α). The degree of K refers to its degree as a field extension.
To define a cyclic algebra, we will need to take an additional extension of K. In particular, we will need the extension to be Galois over K, defined as follows.
Definition 6. Let L/K be an extension of number fields of dimension d. The Galois group of L over K is the group Aut(L/K) of automorphisms of L that fix K. We say that the extension is Galois if the subfield of L fixed by Aut(L/K) is exactly K.
We define a cyclic Galois extension L/K to be a Galois extension such that the Galois group of L over K is the cyclic group generated by some element θ of degree d := [L : K]. Finally, we require the ring of integers of a number field.
Definition 7. Given a number field K, its ring of integers O K is the ring consisting of those elements of K whose minimal polynomial over Q lie in Z[x].
It is easy to check that if L/K is an extension of number fields then 1) The Canonical Embedding: Let K = Q(α) be a number field of degree n. It is a well known fact that there are exactly n distinct ring embeddings σ i : K → C. These embeddings correspond to the n distinct injective ring homomorphisms mapping α to the roots of its minimum polynomial f . We split these embeddings and say that there are r 1 real embeddings (whose image lie in R) and r 2 conjugate pairs of complex embeddings (the complex embeddings come in pairs since complex roots of f occur in conjugate pairs), such that r 1 + 2r 2 = n. The standard convention is to order the embeddings such that the r 1 real embeddings come first and the complex embeddings are arranged such that σ r 1 +j = σ r 1 +r 2 +j for 1 ≤ j ≤ r 2 . Definition 8. Let K = Q(α) be a number field of degree n = r 1 +2r 2 . The canonical embedding σ is the ring homomorphism σ : K → R r 1 × C 2r 2 defined by σ(x) = (σ 1 (x), ..., σ n (x)).
Formally, σ maps into the space which is isomorphic to R n as an inner product space.
We can equip H with the orthonormal basis 2 (e j − e j+r 2 ) for r 1 < j ≤ r 1 + r 2 , and use the well defined p norm induced by viewing H as a subset of C n . Observe that multiplication in K maps to coordinatewise multiplication in H. The 2 norm on H allows us to efficiently sample a Gaussian distribution D r over K by sampling such a Gaussian coordinatewise over H, although technically this distribution is over the field tensor product K R = K ⊗ Q R ∼ = H. Furthermore, it satisfies the property that for any x ∈ K R we have the equality of distributions x · D r and D r , where r i = r i · |σ i (x)|. When we have an extension of number fields L/K we will denote their respective canonical embeddings σ L and σ K as maps into H L and H K to avoid confusion.
2) Relative Embeddings: In the case of an extension L of a number field K it is sometimes more convenient to apply a different order on its embeddings induced by extending embeddings of K to those of L. Given a tower L/K/Q where K has degree n and L has degree d over K, there are precisely n embeddings σ 1 , ..., σ n of K into C. Assuming L/Q is Galois, each of these can be extended to an embedding α i : L → L such that α i | K = σ i . However, these extensions are not unique, and it is easy to see that there are [L : K] = d choices for each α i . In particular, in the case where L/K is a cyclic extension with Galois group generated by θ it holds that the composite automorphisms α i • θ j (·), 1 ≤ j ≤ d, run through the d choices of α i . Hence for a fixed choice of α 1 , ..., α n the nd automorphisms of L can each be uniquely represented by some α i • θ j (·), which we denote by α j i (·), 1 ≤ i ≤ n, 1 ≤ j ≤ d. Given the usual ordering of embeddings of K this induces two systematic orderings on the embeddings of L by running through either the i or j coordinates first.

D. Cyclic Algebras
Definition 9. Let K be a number field with degree n, and let L be a Galois extension of K of degree d such that the Galois group of L over K is cyclic of degree d, Gal(L/K) = θ . For non-zero γ ∈ K we define the resulting cyclic algebra where ⊕ denotes the direct sum, u ∈ A is some auxiliary generating element of A satisfying the additional relations xu = uθ(x), ∀x ∈ L and u d = γ. We will call d the degree of the algebra A. We call such an algebra a division algebra if every element a ∈ A has an inverse a −1 ∈ A such that aa −1 = 1.
The relations among K, L and A are illustrated in Fig. 1.

Cyclic Extension Attach
Degree Degree Fig. 1: Structure of a cyclic algebra.
Since θ fixes K, the center of the cyclic algebra is precisely K. Oftentimes the condition γ ∈ K is replaced by the stronger condition γ ∈ O K , and we will use this condition in our work to guarantee the existence of a certain subring known as the natural order. Note that the division property does not hold for arbitrary γ, and such algebras are not always easy to construct, which we will discuss later in this section.
We present a matrix representation of elements of A which proves useful for computing multiplication in cyclic algebras. We can naturally view an element a ∈ A as an d-dimensional vector Vec(a) over L, in which case we can view left multiplication of elements as matrix-vector operations. This is done by defining the map φ : We call this mapping a left regular representation of A, because it holds for any a, b ∈ A that φ(a)Vec(b) = Vec(ab), and that φ(ab) = φ(a) · φ(b). In the case where A is a division algebra it follows that each φ(a) is an invertible matrix. Since θ is well defined on L R we abuse notation and extend this map to φ : . We derive lattices from subrings of a cyclic algebra by vectorising their images under φ.
Definition 10. Let A = (L/K, θ, γ) be a cyclic division algebra. A Z-order Λ in A is a finitely generated Z-module such that Λ · Q = A and that Λ is a subring of A with the same identity element as A. We call Λ maximal if there is no Z-order Γ such that Λ Γ A. Here, Since we are only concerned with Z-orders in this paper, we will just refer to them as orders.
Example 1. The ring of integers O K of a number field K is the unique maximal order of a number field. In the case of cyclic algebras a maximal order is not necessarily unique.
An order of particular interest that we will use in our LWE construction is known as the natural order, defined as Λ : Unlike in the case of O K , this order is not necessarily maximal (however, we are going to work with natural orders that are also maximal). Note that in order for Λ to be closed under multiplication the element γ must lie in O K .
1) Non-Norm Condition: It is not a priori obvious whether well-defined cyclic algebras or orders actually exist. As observed earlier, the existence of γ enforcing the division algebra condition is a key component in constructing such objects. Fortunately, it is sufficient for γ to satisfy the so called 'non-norm condition' [7]. Proposition 1. The cyclic algebra A = (L/K, θ, γ) of degree d is a division algebra if and only if none of the elements γ t , 1 ≤ t ≤ d − 1, appears in N L/K (L), where N L/K represents the relative norm of L into K.
In other words, this condition states that the lowest power of γ that is norm of some element of L, is γ d .
2) Order Ideals: Analogous to the use of O K ideals in RLWE, we will be interested in ideals of an order Λ of a cyclic division algebra A. Although Λ is a ring, it is non-commutative -thus there are three types of ideals. A left (respectively right) ideal I of Λ is an additive subgroup of Λ such that for any i ∈ I, r ∈ Λ, we have r · i ∈ I (respectively i · r ∈ I). A two-sided ideal of Λ is an additive subgroup that is closed under left and right scaling by Λ, i.e. a right ideal that is also a left ideal. The sum and product of two ideals I, J are defined as usual; I + J = {i + j : i ∈ I, j ∈ J } and I · J = { m l=1 i l · j l : i l ∈ I, j l ∈ J , m ∈ N}. In the case of two-sided ideals we have the standard notion of a fractional ideal; I is a fractional ideal of Λ if cI = J for a two-sided ideal J and some c ∈ K. In the rest of this paper, a (fractional or integral) ideal is always restricted to be two-sided, unless otherwise stated.
We remark that the structure of the collection of two-sided ideals of the natural order is not as simple as those of O K , or indeed those of an arbitrary maximal order. In a maximal order, the group of two-sided ideals is a free abelian group generated by the prime (e.g. maximal) ideals [35,Theorem 22.10], from which one can deduce obvious definitions of inverse and coprime ideals. For a general order Λ, we define its prime ideals as its maximal two-sided ideals and the inverse of an ideal I ⊂ Λ is which lines up with the expected definition in the two-sided case (e.g. I · I −1 = I −1 · I = Λ).
For the case of the natural order we do not have such a well-behaved ideal group, but a nice exposition is given in [29,Section 3]. In particular, for a two-sided ideal I ⊂ Λ, I ∩ O K is an ideal of O K . For an ideal I ⊂ O K , (I · Λ) ∩ O K = I, from which it follows that this intersection map is a surjection onto the ideals of O K . However, it is not in general an injection since several ideals of A may have the same intersection with O K . Since the ideals of Λ do not in general form a finitely generated abelian group, we define two ideals I, J of Λ to be coprime if I + J = Λ.
Nonetheless, since the orders to be constructed in Section III-C1 are both natural and maximal, it will always hold for a two-sided ideal I that I · I −1 = I −1 · I = Λ and (I −1 ) −1 = Λ. These properties will be required in the proofs of Lemmas 6 and 7.
3) Some Useful Ideals: For an order Λ we define the codifferent ideal where Tr refers to the reduced trace, defined Tr(a) := Tr K/Q (Trace(φ(a))). Similarly, for an ideal I we define the dual ideal Since the matrix trace satisfies Trace(AB) = Trace(BA), this definition is two-sided. Note that the codifferent ideal and a general dual ideal may be fractional ideals rather than full ideals, and they satisfy the equality I ∨ = Λ ∨ · I −1 for any ideal I. We will also be interested in principal ideals, but must take more care with these than in commutative settings. For a central element t ∈ K, we can define simply t = t · Λ, the set of elements of Λ divisible by t. However, for a general t that does not lie in the center of Λ we need the slightly more complex definition which can easily be seen to be a two-sided ideal, moreover the smallest one that contains t.
4) Orders and Ideals as Integer Lattices: Any order Λ of a cyclic algebra A = (L/K, θ, γ) has dimension nd 2 over Z and thus generates a lattice of dimension nd 2 over Z. We will consider the following representation of these lattices, which extends naturally to ideals of orders as well. Consider an element x = d−1 i=0 u i x i ∈ Λ. We can consider x as a vector over H L of dimension d by σ A (x) := {σ L (x 0 ), σ L (x 1 ), ..., σ L (x d−1 )}. Then, the collection σ A (Λ) forms an integer lattice of dimension nd 2 . We will refer to this representation as the "module representation" and will sometimes double index the element x, denoting by x i,j the embedding σ j (x i ), and extend this notation in the obvious manner to the space d−1 i=0 u i L R . Though this representation is conceptually simple, we remark that it has some drawbacks in the case where |σ i (γ)| = 1 for some i when considering sizes of lattice elements; we will choose γ carefully in our constructions to remove this issue. 5) Gaussian Distributions Over Cyclic Algebras: As in (R)LWE, we will need to sample Gaussian distributions over our ambient space in certain norms. In the case of RLWE, the continuous Gaussians are sampled in K R ∼ = H. Since a cyclic algebra A can be viewed as an n-dimensional algebra over L, we use the visualization from the previous subsection and sample our error distributions over d−1 i=0 u i L R , which has the same structure as a vector space as H L d .
For simplicity we restrict ourselves to the case when |σ i (γ)| = 1 for each i. Although this is a strong condition on γ it holds in the case where it is a root of unity, which we will enforce later. Otherwise, in order to maintain a norm that is sub-multiplicative the norm and shape of γ must be considered. Explicitly, we just consider the norm of an element of A to be equal to the norm of the corresponding module element in L d of dimension nd 2 used in [3], e.g.
It is straightforward to check that this is indeed a norm in the case where |σ i (γ)| = 1 for each i, since γ is fixed under θ and multiplying by γ does not change the norm of an entry of σ L . It is clear that this norm extends to any y ∈ d−1 i=0 u i L R in a natural manner. Now that we have defined a norm, it is easy to define a Gaussian distribution D r on A, or its discrete analogue on Λ by sampling over the module L R d .
6) The CRT: In this subsection we state the CRT for order ideals, and deduce some important consequences. We note that the following lemmas are merely adaptations of those in [2, Section 2.3.8] extended to the case of cyclic algebras. The first is just the CRT.
We call a CRT basis for a set of coprime order ideals I 1 , ..., I r a basis C = {c 1 , ..., c r } of elements of Λ satisfying c i = 1 mod I i , c i = 0 mod I j for i = j.
Lemma 5. Given pairwise coprime ideals I 1 , ..., I r of an order Λ, there is a deterministic polynomial time algorithm that outputs a CRT basis c 1 , ..., c r ∈ Λ for those ideals.
The proof is the same as in the ring case [2, Lemma 2.13]. Using Lemma 5 we can efficiently invert the natural CRT isomorphism. Given a = (a 1 , ..., a r ) ∈ r i=1 (Λ/I i ), it can be easily checked that its inverse is b = r i=1 a i c i mod I. The next two lemmas will be required later to construct an efficiently invertible bijection between quotient spaces I/ q · I and Λ/ q . Lemma 6. Assuming q is unramified in L. Let I be an ideal of the natural order Λ which is maximal and let J = q · Λ = q · Λ, where q is a prime integer and q = r i=1 q i is a decomposition into prime ideals in O K . Assume γ / ∈ q i for each i. Then, there exists an element t ∈ I ∩ O K such that the ideal t · I −1 ⊂ Λ is coprime to J , and we can compute such a t efficiently given I and the prime factorization of J . Remark 1. The condition on γ will be immaterial in our use case, since when γ is a unit the only O K ideal that contains γ is O K itself.
Proof. For an ideal I denote by I its intersection with K, which is a non-trivial ideal of O K (see [29,Section 3]). We apply the corresponding [2, Lemma 2.14] to obtain t ∈ I such that t · I −1 and J are coprime as ideals of O K and t ∈ I \ r i=1 q i · I. Assume, for a contradiction, that t · I −1 + J = Λ e.g. the ideals are not coprime. Then, there is some maximal ideal M of Λ containing t · I −1 and J . Since q is unramified in L and γ / ∈ q i , by [29,Propositions 1 and 4], this ideal must be one of the ideals q i · Λ since it contains J . Then t · I −1 ⊂ q i · Λ and consequentially t ∈ q i · I because I · I −1 = Λ in a maximal order. Since t and q i are central it follows that t ∈ q i · I, a contradiction.
The next lemma will be the one we use in our reduction. As in RLWE, in practice we are interested in the case where J = q for a prime integer q and P = Λ ∨ . We will use the familiar notation I q := I/q · I for an ideal I and q ∈ Z throughout the paper.
Lemma 7. Let Λ, γ and q be given in Lemma 6. Let I, J be ideals of Λ, with t ∈ I ∩ O K chosen as above such that t · I −1 and J are coprime as ideals, and let P denote an arbitrary fractional ideal of Λ. Then, the function χ t : A → A defined as χ t (x) = t · x induces a module isomorphism from P/J · P → I · P/I · J · P. Furthermore, in the case J = q for a prime integer q we can efficiently compute the inverse.
Proof. The proof is similar to that of [2]. Since t lies in the center of Λ it is clear that multiplication by t induces a module homomorphism. Given the map χ t : P → I ·P/I ·J ·P and j ∈ J ·P, χ t (j) = t·j ∈ I ·J ·P, so it is clear that J ·P is in the kernel of this map. Conversely, if χ t (x) = 0 then t · x ∈ I · J · P, from which it follows that I −1 · t · x ⊂ J · P. From the definition of coprime, t · I −1 + J = Λ, from which it follows that there exists a ∈ t · I −1 , b ∈ J such that a + b = 1.
Since a · x, b · x ∈ J · P it follows that x ∈ J · P, from which injectivity follows immediately.
To demonstrate efficient invertibility, we must work slightly harder. Now let J = q . Compute t as in Lemma 6 and observe that the bijection χ t : Λ q → I q is an additive homomorphism. Thus, it suffices to compute the inverse of all elements of a Z basis of I q , since then any element can be inverted by computing its representation in this basis and inverting that. We construct such a basis as follows. First, choose n 2 · d 4 elements x i , i = 1, ..., n 2 · d 4 from Λ q uniformly at random and compute y i = χ t (x i ) for each i. It follows that each y i is a uniformly random element of I q . Then, with high probability the y i 's form a spanning set of I q (see the proceeding lemma), which we can reduce to a Z basis y 1 , ..., y n·d 2 . This basis satisfies the desired property that each element has a known inverse. If this algorithm fails (e.g. there is no suitable basis y 1 , ...y n·d 2 ), we repeat, choosing a fresh set of elements x 1 , ..., x n 2 ·d 4 until we succeed.
Lemma 8. Given a set of n 2 · d 4 independent and uniformly random elements Ξ ⊂ Z n·d 2 q , the probability that Ξ contains no set of n · d 2 linearly independent vectors (over Z) is exponentially small in d.
This lemma is a straightforward adaptation of Corollary 3.16 of [1].

E. Lattice Problems
Computational problems on lattices represent the foundations of the security of (R)LWE, and will do so for our Cyclic LWE as well. The standard lattice problems are as follows.
Definition 11. Let · be some norm on R n and let ξ ≥ 1. Then the approximate Shortest Vector Problem (SVP ξ ) on input a lattice L is to find some non-zero vector x such that x ≤ ξ · λ 1 (L).
Definition 12. Let · be some norm on R n and let ξ ≥ 1. Then the (approximate) Shortest Independent Vectors Problem (SIVP ξ ) on input a lattice L is to find n linearly independent non-zero vectors x 1 , ..., x n such that max i ( x i ) ≤ ξ · λ n (L).
Definition 13. Let · be some norm on R n , let L be a lattice, and let d < λ 1 (L)/2. Then the Bounded Distance Decoding problem (BDD L,d ) on input y = x + e for x ∈ L and e ≤ d is to compute x, or equivalently e.
The above problems are all well investigated, and believed to be sufficiently hard to base postquantum cryptographic security on; there are no known algorithms for any of these problems (for suitable parameters) running in polynomial time in dimension n.
Unfortunately, these problems are not directly suitable for CLWE, where we will be interested in their adaptations to lattices generated by order ideals, similarly to how ideal lattices are used the ring case. Specifically we have the same problems on lattices that they induce under the map σ A (·). So, SVP becomes: Definition 14. Let A be a cyclic algebra, let I be some (possibly fractional) ideal of the natural order Λ. Then, for an approximation factor ξ ≥ 1, the A-SVP ξ is to find a non-zero element a ∈ I such that |a| := σ A (a) 2 ≤ ξ · λ 1 (I), where as usual λ 1 (I) denotes the minimal length of elements of I in the given norm.
Remark 2. When we use these problems in our security reductions, we will assume that the ideals are in fact integral ideals (e.g. we exclude fractional ideals). Observe that this may be done without loss of generality, since solving the A-SVP problem on the fractional ideal I may be done by solving it on the integral ideal cI (where c ∈ K is the element such that cI is integral) and rescaling the solution.
Essentially we have a specialized version of the SVP problem; we must find an element of I with minimal norm (up to approximation factor) in the ideal I. The extension of SIVP to A-SIVP is analogous, but since we consider our objects as Z-lattices we require the independent 'vectors' a 1 , ..., a r to be linearly independent over Z. For BDD, we need a suitable ambient space, and use the following definition.
Definition 15. Let A be a cyclic algebra, let I be some (possibly fractional) ideal of a maximal Z-order Λ, and let δ < λ 1 (I)/2. Then the A-BDD I,δ problem, on input y = x + e for x ∈ I and e ∈ d−1 i=0 u i L R satisfying |e| ≤ δ, is to compute x.

F. The Learning With Errors Problem
We will briefly recall the initial Learning With Errors (LWE) problem here; in Section III we will extend it to cyclic algebras. The problem comes in two forms; search and decision, both of which are based on the LWE distribution. Let n and q be positive integers, and let α > 0 be some error parameter. Define T := R/Z, the unit torus.
Definition 16. For a secret s ∈ Z n q , a sample (a, b) ← A s,α is taken by sampling a uniformly random vector a ∈ Z n q and e ← D α and outputting (a, b) = (a, a, s /q + e mod Z).
Given the above distribution, the LWE problem comes in two forms.
Definition 17. The search LWE problem is to recover s from a collection of samples A s,α . The decision LWE problem on input a collection of samples on Z n q × T is to decide whether they are uniform samples or were taken from A s,α for some secret s, providing the samples were taken from one of these distributions.
Typically, the number of samples provided in each of these problems depends on the application. Since the decision problems has a probabilistic element, we will be interested in the advantage of the algorithms that solve it, which is defined as the difference between their acceptance probabilities on samples from an LWE distribution A s,α and the uniform distribution. In practice, the decision problem is of more interest in cryptography.
We will not define the popular extensions of these problems to number fields or modules, known as Ring-LWE and Module-LWE, but the unfamiliar reader may find details in [2] and [3] respectively, both of which we reference frequently in this work.

III. THE CLWE PROBLEM
In this section we present the general definition of CLWE together with justifications for choices made in the definition, as well as constructions of specific algebras to use. We will save the security properties for Section IV-A. i=0 u i L R , an integer modulus q ≥ 2, and a secret s ∈ Λ ∨ q , a sample from the CLWE distribution Π q,s,ψ is obtained by sampling a ← Λ q uniformly at random, e ← ψ, and outputting (a, b) = (a, Remark 3. Unlike in commutative spaces, the order of multiplication of a and s is important; our choice is (a · s), but similar security properties would hold if one took (s · a) instead. Also observe that our modulo reduction in the second coordinate of the pair is well defined, since (a · s) ∈ Λ ∨ q .
As usual, the associated CLWE problem will come in search and decision variants.
Definition 19. Let Π q,s,ψ be a CLWE distribution for parameters q ≥ 2, s ∈ Λ ∨ q , and error distribution ψ. Then, the search CLWE problem, which we denote by CLWE q,s,ψ , is to recover s ∈ Λ ∨ q from a collection of independent samples from Π q,s,ψ .
We do not state the number of samples allowed for this (or the next) problem, as typically it depends on the application. Definition 20. Let Υ be some distribution on a family of error distributions over Then, the decision CLWE problem, written D-CLWE q,Υ , is on input a collection of independent samples from either Π q,s,ψ for a random choice of (s, ψ) ← U (Λ ∨ q ) × Υ or from U Λ , to decide which is the case with non-negligible advantage.
A. Discussions 1) Relation to Module-LWE: First, we explain why we choose the order of multiplication a · s. As discussed in the introduction, the transformation from a (primal) RLWE sample to n related LWE samples provides our motivation. Here, one RLWE sample a · s + e, where a, s, e ∈ R q ∼ =

Zq[x]
x n +1 , generates n LWE samples by considering the multiplication operation as As + e, where A := rot(a) is a negacyclic matrix. For appropriate choices of error distributions, this is precisely n LWE samples with the exception that there is some structure in the matrix A. By ordering the multiplication a · s, we get a similar transform from CLWE to MLWE. Assuming for now that we have a discretized form of CLWE, and observing that for [29]), we transform a CLWE sample a·s+e into matrix-vector form to get φ(a)·s+e, where s and e are vectors of dimension d over O L /qO L . Setting A = φ(a), one can see that for appropriate choices of error distribution this is similar to d samples from the MLWE distribution with some additional structure in the matrix A, as intended.
2) The Natural Order vs. Maximal Order: We consider Λ the natural order or a maximal order. The natural order is simple to construct and represent, whereas finding a maximal order is computationally slow. Additionally, the natural order is somewhat orthogonal, in the sense that it has the same span in each u i coordinate independently of the other coordinates. This is advantageous when considering the relation to MLWE, where the module is always taken to be the full module O d K . As mentioned above, two-sided ideals in a maximal order form a free abelian group, which is not necessarily the case in the natural order. Further, as lattices, a maximal order gives denser sphere packing than the natural order, since the latter is a sublattice. Fortunately, we will construct in Section III-C1 cyclic algebras whose natural order is also maximal, thus enjoying both the simplicity of the natural order and the convenience of a maximal order.
Example 2. Quaternion algebra over Q is defined by H = {x + yj : x, y ∈ Q(i)}, with the usual relations i 2 = j 2 = −1 and ij = −ji. It can be seen as a cyclic division algebra (Q(i)/Q, (·), −1) where (·) denotes the complex conjugate and −1 is a non-norm element. A quaternion has matrix representation

The maximal Hurwitz order is given by
It is easy to check that, as Z-lattices of dimension 4, the Lipschitz order is a sublattice of the Hurwitz order, of index 2.

3) A Pair of Number Fields:
In MLWE, we are free to choose the dimension of our module over the underlying number field K. However, in the cyclic algebra case we are restricted to cases where we can find L, K, and γ such that A = (L/K, θ, γ) is well defined. From a theoretical standpoint it is not immediately clear whether we want to consider asymptotic security in terms of n or d, but following our motivation from MLWE we suggest that n is likely the suitable choice since the module dimension d is typically small in applications using MLWE, whereas the dimension of the underlying field K is large. However, there seems to be no a priori reason why with the right techniques one could not consider both n and d asymptotically; the only case a cyclic algebra precludes is high dimensional MLWE over a low dimension number field L, because the parameter d occurs in both the module and field dimension.

B. Evading BCV Style Attacks
In our CLWE construction we have enforced that γ is selected so that A is a division algebra. We do this to avoid attacks in the style of [22] on the m-RLWE protocol. For m = 2, the m-RLWE protocol of [20] can be considered as a structured variant of MLWE, where the matrix A in the operation As + e is a negacyclic matrix over some ring R q . More explicitly, 2-RLWE considers the tensor product of two fields K = K 1 ⊗ K 2 and runs the LWE assumption in the ring of integers R q . The example use case given in [20] considers power-of-two cyclotomics K 1 , K 2 defined by the polynomials x k 1 + 1 and y k 2 + 1 respectively, claiming that the resulting problem in R q = Zq[x,y] (x k 1 +1,y k 2 +1) effectively corresponds to an RLWE problem of dimension k 1 · k 2 due to an obvious homomorphism between K and the two-power cyclotomic field L of degree k 1 · k 2 . The problem also represents a structured MLWE instance over Zq[x] (x k 1 +1) of dimension k 2 . However, the observation of [22] is that there is a smaller field K containing K 1 such that there is a homomorphism from K into K with a well defined image for y. This is because the roots of distinct two-power cyclotomic polynomials are algebraically related. For example, in the case k 1 = 8, k 2 = 4, it is clear that the map taking y to x 2 and fixing K 1 is a well defined homomorphism from K to K 1 . Using this homomorphism, [22] simplifies the problem of solving one 2-RLWE instance by considering it as four RLWE instances in dimension k 1 rather than one instance in dimension k 1 · k 2 , essentially removing the module dimension k 2 from the problem.
We argue that the non-norm condition of γ precludes the existence of a homomorphism removing the module structure by taking a well defined cyclic algebra A = (L/K, θ, γ) to a smaller subfield containing K. We restrict our search to maximal subfields of A, since any subfield is contained in at least one maximal subfield. It is a well known result on division algebras that any maximal subfield E of A contains K and satisfies [E : K] = d, and that in the case of a cyclic division algebra A there is a choice of u ∈ A such that the cyclic algebra A := j u j E is isomorphic to A (see Section 15.1, Proposition a of [36]). Assume, for a contradiction, that we had such a homomorphism χ : A → L, where without loss of generality we assume the maximal subfield is L by the aforementioned proposition. Since L is Galois, the restriction of χ to L is an automorphism of L. It is clear that χ must agree on conjugates, since for any ∈ L. However, this contradicts χ being injective on L and it follows that no such homomorphism exists. Hence we conclude that the attack style of [22] does not threaten our algebraic structure.
On the other hand, Appendix A shows that if γ violates the non-norm condition, then those instances of the CLWE problem are potentially vulnerable. To sum up, the non-norm condition is crucial to the hardness of the CLWE problem.

C. Concrete Algebras for CLWE
In order to apply the CLWE assumption in a practical cryptosystem one must choose a concrete algebra as an ambient space. More generally, we are interested in finding families of algebras suitable for CLWE that allow for asymptotic analysis and varied security levels. Our search for algebras is motivated by the restrictions and conditions discussed in the previous section. In particular, we are interested in cyclic division algebras satisfying the following properties: • The non-norm element γ must lie in O K to keep the natural order closed under multiplication, and should satisfy |γ| = 1 in order to maintain both the coordinatewise independence and sub-multiplicative properties of the norm 2 .
• The dimension n := [K : Q] of the division algebra should be large and the degree d := [L : K] should be small. This is to maintain the analogy with structured MLWE (the degree corresponds to the module rank) and follows from the search-decision reduction, which takes time polynomial in n but not in d.
• The base field K should be cyclotomic and q should split completely in K. This is also a result of the methodology of the search-decision reduction, which uses the well understood factorization of q in O K . In addition, since the bulk of lattice based cryptography is done over cyclotomic fields, we consider algebras which are small extensions of these as somewhat natural. We observe that an improved proof of decision security may allow this point to be dropped, whereas the other two points feel more integral.
Although significant effort has been expended by coding theorists to construct cyclic division algebras satisfying a variety of conditions, such as in [7] or [25], we find ourselves with a fairly unique set of restrictions. In particular, for reasons relating to desired applications, the majority of algebras used in coding theory are either of small total dimension or have small [K : Q] and scale asymptotically in [L : K]. Since we are interested in scaling up K asymptotically, we will have to build novel algebras satisfying the above requirements ourselves. We will, however, make heavy use of the following theorem as an intermediate step.
Here ζ m denotes a primitive m th root of unity where ϕ(m) = n is the degree of the base field K = Q(ζ m ).

Theorem 1 ([25]
). Let m = p a be a prime power and let K = Q(ζ m ). Then, there exist infinitely many cyclic Galois extensions L/K of degree m such that ζ i m is not a norm of L/K for 0 < i < m.
We remark that the theorem is effective in the sense that it provides an explicit description of L, and we provide a summary of the recipe for constructing L. The crucial aspect of its construction is that L is a subfield of some cyclotomic extension of K, K(ζ q ) for a prime q , but we present its full description for completeness.
First, find some prime q such that q = 1 mod p a but q = 1 mod p a+1 , so that p a is the highest power of p dividing q − 1 3 . Set M = K(ζ q ) so that by coprimality M = Q(ζ mq ). Then Gal(M/K) is a cyclic group of order q − 1 generated by some automorphism σ. Denote by L the subfield of M fixed by σ m . Then [L : K] = m by the fundamental theorem of Galois theory and the extension is both cyclic and Galois. Finally, localization theory is used to show that the powers of ζ m are not norms in this extension. In this way, the theorem constructs L explicitly.
The part of this theorem of our interest is that it allows us to scale K asymptotically, but this comes with a drawback of very high degree L, i.e., it only permits a degree-m extension L of a degree-ϕ(m) base field K. We present a new method that uses this theorem as a starting point to construct good algebras satisfying our restrictions. More precisely, our construction will begin with Theorem 1 and then use elementary methods from Galois theory to build more favourable fields.
1) Constructions Using Subfields: We squash the field L from Theorem 1 to a subfield M of small index over the base K satisfying the necessary properties to generate a cyclic algebra.  Both groups in the quotient are cyclic, and so Gal(M/K) is cyclic with some generator θ. Furthermore, this isomorphism also allows us to deduce |Gal(M/K)| = d. We where the first equality follows from x ∈ M and the second since the norm is multiplicative. L does not contain any power of ζ m except ζ m m = 1 since ζ m is a non-norm element in L/K, so it follows that m|(m/d)i and so d|i. From this we conclude that ζ m , ζ 2 m , . . . , ζ d−1 m do not lie in M and so ζ m satisfies the non-norm condition.
Remark 5. We presented the proof in the above form for ease of legibility, but it is straightforward to extend the argument in the final paragraph to show that ζ jd+1 m satisfies the non-norm condition for any j = 0, 1, .
This is an effective construction that allows us to build cyclic division algebras of the form A = (M/K, θ, γ) where |γ| = 1, K is an arbitrary prime power cyclotomic, and M is an extension of K with degree divisible by the prime p. absolute value of γ we view these algebras as essentially the best possible, at least for the case where K is a prime-power cyclotomic.
As discussed in Section III-A, the natural order is not necessarily a maximal order. Nevertheless, the following theorem shows that the specific family of algebras we have constructed in Section III-C1 represents a lucky case (its proof is given in Appendix C). This makes our constructed family of algebras very attractive, as it enjoys both the simplicity of the natural order and the nice property of a maximal order. Remark 6. In the context of multiblock space-time coding [25], the construction of Theorem 1 allows for a space-time code for m antennas and ϕ(m) blocks, i.e., a relatively small number of blocks. With our new construction Theorem 2, any number ϕ(mk), k ∈ N of blocks becomes possible. Further, using a maximal order leads to optimum coding gains; it was not realized in [25] that the natural order from Theorem 1 is actually maximal.

D. Sample Parameters
Now that we have discussed our techniques for constructing suitable number fields we proceed to demonstrate that these methods are able to attain cryptographically relevant dimensions. In this section, we present a small selection of proof-of-concept dimensions in Table I where we take our motivation for choices of dimension from KYBER and NewHope, since they are the successful second round NIST candidates whose methods are most similar to our own. Thus we aim for dimensions in the region of between 512 and 1024, dimensions proposed for both NewHope and KYBER (which also achieves dimension 768). Of course, these schemes are restricted to having power-of-two ring dimension n and so their choices of dimension may not be optimal in general, but FrodoKEM [37], a plain LWE scheme, suggests dimensions in around the same range, specifically 640, 976, and 1344, so we consider dimensions in this region a sensible starting point. Corresponding to KYBER and other MLWE based schemes we will set a small 'module' rank d := [A : L]. We are constricted in our choice of fields by the fact that d appears as a square in the total dimension N = nd 2 , but for the most part we are able to work around this problem. Parameters of Cyclic Algebras. The subfield method is given in Section III-C, while the compositum method is given in Appendix D.

Method
Center 1) Two-Power Cyclotomic K: We begin with straightforward cases where we can apply Theorem 2 immediately to obtain fields in suitable dimensions. Let K be a two-power cyclotomic field, K = Q(ζ 2 k ), with dimension n := 2 k−1 . Since the rank d = [L : K] = [A : L] is a small power of two, the dimension n of K will be dictated by the choice of module rank d. We construct rank 2 and 4 examples as follows: To obtain algebras in dimension 512 simply pick K with dimension n/2 e.g. Q(ζ 256 ) and Q(ζ 64 ) respectively. In all cases, Theorem 2 lets us pick the non-norm element γ as a root of unity.
2) Three-Power Cyclotomic K: Since 3 1024, one can not achieve algebras in dimension 1024 with a 3-power cyclotomic center and instead we set about searching for algebras of nearby dimensions. Although we are unable to build fields in this case with dimension around 1024, we can get close to the more lightweight cryptographic dimension of 512 used in schemes targeting a lower security level. Recall that if K = Q(ζ 3 k ) then K has dimension n := φ(3 k ) = 2 · 3 k−1 . Again, the module rank is a power of 3 and the choice of module rank will define the choice of n.
• For d = 9 we have [A : K] = 81. To achieve the same total dimensions we take small base fields K = Q(ζ 9 ) and Q(ζ 27 ) respectively.

3) Fields Using Compositum Techniques:
The algebras with prime-power cyclotomic centers of the previous subsections use the field construction technique of Theorem 2, and as such they are restricted to algebras whose dimension N is in the form p k (p − 1) for a prime p and integer k. In Appendix D, we present another method of constructing algebras using compositum fields that allows us to target dimensions not achievable in this setting. The bottom three algebras of dimensions 576, 768 and 1152 in Table I are obtained with this method.

E. Extensions Where q Splits Completely
All suggested algebras in the previous section satisfy the conditions required for our chosen norm σ A (x) 2 to be well-defined. In particular, they have root of unity non-norm γ and K is cyclotomic. Because any q = 1 mod m splits completely in Q(ζ m ), it is straightforward to find q which splits completely in O K .
Later in this paper, in order to enable efficient multiplication algorithms, it will turn out that it is convenient to have a modulus q that splits completely into a product of prime ideals in both O K and O L . Recall Lemmas 6 and 7 also require q be unramified in L. An appeal to Chebotarev's Density Theorem suggests that a proportion of 1/d of the primes q that split completely in K also do so in L. In cases where d is small this suggests that finding such primes should not prove too arduous; but since cryptosystems require specific parameters rather than density arguments, we provide constructions satisfying the requisite conditions on q in Appendix E.

IV. SECURITY PROOF
The 'standard' security reductions used in [1] and [2] firstly reduce certain lattice problems to search LWE and RLWE, then establish hardness of the decision problem via a search-decision reduction. This proof follows a sequence of shorter reductions as shown in Fig. 3.
The reduction from the approximate SVP to the search LWE problem implies that search LWE is at least as hard as approximate SVP. It can be explained as follows: first, the approximate SVP is reduced to the problem of sampling a discrete Gaussian of narrow variance over a lattice, where intuitively sampling from a sufficiently narrow Gaussian should output a vector whose norm is reasonably short compared to the first minima. Then, a quantum algorithm reduces the problem of sampling from a narrow Gaussian to that of solving the BDD problem on the lattice. Finally, a transformation maps an instance of the BDD problem to an appropriate instance of the LWE problem, reducing the BDD problem to that of search LWE.
For applications in cryptography, the hardness of the decision problem is preferred to that of the search problem. Assuming that the decision problem is hard implies that LWE samples are computationally indistinguishable from uniform, so intuitively an LWE sample can be used to hide a message m as an element of Z n q by adding it to b. Using similar machinery, we reduce a BDD problem to search CLWE using the same method as in [2]. The methodology of their search-decision reduction is an adaptation of that of Regev's, which relies on guessing each coordinate of the secret s separately. The adaptation to the ring case instead guesses the coordinate of the secret ring element s modulo a suitable collection of ideals p i such that guessing s mod p i O ∨ K requires only a polynomial number of guesses, from which s is recovered using the CRT. We apply a similar method in suitable subrings to deduce the hardness of our decision problem. The main technical novelty is to deal with non-commutativity in the proof.
For the remainder of this paper, we will always be working in an extension of number fields Recall from the motivation of structured MLWE and the sample algebras given that in practice we seek asymptotic security in n, since the parameter d corresponds to the typically small module dimension.

A. Hardness of Search CLWE
Definition 21. We define the family of error distributions Σ α as the set of all Gaussian distributions D Σ over d−1 i=0 u i L R with covariance matrix obtained as the distribution of the error in Lemma 11. This is the family of error distributions we will claim hardness of search CLWE for; although specifying this family of matrices precisely is not simple, we demonstrate how the error is obtained in the BDD transformation step. For now, we remark that it is a Gaussian distribution whose marginals are Gaussian with variance at most α.
In the following theorem we denote by A−DGS ξ the problem of sampling a discrete Gaussian D I,ξ , where I is some ideal of the order Λ.
Theorem 4. Let A be a cyclic division algebra over a number field L with center K and natural, maximal order Λ with |γ| = 1. Let α = α(n) ∈ (0, 1) and q = q(n) ≥ 2, unramified in L, be parameters such that α · q ≥ ω(1). Then, there is a polynomial-time quantum reduction from A-DGS ξ to search CLWE q,Σα for any From this we deduce the following corollary, similarly to [3], since the lattice structure of our algebra is merely a special case of their modules. We denote by N the total dimension of A, N := nd 2 . Corollary 1. Let A, Λ, α and q be as above. Then, there is a polynomial-time quantum reduction from A-SIVP ξ to search CLWE q,Σα for any The following theorem is our analogy of Lemma 4.10 of [3].
Theorem 5. Given an oracle that solves CLWE q,Σα for input α ∈ (0, 1), an integer q ≥ 2, an ideal , and polynomially many samples from the discrete Gaussian D I,r there exists an efficient quantum algorithm that outputs an independent sample from D I,r .
We can then prove Theorem 4 in the standard iterative manner; for a very large value of r, e.g. r ≥ 2 2N λ N (I), start by sampling classically from D I,r . Then apply the above algorithm to obtain a polynomial number of samples from D I,r . Repeating this step gives samples from progressively narrower distributions, until we arrive at the desired Gaussian parameter s ≥ ξ.
In order to classically sample the initial collection of Gaussian samples, we use the standard Lemma 3.2 of [1] to sample D I,r on the module representation d−1 i=0 u i L R . As usual, we obtain Theorem 5 in two steps, first the main reduction of Lemma 11, then the following quantum step adapted from [1]. We use a form of A−BDD L,δ from [3] where we bound the offset in the norm where σ denotes the canonical embedding of L.

Lemma 9.
There is an efficient quantum algorithm that given any N = n · d 2 dimensional lattice L := σ A (I) for some ideal I, a real δ < λ 1 (L * )/(2 √ 2nd), and an oracle that solves A-BDD L * ,δ with all but negligible probability, outputs an independent sample from D L, For the reduction of BDD to Search CLWE, we begin with the cyclic algebra analogy of the BDD-to-LWE samples transformation from Section 4 of [2]. As is standard for LWE security, we use the following 'modulo q' definition of BDD: Definition 22. For any q ≥ 2 the qA−BDD I,d problem is as follows: given an instance of the A−BDD I,δ problem y = x + e with solution x ∈ I and error e ∈ d−1 i=0 u i L R satisfying e 2,∞ ≤ δ, output x mod qI.
We use (a special case of) Lemma 3.5 from [1], which lifts immediately since it is lattice preserving.
Lemma 10. For any q ≥ 2 there is a deterministic polynomial time reduction from A−BDD I,d to qA−BDD I,d .
We now present an algorithm which transforms qA-BDD samples to CLWE samples given some additional Gaussian samples. The algorithm is the same in spirit as Lemma 4.7 of [2], but has some technical differences induced by the structure of cyclic algebras.
Lemma 11. Let A be as in Theorem 4. There is a probabilistic polynomial time algorithm that on input a prime integer q ≥ 2, a fractional ideal I ∨ ⊂ Λ, a qA−BDD L,αq·ω( √ log(nd))/ √ 2nd·r instance y = x + e where x ∈ I ∨ , a parameter r ≥ √ 2q · η(I), and samples from the discrete Gaussian D I,r with r ≥ r, outputs samples that are within negligible statistical distance of the CLWE distribution Π q,s,Σ for a secret s = χ t (x mod qI ∨ ) ∈ Λ ∨ q , where χ t is as in Lemma 7 and Σ is an error distribution such that in the case where |γ| = 1 the resulting error e has marginal distribution in its i, j th coordinate that is Gaussian with parameter r i,j ≤ α.
Proof. The proof will be in two parts -first, we will describe the algorithm, then we will prove correctness. Recall that in the definition of CLWE, a sample is in the form (a, b) = (a, (a·s)/q+e mod Λ ∨ ), where e is taken from an error distribution ψ ∈ Σ α .
Begin by computing an element t ∈ I such that I −1 · t and q are coprime using Lemma 6. We can now create a sample from the CLWE distribution as follows: take an element z ← D I,r from the Gaussian samples, and compute a pair where e ← D α/ √ 2 . We now claim that these samples are within negligible statistical distance of the CLWE distribution and that s is uniformly random. First we show that a ∈ Λ q is statistically close to uniform. By assumption, r ≥ q · η(I) and so by appealing to Lemma 1 it can be seen that any value z mod qI is obtained with probability in the interval [ 1−ε 1+ε , 1] · β for some positive β, from which it follows immediately that the statistical distance between z mod qI and the uniform distribution is bounded above by 2ε. Since χ t of Lemma 7 and its inverse are both bijections, we conclude that a = χ −1 t (z mod qI) is within statistical distance 2ε of the uniform distribution over Λ q . Now we must show that b is in the form (a · s)/q + e , for some suitable error e and a uniformly random s, where we condition on some fixed value of a. By construction, b : = (z · y)/q + e mod Λ ∨ = (z · x)/q + (z · e)/q + e mod Λ ∨ , so since z = t · a mod Λ ∨ q and t lies in the center of A it follows that (z · x)/q = (a · t · x)/q = (a · s)/q mod Λ ∨ for s := χ t (x mod qI ∨ ). It follows that s is uniformly random over Λ ∨ q as long as x is uniform over I ∨ , since χ t is a bijection.
Finally it is left to show that, conditioned on a fixed value of a, the marginal distribution of the i, j th coordinate of the error term e = (z · e)/q + e is negligibly close to that specified by Σ. We can explicitly calculate the error as where the sum j +k is taken modulo d and the functon (1−(1−γ)1 j+k≥d ) is 1 if j +k < d and γ otherwise 5 . Since |γ| = 1 and z ← D I,r is spherically distributed, it follows that multiplying by γ and applying the permutation of j coordinates induced by θ does not change the distribution of z i,j . Hence, each marginal distribution may be analyzed independently as in the case of MLWE, and the result follows using the analysis of the error from Lemma 4.15 of [3].
Though we do not specify the covariance of Σ, one can see that each entry of σ A (z) appears in σ A (e ) exactly d times, and so by symmetry each element of σ A (e ) has non-zero correlation with at most d 2 other entries. Hence, a proportion of at most nd 4 n 2 d 4 = 1 n of entries of Σ are non-zero.

B. Search To Decision Reduction
In this section we will show that the hardness of decision CLWE follows from that of the search problem. Once again, we will follow a combination of the expositions of [2] and [3] for the ring and module cases, making necessary changes for the structure of cyclic algebras. We will make heavy use of the following CRT style decomposition, a rephrasing of [29,Lemma 4].
Lemma 12. Let Λ be the natural order of a cyclic algebra A = (L/K, θ, γ) and let I be an ideal of O K which splits completely as I = q 1 ...q n as an ideal of O K . Then, we have the isomorphism Of course, this is not a true CRT decomposition, because we are considering ideals of O K rather than those of Λ. In the case where γ is a unit, Λ ∨ = i u i O ∨ L and the above lemma is also valid in the case where each instance of O L and Λ are replaced with their respective duals.
As in [2], our reduction will be limited to certain choices of algebras. The above lemma considers the splitting of the ideal I as an ideal of the base field K. Setting I = q , the ideal generated by the modulus q, we will consider cases where q splits completely in the base field. Now consider the family of algebras A in Section III-C and let K = Q(ζ p a ) have dimension n. It follows that if q ≡ 1 mod p a then q splits completely into a product of prime ideals q 1 , ..., q n as an ideal of O K . Hence, we obtain the decomposition where R i is as is Lemma 12. Also as in [2], we see no way to avoid randomizing the error distribution in the resulting decision problem. Further, we require that an oracle for D-CLWE q,Υα on an algebra A = (L/K, θ, γ) is also an oracle for the decision problem on any algebra A = (L/K, θ, γ ) over the same number fields L, K and some other root of unity γ ∈ O K . Intuitively this implies that for fixed L and K as in Section III-C the hardness of the D-CLWE problem is invariant under the choice of root of unity γ, and will be required for Lemma 15. This is because there exist efficient, easy-to-compute isomorphisms isomorphisms sending A to A , which we will define shortly.
The main theorem of this section is Theorem 6; we emphasize that our algorithm is only intended to be efficient in the dimension n of the base field K, since we expect to fix d as a small constant in practice. We will prove Theorem 6 in the usual manner: first we show that it is sufficient to recover the value of s ∈ Λ ∨ /qΛ ∨ in one of the rings R i (Lemma 13). Then, we use a hybrid distribution to define a decision problem in R i , for which we demonstrate a search to decision reduction (Lemma 14). We then use a hybrid argument to conclude the proof (Lemma 16).
1) CLWE in R i : In this section we will abuse notation and denote by s mod R i the value of s ∈ Λ ∨ /qΛ ∨ in the R i coordinate under the isomorphism of Lemma 12.
Definition 23. The R i −CLWE q,Σα problem is to find the value s mod R i given access to the CLWE distribution Π q,s,Σ for some arbitrary Σ ∈ Σ α .
In the following lemmata we make use of the automorphisms of K coordinatewise on the rings R i . Since K is a Galois extension of Q and q splits completely, it follows that the automorphisms σ i of K act transitively on the ideals q i . We demonstrate how to extend these to functions of A. First, extend these automorphisms to automorphisms α i of L in some arbitrary manner. Then, we can extend these to isomorphisms α i : A → A , with A = (L/K, θ, γ ), which agree with α i on L and send u to u with u d = α i (γ) and xu = u θ(x) for x ∈ L. By the construction of K from [25], α i (γ) is a non-norm element since it is some primitive n th root of unity, and so it is easy to check that this A is a well defined division algebra and that α i is indeed an isomorphism which sends A to A . Furthermore, it fixes the family of error distributions Σ α . This is because each component of z · e + e is defined coordinatewise over the d copies of L R in the module representation of A, and since α i induces the same permutation of the entries of the canonical embedding of L in each coordinate as an automorphism of L it fixes the family of choices for each of z, e, e ; hence since α i is an isomorphism the family of distributions z · e + e is fixed. It follows that the extended α i function maps the R i −CLWE q,Σα problem in A to the same problem in A , and moreover that this map preserves Λ ∨ and the CRT style decomposition (Lemma 12) of Λ ∨ q by sending R i to some R j , where j depends on the choice of σ i . We are now ready for the first step of our reduction.
Lemma 13. There is a deterministic polynomial time reduction from CLWE q,Σ to R i −CLWE q,Σ .
Proof. Let O i be an oracle for the R i −CLWE q,Σ problem. Since Lemma 12 defines an isomorphism, it is sufficient to use O i to solve the R j −CLWE q,Σ for each j. Let α j/i be an extension of the automorphism of K mapping q j to q i , which exists by transitivity. Then, given a sample (a, b) ← Π q,s,Σ , we construct the sample (α j/i (a), α j/i (b)). Since Λ q and Λ ∨ q are fixed by each α j/i , the resulting pair is a valid CLWE sample in A = (L/K, θ, α j/i (γ)); feeding these samples into O i outputs a value t j mod R i .
We claim α −1 j/i (t j ) = s mod R j . Since α j/i is an automorphism, each sample (a, b) is mapped to a new CLWE sample (α j/i (a), α j/i (a · s/q + e) mod Λ ∨ ) in a new algebra A . We may write the second coordinate as α j/i (a) · α j/i (s)/q + α j/i (e) mod Λ ∨ . Since our automorphisms fix our family of error distributions and map the uniform distribution to the uniform distribution, it follows that this is a valid CLWE instance with secret α j/i (s) and error distribution Σ . Hence, O i outputs t = α j/i (s) mod R i , from which we recover α −1 j/i (t) = s mod R j , as required. 2) Hybrid CLWE and Search-Decision: For this section we must introduce the cyclic algebra analog of the Hybrid LWE distribution used in [2]; we use the decomposition into the rings R i rather than the CRT.
Definition 24. For a secret s ∈ Λ ∨ q , distribution Σ over j u j L R , and i ∈ [n], we define a sample from the distribution Π i q,s,Σ over Λ q × ( d−1 i=0 u i L R )/Λ ∨ by taking (a, b) ← Π q,s,Σ and h ∈ Λ ∨ q which is uniformly random and independent mod R j , j ≤ i and 0 mod R j , j > i, and outputting (a, b + h/q). If i = 0,we define Π 0 q,s,Σ = Π q,s,Σ .
Using this distribution we define a worst-case decision problem relative to one R i and reduce it to the search problem R i −CLWE.
Definition 25. For i ∈ [n] and a family of distributions Σ α , the W-D-CLWE i q,Σα problem is defined as the problem of finding j given access to Π j q,s,Σ for j ∈ {i − 1, i} and valid CLWE secret and error distribution s, Σ.
For a technical reason in the following proof, we restrict our secret s so that s mod R i lies in a set G i with the property that g = h ∈ G i implies g − h is an invertible element. Applying this restriction for each i places s ∈ G for a set G = G 1 × · · · × G n of size |G| = i |G i |. We will call such a set G a pairwise different set. We need to guarantee that there exist sufficiently large choices of G. It is not difficult to see that the maximal set sizes |G i | = q d and |G| = q nd , because any set of matrices in M d×d (F q ) of size at least q d + 1 contains two matrices with the same first row, whose difference is therefore uninvertible. Constructions of such maximal sets G are given in Appendix F. Lemma 14. Assuming s ∈ G, there is a probabilistic polynomial-time reduction from R i −CLWE q,s,Σα to W-D-CLWE i q,Σ for any i ∈ [n].
Proof. We follow the standard search-decision methodology of guessing the value of the secret mod R i and then modifying the samples so that the decision oracle tells us whether or not our guess was correct. Note that there are only |G i | possible values of s mod R i , which is bounded above by q d 2 , polynomial in n, and so we may efficiently enumerate over the possible values. We define the transform which takes a value g ∈ Λ ∨ q and maps Π q,s,Σ to Π i−1 q,s,Σ if g = s mod R i or Π i q,s,Σ otherwise as follows. On input a CLWE sample (a, b) ← Π q,s,Σ , output the pair where v ∈ Λ q is uniformly random mod R i and 0 mod R j for j = i and h ∈ Λ ∨ q is uniformly random and independent mod R j , j < i and 0 on the other R j . It is clear that a is still uniformly distributed on Λ q , so we are left to show b is correctly distributed. For a fixed value of a , we write where e is still drawn from Σ. If g = s mod R i , then v(g − s) = 0 mod R i , and so the distribution of the pair (a , b ) is precisely Π i−1 q,s,Σ . Otherwise, v(g − s) is uniformly random mod R i by assumption on G and 0 mod the other R j , and so letting h = h + v(g − s) we see that the distribution of (a , b ) is precisely Π i q,s,Σ .
Remark 7. This is the only stage of the proof which enforces that the asymptotic complexity scales only with n and not with d, since we are forced to guess all of s mod R i at once.
Since the above reduction is secret preserving the required decision oracle for W-D-CLWE i q,Σα has the additional restriction that s ∈ G, but for the purposes of the rest of our proof it will be more convenient to have access to an oracle solving the at least as hard problem where s is arbitrary. Additionally, in practical applications we will use the decision problem for arbitrary s, so we see no benefit of the tighter reduction where s is restricted.
3) Worst-Case to Average-Case Decision Reduction: Now that we have removed the restriction that s ∈ G, we are able to follow the skeleton of the RLWE search-decision reduction of [2] more liberally.
Definition 26. The error distribution Υ α on the family of possible error distributions is sampled from by choosing an error distribution Σ ← Σ α and adding it to D r , where each r i := α((n · d 2 ) 1/4 · √ y i ) for y 1 , ..., y n·d 2 sampled from Γ(2, 1).

Definition 27. For i ∈ [n]
and a distribution Υ α over possible error distributions, an algorithm solves the D-CLWE i q,Υα problem if with a non-negligible probability over the choice pairs (s, Σ) ← U (Λ ∨ q ) × Υ α it has a non-negligible difference in acceptance probability on inputs from Π i q,s,Σ and Π i−1 q,s,Σ .
This is the average case decision problem relative to R i ; in our worst-case to average-case reduction we will need to randomize the choice of error distribution, which we do by sampling from Υ α . Proof. Since the definition of Υ α is a distribution over the family of distributions obtained by sampling from Σ α and adding an elliptical Gaussian, the proof is the same as Lemma 5.12 of [2], except we replace each instance of mod q i R ∨ with mod R i and each instance of R q with Λ q .
Remark 8. This choice of Υ α means that our decision problem is closer to diagonal than the corresponding search problem! In fact, if one increased the elliptical error in the decision problem, one could 'flood out' the non-diagonal entries of the covariance matrix, leading to elliptical error which is easier to handle in practice.
Finally, we use a hybrid argument. We must first show that Π n q,s,Σ is uniformly random given Σ sampled from Υ α , but again this follows the same method as the ring case, except we must replace their use of Lemma 1 by [38, Lemma 2.4].
Lemma 16. Let Υ α be as above and let s ∈ Λ ∨ q . Then given an oracle O which solves the D-CLWE q,Υα problem there exists an efficient algorithm that solves D-CLWE i q,Υα for some i ∈ [n] using O.
Proof. The proof is identical to the ring case, Lemma 5.14 of [2], except that the indexing set Z * m is replaced by [n].
Denote by CLWE q,Σα,G the search CLWE problem where s ∈ G for arbitrary fixed G ⊂ Λ ∨ q . To sum up, we have obtained the main result of this section: Theorem 6. Let Λ be the natural order of a cyclic algebra A = (L/K, θ, γ), q ∈ poly(n) and assume that α·q ≥ η ε (Λ ∨ ) for a negligible ε = ε(n). Then, there is a probabilistic reduction from CLWE q,Σα,G for any pairwise different G ⊂ Λ ∨ q to D-CLWE q,Υα which runs in time polynomial in n.

C. Summary of Security Proof
There are certain technicalities and subtleties in our security proof, which we briefly summarize as follows.
The hardness of Search CLWE in Section IV-A requires a natural, maximal order Λ. Nonetheless, Lemma 11 (due to Lemmas 6 and 7) is the only stage of the proof that assumes such a natural, maximal order. An improved proof technique may be able to drop this assumption (e.g., to use the natural order). The search to decision reduction in Section IV-B requires a natural order Λ, due to the CRT decomposition of Lemma 12. A better version of CRT may extend the reduction to a maximal order. Fortunately, the orders we take from Section III-C1 are both natural and maximal, thereby meeting these requirements. The requirement of unramified q in Theorem 4 (due to Lemma 6) is minimal: for the algebras of Theorem 2, the only unsuitable primes are the p and q used in the construction (cf. Section III-C).
Lemma 14 in Section IV-B2 enforces that s lies in a pairwise different set G. It is the only stage of the proof which requires such a set. We emphasize that our reduction takes the search CLWE problem where s ∈ G for arbitrary fixed G to the decision CLWE problem for arbitrary secret s. In other words, we claim hardness for the full decision problem, based on hardness of a restricted search problem. Also, our reduction implies that the decision problem is as hard as the search problem for the hardest choice of G. See Appendix F for more details.
Remark 9. The so-called normal form is used de facto in LWE-based cryptography. We note that the normal form reduction is agnostic to the secret space G. More precisely, secret s ∈ G gets completely cancelled in the transformation and replaced by a new secret s over the entire space (see of Lemma 18 in Section V-A). Therefore, the secret space in the normal form of CLWE is the entire space, after all.
In practice it may be a concern with security of CLWE if these reductions were best possible (e.g. decision CLWE is polynomial-time equivalent to restricted search, rather than at least as hard). In any case, our secret space is still exponentially large in n.

V. CLWE IN CRYPTOGRAPHY
In this section we present a proof of concept cryptosystem using CLWE. To demonstrate our comparison against MLWE our scheme will closely resemble the typical 'compact' LWE cryptography schemes over modules, in particular KYBER (see [5]), although it is likely that an adaptation of Regev style encryption from [1] would suit CLWE as well.

A. Making CLWE Suitable For Cryptography: Normal Form
We implicitly use some standard LWE facts: firstly, we discretize our error distribution e to Λ ∨ q ; discretizing does not reduce security since an attacker may always discretize the samples themselves. Secondly, we can 'tweak' the problem so that e, s ∈ Λ q . Fortunately, in the case where γ is a unit, Λ ∨ = i u i O ∨ L and so this tweak is precisely multiplying on the right by the tweak factor taking O ∨ L to O L (see e.g. [39]). Finally, we require hardness of a 'normal' form for the CLWE distribution, where s is sampled from the same distribution as the noise e.
We require two facts for our proof: firstly, given that q splits completely in K the ring Λ q is isomorphic to the direct product of n full matrix algebras over M d×d (F q ), which can be seen by appealing to the CRT-style decomposition of Lemma 12 and Wedderburn's Theorem as in [29,Propositions 1 and 4]. Secondly, we require that a non-negligible fraction in n of elements of Λ q are invertible, which follows for fixed, small, d and q ∈ poly(n) from this direct product decomposition. Otherwise, our proof follows the outline for that of plain LWE from [40]. Given these two facts, we proceed with showing that the normal form of the CLWE distribution is as hard as the case of taking the secret uniformly at random.
Lemma 17. For a fixed d and q ≥ (n + 1), a non-negligible proportion of elements of Λ q are invertible.
Proof. Following the decomposition of Lemma 12 and Wedderburn's Theorem, it is sufficient to show that a non-negligible proportion of elements of are invertible, where there are n copies of M d×d (F q ). The proportion of invertible elements of M d×d (F q ) is precisely from which it follows that the total fraction of invertible elements in Λ q is at least ((1 − 1 q ) d ) n . By assumption, q ≥ n + 1, and so

as required.
Remark 10. This lower bound of e −d means that the normal form reduction will be asymptotic in n but only valid for fixed d. However, as d increases the number of invertible matrices in Λ q is bounded above by (1 − 1 q ) nd , and so the reduction would be efficient in d in the case where one enforced a relation on q and d, such as q ≥ nd + 1, or more succinctly q ≥ N . Lemma 18. There is a probabilistic polynomial time reduction from the CLWE problem with uniformly random secret s, possibly over a limited secret space G, and error distribution χ to the CLWE problem with secret s ← χ.
Proof. It is sufficient to show that there is an efficient transformation taking samples with secret s to samples with some new secret s taken from χ. Sample pairs (a, b) ← Π q,s,χ until a pair (a 1 , b 1 := a 1 · s + e 1 ) such that a 1 is invertible in Λ q is obtained. Since a non-negligible fraction of elements of Λ q are invertible by Lemma 17, this step takes only polynomial time. Now, given a pair (a i , b i ) ← Π q,s,χ , we obtain a sample from the CLWE distribution Π q,e 1 ,χ by outputting and so (a i , b i ) is a valid CLWE sample with secret e 1 and error distribution χ. Relabelling e 1 as s completes the proof.

B. Sample Cryptosystem
Our scheme is parameterized by an algebra A := (L/K, θ, γ), where A is as in Section III-C, an error distribution Σ, and a prime modulus q ≡ 1 mod m (recall K = Q(ζ m )) which is completely split in L. We will denote with bold faced letters the vector form of an element of Λ q , e.g. if a = a 0 + ua 1 + ... + u d−1 a d−1 then a = (a 0 , a 1 , ..., a d−1 ). We note that O L /qO L has a polynomial representation of dimension n · d, and so we encode our message ∈ {0, 1} n·d 2 as an entry of Λ q as a vector m of d {0, 1} polynomials. The scheme proceeds as follows: • Alice generates a CLWE sample (a, b := a · s + e), where a ∈ Λ q is uniformly random and e ← Σ, and outputs public key a, b.
• To encrypt m ∈ {0, 1} n·d 2 , Bob samples t, e 1 , e 2 ← Σ and outputs u := φ(a) T t + e 1 , v := φ(b) T t + e 2 + q 2 · m. • To decrypt, Alice computes c = v − φ(s) T u and recovers each coordinate of m by rounding the corresponding entry of c to 0 or q 2 and outputting 0 or 1 respectively.

Remark 11.
There are two benefits of instantiating this scheme in the cyclic algebra setting rather than over modules as in [5], both following from the matrix embedding φ. Firstly, in the module setting Alice must publish a matrix A rather than the vector a in her key, since φ(a) lets us generate a matrix; this saves a factor of d in the size of the public key. Secondly, by extending b to φ(b) we are able to increase the dimension of v, and correspondingly increase the size of the message by a factor of d.
Example 3. Recall our explicit algebras from Section III-C. Without considering streamlined implementation for specific NIST submissions, we will pick toy comparison parameters for equivalent module based systems and ring based schemes, e.g. KYBER and NewHope. For the module case, consider a module of dimension 4 over a ring L of dimension 256, with 2-power cyclotomic base field [K : Q] = 64. Our public key (a, b) requires storing only 8 elements of R q = O L /q · O L rather than 20 in the form (A, b). Our message consists of 1024 bits, corresponding to the total dimension of the algebra rather than the module versions 256 which corresponds to the field dimension; if the private key size is 256, our CLWE scheme allows a rate-1/4 binary error correction code, while KYBER does not. Our ciphertext sizes are the same. As far as the modulus q is concerned, we find q = 3329 splits completely in a quartic cyclic extension L of K. This matches with the modulus q used in KYBER 6 . Overall this represents a noteworthy gain in key and message size without loss in efficiency. For the ring case, consider an instantiation of NewHope in dimension 1024. Both public keys are in the form (a, s) and so require equivalent levels of storage (8 elements of a field of dimension 256 or 2 in dimension 1024), and the same phenomenon is true of ciphertext sizes and message length. However, a larger modulus q = 12289 is ued in NewHope. Hence, we hope to gain in security without losing much efficiency.
Before considering security and correctness we need a somewhat technical lemma allowing the use of the matrix transpose operation. Essentially, it states that if the CLWE problem is hard in an algebra A, then for a, s, e ∈ Λ q , the equation φ(a) T s + e is a valid CLWE instance in some other algebra A for which the CLWE problem is still hard.
Lemma 19. Let A = (L/K, θ, γ) be a cyclic division algebra with matrix embedding φ(a) and natural order Λ. Then there exists another cyclic algebra A = (L/K, θ, γ −1 ) with matrix embedding φ (a ) and natural order Λ such that for a ∈ A there exists a ∈ Λ satisfying φ(a) T = φ (a ). Moreover, A still satisfies the division algebra condition, and Λ q are Λ q canonically isomorphic as additive groups.
Proof. The fact that A is still a division algebra follows from the non-norm property on γ and the fact that N L/K (L × ) is a multiplicative group. Λ q and Λ q are additive isomorphic because both algebras share the same underlying fields and γ, γ −1 are both units of O L . Since the first row of φ(a) is precisely (x 0 , γθ(x d−1 ), γθ 2 (x d−2 ), . . . , γθ d−1 (x 1 )), by setting a = x 0 +uγθ(x d−1 )+· · ·+ u d−1 γθ d−1 (x 1 ) and observing that θ d is the identity it is easy to check that φ(a) T = φ (a ).
The proofs of correctness and security are similar in spirit to those of other compact LWE schemes such as e.g. NewHope [4] or KYBER [5]. We proceed with a somewhat informal security argument.
Lemma 20. The defined scheme is IND-CPA secure under the assumption that the decision CLWE q,Υ problem is hard.
Proof. The goal of an IND-CPA adversary is to distinguish, with non-negligible advantage, between encryptions of two plaintexts m 1 , m 2 . The challenger chooses i ∈ {0, 1} uniformly at random and encrypts m i as u, v. By the assumption that the decision CLWE problem is hard, the adversary cannot distinguish between the case where b = as + e and the case where it is replaced by a uniform random b , so we replace the challenge ciphertext v with v by replacing b with b . Setting v := v − q 2 · m i , it follows by Lemma 19 that u, v represent two samples from a valid CLWE distribution with secret t, and so the adversary cannot distinguish them from uniform with non-negligible advantage. Hence, the challenger cannot distinguish v and hence v from uniform with non-negligible advantage and so cannot guess i with non-negligible advantage.
Finally, we demonstrate conditions on the error term for the scheme to be correct.

Lemma 21.
The defined scheme is correct as long as the ∞ norm of e = (φ(e) T t+e 2 −φ(s) T e 1 ) is less than q 4 , where the ∞ norm is over the vector of all polynomial coefficients of each u i entry of e of dimension n · d 2 .
Proof. To decrypt, Alice computes v − φ(s) T u and computes m by rounding.
from which the result follows immediately.
We note that the error term e will be unsurprising to those familiar with LWE based cryptography. Although we do not provide concrete correctness estimations, the error parameters for our decision reduction are equivalent to those of MLWE up to some small covariance terms. We do not expect this covariance to greatly affect the distribution of the error and thus for equivalent parameter choices we expect a similarly small probability of decryption failure.

C. Operational Complexity in Cyclic Algebras
In the previous subsection we showed that the CLWE problem can be used to construct a standard LWE based cryptosystem. Assuming that parameters across all variants of the LWE assumption are roughly equivalent, the CLWE problem supports key and message sizes as advantageous as those of the RLWE problem, and better than those of the module case. Along with storage considerations, another important facet of the ambient space in LWE cryptography is the efficiency of operations. Here, we will construct algorithms and consider the asymptotic complexity of multiplication in a cyclic algebra in order to compare it to the ring and module variants. Since in practice we consider operations modulo some prime q, addition in rings, modules, and cyclic algebras can be considered as addition in vector spaces over Z q , which has complexity dominated by that of multiplication.
Consequentially, we only concern ourselves with a comparison of the cost of computing the multiplication operation As in the three cases. In order to keep our comparison consistent, we let N denote the total dimension of the underlying LWE instance. In the ring case, N denotes the ring dimension; in the module case, N = nd, where n denotes the ring dimension and d the module rank; in the cyclic algebra case N = nd 2 , where the ring dimension is nd and the algebra has 'module' rank d. However, since it will be important later we remark here that the cyclotomic part of the ring will be of dimension n rather than nd. The three cases can be considered as follows: • In the ring case, the operation As over Z q is a representation of the ring operation a · s in R q ∼ = Z q [X]/(X N + 1). Using the CRT decomposition in dimension N of [41], this operation is decomposed into coordinatewise multiplication in a vector of dimension N over Z q , following which the decomposition is reversed to recover a · s. The complexity of this technique is dominated by that of the CRT decomposition, which takes time O (N log N ), although the coordinatewise multiplication also requires time O(N ).
• In the module case, A is a d×d matrix over R q . In this case, one can compute As by applying the CRT in dimension n coordinatewise on A and s. This requires d 2 + d applications of the CRT, for a total asymptotic complexity of O(d 2 n log n) = O(N d log(N/d)). Again, this hides a coordinatewise multiplication step which takes time O(N d) in this setting.
• In the cyclic algebra case, A is a matrix in the shape φ(a), where φ(a) is the left regular representation of a ∈ Λ q . We estimate the complexity of the operation φ(a) · s in Appendix G. Explicitly, our algorithm has complexity O(N log(N/d 2 ))+O(N d ω−2 ) in the case where q splits completely in L, with ω ∈ [2, 2.373] denoting the exponent of matrix multiplication. The latter term corresponds to the cost of multiplication in our analog of the finite fields used in the CRT method for RLWE. We see that cyclic algebras compare favourably with modules for multiplication in the same dimension N , depending on the exact relationship between log d 2 and d ω−2 . Since d is likely to be fixed while n scales up, we expect that the O(N log N ) term will dominate the complexity. Nonetheless, we include the second term in our results to quantify our claims. The second term O(N d ω−2 ) becomes O(N d) with naive matrix multiplication instead of the algorithms of [13], yet its overall multiplication complexity is still lower than that of module multiplication in the same dimension.

VI. CONCLUSIONS AND FUTURE WORK
The primary goal of this work is the introduction of the Learning with Errors problem over Cyclic Algebras, CLWE, adding to the family of available LWE assumptions for use in cryptography. To this end, the central pillars of an LWE problem are provided for the cyclic algebra case. First, in order to provide a foundation for the construction the notion of lattices derived from ideals of the natural order of a cyclic algebra are applied in cryptography for the first time. Then, in Section III, the CLWE problem is formally introduced, following which explicit algebras are provided with dimensions and structure appropriate for cryptographic use. Then, in Section IV, the usual LWE security reductions are established in the CLWE case: namely, the problem of solving short vector problems on order-ideal lattices is reduced to the search CLWE problem, and then a variant of the search CLWE problem where the secret is restricted to a fixed, well constructed subset of its usual space is reduced to the decision CLWE problem. Under plausible assumptions on this restricted search problem, combining these two reductions gives the necessary security grounding for CLWE based cryptography, which is that samples from the CLWE distribution appear pseudorandom to an onlooker with no knowledge of the secret s. Finally, in Section V, the necessary steps are taken to mold the CLWE problem into a practical format for cryptography. Normal form reduction is shown and a sample cryptosystem in this form is provided. Additionally, the complexity of operations in CLWE cryptography is compared to that of RLWE and MLWE based schemes.
Cyclic algebras exhibit substantial novel structures within lattice-based cryptography, and discovering use cases for these previously unseen features represents an exciting area of future research. We outline a few directions of future research in the following.
From a theoretical standpoint, the most pressing question to be solved about CLWE is whether or not the search and decision problem are polynomial time equivalent, or instead if the hardness of the decision variant can be based directly on hard lattice problems via some other technique. In this work, the hardness of the decision problem for arbitrary secret is shown to derive from the assumed hardness of a variant of the search problem where the secret is restricted to lie in any so-called pairwise difference set G. Although this substantially lowers the size of the secret space, the resulting secret space is still far too large to exhaustively search. Furthermore, the decision problem is as hard as the search problem for the hardest choice of decision set G, precluding particularly easy cases. Nonetheless, this does not establish the formal hardness of the decision CLWE problem based on the lattice problems of Section II-E. The reduction fails to permit arbitrary secret since the decomposition into matrix rings of Lemma 12 results in a problem that can not be 'guessed' effectively, since the oracle does not necessarily accept inputs as valid when the guess is wrong.
Another method of establishing the hardness of decision RLWE that is not shown for CLWE in this work is a direct to decision reduction, which more generally represents a security proof for the decision problem that holds for wider classes of cyclic division algebras than those of Section IV-B. The direct to decision reduction of [34] is the only security reduction for RLWE which establishes the hardness of the decision problem without enforcing that K is a cyclotomic field within which q splits completely, as in the search-decision reduction of [2] and the presented analog for CLWE. Dropping this restriction, and hence widening the possible choices of cyclic algebras supporting the hardness of the decision problem, would provide larger design space for CLWE based cryptography.
As for another direction of future work, we view a drawback of our work to be that we are restricted to certain instances of cyclic algebras. Although in practice most cryptography would use a fixed choice of algebra, this is a function of our methods and may be possible to remove. Additionally, showing the aforementioned direct-to-decision reduction may generalize the choice of algebras.
Finally, this work is focused on the theoretical construction of a non-commutative Ring-LWE assumption, and we leave practical analysis and implementation of cryptography based on CLWE as further research.

ACKNOWLEDGMENT
The authors would like to thank Jyrki Lahtonen, Damien Stehle and Martin Albrecht for helpful discussions. They are also grateful to Andrew Mendelsohn for finding the prime q = 3329 in Example 3.

APPENDIX A ATTACKING NON-DIVISION ALGEBRAS
In Section III-A, the condition that γ is a non-norm element of L/K is required in order to stop parallelizing attacks in the style of that of [22] applying to the CLWE problem. Thus, γ is chosen so that γ i is not in the norm group of L into K for i = 1, 2, . . . , d − 1. Here, we demonstrate that picking γ that violates this condition leads to potentially vulnerable instances of the CLWE problem. We will need the following lemma, a rephrasing of [35,Theorem 30.4]. Remark 12. If γ ∈ O K is a unit then all isomorphisms of this lemma hold when replacing L and K with O L and O K respectively. The first and third can be seen by examining the proofs in [35]; the first is a re-indexing of u coordinates of A, and the third simply sends u to βu. The second requires a little more work. We map A to Hom K (L, L) by sending u to θ and x ∈ L to the K-homomorphism on L defined by multiplying by x. Finally, we appeal to the standard isomorphism between Hom K (L, L) and M d×d (K), which preserves integral elements as long as there exists an integral basis of L over K. We discuss the details of this last part later, because we also require it to preserve a notion of smallness.
Armed with this lemma, we demonstrate potential weaknesses of choosing γ poorly. Let A = (L/K, θ, γ) be a cyclic algebra where γ lies in the norm group N L/K (L × ) (and still lies in O K ); later we will generalize our argument to the case where instead some power of γ less than d is a norm instead. Consider the primal CLWE instance (a, a · s + e) ∈ Λ q × Λ q , where a, s are uniform 7 and e ← χ is drawn from an error distribution which is of Gaussian shape. Applying Lemma 22 transforms our sample into one over M d×d (O K q ) × M d×d (O K q ). That is, we construct a sample in the form where A, S, E ∈ M d×d (O K q ). Since isomorphisms are bijections, A and S are uniformly random matrices. Assume for the time being that the isomorphisms are also smallness preserving, so that if e is a small element of Λ q then the corresponding matrix E will have entries that are small elements of O K . Let s i , e i denote the i th columns of S and E respectively. Then, for each i the pair (A, As i + e i ) constitutes d samples from the MLWE distribution in dimension n and rank d. That is, the single CLWE sample provides a collection of d samples from d instances of the MLWE distribution with different secrets s 1 , . . . , s d , where each set of samples shares the same uniformly random matrix A. Since the difficulty of LWE problems is assumed to be superlinear in dimension N , solving d instances of the MLWE problem in dimension n and module rank d is easier than solving a single instance in dimension nd and rank d, the targeted dimension of our CLWE problem, which is essentially the parallelizing argument of the attack of [22] on m-RLWE. Furthermore, the matrix A being common to each set of samples potentially weakens the resulting MLWE instances. Thus, assuming that e i is suitably distributed, it is clear that choosing a γ that is the norm of an element of L compromises security.
We are left to consider the distribution of the error matrix E. In order to understand this, we must discuss the proof of Lemma 22 further. Let γ = N L/K (β), so that the isomorphism mapping A to A = (L/K, θ, 1) fixes L and sends u to uβ. Following the proof of Theorem 2 we see that the γ which are both roots of unity and norm elements are precisely norms of some other root of unity. Hence, β is a root of unity and this isomorphism maps a Gaussian distribution on A to a Gaussian distribution on A .
The isomorphism mapping A to M d×d (K) begins with a mapping from A to Hom K (L, L) that sends x ∈ L to the multiplication function f x (y) = xy for y ∈ L and sends u to θ. Then, it applies the well known isomorphism sending Hom K (L, L) to M d×d (K), which can be defined constructively as follows: • Fix a K-basis { 1 , . . . , d } of L over K. • Define f j : K → L as f j (x) = j x, a mapping onto the j coordinate of the basis. • Let π j : L → K denote the projection map onto the j sending π j ( d i=1 Since it permits an arbitrary choice of K-basis, this isomorphism is non-unique. Furthermore, an attacker trying to apply this isomorphism would be able to use their choice of basis and still compute the isomorphism efficiently. We are interested in the image of a Gaussian sample e ∈ Λ q under this isomorphism, with e = d−1 i=0 u i e i , having each e i sampled independently from a discrete Gaussian over O Lq , being sent to ψ e = d−1 i=0 e i σ i . Correspondingly, the i, j coordinate of the matrix E = ∆(e) is For the j th column of E (the error vector in the set of d MLWE samples with secret s i ), the error is precisely the i coordinate of d−1 k=0 e k θ k ( j ). Now the distribution of the error in each collection of MLWE samples depends on the properties of the chosen basis. Since the e k are independent Gaussian samples from L, j is fixed and θ represents a permutation of the canonical embedding coordinates of L elements. Hence, d−1 k=0 e k θ k ( j ) is an elliptical Gaussian with n blocks of d different parameters. Furthermore, if { 1 , . . . , d } is a cyclic basis then, since the distribution of σ L ( k ) 2 is independent of k, the projection π i ( d−1 k=0 e k θ k ( j )) follows an elliptical Gaussian. In addition, these coordinates are not independent and are potentially highly correlated.
The end result of this exposition is that, depending on the properties of the cyclic bases of L/K and given the choice of γ as a norm element, from a single CLWE instance we can construct d parallel copies of d MLWE instances in dimension n and rank d with correlated error. These correlated instances of the MLWE problem are plausibly substantially easier than the claimed security of the CLWE instance, which is that it is roughly as hard as an MLWE instance in the same dimension nd and rank d. Of course, the error distributions in the underlying MLWE instances are non-standard and we have not presented a concrete attack on them. Instead, we believe this discussion is sufficient to persuade the unconvinced reader that solving the CLWE problem with norm element γ can be simplified by some parallelization into MLWE instances, and thus we should stick to our specification that γ is a norm.
In the above exposition we restricted ourselves to cases where γ is a norm, but the definition of the non-norm condition also precludes γ as valid if and only if γ i is a norm for some i < d that is coprime with d (see [7]). However, we have previously assumed that the hardness of the CLWE problem was independent of the choice of primitive n th root. In the constructions of Theorem 2 γ is an n th root of unity and d divides the prime power n, so if i is coprime with d then i is also coprime with n and so γ i is a primitive root which defines a cyclic algebra in which the CLWE problem can be parallelized. Thus, we conclude that γ must satisfy the non-norm condition rather than just itself not be a norm. Independently, a recent work [21] revisiting m-RLWE observes that the underlying property causing the attacks of [22] on the original instantiations was the presence of zero-divisors in the ambient space. In our case, zerodivisors exist in a cyclic algebra if and only if the non-norm condition is not satisfied, so their argument should preclude not just the γ that are themselves norms but also all γ which fail the non-norm condition.

APPENDIX B IMPOSSIBLE ALGEBRAS
We show that certain algebras that would otherwise be what we are looking for do not exist under our restrictions. As discussed above we would like to begin with a base field that is cyclotomic, K = Q(ζ m ) for integer m, and proceed to fix some low degree cyclic Galois extension L/K and non-norm element γ ∈ O K with |γ| = 1 e.g. γ is a root of unity. Given these restrictions and the shape of lattice cryptography, the most natural fields to look for are low degree extensions of two-power cyclotomics e.g. m = 2 k . Unfortunately, we are able to prove the non-existence of a large class of such extensions. Proof. Since L/K is a Galois extension of degree p, the relative norm map N L/K (·) induces the map x → x p on elements x ∈ K × . Let 1 ≤ i ≤ m − 1 be an integer; we will prove the theorem by finding 1 ≤ j ≤ m − 1 such that N L/K (ζ j m ) = ζ i m . Since ζ m and its powers lie in K, the relative norm map takes ζ j m to ζ jp m and we are left to solve the congruence jp ≡ i mod m. By assumption, g.c.d.(m, p) = 1 and so p is invertible modulo m. Denoting this inverse p −1 and letting j = p −1 i mod m it is easy to see that jp ≡ ip −1 p ≡ i mod m. The theorem statement follows immediately.
This theorem precludes the existence of a very large class of cyclic division algebras with cyclotomic base field. In particular, if the degree of [L : K] is coprime with m then we can not have our restrictions that |γ| = 1, is integral, and that K is cyclotomic. We draw attention to the specific classes whose non-existence we are interested in: in an ideal world we might instantiate CLWE with K = Q(ζ 2 k ) and [L : K] = d for arbitrary small integer d corresponding to the module rank, which in practice is likely to be at most say 5. However, as a result of Theorem 7 we know that d can not be coprime with 2 k and must be even in order to permit a suitable γ, from which it follows that we can not have d = 3, 5.

APPENDIX C PROOF OF THEOREM 3
Before proving Theorem 3 we need some additional concepts and a Lemma. Given a K-central division algebra A and some O K order Λ in it, then the O K -discriminant of Λ, d(Λ/O K ), is a certain ideal in O K [35, p.126]. While A has many maximal orders they all share the same discriminant, which is called the discriminant of the algebra d A . Now the key fact about discriminants we need is that an order Λ is maximal if and only it's discriminant equals that of d A .
We will now use the notation of Section III-C. According to [25] the field L and therefore also its subfield M are subfields of Q(ζ m , ζ q ), where m = p a , and q = p is some large prime. Let n = ϕ(m) = p a−1 (p − 1). Furthermore it is known that q splits completely in the field K = Q(ζ m ). Let us now denote with q O K = q 1 · · · q n , the prime ideal decomposition of q in K. We then have the following result.
Lemma 23. Let (M/K, θ, ζ m ) be an index d division algebra of Theorem 2 and let Λ be the corresponding natural order. Then we have that Proof. According to [7,Lemma 5.4] we have that where d(M/K) is the relative number field discriminant of the extension M/K. In order to find the discriminant of the natural order, it is now enough to find d(M/K). By the basic theory of cyclotomic fields we know that Q(ζ m , ζ q ) = Q(ζ mq ). We also know that the only ramified primes in the extension Q(ζ mq )/Q are p and q and their ramification indices are e 1 = n and e 2 = q − 1, respectively. Furthermore ramification index of p in the extension Q(ζ m )/Q is e 1 .
As ramification indices are multiplicative in towers of extensions we can deduce that the only primes that are possibly ramified in the extension Q(ζ mq )/Q(ζ m ) are those that lie above q in the ring O K . As q is not ramified in Q(ζ m ), we get again by the multiplicativity of the ramification indices that all the primes q i are totally ramified in the extension Q(ζ mq )/Q(ζ m ). Therefore they are also totally ramified in the extension M/Q(ζ m ). Because q does not divide d the prime ideals q i are tamely ramified. Dedekind's discriminant theorem now imply that Now we are ready to prove the natural order in Theorem 3 is actually maximal.
Proof. The proof is based on the result in [35] that states that an order is maximal if and only if it has the same discriminant as the discriminant of the algebra. According to Lemma 23 we have that According to [35] the discriminant of the maximal order will always divide the discriminant of the natural order. Hence we know that the only prime ideals that can possibly divide the discriminant of the maximal order are q i . Let us now assume that Q i is prime ideal above q i in L. By abusing notation we will denote with L q i the Q i -adic completion of L and in the same way the respective completion M q i . Following the proof of [25,Theorem 4] we can see that the authors actually prove that ζ m is a non-norm element in the extension L q i /K q i for each prime ideal q i . Using the same proof as in Theorem 2 we can now see that ζ m is a non-norm element in the extensions M q i /K q i , for all i. According to [35,Theorem 30.8 where θ naturally extends θ.
As ζ m is a non-norm element, (M q i /K q i , θ , ζ m ) is an index d division algebra. By definition of the local index we can see that the local indices m q i are d for all q i . We now know that q i are the only possible primes dividing the discriminant and that their local indices are d. According to [35,Theorem 32.1] the discriminant of the algebra A is completing the proof.

APPENDIX D CONSTRUCTIONS USING COMPOSITUM FIELDS
Our other method for constructing suitable extensions starts from extensions which are nearly what we are looking for and applies field compositums (cf. [35,Chapter 30]). We recommend this method to build on top of fields constructed using either Theorem 1 or Theorem 2. Say we have a Galois field extension L /K with non-norm element γ ∈ O K whose Galois group is cyclic of degree d. Let F be some other Galois number field with F ∩ L = Q. Then Gal(L F/K F ) ∼ = Gal(L /K ) and γ is a non-norm element in L F/K F . Relabelling this extension as L/K and letting θ denote the cyclic generator of the Galois group gives a cyclic field extension with non-norm γ such that [L : K] = d and [K : Q] = [K : Q] · [F : Q]. The relations among these fields are illustrated in Fig. 4(a).
One can generalize this method to the case where the base field can not be written conveniently as a compositum of two fields. Let L /K be a cyclic Galois extension of degree d with nonnorm element γ and let K be another Galois number field which contains K . Then KL /K is a cyclic Galois extension of degree k for some k dividing d, and in particular if K ∩ L = K then k = d since the fields are linearly disjoint above K . See Fig. 4(b) for the relations among these fields.
We give example algebras of dimensions 576, 768 and 1152 in Table I with less restrictive dimension using field compositum techniques. We propose two alternate methods of applying field compositums in Fig. 4(a): either use Theorem 2 to make an algebra which already has large dimension by selecting large center K and small extension L, then compose a small field F onto K and L to tweak the total dimension. Alternatively, one can create algebras by selecting small fields L and K using Theorem 1 and composing both with a large field F . We begin with an example of the first method that achieves dimension 768. Let L be a degree two extension of the field K = Q(ζ 64 ) chosen by Theorem 2 with non-norm root of unity γ, so that the corresponding algebra A has dimension 128. Compose both L and K with the field F = Q(ζ 9 ), denoting the compositums by L and K respectively. Then γ is still a non-norm element in the extension L/K, a degree two extension that is cyclic and Galois, and the algebra A = (L/K, θ, γ) is a cyclic algebra of dimension 6 × 128 = 768, as required. We observe that here the center K corresponds to the fields with fast operations used in [42].
Our final method of composing large degree fields onto small degree extensions is aimed at targeting odd module ranks. Begin by choosing the desired module rank d as a (likely small) odd prime. Then set K = Q(ζ d ) and pick L as a cyclic Galois extension of K in which the d th root of unity is a non-norm element using Theorem 1. Let F := Q(ζ 2 k ) and again let L and K denote its compositum with L and K respectively. Then A = (L/K, θ, γ) is a cyclic algebra with n := [K : Q] = (d − 1)2 k−1 and d = [L : K] a small prime. The form of the total dimension N = d 2 (d − 1)2 k−1 constrains our choice of dimension, but for examples of cryptographically relevant sizes with d = 3 one can consider setting k = 6 or k = 7 to achieve dimension N = 576 or N = 1152 respectively. If one required additional flexibility of dimension one could also consider increasing d or replacing the power-of-two cyclotomic field with any cyclotomic field whose intersection with Q(ζ d ) is precisely Q. This method comes with the subtle drawback that the module rank d is also present in the dimension of the base field K, which precludes the case where one wants a large module rank and a small center. On the other hand, since such cases are excluded in our security proof we view this drawback as minor.
We would like q to be of roughly appropriate cryptographic size (say between 3000 and 15000 as a soft estimate, once again presuming parameters similar to those of NewHope or KYBER). Having q split completely in L is not as straightforward as in K because L is not a cyclotomic field, so we return to our examination of the proof of Theorem 1. Recall that in this proof the extension field L is a subfield of K(ζ mq ) for some prime integer q satisfying q = 1 mod m and, for m = p a , p a+1 does not divide q − 1. That is, a is the highest power of p that divides q − 1. We have several methods to ensure that q splits completely in L, of which we start with the most naive.
1) Naive Method: For our general method we rely on the following fact: If q i is an ideal of O K which splits completely in an extension M/K then it splits completely in any intermediate field M/L/K. As it is conceptually simpler to apply this idea to the integer q than to the O Kideals q i we use a simpler statement, that if q splits completely in some M containing L then it splits completely in L. This gives us an easy way to find some q that splits completely by examining a cyclotomic field that contains L: let K = Q(ζ m ) and let M = K(ζ q ). Then since q = 1 mod m it follows that M = Q(ζ mq ). Thus q splits completely in M if and only if q = 1 mod mq and consequentially splits completely in our extension L if q = 1 mod mq . Since there are infinitely many primes equal to 1 mod mq this recipe always provides a prime q that splits completely in L. The upside of this method is that it is both very general and simple, since all candidate fields L we construct are contained in a larger cyclotomic field. Theoretically, this method can be extended to any abelian extension of Q using the partial converse of the Kronecker-Weber Theorem. However, using the Kronecker-Weber Theorem constructively is not as straightforward as picking q as in the proof of Theorem 1, so this extension to general abelian L is slightly contrived.
The downside to this method is that it seems that often this will result in unrealistically large q. Since q = 1 mod m and not 1 mod p a+1 , q must be chosen carefully and there are not many 'small' primes satisfying these conditions. For example, in our quadratic extension case with m = 512 the smallest prime that is 1 mod m but not 1 mod 2m is q = 7681. The smallest q which is 1 mod (512 · 7681) has to be bigger than 512 · 7681 = 3932672, which is inappropriately large for lattice cryptography. Of course, one could be lucky here and have much smaller q for different choices of L and K, but in general we regard this as a theoretical result rather than a practical method. Even for smaller 2-power cases such as m = 128 one must set q = 641, which leads to a smallest valid prime of q = 820481.
Remarkably, this is much less bad in the cubic case; K = Q(ζ 81 ) gives q = 163 as a suitable prime and q = 26407 still splits completely. This is perhaps slightly too large, but certainly not so much so that it is completely impractical. Nonetheless, we move on to a better method for quadratic cases.
2) Quadratic Case: In the case where L/K (K = Q(ζ 512 )) is a quadratic extension we are able to choose substantially smaller q by examining the unique quadratic subfields of E := Q(ζ q ). We rewrite M as the compositum of E and K, and observe that since our chosen L contains K our method of choosing L as a subfield of M allows us to write L = EK for a subfield E of E . In the case where L is a degree two extension of K we know that E is a quadratic field, and since E is a prime cyclotomic field we have an explicit description for its unique quadratic subfield E; namely that E = Q( √ q ) if q = 1 mod 4 and E = Q( √ −q ) is q = 3 mod 4. It is a standard fact that the discriminant d E of E is q if q = 1 mod 4 and −q otherwise. Finally, we know that a prime q splits completely in E if and only if the congruence d E = x 2 mod q has a solution e.g. if d E is a square mod q. Plugging in the prime numbers q = 12289 and q = 7681 that are common in cryptography we see that q = 1 mod 4 and that 7681 = 3788 2 mod 12289, so that q = 12289 splits completely in E, K, and thus L, as required. Since this prime is explicitly the prime used in NewHope for all parameter sets we view this method as a substantial improvement on the previous technique.
3) Quartic Fields: Again, we use the method of describing L as a compositum M K/K. Now, M will be a quartic subfield of the field Q(ζ q ) and one can establish the linearly disjoint nature of M and K required to express L as this compositum by e.g. examining their discriminants: since K is a power-of-two cyclotomic field the only prime appearing in its discriminant is 2, and since M is a subfield of Q(ζ q ) the only prime in its discriminant is q . Since they have coprime discriminants they are linearly disjoint, and since ramified primes are factors of the discriminant we have a relatively easy way to discount q being ramified (q = 2, q ), so the remaining case to concern ourselves with is q being inert.
Since the discriminants are coprime we have a method for explicitly describing the integral basis of L = M K; the integral basis for K is clear, and an integral basis for M in fixed dimension can be computed relatively easily since it has degree 4. Then, the product of their integral bases is an integral basis for L. Now one only needs to check whether q splits completely in M , since splitting in K is well understood. We are unable to provide a general method for finding such q, but an easy computation reveals that for q = 10753 and K = Q(ζ 256 ) there is a quartic field M such that q splits completely in M and K and hence L. Since we have a relatively small range in which we wish to place q and M has low degree we do not consider the cost of this search as a large drawback since it can be done efficiently on computational software such as SAGE or PARI.
Remark 13. In fact, this quartic method can be applied to other instances where we do not have an explicit description of the subfields of K(ζ q ) which have degree d over K: define the families of q which split completely in K, then check whether those q split completely in L using computational software. Since q = 1 mod m and m is relatively large, there will not be many q to check of appropriate size for lattice cryptography, and so we conclude that this method is sufficient for fixed choices of fields L, K for which a satisfactory q exists.

APPENDIX F RESTRICTING THE SECRET SPACE
In Lemma 14 we need to use a fact that is implicit in the search-decision reduction of [2]: for uniformly random v ∈ R i and an incorrect guess g of the secret s modulo R i , the distribution of v(g − s) is uniformly random. In the ring and module cases, the secret space is decomposed into a direct product of finite fields, so it is clear that v(g − s) is uniformly random in each finite field for g = s.
In our case, an appeal to Wedderburn's theorem demonstrates that, since for our parameter choices each R i is a central simple algebra over , for which it is not true in general that v(g − s) is uniformly random for g = s; in fact, it is uniformly random if and only if g − s is invertible. Thus we restrict our secret s so that s mod R i lies in a set G i with the property that g = h ∈ G i implies g − h is an invertible matrix. Applying this restriction for each i places s ∈ G for a set G = G 1 × · · · × G n of size |G| = i |G i |. Now, an incorrect guess g ∈ G i of s mod R i results in a distribution of v(g − s) which is uniformly random mod R i . We will call such a set G a pairwise difference set.
We also need to guarantee that there exist sufficiently large choices of G. A simple method for constructing a valid G i is by fixing some arbitrary embedding β of F q d into M n×n (F q ) and letting G i equal the image of this embedding, such that |G i | = q d and |G| = q nd . Indeed, a G i constructed in this way is maximal because any set of matrices in M d×d (F q ) of size at least q d + 1 contains two matrices with the same first row, whose difference is therefore uninvertible.
There are a number of choices of embedding β, and thus set G i , equal to the number of irreducible polynomials of degree d in F q [x], which can be calculated by the Necklace polynomial and in general will vastly exceed q. We make clear that our reduction will take the decision CLWE problem for arbitrary secret s to the search CLWE problem where s ∈ G for arbitrary fixed G, which we denote by CLWE q,Σα,G . Thus, our reduction states that the decision problem is as hard as the search problem for the hardest choice of G, precluding obvious attacks on the unique case where G = O Lq ∨ and the CLWE problem with s ∈ G corresponds to d parallel copies in L of the RLWE problem 8 . For a general set G, s ∈ G will not provide parallelization since they need not have the property of L that they are entirely contained in one u coordinate of A. Additionally, even though elements of G constructed this way co-commute, they do not lie in the center of Λ and the multiplication a · s in the CLWE instance will not be a commutative operation.
Of course, fixing a G of size q nd restricts the size of the secret space by a factor of q nd q nd 2 , a substantial loss in size even for fixed, small d. For concrete parameter settings, this may result in a much easier problem, but asymptotically it is still exponential in n and thus establishes a suitable hardness property for decision CLWE. Of course, attacks based on exhaustive search are unlikely to represent the best attacks on the CLWE problem, so this may or may not substantially aid an attacker in practice.
In fact, there is no a priori reason why G i should be a field, or even closed under multiplication. For example, fixing a pair of invertible matrices M 1 , M 2 and replacing G i with M 1 · G i · M 2 = {M 1 XM 2 |X ∈ G i } results in a new set of size q d whose pairwise differences are all invertible but is not multiplicatively closed in general. Although the field embedding technique is perhaps the most elegant way of building G i , and certainly the most constructive, it may transpire that taking s from some set with less algebraic structure is advantageous in terms of the hardness of the resulting search problem. One can also construct the valid set G i + X by adding a fixed matrix X to each element of G i , but this technique is somewhat constrained by the fact that LWE samples are additive in the secret s (e.g. one could just add a·X into the second coordinate of the resulting samples).
Although this restriction is not ideal, we have a remark about the implications on the security of the CLWE problem. Restricting the secret space in (R)LWE problems is not an uncommon idea: tertiary secrets, where each coordinate of s ∈ {−1, 0, 1}, are used in the NIST candidate LAC [43] amongst others, and security whilst restricting the secret to orders or subfields is discussed in [44], and to other K-lattices in [45]. Overall, we suspect that the decision CLWE problem is polynomial time equivalent to the search CLWE problem without restriction on s, in particular when the number of samples is small as in our applications in Section V, and that the restriction is a function of our reduction technique rather than some causal property of the CLWE distribution. For the purposes of constructing a cryptosystem, we assume that this reduction implies that the decision CLWE problem is hard.

APPENDIX G ESTIMATING THE MULTIPLICATION COMPLEXITY
The overall flow to compute the multiplication is depicted in Fig. 5, which is explained in detail in the sequel.

A. Algorithm for Multiplication in Cyclic Algebras
We recall some details necessary to understand our multiplication algorithm. Recall that in the explicit constructions of Theorem 2 the base field K is cyclotomic and q is a prime integer chosen so that q splits completely in O K as q = q 1 . . . q n , where n is the dimension of K as an extension of Q. Furthermore, the degree of L over K is a typically small d. Then, following the CRT-like decomposition of Lemma 12 we write We will show that each R i is a skew polynomial ring over Z q , and in particular a skew polynomial ring for which we can apply the algorithms of [13] to compute multiplication independently in each R i in O(d ω ) operations in Z q , which output elements whose  [CLB17] is referred to as [13].
u coordinates are in the form i i k i for k i ∈ O K q and { i } some arbitrary normal basis for O Lq over O K q . We remark that the representation as a skew polynomial ring need not contradict the fact that we viewed the rings R i as matrix rings in Section IV-B, since computing matrix multiplication can be reduced to the problem of computing multiplication of skew polynomials (see [13]). Since ω ≤ 2.373 and we can compute the multiplication in each R i in parallel, this leads to a complexity of approximately O(N d 0.373 ). However, we must also compute the complexity of the splitting isomorphism.

B. The Rings R i
In order to apply the algorithm of [13], we must confirm that each R i satisfies the following conditions: • R i is the quotient of a skew polynomial ring with center O K /q i by a polynomial in the form X d − γ.
The first of the conditions follows immediately from the definitions of a skew polynomial ring and a cyclic algebra. The veracity of the latter conditions will depend on how the prime ideal q i of O K splits in O L as q i O L . Since q i is prime in K and L/K is Galois, we know for some prime ideals q i,j in O L and integers e, g satisfying ef g = [L : K] = d, where f denotes the inertial degree. Assuming that L is constructed as a subfield of a cyclotomic field as in [25], it is a Galois number field and it follows that each q i splits with the same e, f, and g. Furthermore, since they are coprime as ideals of O K , their factorizations' in L are disjoint. Thus, we are left to consider three cases.
We first consider the case where each q i O L remains prime in O L . It follows that O L /q i O L is a finite field, and computing the norm of In this case it is easy to see that O L /q i O L is a finite field extension of O K /q i ∼ = F q and consequentially, because the norm map is surjective over finite field extensions, that γ is a norm. Here it is clear that the algorithms of [13] can be applied.
The second case we consider is g = d, e = f = 1. Now each q i O L splits completely in O L into a product of prime ideals q i,1 . . . q i,d . By the CRT we have where each O L /q i,j ∼ = F q , and it follows that O L /q i O L is anétale-O K /q i algebra. We are left to show that γ is a norm, which we show via the stronger condition that the norm map in this extension is surjective. By the CRT, O L /q i O L is isomorphic to a direct product of d copies of F q . Since the embeddings of L cyclically permute the ideal factors of q i it follows that the relative norm of an element (x 1 , . . . , x d ) ∈ d j=0 O L /q i,j is precisely d k=1 x k mod q. It is easy to see that this norm is surjective (because any x ∈ F q is the norm of e.g. (1, 1, . . . , x)) and now once again we can apply the multiplication algorithms of [13].
Intermediate cases, where q i splits into a product of prime ideals with the same norm such that e = 1, f g = d, can be handled using a straightforward combination of these two methods.
The final case to consider is the ramified case, when e = 1. Now the factorization of q i O L contains some power p e i i of a prime O L ideal p i . In this case, we are not able to verify that the necessary conditions for the algorithms of [13] hold. However, we observe that the ideal q ramifies in O L if and only if q divides the discriminant of O L . Since only a finite number of primes divide this discriminant, we restrict ourselves to considering the cases where q does not ramify. We emphasize that in the main cases of interest, where K is the m th cyclotomic field with m having small divisors and [L : K] is small, it is particularly unlikely that the large modulus q typical in cryptography divides the discriminant of L. Indeed, when we pick L as a subfield of K(ζ q ) for some large prime integer s using the techniques of [25] as in Theorem 2, it is easy to quantify which primes potentially ramify for a fixed choice of fields: either s or the primes smaller than or equal to the divisors of m. As an easy example, the modulus q = 12289 does not ramify in the example algebras given in the Section III-D achieving dimension 1024.

C. Complexity of the CRT Style Isomorphism
We have shown that we may apply the algorithms of [13] to compute the multiplication operation in each R i in complexity O(d ω ). We are left to consider the complexity of the isomorphism defined by Lemma 12 generating the rings R i . Essentially, this operation is a coordinatewise split of the u coordinates of Λ q = d−1 j=0 u j O L , where each entry is split into its mod q i O L parts. That is, the isomorphism maps Splitting one element x i ∈ O K can be done in time O(n log n) using the CRT algorithm of [41] when K is a cyclotomic field of dimension n. However, L is a not a cyclotomic field, but instead a small degree d cyclic extension of a cyclotomic. Furthermore, we are trying to split the elements of L modulo ideals of K extended to those of L. We do not know of an existing general, efficient way of doing this. The naive estimate for an optimal method would take time O(nd log nd), where nd is the dimension of L, but we suspect something this efficient is impossible. We have to perform d such splits, which would result in a total complexity of O(N log N/d). Note that this compares relatively closely with the O(N d 0.3 ) claimed for the multiplication step, and since these steps are sequential rather than parallel which of them dominates the asymptotic complexity would depend on the exact relationship between n and d, but the result is an operational complexity essentially equivalent to that of the ring variant.
Of course, the discussion of the previous paragraph relies on our implausibly low estimate of O(nd log nd) complexity of the CRT split and so we do not claim such efficiency. Instead, we present techniques in the proceeding sections to work around the problem of splitting the L part modulo the K ideals in the factorization of q. Our methods are particularly efficient in the case where q splits completely in L, but can be generalized to arbitrary splitting at only a small cost.

D. Fast Cryptography When q Splits Completely in L
We consider an explicit method for implementing fast cryptography in the special case where the ideal q splits completely in O L . By construction, q = i q i in O K , so in this case we We recall some facts about the extension O Lq of O K q . It is clear that the extension is cyclic of degree d, with Galois group generated by θ. By the CRT, where operations on the finite field products are applied coordinatewise. We represent the CRT decomposition of O Lq as (F q d ) n , where each copy of F q d corresponds to the extension j O L /q i,j of O K /q i . In the finite field representation of j O L /q i,j , the elements of O K /q i embed as elements of F q d with the same entry in each coordinate, e.g. (x, x, . . . , x), corresponding to scalars over (F q ) d , which can be seen from the following argument: for k ∈ O K , k = x mod q i implies k − x ∈ q i . Then it follows that k − x ∈ q i,j and thus k = x mod q i,j for each j. Furthermore there is a simple, explicit, description of the action of θ in this representation: since θ cyclically shifts the ideals in the factorization of q i , one can order each copy of F q d so that the action of θ on (F q d ) n is a cyclical shift of the coordinates of each of the n copies of F q d concurrently. We exhibit this with a trivial example: set d = 3, n = 2. Then the action of θ on . . , e d , where e i = (0, . . . , 1, . . . 0) denotes the i th element of the standard basis of dimension d. Furthermore, this basis is orthonormal in the sense that e i · e j = e i for i = j and 0 otherwise and cyclic 10 in the sense that θ(e i ) = e i+1 (e.g. normal), since the Galois group θ of L over K permutes the factors q i,j of q i O L for each i. Because the CRT splits O Lq into a direct product within which operations are computed coordinatewise, we can extend this to a basis of O Lq over O K q in the finite field representation by concatenating n copies of this basis together, denoting by e n i the vector of dimension nd (e i , e i , . . . , e i ). This basis is still cyclic, with θ operating independently on each of the n copies of F q d and hence the n copies of e i . Concatenating the bases in this way also preserves the orthonormal property. Denote the above basis by 1 , . . . , d . Recall that the CRT-like decomposition Lemma 12 splits each u coordinate, an element of O Lq , into its mod q i O L parts. However, we already know the mod q i O L parts of each j by construction. So, if we store elements of O Lq as = d j=1 j k j for k j ∈ O K q we can split into its O L /q i O L components in time O(d · n log n) as long as the k j elements are stored in the polynomial representation of O K q . Consequentially, we can perform the CRT style decomposition of an element in Λ q whose u coordinates are stored in this manner in time O(d 2 · n log n) = O(N log (N/d 2 )).
Now we see a way to achieve fast multiplication in Λ q . We are required to perform the CRT in each of the d u coordinates, after which we can plug the rings R i into the fast multiplication algorithm of [13]. Since the CRT is an isomorphism and we know the image of i under the CRT, this reduces to d copies of the CRT in O K , each with complexity O(dn log n), and therefore a total multiplication complexity of O(N log(N/d 2 )) + O(N d ω−2 ). However, this algorithm comes with complications associated with the chosen representation of elements of O Lq , which we handle in the next section.

1) Handling Elements in the Representation:
To use the above multiplication algorithms in the scheme of Section V-B we need to be able to store the elements compactly and sample the elements efficiently. Storing elements in this form turns out to be straightforward: each O Lq element requires storing d elements of O K q . An element of Λ q is d elements of O Lq , so in total we store d 2 elements of O K q , corresponding to one element of dimension N = nd 2 , which is equivalent to storing d elements of dimension nd.
We now discuss how to efficiently sample elements of Λ q according to an appropriate error distribution. Recall from the security reduction of Section III that the error distributions we recommend in practice are spherical or elliptical Gaussians in the coordinates of the embedding σ A . We sample using the following result. Proof. Recall that in the case where K is a prime power cyclotomic the power basis is a rotation and a scaling of the canonical basis (see e.g. [46]), so a discrete Gaussian in the polynomial representation corresponds to a discrete Gaussian in the canonical basis as well. Order the canonical embedding of O L such that elements of O K embed as vectors of n blocks of length d that are the same in each block, e.g. k 1 = (k 1,1 , k 1,1 . . . , k 1,1 , k 1,2 , . . . , k 1,n ), where each entry k i,j of k i appears d times. Since the i form a cyclic basis, in each d-block the entries of i+1 are just a cyclic shift of those of i 11 . For a fixed choice of basis the distribution in each d-block of is independent, because the k i,j are sampled independently from a spherical Gaussian. So we can consider one d block of at a time, and write the d-block of 1 as a 1 , . . . , a d . Since multiplication in the canonical embedding is coordinatewise and the i form a cyclic basis, the first block of can be written as  Call the left matrix A and the right vector k. k is a Gaussian of parameter r, so Ak has has a Gaussian distribution with covariance matrix r · AA † by e.g. [9,Lemma 2.5], and if this is diagonal and constant on the lead diagonal then we are done. Due to the structure of the canonical embedding and how we picked our basis in the O L / q representation, we have that a i = θ i (a 1 ), and that for i = j θ i (a 1 ) · θ j (a 1 ) = 0 mod q. It follows that the off-diagonal entries of AA † are 0 (since product being 0 is preserved under representations) and the diagonal entries are d i=1 |a i | 2 , where | · | denotes the absolute value. Hence, the first d-block of is a spherical Gaussian distribution, and since this analysis holds for any block it follows that each block of is a spherical Gaussian. One also needs to show that the Gaussian distribution has the same variance in each block, but this follows from the fact that the K-embeddings permute the mod q i values and fix the 2 norm of K R . Explicitly, by construction each K embedding modulo q can be extended 'identically' onto O L mod q in a way that fixes each i , so they must have the same set of values in each block (this would not be the case if we considered their norm in a global sense, and the restriction modulo q is strictly necessary).
Note that the statement does not define the resulting parameter of the Gaussian outputting , but the proof allows one to compute this: say each k i was chosen from a discrete Gaussian of parameter r. Then each element of has parameter i |a i | 2 · r. Computing i |a i | 2 is a one time cost for a fixed choice of 1 , . . . , d , so one can sample the required Gaussian over O Lq of parameter r by sampling from the discrete Gaussian over O K q of parameter r = r / i |a i | 2 . Finally, to sample elements of Λ q we merely sample each u coordinate independently according to the above technique. If we wanted to use this method in the cryptosystem of Section V-B to attain efficient operations then we would sample and store all elements using this representation over the cyclic basis 1 , . . . d .
Unfortunately, we are unable to generalize this theorem to the case where q i remains prime, or even intermediate cases. In this case, there exist cyclic bases of O L /q i O L over O K /q i , but since O L /q i O L is a finite field and thus has no zero-divisors the cyclic bases are not orthogonal. Consequentially, the matrix A does not in general give a diagonal AA T and thus the distribution of Ak has several potentially large covariance terms. If one were able to tolerate the covariance, the method can be extended in this case. It is also possible that a cyclic basis satisfying the condition that AA T is diagonal may exist for certain choices of field, but we were not able to find such a family of fields. We note that this question can be asked as a more generic question about finite fields: let F = F q d be a finite field with d > 1 and let θ denote the Frobenius automorphism of F . Does there exist a cyclic basis b 1 , . . . , b d with b j = θ j (b 1 ) for F over F q satisfying d−1 i=0 θ i (b 1 · θ j−k (b 1 )) = 0 for all j = k less than d? Here j and k correspond to j, k th entry of AA T . We were unable to come up with a basis satisfying this condition, but neither can we show that no such basis exists.
Example 4. We exhibit an example of the basis 1 , 2 in the simplest setting, that of a degree 2 extension of Q. Let L = Q(i), with ring of integers O L = Z[i], and consider the ideal 5 of O L . 5 factorizes in O L as 5 = (2 + i)(2 − i), and it is clear that 5 = 2 + i · 2 − i is a decomposition into a product of prime ideals.
Using the notation q 1 := 2 + i , q 2 := 2 − i , it is easy to check that 2 + i = −1 mod q 2 and thus −(2 + i) = −2 − i is a valid choice for 1 . Similarly, −(2 − i) = −2 + i is an appropriate choice for 2 . Correspondingly, the distribution obtained by sampling k 1 , k 2 ← D r , the discrete Gaussian of parameter r over Z 5 , and outputting k 1 ·(−2+i)+k 2 ·(−2−i) is a discrete Gaussian over O L mod 5 . Furthermore, to multiply two elements k = k 1 1 + k 2 2 and g = g 1 1 + g 2 2 modulo 5 one outputs kg = (k 1 g 1 mod 5) · 1 + (k 2 g 2 mod 5) · 2 , at a cost of two operations in Z 5 , and performing the O L mod 5 CRT on each u coordinate of an element of the resulting natural order Λ 5 can be done by merely reading off the d 2 = 4 values of k i and no additional computation.
Furthermore, this is an example where the techniques of our next section may be advantageous. We will generalize the multiplication and CRT technique so that one is free to use any basis of O L over Z, for example the basis {1, i}. In this basis it is particularly easy to sample a discrete Gaussian in the polynomial representation of O L mod 5 ∼ = Z 5 [x] x 2 +1 , but the resulting multiplication operation and CRT decomposition is not coordinatewise in the basis and so a small amount of efficiency is lost at a gain in parameter of the Gaussian. Specifically, to compute the CRT on an element k = k 1 + k 2 · i, one has to precompute 12 the values i = −2 mod q 1 , i = 2 mod q 2 and output (k 1 − 2k 2 mod q 1 , 2k 2 mod q 2 ), which requires additional operations over Z 5 .

E. Generalizing to non-Split q and Arbitrary Bases
In order to construct the cyclic, orthonormal, basis of Theorem 8, the previous section requires that q be completely split in both K and L. However, it is possible to drop the splitting condition in L and obtain fast multiplication algorithms in the general case at only a small loss of efficiency. We demonstrate the technique in this section and then briefly describe cases where a general algorithm may be superior to the one requiring that q splits by discussing alternatives to Theorem 8. Observe that, regardless of the prime ideal decomposition of each q i O L , under the CRT decomposition the quotient ring O L /q i O L is a vector space of dimension d over F q ∼ = O K /q i . Consequentially, an arbitrary O K q basis 1 , . . . , d of O Lq can be decomposed into n bases j = ( 1,j , . . . , n,j ) so that each collection i,1 , . . . , i,d of q i O L parts is a vector space basis of dimension d over O K /q i . Indeed, in the split case we constructed each i in this manner. Armed with this knowledge, we adapt the multiplication algorithm as follows.
Choose an arbitrary integral O K -basis 1 , . . . , d of O L . As a precomputation phase, compute and store the images j mod q i O L for each i and j. The CRT-like decomposition of Lemma 12 splits each of the u coordinates of an element of Λ q , an element of O Lq , into its mod q i O L parts. Once again, we suggest an algorithm where elements of O Lq are stored in the form = d j=1 j k j for k j ∈ O K q , e.g. on elements stored as K-combinations of this basis. We split ∈ O Lq into its O L /q i components in time O(d · n log n), since where each k j mod q i can be computed in time O(n log n) by the K-CRT and each j mod q i mod O L was computed in the precomputation phase. Consequentially, we can perform the CRT style decomposition of an element in Λ q whose u coordinates are all stored in this manner in time O(d 2 · n log n), since we must split d 2 elements of O K . This decomposing complexity is the same as in the previous case where q splits completely. Following this, each ring R i can be plugged in to the algorithm of [13] to compute the multiplication in time O(N d ω−2 ). However, since the i do not correspond to a standard orthonormal basis we incur an extra cost when reversing this transformation. Namely, each of the u coordinates of each ring R i is output by the algorithm of [13] as an element ∈ O L mod q i O L expressed in an arbitrary normal basis. Before reversing the decomposition we must allow for the complexity of expressing each element of the output in the bases obtained by the images of 1 , . . . , d mod q i O L , as this basis was not necessarily normal. Since O L mod q i O L is a vector space of dimension d over F q this can be done via a precomputed change of basis matrix over F q in time O(d ω ), and since there are n rings with d coordinates each the complexity of computing this on every coordinate is O(nd ω+1 ). The resulting multiplication algorithm has total complexity O(N log(N/d 2 )) + O(N d ω−1 ). While this represents only a minor asymptotic loss, especially since we expect the first term to dominate the complexity, it is likely in practice that the extra step required to recover the basis representation would cause a tangible slowdown.
An unfortunate issue with this technique is that by replacing the orthonormal basis with an arbitrary basis we have lost Theorem 8 and thus the efficient method for sampling a discrete Gaussian in the representation = j j k j . However, this generalization allows for the use of an arbitrary basis 1 , . . . , d , unlike in the split case in which we chose a specific basis. Since we require that elements of Λ q are input into the algorithm with u coordinates in the form j j k j this algorithm can be combined with the cryptosystem of Section V-B in the case where there is a basis g 1 , . . . , g d of O Lq over O K q in which one can compute the representation = j g j k j particularly efficiently. This is because one can just sample from the usual Gaussian distribution over the polynomial basis of O Lq , compute its representation as = j g j k j , and then apply the multiplication algorithm in this form. More generally, the flexible choice of basis allows for both non-split q and for a user to choose their favourite O L basis properties, such as a normal basis or a basis consisting of small elements. We remark that it is likely possible to construct a pair of fields L/K that allow for a basis 1 , . . . , d permitting a fast algorithm transforming from the polynomial representation of O L to the representation i i k i with each k i in polynomial representation, which would allow one to bypass the complications of sampling Gaussian distributions by just sampling in O L directly.

F. Generalizing to Other Centers
In the exposition of the previous section we required that q splits completely in the center K. This corresponds to the requirement in the ring and module cases that q splits completely in the field K, which allows the use of the NTT to compute multiplications over a direct product of finite fields. However, there has been recent progress in loosening this requirement for the NTT and allowing the modulus q to be 1 mod n rather than 1 mod m, where as usual K is the m th cyclotomic field of degree n. For example, in the second round specification of KYBER [5] q is set as 3329 and n = 256, yet they still support efficient NTT based multiplication. In such cases, q is 'well' split but not completely split, and the fast NTT operations use the method of [42], where q splits into some product of prime ideals q i whose norms can be small powers of q.
We observe that our methods can be partially generalized to this case in the following manner. Say q = i q i is a decomposition into prime ideals in O K and there exists an efficient algorithm for fast multiplication in O K q . We can replace our condition that q splits completely in O L with the condition that each ideal q i in the O K -factorization of q splits completely into a product of d prime ideals q i O L = d j=1 q i,j in O L of the same norm. Then, we can replicate the method of Section G-D to find a cyclic, orthonormal basis e 1 , . . . , e d of O L /q i O L over O K /q i and concatenate together the bases for each i to make the cyclic, orthonormal, basis 1 , . . . , d of O Lq over O K q . Since the basis is orthonormal, if = i i k i and g = i i g i with each k i , g i ∈ O K q , then Since the basis is cyclic, where we define k 0 := k d . Now we are able to use existing fast multiplication algorithms in O K q to compute operations in O Lq by expressing elements in this basis. Represent each x = d−1 i=0 u i x i ∈ Λ q by expressing each x i ∈ O Lq in the j basis. Then, to multiply x and y in Λ q one only has to compute multiplications in O K q , since the operations required are just computing the non-commutative relation u = uθ( ), which merely permutes the i using θ, and computing multiplication and addition, which can be done coordinatewise in the orthonormal i basis. Each L multiplication requires d multiplications in K, and each u coordinate of Λ requires d multiplications in L.
Consequentially, naive multiplication in Λ q takes d 3 instances of the efficient O K q -multiplication algorithm we have access to. For specific K-multiplication algorithms it is likely that this process can be streamlined; the intention of this section is merely to demonstrate that one can build efficient Λ q operations from more general efficient operations over the center in the same manner that the techniques of Section G-D used the CRT method.