1 Introduction

Today minimal or no security is typically provided to low-end low-cost wireless devices such as sensors or RFID tags in the conventional belief that the information they gather is of little concern to attackers [45]. However, case studies have shown that a compromised sensor can be used as a stepping stone to mount an attack on a wireless network [37]. For example, in the attack described in [37], wireless tire pressure sensors were hacked and used to access the automotive system.

Future wireless networks are expected to support security-critical services related to industrial automation, traffic safety, smart transport, smart grid, e-health, etc. The value of the information to which the low-end devices will have access via future wireless networks is expected to be much greater than the one today, hence the incentives for attackers will increase [17]. As processing power and connectivity become cheaper, the cost of performing an attack drops. The damage caused by an individual actor may not be limited to a business or reputation, but could have a severe impact on public safety, national economy, and national security.

Many low-cost wireless devices work under severe resource constrains such as limited battery and computing power, little memory, and insufficient bandwidth. These devices must dedicate most of their available resources to executing core application functionality and have little resources left for implementing security. To satisfy their constrains, it might be necessary to reuse existing functions, e.g. by combining coding techniques (scrambling, checksums, forward error correction (FEC)) with cryptographic techniques (encryption, integrity protection). In particular, functional similarities between error detection and data integrity protection can be exploited to combine these functions in one.

Clearly, data integrity protection can be implemented by using some n-bit message authentication code, e.g. keyed hash message authentication code (HMAC) [4] or cipher block chaining message authentication code (CBC-MAC) [7], on the top of an error-detecting code, e.g. n-bit cyclic redundancy check (CRC). However, such an approach expands the message by n bits and requires a separate encoding/decoding engine which is more complex than the CRC encoding/decoding engine.

On the other hand, if we simply replace an n-bit CRC with an n-bit HMAC or CBC-MAC, we cannot guarantee the detection of the same type of random errors as the CRC. For example, the detection of n-bit burst errors cannot be guaranteed. This may have a negative impact on the reliability of communication links. Only if we make the conventional CRC cryptographically secure, can we assure a certain level of security without sacrificing reliability.

The latter motivated the development of cryptographically secure CRCs. The core idea is to make the CRC generator polynomial variable and secret. The CRC presented by Krawczyk [27] is based on irreducible generator polynomials. The approach described in [15] uses a product of irreducible polynomials. The CRC proposed in [14] uses generator polynomials of type (1 + x)p(x), where p(x) is a primitive polynomial. In all three cases, testing for irreducibility or primitivity is required, which is either time or memory consuming. Selecting an irreducible degree-n polynomial at random requires either selecting at random a degree-n polynomial (O(n) time) and running a test for irreducibility (Ω(n 3) timeFootnote 1 [21]), or selecting at random a degree-n polynomial from a database of irreducible degree-n polynomials (roughly 2n/n space). Note that the irreducibility test has to be done during key agreement, i.e. it incurs delay before the communication can start. Therefore, it is desirable to minimize the time spent on doing it as much as possible.

In this paper, we present a cryptographically secure CRC based on any randomly selected generator polynomial, with no requirements on irreducibility. We provide a detailed quantitative analysis of the achieved security as a function of message and CRC sizes. To the best of our knowledge, no security analysis for the general case of reducible polynomials has been made so far. This might be due to the fact that the evaluation involves estimating the maximum number of reducible polynomials which can be constructed from any multiset of irreducible polynomials of a given size, which is a non-trivial task.

The paper is organized as follows. Section 2 gives a background on hash functions and describes the basics of CRC codes. In Section 3, we introduce two new families of cryptographically secure CRC hash functions. Section 4 analyzes error-detecting capabilities of hash families. In Section 5, we present the security analysis of hash families. Section 6 shows experimental results. Section 7 describes related work. Section 8 concludes the paper and discusses open problems.

2 Preliminaries

2.1 Notation

Throughout the paper, we associate each binary string L ∈{0,1}l representing an l-bit binary message with a polynomial L(x) over the Galois field of the order 2, G F(2), so that the coefficients of L(x) correspond to the bits of L. We use deg (L) to denote the degree of the polynomial L(x).

We use a R A to indicate that the element a is taken uniformly at random from the set A. The term Pr h is used to denote the probability of the event E(h) where h R H and H is a set of hash functions, usually implicit.

2.2 Hash functions

In this section we describe properties of hash functions which are used in the sequel.

Definition 1

[27] An (l,n)-family of hash functions H is a set of functions h that map the set of binary strings of length l into the set of binary strings of length n.

The hash functions considered in this paper are linear relative to the exclusive-OR operation. This property simplifies their analysis.

Definition 2

[27] A family of hash functions H is ⊕-linear if, for all messages L 1 and L 2 and for all hH,

$$h(L_{1} \oplus L_{2}) = h(L_{1}) \oplus h(L_{2}), $$

where “ ⊕” is the bitwise exclusive-OR (XOR).

Another important property of some hash functions is the ability to map elements into their images in a balanced way.

Definition 3

[27] A family of hash functions H is 𝜖-balanced if

$$\forall L \not= 0, \forall a \in \{0,1\}^{n}: \text{Pr}_{h}[h(L) = a] \leq \epsilon. $$

2.3 Message authentication

A message authentication algorithm accepts as input a secret key and a message to be authenticated and outputs an authentication tag. The tag protects both, message data integrity and message authenticity.

It is known that hash functions can be combined with one-time pads to construct strong authentication algorithms [46]. In this case, the secret key consists of the description of a particular hash function h R H drawn randomly from an (l,n)-family of hash functions H and a random pad s R {0,1}n.

In the definition below, it is assumed that the adversary knows the family of hash functions H, but not the particular value of h or the pad s. As mentioned in [27] the name “otp”-secure is intended to stress that importance of the one-time pad for the security of the authentication scheme.

Definition 4

[27] A family of hash functions H is 𝜖-otp-secure if, for any message L, no adversary that has L and its hash tag t = h(L) ⊕ s, where h R H and s R {0,1}n, can find L L and t = h(L ) ⊕ s with probability larger than 𝜖.

Note that the success probability of an adversary that can modify a single transmitted message remains 𝜖 even if the adversary has access to more than one pair of messages and tags, provided that truly random pads s are used for computing the tags and these pads are changed for every message [27]. If this holds, then the authentication tags look completely random and therefore leak no information on the value of h. If the adversary is able to modify k of the transmitted messages, then the success probability is at most k 𝜖 [27].

In most practical applications, pseudo-random pads generated from a secret seed shared by the communicating parties rather than truly random pads of the size of the hash output are used. In this case, the unconditional security of the authentication scheme in Definition 4 reduces to the security of the pseudo-random generator producing the pads and the computational power of the adversary is assumed to be bounded depending on the security model of the pseudo-random generator [27].

The following theorem characterizes 𝜖-otp-secure families of hash functions.

Theorem 5

[27] A necessary and sufficient condition for a family of hash functions H to be 𝜖-otp-secure is that

$$\forall L_{1} \not= L_{2}, \forall a \in \{0,1\}^{n}: \text{Pr}_{h}[h(L_{1}) \oplus h(L_{2}) = a] \leq \epsilon. $$

For linear families of hash functions, Theorem 5 implies the following result.

Theorem 6

[27] If H is ⊕-linear, then H is 𝜖-otp-secure if and only if H is 𝜖-balanced.

We use Theorem 6 as the main step in proving the security of the presented authentication scheme.

2.4 Cyclic redundancy check

A cyclic redundancy check (CRC) is widely used for protecting data communication or storage against random errors [34]. Many wireless communication standards use CRC. For example, IEEE 802.15.4 standard uses 16-bit CRC [24], LTE uses 24, 16 and 8-bit CRCs [1], and GSM uses 40-bit CRC [19].

To perform n-bit CRC encoding, a message polynomial, L(x), is multiplied by x n and then divided modulo a generator polynomial g(x) of degree n. The coefficients of the resulting polynomial

$$r(x) = L(x) \cdot x^{n} ~\text{mod}~ g(x) $$

represent the check bits of the CRC. These check bits are added to L(x) ⋅ x n to get the resulting CRC codeword L(x) ⋅ x n + r(x).

The CRC decoding is usually done by dividing the received message polynomial modulo the generator polynomial g(x) and comparing the coefficients of the resulting remainder to the received CRC check bits [36]. A disagreement indicates an error. It is well-known that if an irreducible generator polynomial of degree n is used as a generator polynomial, then the resulting CRC detects all burst errors of length n or less [34].

The CRC encoding and decoding can be efficiently implemented using a linear feedback shift register (LFSR) [23] having g(x) as its connection polynomial. There are many efficient techniques for speeding up the computation of CRC [32, 33, 38].

Traditional CRCs are good at detecting random errors. However, they are not suitable for detecting malicious errors. An adversary who knows the generator polynomial g(x) may simply substitute the original message L(x) by another message L (x), encode L (x) as usual into the codeword L (x) ⋅ x n + r(x), where r(x) = L (x) ⋅ x n mod g(x), and then submit the resulting codeword. The receiver will not be able to distinguish the codeword L (x) ⋅ x n + r(x) from the codeword received from a legitimate sender.

3 Two families of cryptographically secure CRC hash functions

In this section, we define two new families of cryptographically secure CRC-based hash functions.

Definition 7 (Family H R )

For any binary message L of length l and any polynomial g(x) of degree n over G F(2), a hash function h g (L) is defined as the coefficients of the polynomial

$$h_{g}(L) = L(x) \cdot x^{n} ~\text{mod}~ g(x). $$

The (l,n)-family H R is a set of all hash functions h g , H R = {h g : {0,1}l →{0,1}n}.

Since each degree-n polynomial over G F(2) defines one member of the family H R and there are 2n degree-n polynomials over G F(2), the size of the family H R is 2n.

To authenticate a message L using the hash function family H R , a sender computes the authentication tag t as

$$ t = h_{g}(L) \oplus s, $$
(1)

where h g R H R and s R {0,1}n, appends t to L, and transmits the message and the appended tag. A receiver authenticates a received message L (potentially different from L) by re-computing the tag for L and comparing the received and the re-computed tags. A disagreement implies an error.

Note that the modification of the linear hash function to the affine one is necessary to prevent an attacker from injecting all-0 messages. Without such a modification, the hash value of an all-0 message would always be 0, independently of the polynomial g(x). The reader familiar with e.g. the UIA2 MAC of the 3G standard will recognize this type of construction. In that case, the encryption pad s is generated by the SNOW3G stream cipher [18].

We also consider separately a special case of the Definition 7 when the generator polynomial has a non-zero constant term. This case is particularly interesting because, as we show in the next section, CRCs based on such polynomials detect the same type of burst errors as CRCs based on irreducible polynomials.

Definition 8 (Family H R C )

For any binary message L of length l and for any polynomial q(x) of degree n over G F(2) with a non-zero constant term, a hash function h q (L) is defined as the coefficients of the polynomial

$$h_{q}(L) = L(x) \cdot x^{n} ~\text{mod}~ q(x). $$

The (l,n)-family H R C is a set of all hash functions h q , H R C = {h q : {0,1}l →{0,1}n}.

Since each degree-n polynomial over G F(2) with a non-zero constant term defines one member of the family H R C and there are 2n−1 degree-n polynomials over G F(2) which have a non-zero constant term, the size of the family H R C is 2n−1.

Similarly to the family H R , the authentication tag for the family H R C is computed as

$$ t = h_{q}(L) \oplus s, $$
(2)

where h q R H R C and s R {0,1}n.

The computation of CRCs defined above is based on the same operation of polynomial modular division as the traditional CRCs except that, in our case, the generator polynomial has to be changed to appear random to an adversary. Therefore, an LFSR implementing encoding and decoding for the cryptographic CRC needs re-programmable connections. Techniques for implementing re-programmable LFSRs are known [9]. Re-programmable LFSRs are used, for example, in applications which support multiple CRC standards.

Note that restricting generator polynomials to polynomials with non-zero constant terms does not complicate the implementation of CRC encoding and decoding in any way. The only difference is that, for polynomials with non-zero constant terms, the LFSR connection corresponding to the constant-one term of the polynomial is made fixed rather than programmable.

4 Analysis of error-detecting capabilities

It is well-known that a CRC based on an irreducible generator polynomial of degree n detects all burst errors on length n or less [34]. Next, we show that a cryptographically secure CRC based on a reducible generator polynomial of degree n with a non-zero constant term detects the same type of errors.

Theorem 9

A CRC based on a reducible generator polynomial of degree n > 1 with a non-zero constant term detects the same type of burst errors as a CRC based on an irreducible generator polynomial of degree n.

Proof

Let L be an l-bit message and let the CRC check bits be computed according to the Definition 8 using a reducible degree-n generator polynomial q(x) with a non-zero constant term. Any k-bit burst error e, 0 < kn, can be described by a polynomial of type

$$ e(x) = x^{j} \cdot f(x) $$
(3)

where

$$f(x) = x^{k-i-1} + x^{k-i-2} + {\ldots} + x + 1, $$

for i ∈{0,1,…,k − 1} and j ∈{0,1,…,l + i}.

The error e is not detected by the CRC if and only if e(x) is divisible by the generator polynomial q(x). Clearly g c d(x j,q(x)) = 1. So, e(x) is divisible by q(x) if and only if f(x) is divisible by q(x). However, this is not possible since deg (f) < n. Therefore, a CRC based on a reducible degree-n generator polynomial with a non-zero constant term detects all burst errors on length n or less. □

Theorem 9 shows that, from the point of view of correcting burst errors, no advantage is lost if an irreducible polynomial is replaced by a reducible polynomial with a non-zero constant term.

5 Security analysis

In this section, we analyze the security of the new families of hash functions. We assume a typical setting in which the sender and the receiver transmit messages over an unsecure channel where messages can be maliciously modified [43]. The sender and the receiver share a secret key which is unknown to the adversary. In our case, the key consists of the description of a particular generator polynomial g(x) (respectively, q(x)) drawn randomly from the set of all possible degree-n polynomials (all possible degree-n polynomials with a non-zero constant term) over G F(2) and a random pad s R {0,1}n.

In order to prove the security of the hash function families H R and H R C for implementation for the message authentication schemes (1) and (2), respectively, in this section we show that these families are 𝜖-otp-secure with 𝜖 being exponentially small in the length of the hash value. By Definition 4, if a hash family is 𝜖-otp-secure, then the success probability for an adversary to modify a message is at most 𝜖. Throughout the paper, when we say ”attack success probability”, we mean the probability that an adversary can successfully modify a message according to the scenario described by Definition 4.

In the following section we quantify 𝜖 for the hash function families H R and H R C .

5.1 Quantifying attack success probability

An adversary can successfully replace a message and a tag pair (L,t) by another pair (L ,h ), L L, only if for the hash function h R H and pad s R {0,1}n used by the communicating parties it holds that t = h(L) ⊕ s and t = h(L ) ⊕ s, or equivalently tt = h(L) ⊕ h(L ) [27]. Thus, the success probability of the adversary is bounded by

$$\max\limits_{L,L^{\prime},a} \text{Pr}_{h}[h(L) \oplus h(L^{\prime}) = a] $$

where a = tt . By Theorem 6, for linear hash functions the above condition can be simplified to

$$\max\limits_{L,a} \text{Pr}_{h}[h(L) = a] $$

for all L ≠ 0. Let us analyze how this probability can be maximized for the hash function families H R and H R C .

By Definition 7, for the hash function family H R , the success probability is proportional to the number of degree-n polynomials, g(x), that divide the polynomial L(x) ⋅ x na(x). So, in order to find 𝜖 for H R , we need to estimate the maximum number of distinct degree-n polynomials that can be constructed from the irreducible factors of a degree- (n + l) polynomial.

Similarly, by Definition 8, for the hash function family H R C , the success probability is proportional to the number of degree-n polynomials with a non-zero constant term, q(x), that divide the polynomial L(x) ⋅ x na(x). So, in this case we need to find the maximum number of distinct degree-n polynomials with a non-zero constant term that can be constructed from the irreducible factors of a degree- (n + l) polynomial. Note that all irreducible factors of a polynomial with a non-zero constant term have a non-zero constant term.

We start with the case of the hash function family H R . Let P be a multiset of irreducible polynomials over G F(2). In a multiset, the same element may repeat more than once. By m u l t(p) we denote the number of occurrences of a polynomial p in P. By size(P) we denote be the sum of degrees of all elements of P. For example, for P = {x,x,x + 1,x 2 + x + 1}, m u l t(x) = 2 and size(P) = 5.

Let N(n;P) denote the number of distinct degree-n polynomials over G F(2) which can be constructed by multiplying a subset of the elements of P. For example, if P = {x,x,x + 1,x 2 + x + 1} and n = 2, we can construct x 2,x(x + 1) and x 2 + x + 1, so N(2;P) = 3. N(0;P) is defined to be 1 for any P. Since each polynomial has a unique factorization into irreducible polynomials, N(n;P) can be computed by counting the number of distinct combinations of elements of P whose degrees sum up to n. We address this problem in Section 5.2.

For a given n, we want to find a multiset of irreducible polynomials P max such that

$$N(n;P_{max}) = \max\limits_{\forall P: size(P) = size(P_{max})} N(n;P). $$

If P max is known, we can quantify the attack success probability for the hash function family H R as follows.

Theorem 10

For any values of l and n, the (l,n)-family of hash functions H R is 𝜖 1-otp-secure for

$$ \epsilon_{1} \leq \frac{N(n;P_{max})}{2^{n}}, $$
(4)

where size(P max ) = l + n.

Proof

A family of hash functions is 𝜖-otp-secure if it is ⊕-linear and 𝜖-balanced. The family of hash functions H R is ⊕-linear because for all messages L 1 and L 2 and for all h g H R , we have h g (L 1L 2) = h g (L 1) ⊕ h(L 2).

To show that the family H R is also 𝜖-balanced, we observe that, on one hand, for any degree-n polynomial g(x) over G F(2), any non-zero message L of length l and any string a of length n, h g (L) = a if and only if L(x) ⋅ x n mod g(x) = a(x). On the other hand, L(x) ⋅ x n mod g(x) = a(x) if and only if g(x) divides L(x) ⋅ x na(x).

Let f(x) = L(x) ⋅ x na(x). Obviously, f(x) is a non-zero polynomial of degree less than or equal to l + n, and g(x) is a polynomial of degree n which divides f(x). On one hand, there are at most N(n;P max ) hash functions in the family H R that map L into a, because N(n;P max ) is the maximum number of distinct degree-n polynomials which can be constructed from the irreducible factors of any degree- (n + l) polynomial. On the other hand, the family H R consists of 2n elements (the number of degree-n polynomials over G F(2)). Therefore

$$\text{Pr}_{h}[h_{g}(L)=a] \leq \frac{N(n;P_{max})}{2^{n}}. $$

In a similar way we can quantify the attack success probability for the hash function family H R C .

Let P be a multiset of irreducible polynomials with a non-zero constant term over G F(2). For a given n, let \(P^{*}_{max}\) be a multiset of irreducible polynomials with a non-zero constant term such that

$$N(n;P^{*}_{max}) \geq N(n;P^{*}) $$

for any other multiset P with \(size(P^{*}) = size(P^{*}_{max})\).

Theorem 11

For any values of l and n, the (l,n)-family of hash functions H R C is 𝜖 2-otp-secure for

$$ \epsilon_{2} \leq \frac{N(n;P^{*}_{max})}{2^{n-1}}, $$
(5)

where \(size(P^{*}_{max}) = l+n\).

Proof

Similar to the proof of Theorem 10.

In the following subsections we show how to compute N(n;P) and N(n;P ). □

5.2 Number of polynomials which can be constructed from a given set of irreducible polynomials

Let I i be the number of distinct irreducible polynomials of degree i over G F(2). It is well-known how to compute I i [30].

Let p i,j be the jth irreducible polynomial of degree i, for all j ∈{1,2,....,I i }. Note that for our purpose we only need to enumerate all irreducible polynomials of a given degree. The order in which they are assigned the index j is not significant. So, whether we assign p 1,1 = x and p 1,2 = x + 1 or vice versa does not change the presented results.

As we mentioned in the previous section, for a given n and a given multiset of irreducible polynomials P, the number of distinct degree-n polynomials which can be constructed by multiplying a subset of the elements of P, N(n;P), can be computed by counting the number of distinct combinations of elements of P whose degrees sum up to n.

As an example, consider a multiset P which contains five copies of the polynomial p 1,1 = x, five copies of the polynomial p 1,2 = x + 1 and two copies of the polynomial p 2,1 = x 2 + x + 1. Let n = 5. Then, the following 12 distinct polynomials can be constructed from P:

$$\begin{array}{c} x^{5}, x^{4} (x+1), x^{3}(x+1)^{2}, x^{2}(x+1)^{3}, x(x+1)^{4}, (x+1)^{5}\\ x^{3}(x^{2}+x+1), x^{2}(x+1)(x^{2}+x+1), x(x+1)^{2}(x^{2}+x+1), (x+1)^{3}(x^{2}+x+1)\\ x(x^{2}+x+1)^{2}, (x+1)(x^{2}+x+1)^{2}. \end{array} $$

So, N(5;P) = 12.

Next, we show that N(n;P) can be computed using a recurrence relation given by the following Lemma. It is obvious that elements p with \(mult(p) > \lfloor \frac {n}{\text {deg}(p)} \rfloor \) do not contribute to the new polynomials of degree n. For this reason the index m in the Lemma is limited by \(\lfloor \frac {n}{\text {deg}(p)} \rfloor \).

Lemma 12

For any multiset of irreducible polynomials P, any irreducible polynomial p ∉ P of degree deg (p) ≤ n, and any m such that \(1 \leq m \leq \lfloor \frac {n}{\text {deg}(p)} \rfloor \) , it holds that

$$N(n;P \cup \{p^{m}\}) = \sum\limits_{i=0}^{m} N(n-i \cdot \text{deg}(p); P), $$

where {p m} denotes a multiset containing m elements p.

Proof

By induction on m.

  1. 1.

    Base case: m = 1. We need to prove that

    $$N(n;P \cup \{p\}) = N(n;P) + N(n-\text{deg}(p);P). $$

    By subtracting N(n;P) from both sides we get

    $$N(n;P \cup \{p\}) - N(n;P) = N(n-\text{deg}(p);P). $$

    The left-hand side is the difference between the number of distinct degree-n polynomials which can be constructed from the elements of P ∪{p} and the number of distinct degree-n polynomials which can be constructed from the elements of P. This difference is equal to the number of distinct degree-n polynomials which contain p as a factor with the multiplicity exactly one. Removing factor p from each of such polynomials yields all possible distinct polynomials of degree n −deg(p) which can be constructed from the elements of P, i.e. the right-hand side N(n −deg(p);P).

  2. 2.

    Inductive step: Assume the statement holds for m. Next we prove that it holds for m + 1, i.e. that

    $$\begin{array}{ll} N(n;P \cup \{p^{m+1}\}) & = \displaystyle{\sum\limits_{i=0}^{m+1} N(n-i \cdot \text{deg}(p); P)}\\ & = N(n;P \cup \{p^{m}\}) + N(n-(m+1) \cdot \text{deg}(p); P) \end{array} $$

    where \(2 \leq m+1 \leq \lfloor \frac {n}{\text {deg}(p)} \rfloor \).

By subtracting N(n;P ∪{p m}) from both sides we get

$$N(n;P \cup \{p^{m+1}\}) - N(n;P \cup \{p^{m}\}) = N(n-(m+1) \cdot \text{deg}(p);P). $$

The left-hand side is the difference between the number of distinct degree-n polynomials which can be constructed from the elements of P ∪{p m+1} and the number of distinct degree-n polynomials which can be constructed from the elements of P ∪{p m}. The former accounts for factorizations which contain p with multiplicity from 0 to m u l t(p) + 1. The latter accounts for all factorizations which contain p with multiplicity from 0 to m u l t(p). Therefore, the difference is equal to the number of distinct degree-npolynomials which contain p as a factor with the multiplicity exactly m u l t(p) + 1. Removing the r p with the multiplicity m u l t(p) + 1 from each of such polynomials yield all possible distinct polynomials of degree n − (m u l t(p) + 1) ⋅deg(p) which can be constructed from the elements of P, i.e. the right-hand side N(n − (m u l t(p) + 1) ⋅deg(p);P). □

Finally, we derive a general formula for N(n;P). In the derivations below we denote by P d a multiset of irreducible polynomials in which the maximum degree of elements is d. To unify the notation, we allow multiplicities of elements of P to be 0. In this way, any P d can be uniquely represented by the vector of multiplicities of its elements

$$(m_{1,1},\ldots,m_{1,I_{1}},m_{2,1},\ldots,m_{2,I_{2}},\ldots,m_{d,1},\ldots,m_{d,I_{d}}), $$

where m i,j = m u l t(p i,j ) for all i ∈{1,2,…,d} and j ∈{1,2,....,I i }.

There are two irreducible polynomials of degree 1. It is easy to see that

$$N(n;P_{1}) = \left\{ \begin{array}{l} min(m_{1,1},n) + min(m_{1,2},n)-n+1, \text{if}~ m_{1,1} + m_{1,2} \geq n\\ 0, ~~~ \text{otherwise} \end{array} \right. $$

There is only one irreducible polynomial of degree 2. From Lemma 12, we can conclude that

$$N(n;P_{2}) = N(n;P_{1}) + N(n-2;P_{1}) + N(n-4;P_{1}) +\ldots+ N(n-2 \cdot min\left( m_{2,1},\left\lfloor \frac{n}{2} \right\rfloor\right);P_{1}) $$

or

$$N(n;P_{2}) = \sum\limits_{i_{2,1}=0}^{min(m_{2,1},\lfloor \frac{n}{2} \rfloor)} N(n-2 i_{2,1};P_{1}) $$

It is straightforward to extend the derivations above to the following result.

Theorem 13

For d = 1

$$N(n;P_{1}) = \left\{ \begin{array}{l} min(m_{1,1},n) + min(m_{1,2},n)-n+1, \text{if}~ m_{1,1} + m_{1,2} \geq n\\ 0, ~~~ \text{otherwise} \end{array} \right. $$

and for d > 1

$$ N(n;P_{d}) = \sum\limits_{i_{d,1}=0}^{A_{d,1}} \sum\limits_{i_{d,2}=0}^{A_{d,2}} \!\!\ldots\!\! \sum\limits_{i_{d,I_{d}}=0}^{A_{d,I_{d}}} \!\!\ldots\!\! \sum\limits_{i_{2,1}=0}^{A_{2,1}} N\left( n- \sum\limits_{h=2}^{d} \sum\limits_{j=1}^{I_{h}} i_{h,j}; P_{1}\right) $$
(6)

where

$$\begin{array}{l} A_{d,1} = min\left( \lfloor \frac{n}{d} \rfloor, m_{d,1}\right)\\ A_{d,2} = min\left( \left\lfloor \frac{n-d \cdot i_{d,1}}{d} \right\rfloor, m_{d,2}\right)\\ \ldots\\[-2mm] A_{d,I_{d}} = min(\lfloor \frac{n-d \displaystyle{\small \sum\limits_{j=1}^{I_{d}-1} i_{d,j}}}{d} \rfloor, m_{d,I_{d}})\\ \ldots\\ A_{2,1} = min\left( \left\lfloor \frac{n-S(d:3)}{2} \right\rfloor, m_{2,1}\right);\\ \end{array} $$

where \(S(d:i) = \displaystyle { \sum\limits_{r=i}^{d} \left (r \cdot \sum\limits_{j=1}^{I_{r}} i_{r,j} \right )}\).

All the results derived above also apply to the case of P being a multiset of irreducible polynomials with non-zero constant term except that, in Theorem 13, \(N(n;P^{*}_{1})\) reduces to

$$N(n;P^{*}_{1}) = \left\{ \begin{array}{ll} 1, & \text{if}~ m_{1,2} \geq n\\ 0, & \text{otherwise.} \end{array} \right. $$

Lastly, we would like to point out the relation between the problem we addressed in this section and restricted colored integer partitions.Footnote 2 The number N(n;P) is equal to the number of colored partitions of the integer n into arbitrarily many parts such that the integer i may occur in f(i) different colors (f(i) corresponds to the number of polynomials of degree i in P) and the number of occurrences of the integer i with a color cf(i) in the partition is at most m(i,c) (m(i,c) corresponds to the multiplicity of the polynomial with the degree i and color c in P). A lot of work has been done on k-colored partitions, in which parts may appear in k different colors, see for example [8], or a survey [3]. The generalization of k-colored partitions in which at most j colors may appear for a given part size has been recently presented in [26]. However, we are not aware of any work addressing the specific case of this paper in which the integer i may occur in f(i) different colors and the number of occurrences of the integer i with a color cf(i) in the partition is at most m(i,c). We only know the work of Eger [16] on S-restricted f-colored integer compositions (where the order of parts is significant) in which all parts lie within a subset S of nonnegative integers and each integer iS may take on f(i) different colors.

5.3 Computing N(n;P max )

Theorem 13 shows us how to compute N(n;P) for a given n and P. Next we need to find a vector of multiplicities which maximizes N(n;P) for a given n and size(P). In this section, we derive some properties which allow us to guide and bound the search.

Property 14

For any n > 0, there exist P max such that an irreducible polynomial p i with deg(p i ) = i is contained in P max only if each irreducible polynomial p j with deg(p j ) = j, 1 ≤ j < i, is contained in P max at least once.

Proof

Suppose that p i P max and p j P max for some j < i. Then we can replace P max by P such that

$$P^{\prime} = (P_{max} - \{p_{i}\}^{mult(p_{i})}) \cup \{p_{j}\}^{mult(p_{i})} \cup \{p_{i-j}\}^{mult(p_{i})} $$

where p ij is any irreducible polynomial of degree ij. Obviously, size(P ) = size(P max ). Furthermore, for any polynomial of degree n constructed from the elements of P max which contains \({p_{i}^{k}}\) as a factor, we can replace \({p_{i}^{k}}\) by \({p_{j}^{k}} \cdot p_{i-j}^{k}\), for any 1 ≤ km u l t(p i ). Since \({p_{j}^{k}} \not \in P_{max}\), this implies that N(n;P ) ≥ N(n;P max ). □

For P max satisfying the condition of Property 14, we can derive a rough upper bound on the maximum degree of polynomials contained in P max by computing the smallest integer d satisfying

$$ size(P_{max}) \leq I_{1} + 2 I_{2} + 3 I_{3} + {\ldots} + d I_{d}. $$
(7)

We can reduce the search space for P max by first deriving an upper bound on d using (7) and then removing from the consideration multisets P which do not satisfy the condition of Property 14. We also can take into account that the order of elements of the same degree in a multiset does not matter.

Property 15

For any two multisets P and P with size(P) = size(P ) which are equivalent up to a permutation of elements of the same degree, N(n;P) = N(n;P ).

As an example, suppose that n = 2 and size(P) = 4. From 4 ≤ 2 + 2 ⋅ 1 we get d = 2. There are four possible candidates into P max defined by the following vectors of multiplicities (m 1,1,m 1,2,m 2,1):

$$(2,2,0), (2,0,1), (0,2,1), (1,1,1). $$

Recall that elements p with \(mult(p) > \lfloor \frac {n}{\text {deg}(p)} \rfloor \) do not contribute to new constructions of polynomials of degree n, therefore vectors (4,0,0), (0,4,0), (3,1,0), (1,3,0), and (0,0,2) are not included in the list.

By applying Properties 14 and 15, we can reduce the set of candidates into P max to two:

$$(2,2,0), (1,1,1). $$

Now by using Theorem 13 we can compute N(2;P 1) = 3 for P 1 = {p 1,1,p 1,1,p 1,2,p 1,2} and N(2;P 2) = 2 for P 2 = {p 1,1,p 1,2,p 2,1}. We can see that P max = P 1.

Finally, in order to compute N(n;P) for large n and size(P), Lemma 12 can be used to decompose the problem into two smaller sub-problems. The decomposition can be recursively applied until the problem size is sufficiently reduced.

6 Experimental results

Using the approach described above, we computed N(n;P max ) for CRC lengths n = 32,48,64,96 and 128 bits and message lengths l = 32,64,128 and 256 bits. The resulting upper bounds 𝜖 1 and 𝜖 2 on attack success probabilities, computed using (4) and (5), are shown in Table 1 in the logarithmic form − log 2(𝜖 i ). The 7th column shows the upper bound 𝜖 3 on attack success probability of the cryptographically secure CRC of Krawczyk [27], given by 𝜖 3 ≤ (n + l)/2n−1. Columns 4, 6 and 8 show the fraction \(\frac {-\log _{2}(\epsilon _{i})}{n}\) reflecting the efficiency of 𝜖 i with respect to the optimum probability 1/2n, for i ∈{1,2,3}.

Table 1 Comparison of attack success probabilities for three types of generator polynomials

As we can see, the case of random polynomials with a non-zero constant terms (column 5) has a smaller attack success probability compared to the case of random polynomials (column 3). The former case is also preferable from the point of view of correcting burst errors. We can also see from the table that the presented method is particularly suitable for the authentication of short messages.

7 Related work

A lot of work has been done on message authentication codes in the past, see [39] for an excellent survey. Security of several types of MACs, including HMAC [5], CBC-MAC [7] and XOR-MAC [6], have been quantitatively analyzed.

Unconditionally secure message authentication codes were pioneered by Gilbert et al. [22] and their theoretical basis was developed by Simmons [40].

Wegman and Lawrence Carter [46] showed that hash functions can be combined with one-time pads to construct strong authentication algorithms. Their approach was further developed by Brassard [11], Desmedt [13] and Krawczyk [27].

Stinson [41] introduced the notion of “almost strongly universal hash families” which made possible to considerably reduce the key size of unconditionally secure MACs. For more details on universal hashing, see [42]. Black et al. showed that universal hash families can be used to construct efficient computationally secure MACs, e.g. UMAC [10].

Various techniques for cryptographic checksums and MACs based on stream ciphers have been proposed, including Lai et al. [28], Taylor [44], Johansson [25] and [2]. In these techniques, a new hash function from a hash family is produced for every message by using the pseudo-random generator of a stream cipher. In the scheme presented in this paper, as well as in the method of Krawczyk [27], the same hash function can be re-used for multiple messages. Only the random pad which is used for the encryption of the hash values needs to be updated for each message.

Rabin [35] was first to use CRCs in the cryptographic context for the fingerprinting of information. However, in his scheme the modular division by the generator polynomial is applied directly to a message, without shifting the message n bit positions left first. As a result, Rabin’s scheme is non-secure for message authentication even if the fingerprint is encrypted using a perfect one-time pad [27]. For example, if some of the least significant bits of the message together with the corresponding bits of the encrypted authentication tag are flipped, the change will not go undetected by the fingerprint.

Krawczyk [27] proved that the inclusion of the n-bit shift into Rabin’s scheme [35] makes the scheme secure for message authentication provided that tag is encrypted using a one-time pad. He showed that the probability of breaking the resulting authentication scheme is \(\epsilon \leq \frac {l+n}{2^{n-1}}\), where n is CRC length and l is message length.

In [15] Krawczyk’s approach was extended to the case when a product of k irreducible polynomials is used to generate the CRC. The attack success probability of such an authentication scheme is \(\epsilon \leq \frac {(l+n)^{k}}{2^{n-k}}\).

In [14] generator polynomials of type (1 + x)p(x), where p(x) is a primitive polynomial, are used to generate the CRC. Such CRCs are able to detect all double-bit errors in a message, which is of importance for systems using Turbo codes, including LTE. The attack success probability in this case is \(\epsilon \leq \frac {l+n-1}{2^{n-2}}\).

Krawczyk also developed another interesting family of hash functions based on Toeplitz hashing in which the columns of a matrix are formed by the consecutive states on an LFSR [27]. Such a method has a lower hashing and authentication strength compared to the approach based on a random matrix, namely \(\epsilon \leq \frac {l}{2^{n-1}}\), where n is CRC length and l is message length, however its implementation cost is much smaller.

Apart from CRC, other error detecting/correcting codes were also proposed for message authentication. MACs based on BCH and Reed-Solomon error-correcting codes were presented in [29].

8 Conclusion

In this paper, we introduced two new families of cryptographically secure hash functions based on CRCs. Similarly to previously proposed cryptographically secure CRC-based hash families, the presented ones enable combining the detection of random and malicious errors without increasing bandwidth. They detect the same type of burst errors as cryptographically non-secure CRCs based on irreducible generator polynomials. They retain most of the encoding and decoding implementation simplicity of cryptographically non-secure CRCs except that the LFSR implementing the division modulo generator polynomial needs to have re-programmable feedback connections. The main advantage of the presented CRCs over the previously proposed ones is that the irreducibility testing, which is either time or memory consuming, can be omitted.

However, using random polynomials as generator polynomials for the CRC gives an adversary a higher chance of braking authentication. We provided a detailed quantitative analysis of the achieved security as a function of message and CRC lengths and showed that the presented authentication scheme is particularly suitable for short messages. Short messages (a few bytes to a few tens of bytes) are expected to be dominant in machine-to-machine (M2M) communication. Since the presented method provides some level of integrity protection almost for free, it might be quite useful for resource-constrained M2M devices.

Note that in our attack scenario it is assumed that an adversary gets access to a message and its authentication tag. Other attack scenarios are also possible, for example, an adversary may have an access to a verification oracle as well. In this case any cryptographic CRC, including the presented one, is susceptible to Ferguson’s attack [20, 31] which reveals the polynomial used for generating the CRC with the probability 2n, where n is the polynomial degree. The access to an oracle is a reasonable assumption, for example, in a multicast.Footnote 3 Therefore, we do not recommend the use of cryptographic CRCs with short generator polynomials.

In the current wireless standard message formats two separate fields are typically used for the protection against random and malicious errors. These fields may be located on different layers, e.g. in LTE the CRC is located at the physical (PHY) layer while the message authentication code is located at the packet data convergence protocol (PDCP) layer. A good strategy might be to combine these two fields into the one at the PHY layer and use the a cryptographic CRC for the protection against both types of errors. Future work involves investigating implications for security and coverage caused by such a merge.