1 Introduction

The demand for reliable memory solutions and in particular for non-volatile memories such as flash memory and phase change memories (PCMs) for different applications is steadily increasing. These multi-level devices provide permanent storage and a rapidly extendable capacity. Recently developed devices exploit an increased number of cell levels while at the same time the physical size of the cells was decreased. Therefore, coding and signal processing solutions are essential to overcome reliability issues. The key characteristic of PCM cells is that they can switch between two main states: an amorphous state and a crystalline state. PCM cells may become defect (also called stuck) [8, 13, 17, 20] if they fail in switching their states. This occasionally happens due to the cooling and heating processes of the cells. Therefore, cells can only hold a single phase [8, 20]. In multi-level PCM cells, failure may occur at a position in either of extreme states or in the partially programmable states of crystalline.

The work [29] investigates codes that mask so-called partially stuck (partially defective) cells, i.e., cells which cannot use all levels. For multi-level PCMs, the case in which the partially stuck level \(s=1\) is particularly important since this means that a cell can reach all crystalline sub-states, but cannot reach the amorphous state.

Figure 1 depicts the general idea of reliable and (partially) defective memory cells. It shows two different cell level representations: Representation 1 forms the binary extension field \({\mathbb {F}}_{2^2}\) and Representation 2 forms the set of integers modulo \(q=4\), i.e., \({\mathbb {Z}}/4{\mathbb {Z}}\).

Fig. 1
figure 1

Illustration of reliable and (partially) defective memory cells. In this figure, there are \(n=5\) cells with \(q=4\) possible levels. The cell levels \(\in {\mathbb {F}}_{2^2}\) are mapped to (0, 1, \(\alpha \) or \(1+\alpha \)) shown in Representation 1 or \(\in {\mathbb {Z}}/4{\mathbb {Z}}\) are mapped to (0, 1, 2 or 3) shown in Representation 2. Case (A) illustrates fully reliable cells which can store any of the four values in both representations. In the stuck scenario as shown in case (B), the defective cells can store only the exact stuck level s. Case (C) is more flexible (partially defective scenario). Partially stuck cells at level \(s \ge 1\) can store level s or higher

1.1 Related Work

Coding for memories with stuck cells, also known as defect-correcting codes for memories with defects, dates back to the 1970 s, cf. the work by Kuznetsov and Tsybakov [15]. They proposed binary defect-correcting codes in finite and asymptotic regimes whose required redundancy is at least the number of defects. Later works [2,3,4,5,6,7, 12, 14, 16, 19, 25,26,27] investigated the problem of defective cells under various aspects: binary and non-binary, only defect-correcting coding and error-and-defect-correcting coding, and finite and asymptotic length analysis.

In binary defect-correcting coding models, e.g. [2,3,4,5, 16, 19, 26], the authors dealt with masking stuck cells without considering additional substitution errors. In these studies, it is unclear if the proposed constructions are optimal in terms of their required redundancy. The works [12, 14, 27] considered masking stuck memory cells while at the same time correcting potential random errors. In [12], so-called partitioned cyclic code and partitioned BCH codes were proposed for this task.

The asymptotic model of stuck-cell-masking codes also received considerable attention in the previously mentioned papers. Moreover, there is work devoted to asymptotically optimal codes for a fixed number of defects [6] or for a number of defects proportional to the codeword length [7]. The proposed constructions, for example [7, Section 4] and its extended version in [7, Section 5] that can additionally correct substitution errors, show that u check symbols are sufficient for masking u defects. However, they use codes with a property that is not well studied in coding theory. Therefore, we do not dwell on [6] and [7], and also our goal is to obtain code constructions for finite code length n.

The recent work [29] considers partially stuck memory cells (see Figure 1. C), and improves upon the redundancy necessary for masking compared to all prior works for classical stuck cells. However, the paper does not consider error correction in addition to masking.

1.2 Our contribution

In this paper, we consider the problem of combined error correction and masking of partially stuck cells. Compared to the conventional stuck-cell case in [12], we reduce the redundancy necessary for masking, similar to the results in [29], and even reduce further compared to [29, Construction 5].

If cells are partially stuck at level 1, we can simply use a \((q-1)\)-ary error correcting code as mentioned in [29, Section III]. However, this approach could require too much redundancy if a cell is partially stuck at different levels rather than 1. For instance, using \((q-s)\)-ary codes for \( 2 \le s \le q-1\) reduces the cardinality of the code because exempting s out of the available q levels is quite expensive. Further, for relatively few partially stuck-at-1 cells, even a \((q-1)\)-ary error correcting code is not a competitor to our constructions (cf. Figure 6). Therefore, considering sophisticated coding schemes is favorable.

We provide code constructions for any number of partially stuck cells; see Table 3 for an overview of our constructions and their required redundancies. For the error-free case, where only masking is necessary, our redundancies coincide with those from [29] or are even smaller.

Our paper also investigates a technique where the encoder, after a first masking step, introduces errors at some partially stuck positions of a codeword in order to satisfy the stuck-at constraints. The decoder uses part of the error-correcting capability to correct these introduced errors.

We also derive bounds on our code constructions, namely a Singleton-type, sphere-packing-type, and Gilbert-Varshamov-type bounds. We provide a numerical analysis by comparing our code constructions and the derived bounds with other trivial codes and known limits.

Our focus is on long codes over small alphabets, i.e., the code length n is larger than the field size q. Otherwise, one could instead mask by a code of length \(n<q\) (by using, e.g., [24]).

The remainder of the paper is arranged as follows. In Sect. 2, we provide notations and define the models of joint errors and partially defective cells examined in this study. Our code constructions along with their encoding and decoding algorithms are presented in Sect. 3 and 4. Section 5 generalizes the previous constructions to mask partially stuck cells at any arbitrary level and correct errors additionally. Section 6 investigates exchanging error correction capability toward more partially stuck cells masking possibility. Upper- and lower-like bounds on our constructions are derived in Sect. 7 and 8, respectively. In Sect. 9, we provide numerical and analytical comparisons. Finally, Sect. 10 concludes this work.

2 Preliminaries

2.1 Notations

For a prime power q, let \({\mathbb {F}}_q\) denote the finite field of order q and \({{\mathbb {F}}}_q[x]\) be the set of all univariate polynomials with coefficients in \({\mathbb {F}}_q\). For \(g,f \in {\mathbb {Z}}_{> 0}\), denote \([f] = \{0,1, \dots , f-1\}\) and \([g,f]=\{g,g+1,\dots ,f-1\}\).

As usual, an \([n,k,d]_q\) code is a linear code over \({\mathbb {F}}_q\) of length n, dimension k and minimum (Hamming) distance d. The (Hamming) weight \({{\,\textrm{wt}\,}}(\varvec{x})\) of a vector \(\varvec{x}\in {\mathbb {F}}_q^n\) equals its number of non-zero entries.

In order to simplify notation, we sometimes identify \(x\in {\mathbb {F}}_q\) with the number of field elements not larger than x, that is, with the integer \(q-|\{ y\in {\mathbb {F}}_q\mid x\ge y\}|\). The meaning of x will be clear from the context. Figure 1 depicts the two representations that are equivalent in this sense. Finally, we denote the q-ary entropy function by \(h_q\), that is

$$\begin{aligned} h_q(0)&=0 \text{, } h_q(1)=\log _q(q-1), \text{ and } h_q(x)= -x\log _q(x)\\&\quad -(1-x)\log _q(1-x) + x\log _q(q-1) \text{ for } 0< x < 1. \end{aligned}$$

2.2 Definitions

2.2.1 Defect and partially defect cells

A cell is called defect (stuck-at level s), if it can only store the value s. A cell is called partially defect (partially-stuck-at level s), if it can only store values which are at least s. Note that a cell that is partially defect at level 0 is a non-defect cell which can store any of the q levels and a cell that is partially defect at level \(q-1\) is a (fully) defect cell.

2.2.2 \((n,M)_q\)(\(\varSigma \), t) PSMC

For \(\varSigma \subset {\mathbb {F}}^n_q\) and non-negative integer t, a q-ary (\(\varSigma \), t)-partially-stuck-at-masking code \(\mathcal {C}\) of length n and size M is a coding scheme consisting of a message set \({\mathcal {M}}\) of size M, an encoder \({\mathcal {E}}\) and a decoder \({\mathcal {D}}\) satisfying:

  1. 1.

    The encoder \({\mathcal {E}}\) is a mapping from \({\mathcal {M}}\times \varSigma \) to \({\mathbb {F}}_q^n\) such that

    $$\begin{aligned} \text{ for } \text{ each } (\varvec{m},\varvec{s})\in {\mathcal {M}} \times \varSigma , \quad {\mathcal {E}}(\varvec{m},\varvec{s}) \ge \varvec{s}, \end{aligned}$$
  2. 2.

    For each \((\varvec{m},\varvec{s})\in {\mathcal {M}}\times \varSigma \) and each \(\varvec{e}\in {\mathbb {F}}_q^n\) such that

    $$\begin{aligned} {{\,\textrm{wt}\,}}({\varvec{e}})\le t \text{ and } {\mathcal {E}}(\varvec{m},\varvec{s})+\varvec{e}\ge \varvec{s}, \end{aligned}$$

    it holds that

    $$\begin{aligned} {\mathcal {D}}({\mathcal {E}}(\varvec{m},\varvec{s}) + \varvec{e}) = \varvec{m}. \end{aligned}$$

2.2.3 \((n,M)_q (u,1,t)\) PSMC

A q-ary (u, 1, t) PSMC of length n and cardinality M is a q-ary \((\varSigma ,t)\) PSMC of length n and size M where

$$\begin{aligned} \varSigma = \{ \varvec{s}\in \ \{0,1\}^n \mid {{\,\textrm{wt}\,}}(\varvec{s}) \le u \}. \end{aligned}$$

In this special case, the partially stuck-at condition means that the output of the encoder is non-zero at each position of the support \(\varvec{\phi }\) of \(\varvec{s}\).

3 Code construction for masking up to \(q-1\) partially stuck-at-1 Cells

3.1 Code construction

In this section, we present a coding scheme over \({\mathbb {F}}_q\) that can mask up to \(q-1\) partially stuck-at-1 cells and additionally can correct errors. We adapt the construction from [29], which allows to mask up to \(q-1\) partially stuck-at-1 (\(s_i =1\) for all i) cells with only a single redundancy symbol, but cannot correct any substitution errors.

Construction 1

Assume that there is an \([n, k, d]_q\) code \(\mathcal {C}\) with a \(k \times n\) generator matrix of the form

$$\begin{aligned} \varvec{G} = \begin{bmatrix} \varvec{G}_1\\ \varvec{G}_0 \end{bmatrix} =\begin{bmatrix} \varvec{0}_{(k-1) \times 1} &{} \varvec{I}_{k-1} &{} \varvec{P}_{(k-1) \times (n-k)} \\ {1} &{} {\varvec{1}_{k-1}} &{} {\varvec{1}_{n-k}} \end{bmatrix}, \end{aligned}$$

where \(\varvec{I}_{k-1}\) is the \((k-1) \times (k-1)\) identity matrix, \(\varvec{P} \in {\mathbb {F}}^{(k-1)\times (n-k)}_q\), and \(\varvec{1}_{\ell }\) is the all-one vector of length \(\ell \). From the code \(\mathcal {C}\), a PSMC can be obtained, whose encoder and decoder are shown in Algorithm 1 and Algorithm 2.

Theorem 1

The coding scheme in Construction 1 is a (\(q-1,1,\lfloor \frac{d-1}{2}\rfloor \)) PSMC of length n and cardinality \(q^{k-1}\).

Algorithm 1:
figure a

Encoding

Algorithm 2:
figure b

Decoding

Proof

To mask the partially stuck-at-1 positions, the codeword has to fulfill:

$$\begin{aligned} c_i\ge 1, \text{ for } \text{ all } i \in \varvec{\phi }. \end{aligned}$$
(1)

Since \(\mid \varvec{\phi }\mid \;< q\), there is at least one value \(v \in {\mathbb {F}}_q\) such that \(w_i \not = v\), \( \text{ for } \text{ all } i \in \varvec{\phi }\). Thus, \({c}_i=(w_i - v)\not = 0\) and (1) is satisfied.

The decoder (Algorithm 2) gets \(\varvec{y}\), which is \(\varvec{c}\) corrupted by at most \(\lfloor \frac{d-1}{2}\rfloor \) substitution errors. The decoder of \({\mathcal {C}}\) can correct these errors and obtain \(\varvec{c}\).

Due to the structure of \(\varvec{G}\), the first position of \(\varvec{c}\) equals \(-v\). Hence, we can compute \(\hat{\varvec{w}} = \varvec{w}\) (cf. Algorithm 2) and \({{\hat{\varvec{m}}}}=\varvec{m}\). \(\square \)

Corollary 1

If there is an \([n,k,d]_q\) code containing a word of weight n, then there is a q-ary \((q-1,1,\lfloor \frac{d-1}{2}\rfloor )\) PSMC of length n and size \(q^{k-1}\).

3.2 Comparison to the conventional stuck-cell scenario

Theorem 1 combines [12, Theorem 1] and [29, Theorem 4] to provide a code construction that can mask partially stuck cells and correct errors. The required redundancy is a single symbol for masking plus the redundancy for the code generated by the upper part of \(\varvec{G}\), needed for the error correction. In comparison, [12, Theorem 1] requires at least

$$\begin{aligned} \min \{ n-k \, : \, \exists \, [n,k,d]_q \text { code with } d > u\} \ge u \end{aligned}$$

redundancy symbols to mask u stuck cells, where the inequality follows directly from the Singleton bound.

Throughout this work, we use q-ary cyclic codes, in particular BCH codes, to describe our coding methods and to show some examples. In the following, we present Tables 1 and 2 to compare ternary cyclic codes of length \(n= 8\) for masking partially stuck cells to masking stuck cells [12], both with error correction. Here, the defining set, denoted by \(D_c\), is the set of indices of zeros of the code, that is, \(\varvec{c}\) is a codeword if and only if \(\varvec{c}(\alpha ^b)=0\) for all b in \(D_c\), where \(\alpha \in {\mathbb {F}}_q\) is the primitive \(n^{th}\) root of unity in \({\mathbb {F}}_q\). These code classes are unique, practical, and explicit in their applications. The minimum distance d can be bounded from below by the BCH bound, which is the number of consecutive elements in \(D_c\) plus one, or more involved bounds such as the Hartmann-Tzeng bound [11] or the Roos bound [21].

The tables show that masking partially stuck cells requires less redundancy than masking stuck cells, both with and without additional error correction. The reason is that there is only one forbidden value in each partially stuck-at-1 cell, while there are \(q-1\) forbidden values in each stuck-at cell.

3.3 Remarks on construction 1

Remark 1

The special case of Theorem 1 with \(n<q\) was used in [24] for constructing a \((q-1)\)-ary error-correcting code from a q-ary Reed-Solomon code, which can be of interest if \(q-1\) is not the power of a prime.

Remark 2

The code constructions in Theorem 1 also work over the ring of integers modulo q (\({\mathbb {Z}}/q{\mathbb {Z}}\)) in which q is not necessarily a prime power, similar to the construction for \(u< q\) in [29].

Remark 3

According to [29, Construction 3], it is possible to further decrease the required redundancy for masking u partially stuck-at-1 cells to \(1-\log _q\lfloor \frac{q}{u+1}\rfloor \). We can use the same strategy here. Let \(z= \lfloor (\frac{q}{u+1}) \rfloor \). We choose disjoint sets \(A_1,A_2,..,A_z\) of size \(u+1\) in \({\mathbb {F}}_q\). As additional information, the encoder picks \(j \in \{1,2,\dots ,z\}\). In Step 2 of Algorithm 1, it selects v from \(A_j\). As the decoder acquires v, it can obtain j as well.

Table 1 Ternary Codes for Partially Stuck-at-1 Memory Cell for \(n=8\)
Table 2 Ternary Codes for Stuck-at Memory [12] for \(n=8\)

4 Code constructions for masking more than \(q- 1\) partially stuck-at-1 cells

The masking technique in the previous section only guarantees successful masking up to a number of \(q-1\) partially stuck-at-1 cells. In this section, we present two code constructions for simultaneous masking and error correction when \(q \le u < n\). One is based on the masking-only construction in [29, Construction 4] and the other is based on [29, Section VI], which are able to mask \(u\ge q\) partially stuck positions, but cannot correct any errors. We generalize these constructions to be able to cope with errors. The latter construction may lead to larger code dimensions for a given pair (u, t), in a similar fashion as [29, Construction 5] improves upon [29, Construction 4]. Further, taking \(t=0\) it achieves larger codes sizes than [29, Construction 5] if the all-one word is in the code.

4.1 Code construction over \({\mathbb {F}}_q\) for masking up to \( q+d_0-3\) partially stuck cells

We recall that [29, Construction 4] can mask more than \(q-1\) partially stuck-at-1 cells and it is a generalization of the all-one vector construction [29, Theorem 4]. Hence, replacing the \(\varvec{1}_n\) vector in Theorem 1 by a parity-check matrix as in [29, Construction 4] allows masking of q or more partially stuck-at 1 cells, and correct t errors.

Construction 2

Suppose that there is an \([n, k, d ]_q\) code \(\mathcal {C}\) with a \(k \times n\) generator matrix of the following form:

$$\begin{aligned} \varvec{G} = \begin{bmatrix} &{} \varvec{G}_{1} &{} \\ &{}\varvec{H}_0\end{bmatrix} \end{aligned}$$

where \(\varvec{H}_0 \in {\mathbb {F}}_q^{l\times n}\) is a parity-check matrix of an \([n,n-l,d_0]\) code \(\mathcal {C}_0\). From the code \(\mathcal {C}\), a PSMC can be obtained, whose encoder and decoder are shown in Algorithm 3 and Algorithm 4.

Theorem 2

The coding scheme in Construction 2 is a (\(d_0+q-3,1,\lfloor \frac{d-1}{2}\rfloor \)) PSMC of length n and cardinality \(q^{k-l}\).

Algorithm 3:
figure c

Encoding

Algorithm 4:
figure d

Decoding

Proof

For the masking part, the proof is a simple modification of [29, Theorem 7]. In Sect. 5, we give a full proof of Proposition 2, which generalizes Construction 2. The error correction part of the proof follows the proof of Theorem 1. \(\square \)

The gain of Theorem 2 in the number of partially stuck cells that can be masked comes at the cost of larger redundancy. However, the redundancy is still smaller than the redundancy of the construction for masking stuck-at cells and error correction in [12]. In particular, let \(\mathcal {C}\) be an \([n,k,d\ge 2t+1]\) code containing an \([n,l]_q\) subcode \(\mathcal {C}_0\) for which \(\mathcal {C}_0^{\perp }\) has minimum distance \(d_0\). With Theorem 2, we obtain a \((d_0+q-3,1,\lfloor \frac{d-1}{2}\rfloor )\) PSMC of length n and cardinality \(q^{k-l}\). The construction in [12] yields a coding scheme with equal cardinality, allowing for masking up to \(d_0-1\) fully stuck cells and correcting \(\lfloor \frac{d-1}{2}\rfloor \) errors since the minimum distance \(d_1\) defined in [12] for the error correction capability of the code is d in our notation. Hence, exactly \(q-2\) more cells that are partially-stuck-at levels 1 than classical stuck cells can be masked.

Example 1

We apply Construction 2 to mask up to \(u = 4\) partially stuck cells over \({\mathbb {F}}_4\) and \(\varvec{m}\in {\mathbb {F}}_{4}^9\). Let \(\alpha \) be a primitive element in \({\mathbb {F}}_{16}\) and let \(\mathcal {C}\) be the \([15,12,3]_4\) code with zeros \(\alpha ^0\) and \(\alpha ^1\). Let \(\mathcal {C}_0\) be the [15, 3] subcode of \(\mathcal {C}\) be the BCH code with zeros \(\{ \alpha ^i\mid 0\le i\le 14\}{\setminus }\{ \alpha ^5,\alpha ^6,\alpha ^9\}\). As \(\mathcal {C}_0^{\perp }\) is equivalent to the \([15,12,3]_4\) code with zeros \(\alpha ^5,\alpha ^6,\alpha ^9\), it has minimum distance \(d_0=3\). Hence, we obtain a (4, 1, 1) PSMC code of cardinality \(4^{9}\). \(\square \)

4.2 Code construction over \({\mathbb {F}}_{2^\mu }\) for masking up to \(2^{\mu -1}(d_0+1)-1\) partially stuck cells

We generalize [29, Section VI] to be able to cope with errors. Unlike [29, Section VI] that could be over any prime power q, the following code construction works over the finite field \({\mathbb {F}}_{q}\) where \(q= 2^{\mu }\) in order to describe a \(2^{\mu }\)-ary partially stuck cells code construction. This is because binary sub-field subcodes that are required in this construction are not linear subspace for codes over any prime power q. We denote by \(\beta _0=1,\beta _1,\dots ,\beta _{\mu -1}\) a basis of \({\mathbb {F}}_{2^\mu }\) over \({\mathbb {F}}_{2}\). That is, any element \(a \in {\mathbb {F}}_{2^\mu }\) can be uniquely represented as \(a=\sum _{i=0}^{\mu -1} a_i \beta _i\) where \(a_i \in {\mathbb {F}}_2\) for all i. In particular, \(a \in {\mathbb {F}}_2\) if and only if \(a_1=\dots =a_{\mu -1}=0\). This is a crucial property of \({\mathbb {F}}_{2^\mu }\) that we will use in Construction 3.

Construction 3

Let \(\mu >1\). Suppose \(\varvec{G}\) is a \(k\times n\) generator matrix of an \([n,k,d]\) \(_{2^\mu }\) code \(\mathcal {C}\) of the form

$$\begin{aligned} \varvec{G}=\left[ \begin{array}{c} \varvec{H}_0 \\ \varvec{G}_1 \\ \varvec{x} \end{array} \right] \end{aligned}$$
(2)

where

  1. 1.

    \(\varvec{H}_0\in {\mathbb {F}}_2^{l\times n}\) is a parity check matrix of an \([n,n-l,d_0]\) \(_{2}\) code \(\mathcal {C}_0\),

  2. 2.

    \(\varvec{G}_1 \in {\mathbb {F}}_{2^\mu }^{(k-l-1)\times n}\),

  3. 3.

    \(\varvec{x}\in {\mathbb {F}}^{1 \times n}_{2^\mu }\) has Hamming weight n.

From the code \(\mathcal {C}\), a PSMC can be obtained, whose encoder and decoder are shown in Algorithm 5 and Algorithm 6.

Algorithm 5:
figure e

Encoding (\(\varvec{m}; \varvec{m^\prime }; \varvec{\phi }\))

Algorithm 6:
figure f

Decoding

Theorem 3

The coding scheme in Construction 3 is a \(2^{\mu }\)-ary \((2^{\mu -1}d_0-1,1,\lfloor \frac{d-1}{2}\rfloor )\) PSMC of length n and cardinality \(2^{\mu (k-l-1)}2^{l(\mu -1)}\).

Proof

Let \(\varvec{\phi }\subset [n]\) have size \(u\le 2^{\mu -1}d_0-1\).

We first show the existence of z from Step 1. For each \(i\in \varvec{\phi }\), we have that \(\varvec{x}_i\ne 0\), so there are exactly two elements \(z\in {\mathbb {F}}_{2^{\mu }}\) such that \((\varvec{m}' \cdot \varvec{H}_0 + \varvec{m}\cdot \varvec{G}_1)_i + z\varvec{x}_i\in {\mathbb {F}}_2\). As a result,

$$\begin{aligned} 2u= 2|\varvec{\phi }| = \mid \{(i,z)\in \varvec{\phi }\times {\mathbb {F}}_{2^{\mu }} \mid (\varvec{m}' \cdot \varvec{H}_0 + \varvec{m}\cdot \varvec{G}_1)_i + z\varvec{x}_i \in {\mathbb {F}}_2\} \mid . \end{aligned}$$

As \(u < 2^{\mu -1}d_0\), there is a \(z\in {\mathbb {F}}_{2^{\mu }}\) such that the condition in Step 1 is satisfied.

As \(\varvec{H}_0\) is the parity check matrix of a code with minimum distance \(d_0\), any \(d_0-1\) columns of \(\varvec{H}_0\) are independent, so an appropriate \(\varvec{\gamma }\) exists. Now we show that \(c_i\ne 0\) for all \(i\in \varvec{\phi }\). Indeed, if \(\varvec{w}_i\not \in {\mathbb {F}}_2\), then \(\varvec{c}_i= \varvec{w}_i + ({\varvec{\gamma }}\varvec{H}_0)_i \in \{\varvec{w}_i,\varvec{w}_i+1\}\), so \(\varvec{c}_i\not \in {\mathbb {F}}_2\). By Step 2 in Algorithm 5, for \(\varvec{w}_i\in {\mathbb {F}}_2\), we have that \(\varvec{c}_i=1\). Hence, for all \(i\in \varvec{\phi }\), \(\varvec{c}_i\) is either 1 or is in \({\mathbb {F}}_{2^\mu }\setminus {\mathbb {F}}_2\), i.e., \(c_i\ne 0\).

Decoding: As \(\varvec{c}\in \mathcal {C}\), \(\hat{\varvec{c}}=\varvec{c}\). As \(\varvec{G}\) has full rank, and

$$\begin{aligned} \varvec{c}= (\varvec{m}'+\varvec{\gamma }) \varvec{H}_0 + \varvec{m}\varvec{G}_1 + z\varvec{x}, \end{aligned}$$

it holds that \(\varvec{a}=\hat{\varvec{m}'} + \hat{\varvec{\gamma }}\), \({\hat{\varvec{m}}}=\varvec{m}\) and \({\hat{z}}=z\). As \(\hat{\varvec{m}'}\in {\mathcal {F}}^{l}\) and \(\hat{\varvec{\gamma }}\in {\mathbb {F}}_2^{l}\), we can retrieve \(\hat{\varvec{m}'}=\varvec{m}'\) from \(\varvec{a} =\hat{\varvec{m}'} + \hat{\varvec{\gamma }}\). \(\square \)

We show next two minor extensions of Theorem 3 for the special case that \(\varvec{x}\) is the all-one vector.

Proposition 1

If \(\varvec{x}\) is the all-one vector in Theorem 3, then the coding scheme in Construction 3 can be modified to produce a \(2^{\mu }\)-ary \((2^{\mu -1}d_0-1,1,\lfloor \frac{d-1}{2}\rfloor )\) PSMC of length n and cardinality \(2\times 2^{\mu (k-l-1)}2^{l(\mu -1)}\).

Proof

For \(\varvec{x}=\varvec{1}\), if \(\varvec{m}' \varvec{H}_0+\varvec{m}\varvec{G}_1+z\varvec{1}\) has at most \(d_0-1\) binary entries, then so has \(\varvec{m}' \varvec{H}_0+\varvec{m}\varvec{G}_1 +(z+1)\varvec{1}\). Hence, there is a \(z_0\in {\mathcal {F}}\) such that \(\varvec{w}+z_0\varvec{1}\) has at most \(d_0-1\) binary entries, and we can encode

$$\begin{aligned} \varvec{w}=\varvec{m}' \varvec{H}_0 + \varvec{m}\varvec{G}_1 + (z_0+\zeta )\varvec{1}, \end{aligned}$$

where \(\zeta \in \{0,1\}\) is an additional message bit so that the cardinality from Theorem 3 is doubled. As \(z_0\in {\mathcal {F}}\) and \(\zeta \in \{0,1\}\), the pair \((z_0,\zeta )\) can be retrieved from \(z_0+\zeta \).\(\square \)

Construction 3.A

(Extension of Construction 3) Let \(\varvec{G}\) be a \(k\times n\) generator matrix of an \([n,k,d]\) \(_{2^\mu }\) code \(\mathcal {C}\) of the form

$$\begin{aligned} \varvec{G}=\left[ \begin{array}{c} \varvec{H}_0 \\ \varvec{G}_1 \\ \varvec{1} \end{array} \right] \text{, } \text{ where } \end{aligned}$$
  1. 1)

    \(\varvec{1}\) is the all-one vector of length n

  2. 2)

    \(\varvec{G}_1\in {\mathbb {F}}_q^{k-l-1\times n}\)

  3. 3)

    \( \begin{bmatrix} \varvec{H}_0 \\ \varvec{1} \end{bmatrix}\) is the parity-check matrix of an \([n,n-l-1,d_e]_2\) code.

Theorem 4.A

If the conditions of Construction 3.A hold, then Construction 3 can be modified to produce a \(2^{\mu }\)-ary \((2^{\mu -1}d_e,1,\lfloor \frac{d-1}{2}\rfloor )\) PSMC of length n and cardinality \(2^{\mu (k-l-1)}2^{l(\mu -1)}\).

Proof

In Step 1 of Algorithm 5, the encoder determines z such that \(\mid \{i\in \varvec{\phi } \mid w_i\in {\mathbb {F}}_2\}\mid \;\;\le d_e-1\); the existence of such a z is proved as in the proof of Theorem 3. Next, the encoder determines \(\varvec{\gamma }\in \{0,1\}^{l}\) and \(\gamma _0\in \{0,1\}\) such that

$$\begin{aligned} \varvec{v}= ( \varvec{\gamma },\gamma _0) \cdot \begin{bmatrix} \varvec{H}_0 \\ \varvec{1} \end{bmatrix} \end{aligned}$$

is such that \(v_i=1-w_i\) for all \(i\in \varvec{\phi }\) for which \(w_i\in \{0,1\}\). The encoding output \(\varvec{c}=\varvec{v}+\varvec{w}= (\varvec{m}'+\varvec{\gamma })\varvec{H}_0 + \varvec{m}\varvec{G}_1 + (z_0+\gamma _0)\varvec{1}\) thus is in \(\mathcal {C}\) and has no zeros in the positions of \(\varvec{\phi }\).

In decoding, from \(\varvec{c}\) both \((\varvec{m}'+\varvec{\gamma })\) and \(\varvec{m}\) can be retrieved, and so, as \(\varvec{m}'\in {\mathcal {F}}^{l}\) and \(\varvec{\gamma }\in \{0,1\}^{l}\), \(\varvec{m}'\) can be retrieved as well.\(\square \)

Proposition 1 doubles the size of the PSMC as compared to Theorem 3 (by using \(\zeta \) as additional message bit), while masking the same number of partially stuck-at-errors and correcting the same number of substitution errors. Theorem 4.A, as compared to Theorem 3, results in a PSMC of the same size and error correction capabilities, but increases the number of cells that can be masked from \(2^{\mu -1}d_0-1\) to \(2^{\mu -1}d_e-1\). If \(d_0\) is odd, then this increment is at least \(2^{\mu -1}\).

Now, we show an example using nested BCH codes, allowing to store more symbols compared to Theorem 3 for the same code parameters.

Example 2

Let \(\alpha \) be a primitive \(15^{th}\) root of unity in \({\mathbb {F}}_{16}\), and let \(\mathcal {C}\) be the \([15,12,3]_4\) BCH code with zeros \(\alpha ^5\), \(\alpha ^6\) and \(\alpha ^9\). Let the \([15,4]_2\) subcode \(\mathcal {C}^\perp _0\) of \(\mathcal {C}\) be defined as

$$\begin{aligned} \mathcal {C}^\perp _0 = \bigg \{(x_0,\ldots ,x_{14})\in {\mathbb {F}}_2^{15}\; \Big |\; \sum _{i=0}^{14} x_i\alpha ^{ij}=0 \text{ for } j\in \{0,\ldots ,14\} \setminus \{7,11,13,14\}\bigg \}. \end{aligned}$$

As \(\varvec{1}\in \mathcal {C}\backslash \mathcal {C}^\perp _0\), the code \(\mathcal {C}\) has a generator matrix of the form given in Construction 3, namely

where \(\varvec{H}_0\) is a generator matrix for \(\mathcal {C}^\perp _0\) and \(\varvec{G}_1\) has \(12-4-1=7\) rows. The code \(\mathcal {C}_0=(\mathcal {C}^{\perp }_0)^{\perp }\) is equivalent to the \([15,11]_2\)

We stipulate that \(\alpha ^4=\alpha +1\) to obtain explicit \(\varvec{G}^\prime \) as below,

where \({\mathbb {F}}_4\) has elements \(\{0,1,\omega ,\omega ^2\}\) with \(\omega =\alpha ^5\). Note that the top row of \(\varvec{H}_0\) corresponds to the generator polynomial for \(\mathcal {C}_0^{\perp }\), and the top row of \(\varvec{G}_1\) corresponds to the coefficients of \((x+\alpha ^5)(x+\alpha ^6)(x+\alpha ^9)\) which is the generator polynomial of \(\mathcal {C}\). Application of Proposition 1 yields a (5, 1, 1) PSMC over \({\mathbb {F}}_{2^2}\) of length 15 and size \(2 \times 4^{7}2^{4}= 2^{19}\), whereas application of Construction 3.A gives a (7, 1, 1) PSMC over \({\mathbb {F}}_{2^2}\) with cardinality \(2^{2(7+4)-4}=2^{18}\). Note that application of Construction 3 yields a (5, 1, 1) PSMC over \({\mathbb {F}}_{2^2}\) of length 15 and size \(4^72^4=2^{18}\).

Finally, we note that application of Theorem 2 to \(\mathcal {C}\) yields a (4, 1, 1) PSMC of size \(4^{8}\), which has worse parameters than the three PSMC mentioned before. \(\square \)

Example 2 clearly shows that for the same code parameters, Construction 3, Proposition 1 and Construction 3.A significantly improve upon Construction 2.

Remark 4

For masking only, choose \(n-k =0\) in Construction 3 and therefore,

$$\begin{aligned} \varvec{G}_1= \begin{bmatrix}\varvec{0}_{(n-l-1) \times (l+1)}&\varvec{I}_{(n-l-1)}&\varvec{0}_{(n-l-1)\times 1} \end{bmatrix}, \end{aligned}$$

and we can store \(n-l-1\) information symbols. Thus, Proposition 1 for masking only improves upon [29, Construction 5]. For example if \(l=4\), then \(n-l-1= 10\) in [29, Example 7] and the size of the code is \(2^{2(n-l-1)}\cdot 2^l= 2^{24} \), while \(n-l-1= 10\) in Proposition 1 for \(\varvec{x}=\varvec{1}\) and the cardinality is \(2\cdot 2^{2n-l-1} = 2^{25}\).

We summarize in Table 3 our constructions and compare them with some of the previous works, namely with the construction for masking classical stuck cells in [12] and constructions for partially stuck cells without errors in [29]. Table 3 clearly shows that more information can be stored with partially stuck-at errors than with classical stuck cells.

Table 3 Comparison among [12, 29], and this work. We denote by d the minimum distance required to correct errors and \(d_0\) to mask (partially) stuck cells. A positive integer \(\mu > 1\) is defined in Construction 3. Other Notation: See Sect. 2.1

5 Generalization of the constructions to arbitrary partially defective levels

So far, we have considered the important case for \(s_i =1 \text{ for } \text{ all } i \in \varvec{\phi }\). In this section, we present error correction and masking codes constructions that can mask partially stuck cells at any level \(\varvec{s} \in {\mathbb {F}}^n_q\) of weight \({{\,\textrm{wt}\,}}(\varvec{s}) \le |\,\varvec{\phi }\,|\) and correct errors additionally.

We fix throughout the paper a total ordering “\(\ge \)” of the elements of \({\mathbb {F}}_q\) such that \(a \ge 1 \ge 0\) for all \(a \in {\mathbb {F}}_q{\setminus } \{0\}\). So 0 is the smallest element in \({\mathbb {F}}_q\), and 1 is the next smallest element in \({\mathbb {F}}_q\). We extend the ordering on \({\mathbb {F}}_q\) to \({\mathbb {F}}_q^n\): for \(\varvec{x}=(x_0,\ldots ,x_{n-1})\in {\mathbb {F}}_q^n\) and \(\varvec{y}=(y_0,\ldots , y_{n-1})\in {\mathbb {F}}_q^n\), we say that \(\varvec{x}\ge \varvec{y}\) if and only if \(x_i\ge y_i\) for all \(i\in [n]\).

5.1 Generalization of Theorem 1

Here, we give only the main theorem without adding the exact encoding and decoding processes because it follows directly as a consequence of Construction 1.

Theorem 4

(Generalization of Theorem 1) Let \(\varSigma = \{ \varvec{s}\in {\mathbb {F}}_q^n \mid \sum _{i=0}^{n-1} s_i\le q-1\}\). Assume there is an \([n, k, d]_q\) code \(\mathcal {C}\) of a generator matrix as specified in Theorem 1. Then there exists a \((\varSigma ,\lfloor \frac{d-1}{2}\rfloor )\) PSMC over \({\mathbb {F}}_q\) of length n and cardinality \(q^{k-1}\).

Proof

We follow the generalization for the masking partially-stuck-at any arbitrary levels in [29, Theorem 10]. Hence, for \(\varvec{s} \in \varSigma \), we modify Step 2 in Algorithm 1 such that \(w_i-v\ge s_i\) for all \(i\in [n]\). Such a v exists as each cell partially-stuck-at level \(s_i\) excludes \(s_i\) values for v, and \(\sum _{i=0}^{n-1} s_i < q\). The rest of the encoding steps and the decoding process are analogous to Algorithms 1 and 2. As the output from the encoding process is a codeword, we can correct \(\lfloor \frac{d-1}{2}\rfloor \) errors.\(\square \)

5.2 Generalization of construction 2

In the following, we generalize Construction 2 to arbitrary \(\varvec{s}\) stuck levels.

Proposition 2

Let

$$\begin{aligned} \varSigma = \left\{ \varvec{s}\in {\mathbb {F}}_q^n \; \Big | \; \min \bigg \{ \sum _{i\in \varPsi } s_i \; \big | \; \varPsi \subseteq [n], \mid \! \varPsi \! \mid =n-d_0+2\bigg \} \le q-1 \right\} \end{aligned}$$

then the coding scheme in Construction 2 can be modified to produce a \((\varSigma ,\lfloor \frac{d-1}{2}\rfloor )\) PSMC of length n and size \(q^{k-l}\).

Proof

To avoid cumbersome notation, we assume without loss of generality that \(s_0\ge s_1 \ge \cdots \ge s_{n-1}\). As the \(d_0-2\) leftmost columns of \(\varvec{H}_0\) are independent, there is an invertible \(\varvec{T}\in {\mathbb {F}}_q^{l\times l}\) such that the matrix \(\varvec{Y}=\varvec{T}\varvec{H}_0\) has the form

$$\begin{aligned} \varvec{Y}= \begin{bmatrix} I_{d_0-2} &{} \varvec{A}\\ \varvec{0} &{} \varvec{B}\end{bmatrix}, \end{aligned}$$

where \(I_{d_0-2}\) is the identity matrix of size \(d_0-2\), \(\varvec{0}\) denotes the \((l-d_0+2)\times (d_0-2)\) all-zero matrix, \(\varvec{A}\in {\mathbb {F}}_q^{(d_0-2)\times (n-d_0+2)}\) and \(\varvec{B}\in {\mathbb {F}}_q^{(l-d_0+2)\times (n-d_0+2)}\). As \(\varvec{T}\) is invertible, and any \(d_0-1\) columns of \(\varvec{H}_0\) are independent, any \(d_0-1\) columns of \(\varvec{Y}\) are independent as well.

For \(0\le i\le l-1\), we define

$$\begin{aligned} L_i = \{ j \in [n] \mid Y_{i,j}\ne 0 \text{ and } Y_{m,j}=0 \text{ for } m>i\}. \end{aligned}$$
(3)

Clearly, \(L_0,\ldots , L_{l-1}\) are pairwise disjoint. Moreover, for each \(j\in \{d_0-2,d_0-1,\ldots ,n-1\}\), column j of \(\varvec{Y}\) is independent from the \((d_0-2)\) leftmost columns of \(\varvec{Y}\), and so there is an \(i\ge d_0-2\) such that \(Y_{i,j}\ne 0\). Consequently,

$$\begin{aligned} \bigcup _{i=d_0-2}^{l-1} L_i = \{d_0-2,\ldots , n-1\}. \end{aligned}$$
(4)

By combining (4) and the form of \(\varvec{Y}\), we infer that

$$\begin{aligned} L_k=\{k\} \text{ for } \text{ all } k\in [d_0-2]. \end{aligned}$$
(5)

Let \(\varvec{w}\in {\mathbb {F}}_q^n\) be the vector to be masked, i.e. the vector after Step 1 in Algorithm 3. The encoder successively determines the coefficients \(z_0,\ldots ,z_{l-1}\) of \(\varvec{z}\in {\mathbb {F}}_q^l\) such that \(\varvec{w}+\varvec{z}\varvec{Y}\ge \varvec{s}\), as follows.

For \(j\in [d_0-2]\), the encoder sets \(z_j=s_j-w_j\).

Now let \(d_0-2\le i \le l-1\) and assume that \(z_0, \ldots , z_{i-1}\) have been obtained such that

$$\begin{aligned} w_j + \sum _{k=0}^{i-1} z_k Y_{k,j} \ge s_j \text{ for } \text{ all } j\in \bigcup _{k=0}^{i-1} L_k. \end{aligned}$$
(6)

It follows from combination of (5) and the choice of \(z_0,\ldots , z_{d_0-3}\) that (6) is satisfied for \(i=d_0-2\).

For each \(j\in L_i\), we define \(F_j\) as

$$\begin{aligned} F_j = \bigg \{ x\in {\mathbb {F}}_q \; \Big | \; w_j + \sum _{k=0}^{i-1} z_k Y_{k,j} + xY_{i,j} < s_j \bigg \}. \end{aligned}$$

Clearly, \(|F_j|=s_j\) as \(Y_{i,j}\ne 0\), and so

$$\begin{aligned} \Big |\bigcup _{j\in L_i} F_j\Big | \le \sum _{j\in L_i} \big |F_j\big | = \sum _{j\in L_i} s_j \le \sum _{j=d_0-2}^{n-1} s_j \le q-1, \end{aligned}$$

where the last inequality follows from the assumption of \(\varSigma \) in the proposition statement and the ordering of the components of \(\varvec{s}\). Hence, \(\bigcup _{j\in L_i} F_j \ne {\mathbb {F}}_q\). The encoder chooses \(z_i\in {\mathbb {F}}_q\setminus \bigcup _{j\in L_i} F_j\). We claim that

$$\begin{aligned} w_j + \sum _{k=0}^i z_k Y_{k,j} \ge s_j \text{ for } \text{ all } j\in \bigcup _{k=0}^{i} L_k. \end{aligned}$$
(7)

For \(j\in L_i\), (7) follows from the definition of \(F_j\). For \(j\in \bigcup _{k=0}^{i-1} L_k\), (7) follows from (6) and the fact that \(Y_{i,j}=0\).

By using induction on i, we infer that

$$\begin{aligned} w_j + \sum _{k=0}^{l-1} z_k Y_{k,j} \ge s_j \text{ for } \text{ all } j\in \cup _{k=0}^{l-1}L_k = [n]. \end{aligned}$$
(8)

That is, with \(\varvec{z}=(z_0,\ldots ,z_{l-1})\), we have that \(\varvec{w}+\varvec{z}\varvec{Y}\ge \varvec{s}\). As \(\varvec{Y}=\varvec{T}\varvec{H}_0\), it follows that \(\varvec{z}:=\varvec{z}\varvec{T}\) is such that

$$\begin{aligned} \varvec{w} + \varvec{z}\varvec{H}_0 \ge \varvec{s}. \end{aligned}$$

The decoding process remains as in Algorithm 4.\(\square \)

We give an alternative non-constructive proof for Proposition 2 in Appendix A.

Remark 5

The proof of Proposition 2 shows that \((d_0-2)\) cells can be set to any desired value, while the remaining \((n-d_0+2)\) cells can be made to satisfy the partial stuck-at conditions, provided that the sum of the stuck-at levels in these (\(n-d_0+2\)) cells is less than q.

Example 3

Let \(\mathcal {C}\) be the \([13,10,3]_3\) generated by

Let \(\mathcal {C}_0\) be the code with parity check matrix \(\varvec{H}_0\). Let

Notations are as introduced in Construction 2 and Proposition 2.

Observe that \(\mathcal {C}_0\) has minimum distance \(d_0=3\). Note that \(\varvec{H}_0\) has the form

in Proposition 2 we can take \(\varvec{T}\) the identity matrix, and \(\varvec{Y}=\varvec{H}_0\).

For a random message vector \(\varvec{m} = \left( \begin{array}{rrrrrrr} 1&1&1&2&1&1&0 \end{array}\right) \in {\mathbb {F}}^{7}_3\),

$$\begin{aligned}\varvec{w} = \varvec{m} \varvec{G}_1 = \left( \begin{array}{rrrrrrrrrrrrr} 1&1&1&2&1&1&0&0&0&0&1&0&2 \end{array}\right) \in {\mathbb {F}}^{13}_3. \end{aligned}$$

To mask the \(d_0-2\) largest positions of the vector \(\varvec{s}\) (corresponding to the \((d_0-2)\) leftmost columns of \(\varvec{Y}\)), namely \(s_0\) (the leftmost value highlighted in blue in \(\varvec{s}\)), the encoder sets \(z_j =s_j -w_j\) for \(j \in [d_0-2]\). Thus, \( z_0 = 2-1 = 1 \). Note that

$$\begin{aligned} \varvec{w} + \left( \begin{array}{rrrrrrrrrrrrrrrr} z_0&0&0 \end{array}\right) \varvec{H}_0= \left( \begin{array}{rrrrrrrrrrrrr} 2&1&2&0&1&2&0&1&1&1&1&1&0 \end{array}\right) . \end{aligned}$$

Next, the encoder determines \(z_1\) and \(z_2\) so as to mask in positions 5 and 6. As \(\varvec{H}_0\) has non-zero entries in the bottom row of columns 5 and 6, we have that \(L_1=\emptyset \) and \(L_2=\{5,6\}\). The encoder can thus take any value for \(z_1\), so let us say it takes 0. The coefficient \(z_2\) is chosen in such a way that

$$\begin{aligned} \varvec{w} + \left( \begin{array}{rrrrrrrrrrrrrrrr} z_0&0&0 \end{array}\right) \varvec{H}_0 + \left( \begin{array}{rrrrrrrrrrrrrrrr} 0&0&z_2 \end{array}\right) \varvec{H}_0 \ge \varvec{s}. \end{aligned}$$

This inequality is satisfied if and only if \(2+z_2\ne 0\) (position 5) and \(0+z_2\ne 0\) (position 6), so if and only if \(z_2=2\).

Note that, in this example, there are \({{\,\textrm{wt}\,}}(\varvec{s}) = 3\) partially stuck cells. As \(\mathcal {C}\) has minimum distance 3, it can correct a single error. \(\square \)

Corollary 2

(Generalization of Theorem 2) Let \(s\in {\mathbb {F}}_q\) and let

$$\begin{aligned} \varSigma =\Big \{ \varvec{s}\in {\mathbb {F}}_q^n\mid {{\,\textrm{wt}\,}}(\varvec{s})\le d_0+\Big \lceil \frac{q}{s}\Big \rceil - 3 \text{ and } \max \{s_i\mid i\in [n]\le s\} \Big \}. \end{aligned}$$

The coding scheme in Construction 2 is a \((\varSigma ,\lfloor \frac{d-1}{2}\rfloor )\) PSMC scheme of length n and size \(q^{k-l}\).

Proof

Let \(\varvec{s}\in \varSigma \) have weight \(u\le d_0+\lceil \frac{q}{s}\rceil -3\). Let \(\varPsi \subseteq [n]\) of size \(n-d_0+2\) be such that the number of non-zero components of \(\varvec{s}\) in \([n]\setminus \varPsi \) equals \(\min (d_0-2,u)\). Then \(\varvec{s}\) has \(u-\min (d_0-2,u)\) non-zero components in \(\varPsi \). As a consequence, if \(u\le d_0-2\), then \(\sum _{i\in \varPsi } s_i=0\), and if \(u>d_0-2\), then

$$\begin{aligned} \sum _{i\in \varPsi } s_i \le s(u-d_0+2) \le s(\lceil \frac{q}{s}\rceil -1) < s(\frac{q}{s}+1 - 1) = q. \end{aligned}$$

Hence in both cases, \(\sum _{i\in \varPsi } s_i \le q-1\). The corollary thus follows from Proposition 2. In particular, if \(s=1\), the corollary agrees with Theorem 2.\(\square \)

We do not generalize Construction 3 as it is tailored to the special case where \(s_i =1 \text{ for } \text{ all } i \in \varvec{\phi }\).

6 Trading partially stuck cells with errors

In the constructions shown so far, the encoder output \(\varvec{c}\) is a word from an error correcting code \(\mathcal {C}\). If \(\varvec{c}\) does not satisfy the partially-stuck-at conditions in j positions, the encoder could modify it in these j positions to obtain a word \(\varvec{c}'=\varvec{c}+\varvec{e}'\) satisfying the partially-stuck-at constrains, while \({{\,\textrm{wt}\,}}(\varvec{e}')=j\). If \(\mathcal {C}\) can correct t errors, then it still is possible to correct \(t-j\) errors in \(\varvec{c}'\). This observation was also made in [12, Theorem 1]. The above reasoning shows that the following proposition holds.

Proposition 3

If there is an \((n,M)_q (u,1,t)\) PSMC, then for any j with \(0\le j\le t\), there is an \((n,M)_q (u+j,1,t-j)\) PSMC.

In the remainder of this section, we generalize the above proposition to general \(\varSigma \) (Theorem 5). We also provide variations on the idea of the encoder introducing errors to the result of a first encoding step in order that the final encoder output satisfies the partially-stuck-at conditions.

Theorem 5

(Partial Masking PSMC) Let \(\varSigma \subset {\mathbb {F}}_q^n\), and assume that there exists an \((n,M)_q (\varSigma ,t)\) PSMC \(\mathcal {C}\). For any \(j\in [t]\), there exists an \((n,M)_q (\varSigma ^{(j)},t-j)\) PSMC \(\mathcal {C}_j\), where

$$\begin{aligned} \varSigma ^{(j)}= \Big \{\varvec{s'}\in {\mathbb {F}}_q^n\mid \exists {\varvec{s}\in \varSigma } \left[ d(\varvec{s},\varvec{s'})\le j \text{ and } \varvec{s'}\ge \varvec{s} \right] \Big \}. \end{aligned}$$
Algorithm 7:
figure g

Encoding

Algorithm 8:
figure h

Decoding

Proof

Let the encoder \({\mathcal {E}}_j\) and the decoder \({\mathcal {D}}_j\) for \(\mathcal {C}_j\) be Algorithm 7 and Algorithm 8, respectively. By definition, \(\varvec{c'}\ge \varvec{s'}\). Moreover, if \(s_i=s'_i\), then \(c_i\ge s_i=s'_i\), so \(c_i=c'_i\). As a result, \(d(\varvec{c},\varvec{c'})\le j\).

In Algorithm 8, the decoder \({\mathcal {D}}\) of \(\mathcal {C}\) is directly used for decoding \(\mathcal {C}_j\). As \(\varvec{y}\ge \varvec{s'}\), surely \(\varvec{y}\ge \varvec{s}\). Moreover, we can write \(\varvec{y}=\varvec{c} + (\varvec{c'}-\varvec{c}+\varvec{e})\). As shown above, wt\((\varvec{c'}-\varvec{c})\le j\), and so \({{\,\textrm{wt}\,}}(\varvec{c}-\varvec{c}'+\varvec{e})\le t.\) As a consequence, \({\mathcal {D}}(\varvec{y})=\varvec{m}\).\(\square \)

We can improve on Theorem 5 for Construction 3 giving Lemma 1.

Lemma 1

Given an \([n,k,d]_q\) code as defined in Construction 3, then for any j such that \(0\le j\le \lfloor \frac{d-1}{2}\rfloor \), there is a \(2^{\mu }\)-ary \((2^{\mu -1}(d_0+j)-1,1,\lfloor \frac{d-1}{2}\rfloor -j)\) PSMC of length n and size \(q^{k-l-1}\).

Proof

Let \(\varvec{\phi }\subset [n]\) has size \(u\le 2^{\mu -1}(d_0+j)-1\). We use the notation from Algorithm 5. After Step 1, \(\varvec{w}\) has at most \(u_0=\lfloor \frac{2u}{2^{\mu }}\rfloor \le d_0+j-1\) binary entries in the positions from \(\varvec{\phi }\). After Step 2, at least \(d_0-1\) of these entries in \(\varvec{c}\) differ from 0. By setting the at most j other binary entries in the positions from \(\varvec{\phi }\) equal to 1, the encoder introduces at most j errors, and guarantees that the partially-stuck-at conditions are satisfied.\(\square \)

In Lemma 2, we use another approach for introducing errors in order to satisfy the stuck-at conditions.

Lemma 2

Given an \([n,k,d]_q\) code containing a word of weight n, for any j with \(0\le j\le \lfloor \frac{d-1}{2}\rfloor \), there is a q-ary \((q-1+qj,1,\lfloor \frac{d-1}{2}\rfloor -j)\) PSMC of length n and size \(q^{k-1}\).

Proof

We use the notation from Construction 1.

Let \(\varvec{\phi }\subset [n]\) have size \(u \le q-1+qj\). Let \(\varvec{x}\) be a codeword of weight n. For each \(i\in \varvec{\phi }\), there is exactly one \(v\in {\mathbb {F}}_q\) such that \(w_i+vx_i=0\), and so

$$\begin{aligned} \sum _{v\in {\mathbb {F}}_q} \mid \{ i\in \varvec{\phi }\mid w_i+vx_i=0\}\mid \; = u. \end{aligned}$$

As a consequence, there is \(v\in {\mathbb {F}}_q\) such that \(\varvec{c}=\varvec{w}+v\varvec{x}\) has at most \(\lfloor \frac{u}{q}\rfloor \le j\) entries in \(\varvec{\phi }\) equal to zero. By setting these entries of \(\varvec{c}\) to a non-zero value, the encoder introduces at most j errors. As \(\mathcal {C}\) can correct up to \(\lfloor \frac{d-1}{2}\rfloor \) errors, it can correct these j errors and additionally up to \(\lfloor \frac{d-1}{2}\rfloor -j\) substitution errors.\(\square \)

Example 4

Consider a \([15,9,5]_4\) code \(\mathcal {C}\) containing the all-one word, e.g. the BCH code with zeroes \(\alpha ,\alpha ^2,\alpha ^3\), where \(\alpha \) is a primitive element in \({\mathbb {F}}_{16}\). Let \(u\le 7\) and \(t=1\). We use the all-one word for partial masking, ensuring that 0 occurs in at most \(\lfloor \frac{u}{4}\rfloor \le 1\) position indexed by \(\varvec{\phi }\). We set the codeword value in this position to 1, introducing one error. We can correct this introduced error and one additional random error as \(\mathcal {C}\) has minimum distance 5. Hence, we have obtained a 4-ary (7, 1, 1) PSMC of length 15 and cardinality \(4^8\). \(\square \)

We show in Example 5 how applying Lemma 2 for Construction 3 outperforms Lemma 1.

Example 5

Given \(d_0=3\), \(u=15\) and \(q=2^2\) and let \(\alpha \) be a primitive element in \({\mathbb {F}}_{4}\) and take \(\varvec{x}= \varvec{1}\). Assume we have

then choosing \(z=1+\alpha \) minimizes the number of binary values in \(\varvec{w}^{(\varvec{\phi })}\), we get:

Following Step 2 in Algorithm 5 and since \(d_0 = 3\), we can mask at most \(d_0-1\) binary values highlighted in the vector \(\varvec{w}^{(\varvec{\phi })}\) that leaves us, in this example, with at most \(\lfloor \frac{2u}{2^2}\rfloor -d_0+1 = 5\) zeros that remain unmasked.

However, applying Lemma 2 instead for Construction 3 gives a better result. Choosing \(\varvec{\gamma } = \varvec{0}\) in Step 2 of Algorithm 5, we obtain \(\varvec{c}^{(\varvec{\phi })} = \varvec{w}^{(\varvec{\phi })}\) with \((\lfloor \frac{u}{q}\rfloor =3)\) zeros highlighted in blue above that we can directly trade. \(\square \)

Remark 6

As the code from Construction 3 has a word of weight n, Lemma 2 implies the existence of an (u, 1, t) PSMC of cardinality \(q^{k-1}\) under the condition that \(2(t+\lfloor \frac{u}{q}\rfloor ) < d\). Lemma 1 shows the existence of an (u, 1, t) PSMC of smaller cardinality, viz. \(q^{k-l}\), under the condition that \(2(t+\max (0,\lfloor \frac{2u}{2^{\mu }}\rfloor -d_0+1) < d\). As a consequence, Lemma 1 can only improve on Lemma 2 if \(d_0-1> \lfloor \frac{2u}{2^{\mu }}\rfloor - \lfloor \frac{u}{2^{\mu }}\rfloor \).

We can generalize Lemma 2 as follows.

Lemma 3

Given an \([n,k,d]_q\) code containing a word of weight n. Let \(0\le j\le \lfloor \frac{d-1}{2}\rfloor \), and let

$$\begin{aligned} \varSigma = \bigg \{ \varvec{s}\in {\mathbb {F}}_q^n \; \Big | \; \sum _i s_i \le q-1+qj\bigg \}. \end{aligned}$$

There is a q-ary (\(\varSigma ,\lfloor \frac{d-1}{2}\rfloor -j)\) PSMC of length n and size \(q^{k-1}\).

Proof

We use the notation from Theorem 4. For simplicity, we assume that the code contains the all-one word. We wish to choose the multiplier \(v\in {\mathbb {F}}_q\) such that \(\varvec{c}=\varvec{w}-v\cdot \varvec{1}\) satisfies \(c_i\ge s_i\) for as many indices i as possible. For each index i, there are \(q-s_i\) values for v such that this inequality is met. Hence, there is a \(v\in {\mathbb {F}}_q\) such that \(c_i\ge s_i\) for at least \(\lceil \frac{1}{q} \sum _i(q-s_i)\rceil = n-\lfloor \frac{1}{q}\sum _i s_i\rfloor \) indices i. The encoder thus needs to introduce errors only in the at most \(\lfloor \frac{1}{q} \sum _i s_i \rfloor \) positions for which the inequality is not satisfied. \(\square \)

Lemma 4

Assume there exists a matrix as in Proposition 2. Let \(0\le j\le \lfloor \frac{d-1}{2}\rfloor \), and let

$$\begin{aligned} \varSigma =\bigg \{ \varvec{s}\in {\mathbb {F}}_q^n \; \Big | \; \exists {\varPsi \subset [n]: \; \big | \;\varPsi \;\big |=n-d_0+2} \Big [ \sum _{i\in \varPsi } s_i \le q-1+qj \Big ]\bigg \}. \end{aligned}$$

Then exists a q-ary \((\varSigma ,\lfloor \frac{d-1}{2}\rfloor -j)\) PSMC of length n and size \(q^{k-l}\).

Proof

Let \(\varvec{s}\in \varSigma \). In order to simplify notation, we assume without loss of generality that \(\sum _{i=d_0-2}^{n-1} s_i \le q-1+qj\). We use the same argument as in the alternative proof of Proposition 2 (see Appendix A). Clearly,

$$\begin{aligned} n-d_0+2- \Bigg \lfloor \frac{1}{q} \sum _{i=d_0-2}^{n-1}s_i\Bigg \rfloor \ge n-d_0+2-j. \end{aligned}$$

So we infer that for at least \(n-j\) indices \(i\in [n]\),

$$\begin{aligned} w_i+ \Big ( (\varvec{z},\varvec{\eta }) \varvec{T}\varvec{H}_0\Big )_i \ge s_i. \end{aligned}$$

\(\square \)

Remark 7

The proof of Lemma 4 shows that the encoder output in fact can be made equal to \(\varvec{s}\) in the \(d_0-2\) largest entries of \(\varvec{s}\). In fact, it shows that the scheme allows for masking \(d_0-2\) stuck-at errors, masking partial stuck errors in the remaining cells, and correcting \(\lfloor \frac{d-1}{2}\rfloor -j\) substitution errors, provided that the sum of the partially stuck-at levels in the \(n-d_0+2\) remaining cells is less than \((j+1)q\).

7 Upper bounds on PSMC codes

The output of an encoder has restrictions on the values in the partially-stuck-at cells; in the other cells, it can attain all values. So the set of all encoder outputs is a poly-alphabetic code [23]. To be more precise, the following proposition holds.

Proposition 4

Let \(\mathcal {C}\) be an \((n,M)_q (\varSigma ,t)\) partially-stuck-at-masking code with encoder \({\mathcal {E}}\). For any \(\varvec{s}\in \varSigma \), let

$$\begin{aligned} \mathcal {C}_{\varvec{s}}=\{ {\mathcal {E}}(\varvec{m},\varvec{s}) \mid \varvec{m}\in {\mathcal {M}}\}. \end{aligned}$$

Then \(\mathcal {C}_{\varvec{s}}\) is a code with minimum distance at least \(2t+1\) and \(|{\mathcal {M}}|\) words, and

$$\begin{aligned} \mathcal {C}_{\varvec{s}} \subset Q_0\times Q_1 \times \cdots \times Q_{n-1}, \text{ where } \end{aligned}$$

\(Q_i = \{ x\in {\mathbb {F}}_q\mid x\ge s_i\}\).

Proof

By our error model, errors in stuck-at cells result in values still satisfying the stuck-at constraints. Therefore, t errors can be corrected if and only if \(\mathcal {C}_{\varvec{s}}\) has minimum Hamming distance at least \(2t+1\). The rest of the proposition is obvious. \(\square \)

As a result of Proposition 4, upper bounds on the size of poly-alphabetic codes [23] are also upper bounds on the size of partially-stuck-at codes. Hence, we give the following corollaries to state Singleton-type and sphere-packing-type bounds for error-correcting and partially-stuck-at-masking codes.

Corollary 3

(Singleton-type bound) Let \(\mathcal {C}\) be a q-ary \((\varSigma ,t)\) PSMC of length n and size M. Then for any \(\varvec{s}=(s_0,\ldots ,s_{n-1})\in \varSigma \),

$$\begin{aligned} M \le \min \bigg \{ \prod _{j\in J}(q-s_j) \; \Big | \; J\subset [n], |J|=n-2t\bigg \}. \end{aligned}$$

Proof

Combination of Proposition 4 and [23, Theorem 2]. \(\square \)

Corollary 4

(Sphere-packing-type bound) Let \(\mathcal {C}\) be a q-ary \((\varSigma ,t)\) PSMC of length n and size M. Then for any \(\varvec{s}=(s_0,\ldots ,s_{n-1})\in \varSigma \)

$$\begin{aligned} M \le \frac{\prod _{i=0}^{n-1}(q-s_i)}{V^{(b)}_t}, \end{aligned}$$

where \(V^{(b)}_t\), the volume of a ball of radius t, satisfies

$$\begin{aligned} V^{(b)}_t=\sum _{r=0}^t V^{(s)}_r, \end{aligned}$$

where the volume \(V^{(s)}_r\) of the sphere with radius r is given by

$$\begin{aligned} V^{(s)}_0= & {} 1, \\ V^{(s)}_r= & {} \sum _{1\le i_1< \ldots < i_r\le n} (q-1-s_{i_1})\cdots (q-1-s_{i_r}). \end{aligned}$$

Proof

Combination of Proposition 4 and [23, Theorem 3]. \(\square \)

Remark 8

The difference between poly-alphabetic codes and partially-stuck-at-masking codes is that in the former, the positions of stuck-at cells and the corresponding levels are known to both encoder and decoder, whereas in the latter, this information is known to the encoder only.

Fig. 2
figure 2

Sphere-packing bounds: Comparison for k information symbols for (“only partially stuck cells [29, Theorem 2]”) and our sphere-packing-like (“errors and partially stuck cells”) bounds. The classical sphere-packing bound (“only errors”) can read at \(u=0\) in our sphere-packing-like bounds curves. The chosen parameters are \(\mu =5\) and \( q =3\), and \(n = ((q^\mu -1)/(q-1))\)

Figure 2 compares our derived sphere-packing-like bound to the amount of storable information symbols for a completely reliable memory (i.e., no stuck cells, no errors that can be seen at \(u=0\) in the solid line) and the upper bound on the cardinality of an only-masking PSMC (only stuck cells, no errors) derived in [29] as depicted in the solid curve. At \(u=0\), the derived sphere-packing-type bound (dotted and dashed-dotted plots) matches the usual sphere-packing bound (“only errors”) case. The more u partially-stuck-at cells, the less amount of storable information, i.e. only \(q-1\) levels can be utilized. Hence, the dotted and dashed-dotted lines are declining while u is growing. On the other hand, the more errors (e.g., \(t=25\) in the dashed-dotted plot), the higher overall required redundancy and the lower storable information for the aforementioned curve.

8 Gilbert–Varshamov (GV) bound

8.1 Gilbert–Varshamov (GV) bound: finite length

We have provided various constructions of (u, 1, t) PSMCs based on q-ary t-error correcting codes with additional properties. In this section, we first employ GV-like techniques to show the existence of (u, 1, t) PSMCs. Next, we study the asymptotic of the resulting GV bounds.

We start with a somewhat refined version of the Gilbert bound, that should be well-known, but for which we did not find an explicit reference.

Lemma 5

Let q be a prime power, and assume there is an \([n,s]_q\) code \(\mathcal {C}_s\) with minimum distance at least d. If \(k\ge s\) is such that

$$\begin{aligned} \sum _{i=0}^{d-1} {n\atopwithdelims ()i} (q-1)^i < q^{n-k+1}, \end{aligned}$$

then there is an \([n,k]_q\) code \(\mathcal {C}_k\) with minimum distance at least d that has \(\mathcal {C}_s\) as a subcode.

Proof

By induction on k. For \(k=s\), the statement is obvious. Now let \(\kappa \ge s\) and let \(\mathcal {C}_{\kappa }\) be an \([n,\kappa ]_q\) code with minimum distance at least d that has \(\mathcal {C}_s\) as a subcode. If \(q^{\kappa }\sum _{i=0}^{d-1} {n\atopwithdelims ()i}(q-1)^i < q^n\), then the balls with radius \(d-1\) centered at the words of \(\mathcal {C}_{\kappa }\) do not cover \({\mathbb {F}}_q^n\), so there is a word \(\varvec{x}\) at distance at least d from all words in \(\mathcal {C}_{\kappa }\). As shown in the proof of [28, Theorem. 5.1.8], the \([n,\kappa +1]_q\) code \(\mathcal {C}_{\kappa +1}\) spanned by \(\mathcal {C}_{\kappa }\) and \(\varvec{x}\) has minimum distance at least d. \(\square \)

8.1.1 Finite GV bound based on Lemma 2

Theorem 6

Let q be a prime power. Let nktu be non-negative integers such that

$$\begin{aligned} \sum _{i=0}^{2(t+\lfloor \frac{u}{q}\rfloor )} {n\atopwithdelims ()i}(q-1)^i < q^{n-k+1}. \end{aligned}$$

There exists a q-ary (u, 1, t) PSMC of length n and size \(q^{k-1}\).

Proof

Let \(\mathcal {C}_1\) be the \([n,1,n]_q\) code generated by the all-one word. Lemma 5 implies that there is an \([n,k]_q\) code with minimum distance at least \(2(t+\lfloor \frac{u}{q}\rfloor )+1\) that contains the all-one word. Lemma 2 shows that \(\mathcal {C}_k\) can be used to construct a PSMC with the claimed parameters. \(\square \)

Remark 9

GV bound from Theorem 1 is a special case of Theorem 6 for \(u \le q-1\).

8.1.2 Finite GV bound based on Construction 2

Lemma 6

Let q be a prime power, and let \(1\le k <n\). Let \(E\subset {\mathbb {F}}_q^n{\setminus }\{0\}\). The fraction of \([n,k]_q\) codes with non-empty intersection with E is less than \(|E|/q^{n-k}\).

Proof

Let \(\mathcal {C}\) be the set of all \([n,k]_q\) codes. Obviously,

$$\begin{aligned} \Big |\big \{C\in \mathcal {C}\mid C\cap E \ne \emptyset \big \}\Big | \;\; \hspace{-1ex}\le \hspace{-2ex}\sum _{C\in \mathcal {C}: C\cap E\ne \emptyset }\big | \; C\cap E \; \big | = \sum _{C\in \mathcal {C}} \big | \; C\cap E\;\big |. \end{aligned}$$

It follows from [18, Lemma 3] that

$$\begin{aligned} \frac{1}{\mid \mathcal {C}\mid } \sum _{C\in \mathcal {C}} \big | \; C\cap E \;\big | = \frac{q^k-1}{q^n-1}|E| < \frac{|E|}{q^{n-k}}. \end{aligned}$$

\(\square \)

Remark 10

If E has the additional property that \(\lambda \varvec{e}\in E\) for all \(\varvec{e}\in E\) and \(\lambda \in {\mathbb {F}}_q{\setminus }\{0\}\), then the upper bound in Lemma 6 can be reduced to \(|E|/(q-1)q^{n-k}\).

Lemma 7

Let knd and \(d^{\perp }\) be integers such that

$$\begin{aligned}&\sum _{i=0}^{d-1} {n\atopwithdelims ()i} (q-1)^i< \frac{1}{2} q^{n-k} \text{ and } \\&\sum _{i=0}^{d^{\perp }-1} {n \atopwithdelims ()i} (q-1)^i < \frac{1}{2} q^{k}. \end{aligned}$$

There exists a q-ary [nk] code \(\mathcal {C}\) with minimum distance at least d such that \(\mathcal {C}^{\perp }\) has minimum distance at least \(d^{\perp }\).

Proof

Let \(\mathcal {C}\) denote the set of all \([n,k]_q\) codes. By applying Lemma 6 with \(E=\{\varvec{e}\in {\mathbb {F}}_q^n \mid 1\le {{\,\textrm{wt}\,}}(\varvec{e}) \le d-1\}\) and using the first condition of the lemma, we see that more than half of the codes in \(\mathcal {C}\) have empty intersection with E, that is, have minimum distance at least d. Similarly, more than half of all q-ary \([n,n-k]\) codes have minimum distance at least \(d^{\perp }\), and so more than half of the codes in \(\mathcal {C}\) have a dual with minimum distance at least \(d^{\perp }\). We conclude that \(\mathcal {C}\) contains a code with both desired properties. \(\square \)

Theorem 7

(Gilbert-Varshamov-like bound by Construction 2) Let q be a prime power. Suppose the positive integers utnkl with \(u,t\le n\) and \(l<k\) satisfy

$$\begin{aligned}{} & {} \sum _{i=0}^{2t} \left( {\begin{array}{c}n\\ i\end{array}}\right) (q-1)^i < \frac{1}{2}q^{n-l}, \end{aligned}$$
(9)
$$\begin{aligned}{} & {} \sum _{i=0}^{u-q+2} \left( {\begin{array}{c}n\\ i\end{array}}\right) (q-1)^i < \frac{1}{2}q^{l}, \end{aligned}$$
(10)
$$\begin{aligned}{} & {} \sum _{i=0}^{2t} \left( {\begin{array}{c}n\\ i\end{array}}\right) (q-1)^i < q^{n-k+1}. \end{aligned}$$
(11)

Then there is a q-ary (u, 1, t) PSMC of length n and cardinality \(q^{k-l}\).

Proof

According to Lemma 7, (9) and (10) imply the existence of an \([n,l]_q\) code \(\mathcal {C}_0\) with minimum distance at least \(2t+1\) for which the dual code has minimum distance at least \(u-q+3\). Lemma 5 shows that \(\mathcal {C}_0\) can be extended to an \([n,k]_q\) code \(\mathcal {C}\) with minimum distance at least d. As \(\mathcal {C}\) has a generator matrix of the form required by Construction 2, the theorem follows. \(\square \)

8.1.3 Finite GV bound based on Proposition 1

In this section, we give sufficient conditions for the existence of matrices satisfying the conditions of Proposition 1. We start with Lemma 8 and 9, then we prove the main theorem (Theorem 8).

Lemma 8

Let \(\varvec{G}\) be a \(k\times n\) matrix over \({\mathbb {F}}_q\). For \(s\ge 1\), let

$$\begin{aligned} d_s = \min \{ {{\,\textrm{wt}\,}}(\varvec{m}\varvec{G}) \mid \varvec{m}\in {\mathbb {F}}_{q^s}^k \setminus \{\varvec{0}\}\}. \end{aligned}$$

Then \(d_s=d_1\).

Proof

Let \(s\ge 1\). As \({\mathbb {F}}_q\subseteq {\mathbb {F}}_{q^s}\), it is clear that \(d_1\ge d_s\).

To show the converse, we use the trace function T defined as \(T(x) = \sum _{i=0}^{s-1} x^{q^i}\). As is well-known, T is a non-trivial mapping from \({\mathbb {F}}_{q^s}\) to \({\mathbb {F}}_q\), and

$$\begin{aligned} T(ax+by) = aT(x)+bT(y), \end{aligned}$$
(12)

for all \(x,y\in {\mathbb {F}}_{q^s}\) and \(a,b\in {\mathbb {F}}_q\). We extend the trace function to vectors by applying it coordinate-wise.

Let \(\varvec{m}\in {\mathbb {F}}_{q^s}^k\setminus \{\varvec{0}\}\). We choose \(\lambda \in {\mathbb {F}}_{q^s}\) such that \(T(\lambda \cdot \varvec{m})\ne \varvec{0}\). As \(T(0)=0\), we infer that \( {{\,\textrm{wt}\,}}(\varvec{m}\varvec{G}) = {{\,\textrm{wt}\,}}(\lambda \cdot \varvec{m}\varvec{G}) \ge {{\,\textrm{wt}\,}}(T(\lambda \cdot \varvec{m}\varvec{G}))\). As all entries from \(\varvec{G}\) are in \({\mathbb {F}}_q\), it follows from (12) that \(T(\lambda \cdot \varvec{m}\varvec{G}) = T(\lambda \cdot \varvec{m})\varvec{G}\). As a consequence,

$$\begin{aligned} {{\,\textrm{wt}\,}}(\varvec{m}\varvec{G}) \ge {{\,\textrm{wt}\,}}(T(\lambda \cdot \varvec{m}\varvec{G})) = {{\,\textrm{wt}\,}}(T(\lambda \cdot \varvec{m})\varvec{G}) \ge d_1, \end{aligned}$$

where the last inequality holds as \(T(\lambda \cdot \varvec{m})\in {\mathbb {F}}_q^k{\setminus }\{\varvec{0}\}\). \(\square \)

Now we introduce Lemma 9 which is the binary version of Lemma 7 but with an extra restriction on the weight of the words.

Lemma 9

Let knd and \(d^{\perp }\) be integers such that

$$\begin{aligned} \sum _{i=0}^{d-1} {n\atopwithdelims ()i}< \frac{1}{4} \cdot 2^{n-k} \text{ and } \sum _{i=0}^{d^{\perp }-1} {n \atopwithdelims ()i} <\frac{1}{2} \cdot 2^k. \end{aligned}$$

There exists a binary [nk] code \(\mathcal {C}\) with minimum distance at least d without a word of weight more than \(n-d+1\) such that \(\mathcal {C}^{\perp }\) has minimum distance at least \(d^{\perp }\).

Proof

Similar to the proof of Lemma 7. Let \(\mathcal {C}\) denote the set of all binary [nk] codes. By applying Lemma 6 with \(E=\{ \varvec{e}\in \mathbb {F}_2^{n}\mid 1\le {{\,\textrm{wt}\,}}(\varvec{e}) \le d-1 \text { or }{{\,\textrm{wt}\,}}(\varvec{e})\ge n-d+1\}\), we infer that the first inequality implies that more than half of the codes in \(\mathcal {C}\) contain no element from E. Similarly, the second inequality implies that more than half of the binary \([n,n-k]\) codes have minimum weight at least \(d^{\perp }\), and so more than half of the codes in \(\mathcal {C}\) have a dual with minimum distance at least \(d^{\perp }\). We conclude that there is a code in \(\mathcal {C}\) having both desired properties.\(\square \)

Theorem 8

(Gilbert-Varshamov-like bound by Construction 3) Let \(n,k,l,u,t,\mu \) be positive integers with \(u\le n, 2t< n\) and \(l<k\) be such that

$$\begin{aligned}{} & {} \sum _{i=0}^{2t} {n\atopwithdelims ()i} < \frac{1}{4}\cdot 2^{n-l}, \end{aligned}$$
(13)
$$\begin{aligned}{} & {} \sum _{i=0}^{\lfloor \frac{u}{2^{\mu -1}}\rfloor } {n\atopwithdelims ()i}<\frac{1}{2}\cdot 2^{l}, \end{aligned}$$
(14)
$$\begin{aligned}{} & {} \sum _{i=0}^{2t} {n\atopwithdelims ()i} (2^{\mu }-1)^i < 2^{\mu (n-k+1)}, \end{aligned}$$
(15)

Then there exists a (u, 1, t) PSMC of length n over \({\mathbb {F}}_{2^{\mu }}\) with cardinality

\(2\cdot 2^{\mu (k-l-1)}2^{l(\mu -1)}\).

Proof

By Lemma 9, there exists a binary [nl] code \(\mathcal {C}_0\) with minimum distance at least \(2t+1\) for which \(\mathcal {C}_0^{\perp }\) has minimum distance at least \(\lfloor \frac{u}{2^{\mu -1}}\rfloor +1\) with the following additional property: if \(\varvec{H}_0\in {\mathbb {F}}_2^{l\times n}\) is a generator matrix for \(\mathcal {C}_0\), then the binary code \(\mathcal {C}_{\mu }\) with generator matrix \(\begin{bmatrix} \varvec{H}_0 \\ \varvec{1} \end{bmatrix}\) has minimum distance at least \(2t+1\). According to Lemma 8, the code \(\mathcal {C}_{\mu }\) over \({\mathbb {F}}_{2^{\mu }}\) with this generator matrix has minimum distance at least \(2t+1\) as well. Lemma 5 implies that \(\mathcal {C}_{\mu }\) can be extended to an \([n,k]_{2^{\mu }}\) code with minimum distance at least \(2t+1\). The [nk] code has a generator matrix of the form \(\varvec{G} = \begin{bmatrix} \varvec{H}_0 \\ \varvec{G}_1\\ \varvec{1} \end{bmatrix}\). Application of Proposition 1 yields the claim.\(\square \)

8.1.4 Finite GV bound from trivial construction

Clearly, a \((q-1)\)-ary code of length n with minimum distance at least \(2t+1\) is a q-ary (u, 1, t) PSMC of length n with \(u=n\). Combining this observation with the Gilbert bound for a \((q-1)\)-ary alphabet, we obtain the following corollary.

Corollary 5

Let \(q\ge 3\), and let

$$\begin{aligned} M= \Bigg \lceil \dfrac{(q-1)^n}{\sum _{i=0}^{2t} {n\atopwithdelims ()i} (q-2)^i} \Bigg \rceil . \end{aligned}$$

There is a q-ary (n, 1, t) PSMC of length n and cardinality M.

So far we have covered the GV-like bounds for our code constructions for finite length n.

8.2 Asymptotic GV bound on PSMCs

In this section, we present the asymptotic version of the GV bounds from the previous section. That is, we provide lower bounds on the achievable rates of a q-ary (u, 1, t) PSMCs in the regime that the code length n tends to infinity, and the number u of partially-stuck-at cells and the number t of random errors both grow linearly in n.

We recall the well-known following lemma that estimates the volume of a Hamming ball using the q-ary entropy function.

Lemma 10

For positive integers n, \(q\ge 2\) and real \(\delta \), \(0\le \delta \le 1-\frac{1}{q}\),

$$\begin{aligned} {{\,\textrm{Vol}\,}}_q(n,\delta n) \le q^{h_q(\delta )n}, \end{aligned}$$

where \({{\,\textrm{Vol}\,}}_q(n,r)= \sum _{j=0}^r {n\atopwithdelims ()j} (q-1)^j\) denotes the volume of a Hamming ball with radius r.

Proof

The proof of Lemma 10 has been stated in many references including [22, p.105] and [10, Proposition 3.3.1].\(\square \)

8.2.1 Asymptotic bound for Theorem 6

Theorem 9

Let q be a prime power. Let \(0\le \tau ,\upsilon <1\) be such that

$$\begin{aligned} 2\left( \tau +\frac{\upsilon }{q}\right) < 1-\frac{1}{q}. \end{aligned}$$

For sufficiently large n, there exists an \((\lfloor \upsilon n\rfloor , 1, \lfloor \tau n\rfloor )\) PSMC of length n and rate at least

$$\begin{aligned} 1-h_q\left( 2\left( \tau +\frac{\upsilon }{q}\right) \right) - \frac{2}{n}. \end{aligned}$$

Proof

Let n be a positive integer such that \(\lceil n h_q(2(\tau + \frac{\upsilon }{q})\rceil < n\). Let \(t=\lfloor \tau n\rfloor \) and \(u=\lfloor \upsilon n \rfloor \). Take \(k=n-\lceil nh_q(2\tau + 2\frac{\upsilon }{q})\rceil \). Lemma 10 implies that Vol\(_q(n,2t+2\lfloor \frac{u}{q}\rfloor ) \le q^{n-k}\), and so, according to Theorem 6, there is a q-ary (u, 1, t) PSMC of length n with rate \(\frac{k-1}{n}\ge 1-h_q(2(\tau + \frac{\upsilon }{q})) - \frac{2}{n}\).\(\square \)

8.2.2 Asymptotic GV bound from Construction 2

Theorem 10

(Asymptotic Gilbert-Varshamov-like bound from Theorem 7) Let q be a prime power. Let \(\upsilon ,\tau \) be such that

$$\begin{aligned} 0< \upsilon ,2\tau< 1- \frac{1}{q} \text{ and } h_q(\upsilon )+h_q(2\tau ) < 1. \end{aligned}$$

For sufficiently large n, there exists a q-ary \((\lfloor \upsilon n\rfloor ,1, \lfloor \tau n \rfloor )\) PSMC of length n and rate at least

$$\begin{aligned} 1 - h_q(2\tau ) -h_q(\upsilon )-\frac{4\log _q(2)+2}{n}. \end{aligned}$$

Proof

Let n be a positive integer. Write \(u=\lfloor \upsilon n\rfloor \) and \(t=\lfloor \tau n\rfloor \). Then Vol\(_q(n,u-q+2)\le \text{ Vol}_q(n,u)\). Hence, by setting

$$\begin{aligned} l = \lceil nh_q(\upsilon ) +2 \log _q(2)\rceil , \end{aligned}$$

Lemma 10 implies that (10) is satisfied.

Similarly, by setting

$$\begin{aligned} k = n- \lceil nh_q(2\tau ) + 2\log _q(2)\rceil , \end{aligned}$$

it is ensured that (11) is satisfied.

According to Theorem 7, there is a q-ary (u, 1, t) PSMC of length n and size \(q^{k-l}\), so with rate \(k-l\). The choices for k and l show that the theorem is true.\(\square \)

Remark 11

Theorem 10 in fact holds for classical stuck-at cells instead of stuck-at-1 errors, as follows from considering the generalization of Theorem 7 in Proposition 2, i.e., Heegard’s construction [12].

8.2.3 Asymptotic GV bound from Construction 3

Theorem 11

(Asymptotic Gilbert-Varshamov-like bound from Theorem 8) Let \(\mu \) be a positive integer, and let \(\upsilon \) and \(\tau \) be such that

$$\begin{aligned} 0\le \frac{\upsilon }{2^{\mu -1}}<\frac{1}{2}, 0< 2 \tau<\frac{1}{2}, \text{ and } h_2\left( \frac{\upsilon }{2^{\mu -1}}\right) + h_2(2\tau ) < 1. \end{aligned}$$

For sufficiently large n there is a \(2^{\mu }\)-ary \((\lfloor \upsilon n\rfloor ,1,\lfloor \tau n\rfloor )\) PSMC of length n and rate at least

$$\begin{aligned} 1-h_{2^{\mu }}(2\tau ) -\frac{1}{\mu }h_2\left( \frac{\upsilon }{2^{\mu -1}}\right) -\frac{2}{n} - \frac{3}{\mu n}. \end{aligned}$$

Proof

For notational convenience, we set \(\upsilon _0=\frac{\upsilon }{2^{\mu -1}}\) and \(\eta =1-h_2(2\tau ) - h_2(\upsilon _0)\). Note that \(\eta >0\).

Let n be a positive integer satisfying \(n\ge \frac{7}{\eta }\), and let \(u=\lfloor \upsilon n\rfloor \), \(u_0=\lfloor \frac{u}{2^{\mu -1}}\rfloor \) and \(t=\lfloor \tau n\rfloor \). We set

$$\begin{aligned} l = \lceil nh_2(\upsilon _0) \rceil + 3. \end{aligned}$$

Lemma 10 implies that (14) is satisfied. Moreover, as

$$\begin{aligned} n-l-3 \ge n-nh_2(\upsilon _0) - 7 = nh_2(2\tau ) + n\eta -7 \ge nh_2(2\tau ), \end{aligned}$$

Lemma 10 implies that (13) is satisfied.

We set

$$\begin{aligned} k= n - \lceil n h_{2^{\mu }}(2\tau ) \rceil . \end{aligned}$$

Lemma 10 implies that (15) is satisfied.

According to [10, Corollary 3.3.4], we have that \(h_{2^{\mu }}(2\tau )\le h_2(2\tau )\), and so

$$\begin{aligned} k-l \ge n- nh_2(2\tau ) -1 - nh_2(\upsilon _0) - 4 = n\eta - 5 \ge 2. \end{aligned}$$

Theorem 8 implies the existence of a \(2^{\mu }\)-ary (u, 1, t) PSMC of length n with size \(2\cdot 2^{\mu (k-1)}2^{-l}\), i.e., its rate is

$$\begin{aligned} \frac{k-1}{n} - \frac{l-1}{\mu n} \ge 1-h_{2^{\mu }}(2\tau )-\frac{1}{\mu } h_2(\upsilon _0) - \frac{2}{n}-\frac{3}{\mu n}. \end{aligned}$$

\(\square \)

Theorem 12

(Asymptotic Gilbert-Varshamov bound from Corollary 5) Let \(q\ge 3\). For each positive integer n and each \(\tau \) with \(0\le 2\tau < 1-\frac{1}{q-1}\), there exists a q-ary \((n,1,\lfloor \tau n\rfloor )\) PSMC of length n and rate at least

$$\begin{aligned} (1-h_{q-1}(2\tau ))\cdot \log _q(q-1). \end{aligned}$$

Proof

Let \(t=\lfloor \tau n\rfloor \). Corollary 5 implies the existence of a q-ary (n, 1, t) PSMC of length n and cardinality M satisfying

$$\begin{aligned} M \ge \dfrac{(q-1)^n}{V_{q-1}(n,2t)} \ge (q-1)^{n(1-h_{q-1}(2\tau ))}, \end{aligned}$$

where the last inequality holds by Lemma 10.\(\square \)

9 Comparisons

We provide different comparisons between our code constructions and the existence of the code based on Theorems 6, 7 and 8. Next, we also compare to the known limits and investigate the trade-off between masking and error-correction as described in Sect. 6.

9.1 Comparison of Theorem 6 for \(u \le q-1\) to other Bounds

Figure 3 illustrates the rates of a \((q-1,1,t)\) PSMC obtained from Theorem 1 (applying Theorem 6 for the special case \(u\le q-1\)) for \(n=114, q=7\) and \(0\le t\le 56\). We show how close explicit BCH codes that contain the all-one word of certain rates R and that can correct designed distances \(d \ge 2t+1\) to the achieved rates from Theorem 1. We note that the solid red graph matches the dashed-dotted green plot for a few code parameters and overpasses it for \(t=39\). We also compare to the classical q-ary GV bound (in dashed black) as well as to reduced alphabet \((q-1)\)-ary GV bound (in dashed-dotted blue). To this end, we show upper bounds on the rates that can be obtained from Theorem 1 using the Griesmer bound [9], and the Ball–Blokhuis bound [1] on the size of codes containing the all-one word.

Fig. 3
figure 3

Comparison of other upper and lower limits to our derived GV-like bound in Theorem 6 taking \(n=114\), \(q=7\), \(0\le t\le 56\) and \(u \le q-1\). The dashed-dotted green curve shows the rates for Theorem 1 by Theorem 6 for \(u \le q-1\) in which codes that have the all-one words are considered. This curve for several code parameters matches the red line that shows the rates of BCH codes that contain all-one word with regard to the designed distances \(d \ge 2t+1\)

Fig. 4
figure 4

The achievable rates \(R=\frac{1}{n}\log _{2^3} {\mathcal {M}}\) of GV bounds for different ut for \(n=200\) and \(q=2^3\) in Theorems 7 and 8, where \({\mathcal {M}}\) is the code cardinality. They are also compared to the rates from an ordinary 7-ary GV bound for different t as illustrated in the dashed-dotted blue plot. The solid and the dashed lines represent the derived GV like bounds from Theorems 7 and 8, respectively

Fig. 5
figure 5

The achievable rates \(R=\frac{1}{n}\log _{2^2} {\mathcal {M}}\) of GV bounds for different ut for \(n=200\) and \(q=2^2\) in Theorems 7 and 8. They are also compared to the rates from an ordinary 3-ary GV bound for different t as illustrated in the dashed-dotted blue plot. The solid and the dashed lines correspond to the derived GV like bounds by Theorem 7 and by Theorem 8, respectively

Fig. 6
figure 6

The achievable rates \(R= \frac{1}{n} \log _{2^3} {\mathcal {M}}\) of GV bounds for different ut for \(n=200\) and \(q=2^3\) in Theorem 6 that are compared to the reduced alphabet conventional \((q-1)\)-ary GV bound for different t. The dashed-dotted green curve represents the rates from Theorem 6 when \(u \le q-1\)

9.2 Comparison among Theorems 7, 8 and \((q-1)\)-ary Gilbert-Varshamov bound

We plot the achievable rates \((R=log_q(M)/n)\) as a function of t for different fixed values of u. Figure 4 is the resulting plot for \(n=200\), \(\mu =3\) and \(q=2^\mu \). It can be seen that the GV-like bound in different ranges of u and t based on Construction 2 improves upon the (\(q-1\))-ary GV bound for \(u\le 5\) as depicted in the solid red curve, and improves further (up to \(u\le 20\)) based on Construction 3 as shown in the dashed gray line (3rd one from above). The dashed dotted blue curve is used to see what if we map our \(2^3\) levels such that we avoid the subscript 0 to compare with 7 levels. It is obvious that for \(\mu =3\), the rate lossFootnote 1 resulting from using \(q-1\) instead of q symbols is already quite small. Note that for \(u=0\) the solid red curve mostly achieves the exact rates obtained from the standard \(2^3\)-ary GV bound for \(0 \le t \le 80\), and so as for the dashed red curve but for \(0 \le t \le 47\).

For \(\mu =2\), the improvements from Construction 2 (\(u \le 10\)) and Construction 3 (\(u \le 30\)) upon a usual (\(q-1\))-ary GV bound are more significant as shown in Fig. 5.

9.3 Comparisons between Theorem 6 and \((q-1)\)-ary Gilbert-Varshamov bound

In Fig. 6, we compare the GV like bound from Theorem 6 for \(q=2^3\) with the conventional GV bound for \(q-1=7\) shown in dashed blue curve. For \(q=8\), we observe that the conventional \(q-1\)-ary GV bound is superior to the derived GV-like bound from Theorem 6 for \(u \ge 40\). However, applying Theorem 6 where \(u \le 20\), the traditional \(q-1\)-ary GV bound is a bad choice. We observe that the dashed-dotted green curve by Theorem 6 for (\(u\le 7\) as stated in Remark 9) shows the highest rates.

9.4 Comparisons between Theorems 8 and 6

In Fig. 7, we compare Theorems 8 and 6. Theorem 8 is showing higher rates for larger u values, for example taking \(u= 40\) and \(t=1\), the rate is \(R = 0.87\) from Theorem 8 while \(R = 0.83\) from Theorem 6. It is interesting to note that for \(u=30\) and \(t> 10\) Theorem 6 is better, and for \(u =10\) and \(t > 18\) Theorem 6 is as good as Theorem 8.

Fig. 7
figure 7

The achievable rates \(R=\frac{1}{n}\log _{q} {\mathcal {M}}\) of GV bounds for different \(u =\{10,30,40\}, \text{ and } t =\{0,1,2,4, 5, 10,20\}\) for \(n=200\) and \(q=2^3\) in Theorems 8 and 6. They are also compared to the rates for \((q-1)\)-ary GV bound as shown in dashed black graph

9.5 Comparisons of application of Theorem 5 vs direct application of Theorem 7

For given (ut), we illustrate the trading \((u+1, t-1)\) in Fig. 8. For some of t and a few of u values, it is advantageous if the encoder introduces an error in a partially-stuck-at position in order to mask this position. The orange solid curve, for instance, represents the rates that have been determined by Theorem 7 for \(u=17\) and \(0\le t\le 50\), while the orange dotted sketch highlights the rates for \(u =16\) while \(1\le t\le 51\). Due to the exchange such that \(u+1 =17\) and \(0 \le t-1 \le 50\), the orange dotted line slightly fluctuates up and down the rates shown in the orange solid curve for most t values.

Let us describe some points of Fig. 8 in Table 4. Let \(\mathcal {C}_{u,t}\) be a code by Theorem 7 whose rate is R given in Table 4 at u row and t column. Take \(\mathcal {C}_{21,15}\) so that its rate . By applying Theorem 5 on \(\mathcal {C}_{21,15}\), we obtain a code \(\mathcal {C}_{22,14}\) of \(R= 0.470\). Direct application of Theorem 7 yields a \(\mathcal {C}_{22,14}\) of rate as highlighted in Table 4. We conclude that in this case, the trade by Theorem 5 gives lower rates than taking the same code directly by Theorem 7 for given \((u=22,t=14)\).

On contrary, for larger t values, Table 4 shows that the exchange is beneficial giving higher rates. For example, we start with \(\mathcal {C}_{21,41}\) whose , then applying Theorem 5 gives \(\mathcal {C}_{22,40}\) of \(R=0.105\) which is greater than that has been obtained directly by Theorem 7 as stated in Table 4.

Table 4 Table of selected points from Fig. 8 with slightly lower and higher rates due to trading. All points are from Theorem 7
Fig. 8
figure 8

The achievable rates \(R=\frac{1}{n}\log _{q} {\mathcal {M}}\) of GV bounds for different \(u, \text{ and } t\) for \(n=200\) and \(q=2^3\) in Theorem 7. The solid plots are the rates from the derived GV-like bound and the dotted graphs are the rates after trading \(u+1, t-1\) by Theorem 5

9.6 Comparisons of applications of Theorem 5, Lemma 1, Lemma 2 vs direct application of Theorem 8

For the derived GV bound based on Construction 3 obtained by Theorem 8, we demonstrate the exchange of a one error correction ability with a single masking capability of a partially stuck cell following Theorem 5 in Fig. 9. The solid and dotted lines represent the rates before and after trading, respectively. We also show the exchange by Lemmas 1 and 2 in which the reduction of the correctable errors by one increases u by \(2^{\mu -1}\) and \(2^\mu \), respectively. As it is seen in Fig. 9, every single solid curve by Theorem 8 corresponds to multiple values of u due to the floor operation where \(\mathcal {C}_0^{\perp }\) has a minimum distance at least \(\lfloor \frac{u}{2^{\mu -1}}\rfloor +1\). For instance, for all \(u ={8,9,10,11}\) and \(\mu =3\), we obtain: \(\lfloor \frac{u}{2^{3-1}}\rfloor +1 = 3\), and the transition starts for \(u = 12\) as \(\lfloor \frac{u}{2^{3-1}}\rfloor +1 = 4\). Similarly by the floor operation there is no change for the larger values of \(u =13,14,15\), and so on. Let us discuss the following curves. For \(u=19\), the orange solid curve shows the rates by Theorem 8. Exchanging \(u+1\) and \(t-1\) throughout Theorem 5 obtains the orange dotted line for \(u+1 = 20\) which lies slightly bellow the orange solid plot. Hence, the exchange gives lower rates. However, it provides rate \(R =0.380\) for \(t=30\) while direct application of Theorem 8 (compared to its corresponding graph which is the solid green curve at \(u = 20,21,22,23\)) does not.

Now, we apply Lemma 1 rather than Theorem 5. For the same achieved rates, we observe that the dashed red graph for \(u+2^{3-1} = 23\) shows the exact rates from the orange dotted curve for \(u+1 = 20\). Therefore, it is clear that Lemma 1 provides a gain of masking exactly 3 more cells with regard to Theorem 5. On the other hand, for the same masked u, applying Lemma 1 for \(u=19\) gives \((u+4,t-1)\) PSMC represented by the dashed red graph which presents slightly higher rates if compared to the dotted green for \(u=23\) by Theorem 5. Further, Lemma 1 provides rate at \(t=30\) while Theorem 5 stops giving rates at \(t \ge 29\). For this graph, we conclude that Lemma 1 surpasses Theorem 5.

However, if we take \(u=23\) directly by Theorem 8, we achieve slightly higher rates. We conclude that Theorem 8 can directly estimate the maximum possible masked u cells that can also be achieved applying Lemma 1, and can achieve slightly higher rates. On contrary, Theorem 8 does not give rates for larger t values while Lemma 1 and Theorem 5 do that.

On the other hand, as Theorem 8 is based on Construction 3 that contains a word of weight n, Lemma 2 is applicable under the condition that \(2(t+ \lfloor \frac{u}{q}\rfloor ) <d\) (cf. Remark 6). Hence, we can achieve higher rates as shown in the dashed-dotted curve while masking up to the same number of u cells, rather employing Lemma 1 or Theorem 5.

For that we describe some points of Fig. 9 by Table 5. Let \(\mathcal {C}_{u,t}\) be a code by Theorem 8 whose rate is R given in Table 5 at u row and t column. Taking \(\mathcal {C}_{19, 27}\) gives \(\mathcal {C}_{20, 26} \text{ and } \mathcal {C}_{23, 26}\) with \(R = 0.435\) applying Theorem 5 and Lemma 1, respectively. In contrary, taking \(\mathcal {C}_{19, 31}\) is advantageous as there are codes (\(\mathcal {C}_{20, 30}\) by Theorem 5 and \(\mathcal {C}_{23, 30}\) by Lemma 1) with \(R =0.380\) while direct application of Theorem 8 cannot provide these codes as highlighted in green with “”. Now, we apply Lemma 2 on a code obtained by Theorem 6 for \((u=7, t=27)\) to obtain the code \(\mathcal {C}_{15, 26}\) of rate \(R = 0.465\) that satisfies \(2(26+\lfloor \frac{15}{8} \rfloor ) < 55\). The achieved rate is higher compared to \(\mathcal {C}_{15, 26}\) of \(R = 0.460\) that is directly obtained by Theorem 8, or applying Theorem 5 on \(\mathcal {C}_{14, 27}\) to obtain \(\mathcal {C}_{15, 26}\) of \(R = 0.445\), or using Lemma 1 on \(\mathcal {C}_{11, 27}\) to obtain \(\mathcal {C}_{15, 26}\) of \(R = 0.456\). This result does not mean that application Lemma 2 on a code obtained by Theorem 6 always provides higher code rates for the same parameters ut (see Fig. 7).

Table 5 Table of selected points from Fig. 9. All points are from Theorem 8
Fig. 9
figure 9

The achievable rates \(R=\frac{1}{n}\log _{q} {\mathcal {M}}\) of GV bounds for different \(u, \text{ and } t\) for \(n=200\) and \(q=2^3\) in Theorem 8. The solid plots are the rates from the derived GV like bound and the dotted graphs are the rates after trading \(u+1, t-1\) by Theorem 5. We also show the exchange by Lemma 1 and Lemma 2 in which the reduction of the correctable errors by one increases u by \(2^{\mu -1}\) and \(2^\mu \), respectively

9.7 Analytical comparison of asymptotic GV-like bounds

In this section, we state the results of the analytical comparisons of the asymptotic GV bounds from Theorems 9, 10 and 11, ignoring the terms that tend to zero for increasing n. We will use the following lemma.

Lemma 11

Let \(q\ge 2\) be an integer. If \(0\le x,y\) are such that \(x+y\le 1\), then

$$\begin{aligned} h_q(x+y) \le h_q(x) + h_q(y). \end{aligned}$$

Proof

Let \(0\le y < 1\), and consider the function \(f_y(x)=h_q(x+y)-h_q(x)-h_q(y)\) on the interval \([0,1-y]\). Clearly, \(f'_y(x)=h'_q(x+y)-h'_q(x) \le 0\), where the inequality follows from the fact that the second derivative of \(h_q\) is non-negative. Hence, \(f_y(x)\le f_y(0)=0\) for each \(x\in [0,1-y]\).\(\square \)

Proposition 5

If \(\upsilon ,\tau \) and q are such that the conditions of Theorems 9 and 10 are met, then the rate guaranteed by Theorem 9 is at least equal to the code rate guaranteed by Theorem 10.

Proof

Assume \(\tau \) and \(\upsilon \) are such that the conditions of Theorems 9 and 10 are satisfied, that is, such that \(2\tau +2\frac{\upsilon }{q} < 1-\frac{1}{q}\) and \(h_q(\upsilon )+h_q(2\tau ) < 1\). By invoking Lemma 11, we see that

$$\begin{aligned} h_q(2\tau +2\frac{v}{q}) \le h_q(2\tau ) + h_q(2\cdot \frac{\upsilon }{q}) \le h_q(2\tau ) + h_q(\upsilon ), \end{aligned}$$

where the final inequality holds as \(q\ge 2\) and \(h_q\) is monotonically increasing on \([0,1-\frac{1}{q}]\). As a consequence, the code rate guaranteed by Theorem 9 is at least equal to the code rate guaranteed by Theorem 10.\(\square \)

Proposition 6

If \(\upsilon ,\tau \) and \(q=2^{\mu }\) are such that the conditions of Theorems 10 and 11 are met, then the code rate guaranteed by Theorem 11 is at least equal to the code rate guaranteed by Theorem 10.

Proof

Assume that the conditions of Theorems 10 and 11 are satisfied. The difference between the rate of Theorem 11 and of Theorem 10 equals

$$\begin{aligned} h_{2^{\mu }}(\upsilon ) - \frac{1}{\mu }h_2(\frac{\upsilon }{2^{\mu -1}}). \end{aligned}$$
(16)

According to the conditions of Theorem 10, \(\upsilon \le 1-\frac{1}{2^{\mu }}\), and so \(h_{2^{\mu }}(\upsilon ) \ge h_2(\frac{\upsilon }{2^{\mu -1}})\). As \(h_{2^{\mu }}(x)=\frac{1}{\mu } h_2(x) + x\log _{2^{\mu }}(2^{\mu }-1)\), the difference in (16) is non-negative. That is, Theorem 11 is better than Theorem 10. \(\square \)

We note that the requirement \(2\tau < \frac{1}{2}\) from Theorem 11 is stricter than the requirement \(2\tau < 1-\frac{1}{2^{\mu }}\) from Theorem 10. That is, there are pairs \((\tau ,\upsilon )\) for which Theorem 10 is applicable, but Theorem 11 is not.

Comparison of Theorems 9 and 11 is more complicated. We have the following partial result.

Proposition 7

Let \(\upsilon ,\tau >0\) and \(q=2^{\mu }\) be such that the conditions of Theorems 9 and 11 are met. If \(\upsilon \) is sufficiently small, then the rate guaranteed by Theorem 9 is larger than the rate guaranteed by Theorem 11.

Proof

Assume that \(\tau \) and \(\upsilon \) are such that the conditions of Theorem 9 and of Theorem 11 are satisfied, that is, \(2\tau + 2\frac{\upsilon }{2^{\mu }}< 1-\frac{1}{2^{\mu }}\),

$$\begin{aligned} 0\le \upsilon \le 2^{\mu -2},\; 0\le 2\tau \le \frac{1}{2} \text{ and } h_2(\frac{\upsilon }{2^{\mu -1}}) + h_2(2\tau ) < 1. \end{aligned}$$

Let \(f_{\mu }(\tau ,\upsilon _0)\), where \(\upsilon _0=\frac{\upsilon }{2^{\mu -1}}\), be the bound from Theorem 9 minus the bound from Theorem 11, that is

$$\begin{aligned} f_{\mu }(\tau ,\upsilon _0)= h_{2^{\mu }}(2\tau ) + \frac{1}{\mu }h_2(\upsilon _0)-h_{2^{\mu }}(2\tau + \upsilon _0).\end{aligned}$$

The definition of the entropy function implies that for any \(x\in [0,1]\)

$$\begin{aligned} h_{2^{\mu }}(x) = \frac{1}{\mu } \left( h_2(x) + x \log _2(2^{\mu }-1)\right. ). \end{aligned}$$
(17)

Applying (17), we infer that

$$\begin{aligned} \mu f_{\mu }(\tau ,\upsilon _0) = h_2(2\tau ) + h_2(\upsilon _0) -h_2(2\tau + \upsilon _0) - \upsilon _0\log _2(2^{\mu }-1). \end{aligned}$$
(18)

In particular, \(\mu f_{\mu }(0,\upsilon _0) = -\upsilon _0\log _2(2^{\mu }-1) \le 0\).

So for \(\tau =0\), Theorem 11 is better than Theorem 9. It follows from Lemma 11 that the three leftmost terms in (18) form a non-negative number. The subtraction of the fourth term, however, can result in a negative function value, especially for large \(\mu \).

Example 6

(Numerical example) \(\mu f_{\mu }(0.055,0.11) = 2\cdot h_2(0.11)- h_2(0.22) -0.11\log _2(2^{\mu }-1)\approx 0.23397-0.11 \log _2(2^{\mu }-1)\) is positive for \(\mu \le 2\) and negative otherwise. \(\square \)

We now prove Proposition 7. That is, we show that for \(\tau >0\) and \(\upsilon _0\) sufficiently small, \(f_{\mu }(\tau ,\upsilon _0) > 0\). This follows from the Taylor expansion of \(\mu f_{\mu }(\tau ,\upsilon _0)\) around \(\upsilon _0=0\). Indeed, \(f_{\mu }(\tau ,0)=0\), and \(h'_2(x)\rightarrow \infty \) if \(x\downarrow 0\).\(\square \)

10 Conclusion

In this paper, code constructions and bounds for non-volatile memories with partial defects have been proposed. Our constructions can handle both: partial defects (also called partially stuck cells) and random substitution errors, and require less redundancy symbols for \(u> 1\) and \(q>2\) than the known constructions for stuck cells. Compared to error-free masking of partially stuck cells, our achieved code sizes coincide with those in [29], or are even larger as shown in Proposition 1. We summarize our constructions and the previous works on partially/fully stuck cells in Table 3.

Further, we have shown that it can be advantageous to introduce errors in some partially stuck cells in order to satisfy the stuck-at constraints. For the general case that is applicable for all of our constructions, we have shown in Theorem 5 how to replace any \(0 \le j \le t\) errors by j masked partially stuck cells. This theorem has been improved for Construction 3 by Lemma 1, and further enhanced by another method for introducing errors in the partially stuck locations through Lemma 2 (cf. Example 5). We gain (e.g., for \(j=1\)) exactly \(2^{\mu -1}\) and \(2^{\mu }\) (under the condition that \(2(t+ \lfloor \frac{u}{q}\rfloor ) <d\)) additional masked partially stuck cells applying Lemmas 1 and 2, respectively. So far, determining if introducing errors in partially stuck cells is advantageous or not can only be done numerically.

We also derived upper and lower limits on the size of our constructions. Our sphere-packing-like bound for the size of \((\varSigma ,t)\) PSMCs has been compared to the usual sphere-packing upper bound, and for the case of no errors (\(t=0\)) to [29, Theorem 2].

We have numerically compared our Gilbert–Varshamov-type bounds, for given (ut), to each other and to \((q-1)\)-ary codes. For \(u \le q-1\), Theorem 6 states the existence of (u, 1, t) PSMCs with rates that almost match the ones from the usual q-ary GV bound (shown in Fig. 3). Moreover, up to \(u = 20\) for \(q=8\), Fig. 6 shows that application of Theorem 6 is better than using \((q-1)\)-ary code as mentioned in [29, Section III]. On the other hand, for \(q=4\) and \(u =10\), Theorems 7 and 8 obviously require less redundancy than \((q-1)\)-ary code as shown in Fig. 5.

Figures 8 and 9 demonstrate the application of Theorem 5, Lemmas 1 and 2 on (u, 1, t) PSMCs of rates that have been obtained based on Theorems 7 and 8. For some parameters (i.e. \(u=16\), \(t=41\) as shown in Table 4 and \(u=19\), \(t=31\) as shown in Table 5), application of Theorem 5 and Lemma 1 achieve higher code rates and more masked cells. Application Lemma 2 on a code obtained by Theorem 6 (i.e. \(u=7\), \(t=27\)) provides higher code rate compared to the direct employment of Theorems 8, 5 and Lemma 1.

In the asymptotic regime of our GV-like bounds, Theorems 9 and 11 are remarkable competitors to Theorem 10. However, the analytical comparison between Theorem 9 and Theorem 11 is more complicated to decide which one is the better choice. This was also confirmed numerically via Fig. 7.