1 Introduction

Information theory is a field of applied probability that deals with the quantification of uncertainty. Initiated by a landmark paper by Claude Shannon1, it was originally intended to be used for data compression and to determine the fundamental limits of communication. Today, however, it finds a variety of applications including portfolio optimization2 and evolutionary biology3. Recently, there has also been work in characterizing quantum mechanical interactions as channels and viewing them from information theoretic perspective. This allows us to obtain channel capacity results for those interactions similar to classical case. Moreover, there are some new notions of capacity (e.g., Quantum Capacity) which are specific to quantum communication. While there do exist extensive surveys4 on finite dimensional quantum information theory, there are very few for infinite-dimensional cases. Interesting channels such as Gaussian Bosonic channels, used in optical channels, fall in this purview.

When considering classical channel coding problems, the channel is modelled as a conditional probability mass function (pmf) where given a certain input, an output is generated with respect to the pmf. In quantum channel coding problems, the channel is modeled as a completely positive trace preserving (CPTP) map5,6,7. These recover the classical channel results as special cases. What makes quantum channels special is in the usage of entangled states and superpositions of states8, which do not have classical equivalents that aid in boosting channel capacity. An elementary example is that of superdense coding9, where in the presence of entanglement, 2 bits of information may be sent using one entangled qubit. Thus the study of quantum information theory promises significant gains in telecommunications provided we can get around the intricacies involved. For example, quantum modeling of optical fibers can provide higher transmission rate of classical information than possible via classical modeling (see10); in fact, even for semi-classical modeling with direct, homodyne or heterodyne detection.

While there are several potential benefits in switching to quantum communications, it must be noted that there are some restrictions that prevent us from abusing the technology. This is characterized by the so-called no-go theorems. For instance, the no cloning theorem11,12,13 states that there is no unitary operation that can clone an arbitrary unknown state. This is unlike classical communication where bits may be copied. Then there is the no communication theorem14 which states that the very act of measuring an entangled state does not communicate information. This is important as measuring a state in quantum systems is very unlike in classical systems and it does not help in communication. Finally, the quantum teleportation protocol15,16 may lead to the incorrect conclusion that, as the state transfer was instant, we communicated faster than speed of light17. In fact, that protocol is constricted by an auxiliary classical channel which forces the information transfer to be limited by the speed of light. Without the classical channel outputs, the receiver cannot extract any useful information from its state. These are just some of the many examples that illustrate the limitations of quantum technology.

When restricted to finite-dimensional Hilbert spaces, most results and techniques nearly mimic that of classical information theory5,7. For example, achievability of classical discrete memoryless channel capacity is derived using a method called typical sets2. The quantum analog of this is typical subspaces which uses a similar proof technique to derive the achievability bound. There are quantum analogs of several classical information theory concepts, e.g., quantum entropy (also known as von Neumann Entropy), quantum KL Divergence, quantum mutual information to name a few (see5,6,7). For the concepts that have no classical analog, e.g., coherent information, quantum capacity, finite-dimensional quantum information theory is still tractable for several classes of channels. To aid in the understanding of infinite-dimensional results, we will recall the finite dimensional results wherever applicable.

Other interesting applications of quantum information theory (QIT) include quantum error correction (QECC)8,18,19, quantum source coding (QSC)20,21, quantum key distribution (QKD), private capacity and quantum internet, to name a few. We do not cover these in our survey but briefly comment in the following.

Classical error correction deals with the design of coding techniques to safeguard messages from a specified number of errors. This is done by adding redundant symbols to the message which serve to recover the message when an error occurs. When the messages are binary, the errors are merely bit flips. In QECC, the error could be a qubit flip (X Gate), phase flip (Z Gate), Y-Gate errors or any superposition of these. Thus there are more types of errors and so, the coding techniques are more sophisticated. The first type of Quantum ECC was the Shor 9-Qubit code22 which required 8 additional qubits to protect a single qubit from any type of Pauli gate (X, Z or Y gate) errors. The idea that the uncountable number of errors can be efficiently corrected started with22,refined in Calderbank-Shor-Steane (CSS) codes23, culminating into stabilizer codes8. We require far more qubits (five) to protect a qubit than the bits needed in classical ECC. Moreover, unlike bits, qubits have a tendency to decohere and, therefore, lose their information unless extreme conditions (near zero Kelvin temperature, superconductivity, etc.) are maintained. This led to the field of fault-tolerant quantum computing24 which also takes into account the failure rate of the qubits. However, it is still a far cry to build a sustainable fault-tolerant quantum system. The above references are on QECC for discrete (qubit) variables. But there has been extensive work on extension of these to continuous variables as well (see24,25). An interesting no-go theorem here is that error correction of Gaussian noise imposed on Gaussian states using only Gaussian operations is impossible26.

Classical information theory gives a lower bound on the number of bits required to describe a source, known as the source coding theorem1,2. In QIT, a quantum source, that output’s quantum states instead are considered. The states need not be orthonormal and are output with a corresponding pmf. The quantum source coding result by Schumacher5,20 is analogous to the classical one, in that the von Neumann entropy of the averaged state of the source is the minimum number of qubits required to represent the source. For more recent results on it, see27. Shannon’s rate distortion theory2 has also been extended. See28 for previous references and further extensions.

Transmitting secret messages over a public channel can be achieved by private key or by a public key protocol. Private key protocol was invented by Shannon1. It is provably secure. However, it requires a secrete key between the transmitter and the receiver which is as long as the message to be transmitted and can be used only once. Thus public key cryptography is commonly used. However, the secrecy in this is ensured only due to difficulty in computing some functions, e.g., in RSA29, the difficulty in factoring large integers is exploited. But with Shor’s algorithm30,this can be accomplished efficiently if quantum computers are available. The QKD algorithm BB84 (Bennett and Brassard31) helps in distributing a secret key between the transmitter and the receiver unconditionally. This can be used in private key protocol. In this protocol, due to no cloning theorem, the eavesdropper cannot copy the message from the channel without being noticed. An improvement to the protocol, SARG04 in32 was proposed in 2004. Of all the quantum protocols mentioned so far, it is QKD that is closest to being implemented in practice33. See34 for different variations of QKD. For continuous (Gaussian) variables, see35,36,37. Scarani et al.36 has extensive comparison of discrete and continuous variable QKD.

Private classical capacity refers to the classical capacity of communication which is provably secret via a suitable secrecy metric. A good reference is the book El Gamal and Kim38. In the quantum setting, major strides were made by Devetak39 which extended the HSW capacity results to private capacity and by Ke Li et al.40 which showed that private classical capacity was non-additive. Private classical capacity is the maximum rate at which a secret key can be established between two users in QKD. For finite dimensions, it is lower bounded by the quantum capacity5,6. For degradable channels, it equals the quantum capacity. For infinite dimensions, recent contributions are41 and42, where for some Gaussian channels, private classical capacity is shown to be equal to the quantum capacity.

A widespread use of quantum computing and communication requires a quantum internet43,44,45. In quantum internet, all communications between multiple parties is carried out using quantum protocols and secrecy is ensured using QKD. Thus, a major issue in quantum internet is to establish an entangled state between two or more distributed nodes. This is needed for QKD, to transmit quantum information, say through teleportation, and distributed quantum computation. We will also see in this review that entanglement between the transmitter and the receiver can increase the classical and quantum capacity of the quantum channel. However, entanglement between two nodes cannot be generated via a classical communication link; a quantum channel is required also. Thus in the design of a quantum network efficiently, one needs to compute the entanglement generation capacity of point-to-point and general networks. For finite-dimensional point-to-point quantum channels, it equals the quantum capacity7 studied in this review. For some infinite-dimensional channels, this equality has been shown in42. Another important consideration in quantum internet is the two-way capacities of networks of channels. In our review, we do not consider these capacities although they are intimately connected to one-way communication between point-to-point channels studied in this review. See45,46 and the references therein for more details on these issues.

The useful concepts of zero-error channel capacity and network coding have also been extended to quantum domain4.

The trouble starts when we venture into infinite-dimensional Hilbert spaces. Without a suitable framework, most, if not all, of the machinery developed for finite-dimensional Hilbert spaces are inapplicable. This is because the proof techniques revolve around the finiteness of the dimensionality of the underlying Hilbert space. Moreover, without additional constraints on the states of the Hilbert space, the channels under study may be absurd and have a trivial infinite classical capacity. Unfortunately, as mentioned earlier, a very important class of channels, namely the Gaussian Bosonic channels are constrained channels in infinite dimensions. These model optical channels, which are encountered in fiber optic as well as free space optical communications47,48.

There are two ways to circumvent these issues as follows: One is to find an alternate finite-dimensional or parameter representation of the states and work with that. Fortunately, Bosonic Gaussian channels in constrained space are amenable to it. The other is to use suitable limit theorems that allow us to obtain the results we desire via a simple limit. As we are not providing proofs, we may not see it but we will recall those results.

Major milestones in the development of QIT are the following: Quantum entropy was defined by von Neumann in 1927. Holevo49 proved the first theorem of QIT and provided an upper bound on the mutual information between transmitter and receiver. The quantum source coding theorem, corresponding to the Shannon’s1 source coding theorem was proved by Schumacher20. Two early communication protocols, namely quantum teleportation15 and quantum dense coding9, have turned out fundamental in QIT. Holevo50 and Schumacher, Westmoreland51 independently proved the HSW theorem providing the classical capacity result for a quantum channel.

Entanglement-assisted classical capacity was obtained by Bennett, Shor, Smolin and Thapliyal in52.

Quantum information transmission via a quantum channel was initiated in Schumacher and Nielsen53, Adami and Cerf54 along with others. The quantum capacity for finite-dimensional channels was proved in Devetak39. Shor47 proved that non-additivity of Holevo capacity is a result of non-additivity of minimum output entropy. An early survey providing far more references is found in55.

The quantum teleportation and dense coding algorithms have been extended to continuous variables in56 and57. For infinite-dimensional channels, the classical capacity for energy-constrained quantum channel is obtained in58. The results for entanglement-assisted classical capacity are provided in55. The energy-constrained quantum capacity is studied in42. A major breakthrough was in59 and Holevo60 for linear bosonic Gaussian gauge covariant or contravariant channels where additivity of the minimum output entropy was shown and hence a single-letter characterization of classical capacity was obtained.

The surveys4,6,55, 61,62 provide a lot more information on the topics mentioned above. The book63 addresses the application of these issues in practical communication systems.

The paper is organized as follows: In Sect. 2, we provide the notation and basic definitions and results in QIT. We also particularize these for Gaussian states. Section 3 introduces C–Q channels, which are the most basic form of a quantum channel and we provide classical capacity results for the finite as well as the available results for the Gaussian C–Q channels. In Sect. 4, we provide the classical capacity results for quantum channels and then juxtapose them with the single mode Gaussian channel results. We also discuss the various types of Gaussian channels and discuss quantum channels with entanglement assistance. Section 5 discusses quantum capacities of the channels we studied so far. In Sect. 6, we discuss an elementary model for free space optical channels and in Sect. 6.1, we match the model obtained earlier with the Gaussian channels we studied and obtain corresponding capacity results. Section 7 discusses the multiuser versions of the quantum channels. In Sect. 8, we discuss recent developments and some open questions in this field. Finally, we conclude the survey.

Our focus, therefore, will be on quantum communication problems in infinite dimensions with special attention to Gaussian channels. We will provide the classical as well as quantum capacity expressions, for the cases where they are known.

2 Preliminaries

A good introduction to the following topics is in6,64: Let \(\mathcal {H}\) be a separable Hilbert space, which may be infinite dimensional. Let \(\mathscr {B}(\mathcal {H})\) be the Banach Algebra of linear bounded operators on \(\mathcal {H}\).

For \( A \in \mathscr {B}(\mathcal {H}) \), let

$$\begin{aligned} \vert {A}\vert \triangleq \sqrt{A^*A},\quad \mathrm {Tr}(A) = \sum _i \langle e_i\big \vert A e_i \rangle ,\quad \Vert A\Vert _1 \triangleq \mathrm {Tr}(\vert {A}\vert ), \end{aligned}$$
(1)

where \(A^*\) is the adjoint of A and \( \{e_i\} \) is an orthonormal basis (o.n.b) in \(\mathcal {H}\) and \(\langle .\big \vert . \rangle \) denotes the inner product of \(\mathcal {H}\). \( \Vert A\Vert _1 \) is called the trace norm of A. It is independent of the chosen basis. The set

$$\begin{aligned} \Gamma (\mathcal {H}) = \{A \in \mathscr {B}(\mathcal {H}): \Vert A\Vert _1 < \infty \} \end{aligned}$$

is a Banach space (with norm \( \Vert .\Vert _1 \)) of operators. Its elements are called the trace class operators.

The operator norm (or simply norm unless otherwise specified) of \( A \in \mathscr {B}(\mathcal {H}) \)

$$\begin{aligned} \left\| A \right\| \triangleq \sup \limits _{\Vert x\Vert \le 1} \left\| Ax \right\| \end{aligned}$$
(2)

satisfies \( \left\| A \right\| \le \left\| A \right\| _1 \) and the dual \( \Gamma (\mathcal {H})^* = \mathscr {B}(\mathcal {H}). \)

We will use the following definitions:

Definition 1

Density Operator. An operator \( A \in \Gamma (\mathcal {H}) \) is called a density operator if it is positive semidefinite (denoted by \(A \ge 0\)) and \( \mathrm {Tr}(A) = 1 \).

A density operator A defines the state of a quantum system with the underlying space \(\mathcal {H}\). Let \( \mathscr {S}(\mathcal {H}) \) be the space of all density operators on \( \mathcal {H} \). It is a closed convex subset of \( \Gamma (\mathcal {H}) \). It is also a complete separable metric space with metric

$$\begin{aligned} \rho (S_1, S_2) = \Vert S_1-S_2\Vert _1, \end{aligned}$$
(3)

for any \( S_1,S_2 \in \mathscr {S}(\mathcal {H}) \).

For separable Hilbert spaces \(\mathcal {H}_1\) and \(\mathcal {H}_2\), \(\mathcal {H}_1\otimes \mathcal {H}_2\) denotes the tensor product space. A density operator on \(\mathcal {H}_1\otimes \mathcal {H}_2\) is separable if it is in the closure of the convex hull of product states \(S_1\otimes S_2 \in \Gamma (\mathcal {H}_1\otimes \mathcal {H}_2) \). Otherwise, it is called an entangled state.

Definition 2

Partial Trace operators Operators \(\mathrm {Tr}_A\) and \(\mathrm {Tr}_B\) are linear operators on \(\Gamma (\mathcal {H}_1\otimes \mathcal {H}_2)\) defined via \(\mathrm {Tr}_A(A \otimes B) = B\mathrm {Tr}(A)\) and \(\mathrm {Tr}_B(A \otimes B) = A\mathrm {Tr}(B)\). If \(\rho ^{AB}\) is a state of a system in \(\mathcal {H}_A\otimes \mathcal {H}_B\), then \(\mathrm {Tr}_A(\rho ^{AB})\) and \(\mathrm {Tr}_B(\rho ^{AB})\) are the corresponding states of the subsystems on \(\mathcal {H}_A\) and \(\mathcal {H}_B\), respectively.

Definition 3

Weak Convergence. A sequence of operators \(A_n\) converges weakly to operator A, denoted by \( A_n {\mathop {\rightarrow }\limits ^{w}} A\), if for all \( \big \vert \psi \big \rangle , \big \vert \phi \big \rangle \in \mathcal {H} \),

$$\begin{aligned} \langle \psi \big \vert A_n\phi \rangle \rightarrow \langle \psi \big \vert A\phi \rangle . \end{aligned}$$
(4)

One can show6 that \( S_n {\mathop {\rightarrow }\limits ^{w}}S \) weakly in \( \mathscr {S}(\mathcal {H}) \) if and only if \( \Vert S_n -S\Vert _1 \rightarrow 0 \).

Let \( e_i \) be an o.n.b in \( \mathcal {H} \) and \( f_i \), a real valued lower bounded sequence. Let \( F \in \mathscr {B}(\mathcal {H}) \) be defined as

$$\begin{aligned} F\big \vert \psi \big \rangle = \sum _j f_j \big \vert e_j\big \rangle \langle e_j\big \vert \psi \rangle . \end{aligned}$$
(5)

It is self-adjoint with eigenvalues \( \{f_j\} \) with corresponding eigenvectors \( \{\big \vert e_j\big \rangle \} \). These will be referred to as operators of type \( \mathcal {F} \).

If S is a density operator, then

$$\begin{aligned} \mathrm {Tr}(SF) = \sum _j f_j \langle e_j\big \vert Se_j \rangle \le \infty \end{aligned}$$
(6)

is the expectation of S with respect to F. The mapping \( S\mapsto \mathrm {Tr}(SF) \) is affine, lower semicontinuous on \( \mathscr {S}(\mathcal {H}) \). For \( E < \infty \),

$$\begin{aligned} \mathscr {S}_E = \{S:\mathrm {Tr}(SF) \le E\} \end{aligned}$$
(7)

is a compact set.

Definition 4

von Neumann Entropy.5,6,7 For a density operator S with spectral decomposition \( S = \sum _i s_i \vert {e_i}\rangle \langle {e_i}\vert \), the von Neumann entropy is defined as

$$\begin{aligned} H(S) = -\sum _i s_i \log s_i = -\mathrm {Tr}(S\log S). \end{aligned}$$
(8)

Also for density operators S and T, the relative entropy of S w.r.t T is

$$\begin{aligned} H(S;T) = \mathrm {Tr}(S\log S) - \mathrm {Tr}(S\log T). \end{aligned}$$
(9)

For \( \mathcal {H} \) finite dimensional, \( H(S) < \infty \) and H(S) and H(ST) are continuous in \( \Vert .\Vert _1 \) topology5,42. But when \( \mathcal {H} \) is infinite dimensional, H(S) may be infinite and \( S \mapsto H(S) \) and \( (S,T) \mapsto H(S;T) \) are only lower semicontinuous6. However, the following important result holds: Let

$$\begin{aligned} \exp (-\theta F)\big \vert \psi \big \rangle = \sum _j \exp (-\theta f_j)\big \vert e_j\big \rangle \langle e_j\big \vert \psi \rangle \end{aligned}$$
(10)

for all \( \theta > 0 \).

Lemma 1

6 If \( \mathrm {Tr}(\exp (-\theta F) < \infty \), then H(S) is bounded and continuous on \(\mathscr {S}_E\).

2.1 Gaussian States

Gaussian states are ubiquitous in quantum systems. Any physical quantum system, in small oscillation limit can be represented by a ground state or a thermal equilibrium state. These are Gaussian states. In the following, we will first define general single mode Bosonic states and then the subclass of Gaussian states. See6,62,65 for details.

Consider Hilbert space \( \mathcal {H} = L^2(\mathbb {R}) \), where \( L^2(\mathbb {R}) \) is the space of square integrable real-valued functions on \(\mathbb {R}\). For \( \big \vert \psi \big \rangle \in \mathcal {H} \), define position operator q and momentum operator p, as

$$\begin{aligned} (q(\psi ))(x) = x\psi (x), \qquad (p(\psi ))(x) = -i\hbar \frac{d}{dx}\psi (x), \end{aligned}$$
(11)

where \( \hbar \) is the normalized Planck’s constant. Operators p, q, called canonical variables, are unbounded, self-adjoint with dense domains. For \(\omega \in \mathbb {R}^+\), define

$$\begin{aligned} a = \frac{1}{\sqrt{2\hbar \omega }}(\omega q + ip), \qquad a^\dagger = \frac{1}{\sqrt{2\hbar \omega }}(\omega q - ip) \end{aligned}.$$
(12)

In electromagnetics, a, \(a^\dagger\) represent a mode of the EM field corresponding to frequency \(\omega\) at a particular polarization. Also define the number operator

$$\begin{aligned} \mathcal {N} = a^\dagger a = aa^\dagger - I. \end{aligned}$$
(13)

Also define

$$\begin{aligned} \big \vert n\big \rangle \triangleq \frac{(a^\dagger )^n}{\sqrt{n!}}\big \vert 0\big \rangle , \quad \text {for } n=0,1,2, \ldots \end{aligned}$$
(14)

where \(\big \vert 0\big \rangle \) is the eigenstate of \(a\) corresponding to eigenvalue 0. The set \(\{\big \vert n\big \rangle , n=0,1,2\ldots \}\) forms a countable o.n.b for the space \(\mathcal {H}\), called Fock space. Thus \(\mathcal {H}\) is a separable Hilbert space. State \(\big \vert n\big \rangle \) has n photons with frequency \(\omega\)\o.

We have \( a\big \vert n\big \rangle = \sqrt{n}\big \vert n-1\big \rangle \), for \(n \ge 1\) and \( a^\dagger \big \vert n\big \rangle = \sqrt{n+1}\big \vert n+1\big \rangle \) for \(n \ge 0\). Thus \( a^\dagger \) is called the creation operator and a, the annihilation operator and \(\omega \) their frequency. Each \(z \in \mathbbm{C}\) is an eigenvalue of \(a\) with eigenvector \(z.\) Operator \( \mathcal {N} \) has spectral decomposition

$$\begin{aligned} \mathcal {N} = \sum \limits _{n=0}^\infty n\big \vert n\big \rangle . \end{aligned}$$
(15)

The variance of \(p(\psi ) = \left\| (p-y)\big \vert \psi \big \rangle \right\| ^2\) and of \(q(\psi ) = \left\| (q-x)\big \vert \psi \big \rangle \right\| ^2\), where \(y = \langle \psi \big \vert p\psi \rangle \) and \(x = \langle \psi \big \vert q\psi \rangle \), satisfy the following uncertainty principle:

$$\begin{aligned} \mathrm {Var}(p(\psi ))\mathrm {Var}(q(\psi )) \ge \frac{\hbar ^2}{4}. \end{aligned}$$
(16)

For a probability density p on \(\mathbb {C}\), the set of complex numbers, we can define a density operator

$$\begin{aligned} S = \int _{z \in \mathbb {C}} \vert {z}\rangle \langle {z}\vert p(z) d^2z . \end{aligned}$$
(17)

This is called a single-mode Bosonic state. A Bosonic state that maximizes entropy for a fixed energy is called a Thermal state.

As a special case, a single mode Gaussian state with mean \( \mu \) and variance N is defined as

$$\begin{aligned} S_{\mu } = \frac{1}{\pi N}\int _{z \in \mathbb {C}} \vert {z}\rangle \langle {z}\vert \exp \left( -\frac{\vert {z - \mu }\vert ^2}{N}\right) d^2z , \end{aligned}$$
(18)

where \( \mu \in \mathbb {C} \). Then we can show that for \(\mu = 0\),

$$\begin{aligned} S_0 = \frac{1}{N+1}\sum _{n=0}^{\infty } \left( \frac{N}{N+1}\right) ^n \vert {n}\rangle \langle {n}\vert \quad \text {and} \end{aligned}$$
(19)
$$\begin{aligned} H(S_0) = \frac{1}{N+1}\sum _{n=0}^{\infty } \left( \frac{N}{N+1}\right) ^n [(n+1)\log (N+1) - n\log N] = g(N), \end{aligned}$$
(20)

where

$$\begin{aligned} g(n) = (n+1)\log (n+1) - n\log n \end{aligned}$$
(21)

and \( g(0) = 0 \). \( S_\mu \) is pure or coherent iff \( N =0 \) and then \( S_\mu = \vert {\mu }\rangle \langle {\mu }\vert \). A thermal state is a Gaussian state with \(\mu=0v\).

The characteristic function of \( S_\mu \) is defined as

$$\begin{aligned} \mathrm {Tr}(S_\mu D(z)) = \exp (2i\mathcal {I}(\mu z) - \left( N+\frac{1}{2}\right) \vert {z}\vert ^2), \end{aligned}$$
(22)

where \(D(z)\psi (\xi ) = \exp \left( \frac{iy}{\hbar }\left( \xi - \frac{x}{2}\right) \psi (\xi - x)\right) \) for \(z = \frac{\omega x + iy}{\sqrt{2\hbar \omega }}\) and \(\mathcal {I}(z)\) is the imaginary part of z.

Even though Gaussian states have infinite-dimensional support, they can be completely described by first and second moments. Thus these are much more theoretically tractable, especially after using symplectic formulation provided below.

3 C–Q Channels

Let \(\mathcal {X}\) be a completely separable metric space with a \( \sigma \)-algebra of Borel sets.

Definition 5

A weakly continuous mapping \( \Phi : x \mapsto S_x\), from \( \mathcal {X} \) to \( \mathscr {S}(\mathcal {H}) \) is called a C–Q (classical-quantum) channel.

At the output \( S_x \) of the channel, we take measurements via non-negative measurement operators \( M = M_y, y\in \mathcal {Y} \) for some output alphabet \( \mathcal {Y} \). These operators satisfy \(\sum _{y \in \mathcal {Y}} M_y = I\). With these measurements, if input is x, then the probability of measuring y, post measurement, is given by

$$\begin{aligned} p(y\vert x)= \mathrm {Tr}(S_xM_y) \end{aligned}$$
(23)

The above probabilty effectively defines a classical channel between alphabets \( \mathcal {X} \) and \( \mathcal {Y} \).

We consider n independent uses of this channel with the transition probability given by (23). If we input the codeword \( x = (x_1, x_2, \ldots , x_n) \), where \( x_j \in \mathcal {X} \), to this channel then the probability of output \(y = (y_1, y_2, \ldots , y_n) \) is given by

$$\begin{aligned} P(y\vert x) = \prod _{i=1}^{n}Tr(S_{x_i}M_{y_i}). \end{aligned}$$
(24)

Thus, this is a memoryless channel. Let \( P_e(W,M) \) be the maximum probability of error for codebooks W of codeword length n, with measurements M, with codewords chosen uniformly from W. Then for n channel uses and \( N_n \) codewords, the maximum probability of error for the channel is given by

$$\begin{aligned} P_e(n,N_n) = \min _{W, M} P_e(W,M), \end{aligned}$$
(25)

where minimum is taken over all codebooks W and measurements M.

3.1 Classical Capacity of a C–Q channel

Definition 6

The classical capacity C of a C–Q channel is defined as the supremum of the rates \(R = \lim \limits _{n \rightarrow \infty }\frac{\log N_n}{n}\) where \(n, N_n\) are such that \(\lim \limits _{n\rightarrow \infty } P_e(n,N_n) = 0\).

C is the maximum rate per channel use of classical information that can be transmitted with arbitrarily low probability of error. This is an asymptotic quantity. To achieve this rate, one may have to use the channel an infinite number of times. If \( \mathrm {dim}(\mathcal {H})< \infty \) (we can then take \( \mathcal {X} \) finite dimensional), then we have the following:

Theorem 1

(Holevo, Schumacher, Westmoreland)5,6,7 The classical capacity of a C–Q channel is given by

$$\begin{aligned} C = \max _\pi \left[ H\left( \sum _x \pi (x)S_x\right) - \sum _x \pi (x)H(S_x)\right] \triangleq \max _\pi \chi (\pi ) , \end{aligned}$$
(26)

where the max is over all probability distributions \( \pi \) on \( \mathcal {X} \). The quantity \( \chi (\pi ) \) is known as Holevo’s information. \(\square \)

If \( \mathrm {dim}(\mathcal {H}) = \infty \), the capacity in (26) could be infinite and the input energy used also infinite6,65. This happens in classical information theory also. Thus, for channels of interest, an input energy constraint is imposed. In practice, also one needs to impose such constraints. Let \( f:\mathcal {X}\rightarrow \mathbb {R}^+ \) be a Borel measurable function. Define

$$\begin{aligned} \mathscr {P}_E = \left\{ \text {Probability measures }\pi \text { on }\mathcal {X}: \int f(x)\pi (dx) \le E\right\} , \end{aligned}$$

for \(0< E < \infty \). Denote \( \overline{S}_\pi = \int S_x \pi (dx) \). Assume \( H(\overline{S}_\pi ) < \infty \) and define

$$\begin{aligned} \chi (\pi ) = H(\overline{S}_\pi ) - \int H(S_x) \pi (dx). \end{aligned}$$
(27)

Then we have

Theorem 2

6Let F be a self adjoint operator of type \( \mathcal {F} \) satisfying \( Tr(\exp (-\theta F) < \infty \) for all \( \theta > 0 \), and \( f(x) \ge Tr(S_xF)\) for all \( x\in \mathcal {X} \). Then \( \sup _{\pi \in \mathscr {P}_E} H(\overline{S}_\pi )<\infty \) and classical capacity of C–Q channel \( x\mapsto S_x \) with constraint \( \int f(x) \pi (dx) \le E \) is given by

$$\begin{aligned} C_\chi = \sup _{\pi \in \mathscr {P}_E}\chi (\pi ) < \infty . \end{aligned}$$
(28)

This capacity is achievable if f is lower semicontinuous (l.s.c) and \( \{x:f(x) \le K\} \) is compact for any K with \( 0< K < \infty \). \(\square \)

In practice, one picks F so that it provides a meaningful energy constraint at the input. If f is l.s.c, then finitely supported probability distributions are dense in \(\mathscr {P}_E\). This helps in proving Theorem 2 from Theorem 1.

3.2 Gaussian C–Q Channel

We can define for each \( \mu \in \mathbb {C}\), mapping \(\mu \mapsto S_\mu \), where \( S_\mu \) is a Gaussian state as defined in (18). This C–Q channel is called a Gaussian C–Q channel. For transmission, we can use codeword \( w = (\mu _1, \mu _2, \ldots , \mu _n) \) with energy constraint

$$\begin{aligned} \sum _{i=1}^{n} \vert {\mu _i}\vert ^2 \le nE. \end{aligned}$$
(29)

Then the conditions of Theorem 2 are satisfied with \( F = a^\dagger a = \mathcal {N} \). Therefore the classical capacity of this channel is6,62,

$$\begin{aligned} C = \max _{\pi \in \mathscr {P}_E} \left[ H(\overline{S}_\pi ) - \int H(S_\mu ) \pi (d^2\mu )\right] . \end{aligned}$$
(30)

where \( \mathscr {P}_E = \{\pi \text { dist. }: \int \vert {\mu }\vert ^2\pi (d^2\mu ) \le E\} \).

We have \( H(S_\mu ) = H(S_0) = g(N) \) (from (20)). The maximum in (30) is attained by the Gaussian density operator and6

$$\begin{aligned} C = C_\chi= & {} g(N+E) - g(N) \nonumber \\= & {} \log \left( 1 + \frac{E}{N+1}\right) + (N+E)\log \left( 1 + \frac{1}{N+E}\right) \nonumber \\&-N\log \left( 1 + \frac{1}{N}\right) . \end{aligned}$$
(31)

This formula behaves as \( \log \left( 1 + \frac{E}{N+1}\right) \) for large N, and hence is a quantum generalization of Shannon’s formula.

4 Quantum Channel

To define a quantum channel, we first define the following terms:

Definition 7

Trace preserving mapping. For separable Hilbert spaces \(\mathcal {H}_A\) and \(\mathcal {H}_B\), a mapping \( \Phi : \Gamma (\mathcal {H}_A)\rightarrow \Gamma (\mathcal {H}_B) \) is called trace preserving if for all trace class operators T, \( \mathrm {Tr}(\Phi (T)) = \mathrm {Tr}(T) \).

Definition 8

Completely positive map. A mapping \( \Phi : \Gamma (\mathcal {H}_A)\rightarrow \Gamma (\mathcal {H}_B) \) is completely positive if \( \Phi \otimes I_n \) maps positive semidefinite operators to positive semidefinite operators for all \(n \ge 1\), where \(I_n\) is the identity operator in \( \mathbb {C}^n \). This definition holds for infinite dimensional spaces \(\mathcal {H}_A\) and \(\mathcal {H}_B\).

With these two definitions, we define a channel5,6,7 as a linear bounded map from \( \Gamma (\mathcal {H}_A) \) to \( \Gamma (\mathcal {H}_B) \) that is both trace preserving and completely positive. The C–Q channels are special cases.

Definition 9

Complementary channel and Entropy exchange. For a channel \(\Phi :\Gamma (\mathcal {H}_A) \rightarrow \Gamma (\mathcal {H}_B)\), there exists a Hilbert space \(\mathcal {H}_E\) and an isometry \(V:\mathcal {H}_A \rightarrow \mathcal {H}_B \otimes \mathcal {H}_E\), such that

$$\begin{aligned} \Phi (S) = \mathrm {Tr}_E(VSV^*), \end{aligned}$$
(32)

where \(\mathrm {Tr}_E\) is the partial trace wrt space \(\mathcal {H}_E\). (32) is known as the Stinespring representation of the channel \(\Phi \). Similarly we have the channel

$$\begin{aligned} \Psi (S) = \mathrm {Tr}_B(VSV^*), \end{aligned}$$
(33)

called the complementary channel of \(\Phi \). The entropy exchange of state S wrt \(\Phi \), is

$$\begin{aligned} H(S;\Phi ) = H(\Psi (S)). \end{aligned}$$
(34)

We provide some examples of channels that would be useful later.

Definition 10

For finite \(\mathrm {dim}(\mathcal {H}_A)\), the channel \(\Phi (S) = (1-p)S + p\Pi \), where \(0<p<1\) and \(\Pi \) is the maximally mixed state, is called a depolarizing channel.

Definition 11

Degradable and Anti-Degradable Channels.5,7. Channel \(\Phi \) is degradable if there exists a channel \(\Psi \) such that \(\widetilde{\Phi } = \Psi \circ \Phi \), where \(\widetilde{\Phi }\) is the complementary channel of \(\Phi \). Similarly, \(\Phi \) is antidegradable if \(\widetilde{\Phi }\) is degradable.

Definition 12

Entanglement Breaking Channels.5,7 A channel \(\Phi \) is entanglement breaking if for any input state \(S_{AB}\), \((\Phi \otimes I_d)S_{AB}\) is separable.

All entanglement breaking channels are anti-degradable.

Definition 13

A channel \(\Phi \) is (irreducible) gauge covariant if there exists a continuous, projective, unitary (irreducible) representation \(g \mapsto V_g\) of a symmetric group \(\mathcal {G}\) in \(\mathcal {H}\) such that

$$\begin{aligned} \Phi (V_g\rho V_g^*) = U_g\Phi (\rho ) U_g^*, \end{aligned}$$

where \(U_g\) is a unitary operator. Similarly, \(\Phi \) is gauge contravariant if \(\Phi (V_g\rho V_g^*) = U_g^*\Phi (\rho ) U_g\).

4.1 Classical Capacity

Block code \( \Sigma ^n \) for n channel uses of this channel is a C–Q channel that maps message i to state \( S_i^{(n)} \) of \( \mathcal {H}_A^{\otimes n} \) and measurement \( M^{(n)} \) on Hilbert space \( \mathcal {H}_B^{\otimes n} \) decoding output state \( \Phi ^{\otimes n}(S_i^{(n)}) \) to classical message j. The probability of decoding j given message i is \( \mathrm {Tr}\left( \Phi ^{\otimes n}(S_i^{(n)})M_j^{(n)}\right) \triangleq P(\Sigma ^n, M^{(n)}, (i,j))\).

The maximum probability of error for this code is

$$\begin{aligned} P_e(\Sigma ^n, M^{(n)}) = \max _i (1- P(\Sigma ^n, M^{(n)}, (i,i))). \end{aligned}$$

Define \(P_e(n,2^{nR}) = \min \limits _{\Sigma ^n, M^{(n)}} P_e(\Sigma ^n,M^{(n)})\).

Definition 14

The classical capacity \(C(\Phi )\) of the quantum channel \( \Phi \) is the supremum of rates R, such that \( \lim _{n \rightarrow \infty } P_e(n, 2^{nR}) = 0 \).

For a fixed probability \( \Pi ^{(n)} \) on the input alphabet and observations \( M^{(n)} \), the classical capacity can be obtained via Shannon’s Theorem. Let \( \mathscr {I}_n(\Pi ^{(n)}, M^{(n)}) \) be the classical mutual information between input and output for n channel uses. Then, the classical capacity of the channel is

$$\begin{aligned} C(\Phi ) = \lim _{n \rightarrow \infty }\frac{1}{n} \sup _{\Pi ^{(n)}, M^{(n)}} \mathscr {I}_n(\Pi ^{(n)}, M^{(n)}). \end{aligned}$$
(35)

From classical information theory2,\(C(\Phi )\) is the maximum number of bits that can be transmitted on this channel with arbitrarily low probability of error.

In contrast to C–Q channels, there is no fixed input alphabet \( \mathcal {X} \), so we need to optimize over states \( \{S_i^{(n)}\} \) also. Let us define

$$\begin{aligned} C_\chi (\Phi )= & {} \sup _\Pi \left\{ H\left( \sum _i \Pi _i \Phi (S_i)\right) - \sum _i \Pi _iH(\Phi (S_i))\right\} \end{aligned}$$
(36)
$$\begin{aligned}\le & {} \sup _S H(\Phi (S) - \min _S H(\Phi (S))), \end{aligned}$$
(37)

where \( C_\chi (\Phi ) \) is known as the Holevo Capacity of channel \(\Phi \) and \(\min _S H(\Phi (S))\) the minimal output entropy of \(\Phi \).

Theorem 3

Holevo Schumacher Westmoreland (HSW) Theorem50,51. For finite-dimensional Hilbert spaces, we get

$$\begin{aligned} C(\Phi ) = \lim _{n \rightarrow \infty }\frac{1}{n} C_\chi (\Phi ^{\otimes n}). \end{aligned}$$
(38)

\(\square \)

In general \(C_\chi (\Phi _1 \otimes \Phi _2) \ge C_\chi (\Phi _1) + C_\chi (\Phi _2)\). Strict inequality may hold because an entangled input to \(\Phi _1 \otimes \Phi _2\) may increase the mutual information with output compared to any product input states. If the Holevo capacity is additive for channel \( \Phi \), i.e., \( C_\chi (\Phi \otimes \Psi ) = C_\chi (\Phi ) + C_\chi (\Psi )\) for any channel \(\Psi \), then \( C_\chi (\Phi ^{\otimes n}) = n C_\chi (\Phi ) \) for every \( n \ge 1 \) and hence \( C(\Phi ) = C_\chi (\Phi ) \). This, for instance, holds for all C–Q, entanglement breaking and depolarizing channels. From this, we obtain Theorem 1.

By subadditivity of entropy, \(\max _{S^{(n)}} H(\Phi ^{\otimes n}(S^{(n)})) = n \max _S H(\Phi (S))\). Thus from (37), \(C_\chi (\Phi ^{\otimes n}) = n C_\chi (\Phi ) \) if we have equality in (37) and the minimal output entropy is additive. Equality in (37) holds for irreducibly covariant and contravariant channels.

Once again, for infinite-dimensional Hilbert spaces, we impose a finite input energy constraint as for the C–Q channel. Let F be an operator of type \( \mathcal {F} \) on \( \mathcal {H}_A \). Consider the input states \( S^{(n)} \) of channel \( \Phi ^{\otimes n} \) that satisfy

$$\begin{aligned} Tr(S^{(n)}F^{(n)}) \le nE, \end{aligned}$$
(39)

for some finite \( E > 0 \), where

$$\begin{aligned} F^{(n)} = F\otimes I^{\otimes n-1} + I\otimes F\otimes I^{\otimes n-2}+ \cdots + I^{\otimes n-1}\otimes F. \end{aligned}$$
(40)

Theorem 4

6,66Let \( \Phi \) satisfy

$$\begin{aligned} \sup _{S:\mathrm {Tr}(SF) \le E} H(\Phi (S)) < \infty , \end{aligned}$$
(41)

where F is an operator of type \(\mathcal {F}\) satisfying conditions of Lemma 1. Then the classical capacity \( C(\Phi , F,E) \) of \( \Phi \) is finite and

$$\begin{aligned} C(\Phi , F,E) = \lim _{n \rightarrow \infty } \frac{1}{n} C_\chi (\Phi ^{\otimes n}, F^{(n)}, nE) \end{aligned}$$
(42)

where

$$\begin{aligned} C_\chi (\Phi ,F,E) = \sup _{\Pi : Tr(\overline{S}_\Pi F) \le E}\left\{ H(\Phi (\overline{S}_\Pi )) - \sum _i\Pi _i H(\Phi (S_i))\right\} \end{aligned}$$
(43)

and \( \overline{S}_\Pi = \int \Pi (dx)S_x\).\(\square \)

For entanglement breaking channels, \(C(\Phi ,F,E) = C_\chi (\Phi ,F,E)\).

In the following, we will describe Gaussian Bosonic channels in detail. However, first we consider an important Bosonic non-Gaussian channel which models decoherence in fiber optic channels and some quantum circuits. Decoherence in fiber optic channels can occur due to Kerr non-linearities and imprecision in path length. An m-mode Bosonic dephasing channel maps a state \(\rho \) to a state

$$\begin{aligned} \int _{-\pi }^{\pi } d^m(\phi ) p(\phi ) \exp (-i\sum _{j=1}^ma_j^\dagger a_j \phi _j)\rho \exp (i\sum _{j=1}^ma_j^\dagger a_j \phi _j), \end{aligned}$$
(44)

where \(\phi = (\phi _1,\cdots ,\phi _m)\) and p is a probability density on \([-\pi ,\pi ]^m\). This is a degradable channel. For the \(m=1\) case and energy constraint E41,

$$\begin{aligned} C(\Phi ,F,E) = C_\chi (\Phi ,F,E) =g(E). \end{aligned}$$
(45)

In67,another infinite-dimensional general non-Gaussian Bosonic channel is considered. Even though \(C_\chi \) is non-additive in this case, tight upper and lower bounds on classical capacity are obtained.

4.2 Gaussian Channel

In this section, we specialize quantum channels to Gaussian quantum channels. The optical channels (fiber optic as well as free space channels) used in practice are Gaussian quantum channels. Gaussian channels also model natural physical phenomena such as linear photon amplification / loss and adding thermal noise. Often, nonlinear operations can also be accurately approximated by Gaussian channels.

First, we generalize the concepts of single mode Bosonic and Gaussian states6,62,65. We consider \( \mathcal {H} = L^2(\mathbb {R}^d) \), where \( \mathbb {R}^d \) is the coordinate space of the underlying classical system. For each co-ordinate j, we define \(q_j\) and \(p_j\), position and momentum operators resp., as before and specify frequency \(\omega_{j}\)\o. Define for \( x = (x_1, \ldots , x_d) \), \(y = (y_1,\cdots , y_d)\),

$$\begin{aligned} V_x & = \exp \left( i\sum _{j=1}^{d}x_jq_j\right) , \quad U_y = \exp \left( i\sum _{j=1}^{d}y_jp_j\right) ,\quad W(z) \\ & = \exp \left( \frac{i}{2}y^Tx\right) V_xU_y, \end{aligned}$$
(46)

where W(z) is called the Weyl operator and \( z = (x_1,y_1, x_2, y_2, \ldots x_d, y_d) \). For \( z,z' \), define

$$\begin{aligned} \Delta (z,z') = \sum _j (x_jy_j' - x_j'y_j). \end{aligned}$$
(47)

The space \( Z = \mathbb {R}^{2d} \) with non-degenerate skew symmetric \( \Delta (z,z') \) is called a symplectic vector space of the classical system corresponding to the quantum system. W(z) is a multimodal generalization of displacement operator D(z) .

A quantum state on Z will be called a general Bosonic state with d modes. Its characteristic function is

$$\begin{aligned} \phi _S(z) \triangleq \mathrm {Tr}(SW(z)), \quad z \in \mathbb {R}^{2d}. \end{aligned}$$
(48)

This is called the non-commutative Fourier transform. The state S can be uniquely obtained from \( \phi _S(z) \) using the following inversion formula:

$$\begin{aligned} S = \frac{1}{(2\pi )^d} \int \phi _S(z) W(-z) d^{2d}z. \end{aligned}$$
(49)

The state S is a general d mode Gaussian state if its characteristic function is of the following form:

$$\begin{aligned} \phi (z) = \exp \left( imz -\frac{1}{2}\alpha (z,z),\right) \end{aligned}$$
(50)

where \( \alpha (z,z) = z^T\alpha z \). If \(m=0\), we call the state centered, denoted by \(S_0\). For \(\phi (z)\) to be a Gaussian state, it is necessary and sufficient that \(\alpha \ge \pm \frac{i\Delta }{2}\). The corresponding Gaussian state is then given by

$$\begin{aligned} S_m = W(-\Delta ^{-1} m^T)S_0W(-\Delta ^{-1} m^T)^*. \end{aligned}$$
(51)

A general Gaussian \( S_m \) can be decomposed into elementary one-mode Gaussian states as

$$\begin{aligned} S = \bigotimes _{j=1}^d S^{(j)}. \end{aligned}$$
(52)

State S is pure iff each of \( S^{(j)} \) is pure. The entropy of S is given by

$$\begin{aligned} H(S) = \sum _{j=1}^{d} H(S^{(j)}). \end{aligned}$$
(53)

A Gaussian state has maximum entropy of all states with mean m and covariance \(\alpha \).

A multimode Bosonic channel6,65,68 can be best described via its Stinespring dilation. The channel \( \Phi \) from \( \mathcal {H}_A \) to \( \mathcal {H}_B \) will be described by associating with its environment \( \mathcal {H}_D \) (at the input) and \(\mathcal {H}_E\) at the output. Let the corresponding symplectic spaces be \( Z_A, Z_B, Z_D \) and \(Z_E\). Also note that \( \mathcal {H}_A \otimes \mathcal {H}_D = \mathcal {H}_B \otimes \mathcal {H}_E\) U. It evolves according to symplectic transformation

$$\begin{aligned} T = \begin{bmatrix} K &{} L \\ K_D &{} L_D \end{bmatrix}, \end{aligned}$$
(54)

with \( K:Z_B\rightarrow Z_A \), \( L:Z_E\rightarrow Z_A \), \( K_D:Z_B\rightarrow Z_D \) and \( L_D:Z_E\rightarrow Z_D \). The characteristic function \(\phi _B(z_B)\) of the output state is related to the characteristic function \(\phi _A(z_A)\) of the input state by

$$\begin{aligned} \phi _B(z_B)= & {} \phi _A(Kz_B) f(z_B), \end{aligned}$$
(55)
$$\begin{aligned} f(z_B)= & {} \phi _D(K_Dz_B). \end{aligned}$$
(56)

If additionally,

$$\begin{aligned} f(z_B) = \exp (il(z_B) - \frac{1}{2}\alpha (z_B,z_B)), \end{aligned}$$
(57)

where \( l(z_B) = m_D(K_Dz_B) \) and \(\alpha (z_B,z_B') = \alpha _D(K_Dz_B, K_Dz_B')\), the channel is called a Gaussian channel with parameters \( (K,l,\alpha ) \) provided \(\alpha \ge \pm \frac{i}{2}(\Delta _B - K^T\Delta _A K)\). Here \(\alpha _D\) is the covariance of the Gaussian input environment noise.

Gaussian channels transform Gaussian states to Gaussian states via relations

$$\begin{aligned} m_B = m_AK + l, \quad \alpha _B = K^T\alpha _AK + \alpha , \end{aligned}$$
(58)

where \(m_A\) and \(\alpha _A\) are mean and variance of the input state and \(m_B\) and \(\alpha _B\) of the output state. Without loss of generality, we can take \(l=0\). A concatenation of Gaussian channels is a Gaussian channel.

A general Bosonic Gaussian channel is irreducibly gauge covariant / contravariant if it satisfies the conditions of Definition 13 under the \(z \rightarrow W(z)\) group. Gaussian channel \((K,0,\alpha )\) is gauge covariant if there exists an operator J in \((Z,\Delta )\) with \(J^2 = -I\) such that \([K,J] = 0\) and \([\Delta ^{-1}\alpha , J] = 0\).

A Gaussian channel \((K,0,\alpha )\) is entanglement breaking if and only if \(\alpha \) admits a decomposition \(\alpha = \alpha _A + \alpha _B\), where \(\alpha _A \ge \frac{i}{2}K^T\Delta _A K\) and \(\alpha _B \ge \frac{i}{2}\Delta _B\)6. Thus if \(K^T\Delta _A K = 0\), then the channel is entanglement breaking.

For Gaussian channels taking \(F = R\epsilon R^T\), where \(\epsilon \) is a positive definite matrix and \(R = (q_1,p_1,\cdots q_d,p_d)^T\), satisfies the conditions of Theorem 4. However, for a general Gaussian channel, computing the classical capacity from (43) explicitly is possible only if additivity of Holevo capacity holds. Due to this, the classical capacity of general Gaussian channels is unknown. More recently, it was shown59 that for a Gaussian gauge-covariant or contravariant channel, with finite second moment, the minimal output entropy of the channel is additive and attained at a Gaussian state. Thus, for such channels, the Holevo capacity is additive6 and hence \(C(\Phi ,F,E) = C_\chi (\Phi ,F,E)\). It is also true for entanglement breaking channels. Gaussian gauge contravariant channels are entanglement breaking.

However, it is challenging to even compute \( C_\chi (\Phi ,F,E) \) except for C–Q channels and Gauge symmetric channels6.

4.2.1 Single Mode Gaussian Channels

As mentioned earlier, a Gaussian channel can be characterized by \((K,l,\alpha )\). If we take \(l=0\) the characteristic function of the output state simplifies to

$$\begin{aligned} \phi _B(z) = \phi _A(Kz)\exp \left( -\frac{1}{2} z^T\alpha z \right) . \end{aligned}$$
(59)

For single mode, the Gaussian channels can be classified as follows. Let \(I_k\) denote the identity matrix in k dimensions.

Theorem 5

6,62 Let \(N_0\) be the mean number of photons in the environment. Then all single mode Gaussian channels can be classified into one of the following classes:

  • (A1) \( K=0,~\alpha = (N_0 + \frac{1}{2})I_2, ~ N_0 \ge 0 \).

  • (A2) \( K = \begin{bmatrix} 1 &{} 0\\ 0 &{} 0 \end{bmatrix}\), \(\alpha = (N_0 + \frac{1}{2})I_2, ~ N_0 \ge 0\).

  • (B1) \( K = I_2, \alpha = \frac{1}{2}(N_0 + \frac{1}{2})I_2\).

  • (B2) \( K = I_2, ~\alpha = N_cI_2,~ N_c \ge 0\).

  • (C) \(K=kI_2,~ k >0, ~ k \ne 1, ~ \alpha = \left( N_0 + \frac{\vert {k^2-1}\vert }{2}\right) I_2\).

  • (D) \(K = k\begin{bmatrix} 1 &{} 0\\ 0 &{}-1 \end{bmatrix}, ~ k>0, ~ k \ne 1, ~ \alpha = \left( N_0 + \frac{k^2+1}{2}\right) I_2\). \(\square \)

The above channels can be related to the environment as follows. If the input state is described by canonical variables q and p, the environment as Gaussian states described by \(q_E\) and \(p_E\), the output states by \(q'\) and \(p'\), and environment states at the output by \(q_E'\) and \(p_E'\), then the above channels can be specified by the following maps:

  • (A1) Completely depolarizing: \(q' = q_E\), \(p'= p_E\).

  • (A2) A degenerate classical signal q with additive quantum Gaussian noise \(q_E\): \(q' = q+q_E\), \(p' = p_E\).

  • (B1) Quantum signal plus degenerate classical Gaussian noise: \(q_E' = q\), \(p_E' = p-p_E\). \(p_E\) has variance \(\frac{1}{2}\) and \((q_E,p_E)\) is in vacuum state.It is the complementary channel of (A2).

  • (B2) Non-degenerate additive classical Gaussian noise. \(q' = q+\zeta \), \(p' = p+\eta \), where \(\zeta , \eta \) are i.i.d. Gaussian with mean 0 and variance \(N_c\).

  • (C) Attenuation (if \(k<1\)), with quantum noise:

    $$\begin{aligned} q'= kq + \sqrt{1-k^2} q_E, ~ p' = kp + \sqrt{1-k^2} p_E. \end{aligned}$$
    (60)

    Amplification (if \(k>1\)), with quantum noise:

    $$\begin{aligned} q'= kq + \sqrt{k^2-1} q_E, p' = kp + \sqrt{k^2-1} p_E. \end{aligned}$$
    (61)
  • (D) Attenuation / amplification with phase conjugation: \(q'=q, p'=-p\).

For these channels we take \(F = a^\dagger a\) and then the energy constraint \(\mathrm {Tr}(SF) \le E\) is the mean number of photons in the input state \(\le E\). Then the conditions of Theorem 4 are satisfied. Channels (A1), (B2) and (C) are gauge covariant and channel (D) is gauge contravariant. Thus, \(C(\Phi ,F,E) = C_\chi (\Phi , F, E)\). Also (A1), (A2), (D) and (C) with \(k \le \frac{1}{\sqrt{2}}\) are anti-degradable.

Channels (B1), (B2), (C) are also called phase insensitive Bosonic Gaussian channels. In particular, channel (C) is called quantum thermal channel. For applications, (C) and (B2) are most interesting. Channel (C) with \(k < 1\) represents a fiber optic channel and for \(k > 1\), a linear optical amplifier. Maximization in \(C_\chi (\Phi ,F,E)\) is a finite-dimensional optimization problem which is the quantum analog of waterfilling in classical information theory.

For channels (B2) and (C)6,more explicitly,

$$\begin{aligned} C(\Phi ,F,E) = g(k^2E+N_C + \max (0,k^2-1)) - g(N_C + \max (0,k^2-1)), \end{aligned}$$
(62)

where \(N_C = \vert {k^2-1}\vert N_0\) and g(.) is as defined in (21). When \(N_C = 0\) and \(k<1\), we have a pure loss channel and then \(C(\Phi , F,E) = g(k^2E)\). A pure loss channel with \(k \ge \frac{1}{\sqrt{2}}\) is degradable.

For channel (B2), \(k=1\) and (62) simplifies to \(C(\Phi ,F,E) = g(E+N_C) - g(N_C)\).

4.3 Entanglement-Assisted Classical Capacity

One can increase the classical capacity of a quantun channel by apriori preparing an entangled state between the transmitter and the receiver. Of course preparing an initial entangled state has a cost which is not considered in the following:

Let \(\Phi :\Gamma (\mathcal {H}_A) \rightarrow \Gamma (\mathcal {H}_B)\) be a quantum channel. Let \( S_{AB} \) be an entangled pure state on \( \mathcal {H}_A \otimes \mathcal {H}_B \). Classical message i arriving with probability \( \Pi _i \) is encoded into \(\varepsilon ^i_A\) on \( \mathcal {H}_A \) and this acts as operation

$$\begin{aligned} (\varepsilon _A^i \otimes I_B)(S_{AB}) \triangleq S_{AB}^i. \end{aligned}$$
(63)

The input part of this state is sent on \( \Phi \) and measurements are made at the receiver to get the classical information. The maximum rate transmissible via this protocol is called the entanglement assisted classical capacity \( C_{ea}(\Phi ) \).

Theorem 6

5,7,52For a finite dimensional \( \mathcal {H}_A \),

$$\begin{aligned} C_{ea}(\Phi ) = \lim _{n \rightarrow \infty }\frac{1}{n} C_{ea}'(\Phi ^{\otimes n}) = I(S_A;\Phi ), \end{aligned}$$
(64)

where

$$\begin{aligned} C'_{ea}(\Phi )= & {} \sup _{\Pi _i, \varepsilon _A^i, S_{AB}^i} \chi (\Pi , (\Phi \otimes I_B)S_{AB}^i), \end{aligned}$$
(65)
$$\begin{aligned} \chi (\Pi , \Psi (S))= & {} H(\sum _i \Pi _i \Psi (S_i)) - \sum _i \Pi _i H(\Psi (S_i)) \text { and } \end{aligned}$$
(66)
$$\begin{aligned} I(S_A;\Phi )= & {} H(S_A) + H(\Phi (S_A)) - H(S_A;\Phi ). \end{aligned}$$
(67)

\(\square \)

One can show that

$$\begin{aligned} C_\chi (\Phi ) \le C_{ea}(\Phi ) \le \log d + C_\chi (\Phi ), \end{aligned}$$
(68)

where \( d = \mathrm {dim}(\mathcal {H}_A) \). Also, \(C(\Phi ) \le C_{ea}(\Phi )\).

For \( dim(\mathcal {H}_A) = \infty \), the entanglement-assisted classical capacity for the constrained input channel is given by the following theorem:

Theorem 7

6,69Let F be an operator satisfying \(\mathrm {Tr}(\exp (-\theta F) < \infty \) for all \(\theta > 0\). Let \(F'\) be a self adjoint operator of type \( \mathcal {F} \) satisfying \(\mathrm {Tr}(\exp (-\theta F') < \infty \) for all \(\theta > 0\) and \(\mathrm {Tr}(\Phi (S)F') \le Tr(SF)\), for all \(S \in \mathscr {S}(\mathcal {H}_A)\). Then

$$\begin{aligned} C_{ea}(\Phi , F, E) = \max _{S:\mathrm {Tr}(SF) \le E} I(S;\Phi ), \end{aligned}$$
(69)

where \( I(S;\Phi ) = H(S) + H(\Phi (S)) - H(S;\Phi ) \) and \(H(S;\Phi )\) is the entropy exchange.

Moreover, if \( \sup _{S \in \mathscr {S}(\mathcal {H}_A)} I(S;\Phi ) = \infty \), then the sup is achieved on a density operator S with \( Tr(SF) = E \). \(\square \)

For the Gaussian channel, if the input is restricted to the set of states with mean m and covariance \(\alpha \), then from Theorem 7, the entanglement-assisted classical capacity is

$$\begin{aligned} \sup _{S \in \Sigma (m,\alpha )}I(S;\Phi ) \end{aligned}$$
(70)

and the max is attained on a Gaussian state.

For channels (B2) and (C), taking \(F = a^\dagger a\), we obtain for \(Tr(SF) \le E\), from the following Theorem 7:

$$\begin{aligned} C_{ea}(\Phi ) = g(E) + g(E') + g\left( \frac{D+E'-E-1}{2}\right) + g\left( \frac{D-E'+E-1}{2}\right) , \end{aligned}$$
(71)

where \(E' = k^2E + \max (0, k^2-1) + N_C\) and \(D = ((E+E'+1)^2 -4k^2E(E+1))/2\). The energy constraint implies that E is the mean number of photons of the signal. From (62) and (71), for channel (C),

$$\begin{aligned} 1 \le \frac{C_{ea}(\Phi )}{C(\Phi )} \rightarrow \infty \end{aligned}$$
(72)

as \(E \rightarrow 0\) when \(\max (0, k^2-1) + N_C > 0\). Thus entanglement can significantly increase the capacity when signal energy is low. For \(N_C = 0\), \(k < 1\), the pure loss channel70,

$$\begin{aligned} C_{ea}(\Phi , F,E) = g(E) +g(k^2E) - g((1-k^2)E), \end{aligned}$$
(73)

and \(\frac{C_{ea}(\Phi , F,E)}{C(\Phi ,F,E)} \rightarrow 2\) as \(E \rightarrow 0\).

5 Quantum Capacity

The above capacity results are the fundamental limits on transmitting classical information on quantum channels. Limits on transmission of quantum information are also considered. Correspondingly, there are quantum capacity and entanglement assisted quantum capacity.

Consider Hilbert spaces \(\mathcal {M}\) and \(\mathcal {N}\). Consider an isometric map \(V : \mathcal {M}\rightarrow \mathcal {N}\). A quantum code is a map, \(\varepsilon : S \mapsto VSV^*\). Repetition codes cannot be used due to no-cloning theorem.

Definition 15

Quantum Capacity. Consider a quantum channel \(\Phi :\Gamma (\mathcal {H}_A) \rightarrow \Gamma (\mathcal {H}_B)\). A rate R is achievable if there exist Hilbert spaces \(\mathcal {H}^{(n)}\) such that

$$\begin{aligned} \limsup _{n \rightarrow \infty } \frac{1}{n} \log \mathrm {dim}(\mathcal {H}^{(n)}) = R \end{aligned}$$

and a sequence of encodings

$$\begin{aligned} \mathscr {E}^{(n)} : \Gamma (\mathcal {H}^{(n)})\rightarrow \Gamma (\mathcal {H}_A^{\otimes n}) \end{aligned}$$

and decodings

$$\begin{aligned} \mathscr {D}^{(n)} : \Gamma (\mathcal {H}_B^{\otimes n})\rightarrow \Gamma (\mathcal {H}^{(n)}) \end{aligned}$$

such that

$$\begin{aligned} \lim \limits _{n \rightarrow \infty }F_s(\mathcal {H}^{(n)}, \mathscr {D}^{(n)}\circ \Phi ^{\otimes n} \mathscr {E}^{(n)}) = 1, \end{aligned}$$

where \(F_s\) is the subspace fidelity, a measure of closeness of two spaces and is defined as

$$\begin{aligned} F_s(\mathcal {H}, \Phi ) = \min _{\psi \in \mathcal {H}, \left\| \psi \right\| = 1} \langle \psi \big \vert \Phi (\vert {\psi }\rangle \langle {\psi }\vert )\psi \rangle . \end{aligned}$$
(74)

The supremum of all achievable rates here is the quantum capacity \(Q(\Phi )\) of the channel \(\Phi \).

\(Q(\Phi )\) is the maximum number of qubits per channel use that can be transmitted on the channel \(\Phi \) reliably.

Define the coherent information as

$$\begin{aligned} I_C(S;\Phi ) = H(\Phi (S)) - H(S;\Phi ). \end{aligned}$$
(75)

Theorem 8

5,7,71,72For finite-dimensional \(\mathcal {H}_A\), we have

$$\begin{aligned} Q(\Phi ) = \lim \limits _{n \rightarrow \infty } \frac{1}{n} \max _S I_C(S;\Phi ^{\otimes n}). \end{aligned}$$
(76)

\(\square \)

Since \(I_C(S;\Phi ) \le C_\chi (\Phi ;S)\), \(Q(\Phi ) \le C(\Phi )\).

If \(\Phi \) is degradable, then \(I_C(S;\Phi ) \ge 0\). Also then \(I_C(S;\Phi ^{\otimes n}) = nI_C(S;\Phi )\) and \(Q(\Phi ) = \max _S I_C(S;\Phi )\)72. If \(\Phi \) is antidegradable, then \(I_C(S;\Phi ) \le 0\) and hence it has a Quantum Capacity of 0. Note that while coherent information could be negative, quantum capacity cannot be.

A unique aspect of quantum capacity is superactivation. We can have two channels \(\Phi _1\) and \(\Phi _2\) such that \(Q(\Phi _1) = Q(\Phi _2) = 0\) but \(Q(\Phi _1 \otimes \Phi _2) > 0\)7. This is not possible in classical information theory.

For \(\mathrm {dim}(\mathcal {H}_A) = \infty \), \(Q(\Phi )\) defined by (76) is still finite even without any energy constraint. \(Q(\Phi )\) with input energy constraint \(\mathrm {Tr}(SF)\le E\) can be shown to be

$$\begin{aligned} Q(\Phi ,F,E) = \lim _{n \rightarrow \infty } \frac{1}{n} \sup _{S:\mathrm {Tr}(SF)\le E}I_C(S;\Phi ^{\otimes n}), \end{aligned}$$
(77)

under the conditions of Theorem 442. For degradable channels, \(Q(\Phi ,F,E) = \sup _{\mathrm {Tr}(SF) \le E}I_C(S;\Phi )\).

Since m-mode Bosonic dephasing channel is degradable, its quantum capacity without energy constraint is41 \(m\log _2(2\pi ) - h(p)\), where h(p) is the classical differential entropy of p. Under energy constraint E, \(Q(\Phi ,F,E)\) is unavailable but is upper bounded by \(m\log _2(2\pi ) - h(p)\).

The maximal achievable rate with the assistance of entanglement between the transmitter and the receiver states (as is for \(C_{ea}\)) is called the entanglement-assisted quantum capacity \(Q_{ea}(\Phi )\). Using quantum teleportation and dense coding, we can show that \(Q_{ea}(\Phi ) = \frac{1}{2}C_{ea}(\Phi )\)5,7. Also \(Q(\Phi ) \le Q_{ea}(\Phi )\). This holds for infinite-dimensional channels as well.

5.1 Gaussian Channels

If \(\Phi \) is a degradable Gaussian channel, \(Q(\Phi ) = \sup _S I_C(S;\Phi )\) is achievable by taking the supremum over all Gaussian states with mean m and covariance \(\alpha \), where m and \(\alpha \) satisfy the energy constraints. Its explicit expression for phase insensitive channels is provided in42.

Regarding single-mode Gaussian channels provided in Theorem 5, the channels (A1), (A2) and (D) are antidegradable and hence have zero quantum capacity. Moreover, it is known that for Gaussian channels \(\Phi \) of type (C) without any energy constraints46

$$\begin{aligned} Q(\Phi ) = 0, \quad \text {if } k \le \frac{1}{\sqrt{2}} \text { or, } k \ge \frac{1}{\sqrt{2}} \text { and } N_C \ge \frac{1}{2}(k^2 - \vert {k^2-1}\vert ). \end{aligned}$$
(78)

Under above conditions, channel (C) is degradable. For \(1\ge k^2 \ge \frac{N_0+1}{N_0+2}\), from73,channel (C) is not degradable but we have \(Q(\Phi ) \le -\log 2((1-k^2)k^{N_0}) - g(N_0)\). There are other upper bounds for channel (C) in73.

For \(N_C=0\), \(k^2 \in \left[ \frac{1}{2},1\right] \), with mean number of photons in input state \(\le E\), we get the quantum capacity from (77) as \(g(k^2E) - g((1-k^2)E)\). This can be taken as an upper bound for the channel with \(N_C > 0\) and energy constraint E. For \(N_C = 0\), taking \(E\rightarrow \infty \), we get \(\log \frac{k^2}{1-k^2}\). Furthermore, from42,we can generalize it to an m mode parallel pure loss channel with energy constraint E. Then

$$\begin{aligned} Q(\Phi ) = \sum _{j=1}^{m} g(k^2N_j(\beta )) - g((1 - k^2)N_j(\beta )), \end{aligned}$$
(79)

where \(N_j(\beta ) = \frac{1}{e^{\beta \omega _j}+1}\) and \(\beta \) is such that \(\sum _{j=1}^{m} N_j(\beta ) = E\). Here \(\omega _j\) is the frequency of the \(j^{th}\) mode of the channel. The result extends to the case when \(k_j\), for mode j, are different, each satisfy \(k_j^2 \in \left[ \frac{1}{2}, 1\right] \).

An example of superactivation in Gaussian channels is provided in74.

6 Modeling Optical Light and Channels

Consider electromagnetic wave (EM wave) in free space, transmitted by a transmitter in \(z=0\) plane, in pupil \(\mathcal {A}_0\), and measured by a receiver in \(z=L\) plane, in pupil \(\mathcal {A}_L\) during time \(0\le t\le T\). Then48,62,65,75 the electric field at input, in normal mode decomposition, can be written as

$$\begin{aligned} E_0(\rho ,t) = \sum _{n,m} c_{n,m}\Phi _n(\rho )\theta _m(t), \quad \rho \in \mathcal {A}_0,\quad 0 \le t\le T, \end{aligned}$$
(80)

and at the receiver as

$$\begin{aligned} E_L(\rho ,t) = \sum _{m,n}k_nc_{n,m} \Psi _n(\rho )\theta _m\left( t-\frac{L}{c}\right) , \quad \rho \in \mathcal {A}_L, \quad 0 \le t-\frac{L}{c} \le T, \end{aligned}$$
(81)

where \(0 \le k_n \le 1, ~ \{\theta _m(t)\}\) are a completely orthonormal set of functions, \(\Phi _n\) and \(\Psi _n\) are the input and output eigenfunctions respectively (independent of the input field applied) and c is the speed of light. The corresponding quantized electric fields are

$$\begin{aligned} \widehat{E}_0(\rho ,t) = \sum _{n,m} a_{n,m}\Phi _n(\rho )\theta _m(t), \end{aligned}$$
(82)

at \(z=0\) and

$$\begin{aligned} \widehat{E}_L(\rho ,t) = \sum _{m,n}(k_na_{n,m} + \sqrt{1-k_n^2}e_{n,m}) \Psi _n(\rho )\theta _m\left( t-\frac{L}{c}\right) , \end{aligned}$$
(83)

for \(z = L\), where \(0 \le k_n \le 1; ~ a_{n,m}\) are the annihilator operators at the input corresponding to coefficients \(c_{m,n}\) and \(e_{n,m}\) are annihilator operators from the environment. Thus the state of the input field can be described by the joint density operator \(a_{n,m}\) and that of the output field by \(b_{n,m} = k_n a_{n,m} + \sqrt{1-k_n^2}e_{n,m}\). The mapping from \(\{a_{n,m}\}\) to \(\{b_{n,m}\}\) defines the channel for this optical system.

If \(\widehat{E}_0(\rho ,t)\) has only one term then it is a single-mode Bosonic channel, a beam-splitter discussed in more detail in next subsection. If \(e_{11}\) is in a vacuum state then it is a Gaussian (C) class pure loss channel (because now \(N_c = 0\)). If \(e_{11}\) is in a thermal state then it is called a thermal noise channel. The \(C(\Phi , F, E)\) of the channels is provided in (62) and \(C_{ea}(\Phi ,F,E)\) in (71). For pure loss channel, results are given more explicitly. For \(k \ge \frac{1}{\sqrt{2}}\), the pure loss channel is degradable and hence its quantum capacity has single-letter characterization and is provided in the discussion preceding (79). For \(k < \frac{1}{\sqrt{2}}\), it is antidegradable and hence has quantum capacity 0.

When there is more than one term in (83), and each mode is a pure loss mode with corresponding \(k_{mn}^2 \in \left[ \frac{1}{2},1\right] \), then the quantum capacity can be obtained from (79), modified for different \(k_{mn}\).

For classical capacity with energy constraints, if the general m-mode channel is with \(e_{n,m}\) in vacuum states, we obtain Bosonic Gaussian covariant channel and hence from the results of Section 4, \(C(\Phi ,F,E) = C_\chi (\Phi ,F,E)\) and is attained via Gaussian input. Thus we can compute it numerically. For single mode, it equals \(g(k^2E)\). For multimode case, we need to optimize the energy used in different modes. Similarly, one can obtain \(C_{ea}(\Phi ,F,E)\) from Sect. 4.3.

6.1 Optical Systems as Gaussian Quantum Systems

Coherent state \(\big \vert \alpha \big \rangle \), \(\alpha \in \mathbb {C}\) is eigenstate of a with eigenvalue \(\alpha\). It is a special case of a Gaussian state (when \(var = 0\)), models a state of an ideal laser light beam. The simplest state \(\big \vert 0\big \rangle \) is the vacuum state. The coherent states are min uncertainty states of p and q. For \(\alpha \in \mathbb {C}\), the coherent state \(\big \vert \alpha \big \rangle \) can be expanded in basis \(\{\big \vert n\big \rangle \}\) as

$$\begin{aligned} \big \vert \alpha \big \rangle= & {} \exp \left( -\frac{\vert {\alpha }\vert ^2}{2}\right) \sum _{n=0}^{\infty } \frac{\alpha ^n}{\sqrt{n!}}\big \vert n\big \rangle \end{aligned}$$
(84)
$$\begin{aligned}= & {} \exp (\alpha a^\dagger - \alpha ^*a)\big \vert 0\big \rangle . \end{aligned}$$
(85)

It has average number of photons

$$\begin{aligned} \mu = \langle \alpha \big \vert a^\dagger a\vert \alpha \rangle = \vert {\alpha }\vert ^2, \end{aligned}$$
(86)

also called its intensity.

A coherent state can also be characterized by its intensity \(\mu \) and phase \(\theta \) by

$$\begin{aligned} \big \vert \mu ,\theta \big \rangle = \exp (\sqrt{\mu } e^{i\theta } a^\dagger - \mu /2)\big \vert 0\big \rangle . \end{aligned}$$
(87)

The phase \(\theta \) of light pulse from an ordinary laser device is random, uniformly distributed on \([0, 2\pi ]\). Then a coherent state with intensity \(\mu \) is a classical mixture with Poisson distribution of photons

$$\begin{aligned} \rho _\mu = \frac{1}{2\pi } \int _0^{2\pi } \vert {\mu ,\theta }\rangle \langle {\mu ,\theta }\vert d\theta = e^{-\mu }\sum _{n=0}^\infty \frac{\mu ^n}{n!}\vert {n}\rangle \langle {n}\vert . \end{aligned}$$
(88)

If \(\mu \) is small and the state is not \(\big \vert 0\big \rangle \), then we can approximate it with the first two terms with \(n=0\) and 1 in (88). This is light with a single photon. Coherent states have same variance in p and q coordinates. (see6,62,65 for more details).

Thermal states have a representation as described in (19), where N is the mean number of photons in them.

Squeezed states are generalizations of coherent states. In these states, one quadrature may have less uncertainty than the other, resulting in higher uncertainty in the other, due to the uncertainty principle. Then the covariance matrix has one eigenvalue larger than the other.

A single mode squeezed state is

$$\begin{aligned} \big \vert \zeta ,0\big \rangle = S(\zeta )\big \vert 0\big \rangle , \end{aligned}$$
(89)

where \(S(\zeta ) = \exp \left( -\frac{\zeta }{2}(a^\dagger )^2 + \frac{\zeta ^*}{2}a^2\right) \). If \(\zeta = re^{i\phi }\), then the covariance matrix is

$$\begin{aligned} \begin{bmatrix} e^{-2r} &{} 0\\ 0 &{} e^{2r} \end{bmatrix}. \end{aligned}$$
(90)

The variance of position is \(e^{-2r}\) and of momentum is \(e^{2r}\).

The coherent states and thermal states are like classical sinusoidal states with well-defined amplitude and phase. But Fock states \(\big \vert n\big \rangle \) and squeezed states do not have a classical equivalent.

Two-mode squeezed states are very useful and can be used as entangled states. These are generated by the squeezing operator

$$\begin{aligned} S = \exp (-\zeta ^*a_1a_2 + \zeta a_1^\dagger a_2^\dagger ). \end{aligned}$$
(91)

A beam splitter is a widely used optical device (can be taken as an example of a quantum channel). It has a two-mode input beam and a two-mode output beam. One input mode can be taken in vacuum state and the other in single photon. Thus the input state is \(\big \vert 1\big \rangle \otimes \big \vert 0\big \rangle \). The output state is denoted by

$$\begin{aligned} \big \vert \psi \big \rangle _{\text {out}} = U_B\big \vert 10\big \rangle , \end{aligned}$$
(92)

where \(U_B\) is the operator specified by the beam splitter. A 50 : 50 beam-splitter gives

$$\begin{aligned} \big \vert \psi \big \rangle _{\text {out}} = \frac{1}{\sqrt{2}}(\big \vert 10\big \rangle + \big \vert 01\big \rangle ), \end{aligned}$$
(93)

i.e., a maximally entangled state. If the input is a coherent state, then the output is a separable state. We can also obtain a two-mode squeezed state at the output given one-mode squeezed state at input. Thus a beam splitter is a very versatile laser device and is used often in quantum information processing.

7 Capacity of Multiuser Channels

The results provided so far are for point-to-point quantum channels. However, just as for classical information theory38,several of these results have been extended to multiuser channel situation. The focus, as in classical case, has been on multiple access channels, broadcast channels, interference channels and general multihop networks of quantum channels. In the following, we only provide the references for the aforementioned channels with brief comments.

We first consider multiple access channels (MAC). In a MAC, multiple users transmit to one receiver. Winter76,provided the capacity region for classical information on the C–Q MACs. The capacity region obtained is similar to that of classical MAC77 extended this result to a MAC with C–Q channels as well as quantum-quantum (q–q) channels and obtained the capacity results for both classical as well as quantum information. As for point-to-point quantum channels, there are no single-letter characterizations of the capacity region for the general channels. However, for the degradable channel, single-letter characterizations were obtained. In78,capacity of entanglement-assisted case has been studied for quantum–quantum MACs with two users79 studies a MAC with classical inputs and quantum outputs with discrete or continuous variables80 studies finite-dimensional and Bosonic thermal noise channels with more than two users.

The broadcast channel consists of one transmitter and multiple receivers. 81studied this channel when classical information is transmitted to multiple users over quantum channels. Inner and outer bounds on the capacity region are provided. Next, they consider the case when information transmitted to some of the users is classical while for others quantum.

In82, an interference channel is considered where there are two transmitters and two receivers. Each transmitter transmits to one of the receivers (no common receiver) and acts as an interference to the other. For such channels, even in classical information theory, the capacity of the information to individual users is known only in special cases82 generalizes this channel to the quantum setting. The case when the classical inputs are provided and quantum outputs received (C–Q channel) is studied in detail. Capacity of the channel is obtained for the very strong and strong interference case, as in the classical setting. Han–Kobayashi (see2,38) type of results are also obtained.

Savov’s PhD Thesis83 considers several networks of C–Q channels. In these networks, classical inputs are provided and the network outputs quantum signals. Thus classical encoding with quantum decoding is used. Multiple access, broadcast, interference and relay channels are considered in the finite-dimensional setting. These results can be extended to quantum input and quantum output channels. Savov’s thesis also considers Bosonic quantum interference channels with classical inputs and continuous variable quantum state outputs.

8 Recent Developments and Future Directions

Capacity in classical and quantum information theory is an asymptotic concept. It provides the supremum on the achievable rates per channel use as the number of channel uses goes to infinity. In practice, we use only a finite number of channel uses. Then the probability of error in transmission will be non-zero for any positive achievable rate and the capacity is not achievable with arbitrarily low probability of error for most channels of interest.

More recently, in classical information theory, for a given probability of error \(\varepsilon \), upper and lower bounds on the achievable rate for a given number of channel uses have been obtained84,85. Extensions of these results to quantum systems for finite-dimensional channels have been summarized in86,87,88,89. A systematic book-length treatment is provided in90 and the upper and lower bounds obtained are used in providing the channel capacity and rates of convergence of the achievable rate to the channel capacity.

The finite-length achievable rates, rates of convergence to the capacity and continuity of capacity for infinite-dimensional channels have been studied in58,70,91. We provide the results of70 in more detail. These concern the achievable rates for the classical capacity of C–Q channels for finite block length case.

Consider a C–Q channel \(\Phi (x) = S_x\), where \(x \in \mathcal {X}\) can be a continuous set and \(S_x\) is a state on an infinite dimensional, separable Hilbert space. Let \(M^*(\Phi ^{\otimes n}, \varepsilon )\) be the maximum size of the codebook that can be transmitted in n channel uses of \(\Phi \) with probability of error \(\le \varepsilon \), where \(0<\varepsilon < 1\). Then

$$\begin{aligned} \log M^*(\Phi ^{\otimes n}, \varepsilon ) \ge nH(\rho ) -\sqrt{nV(\rho )}Q^{-1}(\varepsilon )+ O(\log n), \end{aligned}$$
(94)

where \(\rho \) the state and \(V(\rho )\) are obtained as follows: Let P be a probability density on \(\mathcal {X}\); then

$$\begin{aligned} \rho\triangleq & {} E_{x \sim P} [\vert {S_x}\rangle \langle {S_x}\vert ], \end{aligned}$$
(95)
$$\begin{aligned} V(\rho )= & {} \mathrm {Var}(-\log P(x)). \end{aligned}$$
(96)

Also Q(.) is the complementary cdf of a Gaussian distribution with zero mean and unit variance. This lower bound is similar to the second-order expansions of achievable rates in classical information theory.

We apply the inequality (94) to a pure loss bosonic channel with Heisenberg evolution

$$\begin{aligned} \widehat{b} = \sqrt{\eta }\widehat{a} + \sqrt{1-\eta }\widehat{e}, \end{aligned}$$
(97)

where \(\eta \in [0,1]\), subject to an input photon number constraint \(\mathrm {Tr}(a^\dagger a \rho ) \le N_S\), \(N_S > 0\). Now for pure state \(\big \vert \alpha \big \rangle \), the output is \(\big \vert \eta \alpha \big \rangle \). We take \(\eta =1\) for convenience and distribution P as isotropic complex Gaussian with variance \(N_S\),

$$\begin{aligned} P_{N_S}(\alpha ) = \frac{1}{\pi N_S}\exp \left( -\frac{\vert {\alpha }\vert ^2}{N_S}\right) . \end{aligned}$$
(98)

Then, for the average state \(\rho \),

$$\begin{aligned} H(\rho )= & {} g(N_S), \end{aligned}$$
(99)
$$\begin{aligned} V(\rho )= & {} N_S (N_S + 1)\left( \log (N_S +1) -\log N_S\right) ^2, \end{aligned}$$
(100)

where g(.) is defined in (20).

For general \(\eta \in [0,1]\), the bound is obtained by replacing \(N_S\) with \(\eta N_S\).

For more general Bosonic channels, this type of results are unavailable but would be of interest. Notice also that (94) is a lower bound but a non-trivial upper bound is elusive. If we were to apply the upper bound results provided in87 and88,we would run into issues, such as finite condition numbering, which automatically force the dimensions of the Hilbert spaces to be finite. A suitable upper bound, therefore, would be desirable and the result is likely to resemble (94), as it does in classical information theory for additive white Gaussian noise (AWGN) channels85. It would also be worthwhile to study the finite blocklength effects on the second-order terms in the presence of entanglement assistance.

As far as quantum capacity is concerned, there are no results of the finite blocklength type available, except for finite dimensions. In fact, as seen in Sect. 5, we do not even have a proper characterization of \(Q(\Phi )\) except for a very specific class of channels. These are most sought after in applications such as QKD and quantum internet.

9 Conclusion

In this survey, we have introduced and discussed QIT for various quantum channels with a focus on Gaussian channels with single mode. We have discussed and compared capacity, both classical and quantum, with and without entanglement assistance for these channels. We have also discussed the multiuser variants of these channels. While the infinite-dimensional space is useful, as that is where the useful channels lie, it is also significantly more complex than finite-dimensional analysis. As a result, a large number of open issues still remain and for these reasons, this field will be vibrant for many years to come.