Positive Hankel operators, positive definite kernels and related topics

It is shown that a positive (bounded linear) operator on a Hilbert space with trivial kernel is unitarily equivalent to a Hankel operator that satisfies double positivity condition if and only if it is non-invertible and has simple spectrum (that is, if this operator admits a cyclic vector). More generally, for an arbitrary positive (bounded linear) operator A on a Hilbert space H with trivial kernel the collection V(A) of all linear isometries V from H into H such that AV is positive as well is investigated. In particular, operators A such that V(A) contains a pure isometry with a given deficiency index are characterized. Some applications to unbounded positive self-adjoint operators as well as to positive definite kernels are presented. In particular, positive definite matrix-type square roots of such kernels are studied and kernels that have a unique such root are characterized. The class of all positive definite kernels that have at least one such a square root is also investigated.


Introduction
In [15] the authors characterized (in the language of the multiplicity theory of separable Hilbert space self-adjoint operators) bounded self-adjoint operators that are unitarily equivalent to Hankel. This is a deep result whose proof is difficult and long. For a bounded positive operator A on a separable Hilbert space the following two theorems immediately follow: • if A is unitarily equivalent to a Hankel operator, the essential supremum of the multiplicity function of A does not exceed 2; • if the essential supremum of the multiplicity function of A does not exceed 1, A is unitarily equivalent to a Hankel operator.
Hankel operators can be defined in a few equivalent ways. One of them, appropriate to our investigations, reads as follows: a bounded operator A : ℓ 2 → ℓ 2 is Hankel if AS = S * A where S is the standard unilateral shift (that is, S is a linear isometry satisfying Se n = e n+1 for any n 0 where e 0 , e 1 , . . . is the canonical orthonormal basis of ℓ 2 ). So, A is both Hankel and self-adjoint iff both A and AS are selfadjoint. When in the last condition we replace the adjective self-adjoint by positive, we obtain the so-called double positivity condition [13]: A is a Hankel operator satisfying double positivity condition if both A and AS are positive. Having this notion in mind, a natural question (related to the main topic of the aforementioned paper) arises of when a (bounded) positive operator on a separable Hilbert space is unitarily equivalent to a Hankel operator satisfying double positivity condition.
In the present paper we answer this question under the additional assumption that the operator in question has trivial kernel. Our main result reads as follows (below R(T ) denotes the range of an operator T ). (ii) A is non-invertible, dim(H) = max(m, ℵ 0 ) and an appropriate condition of the following three is fulfilled: (α) m < ℵ 0 and the essential supremum of the multiplicity function of A does not exceed m; or (β) m = ℵ 0 ; or (γ) m > ℵ 0 and there exists a closed linear subspace Z of H such that Z ∩ R(A) = {0} and dim(Z) = m. In particular, a bounded one-to-one linear operator is unitarily equivalent to a Hankel operator that satisfies double positivity condition iff it is positive, non-invertible and has simple spectrum. Theorem 1.1 shows (in particular) that for infinite m the property (i) above depends only on the range of the operator A (i.e., if two positive operators A and B have trivial kernels and their ranges coincide, then either both A and B satisfy (i) or none of them). In Theorem 3.3 below we gather conditions on a dense operator range R in a Hilbert space H related to the foregoing statement (γ)-that is, conditions equivalent to the existence of a closed linear subspace Z of H such that Z ∩ R = {0} and dim(Z) = dim(H).
Our proofs are independent of the results from [15] and are much simpler. We use basics of the operator theory and of the spectral theory of self-adjoint operators.
Another topic we deal with in this paper is related to Hilbert space reproducing (that is, positive definite) kernels. They are a useful tool in both Hilbert space theory and complex analysis (where they are known as Bergman kernels). Since the seminal paper of Aronszajn [2], positive definite kernels are a subject of an intristic theory. Although Bergman [3,4] is considered by a sizeable mathematical community as the father of that theory, it is Zaremba's work [24], published 15 years earlier than the first Bergman's on kernels, where the reproducing property (without any name) appeared for the first time-see, e.g., [2] or [22]. We wish to emphasize Zaremba's contribution to the theory by calling him its forefather.
As positive definite kernels naturally generalize positive matrices (for complex square matrices can be seen as kernels defined on finite sets), it is natural to investigate various properties of such matrices and recognize those of them that inhere in all such kernels. In the present paper (in Section 5) we characterize those kernels which have the so-called matrix-type square root. To be more precise, we introduce the following 1.2. Definition. Let K : X × X → C and L : X × X → C be two positive definite kernels. K is said to be a positive definite matrix-type square root (for short: a pdms root) of L if for all x, z ∈ X: (1-1) L(x, z) = y∈X K(x, y)K(y, z).
(More on the above notion can be found in Section 5.) A classical result from matrix theory (or, more generally, from bounded Hilbert space operator theory) says that any positive matrix has a unique positive square root. A natural question arises as to how far this result extends in the realm of reproducing kernels. Our main result in this direction reads as follows. (Below we write "K ≪ L" to express that L − K is a positive definite kernel; and δ X is a kernel on X such that δ X (x, y) = 1 if y = x and δ X (x, y) = 0 otherwise.) 1.3. Theorem. For a positive definite kernel K : X × X → C the following conditions are equivalent: (i) K has a unique positive definite matrix-type square root; (ii) each positive definite kernel L : X × X → C such that L ≪ K has a positive definite matrix-type square root; (iii) K ≪ cδ X for some constant c > 0.
(Note that the equivalence of conditions (i)-(iii) above implies that each kernel L appearing in (ii) has in fact a unique pdms root.) We underline here that condition (i) above says about both the existence and the uniqueness of pdms roots. In Section 5 we also give equivalent conditions for a positive definite kernel to have at least one such root.
The paper is organized as follows. In Section 2 we study in greater detail the collections V (A) and Z (A) introduced above, and give a proof of Theorem 1.1. Next section contains further conditions (not listed in Theorem 1.1) equivalent to condition (i) of that theorem. The reader can also find there a full description of all possible unitary equivalence types of isometries from V (A) (see Theorem 3.7 below). In the fourth part the results of Section 2 are applied to unbounded operators. We prove that any truly unbounded positive self-adjoint operator is the absolute value of some positive closed operator that is not self-adjoint (this serves as a criterion for a boundedness of positive self-adjoint operators)-consult Theorem 4.1. We also show that all closable operators are (in a certain sense) "conditionally" weakly continuous (see Lemma 4.3 therein). Last, fifth, part is devoted to the notion of a pdms root (for reproducing kernels) introduced above. We gather equivalent conditions for a positive definite kernel to have at least one pdms root (Theorem 5.5), prove Theorem 1.3 and study in greater detail kernels having such roots (see, e.g., Theorem 5.8). In particular, among all such roots (of a fixed kernel) we distinguish one of them which can be seen as (unique) "self-adjoint" (Corollary 5.10). We also show that if a kernel is a pdms root of some other kernel, then it automatically has a pdms root (item (V) of Theorem 5.8). This property enables one to define (positive definite matrix-type) roots of higher degrees. Apart from the results of Section 4, proofs presented in the last part invoke the machinery of unbounded symmetric operators with Friedrichs' theorem [12] on extending positive operators as the main of them.
Notation and terminology. Throughout this paper all Hilbert spaces are nontrivial and complex, and H denotes one of them. By dim(H) we denote the Hilbert space dimension of H, that is, dim(H) is the cardinality of an orthonormal basis of H. The scalar product of H will be denoted by ·, − H . All operators are linear, act between Hilbert spaces and have dense domains. A linear subspace generated by a set F is denoted by lin(F ) and lin(F ) stands for the closure of lin(F ). For any non-empty set X we use ℓ 2 (X) to denote the Hilbert space of all square-summable complex-valued functions on X equipped with the standard inner product. More precisely, f : X → C belongs to ℓ 2 (X) if x∈X |f (x)| 2 < ∞; and for u, v ∈ ℓ 2 (X), u, v ℓ2(X) = x∈X u(x)v(x). The canonical basis of ℓ 2 (X) consists of functions e x (where x runs over all elements of X) of the form: e x (x) = 1 and e x (y) = 0 for y = x. For simplicity, we will denote by ℓ f in (X) the linear span of the canonical basis (so, f : X → C belongs to ℓ f in (X) iff the set {x ∈ X : f (x) = 0} is finite).
We use B(H) and U(H) to denote, respectively, the C * -algebra of all bounded operators on H and the group of all unitary operators on H; I = I H is used to denote the unit of U(H), and B + (H) stands for the collection of all bounded positive operators with trivial kernel. (In particular, each member of B + (H) is a self-adjoint operator with dense range.) For two self-adjoint operators A)x, x H 0 for any x ∈ H. By a contraction we mean a bounded operator between Hilbert spaces whose operator norm is not greater than 1.
Whenever T is an operator, we use D(T ), N(T ), R(T ) and Γ(T ) to denote, respectively, the domain, the kernel, the range and the graph of T . In addition, R(T ) denotes the closure of R(T ). The operator T is closed if Γ(T ) is closed in the product of Hilbert spaces between which T acts. T is closable if the closure of Γ(T ) is the graph of an operator-in that caseT denotes the unique operator whose graph coincides with the closure of Γ(T );T is called the closure of T and D(T ) a core ofT . For any closed linear subspace K of H, P K stands for the orthogonal projection from H onto K.
Basic facts on the multiplicity theory for bounded self-adjoint operators on separable Hilbert spaces can be found in §10 of Chapter IX in [6]. To undestand the present paper it is sufficient to know the following result, which will be used several times in this paper: the essential supremum of the multiplicity function of a bounded self-adjoint operator A acting on a separable Hilbert space does not exceed n ∈ {1, 2, . . .} iff A can be decomposed as the direct sum of at most n self-adjoint operators each of which has a cyclic vector. Recall that a vector v ∈ H is a cyclic vector for a self-adjoint operator Any closed operator T : is a dense subspace of H) admits the so-called polar decomposition which has the form where A : D(T ) → H is positive self-adjoint in H and Q : H → K is a partial isometry. Moreover, the above Q and A are uniquely determined by (1)(2) and condition N(Q) = N(T ). The above operator A satisfies A 2 = T * T and is called the absolute value of T and denoted by |T |. For the details see Theorem 7.20 in [23]. For any A ∈ B + (H) we denote by V (A) and Z (A) the collections, respectively, of all isometries V ∈ B(H) such that AV ∈ B + (H), and of all closed linear subspaces Z of H such that Z ∩ R(A) = {0}. Additionally-for simplicity-for any set F in H, [F ] A stands for the set lin( ∞ n=0 A n (F )); that is, [F ] A is the smallest closed linear subspace of H that contains F and is invariant under A.
All necessary notions concerning reproducing kernels are introduced and discussed in Section 5.

Isometries of class V (A)
In this section A is fixed and denotes a member of B + (H). We begin with 2.1. Proposition. A function is a well defined bijection.
satisfies R(W ) ⊥ = Z. Define an operator U ∈ U(H) by U def = V −1 W and observe that W = V U . It follows from the assumptions that B def = AV and C def = BU are bounded positive operators such that C 2 = CC * = (BU )(U * B) = B 2 . Since bounded positive operators have unique positive square roots, we infer that B = C. Since B has trivial kernel, we get U = I and thus W = V . In other words, Υ A is one-to-one. Now take any Z ∈ Z (A) and define D ∈ B + (H) as the (unique) positive square root of A(I − P Z )A. Note that then and hence the range of D is dense in H. We also infer that D = |(I − P Z )A| and hence-by the properties of the polar decomposition: (The above formula will be used in the proof of the next result.) Further, since we see that D 2 A 2 and thus, by [10] (see also Theorem 2.1 in [11]), there is a contraction V ∈ B(H) such that D = AV . Observe that then D = V * A and R(V * ) is dense in H. Moreover, it follows from (2-2) that AV V * A = A(I − P Z )A. Since A has dense range and trivial kernel, we get that Recall that any isometry V ∈ B(H) induces a unique decomposition H = H u ⊕ H p (called Wold's decomposition) of the space H such that both H u and H p are invariant under V , V ↾ H u is a unitary operator on H u and {0} is the only closed linear subspace K of H p such that V (K) = K. Each of the spaces H u and H p can be trivial. The restrictions of V to H u and H p are called by us, respectively, the unitary and pure parts of V . We call the isometry V pure if H = H p . It is wellknown (and easy to prove) that Any pure isometry W is unitarily equivalent to the direct sum of α copies of the (standard) unilateral shift where α = dim R(W ) ⊥ . The cardinal α defined above is called by us the deficiency index of the isometry W . For the proofs of the above facts consult, e.g., Chapter 1 in [16] (therein pure isometries are called shifts and the deficiency index of a pure isometry is called its multiplicity). Now we describe Wold's decompositions of members of V (A).
Proof. We continue the notation introduced in the proof of Proposition 2.1: let D denote the positive square root of A(I − P Z )A. Then the formulas (2-1) and  are valid. Moreover, we have D = AV and N(D) = 0. Define a bounded operator We claim that R(T ) is dense in H ⊕ Z. To convince ourselves of that, fix u ∈ H and v ∈ Z such that u ⊕ v ⊥ R(T ). This means that for any x ∈ H, So, Du = −Av, but Du ∈ A(Z ⊥ ) (by (2-1)) and Av ∈ A(Z). Since A is one-to-one, we get Du = 0 = Av and therefore u = v = 0 (as both A and D have trivial kernels). Further, (2-3) combined with (2-2) yields that T x 2 = Ax 2 for any x ∈ H. Since both T and A have dense ranges, we infer that there exists a (unique) unitary operator Q : H ⊕ Z → H such that (2)(3)(4) QT = A.
Consequently, T = Q −1 A and hence (by (2-3)) D = P Q −1 A where P : H ⊕ Z → H is the projection onto the first coordinate. But D = V * A and A has dense range. So, V * = P Q −1 . Equivalently, V = QP * . In other words, For simplicity, denote by E the subspace it follows from the self-adjointness of A that A n x ⊥ Z for any n 0. Hence P Z A n x = 0 and by a simple induction argument applied to (2-2) we get A 2n x = D 2n x for any n 0. Since there is a sequence of polynomials p 1 , p 2 , . . . such that p n (A 2 ) → A and p n (D 2 ) → D in the operator norm as n → ∞, we obtain Ax = Dx. So, thanks to (2)(3)(4)(5), (2)(3) and (2)(3)(4), So, to finish the whole proof, it is sufficient to show that V ↾ E is a pure isometry on E (recall that 1 is an eigenvalue of no pure isometry). To this end, we restrict our further considerations to the space E (note that E is invariant for all A, D and V and that Z ⊂ E and A ↾ E · V ↾ E = D ↾ E). In other words, we assume that H = E. Although everywhere below we will identify A, V and D with their restrictions to E, we shall write E instead of H to avoid confusion. Let be the Wold's decomposition induced by V . We only need to show that E p = E.
To this end, put U def = V ↾ E u ∈ U(E u ) and S def = V ↾ E p ∈ B(E p ) and note that S is a pure isometry on E p and Z ⊂ E p (as Z = R(V ) ⊥ ). Represent A as a block matrix A = B X X * C with respect to the decomposition (2-6) (that is, and Since S is a pure isometry, the range of I − S is dense in E p . We conclude that The above result shows that for Z ∈ Z (A) the structure of the isometry W Z is completely determined by two cardinal numbers: α(Z) It is a natural question of when it may happen (for a fixed operator A) that β A (Z) = 0 for some Z; that is, when V (A) contains a pure isometry with a pre-set deficiency index. This question is fully answered in the next three propositions.
2.3. Proposition. For any n ∈ {1, 2, . . .} the following conditions are equivalent: (i) there exists a pure isometry V ∈ V (A) with deficiency index n; (ii) H is separable, A is non-invertible and the essential supremum of the multiplicity function of A does not exceed n; (iii) A is non-invertible and there is a finite subset F of H such that [F ] A = H and card(F ) n.
Before giving a proof, let us first separate a special case of the above result that will be applied several times in the sequel: 2.4. Lemma. The following conditions are equivalent: (ii) H is separable, A is non-invertible and has simple spectrum.
Proof. If V ∈ V (A) is pure and has deficiency index 1, then R(V ) ⊥ is generated by a single unit vector, say z. In particular, A is non-invertible (since z / ∈ R(A)). Moreover, it follows from Theorem 2.2 that H = [z] A . So, z is a cyclic vector for A and therefore H is separable and A has simple spectrum.
To prove the reverse implication, we model A as the multiplication operator M µ by independent variable on L 2 (µ) where µ is a probabilistic Borel measure on the spectrum K ⊂ [0, A ] of A (consult, e.g., Theorem 3.4 in Chapter IX of [6]). That is, (M µ f )(t) = tf (t) for any f ∈ L 2 (µ) and t ∈ K. Since A is one-toone, µ({0}) = 0. According to Theorem 2.2, we only need to show that there is By an analogous reasoning, also u def = 1 + |f | ∈ L 2 (µ) is not a value of M µ . It follows from the description of all (closed linear) invariant subspaces of self-adjoint operators of the form M µ (consult, e.g., Corollary 6.9 in Chapter IX of [6]) that there is a Borel set σ ⊂ K such that [u] Mµ = {g ∈ L 2 (µ) : g = 0 µ-a.e. on σ}. But u ∈ [u] Mµ and therefore µ(σ) = 0. Consequently, [u] Mµ = L 2 (µ) and we are done.
Proof of Proposition 2.3. First of all, note that all items (i)-(iii) imply that H is separable and A is non-invertible. So, everywhere below we assume these two properties: that H is separable and A is non-invertible.
Implication Now assume (iii) holds. Let F be as specified therein. We claim that there are k ∈ {1, . . . , n} and unit vectors z 1 , . . . , z k such that To this end, we proceed by induction on n. When n = 1, our conclusion easily follows. So, assume n > 1, choose any a ∈ F , put F 0 and apply the induction hypothesis to H 0 and A 0 (and F 0 ): there are ℓ ∈ {1, . . . , n − 1} and unit vectors z 1 , . . . , z ℓ such that H 0 = ℓ j=1 [z j ] A0 . If H 0 = H, just put k = ℓ to finish the proof of (2)(3)(4)(5)(6)(7)(8) = P H ⊥ 0 (a) = 0 and it is sufficient to define k as ℓ + 1 and z k as b b to get (2)(3)(4)(5)(6)(7)(8). Since each of the subspaces [z j ] A is invariant under A (and z j is a cyclic vector for the restriction of A to [z j ] A ), we see that A is the direct sum of at most n self-adjoint operators with simple spectrum-which yields (ii).
Finally, assume that (ii) is fulfilled. This means that A is the direct sum of at most n self-adjoint operators with simple spectrum, say A = k j=1 A j where k n (and A j has simple spectrum). Then each of A j is positive with trivial kernel and one of them, say A 1 , is non-invertible. Using e.g. the spectral measure of A 1 , we can decompose A 1 as A 1 = ∞ m=1 B m where each B m acts on a non-zero Hilbert space and lim m→∞ B m = 0. Now we decompose the set of all positive integers as the union of n pairwise disjoint sets J 1 , . . . , J n in a way such that J 1 is infinite and for any j ∈ {2, . . . , n}: for each s ∈ J j . Now define operators C 1 , . . . , C n as follows: It follows from the above construction that: • A is unitarily equivalent to n j=1 C j ; • each of C j is positive, non-invertible and has trivial kernel; • each of C j has simple spectrum. Only the last of these properties can be seen as non-trivial, so let us briefly explain it. Since B j def = s∈Jj B s is the restriction of A 1 to an invariant subspace of A 1 and A 1 has simple spectrum, B j has simple spectrum as well. Finally, if j > 1 and A j is invertible, then B j < 1/ A −1 j which implies that the spectra of B j and A j are disjoint. But then B j ⊕ A j has simple spectrum, as both A j and B j have so.
To conclude the proof, apply Lemma 2.4 to each of C j : there is a pure isometry S j with deficiency index 1 (acting on an appropriate Hilbert space) such that C j S j is positive. Then also ( n j=1 C j )( n j=1 S j ) is positive. So, we complete the proof by noticing that n j=1 S j is a pure isometry with deficiency index n and that A is unitarily equivalent to n j=1 C j . 2.5. Proposition. The following conditions are equivalent: Proof. The argument is similar to a part of the previous proof and goes as follows. It is clear that (ii) is implied by (i). Assume (i) holds and let B be a maximal set Then B is non-empty and (at most) countable, and We fix e ∈ B such that e / ∈ R(A). We infer that A 0 A is non-invertible (as an operator in B([e] A )). So, we may decompose A 0 (using, e.g., the spectral measure of A 0 ) as A 0 = ∞ n=1 D n where each D n acts on a non-zero Hilbert space and lim n→∞ D n = 0. Denote by C the set of all We divide the set of all positive integers into pairwise disjoint sets J b (b ∈ B) in a way such that: • J e is infinite; Finally, decompose J e as the union of pairwise disjoint infinite sets I 0 , I 1 , . . . Now we define operators T b (b ∈ B) and T (n) (n = 1, 2, . . .) by the rules: • T e = k∈I0 D k ; • T (n) = k∈In D k for all n > 0. For simplicity, gather all the operators defined above in a sequence L 1 , L 2 , . . . Since all the sets I n (n 0) and J b (b ∈ B \ {e}) are pairwise disjoint and their union coincides with the set of all positive integers, one shows that A is unitarily equivalent to ∞ n=1 L n (thanks to (2)(3)(4)(5)(6)(7)(8)(9)). Furthermore, it follows from the construction that each of L n is non-invertible and has a cyclic vector (cf. the proof of Proposition 2.3). So, thanks to Lemma 2.4 for any n > 0 there is a pure isometry S n with deficiency index 1 (that acts on a suitable Hilbert space) such that L n S n is positive. Then S def = ∞ n=1 S n is a pure isometry with deficiency index ℵ 0 and belongs to V ( ∞ n=1 L n ). This easily implies that (i) holds.
To convince oneself of that, assume card(B) < α. Then also dim( b∈B [b] A ) < dim(Z) and therefore there exists a unit vector z ∈ Z orthogonal to Note that Q commutes with all P b and that to get (iii) it is sufficient to show that QP b = 0 for any b ∈ B. To this end, fix b ∈ B and assume that, on the contrary, which contradicts the invertibility of that restriction.
Finally, assume (iii) holds. We want to show that there is V witnessing (i). Let {E s } s∈S be a maximal family of closed linear subspaces such that: • E s is separable and A(E s ) E s for all s ∈ S; • E s ⊥ E t for any distinct s, t ∈ S.
Further, keeping the setting (2-13), decompose F as F = t∈T W t where each W t is a separable closed linear subspace of F invariant under A. Then card(T ) α and hence there is a one-to-one mapping κ : T → S. Now define, for s ∈ S, H s as follows: • Observe that H = s∈S H s , and for any s ∈ S, H s is separable and A(H s ) H s . So, it follows from Proposition 2.5 that for any s ∈ S there is a pure isometry Observe that item (ii) of the above theorem-as well as the collection Z (A)depends only on the range of the operator A. Further conditions (also formulated only in terms of operator ranges) equivalent to this item are a subject of Theorem 3.3 from the next section.
Proof of Theorem 1.1. Just observe that the equivalence of conditions (i) and (ii) immediately follows from Propositions 2.3, 2.5 and 2.6, whereas the remaining (additional) part of the theorem is a reformulation of Lemma 2.4.

2.7.
Remark. It was shown earlier-in [13]-that one-to-one Hankel operators satisfying double positivity condition have simple spectra. Both the proofs-in the paper cited above and ours-are based on the same idea, which is a kind of folklore in operator theory. For the details, consult [14] or Proposition 2.5 in [13] together with the preceding paragraph (therein).
The following result is a simple consequence of a deep theorem from [15]. Here we give its brief proof.

Corollary. The essential supremum of the multiplicity function of a positive Hankel operator A with trivial kernel does not exceed 2.
Proof. If S denotes the (classical) unilateral shift, then S 2 is a pure isometry with deficiency index 2 and AS 2 is a positive operator (because A is positive and Hankel). So, the assertion follows from Theorem 1.1 (for m = 2).
In [13] the authors showed that for any bounded Hankel operator A satisfying double positivity condition the operator A ↾ R(A) has simple spectrum. It is also well-known (and easy to prove) that the kernel of a Hankel operator is either trivial or infinite-dimensional. So, the following problem naturally arises: 2.9. Conjecture. Let A be a bounded positive operator on a separable Hilbert space such that N(A) is infinite-dimensional, R(A) is non-closed and A ↾ R(A) has simple spectrum. Then A is unitarily equivalent to a Hankel operator satisfying double positivity condition.

Operator ranges
In this part we give further conditions on a dense operator range R contained in H equivalent to the existence of a closed linear subspace Z of H such that Z ∩ R = {0} and dim(Z) = dim(H) (cf. item (ii) in Proposition 2.6). To this end we need to recall some well-known facts about operator ranges.
A linear subspace R of a Hilbert space H is called an operator range if there exist a Hilbert space K and a bounded operator T : The following is a basic result on operator ranges (see Theorem 1.1 in [11] Moreover, the operator range R given by is, in general, not unique. Also, some of the spaces H n can be zero-dimensional. For the purposes of this section, let us introduce the following 3.2. Definition. Let α 1 , α 2 , . . . be any sequence of cardinal numbers. We say an A single operator range can be of many different types. However, it is easy to see that dense operator ranges of a given type are all equivalent. When two types represent equivalent dense operator ranges is a subject of Theorem 3.3 in [11]. Here we skip the details. The main result of this section is 3.3. Theorem. Let R be a dense operator range of type R(α n ) ∞ n=1 in a Hilbert space H of dimension α ℵ 0 . The following conditions are equivalent: Before giving a proof, first we comment on the assertion of (f) in the above result: in the case when R is a non-closed operator range in a separable Hilbert space H it was first proved by Dixmier [8] (consult also Theorem 3.6 in [11]). He started his proof with a specific example of an operator range for which (f) holds and then described an elegant technique to get the assertion in full generality (in the separable case). Here we will generalize his method, but instead of using his "starting" example we propose a new approach to this issue-our starting tool will be the following proposition, which may be already known, but we could not find it in the literature. This result can be considered interesting in its own right.
Proof. First of all, recall that for separable H the space U(H) is separable and completely metrizable in the strong operator topology. Moreover, for any Hilbert space K, U(K) is a topological group with respect to this topology. So, in the separable case we can apply Baire's theorem, which will lead us to the final conclusion after showing the following property, valid in all (that is, possibly non-separable) infinite-dimensional Hilbert spaces H: We may and do assume that M = ∅. It easily follows from the compactness of ε for any f ∈ F . We may and do assume that 0 / ∈ F . For simplicity, put L def = M ∪ F and take m > 0 such that  y m (y ∈ L).
Next, choose any δ ∈ (0, 1) satisfying and let S ⊂ L be a finite (mδ/4)-net in L containing F . Set E def = lin(S) and take any W ∈ U(H) such that We conclude from the above property (and the fact that E is finite-dimensional) that there exists V ∈ U(H) that satisfies Since F ⊂ E, it follows from (3)(4) and (3)(4)(5) that for any f ∈ F , (where the last inequality is a consequence of (3-3)). Thus, to end the proof of (⋆), it remains to check that V (M ) ∩ M = ∅. To this end, first take a, b ∈ S ⊂ E ∩ L. Again, we infer from (3)(4) and (3)(4)(5) by . Now if x, y ∈ M are arbitrary, choose a, b ∈ S such that x − a mδ/4 and y−b mδ/4. We then have V x−y V a−b − V a−V x − y−b mδ/2 and we are done.
Having (⋆), the assertion of the proposition can briefly be proven. Since C is compact and the closed unit ballB K in K is weakly compact, the set D ∅} is open and dense in U(H). Finally, Baire's theorem yields that the intersection of all Ω n , which coincides with ∆, is dense in U(H).
In the proof of Theorem 3.3 we shall also apply the following result.
3.5. Corollary. There exists a dense operator range R 0 in a separable Hilbert space K and a closed linear Take any compact self-adjoint operator A : H → H with trivial kernel and choose-applying Proposition 3.4-any U ∈ U(H) such that Proof of Theorem 3.3. For the aim of this proof, take a sequence of mutually orthogonal closed linear subspaces and dim(H n ) = α n for any n > 0. Since R is dense, we have (by Theorem 3.1): Before passing to the main proof, consider an additional condition: (h) for any β < α and n > 0 there exists a closed linear subspace K of H such that (Note that (h) is a weakening of each of (a), (b), (c), (d) and (g).) We will show that (h) is equivalent to each of (a)-(g). For the reader's convenience, let us draw the scheme of the proof: Of couse, (a) implies (b). If β < α and Y witnesses (b) (for β), and K is as specified in (d), then It is obvious that (d) implies (h). Now assume (h) holds and fix β < α and n 0. For simplicity, denote by P the orthogonal projection onto the orthogonal complement (in H) of n j=1 H j . By (h), there is a closed linear subspace K of H that satisfies (3)(4)(5)(6)(7)(8)(9). As argued previously, we conclude that dim(R(P )) dim(K) > β. But R(P ) = ∞ j=n+1 H j (thanks to (3)(4)(5)(6)(7)(8)) and hence j>n α j > β. So, one can find m > n such that m j=n+1 α j > β which gives (c). Finally, assume (c) holds. Consider a bounded operator A : H → H defined as follows: It readily follows from (3-7) that R(A) = R. Moreover, it is also easy to show that for arbitrarily fixed ε > 0, ∞ n=j H j ⊂ R(E((0, ε))) for sufficiently large n > 0 where E is the spectral measure of A. Consequently, we infer from (c) that dim(R(E((0, ε)))) = α and it suffices to apply Proposition 2.6 to get (a).
Further, (e) is easily implied by (f) as U (R) (for any U ∈ U(H)) is a dense operator range in H. And if (e) holds, for all n, m > 0. These properties easily yield (h).
To show the additional claim of the theorem and that (a) follows from (g), it is sufficient to prove that dim(W ) = α whenever W is as specified in (g). To this end, assume-on the contrary-that dim(W ) < α. Then, by (3)(4)(5)(6)(7)(8), we can find n > 0 such that We turn to the hardest part of the proof-namely, that both (f) and (g) follow from (c). We adapt Dixmier's proof [8] (see also Theorem 3.6 in [11]) of the result mentioned in the paragraph following the statement of Theorem 3.3 above, but instead of his specific example of a dense operator range in a separable Hilbert space that satisfies (f) we apply our Proposition 3.4 and Corollary 3.5.
To simplify further arguments, let us call a linear subspace R ⊂ H of an arbitrary Hilbert space (f,g)-valid if both (f) and (g) hold for R. Here we do not assume that R is an operator range. In a similar manner we define (f)-valid and (g)-valid linear subspaces of Hilbert spaces. Moreover, for any non-empty set J and R ⊂ H let Finally, for any infinite cardinal γ we call an operator range R of type R γ if it is of type R(γ n ) ∞ n=1 where γ n = γ for each n. We divide the remaining part of the proof of the theorem into the following steps: (I) There are dense operator ranges R f and R g in separable Hilbert spaces such that R f is (f)-valid and R g is (g)-valid.
(IV) If R is a dense operator range in a separable Hilbert space H and J is an infinite set, then R J contains an operator range of type R card(J) that is dense in H J . (V) For any infinite cardinal γ, all dense operator ranges of type R γ are (f,g)-valid. (VI) If R and H are as specified in the statement of the theorem and (c) is fulfilled, then R is contained in a dense (in H) operator range of type R α . In particular, R is (f,g)-valid. Note that property (VI) is exactly what we want. Below we give brief proofs of the above items (I)-(VI).
In a similar manner one shows the following result whose proof is left to the reader.
Everywhere below, α and β are cardinal numbers and "β < ∞" means that β is finite. Now assume α = dim(H) = cdr(A). The proof of Proposition 2.6 shows that A can be decomposed as A = s∈S A s where card(S) = α and each of A s is non-invertible and acts on a separable Hilbert space. Moreover, we have shown there that the existence of such a decomposition is sufficient for the existence of Z ∈ Z (A) such that dim(Z) = α and [Z] A = H. We will use this property below.
If β is infinite, we take disjoint subsets S 1 and S 2 of S such that S = S 1 ∪ S 2 , card(S 1 ) = α and card(S 2 ) = β. We infer from the property evoked above that there is W ∈ Z ( s∈S1 A s ) such that dim(W ) = α and [W ] A coincides with the domain of s∈S1 A s . Then dim([W ] ⊥ A ) = card(S 2 ), because S 2 is infinite and all A s act on separable spaces. Consequently, (α, β) ∈ Ξ(A).
Finally, if β < eig(A) (and still α = dim(H) = cdr(A), we can find a linear subspace E of H of (finite) dimension β that is invariant for A. It easily follows e.g. from condition (c) of Theorem 3.3 that there is Z ∈ Z (A) such that dim(Z) = α and [Z] A = E ⊥ . This yields (α, β) ∈ Ξ(A) and we are done.

Unbounded positive operators
Recall that a (densely defined) operator T : We emphasize that, according to the above definition, positive operators need not be self-adjoint.
The main aim of this section is the following consequence of the results presented in previous sections.

Theorem. For a (possibly unbounded) positive self-adjoint operator A in a
Hilbert space H the following conditions are equivalent: Proof. Assume A is not bounded and let E denote the spectral measure of A. We will show that there is an isometry V on H such that dim(R(V ) ⊥ ) = 1 and It is sufficient to show our claim (stated at the beginning of the proof) for A 0 (because if V 0 is an appropriate isometry for A 0 , then V 0 ⊕I H1 is appropriate for A) and thus we may and do assume that A = A 0 (and H = H 0 ). Let B = A −1 ∈ B + (H). It follows that R(B) = H and we infer from Proposition 2.1 that there is an isometry V ∈ V (B) such that dim(R(V ) ⊥ ) = 1. For any x ∈ D(A) set y = Ax and observe that V Ax, x H = V y, By H = BV y, y H 0, which shows that T = V A is positive. Moreover, since A is closed and V is isometric, T is closed as well and T * = AV * . Thus T * T = A 2 and consequently |T | = A. Finally, since V = I and R(A) is dense in H, we have T = A. This shows that (i) is followed by (ii). The reverse implication is well-known and left to the reader as a simple exercise.
As a consequence of the above theorem, we get the following, a little bit surprising, result. x ∈ X} is an orthonormal system in ℓ 2 (X); (O3) A(e z ) = u(z)e z for some z ∈ X.
Proof. First assume that u is bounded and A is a positive self-adjoint operator such that (O1) and (O2) hold. We shall show that Ae x = u(x)e x for any x ∈ X (that is: that A is bounded and diagonal in the canonical basis). For simplicity, set f x def = Aex u(x) for x ∈ X. From (O2) and the boundedness of u it easily follows that A ↾ lin{e x : x ∈ X} is bounded. Consequently, A is bounded as well. Moreover, for any x, y ∈ X we have A 2 e x , e y ℓ2(X) = Ae x , Ae y ℓ2(X) = u(x)u(y) f x , f y ℓ2(X) = u(x)u(y) e x , e y ℓ2(X) and hence A 2 e x = u(x) 2 e x . Since A is a unique positive square root of A 2 , we obtain that Ae x = u(x)e x for any x ∈ X-as claimed above.
Finally, assume u is unbouded and let B be the diagonal operator (with respect to the canonical basis) induced by u; that is, D(B) = {f ∈ ℓ 2 (X) : uf ∈ ℓ 2 (X)} and Bf = uf for f ∈ D(B). Since B is not bounded, we conclude from the proof of u(x) = V (e x ) form an orthonormal system different from the canonical one. (Note also that it is possible to enlarge this system by adding a single vector to obtain an orthonormal basis of ℓ 2 (X)since dim(R(V ) ⊥ ) = 1.) Now to end the proof, it suffices to apply the Friedrichs' theorem [12] on extending positive operators (consult also, e.g., [1] or Theorem 5.38 in [23]) to get a positive self-adjoint operator A in ℓ 2 (X) that extends T and satisfies conditions (O1)-(O3).
We leave it to the reader as an exercise that whenever conditions (O1)-(O3) are fulfilled (for a positive operator A), the system in (O2) is never an orthonormal basis. However, as the above proof shows, if only u is an unbounded function, we can always find such an operator A for which the closed linear subspace generated by the system from (O2) has codimension 1.
We end this section with the following result which is unrelated with the main subject of the paper (that is, it says nothnig about positivity). We will use it in the next section. It is likely that this result is already known. However, we could not find it in the literature. It can be considered interesting in its own right. Proof of Lemma 4.3. Replacing T byT , we may and do assume T is closed. In that case we need to show that T ↾ S is continuous in the weak topologies, and S is closed. To this end, let x x x = (x σ ) σ∈Σ be a net in S that weakly converges to some z ∈ H (where H is the underlying space containing the domain of T ). Let (y λ ) λ∈Λ be any subnet of x x x such that (T y λ ) λ∈Λ is weakly convergent, say to w. Then (y λ , T y λ ) λ∈Λ ⊂ Γ(T ) is a net weakly convergent (in the product space) to (z, w). Since norm closed convex subsets of Banach spaces are weakly closed, we conclude that (z, w) ∈ Γ(T ); that is, z ∈ D(T ) and T z = w. This shows that each weakly convergent subnet of (T x σ ) σ∈Σ converges to T z. So, it follows from the weak compactness of the unit ball (in the target space) that the net (T x σ ) σ∈Σ itself converges to T z. In particular, T z 1 and we are done.

Positive definite kernels
Before passing to the main issue of this part, first we recall necessary notions.
A kernel on X (where X is an arbitrary non-empty set) is a complex-valued function on X × X. A kernel K : X × X → C is said to be a positive definite kernel (on X) (or a Hilbert space reproducing kernel, or briefly a reproducing kernel ) if for any n 1, x 1 , . . . , x n ∈ X and λ 1 , . . . , λ n ∈ C. Note that the above condition says that the sum on the left-hand side of (5-1) (whose summands are complex!) is a non-negative real number. It is well-known (and easy to check) that for any reproducing kernel K on X, and K(x, x) 0 for all x, y ∈ X. It is well-known (and can briefly be proved by applying Sylvester's theorem on strictly positive definite matrices) that a kernel K : X × X → C is positive definite iff K satisfies (5-2) and det[K(x j , x k )] n j,k=1 0 for all x 1 , . . . , x n ∈ X (and arbitrary n 1). Another equivalent (and well-known) condition for the kernel K to be a reproducing kernel is the existence of a function j : X → H where H is some Hilbert space such that K(x, y) = j(x), j(y) H . To shorten statements, below we will use the abbreviation "pd" for "positive definite." We will also write "K ≫ 0" to express that K is a pd kernel. More generally, for two kernels K and L defined on a common set the notations "K ≪ L" and "L ≫ K" will mean that L − K is a pd kernel.
In this paper we study those pd kernels that have a pdms root (see Definition 1.2). Note that, thanks to (5-2), (1-1) is equivalent to L(x, z) = y∈X K(x, y)K(z, y). In particular, if z = x, the series in (1-1) has non-negative summands and thus its sum is well-defined (it is either a real number or ∞). And, if all these series (with z = x) are finite, the Schwarz inequality shows that the right-hand side series in (1-1) converges for arbitrary x, z ∈ X. So, there is no ambiguity in understanding this formula. To simplify further statements, let us introduce the following 5.1. Definition. A kernel K on X is called an ℓ 2 -kernel if both K(x, ·) and K(·, x) are members of ℓ 2 (X) for any x ∈ X.
For two ℓ 2 -kernels K and L on X, K * L is a kernel on X given by In addition, K * is a kernel on X such that K * (x, y) = K(y, x) for any x, y ∈ X.
The paragraph preceding the above definition explains that for any ℓ 2 -kernels K and L (both on X), K * L is a well-defined kernel on X. Moreover, we readily have (K * L)(x, y) = K(x, ·), L * (y, ·) ℓ2(X) . It is worth noting here that, in general, K * L is not an ℓ 2 -kernel (see, e.g. item (A) in Example 5.21 below). Observe also that a pd kernel K on X is an ℓ 2 -kernel iff K(x, ·) ∈ ℓ 2 (X) for all x ∈ X.
Using the notation introduced in Definition 5.1, we can reformulate the equation defining a pdms root as follows: K ≫ 0 is a pdms root of L ≫ 0 iff K is an ℓ 2 -kernel and L = K * K. In the last statement the assumption that L ≫ 0 can be skipped as shown by the following very easy 5.2. Proposition. For any ℓ 2 -kernel K, K * K * is a pd kernel. In particular, if K ≫ 0 is an ℓ 2 -kernel, then K * K ≫ 0 as well.
Proof. Each pd kernel K satisfies K = K * , so it is sufficient to prove the first claim which follows from the formula (K * K * )(x, y) = K(x, ·), K(y, ·) ℓ2(X) .
For any kernel K on X let K op denote a unique linear operator defined on ℓ f in (X) such that K op (e x ) = K(x, ·) for all x ∈ X (so, all values of K op are complex-valued functions on X). Everywhere below its domain D(K op ) = ℓ f in (X) will always be equipped with the norm and the topology inherited from ℓ 2 (X) and considered as a subspace of ℓ 2 (X). In contrast, the target space of K op will vary and will always be specified.

Lemma.
For any ℓ 2 -kernel K on X, K op : ℓ f in (X) → ℓ 2 (X) is a closable operator such that K(x, y) = K op e x , e y ℓ2(X) for any x, y ∈ X.
Proof. The only thing that needs proving is the closability of K op . But this easily follows from the relation: K op f, e x ℓ2(X) = f, K * (x, ·) ℓ2(X) (x ∈ X, f ∈ ℓ f in (X)). The details are left to the reader.
Every pd kernel K on X generates a unique Hilbert function space (consisting of complex-valued functions on X), to be denoted by H K , such that the following two conditions are fulfilled: • the functions K(x, ·) (where x runs over all elements of X) belong to H K ; for any x ∈ X and f ∈ H K . In particular, K(x, y) = K(x, ·), K(y, ·) HK . (Very often H K is defined by conditions obtained from the above two by replacing the functions K(x, ·) by K(·, x). In general, this way leads to a different vector space. However, both the approaches are fully equivalent and it is a matter of taste which one to choose. Our choice is more convenient for our purposes.) A well-known fact says that H δX = ℓ 2 (X), which we will use many times without any additional explanations. (Recall that δ X is the pd kernel on X such that δ X (x, x) = 1 for all x ∈ X and δ X (x, y) = 0 whenever x = y.) To simplify further statements, let us say that an operator T : D(T ) → H (where ℓ f in (X) ⊂ D(T ) ⊂ ℓ 2 (X) and H is a Hilbert space) factorizes a pd kernel K on X if  K(x, y) = T e x , T e y H (x, y ∈ H).

Lemma.
For a pd kernel K on X the following conditions are equivalent: (i) there exists a closable operator that factorizes K; (ii) the restriction to ℓ f in (X) of each operator that factorizes K is closable.
Proof. It is easy to see that (i) is implied by (ii) (note that it is sufficient to show the existence of an operator that factorizes K): the operator K op : ℓ f in (X) → H K factorizes K. The reverse implication is also simple: if S : D(S) → H and T : D(T ) → E factorize K, then T e x , T e y E = Se x , Se y H for any x, y ∈ X and therefore there exists a linear isometry V : is so, and we are done.
For a collection of kernels {K s : X s × X s → C} s∈S where the sets X s are all pairwise disjoint we define the kernel s∈S K s on the (disjoint) union s∈S X s of X s as follows: • ( s∈S K s )(x, y) = K t (x, y) for x, y ∈ X t and arbitrary t ∈ S; • ( s∈S K s )(x, y) = 0 if x ∈ X p , y ∈ X q and p = q. It is easy to check and left to the reader that s∈S K s ≫ 0 iff K s ≫ 0 for all s ∈ S.
For simplicity, let us call a vector u ∈ ℓ f in (X) K-unit (where K is a pd kernel on X) if x,y∈X u(x)u(y)K(x, y) 1.
Now we gather various criteria for a pd kernel to have a pdms root.

5.5.
Theorem. For a pd kernel K on X the following conditions are equivalent: (a1) K has a pdms root; (a2) there exists an ℓ 2 -kernel L on X such that K = L * L * ; (b) the set X admits a decomposition X = s∈S X s into (pairwise disjoint) non-empty (at most) countable sets such that K = s∈S K ↾ X s × X s and each of K ↾ X s × X s has a pdms root; (c1) there exists a closable operator that factorizes K; (c2) there exists a positive self-adjoint operator B in ℓ 2 (X) that factorizes K and has ℓ f in (X) as a core; n=1 ⊂ ℓ f in (X) norm converges to 0 and consists of K-unit vectors, then lim n→∞ x∈X α n (x)K(x, z) = 0 for all z ∈ X; (e2) for any z ∈ X and ε > 0 there is a finite non-empty set F ⊂ X such that the following condition holds: if a K-unit vector f ∈ ℓ f in (X) vanishes at each point of F and has norm not exceedind 1, then | x∈X f (x)K(x, z)| ε; (e3) for any z ∈ X and ε > 0 there are a finite orthonormal system u 1 , . . . , u k in ℓ 2 (X) and δ > 0 such that the following condition is fulfilled: if a K- Proof. For the reader's convenience, let us draw a scheme of the proof: First assume L ≫ 0 satisfies K = L * L (see (a1)). This means that factorizes K. It follows from Lemma 5.3 that L op is closable (recall that L is an ℓ 2 -kernel). So, (c1) holds. Let us now check that (c2) is implied by (c1). To this end, let T : ℓ f in (X) → H be closable and factorize K, and letT = QB be the polar decomposition ofT . Then B is a positive self-adjoint operator whose domain contains ℓ f in (X) and Q is isometric on R(B). So, K(x, y) = T e x , T e y H = QBe x , QBe y H = Be x , Be y ℓ2(X) . Moreover, since ℓ f in (X) is a core ofT , it is a core of B as well-that is, (c2) holds. Now we will show that (a1) follows from (c2). So, let B be as specified in (c2) and define L : X × X → C by L(x, y) = Be x , e y ℓ2(X) (x, y ∈ X). Since B is positive, it is readily seen that L ≫ 0. Moreover, observe that Be x = y∈X L(x, y)e y and therefore L is an ℓ 2 -kernel and L(x, ·), L(y, ·) ℓ2(X) = Be x , Be y ℓ2(X) = K(x, y) and (a1) is fulfilled. Of course, (a1) is followed by (a2). Conversely, if L is as specified in (a2), then L op : ℓ f in (X) → ℓ 2 (X) is closable (by Lemma 5.3) and factorizes K, which shows (c1).
Further, if (c1) holds, then K op : ℓ f in (X) → H K is closable as it factorizes K (see Lemma 5.4 and its proof), which shows that (c1) implies (c3). The reverse implication is trivial. To show that (c3) is equivalent to (d), consider T = K op : ℓ f in (X) → H K and recall that (since T is densely defined) T is closable iff T * is densely defined. Therefore it is sufficient to check that D(T * ) = ℓ 2 (X) ∩ H K . To this end, observe that g ∈ H K belongs to D(T * ) and T * g = f ∈ ℓ 2 (X) iff T e x , g HK = f (x) for all x ∈ X. Equivalently, we need to have f (x) = g, K(x, ·) HK = g(x). So, g ∈ D(T * ) iff g ∈ ℓ 2 (X) ∩ H K (and then T * g = g), which finishes this part of the proof.
Going further, for simplicity, we denote by S ⊂ ℓ f in (X) the set of all K-unit vectors, and by T the operator K op : ℓ 2 (X) → H K . Observe that for any u ∈ ℓ f in (X), u ∈ S ⇐⇒ T u 1 and x∈X u(x)K(x, y) = T u, K(y, ·) HK (for any y ∈ X). Now if (c3) holds, it follows from Lemma 4.3 that (e3) is fulfilled. Indeed, a note that all sets of the form > 0 and u 1 , . . . , u k is a finite orthonormal system in ℓ 2 (X)) form a neighbourhood basis of 0 in the weak topology of ℓ 2 (X) enables one deducing (e3) from (c3) (we leave the details to the reader). Further, if (e3) holds, then (e2) holds as well (for in (e2) K-unit vectors are taken from the unit ball of ℓ f in (X) and on the unit ball of ℓ 2 (X) a neighbourhood basis of 0 in the weak topology can be formed by the sets  where all vectors u 1 , . . . , u k are taken from the canonical basis of ℓ 2 (X)). Let us now give a more detailed proof that (e1) follows from (e2). To this end, assume (e2) holds and let a sequence (α n ) ∞ n=1 be as specified in (e1). Fix z ∈ X and ε > 0. Choose a finite set F ⊂ X guaranteed by (e2) (for these z and ε > 0). Denote by E the linear span of all e x with x ∈ F and write α n = β n + γ n where β n ∈ E and γ n ⊥ E. Then lim n→∞ β n = lim n→∞ γ n = 0. Since E is finite-dimensional, Therefore β n ∈ S for sufficiently large n. Consequently, 1 2 γ n = 1 2 (α n − β n ) eventually belongs to S as well. Since γ n vanishes at each point of F , we infer from (e2) that | T γ n , K(z, ·) HK | 2ε for sufficiently large n. This inequality combined with (5-5) yields | T α n , K(z, ·) HK | 3ε (for sufficiently large n) and hence (e1) is fulfilled. Finally, assume (e1) holds. Our aim is to show that (c3) is fulfilled; that is, that T (defined above) is closable. To this end, assume (α n ) ∞ n=1 ⊂ ℓ f in (X) norm converges to 0 and T α n → β ∈ H K (n → ∞). We need to check that β = 0. Without loss of generality, we may and do assume that T α n 1. So, α n ∈ S and it follows from (e1) that lim n→∞ T α n , K(z, ·) HK = 0 for any z ∈ X, from which it easily follows that β = 0.
It remains to check that (b) is equivalent to (a1). First assume (a1) holds and choose any pd ℓ 2 -kernel L such that K = L * L. Define an equivalence relation "∼" on X as follows: x ∼ y if there are points a 0 , . . . , a N ∈ X (for some N > 0) such that a 0 = x, a N = y and L(x j−1 , x j ) = 0 for j = 1, . . . , N . Observe that all equivalence classes [x] ∼ are at most countable (because the set {x ∈ X : L(x, y) = 0} is such for any y ∈ X). So, we can divide X into pairwise disjoint sets X s such that L(x, y) = 0 for any x ∈ X s and y ∈ X s ′ with distinct s, s ′ ∈ S (namely, . For simplicity, set L s def = L ↾ X s × X s and note that L = s∈S L s and each of L s is a pd ℓ 2 -kernel. It is then easy to verify that L * L = s∈S (L s * L s ) and therefore (b) holds. Conversely, if (b) is satisfied, then for each s ∈ S we can choose a pd ℓ 2 -kernel L s on X s such that K ↾ X s × X s = L s * L s . Then it suffices to set L def = s∈S L s to get a pd ℓ 2 -kernel such that K = L * L.
In Proposition 5.9 below we will show that the operator B witnessing the above condition (c2) is unique.
In the next theorem we will make use of the following two results. The former of them is well-known (see, e.g., Theorem 6 on page 37 in [17]) and it is likely that so is the latter, but we could not find it in the literature and thus we give its proof. 5.6. Lemma. For two pd kernels K and L on a common set and a constant c 0 the following conditions are equivalent: (i) H K ⊂ H L and the identity operator from H K into H L has norm not exceeding c; (ii) K ≪ c 2 L. Moreover, H K ⊂ H L iff (ii) hols for some c > 0. 5.7. Lemma. Let {K σ } σ∈Σ be an increasing net of pd kernels on X, that is: • (Σ, ) is a directed set; • K σ ≪ K τ for any σ, τ ∈ Σ with σ τ . If K : X × X → C is the pointwise limit of this net-that is, if then σ∈Σ H Kσ is a dense linear subspace of H K .
Since the functions K(z, ·) (z ∈ X) form a total subset of H K , it follows from the above convergence and the boundedness of the net under consideration that K σ (x, ·) weakly converge to K(x, ·). Therefore K(x, ·) belongs to the weak closure of H which coincides with the norm closure of H. Consequently, H K = lin{K(x, ·) : x ∈ X} is contained in the (norm) closure of H and we are done.
As a consequence of Theorem 5.5, we obtain the following 5.8. Theorem. Let K be a pd kernel on X.
(I) If K has a pdms root and A is a non-empty subset of X, then K ↾ A × A has a pdms root as well. (II) If K = s∈S K s , then K has a pdms root iff each of K s has a pdms root. (III) If K has a pdms root and u : X → C is a bounded function, then the kernel L : X × X ∋ (x, y) → u(x)u(y)K(x, y) ∈ C has a pdms root as well. In particular, the kernel has a pdms root provided so has K. (IV) If K is bounded and has a pdms root, then for any x ∈ X, K(x, ·) is a c 0function; that is, for any x ∈ X and ε > 0 there is a finite set F ⊂ X such that |K(x, y)| < ε for any y / ∈ F . (V) If K is an ℓ 2 -kernel, it has a pdms root. In particular, each pdms root of a pd kernel has a pdms root.
(VI) If K = s∈S K s where {K s } s∈S is an arbitrary family of pd kernels having pdms roots, then K also has a pdms root. (VII) If K is the pointwise limit of an increasing net of pd kernels (cf. the statement of Lemma 5.7) each of which has a pdms root, then also K has a pdms root. (VIII) If L is a pd kernel on X such that c 1 K ≪ L ≪ c 2 K for some positive constants c 1 and c 2 , then either both K and L have pdms roots or none of them. (IX) There exists a unique pd kernel K 0 ≪ K that has a pdms root and is the greatest kernel with these properties; that is, if L ≪ K is a pd kernel that has a pdms root, then L ≪ K 0 . (X) There exists a pd kernel K 1 ≪ K that has a unique pdms root and the following property: whenever L ≫ 0 is such that L ≪ aK for some constant a > 0, then L has a unique pdms root iff L ≪ bK 1 for some constant b > 0.
Before giving a proof, we explain how to understand the sum appearing in item (VI) above and when this (generalized) series converges. The formula K = s∈S K s is understood pointwise: we only assume that K(x, y) = s∈S K s (x, y) for any x, y ∈ X. In particular, (all summands in (5-7) are non-negative and hence the series is well-defined as a quantity in [0, ∞]). Conversely, if (5-7) is fulfilled, then s∈S K s (x, y) is absolutely convergent (for any x, y ∈ X): where the first inequality above follows from the property that the matrix K(x, x) K(x, y) K(y, x) K(y, y) is positive.
Proof of Theorem 5.8. Since the restriction of a closable operator is closable as well, item (I) immediately follows from condition (c1) in Theorem 5.5, whereas (II) is a consequence of (I) and of item (b) therein.
To prove that L defined in (III) has a pdms root, we use condition (c1) of Theorem 5.5. Since K has a pdms kernel, there is a closable operator T : ℓ f in (X) → H that factorizes K. Let S : ℓ f in (X) → ℓ f in (X) be given by Se x def = u(x)e x . Then S is bounded (since so is u) and hence the operator T S : ℓ f in (X) → H is closable. Observe that T S factorizes L and thus L has a pdms root (by (c1)). The claim about K bd follows from the boundedness of the function X ∋ x → max(K(x, x), 1) −1/2 ∈ (0, ∞).
We turn to (IV). Assume K is bounded and has a pdms root. It is sufficient to show that for any x ∈ X, lim n→∞ K(x, y n ) = 0 for any one-to-one sequence (y n ) ∞ n=1 ⊂ X. To this end, take an upper bound M ≥ 1 of |K| and apply condition (e2) (in Theorem 5.5) to z = x and f = 1 M e yn : (e2) implies that lim n→∞ 1 M K(y n , x) = 0. Consequently, K(x, y n ) = K(y n , x) converges to 0 as n tends to ∞. Now assume K is an ℓ 2 -kernel. Then lin{K(x, ·) : x ∈ X} ⊂ ℓ 2 (X) ∩ H K and hence ℓ 2 (X) ∩ H K is dense in H K . So, condition (d) of Theorem 5.5 shows that K has a pdms root. Since each pdms root is an ℓ 2 -kernel, the whole conclusion of (V) follows.
To prove (VI) we apply condition (c1) of Theorem 5.5. So, for any s ∈ S there exists a closable operator T s : ℓ f in (X) → H s that factorizes K s . Then, for any x ∈ X, s∈S T s e x 2 = s∈S K s (x, x) = K(x, x) < ∞ and therefore ⊕ s∈S T s e x ∈ s∈S H s . In particular, for any f ∈ ℓ f in (X), T f def = ⊕ s∈S T s f ∈ s∈S H s . In this way we have defined a linear operator T : ℓ f in (X) → H def = s∈S H s . It is readily seen that T is closable-as all T s are such. Moreover, for any x, y ∈ X we have T e x , T e y H = s∈S T s e x , T s e y Hs = s∈S K s (x, y) = K(x, y) which shows that T factorizes K. So, an application of (c1) from Theorem 5.5 completes the proof of (VI).
We turn to (VII). Let {K σ } σ∈Σ be an increasing net (of pd kernels with pdms roots) whose pointwise limit is K. For simplicity, we set H σ def = H Kσ (σ ∈ Σ) and H def = H K . It follows from Lemma 5.7 that H * def = σ∈Σ H σ is a dense subspace of H. Moreover, Lemma 5.6 yields that the identity operator I σ : H σ → H is continuous; whereas condition (d) of Theorem 5.5 implies that ℓ 2 (X) ∩ H σ is dense in H σ . Consequently, I σ (ℓ 2 (X) ∩ H σ ) is dense in I σ (H σ ) and therefore the closure of ℓ 2 (X) ∩ H (in H) contains H * , which finishes the proof of (VII).
Property (VIII) immediately follows from Lemma 5.6 (which implies that under the assumption of (VIII), H K = H L and their topologies coincide) and condition (d) of Theorem 5.5 (since in that case ℓ 2 (X) ∩ H K = ℓ 2 (X) ∩ H L ).
To prove (IX), denote by P the orthogonal projection from H K onto the closure E (in H K ) of ℓ 2 (X) ∩ H K and define K 0 as follows: Then K 0 is a pd kernel such that H K0 = E and the inner product of H K0 coincides with the one on E inherited from H K (consult, e.g., Theorem 5 on page 37 in [17]). In particular, ℓ 2 (X) ∩ H K0 is dense in H K0 (and hence K 0 has a pdms root-see (d) in Theorem 5.5) and Lemma 5.6 implies that K 0 ≪ K. Now assume L ≫ 0 has a pdms root and satisfies L ≪ K. Again: • condition (d) of Theorem 5.5 yields that ℓ 2 (X) ∩ H L is dense in H L ; • Lemma 5.6 implies that H L ⊂ H K and the identity operator I : H L → H K has norm not exceeding 1.
It follows from the former of the above properties that I(ℓ 2 (X) ∩ H L ) is dense in I(H L ). Consequently, Since I 1, we obtain L ≪ K 0 (one more time by Lemma 5.6). The maximality of K 0 (just proved) implies the uniqueness of K 0 .
Finally, we turn to (X). We will use here Theorem 1.3 (that has not been proved yet!). A careful reader will verify that the proof of that theorem presented below is independent of this part of the present result. Equip the vector space H def = ℓ 2 (X) ∩ H K with the inner product It is a kind of folklore that the above H is a Hilbert space on which all the evaluation functionals (that is, all functions of the form u → u(x) where x ∈ X) are continuous (actually both these properties are easy to prove). This means that H has a reproducing kernel, say K ′ . Since then H K ′ = H ⊂ H K , it follows from Lemma 5.6 that K ′ ≪ cK for some constant c > 0. We define K 1 as 1 c K ′ . Observe that K 1 ≪ K and H K1 = H ⊂ ℓ 2 (X) = H δX . Another application of Lemma 5.6 yields that K 1 ≪ c ′ δ X for some constant c ′ > 0. So, Theorem 1.3 implies that K 1 has a unique pdms root. Now let L ≫ 0 be such that (5)(6)(7)(8) L ≪ aK for some constant a > 0 and L has a unique pdms root. Then-again by Theorem 1.3-L ≪ a ′ δ X for some constant a ′ > 0. The last property combined with Lemma 5.6 gives H L ⊂ H δX = ℓ 2 (X), whereas, similarly, (5)(6)(7)(8)  The next result gives a description of all pdms roots of a pd kernel (that has such a root). 5.9. Proposition. Let K be a pd kernel that has a pdms root.
(I) There is a unique positive self-adjoint operator B in ℓ 2 (X) that factorizes K and has ℓ f in (X) as a core. (II) If B is as specified in (I), then there is a one-to-one correspondence κ : V → R between the set V of all linear isometries V : R(B) → ℓ 2 (X) such that the operator V B is positive, and the set R of all pdms roots of K; κ is given by the rule: Proof. The existence of the operator B with all properties (apart from the uniqueness) specified in (I) follows from condition (c2) in Theorem 5.5. We fix it and after proving (II) we will show its uniqueness. We turn to (II). Fix for a moment V ∈ V . It is easy to check that κ(V ) given by (5-9) is a pd kernel (because V B is positive). Moreover, we have κ(V )(x, ·) = V Be x and hence κ(V ) is an ℓ 2 -kernel such that (κ(V ) * κ(V ))(x, y) = V Be x , V Be y ℓ2(X) = Be x , Be y ℓ2(X) = K(x, y). So, κ(V ) ∈ R.
Since members of V are defined (only) on R(B) and ℓ f in (X) is a core of B, we readily conclude that κ is one-to-one. So, to end the proof of (II), it remains to check the surjectivity of κ. To this end, let L be a pdms root of K. We consider L op with target space ℓ 2 (X). Observe that for any x, y ∈ X, L op e x , L op e y ℓ2(X) = K(x, y) = Be x , Be y ℓ2(X) . This equation implies that there is a linear isometry V : B(ℓ f in (X)) → ℓ 2 (X) such that L op f = V (Bf ) for any f ∈ ℓ f in (X). Since ℓ f in (X) is a core for B, we get that D(V ) = R(B) and V B is positive (as it is positive on ℓ f in (X). Consequently, V ∈ V and (κ(V ))(x, y) = V Be x , e y ℓ2(X) = L op e x , e y ℓ2(X) = L(x, y) (x, y ∈ X).
Having (II), we can briefly validate the uniqueness of B. Assume A is a positive self-adjoint operator in ℓ 2 (X) that factorizes K and has ℓ 2 (X) as a core. Then Ae x , Ae y ℓ2(X) = K(x, y) = Be x , Be y ℓ2(X) for any x, y ∈ X and it follows from the previous paragraph that A ↾ ℓ f in (X) = V B ↾ ℓ f in (X) for some V ∈ V . Since V is isometric and ℓ f in (X) is a core for both A and B, we obtain A = V B. Hence N(A) = N(B). Extend V to the partial isometry Q such that N(Q) = N(B). Then also A = QB and it follows from the uniqueness of the polar decomposition that A = B.
As an immediate consequence we obtain the following result (we skip its proof).

Corollary.
For any pd kernel that has a pdms root there exists a unique pdms root K such that the closure of K op : ℓ f in (X) → ℓ 2 (X) is self-adjoint.

This is a good moment to give
Proof of Theorem 1.3. Everywhere in this proof K op is considered as an operator with target space H K . We start from showing that condition (iii) of the theorem is equivalent to: (bd) K op : ℓ f in (X) → H K is bounded. Indeed, (iii) says that for some c > 0 we have for any u ∈ ℓ f in (X), which is equivalent to (bd), as the left-hand side of (5-10) coincides with K op u 2 . Now we turn to the main part of the proof.
Assume (iii) holds. Then the absolute value B of the closure of K op is positive and bounded (by (bd)). Observe that Be x , Be y ℓ2(X) = K op e x , e y ℓ2(X) = K(x, y) and thus (e.g. by condition (c1) in Theorem 5.5) K has a pdms root. Moreover, the above B witnesses property (I) in Proposition 5.9. So, it follows from item (II) of that proposition that all other possible pdms roots are in one-to-one correspondence with linear isometries V : R(B) → ℓ 2 (X) such that V B is positive. But if V is such an isometry, then V B is self-adjoint (being positive and bounded) and it follows from the uniqueness of B that V B = B or, equivalently, that V is the identity. This shows (i)-that is, that K has a unique pdms root.
To prove the reverse implication, we assume that (iii) is false and we will show that (i) is false as well. To this end, assume K has a pdms root (otherwise (i) does not hold). Let B witness property (I) in Proposition 5.9. We claim that B is not bounded. Indeed, since (iii) does not hold, (bd) is false. And we infer from the proof of Lemma 5.4 that K op = V B ↾ ℓ lin (X) for some linear isometry V : B(ℓ f in (X)) → H K . So, B is not bounded as K op is not such. Now it follows from Theorem 4.1 that there exists a positive (densely defined) operator T in ℓ 2 (X) such that |T | = B = T . Let T = QB be the polar decomposition of T . Then Q ↾ R(B) is isometric and differs from the identity map. So, there are at least two pdms roots of K thanks to item (II) of Proposition 5.9.
Further, observe that it follows from the equivalence of (i) and (iii) (which we have already proved) that (ii) is implied by (i). So, it remains to show that if (iii) does not hold, then (ii) is false. To this end, assume there is no c > 0 for which K ≪ cδ X . Equivalently, H K ⊂ H δX = ℓ 2 (X) (cf. Lemma 5.6). So, there exists a unit vector u ∈ H K such that u / ∈ ℓ 2 (X). Then L def =ū ⊗ u is a pd kernel such that L ≪ K (since u is a unit vector in both H K and H L -see, e.g., Corollary 2 on page 45 in [17]) and H L = lin{u} where In particular, ℓ 2 (X) ∩ H L = {0} and hence L does not have a pdms root (thanks to condition (d) of Theorem 5.5).

5.11.
Remark. An inspection of the proofs of Theorems 4.1 and 1.3, combined with Proposition 2.1, shows that if a pd kernel has at least two pdms roots, then it actually has uncountably many such roots. We leave the details to interested readers. 5.12. Corollary. A pd kernel K on X has a unique pdms root iff H K ⊂ ℓ 2 (X).
Proof. The assertion immediately follows from Theorem 1.3 and Lemma 5.6. 5.13. Corollary. Let K be a pd kernel on X that has a unique pdms root. For any pd kernel L on X, L has a pdms root iff so has K + L.

Proof. The proof of Theorem 1.3 shows that T
Thus, the conclusion follows from condition (c1) of Theorem 5.5 and Lemma 5.4. 5.14. Example. Simplest possible pd kernels on a set X are of the form (5)(6)(7)(8)(9)(10)(11) where u : X → C is totally arbitrary. (The simplicity of these kernels can be justified as follows: they are precisely those pd kernels K for which dim(H K ) 1.) Since Hū ⊗u = lin{u}, we either have Hū ⊗u ⊂ ℓ 2 (X) or ℓ 2 (X) ∩ Hū ⊗u = {0}. So, condition (d) of Theorem 5.5 and Corollary 5.12 imply that for a function v : X → C the following conditions are equivalent: (i)v ⊗ v has a pdms root; (ii)v ⊗ v has a unique pdms root; (iii) v ∈ ℓ 2 (X).
In Example 5.22 we will use the above charecterization to give a (counter)example witnessing that the uniform limit of bounded pd kernels having a unique pdms root can have no pdms roots.
To simplify further statements, we introduce a few additional notions: • pointwise countable if for any x ∈ X the set {y ∈ X : K(x, y) = 0} is (at most) countable.
Our nearest aim is to characterize those kernels that admit a non-vanishing rescaling having a pdms root. The following result will serve as a useful tool in investigating this issue. Everywhere below N denotes the set of all positive integers. 5.16. Lemma. Let K be a pd kernel on N and D be the diagonal pd kernel such that D(n, n) = n 2 for any n.
Although properties (I) and (II) are consequences of well-known results on Schatten class and Hilbert-Schmidt operators (consult, e.g., [18] or the material of §18 in [7]), below we present their brief proofs.
To show (I), for any α ∈ ℓ f in (N) the Schwarz inequality yields (cf. the paragraph following Theorem 5.8). We turn to (III). For simplicity, denote the u-rescaling of a pd kernel L by L u . In particular, Since |K| is bounded by c and ∞ n=1 n −2 = π 2 6 , we easily get that ∞ n=1 K u (n, n) 1. So, we conclude from (II) that K u ≪ δ N . Consequently, K = (K u ) 1 u ≪ (δ N ) 1 u = cπ 2 6 D and we are done.
Now we can characterize pd kernels admitting non-vanishing rescalings having pdms roots. 5.17. Theorem. For a pd kernel K on X the following conditions are equivalent: (i) K has a non-vanishing rescaling that has a pdms root; (ii) the v-rescaling of K has a unique pdms root for some v : X → (0, 1); (iii) K ≪ D for some diagonal pd kernel D on X; (iv) K ≪ C for some pointwise countable pd kernel C on X; (v) K is pointwise countable.
In particular, • each pd kernel on a countable set satisfies conditions (ii) and (iii); • the collection C of all pd kernels K on X that satisfy (i) is a convex cone such that L ∈ C whenever 0 ≪ L ≪ K for some K ∈ C or L is the pointwise product of two members of C .
Proof. Since the additional claim of the theorem easily follows from the equivalence of (i), (iii) and (v) (recall that-according to the Schur's theorem [19]-the pointwise product, usually called the Hadamard product, of two pd kernels defined on a common set is a pd kernel as well; see also Section 3 of Chapter I in [9]-consult the material on page 9 therein), we only need to show the equivalence of all conditions (i)-(v). As done in the previous proof, we shall denote, for simplicity, the u-rescaling of a pd kernel L by L u . First we will show the following implications: (iii) =⇒ (ii) =⇒ (i) =⇒ (v) =⇒ (iii) and then we will briefly deduce the equivalence of (iv) and (iii). Assume (iii) holds and let D be as specified therein. Let v : X → (0, 1) be given by v(x) def = 1/ max( D(x, x), 2) (x ∈ X). Since K ≪ D, we get K v ≪ D v and easily D v ≪ δ X . Hence K v ≪ δ X and it follows from Theorem 1.3 that K v (or v) witnesses (ii). Now observe that (i) implies (v) by condition (b) of Theorem 5.5, and obviously follows from (ii). So, we now assume that (v) holds and will show that (iii) is fulfilled. First of all, observe that K, being pointwise countable, induces a decomposition of X = s∈S X s into at most countable (pairwise disjoint) sets X s such that K = s∈S K (s) where K (s) = K ↾ X s × X s (s ∈ S) (cf. the proof that (b) follows from (a1) in Theorem 5.5). Since X s is countable, there exists a function u s : X s → (0, 1) such that x∈Xs u s (x) 2 K (s) (x, x) 1. We infer from Lemma 5.16 that (K (s) ) us ≪ δ Xs . Consequently, s∈S (K (s) ) us ≪ δ X . Now it suffices to define u : X → (0, 1) as the union of all u s (that is, u ↾ X s = u s ) and D as (δ X ) 1 u to get K = (K u ) 1 u = s∈S (K (s) ) us 1 u ≪ (δ X ) 1 u = D. It remains to explain why (iv) is equivalent to (iii). Since diagonal pd kernels are pointwise countable, (iv) follows from (iii). Conversely, if (iv) holds and C is as specified therein, then we know that there exists a diagonal pd kernel D such that C ≪ D (as C satisfies (v)). Then clearly K ≪ D and we are done.
Property (VI) of Theorem 5.8 implies that if K and L are pd kernels on X that have pdms roots, then the kernel K + L has a pdms root as well. Conversely, a basic consequence of Theorem 1.3 is that if K + L has a unique pdms root, so have both K and L. It turs out that a counterpart of this property for pd kernels having at least two pdms roots is (in a sense always-see Proposition 5.18 below) false and when the kernel acts on a countable set, the property mentioned above crashes in a striking way (see item (A) in Proposition 5.18). To be more precise, we call a pd kernel free of pdms roots if no non-zero pd kernel L ≪ K has a pdms root. With the aid of the results of Section 3, we now show that 5.18. Proposition. Let K be a pd kernel on X that has at least two pdms roots. (A) If X is countable, K is a sum of two pd kernels each of which is free of pdms roots. (B) K is a sum of two pd kernels none of which has a pdms root.
In the proof we shall make use of the following 5.19. Lemma. A pd kernel K on X is free of pdms roots iff ℓ 2 (X) ∩ H K = {0}.
Proof. Let K 0 be as specified in property (IX) listed in Theorem 5.8. Observe that K is free of pdms roots iff K 0 ≡ 0. An inspection of the proof of the aforementioned item (IX) shows that H K0 coincides with the closure in H K of ℓ 2 (X) ∩ H K , from which the conclusion follows.
Proof of Proposition 5.18. We start from (A). It follows from the countability of X that H K is separable. Since K has a pdms root, R def = ℓ 2 (X) ∩ H K is dense in H K (by Theorem 5.5). Moreover, R = H K , because K has more than one pdms root (see Corollary 5.12). Note that R is an operator range in H K (the proof of (X) in Theorem 5.8 shows that R admits a Hilbert space norm stronger than the norm induced from H K ). So, condition (g) of Theorem 3.3 implies that there is a closed linear subspace W of H K such that W ∩ R = W ⊥ ∩ R = {0}. Equivalently, Denoting by P and Q the orthogonal projections in H K onto, respectively, W and W ⊥ , we define pd kernels K 1 and K 2 on X by K 1 (x, y) def = P K(x, ·), P K(y, ·) HK and K 2 (x, y) def = QK(x, ·), QK(y, ·) HK . Then H K1 = W and H K2 = W ⊥ (see Theorem 5 on page 37 in [17]) and therefore-thanks to (5)(6)(7)(8)(9)(10)(11)(12)(13) and Lemma 5.19both K 1 and K 2 are free of pdms roots. But K 1 + K 2 = K (since P + Q = I) and we are done. Now we pass to the general case (item (B)). We may and do assume X is uncountable. We will show that X contains a countable subset A such that and K ↾ A × A has at least two pdms roots. Assume for a moment that we have already found such a set A. It then follows from (A) that where L 1 and L 2 are pd kernels on A without pdms roots. We define pd kernels K 1 and K 2 on X by: and K 2 = L 2 ⊕ 0 (where 0 here means the zero function on (X \ A) × (X \ A)). It is easily seen that both K 1 and K 2 are pd kernels such that K = K 1 + K 2 . Moreover, it follows from property (I) in Theorem 5.8 that none of K 1 and K 2 has a pdms root. (Indeed, if K j had a pdms root, so would have K j ↾ A × A = L j which is impossible.) Hence, it remains to construct the set A with appropriate properties. We will do this with the aid of condition (b) of Theorem 5.5.
It follows from the aforementioned result that K = s∈S K s where for any s ∈ S, K s is a pd kernel on a countable set X s that has a pdms root. If one of K s has at least two pdms roots, we just set A = X s and finish the proof. So, we assume that K s has a unique pdms root for any s ∈ S. It follows from Theorem 1.3 that (5)(6)(7)(8)(9)(10)(11)(12)(13)(14) K s ≪ c s δ Xs for some constant c s 0. We take c s to be the smallest possible non-negative number witnessing (5)(6)(7)(8)(9)(10)(11)(12)(13)(14). We claim that Indeed, if c def = sup s∈S c s was finite, then we would have K s ≪ cδ Xs for all s ∈ S and thus also K ≪ cδ X , which would mean that K would have a unique pdms root (by Theorem 1.3). So, (5-15) holds and therefore there exists a sequence s 1 , s 2 , . . . ∈ S for which lim n→∞ c sn = ∞. We set A def = ∞ n=1 X sn . It follows from property (I) in Theorem 5.8 that K ↾ A × A has a pdms root. Further, since the numbers c s have been chosen optimal, there is no c > 0 such that K ↾ A × A ≪ cδ A . Consequently, K ↾ A × A has more than one pdms root (by Theorem 1.3) and we are done.
Since any pd kernel defined on a finite set has a (unique) pdms root, condition (b) of Theorem 5.5 completely reduces the study of the class of pd kernels having pdms roots to pd kernels on infinite countable sets. So, pd kernels on N (or Z) are of special interest. Below we give a single positive example on the existence of pdms roots and a list of counterexamples to related subjects. Everywhere below ℓ 2 = ℓ 2 (N) (that is, indices of sequences in ℓ 2 start from 1), ℓ f in = ℓ f in (N) and e 1 , e 2 , . . . is the canonical basis of ℓ 2 . 5.20. Example. Let S ∈ B(ℓ 2 ) be the standard unilateral shift (that is, S is a linear isometry such that Se n = e n+1 ) and f ∈ ℓ 2 be arbitrary. Then the pd kernel K on N given by: (5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16) K(n, m) = S n−1 f, S m−1 f ℓ2 (n, m ∈ N) has a pdms root. To prove this, it is sufficient to find a closable operator that factorizes K (see item (c1) in Theorem 5.5). To this end, we model the shift S on the Hardy space H 2 of holomorphic functions on the disc D = {z ∈ C : |z| < 1}.
Recall that: (H 2 a) a holomorphic function u : D ∋ z → ∞ k=0 a k z k ∈ C belongs to H 2 if ∞ k=0 |a k | 2 < ∞; (H 2 b) the monomials 1, z, z 2 , . . . form an orthonormal basis of H 2 (we use here a standard simplified notation; z k is in fact the function z → z k restricted to D); (H 2 c) for any z ∈ D, the evaluation functional H 2 ∋ u → u(z) ∈ C is continuous (the reproducing kernel of H 2 has the form (z, w) → 1 1−zw ); (H 2 d) the operator M : H 2 → H 2 given by (M u)(z) = zu(z) (z ∈ D, u ∈ H 2 ) is unitarily equivalent to S; more precisely, if V : ℓ 2 → H 2 is a unitary operator such that V e n = z n−1 (n > 0), then V −1 M V = S. Now for f ∈ ℓ 2 set F def = V f ∈ H 2 and define T : lin{z k : k 0} → H 2 as the multiplication operator by F (that is, (T p)(z) = F (z)p(z) for p ∈ D(T ) and z ∈ D). It easily follows from the above property (H 2 c) that T is closable. Therefore V −1 T V : ℓ f in → ℓ 2 is closable as well. Note that S k = V −1 M k V and hence S k f = V −1 M k F = V −1 T z k = V −1 T V e k+1 . Consequently, V −1 T V factorizes K and we are done.

5.21.
Example. Similarly to the notion of an ℓ 2 -kernel, let us call a kernel K on N a c 0 -kernel if lim n→∞ (|K(p, n)| + |K(n, p)|) = 0 for all p ∈ N. Item (IV) in Theorem 5.8 says that each bounded pd kernel on N that has a pdms root is a c 0 -kernel. On the other hand, pd ℓ 2 -kernels always have pdms roots. So, two natural (contrary) questions arise: (Question A) Does every bounded pd c 0 -kernel on N have a pdms root? (Question B) Is every bounded pd kernel that has a pdms root an ℓ 2 -kernel? In this example we answer these two questions in the negative. Both the kernels constructed below will be constant on the diagonal {(n, n) : n ∈ N}. (A) In this part we construct a pd kernel K on N that has a pdms root and satisfies: • K(n, n) = 1 for all n ∈ N (and thus K is bounded); • K(1, ·) / ∈ p>0 ℓ p . Define a sequence f = (a 1 , a 2 , . . .) of positive real numbers as follows: where H n def = n k=1 1 k . The second formula for a n shows that the sequence f is monotone decreasing, whereas the first implies that the series ∞ k=1 a 2 k is telescoping and its sum equals 1 H1 = 1. So, f ∈ ℓ 2 . Let S ∈ B(ℓ 2 ) be the shift as specified in Example 5.20. That example shows that the pd kernel K given by (5-16) has a pdms root. We now check that K has all announced properties. It is easily seen that K(n, n) = f 2 = 1. Finally, for any n > 0 we have (recall that f is monotone decreasing): Since lim n→∞ log n Hn = 1, we get that ∞ n=1 1 H p n = ∞ for any p > 0 and therefore K(1, ·) / ∈ p>0 ℓ p . (B) In contrast to the example given in (A), now we construct a pd kernel K : N × N → [0, 1] such that K(m, ·) ∈ p>2 ℓ p for any m ∈ N, but K has no pdms roots. To this end, we take any sequence u ∈ ( p>2 ℓ p ) \ ℓ 2 whose all entries lie in [0, 1] and define K 0 as u ⊗ u (see (5)(6)(7)(8)(9)(10)(11)). It follows from Example 5.14 that K 0 has no pdms roots. However, K 0 (m, ·) = u(m)u ∈ p>2 ℓ p . Now let D be the diagonal pd kernel on N such that D(n, n) = 1 − u(n) 2 . Since D ≪ δ N , we conclude that D has a unique pdms root. So, it follows from Corollary 5.13 that K def = K 0 + L has no pdms roots. This kernel satisfies K(n, n) = 1 for all n ∈ N and K(m, ·) ∈ p>2 ℓ p for any m > 0.
The above examples suggest that there is no handy description of bounded pd kernels that have a pdms root.

5.22.
Example. Properties (VI) and (VII) listed in Theorem 5.8 suggest that perhaps pd kernels on a given set that have pdms roots form a set closed in the pointwise or uniform topology (in the space of all kernels). As the following simple example shows, this is not the case.
Let u : N → [0, 1] be given by u(n) = 1 √ n and K def = u ⊗ u (see (5)(6)(7)(8)(9)(10)(11)). We infer from Example 5.14 that K has no pdms roots. Moreover, since dim(H K ) = 1, K is free of pdms roots. Now for n > 0 let K n be a pd kernel on N such that K n (p, q) = K(p, q) if both p and q are less than n, and K n (p, q) = 0 otherwise.
It is easy to see that K n is a pd kernel. Since K n is supported on a finite set, it has a unique pdms root (e.g., by Theorem 1.3). However, since lim n→∞ u(n) = 0, K n uniformly converge to K. So, the uniform limit of a bounded sequence of pd kernels each of which has a unique pdms root can be free of pdms roots.
We end the paper with the following 5.23. Example. Property (III) listed in Theorem 5.8 gives a necessary condition for an unbounded pd kernel to have a pdms root that reads as follows: if an unbounded pd kernel has a pdms root, so have all its bouded rescalings. In this example we show that it is insufficient. More precisely, we will construct an unbounded pd kernel K on N such that: • each bounded rescaling of K has a unique pdms root; • K has no pdms roots. To construct such a kernel, it is sufficient to find two self-adjoint bounded operators A, B ∈ B(H) on a separable Hilbert space H such that for some orthonormal basis (the latter formula follows from (aux2)). To show that K has no pdms roots, it suffices to check that the operator BCU ↾ ℓ f in is not closable (as BCU factorizes K; see condition (c1) in Theorem 5.5 and Lemma 5.4). Since U is unitary, we only need to verify that T def = BC ↾ lin{f n : n > 0} is not closable. To this end, first note that the graph Γ(BC) of BC is contained in the closure (in H × H) of Γ(T ) (as B is bounded and D(T ) is a core for C). Thus, T is closable iff so is BC, iff (BC) * is densely defined. But (BC) * = C * B * = CB (since B is bounded and both B and C are self-adjoint) and D(CB) = {0} by (aux1). So, T is not closable and hence K has no pdms roots. Now let v : N → C be any function such that the v-rescaling of K is bounded. This means that sup n∈N |v(n)| 2 K(n, n) < ∞. But K(n, n) = Bfn rn 2 = r −2 n (by (aux3)). So, there exists c > 0 such that (5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17) |v(n)| r n c (n > 0).
In particular, lim n→∞ v(n) = 0. Denote by W the diagonal operator on ℓ 2 (with respect to the orthonormal basis) such that W e n = v(n)e n (W is compact, but we do not need this property). Since D = ∞ n=1 α n f n : ∞ n=1 (|α n |/r n ) 2 < ∞ , we easily infer from (5-17) that R(U W ) ⊂ D and CU W is bounded (here we do not apply the Closed Graph Theorem-the boundedness of CU W is a direct consequence of (5-17), (aux2) and the definitions of C, U and W ). Consequently, This means that the v-rescaling L of K is factorized by a bounded operator (namely, by Q), from which it easily follows that L satisfies condition (iii) of Theorem 1.3 and therefore has a unique pdms root. In this way we have reduced our proof to the