On the consistency of the matrix equation $X^\top A X=B$ when $B$ is symmetric: the case where CFC($A$) includes skew-symmetric blocks

In this paper, which is a follow-up to [A. Borobia, R. Canogar, F. De Ter\'an, Mediterr. J. Math. 18, 40 (2021)], we provide a necessary and sufficient condition for the matrix equation $X^\top AX=B$ to be consistent when $B$ is symmetric. The condition depends on the canonical form for congruence of the matrix $A$, and is proved to be necessary for all matrices $A$, and sufficient for most of them. This result improves the main one in the previous paper, since the condition is stronger than the one in that reference, and the sufficiency is guaranteed for a larger set of matrices (namely, those whose canonical form for congruence, CFC($A$), includes skew-symmetric blocks).


Introduction
Let A P C nˆn and let B P C mˆm be a symmetric matrix.We are interested in the consistency of the matrix equation where p¨q J denotes the transpose.To be more precise, we want to obtain necessary and sufficient conditions for (1) to be consistent.The main tool to get these conditions is the canonical form for congruence, CFC (see Theorem 1), because (1) is consistent if and only if the equation that we obtain after replacing the matrices A and/or B by their CFCs is consistent.The CFC is a direct sum of three kinds of blocks of different sizes, named Type-0, Type-I, and Type-II, and the idea is to take advantage of this structure to analyze Eq. (1).In particular, the only symmetric canonical blocks are I 1 " r1s and 0 1 " r0s, so the CFC of the symmetric matrix B is of the form CFCpBq " I m ' 0 k (where I m and 0 k are, respectively, a direct sum of m and k copies of I 1 and 0 1 ).With the help of Lemma 2 we can get rid of the null block 0 k , so the equation we are interested in is with m ě 1.
In [4] we introduced τ pAq, a quantity that depends on the number of certain Type-0, Type-I, and Type-II blocks appearing in the CFC of A, and we proved in [4,Th. 2] that if Eq. ( 2) is consistent then m ď τ pAq.Moreover, the main result of that paper, [4,Th. 8], establishes that if the CFC of A contains neither H 2 p´1q nor H 4 p1q blocks (which are specific Type-II blocks) then Eq. ( 2) is consistent if and only if m ď τ pAq.This is not necessarily true if we allow the CFC of A to contain blocks H 2 p´1q and/or H 4 p1q (for instance, it is not true for A " H 2 p´1q nor A " H 4 p1q).
In the present work we introduce a new quantity υpAq, that depends also on the number of certain Type-0, Type-I, and Type-II blocks appearing in the CFC of A. In Theorem 7 we will prove that if Eq. ( 2) is consistent then m ď mintτ pAq, υpAqu.Moreover, according to the main result in the present work (Theorem 12), if the CFC of A does not contain H 4 p1q blocks, then Eq. ( 2) is consistent if and only if m ď mintτ pAq, υpAqu.However, this is not necessarily true if the CFC contains blocks H 4 p1q (it is not true, for instance, for A " H 4 p1q).
Note that the main result of this paper improves the main one in [4] in two senses: (i) the condition here is stronger than the one there; and (ii) the characterization is guaranteed for a larger set of matrices.
In the title we have referred to "the case where CFC(A) includes skew-symmetric blocks".This highlights the fact that, compared to [4], in the present work the main result is applied to matrices whose CFC contains H 2 p´1q blocks, which are the only nonzero skew-symmetric blocks in a CFC.
The interest on Eq. ( 1) goes back to, at least, the 1920's [16], and it has been mainly devoted to describing the solution, X, for matrices A, B over finite fields and when A and/or B have some specific structure [7-11, 13, 17].More recently, some related equations have been analyzed [15] and, in particular, in connection with applications [1][2][3].In [5] we have addressed the consistency of Eq. ( 1) when B is skew-symmetric, where it is emphasized the connection between the consistency of (1) and the dimension of the largest subspace of C n for which the bilinear form represented by A is skew-symmetric and non-degenerate.The same connection holds after replacing skew-symmetric by symmetric, which is the structure considered in the present work.
The paper is organized as follows.In Section 2 we introduce the basic notation and definitions (like the CFC), and we also recall some basic results that are used later.In Section 3 the quantities τ pAq and υpAq are introduced.Section 4 presents the necessary condition for Eq. ( 2) to be consistent (Theorem 7), whereas in Section 6 we show that when the CFC of A does not contain blocks H 4 p1q this condition is sufficient as well (Theorem 12).In between these two sections, Section 5 is devoted to introduce the tools (by means of several technical lemmas) that are used to prove the sufficiency of the condition.Finally, in Section 7 we summarize the main contributions of this work and indicate the main related open question.

Basic approach and definitions
Throughout the manuscript, I n and 0 n denote, respectively, the identity and the null matrix with size n ˆn.By 0 mˆn we denote the null matrix of size m ˆn.By i we denote the imaginary unit (namely, i 2 " ´1), and by e j we denote the jth canonical vector (namely, the jth column of the identity matrix) of the appropriate size.The notation M 'k stands for a direct sum of k copies of the matrix M .
Following the approach in [4] and [5], a key tool in our developments is the canonical form for congruence (CFC).For the ease of reading we first recall the CFC, that depends on the following matrices: for k ě 1 (note that Γ 1 " I 1 " r1s); and • H 2k pµq :" Type II (µ is determined up to replacement by µ ´1) Following [5], the notation A ù B means that the equation X J AX " B is consistent, and The following result, that was presented in [5, Lemma 4], includes some basic laws of consistency that are straightforward to check.Lemma 2. (Laws of consistency).For any complex square matrices A, B, C, A i , B i , the following properties hold: (vi) J 1 p0q-law.For k, ℓ ě 0 we have A ' J 1 p0q 'k ù B ' J 1 p0q 'ℓ if and only if A ù B. By the Canonical reduction law, in Eq. (1) we will assume without loss of generality that A and B are given in CFC.
When B is symmetric, the CFC of B is I m 1 ' 0 m 2 .Then, as a consequence of the Canonical reduction law, we may restrict ourselves to the case where the right-hand side of (1) is of this form.Moreover, as a consequence of the J 1 p0q-law, in our developments we will consider B " I m in Eq. (1) (leading to Eq. ( 2)).Therefore, our goal is to characterize those matrices A such that A ù I m , for a fixed m ě 1.This will be done by concatenating several equations A ù A 1 ù ¨¨¨ù A k ù I m , since the Transitivity law allows us to conclude that A ù I m .For this reason, we will use the word "transformation" for a single equation A ù B.
One way to determine the CFC of an invertible matrix A is by means of its cosquare, A ´JA (see [14]), where p¨q ´J denotes the transpose of the inverse.Moreover, the cosquare will be used to determine whether two given invertible matrices are congruent, using the following result.
We claim that r Γ k and r H 2k pµq are congruent to, respectively, Γ k and H 2k pµq.In order to prove that Γ k and r Γ k are congruent, we give an indirect proof.Two matrix pairs pA, Bq and pA 1 , B 1 q are strictly equivalent if there are invertible matrices R and S such that RAS " A 1 and RBS " B 1 .It is known (see, for instance, [6, Lemma 1]) that two matrices A, B P C nˆn are congruent if and only if pA, A J q and pB, B J q are strictly equivalent.Since pΓ k , Γ J k q and `Jk `p´1q k`1 ˘, I k ˘are strictly equivalent (see [6,Th. 4]) and `Jk `p´1q k`1 ˘, I k ȃnd p r Γ k , r Γ J k q are strictly equivalent as well (see Eq. ( 5) in [12]), the pairs pΓ k , Γ J k q and p r Γ k , r Γ J k q are strictly equivalent, so Γ k and r Γ k are congruent.Another alternative to show that Γ k and r Γ k are congruent is by checking that their cosquares are similar to J k pp´1q k`1 q and then using Lemma 3.
To see that H 2k pµq and r H 2k pµq are congruent, consider the permutation matrix and note that r Therefore, the congruence by P 2k is actually a simultaneous permutation of rows and columns of H 2k pµq.More precisely, we start with and move rows (and columns) pk `1, k `2, . . ., 2kq to, respectively, rows (and columns) p2, 4, . . ., 2kq; and we also move rows (and columns) p1, 2, . . ., kq to rows (and columns) p1, 3, . . ., 2k ´1q, respectively.So the 1's coming from the block I k and the 1's coming from the superdiagonal of the block J k pµq in H 2k pµq, get shuffled to form the superdiagonal of P J 2k H 2k pµqP 2k .Moreover, the µ's from the block J k pµq in H 2k pµq are taken to the positions p2, 1q, p4, 3q, . . ., p2k, 2k ´1q in r H 2k pµq.The advantage in using the matrices r Γ k and r H 2k pµq instead of, respectively, Γ k and H 2k pµq, is that the first ones are tridiagonal, and this structure is more convenient for our proofs.Tridiagonal canonical blocks have been already used in [12] (actually, r Γ k is exactly the one introduced in Eq. ( 3) for ε " 1 in that reference).For the rest of the manuscript, we will replace the blocks Γ k by r Γ k and H 2k pµq by r H 2k pµq, so, in particular, we will assume that the CFC is a direct sum of blocks J k p0q, r Γ k , and r H 2k pµq.The only exceptions to this rule are Γ 1 which is equal to r Γ 1 , and H 2 p´1q which is equal to r H 2 p´1q.
Table 1: Values of τ and υ for any single canonical block.
3 The quantities τ pAq and υpAq The main result of this work (Theorem 12) depends on two intrinsic quantities of the matrix A, that we denote by τ pAq and υpAq.In this section, we introduce them and present some basic properties that will be used later.
Definition 4. Let A be a complex n ˆn matrix and consider its CFC, where (i) j 1 is the number of Type-0 blocks with size 1; (ii) j O is the number of Type-0 blocks with odd size at least 3; (iii) γ O is the number of Type-I blocks with odd size; (iv) γ ε is the number of Type-I blocks with even size; (v) h 2O is the number of Type-II blocks r H 4k´2 p´1q for any k ě 1; and (vi) h 2ε is the number of Type-II blocks r H 4k p1q for any k ě 1; (vii) it has an arbitrary number of other Type-0 and Type-II blocks.
The quantities τ and υ satisfy the following essential additive properties (the proof is straightforward): τ pA 1 ' ¨¨¨' A k q " τ pA 1 q `¨¨¨`τ pA k q and υpA 1 ' ¨¨¨' A k q " υpA 1 q `¨¨¨`υpA k q.
The notation for the quantities in Definition 4 follows the one in [5].In particular, the letters used for the number of blocks in parts (i)-(vi) resemble the notation for the corresponding blocks (see [5,Rem. 6]).In [4] we had not yet adopted this notation.The correspondence between the notation in that paper and the one used here is the following: The values γ ε and h 2O played no role in [4].
Table 1 contains the values of τ pAq and υpAq for A being a single canonical block in the CFC.We have displayed the values in three categories, from top to bottom, namely: first, those with τ pAq " υpAq; second, those for which τ pAq ă υpAq; and, finally, those with τ pAq ą υpAq.
Notice that τ pAq ď υpAq whenever the CFC of A consists of just a single canonical block, except for H 2 p´1q.This, together with (4), implies the following result.In order for the condition that we obtain (in Theorem 7) to be sufficient, the following notion is key.Definition 6.The transformation A ù B is pτ, υq´invariant if the following three conditions are satisfied: • τ pAq " τ pBq, and • υpAq " υpBq.

A necessary condition
In this section, we introduce a necessary condition on the matrix A for A ù I m (namely, for Eq. ( 1) to be consistent when B is symmetric and invertible).This condition improves the one provided in [4, Th. 2], namely m ď τ pAq.Theorem 7. If A is a complex square matrix such that X J AX " I m is consistent, then m ď mintτ pAq, υpAqu.
Proof.In [4, Th. 2] it was proved that m ď τ pAq (though the notation τ was not used there).Let us see that m ď υpAq as well.Assuming that the CFC of A is as in Definition 4, in the proof of Theorem 8 of [5] it was showed that n ´rank pA `AJ q " j O `γε `2h 2O . ( By hypothesis, there exists some X 0 P C nˆm such that X J 0 AX 0 " I m .Now, transposing this equation and adding it up, we get X J 0 pA `AJ qX 0 " 2I m .From this identity, and using (5), we obtain m " rank pX J 0 pA `AJ qX 0 q ď rank pA `AJ q " n ´jO ´γε ´2h 2O , so m ď n ´jO ´γε ´2h 2O " υpAq, as claimed.
5 Absorbing the H 2 p´1q blocks The main goal in the rest of the manuscript is to prove that the necessary condition presented in Theorem 7 is also sufficient when the CFC of A does not contain r H 4 p1q blocks.If the CFC of A contains neither H 2 p´1q nor r H 4 p1q blocks, this is already known [4,Th. 8].In that case, as a consequence of Lemma 5, the condition for A ù I m reduces to m ď τ pAq.When the CFC of A does not contain blocks r H 4 p1q but contains blocks H 2 p´1q, this is no longer true (see, for instance, Example 1 in [4]), and then the quantity υpAq comes into play.This is an indication that the presence of blocks H 2 p´1q in the CFC of A deserves a particular treatment.In this section, we show how to deal with this type of blocks.To be more precise, we see that some blocks H 2 p´1q can be combined with other type of blocks in order to "eliminate" them by means of a pτ, υq´invariant transformation.In this case, we say that the block H 2 p´1q has been "absorbed".We will consider separately the cases of Type-0, Type-I, and Type-II blocks, in Sections 5.1, 5.2, and 5.3, respectively.
The following notation is used in the proofs of this section: E αˆβ denotes the α ˆβ matrix whose pα, 1q entry is equal to 1 and the remaining entries are zero.

The case of Type-0 blocks
In Lemma 8, we show how to "absorb" a block H 2 p´1q with a Type-0 block, J k p0q, with k ‰ 3.In the statement, J 0 p0q stands for an empty block.Lemma 8.The following transformation is (τ, υq´invariant: Proof.By considering separately the cases where k in (6) is odd (k " 2t `1) and even (k " 2t), using ( 4) and looking at Table 1, we obtain: so both sides of the transformation in (6) have the same τ and υ.Now let us prove the consistency.The result is true for k " 2, since Let us prove it for k ě 4. Note that J a`b p0q " as wanted.
We will also use the following result, whose proof is straightforward.

The case of Type-I blocks
Lemma 10 is the counterpart of Lemma 8 for Type-I blocks, where Γ k is replaced by r Γ k .
Lemma 10.The following transformation is pτ, υq´invariant: Proof.Considering again separately the cases where k in ( 7) is odd (k " 2t `1) and even (k " 2t), using ( 4) and looking at Table 1, we obtain: so both sides of the transformation in (7) have the same τ and υ.Now let us prove the consistency.
For k " 3 we have as can be directly checked.For k ě 4 we are going to prove that where the first transformation is just a block permutation.So for the rest of the proof we will focus on the second transformation.We use the following notation: Api : jq is the principal submatrix of A containing the rows and columns from the ith to the jth ones.
, for X 4 as above, as can be directly checked.
To prove it we will use the identities where in the last-but-one equality we use that r Γ k p5 : kq " r Γ k´2 p3 : k ´2q.

The case of Type-II blocks
Finally, Lemma 11 is the counterpart of Lemmas 8 and 10 for Type-II blocks.Again, instead of the blocks H 2k pµq we use the tridiagonal version, r H 2k pµq.In the statement, r H 0 pµq stands for an empty block.
Lemma 11.The following transformations are (τ, υq´invariant: In order to see that all transformations in (i)-(iii) are pτ, υq´invariant, first note that

Now let us prove the consistence in (i)-(iii).
The following identity is used: ff " (ii) Let us prove, for k ě 1, that This is because ff " Finally, let us see that r H 4k p´1q is congruent to r Γ '2 2k or, equivalently, that H 4k p´1q is congruent to Γ '2  2k .In order to do this, we are going to prove that the cosquares of H 4k p´1q and Γ '2  2k are similar, and this immediately implies that H 4k p´1q and Γ '2  2k are congruent, by Lemma 3. The cosquare of H 4k p´1q is and the cosquare of Γ '2 2k is " with (see [6, p. 13]) where ‹ denotes some entries that are not relevant in our arguments.As J 2k p´1q ´J is similar to J 2k p´1q, the previous identities show that `H4k p´1q ˘´J H 4k p´1q and ´Γ'2 2k ¯´J Γ '2 2k are similar, since the Jordan canonical form of both them is J 2k p´1q '2 .
(iii) Let us prove that, for k ě 1: For k " 1 the solution matrix is X 1 " C, as it can directly checked.Let us now see it for k ě 2: ff " It remains to see that r H 4k´2 p1q is congruent to r Γ '2 2k´1 or, equivalently, that H 4k´2 p1q is congruent to Γ '2 2k´1 .To prove this, we can proceed as before, by showing that the cosquares of H 4k´2 p1q and Γ '2  2k´1 are similar (in this case, their Jordan canonical form is J 2k´1 p1q '2 ), and this implies that H 4k´2 p1q and Γ '2  2k´1 are congruent, again by Lemma 3.

The main result
The following result, which is the main result in this work, improves the main result in [4] (namely, Theorem 8 in that reference) by including the case where the CFC of A contains blocks of type H 2 p´1q, that were excluded in [4,Th. 8].
Theorem 12. Let A be a complex square matrix whose CFC does not have blocks of type r H 4 p1q, and B a symmetric matrix.Then X J AX " B is consistent if and only if rank B ď mintτ pAq, υpAqu.
Proof.The necessity of the condition is already stated in Theorem 7. We are going to prove that it is also sufficient.
By the J 1 p0q-law and the Canonical reduction law, we may assume that both A and B are given in CFC and that neither A nor B have blocks of type J 1 p0q.This implies, in particular, that B " I m , for some m, and that A is as in Definition 4, with j 1 " 0. We also assume that all blocks Γ k and H 2k pµq in A, if present, have been replaced by r Γ k and r H 2k pµq, respectively.Let us recall that Γ 'm 1 " I m .Throughout the proof, we mainly use the first notation, to emphasize that we are dealing with canonical blocks.
If the CFC of A does not contain blocks H 2 p´1q, then the result is provided in [4,Th. 8].Otherwise, we are going to see that it is possible, by means of pτ, υq´invariant transformations, to either "absorb" all blocks H 2 p´1q or to end up with a direct sum of blocks H 2 p´1q, together with, possibly, other blocks, which are quite specific.More precisely, we can end up with a direct sum of blocks satisfying one of the following conditions: (C0) There are no blocks H 2 p´1q.
(C1) There are some blocks H 2 p´1q together with, possibly, a direct sum of blocks J 3 p0q, r Γ 2 , and/or Γ 1 .
We are first going to see that, indeed, we can arrive to one of the situations described in cases (C0)-(C1).In the procedure, we may need to permute the canonical blocks, in order to use Lemmas 8, 10, and 11.By Theorem 1, this provides a congruent matrix which has, in particular, the same τ and υ, so these permutations do not affect the consistency.Then, we will prove that in both cases (C0) and (C1) the statement holds.So let us assume that the CFC of A contains a direct sum of blocks H 2 p´1q, together with some other Type-0, Type-I, and Type-II blocks (except r H 4 p1q).Using Lemma 8, for each block J k p0q (with k ‰ 3) we can "absorb" a block H 2 p´1q by means of a pτ, υq´invariant transformation, and we end up with a direct sum of a block J k´2 p0q together with two blocks Γ 1 .We can keep reducing the size of the Type-0 blocks until either all H 2 p´1q blocks have been absorbed (so we end up in case (C0)) or there are no more Type-0 blocks, except maybe blocks J 3 p0q.Now, we can proceed in the same way with Type-I blocks using Lemma 10.Again, we end up either with a direct sum containing no H 2 p´1q blocks (case (C0) again) or no Type-I blocks, except maybe blocks Γ 1 and/or r Γ 2 .Next, we do the same with Type-II blocks using Lemma 11.Note that the reductions in parts (ii) and (iii) in the statement of Lemma 11 produce as an output some Type-I blocks r Γ k , with k ě 1.In the case when k ą 1, we can use again Lemma 10, provided that there are still blocks H 2 p´1q.Therefore, after these reductions, either we have absorbed all blocks H 2 p´1q (case (C0) again), or there are blocks H 2 p´1q, together with, possibly, a direct sum of other blocks that cannot absorb them, namely J 3 p0q, r Γ 2 , and/or Γ 1 (case (C1)).Now, it remains to prove that in both cases (C0) and (C1) the statement holds, namely that A ù Γ 'm 1 , for any m ď mintτ pAq, υpAqu, in these two cases.Let p A be the matrix obtained after applying to A all the transformations explained in the previous paragraph.By the Transitive law, A ù p A. Moreover, since all these transformations are pτ, υq´invariant, then (4)  (in [4,Th. 8], however, the notation τ was not used).
In case (C1), we may assume that for some j, h, k, ℓ ě 0.
Note that, in this case, mintτ p p Aq, υp p Aqu " υp p Aq, since τ p p Aq " j `2h `k `ℓ ą υp p Aq " 2h `k `ℓ.Hence, it is enough to prove that A ù Γ υp p Aq 1 . In order to do this, we consider the transformations where the first transformation is a consequence of the Elimination law, and the second transformation is a consequence of the Addition law, together with Lemma 9 (for the first addend) and with r Γ 2 r 1 0 s ù Γ 1 (for the second addend).
Remark 13.Unfortunately, when the CFC of A contains at least one block r H 4 p1q, it is no longer true that, for any m ď mintτ pAq, υpAqu, the equation X J AX " I m is consistent.For instance, X J r H 4 p1qX " I 3 is not consistent (see [4,Th. 7]), but τ p r H 4 p1qq " 4 and υp r H 4 p1qq " 3, so mintτ p r H 4 p1qq, υp r H 4 p1qqu " 3. Therefore, the case where the CFC of A contains blocks r H 4 p1q deserves a further analysis.Related to this, Theorem 12 can be slightly improved, allowing the CFC of A to contain blocks r H 4 p1q provided that the number of these blocks is not larger than the number of blocks H 2 p´1q.In this case, we can start the reduction procedure described in the proof of Theorem 12 by "absorbing" the blocks r H 4 p1q with the blocks H 2 p´1q as described in .More precisely, we can gather each block r H 4 p1q with a block H 2 p´1q, and use the pτ, υq´invariant transformation r H 4 p1q ' H 2 p´1q ù Γ '4 1 .Once we have absorbed all blocks r H 4 p1q we can continue with the reduction as explained in the proof of Theorem 12.

Conclusions and open questions
In this paper, we have obtained a necessary condition for the equation X J AX " B to be consistent, with A, B being complex square matrices and B being symmetric.This condition improves the one obtained in [4,Th. 2].Moreover, we have proved that the condition is sufficient when the CFC of A does not contain blocks r H 4 p1q.This result also improves the one in [4,Th. 8], where the case in which the CFC has blocks H 2 p´1q was excluded.
As a natural continuation of this work it remains to address the case where the CFC of A contains blocks r H 4 p1q, in order to fully characterize the consistency of X J AX " B, with B symmetric, for any matrix A. We have seen that the condition mentioned above is no longer sufficient in this case, so a different characterization is needed.So far, we have been unable to find such a characterization.

Lemma 5 .
If the CFC of A has no blocks of type H 2 p´1q then τ pAq ď υpAq.
Instead of the blocks Γ k and H 2k pµq we will use the following blocks, for k ě 1: [4,lies that τ pAq " τ p p Aq and υpAq " υp p Aq. Therefore, it is enough to prove that p A ù Γ 'm 1 for any m ď mintτ p p Aq, υp p Aqu.By the Elimination law, Γ 'a 1 ù Γ 'b 1 for any b ă a, so it will be enough to prove that p A ù Γ C0) the statement is true, as a consequence of[4, Th. 8].More precisely, in this case, mintτ pAq, υpAqu " τ pAq, as a consequence of Lemma 5.Then,[4, Th. 8]guarantees that A ù Γ