Quantum similarity and QSPR in Euclidean-, and Minkowskian–Banach spaces

This paper describes first how Euclidian- and Minkowskian–Banach spaces are related via the definition of a metric or signature vector. Also, it is discussed later on how these spaces can be generated using homothecies of the unit sphere or shell. Such possibility allows for proposing a process aiming at the dimension condensation in such spaces. The condensation of dimensions permits the account of the incompleteness of classical QSPR procedures, independently of whether the algorithm used is statistical bound or AI-neural network related. Next, a quantum QSPR framework within Minkowskian vector spaces is discussed. Then, a well-defined set of general isometric vectors is proposed, and connected to the set of molecular density functions generating the quantum similarity metric matrix. A convenient quantum QSPR algorithm emerges from this Minkowskian mathematical structure and isometry.


Introduction
Since the first study on quantum similarity [1], the application of this new way to use the geometrical side of quantum mechanics has been ubiquitous; see for instance a few references  as an example. One of the aspects most studied by our laboratory has been the connection of quantum similarity with the Quantitative Structure-Properties Relations (QSPR 1 ) technique; see for example the earlier references [6, 11, 14, 17, 20, 23-25, 27-29, 31-33], for more information. The evolution of these earlier ideas has led to the description of quantum QSPR; see for a sample of this development references .
The present study will provide not only a scheme of quantum QSPR but will be associated with the structure of vector spaces to observe their role in the definition of both classical and quantum QSPR.
A new perspective of the quantum QSPR framework will be also given, taking into account that quantum similarity relies on molecular spaces 2 which in general are non-Euclidian, but Minkowskian. One can advance the fact that this geometrical issue has never been discussed within QSPR literature.
The scheme which will be followed in this paper presents first the study of N-dimensional Euclidian spaces. Introducing afterward the Euclidian-Banach spaces. Then, leading to the concept of dimension condensation and entering into the Minkowskian spaces description. After this, the role of Minkowskian vector spaces in quantum similarity is presented, with a well-defined construction of an isometric set of discrete vectors. After this, according to the simplicity of the present study, a quantum QSPR algorithm will be defined. Along with developing ideas connected with quantum similarity and quantum QSPR, there will be a present search for the inherent incompleteness of classical QSPR.

N-dimensional Euclidean vector spaces
Suppose an N-dimensional Euclidean vector space, which is defined over the real field: N (ℝ) , considering that for computational purposes it could be also redefined over the rational field, N (ℚ) . One can also study the elements of the space N (ℝ) as column vectors, and write: where Dirac's notation for column vectors has been used. Thus, the supraindex T means transposition, and again, vector Dirac's formalism: ⟨ � = � x 1 , x 2 , ...x N � corresponds to the row vectors forming the dual vector space: * N (ℝ). Some thoughts have been done on the structure of vector spaces and the nature of the spaces where molecules can be described, see for instance [55,56], as a way to adopt new points of view towards the usual literature.

Euclidean-Banach space: Euclidean norm of a vector
Whenever a norm can be defined in Euclidian space N (ℝ) , then one can admit that such space has a Banach structure, thus transforming the Euclidian space into a Euclidean-Banach space.
In the case of spaces defined like in the Eq. (1) and this study, the appropriate norm to provide a Banach structure to any Euclidian space is a second-order Euclidian norm, which is straightforwardly defined by: The Euclidian norm e 2 (� ⟩) is in this way a non-negative real number and becomes null only when computed with the zero vector: � ⟩ = (0, 0, ...0) T , that is: It is easy to admit that this second-order norm definition can be associated with the squared Euclidean distance from any vector to the zero vector � ⟩ , which in turn can be seen as the vector space origin.
The Euclidian norm can be also described first using the inward 3 vector product, as defined in numerous previous works [57][58][59][60][61][62][63][64][65], and possess the following form: followed by a complete sum of (the elements of) a vector of N (ℝ) , noted as: Thus, keeping the definitions (3) and (4) in mind, one can write in this case the inward square vector of any vector belonging to N (ℝ) and performing the complete sum of such square vector, then the Euclidian norm can be obtained in this way:

The Euclidian module of a vector
Once defined the Euclidean norm employing the Eqs. (2) or (5), then the associated module of any vector of the Euclidean-Banach space N (ℝ) is defined as the square root of the Euclidean norm: which, when chosen as the positive root, corresponds to a Euclidean distance between the corresponding vector and the origin of N (ℝ) , the zero vector � ⟩. However, nobody seems to have taken into account or discussed the fact that from the presence in the Eq. (6) of a square root, then Euclidean vector modules can be also assumed to be negative, with the same numerical absolute value as the usual positive distance-like use of them.
This property of moduli is of great importance when one wants to associate the whole real line ℝ to any vector space canonical direction. Any vector module might be thought of as a pair of symmetric ± real values, which for obvious practical and classical definition purposes, in current practice is contemplated to be formed by the positive part only.

The unit sphere or shell
Once the Euclidean-Banach space is correctly defined as in the previous section, then any vector of the Euclidean-Banach space can be normalized. Meaning by this statement that any vector can be scaled in such a homothetic way that the vector is transformed into another one possessing a unit module.
The usual way to perform such a homothetic scaling, excepting for the zero vector which possesses a zero module, corresponds to using the inverse vector module, that is: where, by N (1) , the unit sphere, the unit shell, or the 1-shell, is noted and named the set of all the normalized vectors belonging to N (ℝ).
That is, the set of all the vectors � � x ⟩ of a Euclidean-Banach space bearing unit norm, as: Equation (8), shows in fact that the set of all normalized vectors, the unit sphere or 1-shell, N (1) , has the property consistent in that its elements fulfill: In the sense shown above, the set N (1) can be considered as an N-dimensional sphere of radius 1, centered at the origin.

The spheres of radius r or r-shells
This is so, because if any N-dimensional r-shell or sphere of radius r centered at the origin is noted by the set N (r) , defined in turn as: then, all the vectors with a Euclidean norm equal to the same positive real number r 2 ∈ ℝ + , belong to the vector set N (r) , the N-dimensional sphere or of radius r ∈ ℝ , or r-shell.

Euclidean-Banach vectors as homothecies
The above definition is the same to say that any vector in a Euclidean-Banach space N (ℝ) , is a homothecy of a vector of the unit shell N (1) with a scale factor equal to a given radius r . Therefore, any Euclidean-Banach space N (ℝ) can be constructed from knowing the unit sphere or unit shell vector set N (1) and the real line ℝ. Such a homothecy is general and can be applied to any Vector Space isomorphic to a Euclidean-Banach space. It can be succinctly noted by the equation: which can be written with the rational field, if necessary, when thinking of the computational use of the Euclidian-Banach spaces: Also, this construction is the same to say that the spheres or shells of any radius, centered at the origin in any Euclidean-Banach space, are formed by the set of module r vectors, initiating at the origin zero vector and ending at the surface of the sphere or r-shell: N (r).
Then, any Euclidean-Banach space can be observed as an infinite sequence of spheres or r-shells derived by the infinite sequence of homothecies of the unit sphere N (1) using the real or rational fields.

Natural spheres or shells
Among the infinite variety of r-shells possible, one might underline the infinite sequence of natural shells, obtained via natural number homothecies, which can be noted by means: then with the notation: N (ℕ) , one can describe the set of all N-dimensional vectors possessing a natural Euclidean norm and module. Among these natural spheres or shells, the prime spheres are to be highlighted.

Condensing a Euclidean-Banach space
Down to this line, no new information has been provided. However, the way of observing the structure of the Euclidian-Banach spaces permits obtaining interesting points of view to study and use such spaces in QSPR problems.

Direct sum of Euclidian-Banach spaces
Suppose known two Euclidian-Banach spaces of dimensions P and Q: P (ℝ); Q (ℝ) , say; then calling N = P + Q the total dimension of a direct sum of both spaces, at that point, one can symbolically write:

Condensing one space in the direct sum
Using the information on the shell structure of the Euclidian-Banach spaces one can use the Eq. (11) to rewrite the direct sum of the Eq. (12), as: which might be seen as a new formalism revealing that being the Q-dimensional unit shell a constant set, one can simplify the structure of the direct sum in the way of condensing the space Q (ℝ) into a one-dimensional line: Therefore, condensing into a 1-dimensional structure the infinite variety of spaces Q (ℝ) homothetic to Q (1) . Using, instead of the whole space, the signed modules of the vectors of the r-shells composing it, just forming a real line ℝ.
When observing in a closer way, the Eq. (14) one can place the attention on the Euclidian spaces simply defined as direct sums of the real line ℝ: where the most typical example is the usual three-dimensional space ℝ 3 . In this structure made of the real line ℝ as the basic building block, one can easily imagine that every space direction composed of the set of real elements is just a condensed direction of some M-dimensional space, that is:

The QSPR framework as a condensed set of Euclidian vector spaces
The result provided by the previous sections indicates that the typical molecular space used in the classical QSPR framework can be observed as a set of vectors belonging to some Euclidian-Banach space.
This space structure is independent of the procedure employed to obtain a final relation between the vectors. QSPR vectors are constructed by rational molecular descriptors, which are used in turn to obtain a set of numerical vector images of the elements of some molecular set M.
The descriptors leading to the discrete numerical molecular definition, form a large volume of parameters which is increasing steadily with time and computational facilities.
The dimension of such classical molecular space corresponds to the cardinality of the set M, M = Card(M) . Therefore, such molecular space can be generated in the way promoted here: However, every direction in M (ℝ) might be associated with a molecular structure belonging to the molecular set M, because every molecule different from the rest has to be described as a linearly independent vector, to avoid the dimensionality paradox [66,67].
Therefore, every molecular vector can be considered as the condensation of another Euclidian-Banach space of arbitrary dimensions. This point of view has never been discussed in the theory of classical QSPR at any level, as far as the author is aware; see for example a modern assorted sample of references [68][69][70][71][72][73][74][75][76][77][78].
In this sense, the molecular description as an M-dimensional rational vector, or an M-tuple, might correspond to a schematic vector obtained from another arbitrarily large Euclidian-Banach space.
This point of view demonstrates that classical QSPR, associated with any computational structure: least-squares or AI, will be in any case incomplete, a result that has been obtained earlier by other means [79,80].

Metric or signature vector
The Eq. (3) applies as a very particular alternative definition of the scalar product, when in association with the complete vector elements sum.
This is so because one can define the inward product of two vectors by adding a third vector, considering the metric vector � ⟩ of the Euclidian-Banach space in question.
It is obvious that using the N-dimensional unity vector as a metric vector:

Minkowskian-Banach spaces
However, using the direct sum of two unity vectors of arbitrary dimensions (P, Q) ∧ N = P + Q , in the following way: if the involved vector pair entering the inward product is expressed as a direct sum of two parts, the scalar product defined under the metric vector � � N ⟩ , as defined in the Eq. (15) corresponds to a complete vector sum, which can be also written in two parts: Such an arrangement permits the definition of the N-dimensional (P, Q) Minkowskian-Banach space, where the second-order norms, now one can call them Minkowskian, are constructed using the metric vector in the Eq. (15):

Special relativistic spacetime as an example
The Minkowski space employed in the special theory of relativity is a (3, 1) Minkowski space, according to the nomenclature put forward here, where the P = 3 three-dimensional part contains the space coordinates x 1 , x 2 , x 3 and the Q = 1 mono-dimensional part bears the time coordinate ct.
Such an example of widespread use is interesting to work with, because one can now refer to a spacetime model developed some years ago [57], where a (3, 3) Minkowskian-Banach space was naïvely described. Other related time structure models have been also described [81][82][83][84][85]. Such extended spacetime can be easily written, in the light of the present discussion, according to the convention put forward here: In the present vector space structure debate, one can remember the possibility of condensing the three-dimensional time part into a monodimensional construction, which in the case of special relativistic spacetime can be written as:

Dimension condensation in Minkowskian-Banach spaces
Thus, looking for a general point of view, the Eqs. (13) and (14) can be rewritten in the case of (P, Q)-dimensional Minkowskian-Banach spaces as: and in the condensed form as: Therefore, the spacetime described in the paper [81] could be considered as one symmetric (3, 3) Minkowskian-Banach space, which can be condensed into the relativistic (3, 1) spacetime as in the Eq. (16).

Quantum similarity Minkowskian spaces
It is well-known that quantum similarity matrices are metric matrices associated with a Minkowskian-Banach particularity  and also for some extensions [86][87][88][89]. The development of the previous theoretical background is sufficiently adequate for further discussion about discrete Euclidian-or Minkowskian-Banach spaces.
Though it seems, that when one computes the metric matrix of a set of electronic density functions, as is the case in quantum similarity practice, the metric matrix is not positive-definite as usually occurs in Euclidian-Banach spaces. Unless a metric vector like the one defined in the Eq. (15) is taken into account, instead of the implicit unity vector customarily considered in linear algebra, then the algebra of quantum molecular similarity spaces is not well-defined.

Geometrical incompleteness of classical QSPR procedures
Such a classical panorama, associated with the unity vector � ⟩ , taken as a metric vector, is the one that corresponds to the usual QSPR techniques. More than this, it occurs when molecular structures are defined with sets of scalar descriptors, collected into N-dimensional Euclidian vectors.
One must make clear that this is a situation common to all the procedures aiming to obtain structure-properties relations, even if complicated AI neural networks or other computational actions are employed to obtain QSAR-like results.
The problem of Minkowskian metric matrices has been previously discussed in two papers [90,91], but without providing a completely satisfactory solution. Therefore, some simple algorithmic structure is not well-defined for the computational practice, when using such Minkowski non-Euclidian property in calculations associated with the development of new methods in quantum similarity and quantum QSAR.
The present theoretical mainframe discussion permits a sound solution, and thus a further computational development and extension of the quantum similarity theoretical background.
The answer is simple: involving the use of a metric vector defined as in the Eq. (15).
Moreover, such a solution can be extended into the computational framework of the classical QSPR of any type. Just introducing into the attached and unavoidable Euclidian-Banach space, a metric vector different from the usual unity vector. Therefore, transforming the classical Euclidian-Banach space into a Minkowskian-Banach space, whenever scalar products have to be performed.
Perhaps classical QSPR procedures can be optimized in a manner that has never been employed, as far as the author knows, transforming the usual Euclidian space into a Minkowskian one.
Furthermore, the possibility of enriching classical QSPR procedures with a Minkowskian metric, informs everyone that such widespread procedures, besides the incompleteness provided by the background of discrete dimensions as discussed here before, and elsewhere within information theory [80,92,93], there exists an additional geometrical incomplete side, based in principle on the classical QSPR Euclidian-Banach restriction, where all the computational background is built.

Construction of an N-dimensional isometric Minkowskian vector set
Resuming the problem associated with quantum similarity and quantum QSPR: from the knowledge of a (N × N) quantum similarity metric matrix, Z, calculated most simply as the set of scalar products of a set of one-electron density functions: P = I ( )|I = 1, N : there is needed to find out a set of N-dimensional vectors belonging to a Minkowski-Banach space: whose metric matrix possesses the same metric matrix as the density function set P, that is: If the vector set in the Eq. (18) possesses the property (19), then the set is said to be isometric to the density function set P.
To obtain the vector set P isometric to the set P, one can start first with the secular equation of the metric matrix Z, which taking into account the symmetric nature of this matrix: T = , can be written as: Equation (20) above can be rewritten in the following way: and to proceed further the eigenvalues matrix has to be observed, written in a manner bearing a well-ordered set of values, separating those positive from the negative: considering that one can write the inward absolute value of a diagonal matrix as: then, one can suppose that the diagonal matrix of eigenvalues with the aid of a Minkowski metric signature N can be rewritten by means of the following expression: and therefore, by defining a new real diagonal matrix: Afterward, keeping this in mind, one can rewrite the Eq. (21) in the following way: consequently, after defining the set of columns of the matrix = Λ T : and considering that the column vector set: | | | [T] I ⟩ |I = 1, N is nothing else than the columns of the transpose matrix: T of the eigenvector matrix U.
Thus one can see that the presence of the matrix of the signs of the eigenvalues: N is strictly necessary for the definition of the metric matrix Z, by using the set of isometric vectors P, defined in the Eq. (23), as shown in the Eq. (22).
Then, the use of scalar products of the isometric vector set P has to be subject to the presence of the Minkowskian signature diagonal matrix in the corresponding scalar product expressions.
That is, the sets of (N × N) diagonal matrices and the N-dimensional vectors being isomorphic, it can be easily shown that: and accordingly, one can finally write the scalar products which construct isometrically the metric matrix Z, in the form of a triple inward product: The vector � � N ⟩ can be referred to as the metric vector or metric signature vector for Minkowskian spaces.

Extension of the scalar products in Minkowski-Banach spaces
Once the scalar product of two vectors is defined in addition to the Euclidian norms in Euclidian-Banach spaces, one can consider such spaces as Euclidian-metric spaces.
The use of the inward product coupled with the complete sum of a vector to define scalar products can be considered as the first step to defining higher-order scalar products. Some research on this topic has been previously performed, the potential reader can peruse references [37,45,57,[62][63][64][65] for deeper details.
Scalar products as defined in the Eq. (24) can be considered second-order expressions: that can be extended to any order.
For instance, third-order scalar products, constructed like: define a third-order tensor, which can be written with the symbol (3) . In general, one can define a P-th order tensor, (P) using the construction: Such higher-order scalar products have some interest in developing the theoretical background of quantum QSPR as has been described earlier, see for example [64,[87][88][89]93].
The tensor elements of the Eq. (26), possess the extra generalization associated with the metric vector � � N ⟩ . In the development of previous quantum QSPR, it was ignored, or what is the same, used, but in the present development description, taken as the unity vector � � N ⟩ , which has no effect in the expressions like (24), (25), or (26).

Development of a quantum QSPR algorithm
Quantum QSPR has evolved very much since the first attempts to describe a theoretical framework, which could provide basic mathematical background elements aimed to dress the empirical classical QSPR procedures with some broader point of view, looking for some working computational landscape, whose horizon could lie beyond the pure empiricism.
To achieve a sound theoretical setup and connect quantum QSPR with the previous development presented here, one can start with the fact that, knowing some Hermitian operator Ω( ) and an electronic density function, like those already described as forming part of an attached molecular set P = I ( )|I = 1, N , then according to quantum mechanics, one can obtain a set of expectation values of some property associated to the operator, through the integral: That is:

The quantum expectation value in the isometric space
where the inward vector function, acting now as the Hermitian operator, can be developed as a Taylor series, like: with the vector | | | [K] I ⟩ representing the K-th inward power of the corresponding isometric vector, that is: In this way the properties attached to the inward function and the isometric vectors can be evaluated via a system of equations: which for a set of known properties I |I = 1, N of an involved molecular set, permits computing the coefficient set a K |K = 0, N − 1 . Then the known function is determined approximately and can be used to compute values of the property for known molecules with an attached isometric vector. The snag in this formalism is that using the Eq. (30), has little or not at all predictive power.

The appropriate quantum QSPR algorithms
Other computational algorithms have been described so far, see for example references [49][50][51][52][53][54]. The present algorithm in the Eq. (30) is the simplest that one can formulate to solve a quantum QSPR problem though.
The real predictive quantum QSPR procedure consists to set up a molecular set with some molecules possessing known properties and some with unknown values. An equation of type (30) is set for the known molecular property subset, the proposed algorithms use the whole molecular set to determine the values of the property for all the structures, see for example references [49][50][51][52][53][54], or propose new applications and algorithms related to chemical problems and QSPR [94][95][96][97][98]

Discussion and conclusions
A schematic description of almost all the nuances associated with the connection of quantum similarity background with the QSPR problems has been developed.
On the path to this end, several ancillary problems have been also discussed. The structure of vector spaces with a norm defined, Banach spaces, is clarified using N-dimensional unit spheres or unit shells.
This vector space feature permits the definition of some condensation process of the space dimensions. Resulting in that such a possibility appears to connect classical QSPR procedures with some essential incompleteness, already put in evidence from the information theory point of view. Moreover, the possibility to transform scalar products into inward vector products, and these into scalars via a complete vector sum, permits to easily describe a general algorithm to construct scalar products, norms, and modules of any order in any Euclidian or Minkowskian vector space.
Also, this scalar product alternative option can be seen as establishing the first step to build up a general quantum QSPR algorithm.
The similarity matrices, involving scalar products of pairs of density functions, can be considered metric matrices, attached to the non-negative definite metric space where the density functions belong.
However, generally speaking, such similarity matrices might be not positive definite if associated with molecular sets, and consequently, the space of molecular density functions shall be studied from a Minkowskian structure point of view instead of being Euclidian.
Such a characteristic has impeded up to now the setup of a complete general algorithm to find out a vector set, isometric to the set of density matrices, attached to a set of molecular structures.
Such a possibility constitutes a sine qua non condition to develop a computationally sound quantum QSPR algorithm.
However, the definition of a metric signature vector, isomorphic to a diagonal matrix, provides an easy way to construct finite dimension isometric vector sets in Minkowskian spaces, like the ones appearing in quantum similarity.
Such a general possibility eases the path to the final construction of a complete quantum QSPR algorithm. In such an algorithm are merged the isometric vectors inward powers of any order, and a quantum mechanical way to obtain expectation values of some Hermitian operator.
Even if quantum QSPR is based on quantum similarity and is set up within molecular spaces to avoid the dimensionality paradox, such a framework does not forbid that classical QSPR procedures can be also benefit of being associated with an equivalent outline, where Minkowskian spaces can be employed as a set of extra degrees of freedom.
In any classical QSPR technique, a finite-dimensional image of a set of molecules is constructed as a first step. Apart from always presenting the dimensionality paradox problem and the inherent extended incompleteness discussed in the present paper, one is also facing another very difficult (or perhaps impossible) requirement.
It consists of the hard task of taking into account different molecular conformations or optical isomers in the molecular description. Because of this fact, another element of incompleteness might be added to the previous fuzzy bundle of such techniques.
One can conclude that because of all the previously discussed nuances, quantum QSPR corresponds to a very general algorithmic procedure, encompassing the classical QSPR framework as a schematic particular case.
Funding Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature.