QSPR analysis of some novel neighborhood degree based topological descriptors

Topological index is a numerical value associated with chemical constitution for correlation of chemical structure with various physical properties, chemical reactivity or biological activity. In this work, some new indices based on neighbourhood degree sum of nodes are proposed. To make the computation of the novel indices convenient, an algorithm is designed. QSPR analysis of these newly introduced indices are studied here which reveals their predicting power. Some mathematical properties of these indices are also discussed here.


Introduction
The graph theory is a significant part of applied mathematics for modelling real life problems. The chemical graph theory, a fascinating branch of graph theory, provides many information on chemical compounds using an important tool called the topological index [1,2]. Theoretical molecular descriptors alias topological indices are graph invariants that play an important role in chemistry, pharmaceutical sciences, materials science, engineering and so forth. Its role on QSPR/QSAR analysis [3,4,5,6,7], to model physical and chemical properties of molecules is also remarkable. Among several types of topological indices, vertex degree based [8] topological indices are most investigated and widely used. The first vertex degree based topological index is proposed in 1975 by M. Randić [9] known as Connectivity index or Randic index. Connectivity index is defined by where d G (u), d G (v) represent the degree of nodes u,v in the vertex set V(G) of a molecular graph G. By molecular graph, we mean a simple connected graph considering atoms of chemical compound as vertices and the chemical bonds between them as edges. E(G) is the edge set of G. The inverse Randic index [10] is given by The Zagreb indices, introduced by Gutman and Trinajestić [11], are defined as follows: Furtula et al. [12] have introduced the forgotten topological index as follows: B. Zhou and N. Trinanjstić have designed the sum connectivity index [13] which is as follows: The symmetric division degree index [14] is defined as The redefined third Zagreb index [15] is defined by For more study about degree based topological indices, readers are referred to the articles [16,17,18,19,20,21]. Recently, the present authors introduced some new indices [22,23] based on neighbourhood degree some of nodes. As a continuation, we present here some new topological indices, named as first NDe index (ND 1 ), second NDe index (ND 2 ), third NDe index (ND 3 ), fourth NDe index (ND 4 ), fifth NDe index (ND 5 ), and sixth NDe index (ND 6 ) and defined as where δ G (u) is the sum of degrees of all neighboring vertices of u ∈ V(G), i.e, δ G (u) = v∈N G (u) d G (v), N G (u) being the set of adjacent vertices of u. The goal of this article is to check the chemical applicability of the above newly designed indices and discuss about some bounds of them in terms of other topological descriptors to visualise the indices mathematically. We construct the results into two different parts. We start the first part with an algorithm for computing the indices and then some statistical regression analysis have been made to check the efficiency of the novel indices to model physical and chemical properties. Then, we would like to test their degeneracy. This part ends with a comparative study of these indices with other topological indices. The second part deals with some mathematical relation of these indices with some other well-known indices.

Computational aspects
In this section, we have designed an algorithm to make the computation of the novel indices convenient.
To make it simple and understandable, we have considered some variables and matrices. We have used conn [E] [2] matrix to store the connection details among vertices, whereas deg [V] [2] and δ[V] [2] is the matrix to store degree of each vertex and neighborhood degree sum of vertex respectively. The novel indices can be considered as function of

Newly introduced indices in QSPR analysis
In this section, we have studied about the newly designed topological indices to model physico-chemical properties [Acentric Factor (Acent Fac.), Entropy (S ), enthalpy of vaporization (HV AP), standard enthalpy of vaporisation (DHV AP), and heat capacity at P constant (CP)] of the octane isomers and physical properties [boiling points (bp), molar volumes (mv) at 20 • C, molar refractions (mr) at 20 • C., heats of vaporization (hv) at 25 • C., surface tensions (st) at 20 • C and melting points (mp)] of the 67 alkanes from n-butanes to nonanes. The experimental values of physico-chemical properties of octane isomers (Table 1) are taken from www.moleculardescriptors.eu. The datas related to 67 alkanes (Table 9) are compiled from [16]. Firstly, we have considered the octane isomers and then the 67 alkanes are taken into account.

Regression model for octane isomers:
We have tested the following linear regression models where P is the physical property and TI is the topological index. Using the above formula, we have the following linear regression models for different neighbourhood degree based topological indices. 1. ND 1 index:    Now we describe above linear models in the following tableau. Here c, m, r, SE, F, SF stands for intercept, slope, correlation coefficient, standard error, Ftest, and significance F respectively. Correlation coefficient tells how strong the linear relationship is. The standard error of the regression is the precision that the regression coefficient is measured. To check whether the results are reliable, Significance F can be useful. If this value is less than 0.05, then the model is statistically significant. If significance F is greater than 0.05, it is probably better to stop using that set of independent variable.      Now we depict the above correlations in the following figures.       Table 9: Experimental values of physical properties for 67 alkanes.   Now we use the statistical parameters same as previous discussion to interpret the above regression models, where N denotes the total number of alkanes.  Several interesting observations on the data presented in Table 3-16 can be made. From Table 3, the correlation coefficient of ND 1 index with entropy, acentric factor and DHVAP for octane isomers are found to be good (Figure 1). Specially, it is strongly correlated with acentric factor having correlation coefficient r = −0.9904. Also, the correlation of this index is good for the physical properties of 67 alkanes except for cp and mp having correlation coefficient values -0.6941 and 0.2516, respectively. The range of correlation coefficient values lies from 0.7436 to 0.8981.
The QSPR analysis of ND 2 index reveals that this index is suitable to predict entropy, acentric factor and DHVAP of octane isomers (Figure 2). Also, one can  . Surprisingly, the correlation of ND 2 with hv is very high with correlation coefficient value 0.9638. Table 13 shows that ND 3 index is inadequate for any structure property correlation in case of alkanes having the correlation coefficient values from 0.2036 to 0.7318. But, from Table 5, we can see that ND 3 is well correlated with entropy and acentric factor with correlation coefficients -0.9387 and -0.9765 respectively. The QSPR analysis of ND 4 index shows that this index is well correlated with entropy, acentric factor, DHVAP, and HVAP for octane isomers (Table 6). Table  14 shows that ND 4 index is inadequate for structure property correlation in case of alkanes except cp and hv having correlation coefficients -0.8634 and 0.8679, respectively.  The QSPR analysis of ND 6 index reveals that the correlation coefficient of this index with the physical properties of alkanes are very poor ( Table 8). The range of correlation coefficient values lies from 0.2192 to 0.7823. But, when we look into the Table 16, we can say that this index has ability to model entropy, acentric factor, and DHVAP for octane isomers.

Correlation with some well-known indices
In this section, we investigate the correlation between the new indices and some well-known indices for octane isomers. It is clear from Table 17, that the new indices have high correlation with the well-established indices except ND 5 index. Highest correlation coefficient (r = 0.9977) is between ND 1 and M 2 . From table 18, one can say that ND 5 has significantly low correlation coefficient with other indices. So we can conclude that ND 5 is independent among five indices. A correlation graph (Figure 7) is drawn considering indices as vertices and two vertices are adjacent if and only if |r|≥ 0.95.

Degeneracy
The objective of a topological index is to encipher the structural property as much as possible. Different structural formulae should be distinguished by a good topological descriptor. A major drawback of most topological indices is their degeneracy, i.e., two or more isomers possess the same topological index. Topological indices having high discriminating power captures more structural information. We use the measure of degeneracy known as sensitivity introduced by Konstantinova [24], which is defined as follows: where N is the total number of isomers considered and N I is the number of them that cannot be distinguished by the topological index I. As S I increases, the isomer-discrimination power of topological indices increases. The vertex degree based topological indices have more discriminating power in comparison with other classes of molecular descriptors. For octane and decane isomers, the newly introduced indices exhibit good response among other investigated degree based indices (Table 19).

Mathematical properties
In this section, we discuss about some bounds of the newly proposed indices with some well-known indices. Throughout this section, we consider simple connected graph. We construct this section with some standard inequalities. We start with the following inequality.
Proposition 1. For a graph G having m edges with neighbourhood version of second Zagreb index M * 2 (G) [23], we have where equality holds iff G is regular or complete bipartite graph.
Now using the definition of ND 1 and M * 2 indices, we can easily obtain the required bound (3). Equality in (4) holds iff δ G (u)δ G (v) = k, a constant ∀uv ∈ E(G). So the equality in (3) holds iff G is regular or complete bipartite graph.
Lemma 2. Let x = (x 1 , x 2 , , x n ) and y = (y 1 , y 2 , , y n ) be sequence of real numbers. Also let z = (z 1 , z 2 , , z n ) and w = (w 1 , w 2 , , w n ) be non-negative sequences. Then In particular, if z i and w i are positive, then the equality holds iff x = y = k, where k = (k, k, , k), a constant sequence.
Proposition 2. For a graph G having m edges with neighbourhood version of second Zagreb index M * 2 (G), we have where equality holds iff G is P 2 .
Proof. Considering After using the definition of ND 1 and M * 2 indices we can obtain mM * 2 (G) + m 2 ≥ 2mND 1 (G). After simplification, the required bound is obvious. From lemma 2, the equality in (6) Remark: By arithmetic mean ≥ geometric mean, we can write ≥ mM * 2 (G). So the upper bound of ND 1 (G) obtained in proposition 1, is better than that obtained in proposition 2.
Proposition 3. For a graph G having second Zagreb index M 2 (G), forgotten topological index F(G) , neighbourhood version of hyper Zagreb index HM N (G) [23], neighbourhood Zagreb index M N (G) [22] , we have equality holds iff G is P 2 .
Proof. For a graph G, 2 . We know that for any two nonnegative numbers x, y, arithmetic mean ≥ geometric mean, i.e., x+y 2 ≥ √ xy, equality holds iff x = y. Now considering squiring both sides, we have After simplifying and using the formulation of ND 6 , F, M 2 , HM N , and M N indices, the required bound is clear. The equality in (7) For a graph G consider Thus δ N ≤ δ G (u) ≤ ∆ N for all u ∈ V(G). Equality holds iff G is regular or complete bipartite graph. Clearly we have the following proposition.
Proposition 4. For a graph G with m number of edges, we have the following bounds.
Equality holds in each case iff G is regular or complete bipartite graph.
Lemma 3. Let a i and b i be two sequences of real numbers with a i 0 (i = 1, 2, ..., n) and such that pa i ≤ b i ≤ Pa i . Then Equality holds iff either b i = pa i or b i = Pa i for every i = 1, 2, ..., n.
Proposition 5. For a graph G with m edges having neighbourhood version of second Zagreb index M * 2 (G), we have Equality holds iff G is regular or complete bipartite graph. Proof.
Which implies Equality holds iff δ G (u)δ G (v) = δ N or δ G (u)δ G (v) = ∆ N for all uv ∈ E(G), i.e. G is regular or complete bipartite graph. Hence the proof. Equality in both cases hold iff G is regular or complete bipartite graph. Proof.
Equality holds iff δ G (u)+δ G (v) δ G (u)δ G (v) = k, a constant ∀uv ∈ E(G). That is, δ G (u) = someconstant × δ G (v) ∀uv ∈ E(G), i.e., G is regular or complete bipartite graph. (ii) By Cauchy schwarz inequality, we have Equality holds iff δ G (u) = ∆ N = δ G (v) and δ G (u) + δ G (v) = c, a constant occur simultaneously for all uv ∈ E(G). That is, G is regular or complete bipartite graph. Hence the proof It is obvious that, δ G (u) ≥ d G (u) and δ G (v) ≥ d G (v), ∀uv ∈ E(G). Equality appears for P 2 only. Keeping in mind this fact, we have the following proposition. Equality holds in each case iff G is P 2 .

Conclusion
In this article, we have proposed some novel topological indices based on neighbourhood degree sum of end vertices of edges. Their predictive ability have tested using octane isomers and 67 alkanes. It has been shown that these indices can be considered as useful molecular descriptors in QSPR research. These indices are extension of some well-known degree based topological indices (such as RR, S CI, S DD, R etc.). Sometimes the predictive power of these new indices are superior sometimes they are little bit inferior than the old indices. But the degeneracy test on Table 17, assures the supremacy of newly designed indices in comparison to the old indices. We have also correlated these indices with other degree based topological indices. This investigation on Table 17, 18 concludes that ND 5 index is independent among all novel indices. This work ends with computing some bounds of these novel indices. For further research, these indices can be computed for various graph operations and some composite graphs and networks.