Graph realization of sets of integers

Wawrzyniak, Piotr; Formanowicz, Piotr

doi:10.1007/s10910-024-01642-4

Graph realization of sets of integers

Original Paper
Open access
Published: 24 June 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Journal of Mathematical Chemistry Aims and scope Submit manuscript

Graph realization of sets of integers

Download PDF

Piotr Wawrzyniak¹ &
Piotr Formanowicz¹^na1

208 Accesses
Explore all metrics

Abstract

Graph theory is used in many areas of chemical sciences, especially in molecular chemistry. It is particularly useful in the structural analysis of chemical compounds and in modeling chemical reactions. One of its applications concerns determining the structural formula of a chemical compound. This can be modeled as a variant of the well-known graph realization problem. In the classical version of the problem, a sequence of natural numbers is given, and the question is whether there exists a graph in which the vertices have degrees equal to the given numbers. In the variant considered in this paper, instead of a sequence of natural numbers, a sequence of sets of natural numbers is given, and the question is whether there exists a multigraph such that each of its vertices has a degree equal to a number from one of the sets. This variant of the graph realization problem matches the nature of the problem of determining the structural formula of a chemical compound better than other variants considered in the literature. We propose a polynomial time exact algorithm solving this variant of the problem.

Graphs Identified by Logics with Counting

Solving the Mostar index inverse problem

Article 28 February 2024

A unified construction of semiring-homomorphic graph invariants

Article 11 February 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In the last decades, graph theory has found numerous applications in chemical sciences. It is extensively used in the structural analysis of chemical compounds, where vertices represent atoms and edges represent chemical bonds, allowing chemists to predict molecular properties and behaviors [1, 2]. Additionally, graph theory aids in understanding and predicting chemical reaction pathways by modeling reaction networks and studying kinetics and mechanisms [3, 4]. It also plays a crucial role in studying isomers, identifying structural and stereoisomers, and understanding their different chemical properties [5, 6].

One of the problems for which graph theory is useful to solve is the one concerning determination of a structural formula of a chemical compound on the basis of its chemical formula [7,8,9]. In other words, based on the number of various atoms in the molecule of the chemical compound, it should be determined how these atoms are connected by chemical bonds. This means that the structural formula, which shows the arrangement of atoms and the bonds between them, must be identified. Such problems appear, among others, in the analyses performed using mass spectrometers [10,11,12].

The above mentioned problem can be modeled (at least to some extent) as a well known graph realization problem (or graphic sequence problem). In this problem there is given a sequence of natural numbers and the question is whether there exists a graph such that its vertices have degrees equal to the numbers from this sequence. First algorithms for this problem were proposed in 50. and 60. by Havel [13] and Hakimi [14].

One of the variants of the problem which has not been considered so far concerns a sequence of sets of positive integer numbers, not single numbers, as in the basic version of the problem. In other words, there is given a sequence of sets of positive integer numbers, and the question is whether it is possible to construct an undirected multigraph in which vertices have degrees corresponding to numbers from the sets (i.e., the vertex degree should be equal to one of the numbers from the set).

This problem corresponds to the problem of determining a structural formula of a chemical compound, where there may be parallel bonds (edges), since there can exist multiple bonds between pairs of atoms, and the number of bonds created by a given atom is equal to its valency. The fact that in this version of the problem there is a sequence of sets of numbers (not single numbers as in the basic version) follows from the well known property of chemical elements that many of them have more than one valency.

The basic variant of the problem considered by Havel and Hakimi concerns connected undirected graphs without self loops and parallel edges. However, many other variants of the problem for undirected and directed graphs with and without self loops or parallel edges were considered [15,16,17].

Non of the variants of the graph realization problem known from the literature match the problem of determining the structural formula as exactly as the one presented in this paper. Here, we will define the problem formally and propose a polynomial algorithm for solving it.

The organization of the paper is as follows. In Sect. 2 the problem is formulated and its basic properties are described. In Sect. 3 a polynomial algorithm is proposed, while in Sect. 4 is properties are analyzed. The paper ends with conclusions in Sect. 5.

2 Description of the problem

2.1 Sequence of degrees

For decreasing sequence of positive integers describing the degree of graph vertices

Definition 1

(1)

$$\begin{aligned} S = (s_1, s_2, \dots , s_n), \text {where } s_1 \ge s_2 \ge \dots \ge s_n \end{aligned}$$

to determine if it can be used to construct a molecular graph (i.e., connected multigraph without loops), we must ensure that the following three criteria are fulfilled.

The basic one, common for all types of graphs, is the handshaking lemma, which states that the total sum of all vertices degrees has to be even. It is due to the simple fact that each edge increases the degree of exactly two vertices. It is the sufficient condition of the existence of multigraphs with loops, referred to in some publications as pseudographs.

Condition

(1)

$$\begin{aligned} \sum _{i=1}^n s_i=2e: e \in \mathbb {N} \end{aligned}$$

Because we are interested in the basic class of multigraphs, we need to exclude loops. For this to be possible, each vertex has to have a sufficient number of other vertices to connect. Since we allow multiple edges, we need only to check the vertex with the highest degree. Each other with the degree i can be connected, using i edges, to a vertex with the degree j, where $j \ge i$. For the decreasing list of degrees of vertices, where $s_1$ is the vertex with the maximum degree [18], the condition can be written as:

Condition

(2)

$$\begin{aligned} \sum _{i=2}^n s_i \ge s_1 \end{aligned}$$

The last condition that leads us to the class of molecular graphs is the condition that checks whether a given sequence can be used to build a connected graph. A potential connectivity condition says that at least one graph, possible to construct for a given degree sequence, is connected. This condition requires that the number of edges is greater than the number of vertices minus one [18]. Replacing the number of edges by the sum of the degree of vertices, the condition is as follows:

Condition

(3)

$$\begin{aligned} \sum _{i=1}^n s_i \ge 2(n-1) \end{aligned}$$

2.2 Sequence of degree sets

Let us move from a simple sequence of vertices degrees to a sequence of vertices degree sets:

Definition 2

(1’)

Instance: sequence $D = (D_1, D_2, \dots , D_n), \text {where } D_j=\{d_{j,1}, d_{j,2}, \dots , d_{j,g_j}\}$ is a set of positive integer numbers.

Answer: undirected graph or multigraph $G=(V,E)$ such that $\forall _{v_i} d_i \in D_i$.

Related to the base problem from Sect. 2.1, we need to find a sequence of vertices degrees $R = (r_1, r_2, \dots , r_n),\text {where }r_i \in D_i$ which is graphic. An illustration of this can be found in Fig. 1, which provides a representation of this concept, thereby aiding in a more comprehensive understanding of the problem.

Due to the selection of degrees from their sets, the selected degree from the first set need not be the maximum one. So we need to specify the index f, which indicates the set of degrees from which the maximum degree is derived. Under this assumption, the described conditions can be modified to the following:

Condition

(1’)

$$\begin{aligned} \sum _{j=1}^n{r_j} = 2e: e \in \mathbb {N} \end{aligned}$$

Condition

(2’)

$$\begin{aligned} \sum _{j \ne f}^n{r_j} \ge r_f: r_f=\max _{1 \le x \le n}{r_x} \end{aligned}$$

Condition

(3’)

$$\begin{aligned} \sum _{j=1}^n{r_j} \ge 2(n-1) \end{aligned}$$

To check if the sequence of degree sets can be used to construct a graph, we need only to check if any sequence of degrees R satisfies all three presented conditions. The sequence R is just an item from the cartesian product of all degree sets $\mathcal{R}$:

$$\begin{aligned} R \in \mathcal{R} \text { where } \mathcal{R} = D_1 \times D_2 \times \ldots \times D_j \end{aligned}$$

The total number of all such sequences to check is given by following formula:

$$\begin{aligned} |\mathcal{R}| = \prod _{j=1}^{n} |D_j| \end{aligned}$$

as it is easy to see, this number increases exponentially as the number of sets increases. If we have l degrees of vertices in two-element sets where each degree is unique, we get $2^{\frac{l}{2}}$ sequences of degrees to check.

However, lets modify condition (2’) by adding the max selected degree $r_f$, after such operation we get:

Condition

(2”)

$$\begin{aligned} \sum _{j = 1}^n{r_j} \ge 2r_f: r_f=\max _{1 \le x \le n}{r_x} \end{aligned}$$

Now it can be seen that conditions (2”) and (3’) are based on checking whether the sum of the selected vertices degrees is greater than or equal to the doubled: (2”) maximum degree and (3’) the number of vertices minus 1.

It follows that instead of considering all possible combinations of vertices, it is enough to check the maximum degrees of vertices from each set. These degrees must be less than or equal to selected degree $r_f$. Since in this case it is enough to check the conditions as many times as there are different degrees of vertices, the problem can be reduced to a problem of polynomial complexity. Of course, as long as choosing the degrees of vertices less than or equal to the degree $r_f$ remains a task with polynomial complexity. This action is just chosing maximal value less than or equal to a given value, so it is linear, in worst case of unsorted set.

$$\begin{aligned} \exists {r_f \in D_f} :\\ (1')&\sum _{j=1}^n{max\{d_{j,k} : d_{j,k} \le r_f\}} = 2e : e \in \mathbb {N}\\ (2'')&\sum _{j=1}^n{max\{d_{j,k} : d_{j,k} \le r_f\}} \ge 2r_f\\ (3')&\sum _{j=1}^n{max\{d_{j,k} : d_{j,k} \le r_f\}} \ge 2(n-1) \end{aligned}$$

Additionally, we can resign from indicating individual vertices and consider only their sum

$$\begin{aligned} u_f = \sum _{j=1}^n{max\{d_{j,k}: d_{j,k} \le r_f\}} \end{aligned}$$

since we are considering a decision problem.

The above considerations do not yet consider the condition (1’), i.e., the parity requirement of the sum of all vertices degrees. If we have the sum of the maximum degrees $u_f$ that match conditions (2”) and (3’) which is an even number we have the answer $\mathbb {YES}$ for the graph construction question. However, if the sum $u_f$ is odd, it would be impossible to construct a graph. Therefore, a procedure must be introduced to fix such a sum to an even number.

To change an odd sum to an even one, we need to replace one degree with another from the same set, and additionally having a different parity. If this operation is successful, we can subtract a new degree from the previous one and thus obtain an odd difference $p_f$ for the sum fix. This difference, of course, should be minimal in order to satisfy the conditions (2 ”) and (3 ’), which are inequalities of the greater than or equal to type.

$$\begin{aligned} p_f =&\min _{1 \le j \le n}\{\max \{d_{j,k} : d_{j,k} \le r_f\}\\&- \max \{d_{j,k'} : {{d_{j,k'} < d_{j,k}} \wedge {parity(d_{j,k'}) \ne parity(d_{j,k})}}\}\} \end{aligned}$$

In above procedure of looking for a parity fix we can change the max degree $r_f$, what change the conditions body. However changing the max degree $r_f$ with other value $r_f'$ from the $D_f$ set, would be the same, as a not modified check for this new degree $d_{f'}$. So the change of degree, to match the condition (1’), in the set $D_f$, will be considered by the described procedure in another step, and not need to be processed. In the above parity correction procedure, it may happen that we change the maximum degree $r_f$, which would lead to a change of the condition bodies. However, changing the maximum degree of $r_f$ to a different value $r_{f'}$ from the set $D_f$ would be the same as checking these conditions unmodified for this new degree $d_{f'}$ as maximal one. Thus, such a degree change will be taken into account by the described procedure in another step and need not be processed.

Corollary 1

To satisfy the parity condition (1’), only a fix with an odd difference of the sum of the maximum degrees from sets other than $D_f$, with the maximum vertex degree, is required. Since this sum should also be maximum, we need to find the minimum odd difference between one of the degrees under consideration to the smaller or equal degrees in the corresponding sets of degrees.

Now our new $q_f$ fix looks like:

$$\begin{aligned} q_f =&\min _{j \ne f}\{\max \{d_{j,k} : d_{j,k} \le r_f\}\\&- \max \{d_{j,k'} : {{d_{j,k'} < d_{j,k}} \wedge {parity(d_{j,k'}) \ne parity(d_{j,k})}}\}\} \end{aligned}$$

and conditions has following final form:

$$\begin{aligned} \exists {{r_f \in D_f} \wedge {q_f}}:\\ \text {(1') }&u_f - q_f = 2e : e \in \mathbb {N}\\ \text {(2'') }&u_f - q_f \ge 2r_f\\ \text {(3') }&u_f - q_f \ge 2(n-1) \end{aligned}$$

Of course, if value $q_f$ exists (it is not obvious, for all odd or even degrees in the input sequence D there would be no odd difference), the condition (1’) is always true. Satisfying the conditions (2”) and (3’) answers our question about the existence of any graph with vertex degrees from sets in the D sequence.

3 Polynomial algorithm

Below we present a polynomial algorithm for calculating these values.

The algorithm’s input, a sequence of sets, will be transferred into the matrix V. The matrix rows will have assigned all the unique degrees of vertices sorted in descending order. If the degree d is present in both $D_x$ and $D_y$ sets, it will appear on the list only once. Such sorting takes time equal to $n\log {n}$. The matrix columns will be labeled with the indexes j of the sets $D_j$ from the input sequence D.

To such a matrix with m rows (i.e., degrees) and n columns (set of degrees), we rewrite the values from the input as follows:

$$\begin{aligned} v_{i,j} = {\left\{ \begin{array}{ll} d_i &{} \text {if } d_i \in D_j, \\ v_{i+1,j} &{} \text {otherwise} \wedge i < m, \\ - &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

The first condition is for rewriting the degree to cells where row index occurs in the set for a column. The second condition rewrites missing values for all rows except the last one $i<m$, with cells from the below row. The last condition for the last row marks all not set values as non-existent with the minus sign −. The time of the check $d_i \in D_j$ can be constant if for each $d_i$ we store the list of source indices $j: d_{i,j} = d_i$, at the construction of the sorted list of unique degrees (i.e., rows of matrix V). So when we start filling the table from the last row in ascending order, the complexity of filling each cell will be constant. The whole matrix filling has a complexity equal to its size. In the worst case, it will be $O(l^2)$, where l is the number of all input degrees. This worst case is when each set of degrees is a single element and contains a unique value.

In this way, in each row with all cells filled, we have the maximum degrees from all sets $D_j$ that are less than or equal to the value $d_i$ that the given row is labeled. Of course, at least one of these values equals the row’s label $d_i$ because this is a mandatory condition for creating the row. For further calculations, we can only use such completely filled rows. There is also a possibility that it will be all the rows. Such row marking is possible with a time equal to the size of the table and can be performed together with the first step of filling the matrix cells.

At the time of previous matrix processing, we can also count the sum of all cells of the completed rows $t_i$, and it will be an extra value used in the final graphicality check.

$$\begin{aligned} t_i = {\left\{ \begin{array}{ll} \sum _{j=1}^{n}v_{i,j} &{} \text {if } \forall {j\text { }v_{i,j} \text { is set}}, \\ - &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

An example of this, derived from the instance depicted in Fig. 1 and presented in the form of an initialized matrix, can be found in Table 1.

Table 1 Matrix V with filled d, h and v values

Full size table

Some cells (at least one) can be selected as the current maximum degree for filled rows. These cells have a value equal to the row’s label $d_i$, reflecting values in the input sets. Such maximum degree must be compared to the sum of the remaining degrees in the constructed graph, condition (2’). Therefore, we calculate the maximum sum of the remaining degrees for cells with a value equal to the row index. It is nothing else than the calculated sum of the cells from the row subtracted by the current cell value.

The maximum sums of the vertex degrees are sufficient for checking the (2’) and (3’) conditions. The condition (1’) remains to be considered, requiring that the sum of all vertex degrees be even. The maximum sum of the degrees can be odd, and if this happens, one of the vertices degrees should change the parity from even to odd or from odd to even. Because all vertex has a maximum allowed degree, this change is done by decreasing the degree value. Of course, this modification is done within the allowable values for this vertex set of degrees $D_j$. Obviously, due to the conditions of (2’) and (3’), it is preferable to reduce the degree of the vertex as little as possible.

Due to the above need, we introduce the maximum even and odd values less than or equal to the current value of the row label $d_i$. By the same conditions as for filling the values $v_{i,j}$ plus the odd/parity check, this task can be performed simultaneously depending only on the number of cells in the matrix V, the $O(l^2)$.

At the same time, we can calculate the odd fix for maximum degree $p_{i,j}$. If in a cell we have both the even and the odd maximum values less than or equal to the maximum value for a row, it is just a difference between them. If this value cannot be calculated, we mark this with the minus sign −.

$$\begin{aligned} z^e_{i,j} = {\left\{ \begin{array}{ll} d_i &{} \text {if } d_i \in D_j \wedge d_i \text { is even},\\ z^e_{i+1,j} &{} \text {otherwise} \wedge i< m,\\ - &{} \text {otherwise}. \end{array}\right. } z^o_{i,j} = {\left\{ \begin{array}{ll} d_i &{} \text {if } d_i \in D_j \wedge d_i \text { is odd},\\ z^o_{i+1,j} &{} \text {otherwise} \wedge i < m,\\ - &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

$$\begin{aligned} p_{i,j} = {\left\{ \begin{array}{ll} |z^e_{i,j}-z^o_{i,j}| &{} \text {if }z^e_{i,j} \text { is set} \wedge z^o_{i,j} \text { is set},\\ - &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

For further insights into this process, one can refer to Table 2, which includes a matrix with these calculated values.

Table 2 Matrix V with filled z and p values

Full size table

As mentioned in Corollary 1, a change in degrees other than the maximum degree currently under consideration is required. It requires the computation of a minimum odd difference of degrees from vertices other than the current one.

$$\begin{aligned} q_{i,j} = \min _{1 \le k \le n \wedge k \not = j \wedge p_{i,k}\text { is set}}p_{i,k} \end{aligned}$$

This operation has to be performed for each cell, and if we looked through the entire row for all cells, it would require a complexity of O(n) for each cell in the matrix V. However, this operation can be optimized because we eliminate only one value, equal to the degree for this cell. It is enough that in the first step, we remember the two smallest values for each row (constant time per cell). Then for each of the cells, we choose the lower value, excluding the current cell value.

$$\begin{aligned} q_{i,j} = min(\{q_{i,min_1}, q_{i,min_2}\} \setminus \{p_{i,j}\}) \end{aligned}$$

This way of computing the minimum odd difference of degrees for other vertices will cause the complexity to be still $O(n^2)$, where n is the size of the matrix.

Regarding the adjusted parity condition (1’) let us calculate the sums of all degrees $u_{i,j}$ fixed by parity fix. This is needed only for that cells representing the max degree (current row label) $d_i \in D_j$, and can be easily checked by simple comparison $v_{i,j} = d_i$.

$$\begin{aligned} u_{i,j} = {\left\{ \begin{array}{ll} t_i &{} \text {if } d_i \in D_j \wedge t_i\text { is set} \wedge t_i \text { is even} \\ t_i - q_{i,j} &{} \text {if } d_i \in D_j \wedge t_i\text { is set} \wedge t_i \text { is odd} \wedge q_{i,j}\text { is set} \\ - &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

The final results of these calculations for the example instance from Fig. 1 are shown in Table 3.

Table 3 Matrix V with filled q, t, and u values

Full size table

With such calculated values, we can check all conditions:

(1’) $u_{i,j}\text { is set}$, the $u_{i,j}$ has an even value by its definition

(2’) $u_{i,j} \ge 2d_i$

(3’) $t_{i,j} \ge 2(n-1)$

Suppose any cell in the matrix satisfies these inequalities. In that case a graph can be constructed using degrees given by the input sequence D. The correct degrees $v_{i,j}$ from the sets $D_j$ will be in the row with any cell satisfying these inequalities. For used parity correction, its value should be subtracted from the degree from the cell with this correction.

4 Properties of the algorithm

4.1 Computational complexity

Let us analyze the algorithm for determining graphicality:

As we can see, the algorithm’s runtime is limited to

$$\begin{aligned} m \times log (m) + m \times n \end{aligned}$$

Where m is the number of unique degrees and n is the number of sets in the input sequence.

The memory complexity is also very small, equal to O(n) due to line-by-line processing. More precisely, the maximum memory usage is limited by the number of sets in the input sequence D.

Let us examine a few types of input sequences with the different numbers of sets and various values of vertex degrees. Let us make tests for dividing l degree values into the following numbers of sets and the maximum values of vertex degrees:

(a)
max degree $d_{max} = l \times 4 \implies m \leqslant l$ number of sets $n = l\div 4$ theoretical complexity $O(l^2)$
(b)
max degree $d_{max} = l\div 2 \implies m \leqslant l\div 2$ number of sets $n = l\div 2$ theoretical complexity $O(l^2)$
(c)
max degree $d_{max} = l\times 4 \implies m \leqslant l$ number of sets $n = \sqrt{l}\times 4$ theoretical complexity $O(l\sqrt{l})$
(d)
max degree $d_{max} = \sqrt{l}\times 4 \implies m \leqslant \sqrt{l}\times 4$ number of sets $n = l\div 4$ theoretical complexity $O(l\sqrt{l})$
(e)
max degree $d_{max} = \sqrt{l}\times 4 \implies m \leqslant \sqrt{l}\times 4$ number of sets $n = \sqrt{l}\times 4$ theoretical complexity O(l)
(f)
max degree $d_{max} = \sqrt{l}\times 2 \implies m \leqslant \sqrt{l}\times 2$ number of sets $n = \sqrt{l}\times 2$ theoretical complexity O(l)
(g)
max degree $d_{max} = 7 \implies m \leqslant 7$ number of sets $n = l\div 2$ theoretical complexity O(l)
(h)
max degree $d_{max} = 7 \implies m \leqslant 7$ number of sets $n = l\div 4$ theoretical complexity O(l)

For each of the above cases, 3, 530 tests were run for random l values from 100 to 100, 000. In order to simulate the worst-case scenario, the tests were not interrupted when a positive answer was found. The collected results are presented in Fig. 2.

As the charts show, the real-time complexity of tests matches the theoretical ones. This match is seen especially for long-running tests (up to $2{\times }10^{10} ns$ which is equal to 20s), where the calculated exponent of power for time complexity with the coefficient of determination near 1 is equal to the theoretical one with a single percent of error. For short-running tests (below 1s), these values are probably more affected by random disruptions.

4.2 Results of the algorithm

The algorithm answers YES or NO to the question ’Can we construct a molecular graph (i.e., connected multigraph without loops) using vertices with degrees matching input sequence of degrees sets.’ This check is done by testing three inequalities for each cell. What can be told more about the graph matching this cell? With the single cell, we can only say YES or NO, but let us take a look at the whole row for the example instance in Table 4

Table 4 Row from matrix V with cell fullfilling the graphicality conditions

Full size table

In the variables $v_{i,j}$, we have a maximum degree from the set $D_j$ less or equal to the $d_i$ degree. These degrees match all conditions except the one with an even sum of degrees. To apply this condition, in row i, we have to find any cell x other than j having the odd fix for degree $p_{i,x}$. The $p_{i,x}$ is equal to the $q_{i,j}$, the minimum odd fix for degrees in row i other than $p_{i,j}$, the odd fix for a degree in currently consider cell i, j. For such cell i, x, we take not the $v_{i,x}$, the current maximum degree from the set x less or equal to $d_i$, but such maximum value in parity other than the current $v_{i,x}$, which is stored in $z^e_{i,j}$ or $z^o_{i,j}$. In the above example, the $q_{2,4}=1$, and only one cell match this value. It is $p_{2,2}=1$, so the cell to fix has the column index i equals 2. In this cell, the $v_{2,2}=4$ is even, so we have to replace it with the odd value $z^o_{2,2}=3$. As a result, we obtain the following degrees, as detailed in Table 5.

Table 5 Vertex degrees fullfilling the graphicality conditions

Full size table

These degrees 4, 3, 3, 4, 4 meet all conditions needed for constructing a molecular graph, with other algorithms can obtain the exact edges between vertex.

Condition

(1’)

$$\begin{aligned}{} & {} \sum _{i=1}^n d_{i, r_i} = 2e: e \in \mathbb {N} \\{} & {} 4+3+3+4+4=18=2\times 9 \end{aligned}$$

Condition

(2”)

$$\begin{aligned}{} & {} \sum _{i=1}^n d_{i, r_i} \ge 2d_{f, r_f}:d_{f, r_f}=\max _{1 \le j \le n}{d_{j, r_j}} \\{} & {} 4+3+3+4+4=18 \ge 8=2\times 4 \end{aligned}$$

Condition

(3’)

$$\begin{aligned}{} & {} \sum _{i=1}^n d_{i, r_i} \ge 2(n-1)\\{} & {} 4+3+3+4+4=18 \ge 8=2\times (5-1) \end{aligned}$$

In the algorithm, we search for only one combination of degrees, although others may occur. This algorithm finds the results of the maximum degrees (Fig. 3), as the sequence 4, 3, 3, 4, 4 from the example. The other degrees combination with max degree 4 but vertex degree from $D_3$ changed from 3 to 1 (Fig. 4) will not be indicated.

5 Conclusions

In this article, we have introduced a polynomial time algorithm addressing the innovative problem of graph realization from sets of integers. The focus of this problem was to check if a graph could be constructed from a provided sequence of degree sets.

Among various practical applications, a particularly notable one resides in the realm of chemistry. Our algorithm can aid in the validation of potential structural formulas of chemical compounds based on their chemical formulas. In simpler terms, the algorithm can determine whether the count of atoms of different chemical elements, comprising a molecule, can potentially be interconnected by chemical bonds, thus validating a possible structure of the molecule.

Such an application is invaluable in analysis involving mass spectrometers. While a mass spectrometer delivers data about the composition and isotopic distribution of chemical compounds, it lacks the ability to provide insights into their structural configuration. The algorithm we have developed in this study addresses this missing link, facilitating one aspect of the structural elucidation process.

In conclusion, we have successfully designed a novel algorithm for the graph realization of sets of integers, promising potential applications in the validation of chemical structure hypotheses. This advancement marks a significant stride toward more efficient and accurate analysis in the field of molecular structure. Our hope is that this study will act as a stepping stone toward enriching the methodological toolkit used in structural chemistry and, in turn, pave the way for enhanced analysis and understanding of molecular structures in the future.

Data availibility

No datasets were generated or analysed during the current study.

Code Availability

The program codes are available and can be provided by the authors if required.

References

A.T. Balaban, Applications of graph theory in chemistry. J. Chem. Inform. Comput. Sci. 25(3), 334–343 (1985)
Article CAS Google Scholar
N. Trinajstic, Chemical Graph Theory (CRC Press, Boca Raton, 2018)
Book Google Scholar
D.D. Bonchev, O. Mekenyan, Graph Theoretical Approaches to Chemical Reactivity (Springer, New York, 2012)
Google Scholar
M.A. Tudoran, M.V. Putz, Molecular graph theory: From adjacency information to colored topology by chemical reactivity. Curr. Organ. Chem. 19(4), 359–386 (2015)
Article CAS Google Scholar
M. Randic, Characterization of molecular branching. J. Am. Chem. Soc. 97(23), 6609–6615 (1975)
Article CAS Google Scholar
P. Formanowicz, M. Kasprzak, P. Wawrzyniak, Labeled graphs in life sciences—two important applications. Graph-Based Modell. Sci. 8, 201–217 (2022)
Article Google Scholar
R. Gugisch, A. Kerber, R. Laue, M. Meringer, C. Rücker, History and progress of the generation of structural formulae in chemistry and its applications. MATCH Commun. Math. Comput. Chem. 58, 239–280 (2007)
CAS Google Scholar
A. Kerber, R. Laue, M. Meringer, C. Rucker, Molecules in silico: the generation of structural formulae and its applications. J. Comput. Chem. Jpn. 3(3), 85–96 (2004)
Article CAS Google Scholar
J.-L. Faulon, D.P. Visco Jr., D. Roe, Enumerating molecules. Rev. Comput. Chem. 21, 209–286 (2005)
Article CAS Google Scholar
J. Meija, Mathematical tools in analytical mass spectrometry. Anal. Bioanal. Chem. 385, 486–499 (2006)
Article CAS PubMed Google Scholar
K. Scheubert, F. Hufsky, S. Böcker, Computational mass spectrometry for small molecules. J. Cheminform. 5, 1–24 (2013)
Article Google Scholar
A.A. Aksenov, R. Silva, R. Knight, N.P. Lopes, P.C. Dorrestein, Global chemical analysis of biology by mass spectrometry. Nat. Rev. Chem. 1(7), 0054 (2017)
Article CAS Google Scholar
V. Havel, A remark on the existence of finite graphs (in czech). Casopis pro Pestovani Matematiky 80(4), 477–480 (1955)
Google Scholar
S.L. Hakimi, On realizability of a set of integers as degrees of the vertices of a linear graph. J. Soc. Indust. Appl. Math. 10(3), 496–506 (1962)
Article Google Scholar
D. Meierling, L. Volkmann, A remark on degree sequences of multigraphs. Math.l Methods Oper. Res. 69, 369–374 (2009)
Article Google Scholar
T. Michael, Signed degree sequences and multigraphs. J. Graph Theory 41(2), 101–105 (2002)
Article Google Scholar
M. Ferrara, Some problems on graphic sequences. Graph Theory Notes N.Y. 64, 19–25 (2013)
Google Scholar
R.B. Eggleton, D.A. Holton, Simple and Multigraphic Realizations of Degree Sequence (Springer, New York, 1981), pp.155–172
Google Scholar

Download references

Funding

This work was partially supported by statutory funds of Poznan University of Technology.

Author information

Piotr Wawrzyniak and Piotr Formanowicz have contributed equally to this work.

Authors and Affiliations

Institute of Computing Science, Poznań University of Technology, Piotrowo 3, 60965, Poznań, Poland
Piotr Wawrzyniak & Piotr Formanowicz

Authors

Piotr Wawrzyniak
View author publications
You can also search for this author in PubMed Google Scholar
Piotr Formanowicz
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The authors have the same contributions to the paper.

Corresponding author

Correspondence to Piotr Wawrzyniak.

Ethics declarations

Conflict of interest

The authors have no Conflict of interest to declare.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Wawrzyniak, P., Formanowicz, P. Graph realization of sets of integers. J Math Chem (2024). https://doi.org/10.1007/s10910-024-01642-4

Download citation

Received: 23 April 2024
Accepted: 10 June 2024
Published: 24 June 2024
DOI: https://doi.org/10.1007/s10910-024-01642-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Graph realization of sets of integers

Abstract

Similar content being viewed by others

Graphs Identified by Logics with Counting

Solving the Mostar index inverse problem

A unified construction of semiring-homomorphic graph invariants

1 Introduction

2 Description of the problem

2.1 Sequence of degrees

Definition 1

Condition

Condition

Condition

2.2 Sequence of degree sets

Definition 2

Condition

Condition

Condition

Condition

Corollary 1

3 Polynomial algorithm

4 Properties of the algorithm

4.1 Computational complexity

4.2 Results of the algorithm

Condition

Condition

Condition

5 Conclusions

Data availibility

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation