Structural analysis based dummy derivative selection for differential algebraic equations

The signature matrix structural analysis method developed by Pryce provides more structural information than the commonly used Pantelides method and applies to differential-algebraic equations (DAEs) of arbitrary order. It is useful to consider how existing methods using the Pantelides algorithm can benefit from such structural analysis. The dummy derivative method is a technique commonly used to solve DAEs that can benefit from such exploitation of underlying DAE structures and information found in the Signature Matrix method. This paper gives a technique to find structurally necessary dummy derivatives and how to use different block triangular forms effectively when performing the dummy derivative method and then provides a brief complexity analysis of the proposed approach. We finish by outlining an approach that can simplify the task of dummy pivoting.

P u blis h e r s p a g e : h t t p:// dx.d oi.o r g/ 1 0. 1 0 0 7/ s 1 0 5 4 3-0 1 6-0 6 4 2-9 < h t t p:// dx.d oi.o r g/ 1 0. 1 0 0 7/ s 1 0 5 4 3-0 1 6-0 6 4 2-9 > Pl e a s e n o t e: C h a n g e s m a d e a s a r e s ul t of p u blis hi n g p r o c e s s e s s u c h a s c o py-e di ti n g, fo r m a t ti n g a n d p a g e n u m b e r s m a y n o t b e r efl e c t e d in t his ve r sio n.Fo r t h e d efi nitiv e ve r sio n of t hi s p u blic a tio n, pl e a s e r ef e r t o t h e p u blis h e d s o u r c e.You a r e a d vis e d t o c o n s ul t t h e p u blis h e r's v e r sio n if yo u wi s h t o cit e t hi s p a p er.
Thi s v e r sio n is b ei n g m a d e a v ail a bl e in a c c o r d a n c e wit h p u blis h e r p olici e s. S e e h t t p://o r c a .cf. a c. u k/ p olici e s. h t ml fo r u s a g e p olici e s.Co py ri g h t a n d m o r al ri g h t s fo r p u blic a tio n s m a d e a v ail a bl e in ORCA a r e r e t ai n e d by t h e c o py ri g h t h ol d e r s .

Introduction
Differential-algebraic equations (DAEs) arise from the equation based modelling of physical systems, such as those found in engineering or physics, with problems arising, for instance, from chemical engineering [21], electronic circuits [1] and robotics [2].Models are now frequently built interactively using different components from large libraries in environments such as gPROMS and Simulink, as well as MapleSim and other tools that use the Modelica language.This way of modelling can lead to large DAEs [4].A common notion in DAE theory is the differentiation index-which is equal to the number of times all or part of the system has to be differentiated in order to solve the problem as an ODE.It is well known that solving a high index (larger than one) DAE directly is numerically difficult, hence modelling tools usually perform some structural analysis (SA) to determine the index of the problem.This analysis is usually based on Pantelides' algorithm [14], although we will be using the (equivalent for DAEs of order 1) Signature Matrix method [18] for the duration of this paper, as it applies to DAEs of arbitrary order and provides us with extra structural information we wish to exploit.As such if we talk about structural analysis of a DAE we mean information obtained by applying the Signature Matrix method to that DAE.After finding a DAE's index, different packages handle high index DAEs very differently, for example gPROMS [16] reports a set of equations and variables that are causing the high index, whereas Dymola [3] performs an index reduction algorithm (specifically dummy derivatives [8]) to form a new equivalent index 1 DAE, which it then solves.We are interested in the latter approach.We shall consider a DAE given by: f i (t, the x j and derivatives of them), i = 1,...,n where x j (t), j = 1,...,n are state variables and functions of some independent variable t, usually considered to be time.DAEs we consider can be fully non-linear but must be solvable, i.e. have a unique smooth solution when given a sufficient set of consistent initial conditions.The functions f i must be least smooth enough to find derivatives required by the method to succeed, usually three times differentiable is sufficient in practical examples.

A review of the signature matrix method
We begin by providing the reader with a needed background in the Signature Matrix method, for a more detailed explanation see [18] and [20].

Basic Structural information
Take a DAE as above, we begin by finding the problem's Signature Matrix Σ, with entries: Then find a highest value transversal (HVT).A transversal, T , is any choice of n positions (i, j) in an n × n matrix, such that only one entry in each row and column is selected.If we consider a transversal T , then we say Val(T ) = (i, j)∈T σ i, j .A HVT, say T , is such that Val(T ) ≥ Va l (T ) for all transversals T -we denote Val (T ) as Va l ( Σ).We call a DAE structurally well posed if Val(Σ)is finite, i.e. there exists a HVT with corresponding entries in Σ all finite.If the DAE is structurally well posed (SWP) and solvable, see [13], Val(Σ)is equal to its degrees of freedom (DOF).In this paper we only consider SWP DAEs.
We then find non-negative integer valued vectors, c and d satisfying: σ i, j ≤ d j − c i with equality on a HVT (1) called valid offset vectors.There exists a unique elementwise smallest choice of c and d termed the canonical offset vectors.It is easily seen that such canonical offsets must have min i c i = 0. Unless otherwise stated we use canonical offsets throughout the paper.Finding a HVT and offsets is a linear assignment problem [18].
Example 1 Consider the simple pendulum: Equations A and B are from Newton's second law, since we are measuring to the right in the x direction and downwards in the y direction.We keep the length of the pendulum fixed, which is how we get equation C. Displacement in the horizontal and vertical directions is measured by x and y respectively, λ is a multiple of tension in the rod.Gravity and rod length are given by G and L respectively.This is an index 3 formulation of the pendulum, if we differentiate equation c twice we can solve for ẍ and ÿ but λ is undifferentiated, so we must differentiate each equation once more, hence the problem is index 3.One can see from this simple example that it's easy to construct a high index DAE.One may think this problem is rather artificial, since changing to polar coordinates yields an ODE, however for large scale models found in practice such a manipulation is usually not obvious.
We form the problem's signature matrix as above, with canonical offsets around the side: There are two HVTs, marked by • and • in the signature matrix above.Comparing the offsets of the problem with what made the problem index 3 we see we have a scheme of which equations to differentiate and what their highest order variables will be.We go in to more detail on the ordering of this scheme in Sect.2.2.

Structural analysis stages
The example above leads us to the concept of a structural index ν s , defined as: The +1ifad j = 0 case is needed to add a first order derivative of otherwise undifferentiated variables, see λ in Example 1.IftheDAEisSWPν S is known to be an upper bound on the differentiation index, and in many applications they are equal, which is why we suggest using it as a basis for an index reduction algorithm such as DDs later.
We now differentiate each equation in stages specified by c. Initially the approach given by the Signature Matrix method was to solve parts of the DAE in stages via Taylor series, see [12], however we just focus on the equations used at each stage and the variables being solved for.We briefly inform the reader of how we denote derivatives.Consider for instance variables x and y, both of which depend on t, and some function f which is a function of t, x and y and perhaps some first order derivatives of x, i.e. f = f (x, ẋ, y, t).Then: This example shows we mean ordinary derivative with respect to t unless otherwise stated.
The offsets of a DAE provide us with a staged solution process starting at stage k =−max j d j = k min and increasing by one at subsequent stages where at each stage we use equations to solve for variables x Let m k be the number of i such that k + c i ≥ 0 and n k be the number of j such that k + d j ≥ 0, so that at stage k we solve m k equations for n k unknowns.Note: for k < − max j d j we have m k = n k = 0 and for k ≥ 0 we have that Example 2 Consider again the simple pendulum (2), it has SA stages as given in Table 1.
We now make use of the following Lemma, found in [5].
Lemma 1 (Griewank's Lemma) Suppose f (x 1 , x 2 ,...,x n ) is a smooth function that contains derivatives of x j not exceeding order m j .Then the (m j + 1)st derivative of x j occurs linearly in ḟ = df dt , and .
x (3) , y (3) , λ ẍ, ÿ,λ ẋ, ẏx, y 33 ... ... ... ... 33 We have the following formula for the System Jacobian J, with first equality coming from Lemma 1-as shown in [18]: Note: J depends on the choice of offsets, unless otherwise stated by the context we use canonical offsets.If there exists a consistent point at which J is non singular then the method succeeds and the DAE is locally solvable.Each stage k of SA listed above uses an m k × n k submatrix of J, denoted by J k , containing the rows and columns for equations and variables used at stage k from (3) and (4).
Example 3 Consider again the simple pendulum.Let J <i be taken to mean all Jacobians used in stages i − 1, i − 2, ... and similarly for J >i , we have the following stage wise system Jacobians for the simple pendulum, where []denotes an empty matrix:

SA based Block Triangular Forms
We discuss block forms natural to the Signature method, for more detail see [19].
A natural sparsity pattern for a DAE is the set where the entries of Σ are finite: A more informative BTF comes from the sparsity pattern of the system Jacobian J: (the sparsity pattern of J). (7) S 0 (c, d) ⊆ S for any c, d.
In applications a Block Triangular Form (BTF) based on S 0 is usually significantly finer than one based on S and as such we call a BTF based on S the coarse BTF and on S 0 the fine BTF.We now define the concept of a local offset: Definition 1 (Local offsets) We denote canonical local offsets of the fine BTF as ĉ and d, that is the offsets found by treating each fine block as a stand alone DAE.
We have following theorem from [20]:

Theorem 1 The difference between local and global offsets is a constant over a fine block
We call the difference between local and global offsets the lead time of a block l and denote it K l .
Example 4 Consider a modified double pendulum DAE: We present the different BTFs (produced by DAESA) indicated by dotted lines in Figs. 1, 2 and 3, where a blank means −∞.

A review of the dummy derivative method
We provide the reader with a background in the dummy derivative (DD) method, for a more detailed explanation we defer to [7,8].

The idea behind DDs
The usual manner of solving a DAE involves differentiating parts of the DAE in order to reveal the so called hidden constraints of the problem, as demonstrated above in the pendulum example.The DD method takes an n × n smooth DAE and adds the hidden constraint equations to the system directly, as opposed to solving them stage-wise as above.The issue with this is that the system is now over-determined.To get a square system again the DD method adds extra dependent variables to the system for each constraint equation, specifically the method finds a subset of appearing derivatives of variables to be considered algebraic.It is shown in [8] that such a formulation has the same solution as the original DAE.The advantage of this approach is by solving the resulting index 1 DAE we satisfy all constraint equations of the original Fig. 1 Modified double pendulum structural analysis DAE automatically.The solution of index 1 DAEs has been greatly studied in the literature and there are several tools that efficiently solve such problems, for example SUNDIALS [6], DASSL [15] and recently the MATLAB ODE solvers, such as ode15i, ode15s and ode25t.

The DD algorithm
Firstly we note that although originally the algorithm was given using a block lower triangular (BLT) form we shall, at least initially, consider the algorithm over one block.This simplifies notation and allows us to note interactions between blocks to find a reduced BTF with which to find DDs.
Consider a DAE to be written as F x = 0, where F is a (column n-vector) differential-algebraic operator (DAO), following the notation of [7]: Fig. 2 Modified double pendulum structural analysis-coarse BTF 5. A symbolic vector, z, of highest order derivatives (HODs) of x in the differentiated problem, with first entry equal to the HOD of x 1 etc... 6. Equations in Gx = 0 are written as g(z) = 0 There are three main stages in finding DDs: 1. Get ν(F).2. Obtain a differentiated problem Gx = 0. 3. Perform the index reduction algorithm.Loop through steps that select derivatives to be considered as algebraic variables in the solution process.
For ease of analysis later we will also assume, without loss of generality, that the equations (and variables) have been sorted into descending order with respect to number of differentiations needed, i.e. have been sorted with descending d j (and c i )-since d j ≥ c j this gives us each stage Jacobian a submatrix of previous stage Jacobians.We consider each stage by the superscript [κ].To make it simpler to draw comparisons between SA and DDs we reorder the index reduction part of original algorithm as presented in [7] to Algorithm 1.The main changes between the original Algorithm and Algorithm 1 are a new numbering starting from 0 and removing the appearance of the matrix M [κ] , since G [κ+1] = M [κ] .We also now compute H at the end of a stage as opposed to G, for more details see [11].
Example 5 We use an example found in [8] and apply Algorithm 1 to find DDs.Consider the linear constant coefficient DAE, with known smooth forcing functions u 1 (t), . . ., u 4 (t): We need a number of differentiations for each equation, to do this we'll compute the signature matrix and find c: .
Thus ẋ1 and ẋ3 are made DDs, denoted by x ′ 1 and x ′ 3 .Reducing the order of differentiation by 1 gives z [2] = (x 1 , x 3 ), the current equations are g [2] (z [2] ) = ( f 1 , f 2 ) T ,so m = 0 and H [2] =[ ]and the algorithm ends, yielding an index 1 system equivalent to the one found in [7]: The zeros in the third column of H [1] mean we never choose x 4 as a DD, this is always the case for undifferentiated variables, as explained in the following section.Also, the variables and equations used in G [κ] are just Dz [κ] and Dg [κ] (z [κ] ).

Structural analysis based dummy derivatives
If we consider H [κ] to be a matrix of size n κ × m κ , then from the algorithm in Sect. 3 we have m κ n κ potential index 1 systems at stage κ + 1.Thus the potential number of index 1 systems obtainable by the method can be very large for practical examples.We would like to use the structural analysis of Sect. 2 to inform our choice of DDs and thus limit the potential number of systems considered at each stage.See Sect. 5 for other ways of achieving numerical speed up.

Similarities in the methods
There are several similarities between Signature Matrix based SA and DDs.Firstly, as was said above, the ν(F) used in DDs is the same as c in SA, as from [18]w e see that Pantelides algorithm [14] and SA can be used interchangeably.Therefore we have that D ν(F ) = diag( d c 1 dt c 1 ,..., d cn dt cn ) and the following equality: We are differentiating each equation f i , c i times, so the maximum derivative for each variable x j in Gx = 0 will equal max i (σ i, j + c i );from(1)thisisd j .Hence, the 0 th stage system in DDs is the 0 th stage system in SA.Thus the differentiated problem can be written: j ; lower order derivatives ) = 0, for i, j = 1,...,n.
Therefore we must have that: The formula for the DD Jacobian matrix G [0] can now be written in this SA based notation to show it actually equals the stage 0 SA Jacobian J.
Going to the next stage in DDs by reducing the order of differentiation by 1 is equivalent to reducing the offset vector c by 1 (and consequently reducing d by 1 also).Therefore at stage 1 in DDs we will be considering the equations used in stage −1 of SA, since SA increases the order of differentiation by one at each stage.This leads us to the following observation.

Lemma 2
The equations used at stage k in SA are equal to those used at stage κ =−k in DDs (when writing down G [κ] ), for each stage k between k min and 0.
Proof We have already shown that at stage 0 both methods use the same equations.In DDs we now remove all equations such that c i = 0. We then omit one differentiation and repeat.Hence we remove, at stage 1, equations such that c i − 1 = 0, and by induction, at stage κ equations such that c i − κ = 0, where κ is the stage number.From (3) these are exactly the equations considered at stage −κ = k in SA.
⊓ ⊔ Due to the above theorem we will now use the term 'equivalent stage' to mean DD stage κ when talking about SA stage −k = κ and vice versa.We now take notation from [18] in order to write down the k th stage Structural Jacobian found in SA.Consider each variable x j as a function of an independent variable t and let x jl represent x (l) j (t).Then for an n × n DAE we have the index set: similarly for the equations we use the set: This gives us a notation for the variables used at each stage in the SA, that is: where the offsets are taken to be canonical unless otherwise stated.At each SA stage we have: We write f I k to mean the set of equations used at stage k in SA and f I ≤k to mean the set of equations used between k stage k min and similarly for the variables.Again, from [18] we have that the system Jacobian used at stage k in SA is given by: Recall the note at the end of Sect.3.2, giving us the following Lemma: and thus cannot be equal to σ i, j , so that (J k ) ij = 0 due to the definition of the System Jacobian in Equation ( 5).Since we have G [0] = J 0 = J and the DD algorithm is reducing the order of differentiation by one at each stage, if the column referring to x j appears in J k and H [−k −1] then its entries must be the same.Thus columns with , as they will be columns of structural zeros.

⊓ ⊔
Thus columns representing variables that are undifferentiated cannot be chosen as DDs, as one would expect.
Example 6 Consider example 5 and recall the index 1 system given in Eq. (10).Compare this with the SA results in Table 2, the variables that became DDs are marked by prime notation.In Example 5 we make a subset of the variables found at stage k in SA into DDs at stage −k + 1 = κ + 1 in the DD scheme, using the same equations in both cases, see Table 2.
The total number of DDs introduced will be i c i , since this is the total number of new equations introduced and we identify one DD with each new equation.At each stage the variables that produced DDs are a subset of the variables that produced DDs at the previous stage (necessarily excluding those with d j = κ + 1), of size m k−1 , with each variable being differentiated one time less than in the previous stage.So, the DDs will be a subset of the variables solved for at the equivalent stage +1ofSA, thus we have the following: Theorem 2 The matrix G [κ] is a submatrix of (it may be equal to) J k , where κ =−k.
Before going in to a deeper comparison we consider the 0 DOF case.

DDs and SA in the 0 DOF case
We begin with the following observation.Proof We have no choice in our selection of G [κ] since H [κ−1] (of size m ×n say) must contain only m columns of structural non-zeros, since n −m other columns correspond to undifferentiated variables due to Lemma 4. Noting that J 0 = G [0] completes the proof.⊓ ⊔

Example 7
In [2] the authors introduce a DAE for modelling a robot arm.It is reformulated to be SWP (i.e. have finite Val(Σ)) in [17].We shall use the 6 × 6 formulation introduced in [17] and arrange the equations and variables such that we clearly illustrate the block triangular structure of the problem.Structural Jacobian, signature matrix and offsets (with a HVT marked by • and −∞ entries left blank) for this DAE are, where F w means ∂ F/∂w and so on: For later observations it's useful to note the above signature matrix has four coarse (also fine) blocks, two of size 2 × 2 and two of size 1 × 1. Working through the DD algorithm yields ν(F) = (4, 4, 2, 2, 0, 0), and thus the differentiated system G is: Stage 0: The vector of HODs is 3 , ẅ, ẍ2 , u 2 , u 1 ) T and g [0] = (G (4) , H (4) , D (2) , F (2) , E, K ) T .
Thus we have a DD Jacobian of the form: 00004 . By Griewank's Lemma 1 this is equivalent to J. Removing equations with c i = 0 yields: 123 Stage 1: We are now forced to remove the last two columns of H [0] to get a non-singular matrix, choosing x 3 , ẅ, ẍ2 as DDs and reducing the order of differentiation: , .
The DDs are equivalent to the differentiated variables solved for at each prior stage in SA as expected, due to the 0 DOF in this example.

Structurally necessary dummy derivatives
Because the equations used in each equivalent stage of DDs and SA are the same and variables in each DD stage are a subset of those in the SA stage we have the following Theorem: [−k+1] in the DD scheme, i.e.H [κ] is square. [k] used by the DD scheme at stage κ =− k must be DDs in the final index 1 system, i.e.Dz [k] ,...,D (−k) z [k] must be DDs.This gives us the following improved DD algorithm, where we can identify structurally necessary DDs without computing numerical Jacobians:

Corollary 1 If m k =n k in the SA scheme at stage k then all subsequent derivatives of variables in z
Find SA solution scheme 3: Note stages where m K = n K 4: Make subsequent derivatives of such variables DDs 5: for K = 0 :−k min do 6: Work through DDs algorithm, but: 7: keep columns for already known DDs from step 4 when finding G [K ]   For 0 DOF systems this identifies all DDs as one might expect.Clearly Algorithm 2 finds all structurally necessary DDs.Applying the first half of Algorithm 2 (finding structurally necessary DDs) to the robot arm gives us the following staged solution scheme, produced by an extension (not yet released) to the authors' code DAESA [10]:  3 to convince themselves of the result.
Example 8 Recall the modified double pendulum DAE (8) and the associated signature matrix and offsets in Fig. 1.Table 4 gives the SA solution stages for this problem.In Table 4 we see all higher order derivatives of variables found at stage 4 in DDs will be made DDs (i.e.those used in SA stages −3, −2, −1 or DD stages 1, 2, 3).Consider now the different BTFs as shown in Fig. 1.Applying Corollary 1 corresponds to us solving the first coarse block as a stand alone system and then using it to solve the second coarse block.Compare this with the z [κ] and g [κ] (z [κ] ) found in DDs in Table 5 (we have re-ordered equations and variables so they correspond with the coarse block ordering).Due to our ordering in the DD algorithm we introduce DDs for Dz [κ] at stage κ, e.g. at stage 4 we are left with the 3 × 3 system given by the first coarse block in our BTF as expected.We note this algorithm for structurally necessary DDs looks similar to solving for DDs based on the coarse BTF, see Sect.4.4 for why this is not quite true.
Example 9 This improved DD algorithm does indeed achieve our goal of reducing the total number of potentially needed index 1 systems: Consider again the DAE (8), working through the structural analysis we see that at stage k =− 4w eh a v e m k = n k .A ts t a g e−4 we are solving for ẍ1 , ẍ2 , x 3 , so we know to keep columns corresponding to these variables when working through DDs.For example, with • indicating a structural non-zero and a blank indicating a structural zero we G [0]  and H [0] : .
We must keep the first three columns and hence only have to check 3 matrices for non-singularity (in practice best condition number), as opposed to the 15 we would otherwise have to check-although clearly in this case inspection tells us we choose the first 4 columns.

Block based dummy derivatives
Using a block decomposition may yield a way of reducing the size of potential G [κ]  matrices at each stage, which should offer computational speed up when checking the condition number of each Jacobian when doing dummy pivoting, see Sect. 5. Before giving an algorithm for finding DDs on blocks we ask if Algorithm 2 was already doing something similar to a BTF for us.We consider a fine block decomposition (which is itself a BTF within the coarse BTF) and ask if we could further reduce potential index 1 choices when considering m k being equal to n k during our SA stages.Proof Similar to the proof of Lemma 3 one partitions the matrix into the following and then notes the top right block is empty by (1).
This means, we cannot have a coarse block irreducible DAE with n k = m k , unless k ≥ 0ork < k min and therefore n k = m k never occurs when solving via fine blocks unless the DAE has m fine blocks and we are solving fine block m, i.e. are at global stage 0. Hence we turn our attention to the interactions between blocks rather than trying to optimise further within a block.We define local offsets associated with an arbitrary block form: Definition 3 (Block Local Offsets) We denote the canonical local offsets associated with an arbitrary block form, were its blocks treated as a stand alone systems, as či and ď j .
Given a block form of Σ with L blocks we have Algorithm 3. In the following Algorithm 3 Block based Dummy Derivatives 1: for l = 1,...,L do Require: Let m be the number of differentiated equations in g Let while Omit one differentiation to get z l ),... 12: (where we only consider variables and equations in G Let m be the number of differentiated equations in g Let l ,... 15: (the rows using differentiated equations in g Consider the new system using all equations g ),where κ l ≥ 0,... 18: and dummy derivatives for z l ,whereκ l > 0 and original variables,... 19: as well as all equations g l ,...

20:
and dummy derivatives for all variables x discussion we will call DDs block necessary dummy derivatives if they are found at the end of Algorithm 3, i.e. if they can be found by using a block form's local offsets and the global canonical offsets.There is no explicit interaction between blocks in Algorithm 3 (the inter block dependencies are taken in to account on lines 19 and 20) and therefore the majority of the algorithm could be performed in parallel.In our discussions that follow we will restrict ourselves to thinking about coarse and fine BTFs, as described in Sect.2.3, as these tend to be the most natural when using the signature matrix method.
To prove Algorithm 3 gives a suitable choice of DDs we would like an analogue of a Theorem in [20] that asserts the difference between local and global offsets is constant on a fine block form to hold for an arbitrary block form.This would mean that we introduce only as many DDs as differentiated equations when taking the interactions between blocks in to account.Consider however the following example: Example 10 Consider a DAE with the following signature matrix: .
Here we are block triangularising over coarse blocks.There is not a constant difference between coarse local and global offsets, so it's possible we introduce more DDs than we do equations when using Algorithm 3.
The potential problem in Example 10 is actually not a problem at all: Theorem 6 Given a BTF of Σ the difference between the sum of any block's local offsets and global offsets is equal with respect to c and d.
Proof Take an n × n signature matrix Σ and put its HVT on the main diagonal.Let S be any subset of {1,...,n} and č and ď be any valid offsets.Then, since ďi −č i = σ i,i we have: which is independent of č and ď.So, for any other valid offsets, say c and d we have: This is not what one might first expect: in our coarse block algorithm one might expect to differentiate an entire block a number of times to solve a later block, this Theorem shows that actually you may only need to differentiate some parts of the block to retain a square index 1 system using lines 19 and 20 in Algorithm 3.

Theorem 7 Algorithm 3 gives a suitable choice of DDs that could otherwise have been found by considering the entire system and original DD algorithm.
Proof If one is able to do the above algorithm then we have a block form whose diagonal sub-matrices are structurally non-singular.Because each block's coarse local offsets are a constant away from the global, by Lemma 1 we must have a non-singular Jacobian at each global stage of DDs if we have a non-singular Jacobian at each coarse local stage.Again, by Lemma 1 we see that a valid choice for the global stage DD algorithm is an amalgam of variables found at positive and negative local stages, using each block's lead times (if we consider adding variables and equations at the end of the algorithm as doing local stages 0,...,max j (d j − ď j )).
⊓ ⊔ Due to Theorem 5 and Example 7 one may think when restricting to coarse blocks Algorithm 3 is just a restating of Algorithm 2-consider a DAE with signature matrix: Looking for stages where m k = n k we find no structural DDs.However, using Algorithm 3 we find block necessary DDs for x 6 .We see that Algorithm 3 restricted to the coarse BTF yields a valid set of DDs that could be found using the global offsets and no BTF.All we've done is take a BTF, so one might assume in general it generates all possible sets of valid DDs that could be obtained globally, as certainly seems to be asserted in the original paper [7].Consider however: Example 11 Consider a DAE with signature matrix as follows: Note that the coarse BTF given above is the same as the fine BTF and c = č = ĉ and d = ď = d.Algorithm 3 will give a DD for either ẍ1 or ẋ2 from the first block, that is Algorithm 3 cannot find a set of DDs where neither of ẍ1 and ẋ2 are selected.However, if we carry out the original DD algorithm we see that there is an index one system that uses only derivatives of variables x 3 , x 4 and x 5 as DDs.Thus there are index 1 systems we may 'miss' using a BTF that would otherwise be available if considering the DAE as a whole.

Fine block based dummy derivatives
We now show that the choice of block form does directly affect the number of block necessary DDs as defined in Sect.4.4.We assert the fine BTF is a good choice.Before giving an example of Algorithm 3 using the fine BTF we wonder if there exists a more informative version of Algorithm 2 based on the fine BTF.Consider again Eq. ( 8), its signature matrices in Fig. 1 and its stages in Table 5.For stages −2 and −1 m k = n k because we have to introduce initial values for ẍ5 and ẋ5 .If we did not have to do this we would again have a square system, this time with 4 equations-the original coarse block containing equations 1, 2, 3 and variables 1, 2, 3 and a fine block containing equation 6 and variable 4. We note that variable 5 does not actually appear in the DD solution scheme until stage 0 because it does not have an associated equation.Therefore we can ignore any variables that must be specified as IVs and are left with the following definition Definition 4 Fine block local numbers of equations and variables are written as: where k is taken to be a local SA stage associated with a fine block.
Doing this we end up with an improved fine block based analogue of Algorithm 2. For each fine block l do the following (where k min,l is the − max j d j such that j is in block l): Algorithm 4 Fine BTF based structural DD algorithm 1: for K = k min,l :−1 do 2: Find SA solution scheme 3: Note stages where m k,l = n k,l 4: Make subsequent derivatives of such variables DDs 5: for K = 0 :− k min,l do 6: Work through DDs algorithm, but: 7: keep columns relating to already known DDs from step 4 when finding G [K ]   Knowing this it motivates us to continue using the fine BTF to find DDs, since we have a better set of structural DDs (we will term such DDs block structurally necessary DDs) than was previously found using the entire signature matrix.The only block we have to select DDs in is block 4 (equivalent to the simple pendulum).For ease of checking the Jacobians that follow we now consider this as a stand alone DAE (as would be done should Algorithm 3 be done in parallel over fine blocks) and give the differentiated problem Gx = 0: Performing the DD algorithm gives two possible index 1 systems, where the choice of DDs is given in Table 7.We are able to find 14 block necessary DDs without ever having to compute a numerical Jacobian and reduced our problem size by half at the outset.
A brief complexity analysis follows.Assume the DAE decomposes into L fine blocks labelled by a subscript l.At each stage of DDs the original algorithm has a complexity of order: because the selection of DDs is usually done via a QR decomposition in practice, which is an O(n 3 ) operation.The proposed algorithm has a complexity of order: where c i in the second summation is taken to be in block l.Assuming the system decomposes into relatively small fine blocks, i.e. n ≪ n and some c i < c i -from test models in DAESA this is usually the case-this should offer good numerical speed up.
Importantly the large reduction in potential DDs and problem size makes dummy pivoting less problematic, since we will have to consider smaller systems of equations with a reduced number of potential variables to choose from, see Sect. 5 for more detail.
Consider again Example 11, we are reminded that there are index 1 systems we may 'miss' using a fine BTF that would otherwise be available if considering the DAE as a whole.Whilst this seems like a rather large oversight of our method, this potential 'oversight' will never make an otherwise solvable (by DDs) DAE unsolvable: Theorem 8 If the DAE is solvable then Algorithm 3 restricted to the fine BTF will always be able to select a valid set of DDs.Proof If the system is solvable then there must exist DD matrices globally, i.e. there is a choice G [0] ,...,G [max i c i ] for which each matrix is non-singular.We also know that there exist global SA system Jacobian J 0 ,...,J − max i c i that have full row rank.If the SA scheme succeeds globally then it succeeds over fine-blocks.That is, over each fine block l there are J 0,l ,...,J − max i ĉi ,l that have full row rank.Since we have that any J 0 = G [0] and subsequent DD matrices are sub matrices of previous SA matrices it must be possible to select a non-singular DD matrix on each fine block if each SA Jacobian has full row rank in that block.
⊓ ⊔ Thus, although it is possible to find non-singular choices of DDs globally that cannot be found over a fine block, these choices are somehow a redundant selectionalthough admittedly it might be possible the global selections have better condition numbers there will exist a fine block selection if the DAE is solvable.

Dummy pivoting
We briefly discuss the issue of dummy pivoting as first examined in [8].When proceeding along a numerical solution it is possible that some chosen matrices G [κ] will become singular.It is therefore necessary to change our choice of G [κ] as our numerical solution evolves with time.Such a change is called dummy pivoting or dynamic selection of states [9].The main problem with making such a change is that changing a G [κ]  will (frequently, but not necessarily) produce a need to change all subsequent G [κ+i] .This problem is made worse when the DAE is of larger index (there will be multiple stages and thus more potential G [κ] to change) or when the problem is large (this will result in estimating condition numbers of potentially large matrices frequently throughout the solution process).One method of avoiding the former problem is to start by attempting to reduce the order of the problem and block triangularising as in Sect.4.5, a technique that in practice frequently reduces the order.Another alternative is to select which states must or cannot be chosen as DDs, this is implemented in some tools, such as is done in Dymola with each variable given a stateSelect parameter that can be prefer, default or avoid.Selecting dynamic states before solving however needs to be informed by problem specific knowledge, which it is not always possible a user will have.It is also dangerous as the user may avoid choosing necessary DDs if they are not careful.We offer an insight into the number of different potential Jacobians G [κ] gained by the following example.
Here u 1 (t),...,u 5 (t) are arbitrary driving functions.The non-linearity gives stages in the DD algorithm that may need pivoting.This DAE (13) has signature matrix and offsets: This is irreducible and has two HVTS, marked by • and •.Every entry of Σ corresponds to a structurally non-singular entry in each J k and their related DD stage Jacobians.For ease of checking entries in the DD stage Jacobians that follow we present all equations and their derivatives specified by c: Stage 0: In the DD algorithm, we have an initial Jacobians: , .
Stage 1: We need to find G [1] .We do this by choosing any 4 columns, since all 5 possibilities are (at least structurally) non-singular.Numerically one potentially has to pivot between different systems, for example say you chose columns 1, 2, 3, 4 and ẋ2 → 0 at some time t then you would pivot to the system with columns 1, 3, 4, 5.
Stage 2: Choose two columns from H [1] to form a square non-singular matrix.There are three choices.As above it may be necessary to pivot chosen DDs numerically.Say we choose columns 1 and 3, then we introduce DDs for ẍ1 and ẋ3 and get the following Jacobian: G [2] = ẋ1 x 3 f 3 2 ẋ1 2x 3 f 4 02 x 3 .
Removing undifferentiated equations yields H [2] =[]and the algorithm terminates.Note: we have selected the following dummies: 1 , x ′′ 2 , x ′ 3 , x ′′ 3 and x ′′ 4 .I fw e If one thinks of the number of possible DD index 1 systems as a tree diagram, with stage 1 producing the root and each subsequent stage producing branching nodes it is possible to make pivoting easier.In applications where the DAE is usually of large size but (relatively) low index the tree will likely be wide but shallow.Pivoting can then be done across nodes on each level, i.e. you can pivot between all nodes at one level (restricting yourself to nodes coming from one parent node), starting at the node that gives you a singular matrix G [κ] and considering its leaves, then if no nodes at the current level give a non singular matrix go up a level and repeat.Storing such a diagram may take a lot of memory, but it's highly likely many nodes will in fact have the same G [κ] .Consider Table 8.We see there are actually only three distinct nodes at stage 2, so we only have to store 3 Jacobians, not 9 and check 3 Jacobians for non-singularity in the worst case.

Conclusions
In this paper we have outlined the Signature matrix method of [18] in Sect. 2 and a reordered version of the Dummy Derivative method of [7] in Sect. 3 to draw parallels between the approaches and develop a method of doing DDs that is structurally well informed.The concept of structurally necessary DDs has been defined and explored for different block forms in Sect. 4. Such analysis should help those using the DD algorithm for index reduction to better understand the underlying restrictions on their reformulated index 1 system.We have also expanded upon the original description of block based DDs and demonstrated that one can restrict their potential index 1 system choices by using a block form, although such a restriction will not make a DAE unsolvable if it was otherwise solvable.We believe such a rigorous analysis of DDs using SA as a base will help modellers understand the subtleties inherent in this index reduction procedure and give them some insights when developing their models.We believe the complexity analysis for our revised algorithm will drive modellers to use it on their problems where they were previously using a different index reduction procedure.It is our opinion that dynamic selection of states is not a widely understood topic in the DAE community.We hope that Sect. 5 is useful to modellers that have to incorporate dummy pivoting into their codes.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Lemma 4 Theorem 3
If a square non-singular DAE has 0 DOF and a HVT on the main diagonal of Σ then d = c T .Proof Immediate from (1) equality on a HVT constraint.⊓ ⊔ This gives us the following Theorem: If we have 0 DOF then G [−k] = J k for each stage k between k min and 0.

Definition 2 (
Structurally necessary DDs) If there exists a k such that m k = n k and d j − k ≥ 0 then x (d j −k+1) j ,...,x (d j ) j are termed structurally necessary dummy derivatives.

Theorem 5
If given an n × n DAE and there exists a k ∈{ k min ,...,−1} such that n k = m k = µ for some 0 <µ<n, then the DAE must decompose in to at least 2 coarse blocks of size µ and (n − µ).

we have a non-singular matrix 10 :
Make the corresponding variables used in G [κ l ] l DDs 11:

Example 12
Consider again the DAE in Example 4. We carry out Algorithm 3 on this DAE to illustrate the advantages of the fine BTF.The local offsets tell us that we must have DDs shown in Table 6-found by comparing the offsets via lines 19 and 20 of the Algorithm.

Table 1
Stages for the simple pendulum

Table 2
The results of SA on the linear DAE (9)-also showing DDs by a prime

Table 3
DDs and SA stages for the robot arm

Table 8
systems at the end of the algorithm.If however we check for structural singularity at each stage (as is done above) we get 9 possible index 1 systems.Listed in Table8are selected DDs.