A mathematical model of p62-ubiquitin aggregates in autophagy

Aggregation of ubiquitinated cargo by oligomers of the protein p62 is an important preparatory step in cellular autophagy. In this work a mathematical model for the dynamics of these heterogeneous aggregates in the form of a system of ordinary differential equations is derived and analyzed. Three different parameter regimes are identified, where either aggregates are unstable, or their size saturates at a finite value, or their size grows indefinitely as long as free particles are abundant. The boundaries of these regimes as well as the finite size in the second case can be computed explicitly. The growth in the third case (quadratic in time) can also be made explicit by formal asymptotic methods. In the absence of rigorous results the dynamic stability of these structures has been investigated by numerical simulations. A comparison with recent experimental results permits a partial parametrization of the model.


Introduction
Autophagy is an intracellular pathway, which targets damaged, surplus, and harmful cytoplasmic material for degradation.This is mediated by the sequestration of cytoplasmic cargo material within double membrane vesicles termed autophagosomes, which subsequently fuse with lysosomes wherein the cargo is hydrolyzed.Defects in autophagy result in various diseases including neurodegeneration, cancer, and uncontrolled infections [8].The selectivity of autophagic processes is mediated by cargo receptors such as p62 (also known as SQSTM1), which link the cargo material to the nascent autophagosomal membrane [4].p62 is an oligomeric protein and mediates the selective degradation of ubiquitinated proteins.Its interaction with ubiquitin is mediated by its C-terminal UBA domain, while it attaches the cargo to the autophagosomal membrane due to its interaction with Atg8 family proteins such as LC3B, which decorate the membrane [10].Additionally, p62 serves to condensate ubiquitinated proteins into larger condensates or aggregates, which subsequently become targets for autophagy [12,15].It has been reported that this condensation reaction requires the ability of p62 to oligomerize and the presence of two or more ubiquitin chains on the substrates [13,15].
In this article a mathematical model for the condensation process is derived and analyzed.It is based on cross-linking of p62 oligomers by ubiquitinated substrate [15].A cross-linker is assumed to be able to connect two oligomers, where each oligomer has a number of binding sites corresponding to its size.As an approximation for the dynamics of large aggregates, a nonlinear system of ordinary differential equations is derived.
The oligomerization property of p62 has been shown to be necessary in the formation of aggregates [15]: too small oligomers of Ubiquitin do not form aggregates [13].
The dynamics of protein aggregation has been studied by mathematical modelling for several decades, but most models consider the aggregation of only one type of protein, which gives rise to models belonging to the class of nucleation-coagulation-fragmentation equations, see e.g.[2,11,14] for examples in the biophysical literature, and [1,3,5,7] for a sample of the mathematical literature.Contrary to these studies, the present work considers aggregates composed of two different types of particles with varying mixing ratios, which drastically increases the complexity of the problem.
In the following section the mathematical model is derived.It describes an aggregate by three numbers: the number of p62 oligomers, the number of cross-linkers bound to one oligomer, and the number of cross-linkers bound to two oligomers.The model considers an early stage of the aggregation process where the supply of free p62 oligomers and of free cross-linkers is not limiting.Since no other information about the composition of the aggregate is used, assumptions on the binding and unbinding rates are necessary.In the limit of large aggregates, whose details are presented in an appendix, the model takes the form of a system of three ordinary differential equations.Section 3 starts with a result on the well posedness of the model, and it is mainly devoted to a study of the long-time behaviour by a combination of analytical and numerical methods.Depending on the parameter values, three different regimes are identified, where either aggregates are unstable and completely dissolved, or their size tends to a limiting value, or they keep growing (as long as they do not run out of free oligomers and cross-linkers).In Section 4 we discuss the parametrization of the model and a comparison with data from [15].

Presentation of the model
Discrete description of aggregates: We consider two types of basic particles: 1. Oligomers of the protein p62, where we assume for simplicity that all oligomers contain the same number n ≥ 3 of molecules.These oligomers are denoted by p62 n and are assumed to possess n binding sites for ubiquitin each, 2. Cross-linkers in the form of ubiquitinated cargo, denoted by U bi and assumed to have two ubiquitin ends each.When one end of a U bi is bound to a p62 n , we call it one-hand bound, when both ends are bound we call it both-hand bound.
An aggregate is represented by a triplet (i, j, k) ∈ N 3 0 , where i denotes the number of one-hand bound U bi, j denotes the number of both-hand bound U bi, and k denotes the number of p62 n .It is a rather drastic step to describe an aggregate only by these three numbers, since the same triplet might represent aggregates with various forms.This will affect our modelling below.
An aggregate will be assumed to contain at least two p62 n , i.e. k ≥ 2, and enough both-hand bound U bi to be connected, i.e. j ≥ k − 1.Furthermore, an aggregate contains nk binding sites for U bi, implying i + 2j ≤ nk.A triplet (i, j, k) ∈ N 3 0 satisfying the three inequalities will be called admissible.An example of an admissible triplet describing a unique aggregate shape is (0, k − 1, k), representing a chain of p62 n .Adding one both-hand bound U bi already creates a shape ambiguity: The triplet (0, k, k) can be realized by a circular aggregate or by an open chain, where one connection is doubled.
The reaction scheme: Basically there are only two types of reactions: binding and unbinding of U bi to p62 n .However, depending on the situation these may have various effects on the aggregate, whence we distinguish between three binding and three unbinding reactions.
1. Addition of a free U bi, requiring at least one free binding site, i.e. nk − i − 2j ≥ 1, (see Fig. 2): The reaction rate (number of reactions per time) is modeled by mass action kinetics for a second-order reaction with reaction constant κ 1 and with the number [U bi] of free U bi. Since free U bi and free p62 oligomers will be assumed abundant, their numbers [U bi] and [p62 n ] will be kept fixed and the abbreviation κ 1 = κ 1 [U bi] will be used.This leads to a first-order reaction rate 2. Addition of a free p62 n , requiring at least one one-hand bound U bi, i.e. i ≥ 1: Analogously to above, we set κ 2 = κ 2 [p62 n ] and 3. Compactification of the aggregate by a U bi binding its second hand, requiring at least one one-hand bound U bi, i.e. i ≥ 1, and at least one free binding site, i.e. nk − i − 2j ≥ 1: (i, j, k) This is a second-order reaction with rate 4. Loss of a U bi, requiring at least one one-handed U bi, i.e. i ≥ 1.This is the reverse reaction to 1: Its rate is modeled by 5. Loss of a p62 n (leading to loss of the whole aggregate if k = 2): This and the following reaction need some comments.They are actually both the same reaction, namely breaking of a cross-link, which we assume to occur with rate κ − j.However, this can have different consequences.Here we consider something close to the reverse of reaction 2. This means we assume that the broken cross-link has been the only connection of a p62 oligomer with the aggregate, such that the oligomer falls off.This requires the condition nk − 2j ≥ n − 1, meaning the possibility that the other n − 1 binding sites of the lost oligomer are free of two-hand bound U bi.It is not quite the reverse of reaction 2, since we have to consider the possibility that one-hand bound U bi, 0 ≤ ≤ n − 1, are bound to the lost oligomer.The conditional probability α j,k to be in this case, when a cross-link breaks, is zero for a very tightly connected aggregate where each oligomer is cross-linked at least twice, i.e. nk − 2j ≤ n − 2, and it is one for a very loose aggregate, i.e. a chain with j = k − 1.This leads to the model and to the rate In the framework of our model, should be a random number satisfying the restrictions where the upper bound should be obvious and the lower bound implies that the last condition in (1) is satisfied after the reaction.We shall use the choice which can be interpreted as the rounded ( • denotes the closest integer) expectation value for the number of one-hand bound U bi on the lost oligomer in terms of the ratio between the number n − 1 of available binding sites on the lost oligomer and the total number nk − 2j of available binding sites for one-hand bound U bi in the whole aggregate.It is easily seen that in the relevant situation α j,k > 0, i.e. nk − 2j ≥ n − 1, the choice (9) without the rounding satisfies the conditions (8).Since the bounds in (8) are integer, the same is true for the rounded version.
Note that we neglect the possibility to lose more than one oligomer by breaking a cross-link, i.e. the fragmentation of the aggregate into two smaller ones.This is a serious and actually questionable modelling assumption.An a posteriori justification will be provided by some of the results of the following section, showing that growing aggregates are tightly connected.
6. Loosening of the aggregate by breaking a cross-link, requiring at least one excess bothhand bound U bi, i.e. j ≥ k: This is the reverse of reaction 3 with the rate which respects the requirement j ≥ k for a positive rate, because of A deterministic model for large aggregates: The next step is the formulation of an evolution problem for a probability density on the set of admissible states (i, j, k).In this problem the discrete state is scaled by a typical value k 0 of [U bi] and [p62 n ], assumed of the same order of magnitude: It is then consistent with the definitions of κ 1 and κ 2 above to introduce κ 3 := κ 3 k 0 .In the large aggregate limit k 0 → ∞, the new unknowns become continuous, and the equation for the probability density becomes a transport equation (see Appendix A for the details).It possesses deterministic solutions governed by the ODE initial value problem where α(q, r) is the limit of α j,k as k 0 → ∞.The conditions for admissible states (p, q, r) ∈ [0, ∞) 2 × (0, ∞) are obtained in the limit of (1): implying, as expected, The equations satisfied by s and q − r, show that the conditions ( 14) are propagated by (12).

Analytic results
Global existence: Since the right hand sides of ( 12) contain quadratic nonlinearities, it seems possible that solutions blow up in finite time.On the other hand, the right hand sides are not well defined for r = 0.The essence of the following global existence result is that neither of these difficulties occurs.
Then problem (12) has a unique global solution satisfying (p(t), q(t), r(t)) ∈ (0, ∞) 3 as well as (14) for any t > 0. Also the following estimates hold for t > 0: Proof.Local existence and uniqueness is a consequence of the Picard-Lindelöf theorem.Global existence will follow from the bounds stated in the theorem.Positivity of the solution components, of s = nr − p − 2q, and of q − r is an immediate consequence of the form of the equations ( 12), ( 16), (17).This also implies which shows (18) by the Gronwall lemma.With (14), the equation for q in (12) implies and another application of the Gronwall lemma and of ( 14) proves (19) and, thus, completes the proof of the theorem.
Long-time behaviour: The first step in the long-time analysis is the investigation of steady states.Although the right hand sides of ( 12) are not well defined for r = 0, the origin p = q = r = 0 can be considered as a steady state since hold for admissible states satisfying (14).The origin is the only acceptable steady state with r = 0, since α(q, r) and p/r are not well defined in this case, so the factor q, multiplying them in the equations, needs to be zero.Finally, for a steady state this implies also p = 0.The following result shows that at most one other steady state is possible which, somewhat miraculously, can be computed explicitly.
Proof.Assuming r > 0, we introduce and rewrite the steady state equations in terms of p and q: From (24) we obtain which is substituted into the sum of ( 22) and ( 23): The option n = 2q leads to ᾱ = 0, implying p = 0 and, thus, p = 0, which contradicts (23).Therefore the second paranthesis has to vanish, leading to a quadratic equation for q with the only positive solution 2κ − (n − 1) .Now (24) implies the formula for ᾱ stated in the theorem and we note that 0 < ᾱ < 1 implies 1 < q < n/2.We compute p from q by (25) and note that p > 0 since ᾱ > 0. We then compute ŝ = s/r = n − p − 2q from the sum of ( 22) and ( 23): which proves ŝ > 0. Finally we obtain the formula for p from (23) as well as r = p/p and q = r q.
For convenience below, the conditions in the theorem are made more explicit in terms of the parameters by The steady state approaches the origin p = q = r = 0 as ᾱ → 1, whereas all its components become unbounded as ᾱ → 0. This motivates the following.
The conjecture has been supported by numerical simulations.Figures 3, 4, and 5 show typical simulation results corresponding to the three cases.The conjecture is open, and its proof is not expected to be easy.Note for example that not even the local stability of the origin in Case 2 can be investigated by standard methods, since the right hand side of (12) lacks sufficient smoothness.Partial results will be published in separate work.
Closer inspection of the simulation results for growing aggregates (see Figure 5) shows that the growth is polynomial in time.This is verified by the following formal result.Theorem 3.With the notation of Theorem 2, if ᾱ < 0, then there exists a formal approximation of a solution of (12) of the form with The approximation is (from a formal point of view) unique, including the choice of the exponents of t, among solutions with polynomially or exponentially growing aggregate size r.Proof.Since 2r ≤ 2q ≤ nr holds for admissible states, when r(t) tends to infinity, then also q(t) tends to infinity at the same rate, which we write with the sharp order symbol O s as With α = s+p (n−2)r , we write the equations for r and for p + q as Since the right hand sides have to be asymptotically nonnegative by the growth of q and r, taking (30) into account, the first equation implies s(t) = O(p(t)), and the second implies If the growth were exponential, i.e. r(t), q(t) = O s (e λt ), λ > 0, then (31) would imply p(t), s(t) = O s (e λt ).Then the negative term −κ 3 p(t)s(t) = O s (e 2λt ) in the first equation in (12) could not be balanced by any of the positive terms, and would drive p to negative values.This contradiction excludes exponential growth.For polynomial growth, i.e. r(t), q(t) = O s (t γ ), (31) implies p(t), s(t) = O s (t γ−1 ).In the equation for q in (12), q and p are small compared to q. Therefore it is necessary that s(t)p(t) = O s (q(t)), implying 2γ − 2 = γ and, thus, γ = 2.This justifies the ansatz (28) with the addition s(t) = s 1 t + o(t).Substitution into the differential equations and comparison of the leading-order terms gives equations for the coefficients: 2nd equ. in (12): 1st equ. in (31): 2nd equ. in (31): This system can be solved explicitly by first noting that the first two equations imply α(q 2 , r 2 ) = 0 and, thus, 2q 2 = nr 2 .Using this in the third and fourth equation gives a linear relation between p 1 and s 1 .This again can be used in the fourth equation to write q 2 as a linear function of s 1 .
The division of the first equation by s 1 then gives the formula for p 1 in (29).The positivity of p 1 is a consequence of (27).
For all the results so far the positivity of the rate constant κ − for breaking cross-links has been essential.Therefore it seems interesting to consider the special case κ − = 0 separately.It turns out that the dynamics is much simpler.The aggregate size always grows linearly with time.
Theorem 4. Let 3 ≤ n ∈ N, κ 1 , κ 2 , κ 3 , κ −1 > 0, and κ − = 0. Let (p 0 , q 0 , r 0 ) ∈ (0, ∞) 3 satisfy (14).Then the solution of (12) satisfies Proof.For κ − = 0 the right hand sides in (12) depend only on p and s = nr − 2q − p, meaning that these two variables solve a closed system: The unique nontrivial steady state (p ∞ , s ∞ ) can be computed explicitly.We prove that it is globally attracting by constructing a Lyapunov functional.Let a ≥ 1 and For each point (p, s) ∈ (0, ∞) 2 there is a unique value of a ≥ 1 such that (p, s) ∈ ∂R a .Therefore the Lyapunov function is well defined and definite in the sense L(p, s) ≥ 0 with equality only for (p, s) = (p ∞ , s ∞ ).It remains to prove that the flow on ∂R a is strictly inwards.For example, for the left boundary part, where for the first inequality it has been used that p ∞ < κ 1 /κ 3 , and the equality follows from the fact that ṗ vanishes at the steady state.Similarly it can be shown that ṗ < 0 on the right boundary part, ṡ > 0 on the lower boundary part, and ṡ < 0 on the upper boundary part.
The linear growth of q and r follows from This result shows that the breakage of cross-links has somewhat contradictory effects, depending on the parameter regime: It can speed-up the aggregation dynamics, producing a quadratic rather than linear growth of the aggregate size (Case 3 of Conjecture 1).This is linked to the fact that it allows the aggregates to rearrange in a more compact way.On the other hand, it may slow down the dynamics, such that the aggregate only reaches a finite size (Case 1) or even disintegrates completely (Case 2).

Comparison with experimental data -Discussion
Comparison with experimental data: There are only limited options for a serious comparison of the theoretical results with experimental data.We shall use the data shown in Figure 6, which have been published in [15].It provides observed numbers of aggregates in dependence of ubiquitin for a fixed concentration of p62.Our results do not permit a direct comparison with this curve, which would require modelling of the process of nucleation of aggregates.However, the data provide at least some information about concentration levels of ubiquitin and p62, such that stable aggregates exist.
For meaningful quantitative comparisons with these scarce data we need to reduce the number of parameters in our model.As a first step, we fix the value n = 5 of the size of p62 oligomers, following [15] where values between 5 and 6 for GFP-p62 have been found (although we note that in [13] an average of about n = 24 has been reported for mCherry-p62 in vitro).This implies that the experiment corresponds to an oligomer concentration of [p62 5 ] = [p62]/5 = 0.4µM .
Concerning the rate constants, we make the assumption that the binding and, respectively, the unbinding rate constants are equal, i.e. κ 1 = κ 2 = κ 3 and κ −1 = κ − .This will allow to express all our results in terms of one dissociation constant K d := κ −1 /κ 1 .
From Figure 6 we conclude that for an oligomer concentration of [p62 5 ] = 0.4µM the growth of stable aggregates requires a cross-linker concentration [U bi] roughly between 0.6µM and 2.6µM ((1.6 ± 1)µM ).According to the results of the preceding section, these values should correspond to situations with either ᾱ = 0 or ᾱ = 1, depending on the question, if the equilibrium aggregate sizes of Case 1 in Conjecture 1 are large enough to be detected in the experiment, or if we need to be in Case 3 of growing aggregates.Therefore, with the above assumptions, with and with (26), ( 27), we obtain for ᾱ = 1: and for ᾱ = 0: Solving these equations for K d with n = 5, [p62 n ] = 0.4µM , and with [U bi] between 0.6µM and 2.6µM , gives estimates for K d between 0.44µM and 0.73µM for ᾱ = 1, and between 0.20µM and 0.31µM for ᾱ = 0.So we claim that at least the order of magnitude is significant.It differs by three orders of magnitude from published data on the reaction between ubiquitin and the UBA domain of p62 (K d ≈ 540µM [9]).This should not be so surprising, since in the context of growing aggregates the reactions can be strongly influenced by avidity effects.
Discussion: We return to Conjecture 1, where the long-time behaviour is described in terms of the value of the parameter ᾱ defined in (20).With the simplifying assumptions on the reaction rate constants from above, the statements of the conjecture are depicted in Figure 7  There is a significant uncertainty concerning the oligomer size n, which has so far been assumed to be 5, according to observations in [15].Actually, a distribution of oligomer sizes should be expected in the experiments of Figure 6 with the occurrence of much larger oligomers.For this reason the computation of K d from (33) has been repeated for a range of values of n between n = 3 and n = 100.The results are depicted in Figure 8, which shows that the predicted values of K d might be larger by up to an order of magnitude compared to the case n = 5, but still small compared to [9], if larger oligomer sizes are considered and ᾱ = 1 is relevant.The asymptotic behaviour for large oligomer sizes is easily seen to be K d = O(n 1/2 ).On the other hand, if ᾱ = 0 is relevant, the value of K d becomes smaller by up to an order of magnitude for large oligomers with the asymptotic behaviour

Conclusion
In this article, we have proposed an ODE model for the growth and decay of aggregates of p62 oligomers cross-linked by ubiquitin chains.Under the assumption of unlimited supply of free oligomers and cross-linkers we found three possible asymptotic regimes: complete degradation of aggregates, convergence towards a finite aggregate size, and unlimited growth (quadratic in time) of the aggregate size.In the latter case, growing aggregates are asymptotically tightly packed with the maximum number of cross-links.These statements are supported by a mixture of explicit steady state computations, formal asymptotic analysis, and numerical simulations.The three regimes, which can be separated explicitly in terms of the reaction constants, have been illustrated by the simulation results.Rigorous proofs of the long-time behaviour in the three regimes are the subject of ongoing investigations.A comparison of the theoretical results with data from [15] has provided an estimate for the dissociation constant of the elementary reaction between ubiquitin and the UBA domain of p62 in the context of growing aggregates.
There are several possible extensions of this work.A limitation of the original discrete model is that the description of aggregates by triplets (i, j, k) is very incomplete.Typically, very different configurations are described by the same triplet.For example, we could imagine very homogeneous or very heterogeneous aggregates, i.e. fully packed in certain regions and very loose in others.
Reaction rates will strongly depend on the configuration, including information about the geometry of the aggregate.In principle one can imagine an attempt to overcome these difficulties based on a random graph model [6], but the resulting model describing probability distributions on the sets of all possible aggregate shapes would be prohibitively complex.An intermediate solution would be a more serious approach to finding formulas for quantities like the probability α of losing an oligomer, when a cross-link breaks, based on typical probability distributions.
The model (12) describes an intermediate stage of the aggregation process.On the one hand, the large aggregate assumption means that we are dealing with the growth of already developed aggregates, neglecting the nucleation process, which is important for the number of established aggregates.A model of the nucleation process would be based on the discrete representation and it would have to be stochastic.On the other hand, we neglect two effects important for a later stage of the process.The first and obvious one is the limited availability of free p62 oligomers and ubiquitin cross-linkers.It would be rather straightforward to incorporate this into the model, however at the expense of increased complexity.It would also eliminate the dichotomy between the Cases 1 and 3 of Conjecture 1 since unbounded growth would be impossible.For relatively large initial concentrations of free particles, one could imagine a two-time-scale behaviour with an initial quadratic growth and saturation on a longer time scale.The other effect, which is neglected here but definitely present in experiments, is coagulation of aggregates.This is the subject of ongoing work, based on the PDE model (37) derived in the appendix and enriched by an account of the coagulation process.

A Large aggregate limit
We denote by c i,j,k (t) the probability of the aggregate to be in the state (i, j, k) at time t.Its evolution will be determined by a jump process model of the reactions with the rates given in ( 2), ( 3), ( 4), ( 5), ( 6), ( 7), (9), and (10).
For this purpose the relation between pre-reaction state (i , j , k ) and post-reaction state (i, j, k) needs to be inverted.This is easy except for Reaction 5, where we have j = j − 1, k = k − 1, and, with (9), The inversion is not possible in general.Occasionally, i ,j ,k will increase by one, when i is increased by one, implying that i might take the same value for two consecutive values of i .Even worse: For the extreme case nk − 2j = n − 1, where after the loss of a p62 oligomer all binding sites are busy with two-hand bound U bi except the one remaining after breaking the connection, i.e. nk − 2j = 1 = i.This state is independent from the number i ∈ {0, . . ., n − 1} of one-hand bound U bi getting lost with the oligomer.Therefore we introduce The equation for the probability distribution reads We introduce a typical value k 0 for the number k of oligomers in the aggregate and use it also as a reference value for i and j, leading by the definition (11) to the scaled triplet (p, q, r).The latter lives on a grid with spacing ∆p = ∆q = ∆r := 1/k 0 and, thus, becomes a continuous variable in the large aggregate limit k 0 → ∞.Therefore we postulate the existence of a probability density P (p, q, r, t) such that Division of (36) by k 3 0 and the limit k 0 → ∞ (∆p = ∆q = ∆r → 0) will lead to an equation for P .We deal with the six differences on the right hand side of (36), corresponding to the six reactions, separately.Reaction 1: Reaction 2: [κ 2 pP (p, q, r − ∆r, t) − κ 2 pP (p, q, r, t)] → ∂ p (κ 2 pP ) − ∂ q (κ 2 pP ) − ∂ r (κ 2 pP ) .
Reaction 3: Since this is a second-order reaction, it would dominate the dynamics for large k 0 , if the reaction constant were of the same order of magnitude as the others.In order to avoid this, we set κ 3 = κ 3 /k 0 and keep κ 3 fixed as k 0 → ∞.Note that p has been replaced by p in the argument of since p − p = O(∆p).At all these generic points the sum in (36) has only one term.We shall also need α j,k → nr − 2q (n − 2)r =: α(q, r) .

Figure 6 :
Figure 6: Number of aggregates in terms of [U bi] (or more precisely (4 × U bi − GST − GF P ) 2 ) at fixed [p62]= 2µM [15].Average and SD among three independent replicates are shown.The dashed line represents a fitted sigmoidal (more precisely, logistic) function, centered around [U bi] = 1.6µM .Note that here p62 monomers are counted.Under the assumption that p62 only occurs in oligomers of size n we have [p62]=n[p62 n ].The regression coefficient R 2 measures the quality of the fit.
for the fixed values n = 5 and K d = 0.5µM (motivated by the estimates above) in a bifurcation diagram in terms of the concentrations [U bi] and [p62 n ].Note the unsymmetry in the dependence on the two quantities: The critical values for [U bi] tend to zero as [p62 n ] tends to infinity, whereas the critical values for [p62 n ] tend to the positive values K d n−2 for ᾱ = 1 and nK d 2(n−2) for ᾱ = 0, as [U bi] tends to infinity.