Modelling the regulation of telomere length: the effects of telomerase and G-quadruplex stabilising drugs

Telomeres are guanine-rich sequences at the end of chromosomes which shorten during each replication event and trigger cell cycle arrest and/or controlled death (apoptosis) when reaching a threshold length. The enzyme telomerase replenishes the ends of telomeres and thus prolongs the life span of cells, but also causes cellular immortalisation in human cancer. G-quadruplex (G4) stabilising drugs are a potential anticancer treatment which work by changing the molecular structure of telomeres to inhibit the activity of telomerase. We investigate the dynamics of telomere length in different conformational states, namely t-loops, G-quadruplex structures and those being elongated by telomerase. By formulating deterministic differential equation models we study the effects of various levels of both telomerase and concentrations of a G4-stabilising drug on the distribution of telomere lengths, and analyse how these effects evolve over large numbers of cell generations. As well as calculating numerical solutions, we use quasicontinuum methods to approximate the behaviour of the system over time, and predict the shape of the telomere length distribution. We find those telomerase and G4-concentrations where telomere length maintenance is successfully regulated. Excessively high levels of telomerase lead to continuous telomere lengthening, whereas large concentrations of the drug lead to progressive telomere erosion. Furthermore, our models predict a positively skewed distribution of telomere lengths, that is, telomeres accumulate over lengths shorter than the mean telomere length at equilibrium. Our model results for telomere length distributions of telomerase-positive cells in drug-free assays are in good agreement with the limited amount of experimental data available.


Introduction
Most normal cells cycle and divide a limited number of times, a discovery first made by Hayflick (1965), who grew normal human fibroblasts in culture and observed 60-80 population doublings before apoptosis. The limited number of divisions is referred to as the Hayflick limit (Hayflick 1979). On reaching the Hayflick limit, cells cease proliferation permanently and enter a state called replicative senescence in which they are still alive and functional but do not divide. The erosion of protective structures located at the ends of chromosomes, known as telomeres, during each replication is responsible for the limited lifespan of a cell, marking the ageing of cells and eventually triggering irreversible cell cycle exit and cell death when telomeres become critically short. Unprotected telomeres are detected by the DNA repair machinery and trigger a DNA damage response characterized by the formation of telomere dysfunctioninduced foci at telomeres (Takai et al. 2003). Activation of the tumour suppressor gene p53 then triggers cell cycle arrest leading to apoptotic cell death or replicative senescence through the induction of p21. Alternatively, expression of p16 can induce senescence, although this is less well understood (Deng et al. 2008).
There have been several approaches based on mathematical modelling to understand telomere length dynamics of somatic and cancerous cells and how they contribute to chromosome stability and the initiation of senescence or apoptosis. The first papers on quantitative modelling of telomere dynamics describe the process of telomere shortening by simple deterministic (Levy et al. 1992) and probabilistic (Arino et al. 1995;Olofsson and Kimmel 1999) models, which only take account of losses due to a special phenomenon called the end-replication problem causing telomeres to shorten progressively with each cycle (see Sect. 2.1). Rubelj and Vondracek (1999) extended these previously established models of telomere shortening ('gradual telomere shortening') by the introduction of the possibility of 'abrupt telomere shortening' caused by DNA repair mechanisms due to accumulation of DNA damage, producing sudden, stochastic changes in telomere length, and becoming more frequent as telomere shortening advances. Similarly, Sozou and Kirkwood (2001) and Proctor and Kirkwood (2002) included environment-dependent components effecting telomere shortening, where oxidative stress in form of endogenous reactive oxygen species produced by mutant mitochondria is assumed to be the cause of substantial telomere loss.
Telomere length is maintained in most cancer cells by the enzyme telomerase capable of adding new telomeric DNA onto chromosome ends. Mechanisms contributing to telomere length equilibrium have been considered by Blagoev (2009), who proposed a model in which telomere extension by telomerase occurs more frequently at short telomeres than at long telomeres. A logistic function describes the probability of the occurrence of an extendible state of telomeres, opposed to a capped state, which was inspired by data from experiments in Teixeira et al. (2004) on telomere elongation in yeast cells. Other work on the dynamics of telomere length by Qi (2011) compares the effects of normal ageing, accelerated ageing of patients with Werner's syndrome and the unlimited lifespan of telomerase-positive cells, where models involve telomerelength-dependent telomere loss and gain as well as telomere-length-dependent probabilities for cell division.
Mathematical models can be a useful means for integrating different types of experimental data to predict the mechanism of action of compounds (Wolkenhauer et al. 2009). In this paper, we aim to develop and analyse differential equation models of the effects of G-quadruplex ligands on telomere length regulation in telomerase-positive cancer cells, where we consider different time-scales, from one cell cycle to a large number of cell replications.
In Sect. 2 we explain the background biology. In Sect. 3 we develop and analyse a model describing the telomere length dynamics for telomerase-positive cells during one cell cycle and investigate how they respond to treatment with the G4-stabilising drug RHPS4. This model can also be used to describe telomere length dynamics over a small number of cell generations. In Sect. 4 we investigate for what levels of telomerase and RHPS4 the telomere length distributions stabilise over a large number of cell generations, and we predict the corresponding steady-state length distributions. Section 5 summarises the results and contains a concluding discussion.

Telomeres
Mammalian telomeres are specialised nucleotide sequences that protect the end of chromosomes. Telomeres contain short tandem repeats-sequences of basepairs that are repeated numerous times. The sequence of the repeat unit is TTAGGG in mammalian cells and human telomeres contain around 10-15 kilobasepairs per chromosome end at birth. Telomeres of most somatic cells typically shorten in the synthesis (S) phase of the cell cycle during each replication due to the "end-replication problem", that is, the inability to fully replicate the terminating DNA sequences (Levy et al. 1992). One shorter and one longer telomeric end are generated during replication, where the longer strand is generally rich in guanosine (G) and devoid of cytosine (C). The single-stranded protrusion of the longer strand is referred to as the G-overhang, which varies between 50-500 nucleotides in mammalian cells and is considerably shorter in most other eukaryotes. About 3 basepairs are lost from one DNA strand on each round of cell division due to the end-replication problem. However, human and mouse telomeres shorten by about 50-200 basepairs during each replication at both telomeric ends and the average telomere length in human cells decreases by roughly 2-4 kilobases during their lifetime. A more likely explanation of the intensive and double-sided telomere shortening is postreplicative processing by a nuclease (Palm and de Lange 2008;Makarov et al. 1997). Also, oxidative stress in the from of reactive oxygen species, which accumulates over the lifespan of a cell, causes accidental lesion in the DNA and is assumed to be the cause of substantial telomere loss (Richter and von Zglinicki 2007;von Zglinicki et al. 2005).
This progressive telomere erosion has been designated as the reason why normal mammalian somatic cells only divide a finite number of times in vitro, before undergoing permanent growth arrest. It is, however, not yet clear whether it is the average telomere length (Martens et al. 2000) or the length of the shortest telomere (Hemann et al. 2001) that is critical for the onset of cell cycle arrest in a cell. Looking at the distribution of telomere lengths can help determine whether a percentage of short telomeres or the mean telomere length is a trigger for the onset of replicative senescence. Telomeres that become critically short, and thus unprotected, are recognised by the cell as DNA double-strand breaks, inducing senescence and apoptosis, which is dependent on the expression of the oncosuppressor gene p53. Those cells which pass this point in cell replication through inactivation of p53 continue dividing, lose all their protective telomeric DNA and enter a state called crisis, causing end-to-end joining of chromosomes and other forms of enormous genomic instability, carcinogenesis and eventually cell death (see Greenberg 2005 for a review).

Telomerase
The enzyme telomerase can antagonise telomere shortening by association with the telomeric end, where it progressively synthesises telomeric repeat sequences at the single-stranded overhang of the telomere, thus inhibiting telomere uncapping which occurs when telomeres become too short. It has been suggested that human telomerase acts rapidly on most (∼70-100 %) telomeres following replication (Wu and de Lange 2009) and requires the telomeric G-overhang for telomere elongation. In HeLa cells that were synchronised at the G 1 /S transition, the total overhang length gradually increased over the next 6-7.5 h (Zhao et al. 2009;Dai et al. 2010), indicating a phase of increased telomerase activity, and that telomeric sequences are replenished each time a cell divides. Telomerase consists of TERC (Telomerase RNA Component) with a template region for copying telomeric repeat sequences, and the catalytic protein TERT (Telomerase Reverse Transcriptase), which catalyses the G-rich extension of linear chromosomes. TERC is generally highly expressed in all cells, and independently of telomerase activity, whereas the concentration of TERT is estimated at less than 50 copies per cell. In normal somatic cells the catalytic subunit TERT is repressed, but it is upregulated in immortal cells, suggesting it is the major determinant for telomerase activity.
The majority of cancer cells express telomerase continually; they possess altered telomeres and have the potential for unlimited replication. Telomerase was found to be present in 85-90 % of cancerous cells and it is believed that its specific role is to immortalise these cells (Kim et al. 1994). Most of the remaining 10-15 % of cancer cells, in contrast, can maintain their telomeres by a telomerase-independent pathway called alternative lengthening of telomeres (ALT); see Cesare and Reddel (2010).

Telomere structure
The extendible, open form of telomeres is presumably the most likely form occurring during telomere synthesis. Telomeres, however, can loop back and tuck their single-stranded end into the duplex DNA of telomeric sequences to form a t-loop (reviewed by Blackburn 2001;de Lange 2004de Lange , 2009, where a specific protein complex named shelterin (or telosome) (de Lange 2005; Palm and de Lange 2008) is involved in protecting chromosome ends from DNA degradation and DNA damage responses. The t-loop might dissolve during DNA replication; however, it is not yet known whether t-loops switch into an open state during the S phase or persist throughout the cell cycle. In our model (Sect. 3) we will assume that t-loops also function as telomerase inhibitors, as they hide the telomeric G-rich end from access by telomerase, and structural rearrangements between t-loops and the open form of telomeres allow telomerase to establish telomere length homeostasis.
Alternatively, telomeric ends can spontaneously fold into guanine-rich structures called G-quadruplexes (G4), discovered by Henderson et al. (1987), which are supported by monovalent cations such as potassium (K + ) in the nucleus. G4 structures form in vivo and probably unfold during telomere replication (Schaffitzel et al. 2001). When G-quadruplexes are located at the very end of the telomeric G-overhang, which has been shown to be their preferred location (Tang et al. 2008), the enzyme telomerase is inhibited by the folding of the G-rich end (Zahler et al. 1991). For reviews of G-quadruplex structures in vitro and in vivo, see Lipps and Rhodes (2009) and Knig et al. (2010). More general reviews on telomere structures and their function in chromosome-end protection can be found in Oganesian andKarlseder (2009) andXu (2011).
Optimal telomerase activity seems to require the unfolded single-stranded form of terminal telomere sequences. Despite the length variation of individual telomeres within a cell or an organism, average telomere length is maintained within a narrow range that is specific for each species. Studies of sperm ), for instance, suggest 8-20 kb in human, where germ-line cells generally express telomerase (except from mature sperm and oocytes, see Wright et al. 1996). Furthermore, data on telomere length in the telomerase-positive HeLa and MCF-7 human breast cancer cell lines (Canela et al. 2007) indicate a coefficient of variation of, respectively, 0.23 and 0.11. The stability of telomere length suggests the thesis that telomerase-positive cells establish an equilibrium between telomere attrition and elongation. However, Cristofari and Lingner (2006) found that HeLa telomeres, which were observed over 56 population doublings (PD), elongated at a constant rate of 415-635 bp/PD upon overexpression of the main functional subunits of the enzyme telomerase, the catalytic protein TERT and the telomerase RNA component (TERC). This massive telomerase activity is referred to as super-telomerase, and long telomeres did not change into a permanently non-extendible state in super-telomerase cells.
Telomere length has been measured using different techniques, among which telomere restriction fragment (TRF) analysis using Southern blotting (Kimura et al. 2010), and quantitative fluorescence in situ hybridization (Q-FISH) (Poon et al. 1999) have been frequently used. Several reviews of the techniques of telomere length measurement can be found in the literature (Saldanha et al. 2003;Dmitriev and Vassetzky 2009;Samassekou et al. 2010). A recent high-throughput (HT) Q-FISH method (Canela et al. 2007) generates telomere-length frequency histograms, and allows for the analysis of interphase nuclei. Telomere length is maintained in telomerase-positive HeLa cells, for example, with a measured mean value of L 0 = 3.44 kb and standard deviation of  Canela et al. (2007), with permission from PNAS. b A gamma probability density function, p(x) = e −x/θ x γ −1 /(θ γ Γ (γ )), for the telomere length in HeLa cells, with mean L 0 = 3,440 bp and standard deviation σ 0 = 800 bp, that is with the parameters γ = L 2 0 /σ 2 0 and θ = σ 2 0 /L 0 , is indicated by the solid gray line. The rate of t-loop formation (see Sect. 3 and formula (1)), k c (x), is modelled by a sigmoidal function of telomere length (dashed line) with shape parameters α = 1,775 bp, β = 300 bp and δ = 5 × 10 −5 s −1 . Shorter telomeres are more likely to be in an unlooped form than longer telomeres σ 0 = 0.80 kb. A HT Q-FISH histogram of the telomere length distribution of HeLa cells is shown in Fig. 1a, which we approximate by a gamma distribution in Fig. 1b, shown by a solid gray line.
Investigation of Martens et al. (2000) into normal human fibroblasts (telomerasenegative) having a limited lifespan showed that short telomeres increasingly accumulate in cells and the length distribution of telomeres becomes positively skewed close to senescence. Proctor and Kirkwood (2003) considered the uncapping of telomeres by the opening of t-loops as a trigger for replicative senescence to account for the experimental results found by Martens et al. (2000).

Quadruplex-stabilising drugs
On the other hand, stabilisation of G-quadruplexes by specific ligands can limit telomerase activity and alter telomere function in cancer cells. Anti-cancer researchers are now trying to design G-quadruplex ligands that will mimic the effect of the metal ions and inhibit telomerase, with the aim of achieving antitumour activity through the effective stabilisation of G-quadruplexes (Monchaud and Teulade-Fichou 2008). The G-quadruplex ligands RHPS4 (3,8,3,2kl]acridinium methosulphate) (Heald et al. 2002) together with the 3,6,9-trisubstituted acridine compound BRACO-19 (Gunaratnam et al. 2007) and telomestatin (Tauchi et al. 2006), are promising compounds among the cancer inhibitor agents and have come close to clinical testing (Bilsland et al. 2011). These compounds all inhibit telomerase activity, limiting long term proliferation of cancer cells, and directly target components of the protective cap of telomeres, leading to immediate effects on cancer cell proliferation (Neidle 2010). Cheng et al. (2008) compared relative quadruplex and duplex binding affinity constants of different quaternary polycyclic acridinium salts and found that quaternised quino[4,3,2-kl]-acridinium salts, such as RHPS4, selectively bind and stabilise quadruplex DNA. Also, quadruplex DNA binding affinity correlated strongly with telomerase-inhibitory activity data for these G4 ligands. Cookson et al. (2005) showed a notable reduction in telomere length of MCF-7 breast cancer cells when treated with subtoxic doses (<1 µM) of RHPS4. RHPS4 treatment of human melanoma lines possessing relatively long telomeres resulted in a dose-dependent decrease in cell replication and accumulation of cells in the S-G 2 /M phase of the cell cycle (Leonetti et al. 2004). Furthermore, RHPS4 induces a marked decrease of cell growth in human cell lines such as the 21NT breast cancer cells and A431 vulval carcinoma cells after 15 days and for concentrations lower than the level of acute cytotoxicity (Gowan et al. 2001). It also rapidly induces telomere dysfunction by telomere uncapping, which leads to short-term cell death through usage of higher doses. Integrative approaches to investigate the effects of RHPS4 experimentally and by means of mathematical modelling were proposed by Johnson et al. (2011) and Hirt et al. (2012). The precise cell-cycle specific behaviour of RHPS4 and its mechanism of action in cancer cells, however, are still to be elucidated. Golubev et al. (2003) used mathematical modelling to investigate possible causes for the observed positive skewness of telomere length distribution, such as DNA damage caused by free radicals. Similarly, Grasman et al. (2011) characterise the dynamics of telomere shortening by the property that longer telomeres are more vulnerable to oxidative stress, as they are larger targets. The enzyme telomerase is also active at a low level in some somatic cells (Masutomi et al. 2003), such as human fibroblasts, and telomerase-dependent shortening, leading to positively skewed telomere length distributions, has been explained by op den Buijs et al. (2004). Another mechanism yielding skewed telomere length distributions has been described by Itzkovitz et al. (2008), who developed a population mixture model with a re-populating pool of stem cells of constant telomere length and a derived pool which experiences constant decrease in telomere length, where one daughter cell of the repopulating pool stays and one transfers to the derived pool after cell replication.

Mathematical modelling
Furthermore, there are a number of models of telomere-length maintenance in telomerase-positive cells. For example, Kowald (1997) developed a mathematical model involving the concentration of a capping protein, which can bind the G-overhang once it is sufficiently long, and which inhibits telomerase and facilitates DNA replication by maintenance of the single-stranded overhang it is bound to, but is released after telomere replication. To account for the assumption that the functional state of the telomere rather than its length determines the fate of a cell, Arkus (2005) considered the binding and dissociation of a telomeric protein, TRF2, to telomeric repeat sequences, assuming that TRF2 caps telomeres and inhibits telomerase. Rodriguez-Brenes and Peskin (2010) proposed another approach of modelling telomere length maintenance processes based on the biophysics of t-loop formation, which is assumed to determine the state of a telomere and also the cell's fate. They assumed that the longer telomeres are, the more frequently telomere ends come into close proximity of internal positions of the telomere, and hence the more likely are invasions of doublestranded DNA by the G-overhang, which can be facilitated by TRF2 and results in the Kinetics for each reaction are described by their rate constants k. Free telomerase (T) and the G4-stabilising drug (R) in the nucleus bind open forms (U) and G4 structures (G), respectively. Telomerase elongation occurs at rate ρ. Telomeres enter the system at rate k e and exit the system due to t-loop formation at rate k c (x) and due to G4-stabilisation by a G4 ligand at rate k r . Here, x is the length of a telomere. The transitions U → t-loop and C → stabilised G-quadruplex are assumed to be irreversible, and only through replication telomeres could leave these states formation of displacement loops together with t-loops. The dynamics of t-loop formation were described by a worm-like chain model and an algorithm was developed to sample telomeric chromatin chains (modelled as semi-flexible polymer chains) at thermodynamic equilibrium. An ODE model and a stochastic model describe shortening by the end-replication problem, C-strand processing and telomerase-induced telomere elongation. In addition to telomere length maintenance, Kowald (1997) modelled the increase in telomere length when oligonucleotides are added to cell culture. On the other hand, Sidorov et al. (2003) investigated the impact of telomerase inhibition on the growth of tumours possessing either homogeneous or heterogeneous telomere length distributions.
Telomerase-independent pathways of telomere length maintenance are considered by Olofsson and Bertuch (2010), who capture the mechanisms of survivorship of individual budding yeast cells. Another approach to understanding telomere length maintenance by means of mathematical modelling is presented by Antal et al. (2007), who superimpose stochastic telomere length variations upon the systematic decrease in telomere length in ALT cells.

Model for discrete generation numbers
We summarise the mechanisms of telomere length regulation in a simple model containing the states U, B, G and C respectively for the number of telomeres in the open (Uncapped/Unfolded) form, those Bound to telomerase (T), those in a G-quadruplex formation and those forming a Complex with the drug RHPS4 (R). The model is illustrated in Fig. 2. We refer to B, C and the t-loop in the model as the capped states, to G as the folded form of telomeres and to U as the state of telomeres that are neither capped nor folded, that is in the open form. After telomere duplication in the S phase telomeres are introduced into system at rate k e in the open form (U) state and then switch between the open and G4 forms (G4 folding rate k f and G4 unfolding rate k u ), where telomeres in the open form bind to free telomerase molecules T , with association rate k on and dissociation rate k off , which synthesise nucleotides with rate ρ at the telomere end. Telomeric intramolecular G4 structures do not allow telomerase association with the G-overhang, and can be stabilised by free G4 ligand molecules R, where the association and dissociation rates of RHPS4 are k s and k d , respectively. We assume that one telomerase molecule binds one telomere to elongate the telomeric end, and one G4 ligand molecule is sufficient to stabilise a G4 form. Furthermore, all kinetic rates are assumed to be constant and non-negative.
We aim to simulate not only the dynamics of the average telomere length, but also of the telomere length distribution over time for control cells and cells treated with a G4-stabilising compound. We include a variable x for the length of telomeres and allow for a constant influx of telomeres in the open form into the system, at rate k e , and losses with rates k c (x) and k r , after the formation of t-loops and locking of G4 structures by G4 ligands, respectively. We assume that the rate of t-loop formation k c is dependent on telomere length, where shorter telomeres are more likely to form t-loops than longer telomeres. In particular, we approximate the rate of the formation of t-loops by a sigmoidal function, shown as the dashed line in Fig. 1b, inspired by a quantitative model for the probability of uncapping of telomeres in mammalian cells (Proctor and Kirkwood 2003). The shorter the telomeres, the fewer binding sites for certain shelterin proteins (Griffith et al. 1999) facilitating the t-loop formation are present, and, in turn, the less likely the t-loop formation becomes. Hence the probability of uncapping is modelled as a decreasing function of telomere length. Using x to denote the number of basepairs (bp) of a telomere, we model the rate of t-loop formation (telomere capping) by with shape parameters α > 0 bp, β > 0 bp and δ > 0 s −1 . For small β, this has the form of a step function, with step at x = α; k c (x) ≈ 0 for x < α and k c (x) ≈ δ for x > α; and β describes the range of telomere lengths over which the transition occurs.
Since the average telomere loss of about μ = 45 bp during chromosome replication is much less than the initial telomere length of approximately 2k to 6k basepairs in HeLa cells, we treat telomere length, x, as a continuous variable. The dynamics of the number of individual telomeres of length x at time t can be mathematically described by a partial differential equation (PDE) model of the number densities of telomeres in the states U, B, G, C, that is, where p(x) is the probability density function of the length of telomeres entering the system at rate k e . Assuming that telomerase and RHPS4 are conserved quantities in the system, we have for the numbers of free telomerase molecules, T (t), and the numbers of free RHPS4 molecules, is the only derivative term with respect to x in the model equations and accounts for the process of telomere elongation at rate ρ by telomerase.

Steady state
We now assume that the numbers of bound telomerase and bound RHPS4 molecules are small compared respectively to the numbers of free telomerase and free RHPS4 molecules in the nucleus, that is, T (t) ≈ T 0 and R(t) ≈ R 0 . This assumption requires k on k off and k d k s , which will be verified later (see Table 1). Steady state telomere length distributions are described by the equations for each of the four telomere states U, B, C, G. Using Eqs. (9) and (10), we obtain C ∝ G ∝ U . We then express U as a function of B and p using equation (7), and subsequently rewrite (8) as an ODE for the variable B(x). Solving (10), (9) and (7) respectively, we find then the equation relating the distributions B and p is

Gamma-distributed input
If we temporarily restrict ourselves to consider the lengths x > α and assume β α so that k c = δ, then we can find explicit forms for the distribution. We write x = x − α so we are only concerned with x ≥ 0. The form p( x ) = ΛB( x ) + ν B ( x ) has special solutions in terms of the Gamma distribution. We model the input function p( x ) as a gamma distribution with parameters γ and θ , hence Choosing the parameters such that γ = L 2 /σ 2 and θ = σ 2 /L we obtain a distribution with mean telomere length L and variance σ 2 . Choosing the parameters θ = ν/Λ, we find that both B( x ) and p( x ) are Gamma-distributed with Hence, the effect of the process is to increase the exponent from the input value ( p) of γ − 1 to the output (B) value γ . Since the Gamma distribution has mean γ θ, and variance γ θ 2 , the overall effect of the process illustrated in Fig. 2 is to increase the mean from the input value of (γ − 1)ν/Λ, being the mean of the input function p( x ), to γ ν/Λ, as the mean of B, the output. The standard deviation is also increased by the process. These solutions satisfy the boundary conditions p, B → 0 as x → 0 + , ∞.

Gaussian-distributed input
We now return to the more general case, where x < α is permitted, and impose the boundary condition B(−∞) = 0 or B(+∞) = 0 on (8). This boundary condition is chosen such that to avoid negativity in B(x) (imposing B(0) = 0 leads to a sign change of B(x) in x = 0 due to p(x) > 0 for all x ∈ R). In order to investigate which parameters control the shape of the telomere length distributions at steady state, we derive approximate analytical expressions for the distributions. Assuming γ = L 2 /σ 2 is sufficiently large and using the central limit theorem, we approximate the gamma distribution p with the Gaussian distributioñ having the same mean, L, and variance, σ 2 , as p. We also assume β is small and so approximate k c (x) in (1) where and and a 1 , b 0 , b 1 are explicit expressions involving kinetic parameters of the system that are too complex to display here; for details see Hirt (2012). All constants satisfy a i , b i > 0. Here, the steady state distribution of B is a product of an exponential function and an error function, where the error function dominates the exponential function for x < α and vice versa for x > α. The parameter b 2 determines how rapidly the telomere length distribution approaches zero for increasing telomere lengths, x, with smaller values of b 2 increasing the positive skewness of the distribution. Hence, decreasing the rate of t-loop formation (by lowering δ) increases the positive skewness of B(x), and so does increasing the number T 0 of telomerase molecules (or the rate, ρ, of telomere elongation), for example. The parameters k off and k e function as scaling factors that determine the contribution of B(x) andp(x) to the telomere length distribution U (x). For small k off , when B(x) is increasingly positively skewed, we expect B(x) to have a larger tail than U (x) due to the respectively decreasing and increasing contributions of B(x) andp(x) to U (x). We note that U (x) is independent of the rates k f and k u of G4 folding and unfolding, respectively, for control cells (R 0 = 0). The distributions of G and C are of the same shape as U (x), where larger R 0 increases the ratio of telomere numbers C/G. The ratio of telomere numbers C/U increases with increasing R 0 in a nonlinear fashion, and tends to k f /k r for large R 0 . On the other hand, the ratio of telomere numbers G/U decreases with increasing R 0 in an inverse fashion, and is equal to k f /k u for R 0 = 0. For larger T 0 , the distributions become increasingly positively skewed.
By integrating the Eqs. (7)-(10) over the interval [−∞, ∞), assuming B(−∞) = B(∞) = 0, and taking the sum of all these equations, we obtain the steady-state input-output balance where k c (x) is given by (1). We have confirmed that this holds for the solutions plotted later. An analytic expression for the mean telomere length, L = L(T 0 , R 0 ), of telomeres leaving the system at steady state, is and an approximate formula for (24), based on the approximation of k c (x) by δ H(x − α), has been derived using MATHEMATICA with a series of variable substitutions and simplifications, as the formulae involved in the computation are long and complex. We consider only the limiting case R 0 = 0 (no drug) for L, whence where erfc(x) = 1 − erf(x) is the complementary error function, and we assume positive concentrations of telomerase, T 0 > 0, and L > α. Whereas the last term in (25) is dependent on telomerase, and mean telomere length increases linearly with the concentration of telomerase, the second term in (25) is small (≈0.63 nt) relative to the final term (≈37 nt for T 0 = 25), when we adopt the parameter values in Table 1 and choose L = L 0 −μ. This telomerase-independent term is due to short telomeres, which can stay in the system for an extremely long time, lengthening slowly only leaving when x > α. Approximating k c (x) by a step function gives rise to the discontinuous limit of L as T 0 → 0, but L = L when T 0 = 0. The expression L is independent of k e , and in the limiting case R 0 = 0, L is also independent of k f , k u , k s , k d and k r . An increase in the parameter values T 0 (or ρ) leads to a linear increase in (25), becoming nonlinear for positive values of R 0 . An increase in σ leads to an increase in L for R 0 = 0.

Numerical solutions
We aim to show numerical results of telomere length distributions over a few generations. To derive steady state distributions at the end of each replication, we initially assume the length distribution of telomeres before the first replication event to be Gaussian p 0 (x) as in (16) with L = L 0 and σ = σ 0 . Telomeres shorten at an average amount μ due to the end-replication problem and postreplicative processing, and consequently enter the system with length distribution p(x) = p 0 (x + μ). By numerically integrating (8) and using (23) we simulate the telomere length probability density function of telomeres leaving the system, p 1 (x) = (k c (x)U (x) + k r C(x))/k e , at steady state and compare it to the distribution p 0 (x) of telomeres entering the system before telomere shortening takes place. Assuming telomere length does not change between the S/G 2 phases of subsequent cell divisions, we treat the output telomere lengths of one cycle as the input telomere length for the next generation, p = p 1 (x + μ). In particular, we use the steady state distribution of telomeres leaving the system at the end of the initial generation i = 0 as input (of rate k e ) into the system at the beginning of generation i = 1 and after telomere loss of amount μ occurred, that is (k c (x + μ)U (x + μ) + k r C(x + μ))/k e = p 1 (x + μ) replacing p(x) in Eq. (2), and derive steady state distributions p i (x) for higher generations i in this fashion.
In general, there is only very little data on the rate parameters available in the literature. We estimate k e = 1.5 × 10 −2 s −1 according to an approximate influx of 2 × 2 × 82 telomeres [after telomere duplication and considering HeLa cells with a modal chromosome number of 82, as determined by Macville et al. (1999)] into the system per 6 h, that is the time during which telomere extension occurs (Zhao et al. 2009), and estimate, within biologically feasible ranges, the values α, β and δ for the sigmoid function k c (x) (compare Fig. 1b) and the loss rate k r , as shown in Table 1. In order to adopt appropriate values for the remaining parameters, given in Appendix, we employ the fact that a concentration of 1 nM corresponds to approximately 10 −9 N A V n mol l −1 ≈ 415 molecules per HeLa nucleus, where N A = 6.022 × 10 23 mol −1 is the Avogadro constant and V n = 6.9 × 10 −13 l is the nuclear volume of a HeLa cell. The BioNumbers data base of Milo et al. (2010) provides us with this average volume for HeLa nuclei, which is taken from Monier et al. (2000). Using standard conversion factors, we obtain the rate parameters shown in Table 1. Note that k on k off and k d k s as required by the earlier assumption that T (t) ≈ T 0 and R(t) ≈ R 0 for the numbers of bound telomerase and RHPS4 molecules, respectively. The exact values for each of the rates of RHPS4 binding, k s , and RHPS4 dissociation, k d , are not relevant for the steady state of the system, so we set k s = 10 −7 s −1 (following the size of the other molecular binding kinetic parameter, k on ), for example, and derive k d = k s K −1 from the RHPS4 equilibrium binding constant, K . Figure 3 shows simulations of telomere length distributions, p i (x), of telomeres entering the system at generations i = 0, 10, 20, 30, for different concentrations of RHPS4 for the cases T 0 = 25 (telomere length equilibrium) and T 0 = 50 (supertelomerase cells), respectively. For increasing drug concentrations, R 0 , the telomere length distributions are shifted towards zero and become less positively skewed. The value ρ = 6.287 × 10 −2 nt s −1 has been chosen in the simulations, as the telomere length distribution for telomeres leaving the system for T 0 = 25 is in good agreement with experimental data from HeLa cells (see Sect. 4.2).

Summary
We developed and analysed a model of telomere length dynamics for a single division event, which describes the length regulation by telomerase and a G4-stabilising drug. We solved the model numerically and iterated the replication process over few generations. We now investigate the long term evolution over many generations.

Model with continuous time
In order to investigate changes in the telomere length distribution over a large number of cell divisions, we now modify the model (2)-(5). We feed the telomeres that exit the system with rates k c (x) and k r back into the system in the unfolded form, assuming that telomeres shorten by an amount μ due to the end replication problem (in the S phase) before they re-enter the system. The strategy we employed to simulate telomere length distributions over several generations worked well for smaller generation numbers i, but is computationally too expensive for large values of i. Figure 4 illustrates a modified version of the model (2)-(5), allowing for analysis of the telomere length distribution after large numbers of cell divisions, that is iterative S/G 2 phases, where we assume that telomere length does not change in the G 0 /G 1 and the M phase of the cell cycle.
We aim to analyse the model dynamics and compare the output telomere lengths k c (x)U (x) + k r C(x) to experimental telomere length measurements. We replace k e p(x) in (2) by k c (x + μ)U (x + μ, t) + k r C(x + μ, t) and obtain the resulting closed system of differential equations of (3)-(5) together with which is a partial differential-delay equation.
We are interested in analysing the system's steady-state telomere length distributions and predicting how telomerase and/or RHPS4 affect telomere length distributions over large numbers of cell divisions.

Steady states
Assuming that steady state solutions exist is a major assumption, the validity of which we must consider carefully; for example, there may not be enough telomerase to maintain a steady state, or there may be too much telomerase and telomere length may increase without limit. Hence we expect a window in parameter space of feasible steady solutions. We assume that T (t) ≈ T 0 and R(t) ≈ R 0 hold at steady state and hence obtain the approximate steady state equations Solving (27), (29), (30) for B as a function of U (x) and U (x + μ), we obtain with where Inserting (31) into the ODE (28) reduces the governing equations to a single delay differential equation in U (x), namely Now we aim to find approximate solutions to Eq. (34) by using quasi-continuum approximations. Such approximations have previously been used for nonlinear waves in advance-delay equations (Collins 1981;Rosenau 1986;Wattis 1996).
By defining y = x + μ/2, we rewrite (34) as It is useful to introduce the notation ∂ y for the differential operator ∂/∂y, and use ∂ n y to denote ∂ n /∂ y n . For analytic functions, f (y), and α ∈ R, we express where exp(α∂ y ) = ∞ n=0 α n ∂ n y /n! is a differential operator which yields the Taylor series at y, that is, when applied to a function f . Assuming U is analytic, we use (36) with α = μ/2 and α = −μ/2 and re-formulate (35) as Rearranging using commutativity of operator multiplication yields where A is an operator. Expanding the components of A to third order in μ, we find and exp − 1 2 μ∂ y − exp 1 2 μ∂ y = −μ∂ y − and for the inverse operators and where we write for sufficiently large x. Hence, assuming the quantities μ and ρ/k off are small (from Table 1, μ = 45 nt, ρ/k off ≈ 286 nt) compared to x ≈ L 0 = 3,440 nt, from (39) we obtain Note that the integration constant that appears when one applies ∂ −1 y to U is equal to zero, as we assume U ( j) (x) → 0 as x → ±∞ for j = 0, 1, 2, 3. Thus we obtain the first-order differential equation which approximately describes the telomere length distribution U (x) at steady state.
We rewrite (45) as by defining and use (46) to analyse how different numbers T 0 and R 0 affect the distribution of telomeres in the system at steady state. Equation (46) is a separable ODE, which we solve by integrating with respect to x and using (32) and (1) for g(x) and k c (x), respectively, thus where s 0 ∈ R is a constant that depends on the initial number of telomeres in the system. We simplify (48) to obtain where and The amplitude of the distribution is determined by A. For example, we may choose s 0 and hence A, such that ∞ −∞ U (x) dx = 1, that is, U (x) is a probability density function. The quantity λ describes the skewness of the distribution, sech being a symmetric bell-like curve.

Parameter range of steady state solutions
Having constructed a steady-state approximation for the solution (49), and noted at the start of Sect. 4.1 that presuming the existence of a steady state is a significant assumption, we now analyse the conditions under which such a solution may be expected to be relevant. Necessary conditions for a steady solution, U (x), of the continuum model (27)-(30) to be a distribution are that U must have a maximum at a point x where U ( x ) = 0 and U (x) ≥ 0 for all x ∈ R. From (46) it follows that g( x ) = c T must be satisfied for the relevant values of T 0 ≥ 0 and R 0 ≥ 0 in order for U to be physical. We note that solutions U (x) > 0 for x < 0, representing a positive probability of telomeres with a negative length, should be regarded as unphysical. The probability of telomeres with length x < 0, however, is usually very small in our simulations and we interpret the occurrence of larger proportions of telomeres with negative length to reflect the presence of telomeres that have lost all their telomeric sequences and are no longer functional. Such telomeres would typically cause a cell to become senescent or undergo apoptosis. In the following, we aim to find conditions on T 0 and R 0 that must be satisfied to yield solutions with U 0, and we require x > 0 for such solutions to be physical.
It follows immediately from the definitions (32) and (1) that g(x) < δ + c R for all x ∈ R, and hence by (46) and (47) c T < g < δ + c R in x > x, providing us with an upper bound for the number T 0 of telomerase molecules, from (33) and (47) For high concentrations of RHPS4 namely as R 0 → ∞ we find T max (R 0 ) → T ∞ max = μk off (δ + k f )/(ρk on ).
By the same reasoning as for (52) and since, from (32), g(x) > c R for all x ∈ R, we find c T > c R , which provides us with a lower bound on T 0 , namely Thus, for high concentrations of RHPS4 (R 0 → ∞) we find T min (R 0 ) → T ∞ min = μk off k f /(ρk on ). For physical solutions U (x), x needs to be positive, and since g is monotonic increasing in x, we need g( x ) > g(0) = c R + δ/(1 + e α/β ), which provides a larger lower bound on T 0 than (53), namely Note that in the limit β → 0T min = T min , however, in the figures below we use β = 300 bp.
ForT min < T 0 < T max we expect steady state solutions, for T 0 <T min the amount of telomerase is insufficient and the telomere length decays causing the cell to become senescent or undergo apoptosis. For T 0 > T max telomere length grows without limit.
Alternatively, we reformulate these inequalities to provide a lower and an upper bound on R 0 , for given T 0 > 0, in a similar way to (52) and (53), that is, and for where no upper bound on R 0 exists for solutions U (x) to be physical, and this is not true for δ < k f , because then T max (0) < T ∞ min . The lower and upper bounds on R 0 or T 0 can be used to plot (T 0 , R 0 )-regions of parameter space where steady physical solutions U (x) exist, we can then identify the effects of changes of telomerase and RHPS4 concentrations in the cell. Examples with different values of δ are given in Fig. 5 to illustrate the cases δ < k f and δ > k f , where we choose ρ = 2.5 × 10 −1 nt s −1 to illustrate the shape of these regions (lower values of ρ result in unphysically large values of T ∞ min , for example T ∞ min > 10 5 for ρ = 4.5 × 10 −3 nt s −1 ). For δ > k f , there exists a range of telomerase concentrations where a steady state solution exists no matter how large or how small the concentration of RHPS4 is. For δ < k f , the region of steady state solutions is much smaller, hence more care for the regulation of telomerase and/or RHPS4 is required. We believe this latter case to be the more physically relevant by our choice of parameter values in Table 1, where δ k f . For smaller (and more realistic) values of ρ, one is likely to find simpler T 0 -R 0regions in the form of a band in the T 0 -R 0 -plane as illustrated in Fig. 6. The mean telomere length of telomeres leaving the system at steady state has been computed using the same formula as for L in (24) and tends to −∞ for large values of R 0 and small values of T 0 , and to +∞ for small values of R 0 and large values of T 0 . Figure 6 shows a contour plot of L as a function of T 0 and R 0 for ρ = 6.287 × 10 −2 nt s −1 (see below for explanation), and a plot of L as a function of T 0 for R 0 = 0.
We simulate the approximate telomere length distributions of telomeres leaving the system at steady state. The bounds on T 0 for physical solutions U (x) at R 0 = 0, that isT and δ = 2.5 × 10 −2 s −1 (δ > k f , right plot). The rate of telomerase-induced telomere synthesis is ρ = 2.5 × 10 −1 nt s −1 . The lower (T min (R 0 )) and upper (T max (R 0 )) bounds on T 0 are defined by (53) and (52), respectively, and there is no visible difference between the lower bound T min (R 0 ) and the larger lower boundT min (R 0 ), defined by (54) . 6 a Contour plot of the mean telomere length L in (T 0 , R 0 ) space for ρ = 6.287 × 10 −2 nt s −1 and δ = 5 × 10 −5 s −1 . The dotted lines indicates the best approximation for T 0 such that L = L 0 when R 0 = 0 (T 0 = 25), and the according upper limit R max (T 0 ). b A plot of the mean telomere length L of telomeres k c (x)U (x)+k r C(x) exiting the system per unit time as a function of the number T 0 of telomerase molecules for the case of no drug (R 0 = 0) in the system (ρ = 6.287 × 10 −2 nt s −1 , δ = 5 × 10 −5 s −1 ). The dotted lines indicate the value L 0 and according value T 0 are 1 ≤ T 0 ≤ 30 for ρ = 6.287 × 10 −2 nt s −1 . Figure 7 shows two plots of telomere length distributions with T 0 = 25 and T 0 = 30 and varying concentrations of R 0 . Whereas an increase of T 0 leads to more skewed telomere length distributions, an increase in R 0 causes telomere length distributions to become less skewed. In Table 2  we give the mean telomere length L with their respective standard deviations σ , which are computed using the probability density function The decreasing positive skewness with increasing concentrations of RHPS4, R 0 shown in Fig. 7 for T 0 = 25 and T 0 = 30 is predominantly caused by large numbers of telomeres forming a complex with RHPS4, whose length is overall shorter than the length of telomeres leaving the system when they are in the open form, as illustrated in Fig. 8. The dependence of the mean telomere length L on the concentration of RHPS4, R 0 , is shown in Fig. 9 for chosen values of T 0 = 25, 30, 35.
The value ρ = 6.287 × 10 −2 nt s −1 for the lengthening of telomeres by telomerase has been chosen in the simulations, as the telomere length distribution for telomeres leaving the system for T 0 = 25 is in good agreement ( L = 3,440 ± 1,517 nt) with experimental data from HeLa cells (compare Figs. 1a, 7). The choice of a smaller value of ρ would yield much larger values for the number of telomerase molecules, that is  T 0 ≈ 400 forρ = 4.5 × 10 −3 nt s −1 (mean value from experimental measurements, see Appendix) at telomere length equilibrium. Slightly smaller values of ρ with corresponding larger values of T 0 can produce telomere length distributions being similar to each other. Having more knowledge of the values ρ, therefore, will help us determine the number of telomerase molecules in the nucleus more accurately. The small value of T 0 = 25 is, however, consistent with the average value of about 20-50 telomerase molecules per HEK-293 (human embryonic kidney) nuclei measured by Cohen et al. (2007), which is the only quantitative measurement of telomerase levels in cells we are aware of in the literature.

Discussion and conclusions
We have investigated the effects of G-quadruplex interactive agents, such as RHPS4, on telomere erosion in telomerase-positive cells, and developed mathematical models of telomere length dynamics. In particular, we considered telomere length dynamics during the S/G 2 phases, when telomerase replenishes telomeric sequences of open t-loops but not the G-quadruplex structure at each cell division. We determined steadystate length distributions, over small and large numbers of cell generations, with and without treatment by RHPS4. We derived approximate analytic expressions, and simulated numerically, steadystates of the length distributions of telomeres, as cancer cells exit the cell cycle. We analysed the effects of different levels of telomerase and various concentrations of RHPS4 on telomere length during the S/G 2 phases and considered how these effects evolve over large numbers of generations. Our models predict a positively skewed length distribution of telomeres and are in good agreement with experimental observations of HeLa cells. Moreover, our predicted value of the number of telomerase molecules in the nucleus is consistent with experimental findings in cancer cells.
In our models we assumed that t-loops and G4 structures are the key inhibitors of telomerase, and telomerase-induced telomere elongation. Constant telomere shortening due to the end-replication problem and C-strand resection, described by a constant loss term, μ, in our model for a population of continuously cycling cells, that is (3)-(5) and (26), complete the mechanisms that determine the shape of telomere length distributions. We modelled telomere elongation as a length-dependent process in that shorter telomeres are more likely to remain in an open state and be extendible by telomerase than longer telomeres, as longer telomeres have a higher tendency to coil up and form t-loops.
We further assumed that telomeres only exit the system by t-loop formation, unless G4 structures are locked by RHPS4 binding, when telomeres leave the G 2 phase of the cell cycle in the complex state and may unfold either at a later stage or at the beginning of the next cell cycle. We also supposed that the concentrations of telomerase and RHPS4 do not change over cell generations.
We have estimated most of the kinetic parameters in each system by using experimental results from the literature, and estimated the remaining parameters in order to reproduce the experimental results of HeLa cells obtained by Canela et al. (2007). Our model results agree well with the experimental telomere length distribution of HeLa cells and suggests a low concentration of about T 0 = 25 telomerase molecules per nucleus. In the literature the telomerase processivity parameter ρ is given in the range 1.2 − 7.7 × 10 −3 nt s −1 . We investigated the sensitivity of the results to ρ within this range. For small ρ we obtained narrower distributions of telomere lengths (and larger values of T 0 ) than in experimental data. We therefore fitted ρ using a larger value, ρ = 6.287 × 10 −2 nt s −1 , to describe the experimental data. There is almost no visible change in the telomere length distribution for smaller numbers of T 0 (≤ 100) during one S/G 2 phase as simulated by the model for a population of telomeres undergoing a single replication event, described by Eqs. (2)-(5), but visible changes occur after 5-30 generations, as shown in Fig. 3 with T 0 = 25 and T 0 = 50.
Our simulations of conditions on T 0 and R 0 for physically relevant steady solutions showed that the range of values T 0 ≥ 25 reproduces well the experimental results in the literature, and telomeres grow unboundedly in length for values of T 0 larger than 30. In contrast, telomeres shorten below physical lengths (or indefinitely) if we use drug concentrations larger than ∼100 nM of RHPS4 (see Fig. 7). Hence, the equilibrium state is rather sensitive to the amount of telomerase and RHPS4 in the system: larger doses of RHPS4 lead to continuous telomere shortening, whilst telomerase over-expression, on the other hand, induces continuous telomere lengthening. The steady telomere length increase we found for large T 0 is consistent with the findings of Cristofari and Lingner (2006), who observed elongation of telomeres at a constant rate in super-telomerase HeLa cells for over 50 population doublings. Hence, telomere length homeostasis cannot be established with telomerase overexpression.
Our model suggests two different effects of the treatment with RHPS4 which are dependent on the drug concentration used: low concentrations reduce telomere length, but do not impair the ability of the system to maintain an equilibrium, and high concentrations destabilise the system leading to chromosome degradation and senescence and/or cell death. Additionally, overexpression of telomerase can counteract telomere degradation; however, telomerase addition should be carefully regulated to maintain the system in equilibrium and not trigger unlimited telomere elongation.
Note that the upper limit for R 0 , when telomere equilibrium can still be maintained, is probably lower than we predicted. The threshold value for telomere length as determined by the Hayflick limit (Hayflick 1979), triggering senescense or apoptosis pathways in a cell terminating cell proliferation, is likely to be based on the shortest telomere in the cell, not the average length (Hemann et al. 2001;Abdallah et al. 2009). As far as we are aware, telomere length distributions for cells exposed to different drug concentrations of RHPS4 have not yet been measured experimentally. Measurements of telomere length distributions of telomerase-positive cells at different population doublings and for varying concentrations of the drug RHPS4 or different concentrations of telomerase will be useful in comparing our model predictions to experimental results.
We found that higher concentrations of telomerase can lead to ongoing telomere lengthening, which is consistent with observations from experiments with telomerasepositive cells. We derived regions of different telomerase and RHPS4 levels that provide physically plausible solutions to our model of telomere length dynamics over large numbers of generations and showed that telomerase expression must be strictly regulated for telomere length maintenance. Too high concentrations of RHPS4 can lead to progressive telomere erosion; we estimate that drug concentrations larger than ≈100 nM impair the equilibrium of the system leading to continuous telomere shortening and triggering senescence and apoptosis.
In summary, the main results of this paper are showing how telomerase acts as a regulator for telomere length, and how RHPS4 can disrupt this regulation as illustrated by Figs. 6b and 9. At small concentrations, RHPS4 has little effect, but there is a critical concentration above which telomerase is unable to maintain a steady state, and rapid shortening occurs. The region of telomerase-RHPS4 parameter space where steady states exist has also been determined and is illustrated in Fig. 5.
Promising directions for future work include modelling other players involved in telomere maintenance. For instance, the shelterin protein POT1 is involved in several processes at the telomere end, which might be an interesting avenue to follow in future work. For example, two distinct functions of the protein have been identified, depending on the position of POT1 at the 3'-overhang: if POT1 is bound at the very end of the overhang (leaving less than 8 nt free at the 3' end), telomerase cannot extend the telomere (Lei et al. 2005). On the other hand, the formation of G-quadruplex structures require that POT1 is not bound to the terminal four telomeric sequences involved in G-quadruplex formation. Since G-quadruplexes form spontaneously at the end of telomeres and are in a dynamic equilibrium with unfolded or partially unfolded forms, POT1 binding of unfolded structures may trap telomeres in the open form . It may be interesting in future studies to analyse the effects of different levels of POT1 binding, in particular POT1 being overexpressed or suppressed in telomerase-positive cells.