1 Introduction

How to compute the quantiles of a function evaluated on the agent distribution when we can only fit part of the distribution in memory at one time? We introduce the use of t-Digests for this purpose. t-Digests have two main properties that make them suited to this application: (i) fast to compute while using little memory but are still accurate for quantiles, (ii) can be merged.Footnote 1 A t-Digest summarizes a one-dimensional distribution as a sequence of ’centroid means’ and corresponding ’centroid weights’. A basic concept would be to replace the distribution with it’s percentiles, these percentiles would be a sequence of values (the centroid means) and each percentile would have a weight of 1/100. If we know the percentiles of two sub-distributions we can easily imagine merging these to approximately calculate the percentiles of the full distribution. t-Digests are an improvement of this idea, where a scaling function is used that imposes smaller weights near the tails of the distribution (with larger weights near the middle/median). Using scaling functions allows accurate calculation of statistics like the 99th or 99.9th percentile while using few centroids for the t-Digest, at little very cost to the accuracy of the median. Merging t-Digests is computationally easy and fast. The literature on computation and big data has developed t-Digests as as efficient implementation of this concept.

A full introduction and explanation of t-Digests and their properties is provided by Dunning and Ertl (2019), and a brief description of their applications in computing appears in Dunning (2021). Typical uses for t-Digests in computing include calculating the quantiles of a dataset that is distributed across multiple servers (computing a t-Digest for each and then merging them) as well as detecting outliers in a stream of incoming data (compute a t-Digest, then update it periodically on incoming data, an outlier is a shift in, e.g., the 99th percentile). To our knowledge we provide the first use of t-Digests in Economics.

In our practical example we consider a life-cycle model with N permanent types of agent, indexed by \(i=1,2,\ldots ,N\). Permanent types of agents might be as simple as a different parameter value, or more generally the agents might, e.g., have different utility functions, or different values of a given parameter, or face different processes for the exogenous shock.

When solving an overlapping-generations (OLG) heterogeneous agent model we will get an agent distribution \(\mu (a,z,j,i)\), where a is the endogenous state (vector), z is the exogenous state (vector), j is the period/age, and i is the permanent type. Our interest is in computing distributional statistics, such as quantiles, of some function of the agent distribution; the agent distribution is multi-dimensional but the function itself is scalar-valued, and so once we evaluate the function on the agent distribution we have a one-dimensional distribution. For example, say a is asset holdings and we are interested in computing the quartiles of the asset distribution, or we are interested in the deciles of the savings rate, \((a'-a)/income\), where \(a'\) is the policy for next period asset holdings. If the agent distribution \(\mu (a,z,j,i)\) fits in memory we can simply load the whole distribution, evaluate the function, and then calculate the quantiles of this distribution. Our interest is in how to proceed when \(\mu (a,z,j,i)\) cannot fit in memory at once.

If we were interested in calculating the mean, we could simply loop over \(i=1,\ldots ,N\) and for each i we load \(\mu _i(a,z,j)\), the agent distribution for a specific permanent type i, into memory and calculate the (conditional) mean of this agent type. Having done this for each agent type we could then simply take a weighted sum of these to get the mean of the whole agent distribution. t-Digests implement this intuition of doing a calculation on each agent type and them merging them together, but in a way that computes the quantiles: We loop over \(i=1,\ldots ,N\) and for each i we load \(\mu _i(a,z,j)\), the agent distribution for a specific permanent type i, into memory and for \(\mu _i(a,z,j)\) we calculate a t-Digest, denoted \(tD_i\). Once the loop is completed we have the set of t-Digests, \(\{tD_1,\ldots ,tD_N\}\). We then merge these t-Digests to get a new t-Digest tD, and are able to calculate the distribution statistics of interest from tD.Footnote 2

t-Digests are likely to be most useful for using GPUs to solve models with large state-spaces for the agent distribution. For an overview of the heterogeneous agent incomplete markets models where these techniques might be useful, see (Heathcote et al., 2009). For an introduction to the use of GPUs in economics see (Aldrich et al., 2011). Quantiles for all of these models could be computed without t-Digests by using standard CPU memory. But the evaluation of the function on the agent distribution can be made an order of magnitude faster by using the GPU, but this can introduce a memory bottleneck as GPU memory is often an order of magnitude smaller than CPU memory.Footnote 3 t-Digests enable us to take advantage of the faster runtimes of the GPU, without running into the memory bottleneck imposed by the smaller size of GPU memory relative to CPU memory. While the use of t-Digests themselves imposes an additional computational cost, this cost is negligible compared to the speed gains of using the GPU to evaluate the function on the agent distribution: in our example in Sect. 5 using the CPU to compute the quantiles takes 526 s, compared to 3.21 s using the GPU even though the later includes the additional steps involved in computing the t-Digests; using the GPU without t-Digests and just directly computing quantiles takes 3.18 s but this would have issues around GPU memory bottlenecks in larger models.

An alternative to using t-Digests to deal with the memory bottleneck imposed by GPUs when dealing with discretized agent distributions would be to use more parametric approximations of the agent distribution. For example (Algan et al., 2008) use polynomials to approximate the agent distribution.Footnote 4 Another example is Gouin-Bonenfant and Toda (2023) who combine a finite grid with a Pareto tail to approximate the agent distribution in infinite-horizon models where the agent distribution is known to have a Pareto tail, a class of models of interest in studying top-wealth inequality. These more parametric approaches to the agent distribution will directly reduce the amount of memory required to store the agent distribution.Footnote 5

In our implementation we compute \(\mu (a,z,j,i)\) as grid points and their corresponding weights, but there is nothing in our use of t-Digests that requires this; t-Digests could be used with parameterized agent distributions. Nor is our division of the whole agent space into subspaces based on i essential. We could divide based on j or any other dimension (or combination of dimensions). Obviously t-Digests can also be applied to infinite horizon models, and models with aggregate shocks. In our implemented example the different agent permanent types will have the same state-space, but this is not necessary for the application of t-Digests.

While we work with the (discretized) agent distribution directly, an alternative solution technique for heterogeneous agent models is to simulate the agents, which generates a sample of data points and t-Digests could be applied directly to this sample. For example, we might create S simulations in parallel over C cpu cores: for each cpu we could simulate S/C agents and then calculate the t-Digest for these S/C simulations, we could then merge the C different t-Digests created to get the t-Digest for the whole agent distribution, and calculate statistics of interest from this.

Those using Matlab can directly use our functions, createDigest() and mergeDigests(). These functions are provided as part of VFI Toolkit (Kirkby (2022); vfitoolkit.com), but can be used as standalone functions. Users of VFI Toolkit with models containing permanent types of agents will enjoy the advantages of t-Digests without ever having to use touch them directly. Codes implementing the two examples in this paper (Sects. 4 and 5) are available at: github.com/robertdkirkby/tDigestForEcon, which also provides copies of createDigest() and mergeDigests() (duplicates of those in VFI Toolkit).

2 Quantiles of a Function evaluated on the Agent Distribution

We will use t-Digests to calculate the quantiles (or functions thereof) of the agent distribution in heterogeneous agent incomplete market models. This might be something like the distribution of assets, or earnings, or labor supply.Footnote 6 We now provide a brief technical definition of this, but many readers may simply skip to the next section.

Let X be the state-space for the agent distribution; so for our life-cycle model \(X=A \times {\mathbb {Z}} \times {\mathbb {J}} \times {\mathbb {I}}\). We want to calculate the quantiles Q(p), where \(p \in (0,1)\),Footnote 7 of the value of a function g(x), \(g: X \rightarrow {\mathbb {R}}\), on the agent distribution \(\mu (x)\), \(\mu : X \rightarrow [0,1]\).

The quantile is defined as,

$$\begin{aligned} Q(p)=\min _{{\bar{x}}} g({\bar{x}}) \text { s.t. } p \le \int _{\{x \in X: g(x) \le g({\bar{x}}) \}} \mu (x) \end{aligned}$$
(1)

Note that in our computational approximations to the model there is no issue around using the minimum and maximum rather than infimum and supremum.

Although the agent distribution itself, \(\mu (x)\), is multi-dimensional over (azji) once we evaluate the function g(x) on the agent distribution we can collapse to a single-dimensional version of \(\mu (x)\) by ordering x based on g(x). In codes this is a simple sort operation.Footnote 8 We can then use t-Digests, which only work with a single-dimensional distribution, on this sorted distribution.

3 t-Digest

A t-Digest can be understood as a data structure created by clustering real-valued samples. Unlike standard clustering methods the size of each cluster is here limited by a scaling function. By setting the scaling function to create more clusters in the tails of the distribution t-Digests can acheive high accuracy for top percentiles, like the 99.9th percentile, without having to use many points. This comes at the cost of a minor loss in accuracy in the middle of the distribution for statistics like the median. Each cluster, or bin, is summarized by a centroid value and a weight. This is as opposed to say, bins based on minimum and maximum bounds, and will substantially simplify merging because the bins are not required to be non-overlapping.

To simplify we explain t-Digests based on weighted grids where the observations are in increasing order, and where the sum of the weights is restricted to one.Footnote 9,Footnote 10

Due to the scaling function only a few samples will be used to construct the bins corresponding to the extreme quantiles and this is why the estimates of these quantiles remain accurate. t-Digests have an error when estimating the q quantile that is nearly constant relative to \(q(1-q)\); thus this error small for extreme values of q near zero or one, which is an important distinction between t-Digests and existing alternative data structures for estimating quantiles (Dunning & Ertl, 2019). The scaling function depends on a scaling parameter, \(\delta \), with higher values of \(\delta \) putting more bins near the tails.Footnote 11

While most applications of t-Digests are based on a sample of data points, in our heterogeneous agent models we are working directly with the discretized agent distribution which is a series of grid points and associated weights. We will therefore describe t-Digests based on this approach. An alternative solution technique for heterogeneous agent models is to simulate data from the model, which would generate a sample of data points to which t-Digests could be applied; we refer the reader to Dunning and Ertl (2019) for explanation of the implementation of t-Digests for a sample of data points.

We begin with an ordered grid of b points, \([g_1,\ldots ,g_b]=G \subset {\mathbb {R}}\), together with their associated weights \([w_1,\ldots ,w_b] \subset [0,1]\), \(\sum _{i=1}^{b} w_i =1\). We want to create a t-Digest from this. Consider a partition of this distribution into clusters C with each cluster summarized by two pieces of information: \({\bar{C}}\) the mean of the cluster, and |C| the weight of the cluster. The mean \({\bar{C}}\) is defined as the mean of the grid points in C, and the weight of the cluster is the sum of the corresponding weights. As a trivial example we might divide into two clusters \(C_1=[g_1,\ldots ,g_7]\) and \(C_2=[g_8,\ldots ,g_b]\). These would have means of \({\bar{C}}_1=\sum _{i=1}^7 g_i/|C_1|\), and \({\bar{C}}_2=\sum _{i=8}^b g_i/|C_2|\), and weights of \(|C_1|=\sum _{i=1}^7 w_i\), and \(|C_2|=\sum _{i=8}^b w_i\). What is important for t-Digests is we will only need to keep a few numbers: the cluster means and cluster weights, and can drop all information about which points were used to construct them; here we turned 2b pieces of information (b pieces for each of g and w) into just 4 pieces of information (two means and two weights).

Notice that this formulation can be easily used to progressively create any quantiles. Let’s use the concrete example of calculating four clusters each of size 0.25. Start by building the cluster that will correspond to the first quartile, begin with an empty cluster and then one by one add elements of the grid until the sum of the corresponding weights exceeds 0.25 at which point we stop. Next for the second quartile we again begin with an empty cluster, and starting from the point at which we stopped previously we keep adding points until the sum of the corresponding weights exceeds 0.25 at which point we stop. The process can clearly be repeated two more times to create the clusters for the remaining two quartiles. We obviously have the weights for each quartile, and as we went we can simply keep a running track of the cluster mean, updated each time we add a point to the cluster, while discarding all information about the points themselves as we go (both their grid value and their corresponding weight).

What we have so far, if we use m quantiles instead of the four quartiles in our trivial example, is a digest. We implicitly imposed an unnecessary restriction on the clusters, namely in our example that they were all evenly weighted (specifically, as there were four, that they were each of weight 0.25). When building a digest, we do not need to impose even weights, but rather we limit the weight of each cluster consecutively. The remaining piece for a t-Digest is a scale function, so that instead of evenly spaced quantiles we put more clusters near the extremes of the distribution (or equivalently, that the clusters near the extremes have smaller size/weight).

The scale function should be chosen to provide the appropriate trade-off between accurate estimation of the tails of the distribution without weakening accuracy near the median. The scale function will also determine (jointly with the distribution for which we are creating the digest) the number of clusters used. In most applications it is important to keep the number of clusters fairly small, but in our own application to agent distributions it seems appropriate to just use large numbers of clusters, say a few thousand, as the computational costs of the t-Digests are dwarfed by the rest of the heterogeneous agent model.

Fig. 1
figure 1

Scaling function \(k(q,\delta )\) puts more points near \(q=0\) and \(q=1\)

To limit cluster size, we define the scale function as a non-decreasing function from quantile q to a notional index k with scaling parameter \(\delta \). The scaling function is given by,

$$\begin{aligned} k(q;\delta )=\frac{\delta }{2\pi } sin^{-1}(2q-1) \end{aligned}$$
(2)

It is possible to use an alternative scaling function but we have found this one to perform best, see (Dunning & Ertl, 2019) for alternatives (we are using their \(k_1\) out of four alternatives). As the scaling function k is non-decreasing, the maximum accuracy near the tails of the distribution is influenced by the end-most clusters as determined by the minimum value \(k(0;\delta )=-\delta /4\) and maximum value \(k(1;\delta )=\delta /4\). The greater the value of \(\delta \) the more clusters will be generated near the tails of the distribution, and also more generally the more clusters will be generated in total. We tried \(\delta =10\), 100, 1000, 10000 and 100000; we settled on \(\delta =10000\) as the default in our codes. \(\delta =10000\) led to up to roughly 5000 clusters in our t-Digests created from the agent distribution.Footnote 12 Sect. 4.1 shows how the accuracy of our results in Sect. 4 varies with different values for \(\delta \). Our choice of \(\delta \) is very conservative, in the sense of large numbers of clusters and high accuracy compared to most applications, which comes at a computational expense but only a very small one and seemed appropriate as the run time for creating and merging t-Digests was tiny compared to the other aspects of heterogeneous agent models. Figure 1 plots the scaling function \(k(q;\delta )\) for \(\delta =10000\). As can be seen it is much steeper near \(q=0\) and \(q=1\). As a result we obtain clusers (with smaller weights) near the extremes.

Using the scaling function to control the cluster size, our earlier idea of how to create clusters is implemented as Algorithm 1. It takes an agent distribution (grids points and associated weights) as an input, and creates a t-Digest.

That covers how to create a t-Digest for a specific agent permanent type (or just any subspace of the agent distribution). How do we now merge these t-Digests together? Merging t-Digests can be done in batches. We here just explain how to merge two t-Digests as the generalization to any finite number is trivial.Footnote 13

Say we have two t-Digests from independent samples/subspaces. Each t-Digest is a set of cluster means and corresponding cluster weights. We also need to know the size of each t-Digest, or more accurately of the samples/subspaces they represent. In this example we have a model with two permanent types of agent, and 0.7 of agents are type 1 and 0.3 of agents are type 2. Our first step is to simply multiply all the cluster weights of the t-Digest for agent type 1 by the weight of agent type 1, namely 0.7. We then multiply all the cluster weights of the t-Digest for agent type 2 by the weight of agent type 2, namely 0.3. After reweighting we can join the two t-Digests together, giving us a set of cluster weights and cluster means. We can sort this set by the cluster means, and the resulting ordered set of cluster means and cluster weights is just an ordered set of points and associated weights. We now create a t-Digest from this exactly as if it was any other ordered set of points and associated weights. So the task of merging two t-Digests simply involves taking their weighted union, sorting, and then creating a t-Digest from this. In the pseudocode we use ’relative weight’ to refer to the relative size of the sample/subspace sketched by each t-Digest (we assume that the relative weights sum to one, but t-Digests can be generalized to not require this).

Done. We now have the merged t-Digests. Any quantiles of interest can be calculated directly from this. As can things like the Lorenz curve and Gini coefficient.

There are a number of important concepts to understand why t-Digests are computationally advantageous. The first few are obvious: we are using just a few points (the cluster means and cluster weights), the only calculations involve keeping track of a mean and a weight as we loop over points, and thanks to the scaling function we have more clusters where accuracy is important to us. More subtle are that t-digests are weakly ordered (as opposed to strongly ordered) and that they are fully merged. Let’s start with what strongly ordered means: a digest is strongly ordered if \(g_i<g_j\) for all \(g_i\in C_i\) and \(g_j\in C_j\), for any \(i<j\) (points in the ’lower’ cluster are always less than points in the ’higher’ cluster). Algorithms involving bins based on upper and lower limits typically impose strong ordering. Wealkly ordered is just that \(g_i+\Delta <g_j\) for all \(g_i\in C_i\) and \(g_j\in C_j\), for any \(i<j\), for some positive offset \(\Delta \ge 1\); the individual elements that make up a cluster don’t need to be strictly ordered according to the cluster; although the cluster means will still be ordered. When we create a t-Digest off the original distribution it will be strictly ordered, but once we merge t-Digests they need only be weakly ordered, and this substantially reduces the informational, and hence computational, requirements. This comment is worth repeating: using t-Digests we can accurately estimate quantiles without needing to keep the underlying data strictly ordered! Fully merged refers to the fact that due to the way we construct the t-Digests it is not possible to combine any two of the clusters in our t-Digest into one and still satify the restrictions that we imposed on the weights of the clusters. We might think of fully merged as ensuring we are not ’wasting’ any clusters.

figure a
figure b

Note that t-Digests do not retain information on the maximum and minimum values, but this could trivially be done alongside the implementation of t-Digests.

4 Trivial Example

We start with a very simple example (the code is provided as tDigest.m). We generate one matrix of uniform [0, 10] random variables, containing \(10^6\) observations (so the weight of each observation is \(1/10^6\)). Exact calculation of the median gives 5.0017, and with t-Digest the median gives 5.0001. Exact calculation of the 99th percentile gives 9.9001, and using t-Digests gives 9.8998.Footnote 14 The t-Digest provides accurate estimates for both the median and 99th percentile.

We repeat this with a second sample of \(10^7\) observations from a uniform [0, 5] distribution, but rather than giving each of these observations equal weight we instead draw the weights for each point from a uniform [0, 1] distribution. Exact calculation of the median gives 2.4990, and with a t-Digest gives 2.4982. Exact calculation of the 99th percentile gives 4.9498, and using t-Digest gives 4.9496.

We then join these two distributions using relative weights of 0.6 for the first distribution and 0.4 for the second distribution. Exact calculation from the joined sample for the median gives 3.5721, and with a merged t-Digest—calculated merging the two independent t-Digests (not by creating a t-Digest from the joined sample)—of the median gives 3.5692. Exact calculation of the 99th percentile gives 9.8327 and with a t-Digest gives 9.8318.

We create a third distribution of normal, N(0, 9) random variables (with equal weights). Exact calculation of the median gives \(-\)0.0016, and with a t-Digest gives \(-\)0.0029. Exact calculation of the 99th percentile 6.9853, and using a t-Digest 6.9812. We join this together with the two previous distributions using relative weights of [0.4, 0.3, 0.3]. Exact calculation from the joined distribution for the median gives 2.5831, and with a merged t-Digest gives 2.5798. Exact calculation of the 99th percentile gives 9.7545, and with a t-Digest gives 9.7531.

We then tried looking at a very different distribution: we use \(10^6\) points with equal weights (of \(1/(10^6)\)). The first 0.3 fraction (so \(3*10^5\) points) take a value of 4, the next 0.2 take a value of 5, and the remaining 0.5 take a value of 7. When we use t-Digests to calculate the 29.9th percentile we get 4, the 30.1th percentile we get 5, the 49.9th percentile we get 5, and the 50.1th percentile we get 7. We get the same we when do the exact calculation (and as the distribution is so simple, the fact that these values also represent the theoretically correct results is trivially true).

While these examples are so trivial that using t-Digests is superflous as we can just calculating quantiles direct from the distributions themselves, it shows that the t-Digests are accurate and provides a test of the createDigest() and mergeDigest() commands in Matlab that we provide as part of the contribution of this paper.Footnote 15 The example code tDigest.m also doubles as an interactive introduction to t-Digests allowing users to play with different scaling functions and scaling parameter values, or changing the sample sizes or even distributions. Both this example as well as the following example with the agent distribution are using the settings of \(\delta =10,000\) with 5,000 clusters as described in the previous section.

4.1 Hyperparameters of the t-Digests

We now repeat the exercises described in Sect. 4 to demonstrate how changing hyperparameter \(\delta \) in the t-Digests affects accuracy. Using a larger \(\delta \) will mean more clusters and therefore more accuracy. The differences in run times associated with these five values of \(\delta \) are measured in hundreths of a second. The memory use in all cases is negligible, being at most few megabytes.

The six exercises here correspond to those described in Sect. 4. We report the results in Table 1. The column ’precise’ is the results of directly calculating the statistics (largely the same statistics as those in Sect. 4, in most exercises these were the median and the 99th percentile). The column corresponding to \(\delta =10000\) is our default and relates to the results reported so far. As can be seen the larger delta corresponds to more higher accuracy. We also include the number of clusters, which is controlled by \(\delta \).

Table 1 How the hyperparameter \(\delta \) influence accuracy

5 Example with Agent Distribution

We now solve a simple life-cycle model with five agent types, for which we can calculate the quantiles of the asset distribution both directly and using t-Digests. This example further demonstartes the accuracy of using t-Digests to calculate quantiles of functions evaluated on the agent distribution. We have made this example simple enough that we can calculate quantiles directly without the use of t-Digests. VFI Toolkit, which we use to create this example, can handle cases where this is not true, and it is for such cases that t-Digests are most useful, but we would be unable to directly calculate the quantiles to compare accuracy in those cases so we do not present one here.

We very briefly present the household problem here, with minimal explanation as the model itself is of only indirect interest. A full-description of the model appears in Appendix A. The households problem is,

$$\begin{aligned} V_i(a,z,e,j)= & {} \max _{c,a',h} \frac{c^{1-\sigma }}{1-\sigma } - \psi \frac{h^{1+\eta }}{1+\eta } + (1-s_j)\beta {\mathbb {I}}_{(j>=Jr+10)} warmglow(a') \end{aligned}$$
(3)
$$\begin{aligned}{} & {} \quad \quad \quad \quad \quad + s_j \beta E[V_i(a',z',e',j+1)|z'] \nonumber \\{} & {} \text {if }j<Jr: \; c+a'=(1+r)a+wh \kappa _j \alpha _i z e \end{aligned}$$
(4)
$$\begin{aligned}{} & {} \text {if }j>=Jr: \; c+a'=(1+r)a +pension \end{aligned}$$
(5)
$$\begin{aligned}{} & {} 0\le h \le 1, a'\ge 0 \end{aligned}$$
(6)
$$\begin{aligned}{} & {} log(z')=\rho _{z} log(z) + \epsilon , \; \epsilon \sim N(0,\sigma _{\epsilon ,z}^2) \end{aligned}$$
(7)
$$\begin{aligned}{} & {} e \sim N(1,\sigma _{e}^2) \end{aligned}$$
(8)

where a is assets, c is consumption, h is labor supply. There are two exogenous shocks, z is AR(1) and e is i.i.d. normal. The effective labor units depend on z and e, as well as a deterministic age-depenedent component \(\kappa _j\), and a fixed-effect \(\alpha _i\). All additional details of the model appear in Appendix A.

The inclusion of the fixed-effect \(\alpha _i\), which takes five possible values, is the dimension over which we will use t-Digests. Value function iteration can be done seperately for each value of the fixed-effect, and the agent distribution can be calculated seperately for each value of the fixed effect. We then evaluate functions—specifically four functions: (i) the fraction of time worked (h), (ii) earnings (\(w h \kappa _j \alpha _i z e\)), (iii) assets (a), and (iv) the savings rate (\((a'-a)/(r a + wh\kappa _j \alpha _i z e)\))— seperately for each value of the fixed effect. We then calculate a t-Digest for each of these, merge the five t-Digests, and calculate quantiles of the merged t-Digest; we report the median and the 99th percentile. For the purposes of comparison we alternatively join the five agents distributions and function evaluations together and directly calculate the median and 99th percentile.Footnote 16 The results are shown in Table 2.

Table 2 Accuracy of t-Digests for life-cycle model

We view the t-Digests as accurate, although the reader can of course draw their own conclusion. Our main purpose in introducing t-Digests is to allow us to find the quantiles of larger models for which the exact calculation is not possible.Footnote 17 This example, for which we can perform the exact calculation, reassures us that using t-Digests does provide accurate results. Would the t-Digests remain accurate for larger models? If we consider the earnings in the model used here, it takes a potential 722,925 different values and attached to each would be a different weight.Footnote 18 In a larger model we might expect \(10^6\) or \(10^7\) values and associated weights. Note that the larger model with \(10^7\) values and weights actually then looks a lot like our trivial exercise in Sect. 4 and so the t-Digests would be expected to perform with a similar accuracy as was documented there.

The point of t-Digests is to enable us to use the GPU instead of the CPU. How does this impact runtimes? Using the CPU to compute the quantiles takes 526 s, compared to 3.21 s on the GPU even though the later includes the additional steps involved in computing the t-Digests; using the GPU without t-Digests and just directly computing quantiles takes 3.18 s but this would have issues around GPU memory bottlenecks in larger models.Footnote 19

6 Conclusion

t-Digests provide a powerful technique for summarizing distributional information. The loss in accuracy is negligible for most purposes and the ability to parallelize/subdivide makes t-Digests easy to use and powerful. t-Digests have been implemented in VFI Toolkit as the default method for calculating distributional statistics when using ’permanent types’ of agents. This enables handling substantially larger agent distributions. VFI Toolkit thus takes advantage of both the much larger memory of the CPU (to store the full agent distribution) and the faster speed of function evaluation on a grid using the GPU, as the later can then simply be summarized as a t-Digest. VFI Toolkit uses t-Digests when the full agent distribution will fit in CPU memory but will not fit in GPU memory. t-Digests might also be useful in distributed computation of heterogeneous agent models, e.g. solving each permanent type on a seperate node, as they can reduce the overhead of what needs to be returned (i.e., just the t-Digests).

We hope that others may find t-Digests useful for heterogeneous agent models, especially to study the distributional properties of functions evaluated on the agent distribution. The methods are well developed and understood by the computational literature and their use in Economics is simple to implement.