Abstract
When the cells are ordinal in the multinomial distribution, i.e., when cells have a natural ordering, guaranteeing that the borrowing information among neighboring cells makes sense conceptually. In this paper, we introduce a novel probability distribution for borrowing information among neighboring cells in order to provide reliable estimates for cell probabilities. The proposed smoothed Dirichlet distribution forces the probabilities of neighboring cells to be closer to each other than under the standard Dirichlet distribution. Basic properties of the proposed distribution, including normalizing constant, moments, and marginal distributions, are developed. Sample generation of smoothed Dirichlet distribution is discussed using the acceptance-rejection algorithm. We demonstrate the performance of the proposed smoothed Dirichlet distribution using 2018 Major League Baseball (MLB) batters data.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The smoothed Dirichlet distribution has the same parametric form as the Dirichlet distribution that forces the neighboring cells to be closer to each other by adding a penalty function. Many researchers currently force the neighboring cells to be closer to each other by using different parameter values (\(\varvec{\alpha }\)) for the Dirichlet distribution. There is a big necessity for a suitable distribution for the above issue, and our proposed distribution fills that gap, allowing for efficient modeling of data.
A new parametric family of distributions was introduced and developed with some extensions to the Dirichlet distribution to suit different purposes [11]. introduced generalized Dirichlet distribution has a more general covariance structure than Dirichlet distribution. The grouped Dirichlet distribution (GDD) is a multivariate generalization of the Dirichlet distribution, which uses to model incomplete categorical data; it was first described by [5]. Also [6], introduced nested Dirichlet distribution to model incomplete categorical data. The spherical-Dirichlet distribution was introduced by [1], which is obtained by transforming the Dirichlet distribution on the simplex to the corresponding space on the hyper-sphere.
[4] proposed a different version of the smoothed Dirichlet distribution in the context of smoothed language model representation of documents. This proposed smoothed Dirichlet distribution is the same as the Dirichlet distribution except that the domain of the distribution is restricted to include only smoothed language models [3]. Used the above-mentioned smoothed Dirichlet distribution for image categorization using a smoothed simplex. The above proposed smoothed Dirichlet distribution uses smoothed domain and mainly focuses on multimedia data, whereas our proposed smoothed Dirichlet distribution can apply to any context of data [7]. Proposed the modified Dirichlet distribution that simultaneously performs smoothing by making the parameters (\(\varvec{\alpha }\)) to be negative.
2 Basic Properties
In this section, we introduce the proposed smoothed Dirichlet distribution and its basic properties.
2.1 Probability Density Function
The smoothed Dirichlet distribution was suggested by [2] as a variation to the Dirichlet distribution that forces the successive cell categories to be closer to each other. Let \({\textbf{x}}=(x_{1}, x_{2}, \ldots , x_{K})^{t}\) be a vector with K components where \(x_{j} \ge 0\) for \(j=1, 2, \ldots , K\) and \(\sum _{j=1}^K x_{j} = 1\). Also, let \(\varvec{\alpha } = (\alpha _1, \alpha _2, \ldots , \alpha _K)^{t}\), where \(\alpha _j >0\) for each j and \(\delta >0\). The smoothed Dirichlet (SD) probability density function is
where \(\text {exp}(-\delta \Delta ({\textbf{x}}))\) is a penalty function that forces successive \(x_{j}\)’s to be close to each other with higher probability than under the standard Dirichlet distribution satisfying \(\sum _{j=1}^K x_{j} = 1\). \(\delta\) dictates the extent to which the neighbouring \(x_{j}\)s to be close. For instance, a large value of \(\delta\) forces realizations of \({\textbf{x}}\) to have small values of \(\Delta ({\textbf{x}})\). The constant \(C(\varvec{\alpha }, \delta )\) can be written as
From now on, we refer to smoothed Dirichlet distribution (SDD) as \({\textbf{X}} = (X_1, X_2, \ldots , X_K)^{t}\) \(\sim \text {SDD} (\varvec{\alpha }, \delta , \Delta )\). Here \(\Delta = \Delta ({\textbf{X}})\) and its importance is discussed later.
2.2 Moments
In this sub-section, we compute the first, second, and \(n{\text {th}}\) order moments, variance, and covariance of the smoothed Dirichlet distribution. First, we compute the mean of the smoothed Dirichlet distribution is
Let \(\gamma _{il} = 0, \forall l, l \ne i\), and \(\gamma _{ii} = 1\). Here \(\varvec{\gamma }_i = (\gamma _{i1}, \gamma _{i2}, \ldots , \gamma _{iK})^{t}\). Then
where \(C(\varvec{\alpha +\gamma _{i}}, \delta ) = \dfrac{1}{\int \ldots \int \prod _{j=1}^K x_{j}^{\alpha _{j} + \gamma _{ij} - 1} \text {exp}(-\delta \Delta ({\textbf{x}})) dx_1 dx_2 \ldots dx_K}\).
Using (1),
Here
Substituting (2),
Similarly, we compute the expected value for \(X_{i}^n\) as
Then
where \(C(\varvec{\alpha }+n\varvec{\gamma }_{i}, \delta ) = \dfrac{1}{\int \ldots \int \prod _{j=1}^K x_{j}^{\alpha _{j} + n\gamma _{ij} -1} \text {exp}(-\delta \Delta ({\textbf{x}})) dx_1 dx_2 \ldots dx_K}\).
Using (1),
Here
Substituting (4),
More generally, the product moments of smoothed Dirichlet distribution random variables can be expressed as
where \(\varvec{\alpha }+{\varvec{n}} = (\alpha _1+n_1, \alpha _2+n_2, \ldots , \alpha _K+n_K)^{t}\). We can easily compute the second moment by plugging \(n=2\) to (5),
Then the variance of the smoothed Dirichlet distribution is
Next we compute the covariance of \(X_i\) and \(X_l\) (\(\text {Cov} (X_i, X_l)\)), \(i, l =1, 2, \ldots , K \text { and } i \ne l\). Using (6),
Then
2.3 Sample Generation
We use the acceptance-rejection method to generate a random sample from the smoothed Dirichlet distribution with parameters \(\varvec{\alpha }, \delta\), and \(\Delta\). In the acceptance-rejection sampling method, we generate sampling values from a target distribution \({\varvec{X}}\) by using a proposal distribution \({\varvec{Y}}\). We generate the values from \({\varvec{Y}}\) instead of \({\varvec{X}}\) and accept the values of \({\varvec{Y}}\) if \(f({{\varvec{x}}}) \le C g({\varvec{y}})\) where f and g are probability density functions of \({\varvec{X}}\) and \({\varvec{Y}}\), and C is a constant. We consider the Dirichlet distribution as the target distribution. Here are the steps of the acceptance-rejection algorithm.
-
1.
Generate a set of random samples \({\varvec{y}}_1, {\varvec{y}}_2, \ldots , {\varvec{y}}_M\) from the Dirichlet distribution with given \(\varvec{\alpha }\).
-
2.
Compute \(E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{y}}))]\) using all the generated random samples.
-
3.
Generate a random number u from Uniform(0,1) and set \(i=1\).
-
4.
If \(u < C= \dfrac{\text {exp}(-\delta \Delta ({\textbf{y}}_i)) }{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{y}}_i))]}\) accept \({\varvec{y}}_i\) and otherwise reject \({\varvec{y}}_i\), set \(i=i+1\) and repeat the steps again until \(i=M\).
2.4 Role of the Penalty Term (\(\text {Exp}(-\delta \Delta ({\textbf{x}}))\))
As we mentioned before, the penalty function which forces successive \(x_{j}\)’s to be close to each other with higher probability than under the standard Dirichlet distribution satisfying \(\sum _{j=1}^K x_{j} = 1\). The roles of \(\delta\) and \(\Delta\) are very important when constructing the smoothed Dirichlet distribution. The penalty parameter (\(\delta\)) dictates the extent to which cell probabilities of neighboring categories have to be similar. Some examples of \(\Delta\) functions that we consider are
From now on, for our analysis, we use \(\Delta = \sum _{j=1}^{K-1} (x_{j+1} - x_{j})^2\). When \(x_1 = x_2 = \ldots = x_K = \frac{1}{K}\) then
This is the minimum value of \(\Delta = \sum _{j=1}^{K-1} (x_{j+1} - x_{j})^2\). Also, the maximum value of \(\Delta\) is attained when one of the \(x_j=1\), \(2 \le j \le K-1\). Then the maximum value of \(\Delta\) is
Fig. 1 shows the effect of different \(\delta\) and \(\varvec{\alpha }\) for the same penalty function \(\Delta = \sum _{j=1}^{K-1} (x_{j+1} - x_{j})^2\) after simulating data from the smoothed Dirichlet distribution with \(K=3\). The plots in each row of this Fig. 1 are for the same \(\varvec{\alpha }\), and the plots in each column are for the same \(\delta\) value. We consider \((1, 1, 1)^{t}\), \((5, 5, 5)^{t}\), \((10, 10, 10)^{t}\), \((10, 5, 1)^{t}\), \((1, 5, 10)^{t}\) and \((1, 10, 1)^{t}\) as \(\varvec{\alpha }\) for the plots in each row of Fig. 1. The color scale runs from yellow (lowest value) to red (highest value).
As \(\alpha _j\) increases, the distribution becomes more tightly concentrated around the center of the simplex for a given \(\delta\) value. Also, for a given \(\varvec{\alpha }\), the distribution becomes more tightly concentrated around the center of the simplex as \(\delta\) increases. If \(\alpha _j\) values are different, then we will get an asymmetric (non-central) distribution with higher values for the highest \(\alpha _j\). Also, the highest penalty occurs at \((x_1 = 0, x_2 = 1, x_3 = 0)^{t}\) for this penalty function.
Figure 2 shows the effect of different \(\delta\) and penalty functions for \(\varvec{\alpha } = (5, 5, 5)^{t}\) after simulating data from the smoothed Dirichlet distribution with \(K=3\). The first row of this Fig. 2 is for the penalty function \(\Delta _1= \sum _{j=1}^{K-1} (x_{j+1} - x_{j})^2\). The second and third rows of this Fig. 2 are for the penalty functions \(\Delta _2 = \sum _{j=1}^{K-1} (\text {log } x_{j+1} - \text {log } x_j )^2\) and \(\Delta _3 = \sum _{j=2}^{K-1} (x_{j+1} - 2x_j + x_{j-1})^2\) respectively. It is clear that the distribution becomes more tightly concentrated around the center of the simplex for the penalty function \(\Delta _1 = \sum _{j=1}^{K-1} (\text {log } x_{j+1} - \text {log } x_j )^2\) than other penalty functions. Also, the highest penalty occurs at \((x_1 = 0, x_2 = 1, x_3 = 0)^{t}\) for all of these penalty functions.
2.5 The Upper and Lower Limits for \(E(X_i)\) and \(\text {Var} (X_i)\)
Next, we compute the upper and lower limits for \(E(X_i)\) and \(\text {Var} (X_i)\). The upper and lower limits help to find out the possible range for \(E(X_i)\) and \(\text {Var} (X_i)\). First, we compute the upper and lower limits for \(E(X_i)\). We know that the value of \(\Delta ({\textbf{x}}) = \sum _{j=1}^{K-1} (x_{j+1} - x_{j})^2\) is in [0, 2]. Then
Now we can compute
Then we will get
Similarly,
Then
We know that \(E_{\varvec{\alpha }} (X_i) = \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j}\). Then,
The lower limit of \(E (X_i)\) is \(\text {exp} (-2\delta ) \left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j} \right)\) and the upper limit of \(E (X_i)\) is \(\text {exp} (2\delta )\) \(\left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j} \right)\). Note that when \(\delta \rightarrow 0\), \(E (X_i) \rightarrow \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j}\) which is the expected value of the Dirichlet distribution. Next, we compute the upper and lower limits that help to find out the possible range for \(E(X_i)\) and \(\text {Var} (X_i)\). We know that
This is also called the Maclaurin series. Then,
Similarly, \(\text {exp}(2 \delta ) = 1 + 2 \delta + O(\delta ^2)\). Then,
Also,
Also,
We know that, \(\text {Var} (X_i) = E(X_{i}^2) - (E(X_i))^2\). Then we compute the lower limit of \(\text {Var} (X_i)\) \((\text {Var}_{LL} (X_i))\) and the upper limit of \(\text {Var} (X_i) (\text {Var}_{UL} (X_i))\) separately.
and
Note that when \(\delta \rightarrow 0\), both \(\text {Var}_{LL} (X_i)\) and \(\text {Var}_{UL} (X_i) \rightarrow \dfrac{\alpha _{i}\left( \sum _{j=1}^K \alpha _{j} - \alpha _{i}\right) }{(\sum _{j=1}^K \alpha _{j})^2 (\sum _{j=1}^K \alpha _{j} +1)}\) which is the variance of the Dirichlet distribution. Next, we discuss the marginal distribution of the smoothed Dirichlet distribution.
2.6 Marginal Distributions
We know that the marginal distributions of the standard Dirichlet distribution are beta distributions. The smoothed Dirichlet (SD) probability density function is
Let \(A=\sum _{j=1}^K \alpha _j\). Then
The marginal probability density function of \(X_1\) is a product of the probability density function of Beta\((\alpha _1, A - \alpha _1)\) and the ratio of \(\dfrac{\text {exp}(-\delta \Delta ({\textbf{x}}))}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{x}}))]}\). Now let’s obtain the marginal distribution of \(x_2\). The joint distribution of \(x_1\) and \(x_2\) is
Then
Substitute \(x_1 = (1 - x_2)u\), then
3 Estimation of Parameters and Bayesian Inference
We now outline the estimation of the parameters of the smoothed Dirichlet distribution. We first derive estimators for \(\alpha _j, j=1, 2, \ldots , K\) and \(\delta\) using the method of moments (MOM).
3.1 Method of Moments (MOM)
Suppose we have a random sample with n random vectors \({\varvec{X}}_1, {\varvec{X}}_2, \ldots , {\varvec{X}}_n\) such that \({\varvec{X}}_{i} = (X_{i1}, X_{i2}, \ldots , X_{iK})^{t}, X_{ij} \ge 0\) and \(\sum _{j=1}^K X_{ij} = 1\). We know that the first and second population moments are
and
We define the first and second sample moments as
and
for \(j=1, 2, \ldots , K\). We have \(K-1\) first-order moment equations and \(K-1\) second-order moment equations to solve for K unknown \(\alpha _j\) and \(\delta\). Then
and
for \(j=1, 2, \ldots , K\). There is no closed-form solution for \(\alpha _j\) and \(\delta\) in solving simultaneously 19 to 20, so we must solve numerically to obtain the corresponding method of moments estimators for the parameters.
3.2 Bayesian Inference
In the Bayesian paradigm, if the posterior distribution is in the same probability distribution family as the prior distribution, then the prior is called a conjugate prior distribution. Like the Dirichlet distribution, the smoothed Dirichlet distribution is also a conjugate prior to the multinomial cell probabilities vector. The distribution of cell counts \(({\textbf{X}})\) is given by
where \({\textbf{p}} = (p_{1}, p_{2}, \ldots , p_{K})^{t}\) denotes the vector of cell probabilities for the multinomial population. We suggest using the smoothed Dirichlet distribution as the prior distribution for \({\textbf{p}}\),
Then, the posterior distribution of \({\textbf{p}}\), given the observed counts (\(({\textbf{x}})\) is
We introduce the smoothed Dirichlet-multinomial distribution, a compound distribution of a multinomial distribution, and a smoothed Dirichlet distribution. The marginal distribution of \({\varvec{x}}\) is given by
4 Data Analysis
4.1 A Simulation Study
An in-depth simulation study would be useful to demonstrate and compare how the proposed and existing estimators perform over simulated datasets and considering different scenarios. Nevertheless, we performed a brief simulation study using 10000 Monte Carlo simulations, knowing the true cell probabilities. We report the Mean Squared Error (MSE) and compare the estimators below. Also, we varied the number of populations (N - 100, 200, and 500) and the number of categories (K- 3, 6, and 9) to explore the performance of the estimators. We generated data from multinomial distributions with sample sizes ranging from 15 to 75. We considered two scenarios for the true cell probabilities. Here we compare our proposed method with the Dirichlet process method proposed by [10] and the weighted likelihood method by [8]. Note that for the empirical Bayes method, we considered the standard Dirichlet distribution. In the proposed method we considered the penalty function \(\Delta = \sum _{j=1}^{K-1} (x_{j+1} - x_{j})^2\) for the smoothed Dirichlet distribution. Also, \(\delta\) changes with the scenarios, N and K, and it is ranged from 100 to 300.
4.1.1 Scenario 1
In this scenario, the true cell probabilities are strictly decreasing but the differences between successive \({\textbf{x}}\)’s are the same. For example, when \(K=3\), the true cell probabilities are \({\textbf{x}} = \left( \dfrac{3}{6}, \dfrac{2}{6}, \dfrac{1}{6}\right) ^{t}\).
Table 1 provides the mean squared error values for each estimator based on Scenario 1. When N is fixed and K increases the MSE decreases. The MSE also decreases when K is fixed and N increases. The Bayesian shrinkage estimator based on a smoothed Dirichlet prior is the best estimator based on the MSE.
4.1.2 Scenario 2
In this scenario, the true cell probabilities are increasing and decreasing (zig-zagag pattern). For example, when \(K=3\), the true cell probabilities are \({\textbf{p}} = \left( \dfrac{1}{12}, \dfrac{10}{12}, \dfrac{1}{12}\right) ^{t}\).
Table 2 provides the mean squared error values for each estimator based on Scenario 2. As in previous scenarios, when N is fixed and K increases the MSE decreases. The MSE also decreases when K is fixed and N increases. The estimator based on weighted likelihood approach is the best estimator based on the MSE for this scenario. The Bayes estimator based on the smoothed Dirichlet prior is not doing well in this case, which is not surprising given it isn’t designed to handle this case where successive cell probabilities are different.
4.2 Real Data Analysis
Real-world situations often arise where outcomes in certain categories are not observed due to limited sample size and small, but non-zero, cell probabilities. In such cases, the Maximum Likelihood Estimation (MLE) method may yield poor results by underestimating the actual cell probabilities. However, a proposed approach is highly valuable and applicable in such scenarios, particularly when dealing with ordinal categories that possess a natural ordering. This approach ensures that the borrowing of information among neighboring cell categories is conceptually meaningful, making it an effective and useful methodology for analyzing data applications with missing outcomes in the presence of small probabilities.
Let’s now consider the estimation of \({\textbf{p}}\) using a smoothed Dirichlet prior. For the data application, we consider 2018 Major League Baseball (MLB) batting data from the Baseball-Reference website (www.baseball-reference.com). We consider data for all the regular season games taking place between March 29, 2018, and October 12, 2018. Our analysis includes \(m=556\) players with at least 25 plate appearances. [8] proposed an estimator based on the weighted likelihood approach to predict a good baseball batting metric for each batter, especially when a batter has a few plate appearances. This estimator borrows information from other similar batters to make inferences about a target. Our proposed Bayesian estimator using smoothed Dirichlet prior distribution borrows information across other batters but, more importantly, also across neighboring ordinal categories (batting outcomes) to improve the estimation of cell probabilities. [9] used this proposed Bayesian estimator using smoothed Dirichlet prior distribution to estimate the distribution of positive COVID-19 cases across age groups for Canadian health regions.
In our analysis, we consider \(K=11\) possible outcomes to batting; SO - strikeout, GO - ground out, AO - air out, SH - sacrifice hit, SF - sacrifice fly, HBP - hit by a pitch, BB - bases on balls/walk, S - single, D - double, T - triple and HR - home run. The outcome of batting in baseball can be divided into discrete categories; this is the basis for constructing metrics that evaluate the batting performance of players. It is also the basis of our analysis. Let \(x_{ij}\) be the number of plate appearances in which the batting outcome j occurs for the \(i^{\text {th}}\) batter \((j=1, 2, \ldots , K)\) and denote the number of plate appearances for the \(i^{\text {th}}\) batter by \(n_i\). In this paper, we considered \(K=11\) batting outcomes, and the joint distribution of the counts for these 11 discrete categories for batter i is given by
where \(\mathbf {p_i} = (p_{i1}, p_{i2}, \ldots ,p_{i11})^{t}\) represents the vector of outcome specific probabilities satisfying \(\displaystyle \sum _{j} p_{ij}=1\). Taking a Bayesian approach, assume that
Then, the posterior distribution for the \(i{\text {th}}\) batter is given by
It is clear that there are mainly two types of groups; strikeout, ground out, and air out are outs/dismissals, and single, double, triple, and home run are hits. Borrowing of information across cell categories within each group only. For our analysis, we use \(\Delta = \sum _{j=1}^{K-1} (p_{i(j+1)} - p_{ij})^2\). We modify this penalty function slightly so that the borrowing of information across cell categories is done within each group only. Assuming the batting outcomes are arranged in the given order (SO, GO, \(\ldots\), HR) as above, the modified penalty function is given by
Fig. 3 shows the estimates of the cell probabilities for the top 10 batters using smoothed Dirichlet prior with different \(\delta\) that are very close to the MLE. This behavior was to be expected, given the top 10 batters have a large number of plate appearances: in their case, when \(\delta\) increases, we can see small fluctuations from the MLEs. On the other hand, Fig. 4 shows the estimates of the cell probabilities for the last 10 batters are very close to the overall proportion \({\bar{p}}_j = \frac{\sum _{i=1}^{556} x_{ij}}{\sum _{i=1}^{556} \sum _{j=1}^{11} x_{ij}}\). For the last ten batters that only have a small number of plate appearances, when \(\delta\) increases, we can see large fluctuations from the MLEs.
5 Conclusions
The proposed smoothed Dirichlet distribution constitutes a superior alternative for borrowing information among neighboring cells. The proposed smoothed Dirichlet distribution forces the probabilities of neighboring cells to be closer to each other than under the standard Dirichlet distribution. Inference results were in close agreement with the behavior we expected for the data application. This data application also shows that the smoothed Dirichlet distribution is flexible enough to accommodate different ways of borrowing information from the cells. Future research may be aimed at determining a suitable \(\delta\) for the smoothed Dirichlet distribution.
Data Availability
All data generated or analyzed during this study are included in this published article.
References
Guardiola, J.H.: The spherical-Dirichlet distribution. J Stat Distrib Appl (2020). https://doi.org/10.1186/s40488-020-00106-9
Hjort, N. L.: Bayesian Statistics 5, Chapter Bayesian approaches to non- and semi-parametric density estimation, pp. 223–254, (1996)
Najar, F., Bouguila, N.: Image categorization using agglomerative clustering based smoothed Dirichlet mixtures. In: Yen, Y. (ed.) Advances in visual computing, pp. 27–38. Springer International Publishing (2020)
Nallapati, R.: The Smoothed Dirichlet Distribution: Understanding Cross-Entropy Ranking in Information Retrieval. Ph. D. thesis (2006)
Ng, K.W., Tang, M., Tan, M., Tian, G.: Grouped Dirichlet distribution: a new tool for incomplete categorical data analysis. J Multivar Anal 99(3), 490–509 (2008)
Ng, K.W., Tang, M., Tan, M., Tian, G.: The nested Dirichlet distribution and incomplete categorical data analysis. Stat Sin 19(1), 251–271 (2009)
Tu, K.: Modified Dirichlet distribution: allowing negative parameters to induce stronger sparsity. In Conference on Empirical Methods in Natural Language Processing (EMNLP 2016), Austin, Texas, USA (2016)
Wickramasinghe, L., Leblanc, A., Muthukumarana, S.: Model-based estimation of baseball batting metrics. J Appl Stat 48(10), 1–23 (2020)
Wickramasinghe, L., Leblanc, A., Muthukumarana, S.: Bayesian inference on sparse multinomial data using smoothed Dirichlet distribution with an application to covid-19 data. Model Assisted Statistics and Applications 18 (3), (2023)
Wickramasinghe, L., Leblanc, A., Muthukumarana, S.: Semi-parametric Bayesian estimation of sparse multinomial probabilities with an application to the modeling of bowling performance in T20I cricket. Ann Biostat Biometric Appl 5(1), 1–13 (2023)
Wong, T.T.: Generalized Dirichlet distribution in Bayesian analysis. Appl Math Comput 97(2), 165–181 (1998)
Acknowledgements
I would like to sincerely thank the reviewers for their valuable comments, which have greatly improved the quality and impact of our work.
Funding
No funding was received for conducting this study.
Author information
Authors and Affiliations
Contributions
The analyses were performed by LW. All of the authors (LW, AL, and SM were involved in the study concept and design, and interpretation of the data. All of the authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of Interest
There were no competing interests to declare that arose during the preparation or publication process of this article.
Ethical Approval and Consent to Participate
Not applicable.
Consent for Publication
Not applicable.
Appendix
Appendix
1.1 Exponential Family
A random variable x follows an exponential family of distribution if the probability density function can be written in the following form:
where
-
\(\varvec{\eta }\) is a vector of parameters,
-
\(T({\varvec{x}})\) is the sufficient statistics,
-
\(A(\varvec{\eta })\) is the cumulant function.
The smoothed Dirichlet distribution also belongs to the exponential family. The probability density function can be written as
where
-
\(\varvec{\eta } = (\varvec{\alpha }, \delta )\),
-
\(T({\varvec{x}}) = (\text {log}({\varvec{x}}), \Delta ({\varvec{x}}))\),
-
\(A(\varvec{\eta }) = -\text {log} C(\varvec{\alpha }, \delta )\).
1.2 Derivations of the Marginal Distributions
The marginal distribution of \(X_1\) is
If we use \(\Delta ({\textbf{x}}) = \sum _{j=1}^{K-1} (x_{j+1} - x_j)^2\), then \(\Delta ({\textbf{x}}) = (1--2x_1)^2\). Also,
The marginal distribution of \(X_2\) is
Here \(\Delta (x_2, u) = \dfrac{(1-x_2)^2}{2} (1--2u)^2 + \dfrac{1}{2}(1--3x_2)^2\). Then
Also
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Wickramasinghe, L., Leblanc, A. & Muthukumarana, S. Smoothed Dirichlet Distribution. J Stat Theory Appl 22, 237–261 (2023). https://doi.org/10.1007/s44199-023-00062-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s44199-023-00062-8