A cover technique to verify the reliability of a model for calculating fuzzy probabilities

Many models have been suggested to calculate fuzzy probabilities in risk analysis. In general, the reliability of a model is demonstrated by practical effects or proved theoretically. In this article we suggest a new approach called the cover technique to verify the model’s reliability. The technique is based on a hypothesis that a statistical result can approximately confirm a fuzzy probability as a fuzzy-set-valued probability. A cover is constructed by many biprobability distributions. The consistency degree of a cover and a fuzzy probability distribution is employed to verify the reliability of a model. We present a case that shows how to construct a distribution-cover and calculate the consistency degree of the cover and a possibility-probability distribution. A series of numerical experiments with random samples from a normal distribution verify the reliability of the interior-outer-set model.


Introduction
In recent years the problem of estimating a fuzzy probability with a small sample has been given much attention in risk analysis. With incomplete information, it is difficult to clearly see the scenes in the future associated with some adverse incident. The scenes are fuzzy risks (Huang and Ruan 2008) and we would employ models to calculate fuzzy probabilities for representing risks. For example, the fuzzy probability of earthquake magnitude given in Karimi and Hülermeier (2007) represents the fuzzy seismic risk found in the North Anatolian Fault.
Since fuzzy theory was born, the fuzzy community started thinking of fuzzy probability. Most researchers accept the concept of the probability of a fuzzy event (Zadeh 1968) where a basic probability distribution is given.
Following an approach to model uncertainty that was pioneered by Ramsey (1931) and further developed by de Finetti (1937), Williams (1975), and Walley (1991), de Cooman (2005 has presented a sound and deep approach to vague probability. In statistical applications, imprecise probabilities usually come from subjectively assessed prior probabilities. Fuzzy set theory is applicable to the modeling of imprecise subjective probabilities, and is suggested by many researchers, for example Freeling (1980), Watson, Weiss, and Donnell (1979), and Dubois and Prade (1989).
There is an urgent need to verify whether a model that calculates fuzzy probabilities is reliable before it can be employed in risk analysis. For example, we suppose that a group of terrorists monitored by a security department has slipped into a city. According to statistical data, the department could estimate the probability of death toll x resulting from the attack of the group, denoted as p(x), and employ it to describe the risk of the terrorism attack. However, nobody believes the p(x) because the available data are scarce. Thus, a fuzzy probability ( ) p x would be a reasonable improvement for risk analysis of a potential terrorism attack. It is necessary to verify the reliability of the model used to calculate ( ) p x before we suggest it to the security department.
Many models have been suggested to calculate fuzzy probabilities. Some have been demonstrated with practical effects (Tanaka, Fan, and Toguchi 1983;de Cooman 2005), and others would be proven by using mathematical theory (Moeller and Beer 2003).
In this article we develop the histogram-covering approach (Huang and Jia 2008) into a more general cover technique to verify the reliability of a model that is employed to calculate fuzzy probabilities. The technique is based on a hypothesis that a statistical result can approximately confirm a fuzzy probability as a fuzzy-set-valued probability. The key concept in the technique is "biprobability distribution" that is a probability of probability of event occurrence. Many biprobability distributions form a cover. This article is organized as follows: Section 2 presents the cover hypothesis; Section 3 introduces the cover of probability distributions; Section 4 defines the consistency degree of a cover and a possibility-probability distribution (PPD); Section 5 presents two kinds of covers constructed with histograms; Section 6 introduces interior-outer-set model to calculate a possibility-probability distribution. In section 7, we verify that interior-outer-set model is reliable. We conclude this article with section 8.

Primeval Hypotheses
This study illustrates that a statistical result can approximatel y confirm a fuzzy probability represented as a fuzzy-set-valued probability.
First of all, let us look at the example of observing balls drawn from an urn to estimate the probability of drawing a red ball. The urn contains black balls, brown balls, red balls, orange balls, yellow balls, green balls, blue balls, purple balls, grey balls, and white balls. If we draw all balls from the urn, we can accurately estimate the probability of drawing a red ball. In case we draw a small number of balls, the probability cannot be accurately estimated in terms of statistics.
There is no loss in generality when we suppose that there are n red balls and m non-red balls in the urn. Furthermore, suppose a ball is drawn at random from the urn. By the relative frequency approach, the probability of obtaining a red ball is P = n / (n + m).
The problem we study is to estimate the probability by observing s balls drawn from an urn, where + s n m . We suppose that there are n s red balls among s balls. = / s P n s is used to estimate P. In terms of statistics, P P ≠ . In other words, we cannot determine an exact probability of obtaining a red ball until we draw all balls from the urn. The fuzzy framework suits for representing the uncertainty in the probability estimate.
Let M be a model to fuzzify P so that we can obtain a fuzzy probablity P , particularly expressed with a possibility distribution ( ), [0,1] p p∈ p to represent the uncertainty in the probability estimate. For example, the model in Eq. 1 would be used to fuzzify P into a possibility distribution shown in Figure 1a. When all balls are drawn to estimate the probability, the fuzziness will be zero ( Figure 1b).

1,
[ , ], ( ) = 0, otherwise, Eq. 1 where a = max{0,P -[1 -s / (n + m)] 9 }, b = min{1,P + [1 -s / (n + m)] 9 }. Obviously, nobody can confirm whether the model in Eq. 1 is suitable to represent the uncertainty of the probability of obtaining a red ball with respect to an experiment where s balls are drawn from n + m balls in an urn. We suggest two statistical hypotheses to verify the reliability of a model M. The hypothesis is called the cover hypothesis.

Subjective Cover Hypothesis
Consider the following case: In an experiment group there are l statisticians and one fuzzy engineer. Observing s balls drawn from an urn filled with S balls, s S , they estimate the probability of obtaining a red ball.
The statisticians are good at estimating the probability by using their experience. The estimate given by ith statistician is denoted as p (i) .
The fuzzy engineer is interested in mining fuzzy information carried by a small sample and good at constructing a fuzzy model M to fuzzify a probability P that is estimated by using the relative frequency into a possibility distribution p(p).
From the point of view of statistics, it is easy to understand that sample (2) ( ) = = = l p p p implies that the probability is p (1) with confidence. Regarding X as a general sample, we can obtain a probability distribution such as a histogram. Any probability distribution statisticians ( ) P p based on statisticians X is called a subject cover. From the point of view of possibility theory (Zadeh 1978), p(p) implies that the probability is p in possibility p(p) in terms of confidence restriction. Therefore, if a subject cover is similar as p(p), the cover confirms, in some degree, that the fuzzy model M is reliable.
Our hypothesis is that there must exist a subject cover statisticians ( ) P p to verify whether a fuzzy model M is reliable. It is not difficult to invite l statisticians to participate in the experiment, where the statisticians estimate the probability of obtaining a red ball with the s observations and their experience. Therefore, the hypothesis is accepted that the set of their estimates is a subject cover.

Random Cover Hypothesis
Consider the following case: In an experiment group there are one statistician and one fuzzy engineer. The statistician runs N experiments, and each time he draws s balls from the urn. The estimate of the probability of obtaining a red ball from the ith experiment is also denoted as p (i) . (1) ( also provides confidence information about the probability of obtaining a red ball. Any probability distribution experiments ( ) P p based on the experiments X is call a random cover.
Observing s balls drawn from an urn filled with S balls, s S , the fuzzy engineer employs a fuzzy M to obtain a fuzzy distribution p(p) for the probability of obtaining a red ball.
The random cover hypothesis is described as that there must exist a random cover experiments ( ) P p to verify whether a fuzzy model M is reliable.
It is easy to run many experiments when s S , and the results of the experiments are different. Therefore, the hypothesis is accepted that the set of the results is a random cover.

Cover of Probability Distributions
The primeval hypotheses in section 2 serve as the fuzzy models that fuzzify a probability value. In risk analysis, the fuzzy risk is frequently related to a possibility-probability distribution (Huang and Moraga 2002;Karimi and Hülermeier 2007;Huang and Ruan 2008), defined in Eq. 2.
where V stands for the population from which we draw a sample v, U p is the universe of discourse of probability. Let j be a real function defined on V, then x = j(v), v ∈ V, is a random variable. x and v are equipollent to identify an event. p x (p) is the possibility that an event occurs with probability p.
We extend the concept of the cover to correspond with probability distributions p (i) (x), i = 1,2, … ,l, instead of probability values p (i) , i = 1,2, … ,l, shown in section 2.
Let X be a sample drawn from a population V with a theoretical probability distribution p(x). Let c be a statistical method, such as Maximum Likelihood, which processes X to give an estimate of the probability distribution, written as ( ) That is, for a population V, the theoretical probability distribution p(x) is unique, but the different samples 1 , X 2 , , N X X lead us to have different estimates . They form a sample, called probability sample, written as 0 x W , that is, Let w be a statistical model employed to process 0 x W and obtain a probability distribution at x 0 , called biprobability distribution, written as x is a random variable obeying normal distribution N(6.86,0.372 2 ). With 10 random seed numbers, respectively, running Program 2 in Huang and Shi (2002), a generator of random numbers, with MU=6.86, the standard deviation SIGMA=0.372, and the sample size N=11, we obtain 10 samples, one of them is, which is shown in Figure 2. In practice, the biprobability distribution is not the normal distribution inferred by using the central limit theorem, because the integration of the function 7.3 ( ) W p p w in [0,1] is less than 1. That we use the normal distribution assumption is to reduce the complexity in discussing the property of a biprobability distribution. Let be a family of biprobability distributions corresponding to a population with distribution p(x) . For any fixed x, C(x, p) is a probability distribution with respect to variable p. It is a density function for a continuous distribution defined on interval [0,1], or a discrete function for a discrete distribution defined on a universe U p of discourse of probability. The family C(x, p) is called a cover of probability distributions. Obviously, C(x, p) is a random cover but not a subjective cover. According to the random cover hypothesis suggested in section 2, we infer that there must exist a cover C(x, p) to verify the reliability of a fuzzy model M.

Consistency Degree of a Cover and a PPD
The primeval hypotheses suggested in section 2 only assert that it is possible to verify the reliability of a fuzzy model with a cover. In this section, we suggest an approach to compare a cover and a PPD for verification. The reliability degree of the model is measured by using the consistency degree of the cover and the PPD.

Consistency Degree
The concept of consistency is quite rough. Strictly speaking, a PPD and a cover C are consistent if and only if they are equality. In many cases, , x p U ∈Ω ∈ , is called a normalized cover. We define that p x (p) and C(x, p) are consistent if and only if ∀x ∈V, p x (p) = h x (p).
It is interesting to notice that from Figure 2 we know that ∀x, C(x, p) is a probability distribution; therefore it is to be equal to 1. Hence, C and P may coincide well, when H and P can coincide. Obviously, in a numerical experiment, in general, a PPD is not equal to a cover because the size of a sample is always limited. Therefore, it is necessary to weaken the condition of consistency.
Let ( , , ) Ω P A be a probability space, and U p be the universe of discourse of probability. Let P = {p x (p)} and H = {h x (p)}, be a PPD and a cover, defined on V×U p , respectively.
Obviously, both p x (p) and h x (p) are two-variable bounded functions. And, 0 ≤ p x (p), h x (p) ≤ 1, that is, they are 0-1 bounded functions. Hence, our problem can be simplified to study the consistency between two functions defined on a domain U.
Let F be a set of two-variable functions with domain U, denoted as F = {f (x, y) : U}. In our case, F is a set of 0-1 bounded functions, and U = V×U p .    g is an reasonable index to measure the consistency degree between two functions. However, with it, we overlook the information that f 1 (x, y) and f 2 (x, y) would reach extreme values in different points. Particularly, when f 1 (x, y) and f 2 (x, y) are equal to zero in the main part of U, g is not a good index for consistency. g is an upper consistency.
Let F be a set of 0-1 bounded functions, f 1 (x, y), f 2 (x, y) ∈ F. Let 1 is called the extremal error between f 1 and f 2 , where | Z | is cardinal number of Z.
is called the consistent kernel.
In the case that the peak set Z is not integrable, nor discrete, the expression of the consistent kernel may be complex. In practice, the Z is usually discrete.
Obviously, if A = B, then b = 1. Otherwise, the largest error between f 1 (x, y), f 2 (x, y) on the peak set Z determines the consistent kernel. b is a lower consistency.
Definition 4. Let f 1 (x, y), f 2 (x, y) be 0-1 bounded functions with U. f 2 (x, y) is consistent with f 1 (x, y) in degree a = (g + b) / 2, if and only if their consistent approximation is g and consistent kernel is b.
a is also called the consistency degree of f 2 (x, y) to f 1 (x, y).

Consistency Degree of a Cover and a PPD
Let V be a population and U p a universe of discourse of probability. Given a normalized cover H = {h x (p)} and a PPD P = {p x (p)} defined on V×U p , we study the consistency degree of H and P.
Employing formula in Eq. 6 and Eq. 7, respectively, we obtain the naive distance and the extremal error between H and P, shown in Eq. 8 and Eq. 9.
p . According to Definition 4, the consistency degree of H and P is a(H, P) shown in Eq. 10: Eq. 10 a(H, P) is called a consistency degree of a cover with normalization H and a PPD P. In other words, the consistency degree of a cover and a PPD is defined by the consistency degree of the cover's normalization and the PPD.

Two Kinds of Covers Constructed with Histograms
A histogram is a graph of grouped (binned) data in which the number of values in each bin is represented by the area of a rectangular box. A relative frequency histogram, as an estimate of the probability distribution of a continuous variable, is a bar graph constructed in such a way that the area of each bar is proportional to the fraction of observations in the category that it represents.

Relative Frequency Histogram
Let = { | = 1, 2, , } i X x i n be a sample drawn from V with a probability density distribution (PDF) p(x), and Eq. 11 is called a relative frequency histogram (RFH) with respect to X. ( ) h X p x is an estimate of probability that an event occurs in the same interval as x.
Let u j be the midpoint of interval I j . We obtain a discrete domain of definition of ( ) p u , estimated by using RFH with X in u, is only a possible value to probability that an event occurs in the interval which includes u.

Natural Cover of Histograms
According to N (the size of the probability sample), we divide probability domain [0,1] into t probability intervals, where t can be obtained by using Eq. 12 suggested by Otness and Encysin (1972), and probability step d = 1 / t.
Then, with probability sample I j W , we can construct a biprobability distribution, denoted as ( ) where p k is the midpoint of A k . Mathematically, ( ) = 1 (number of ( ) in the same bin as ), ( ) .

Distribution-Cover of Histograms
According to n (the size of the sample drawn from V with PDF p(x)), we construct a discrete universe of discourse of probability shown in Eq. 14.
Eq. 19 Huang (2000) proved that, in the case where we only have a small sample to estimate a probability distribution, a soft histogram estimation is better than a histogram estimation in a higher work efficiency about 28 percent. In other words, if we need a sample including 30 observations for the histogram method, then less 28 percent is 30−30×28% = 30−8 = 22, a sample with 22 observations can give a soft histogram estimation in a similarly accurate way. Therefore, we employ the distribution-cover of histograms to verify the reliability of a model for calculating fuzzy probabilities.

PPD Calculated by Interior-Outer-Set Model
Interior-outer-set model (IOSM) (Huang 2002a q ij is called the information gain of that observation x i distributed to controlling point u j .

Definition 5. =
in j j X X I − ∩ is called an interior set of interval I j . The elements of X in -j are called the interior points of I j .
Let A and B be two sets.

{ }
is called their set difference.
Definition 6. Let X in-j be the interior set of interval I j .
is called an outer set of interval I j . The elements of X out -j are called the outer points of I j .
∀x i ∈ X, if x i ∈ X in-j we say that it loses information, by gain at 1 -q ij , to another interval, we use q ij -= 1 -q ij to represent the loss; if x i ∈ X out-j we say that it gives information, by gain at q ij , to I j , we use q ij + to represent the addition. q ij means that x i may leave I j in possibility q ij if x i ∈ X in-j , or x i may join I j in possibility q ij + if x i ∈ X out-j . q ij is called the leaving possibility, and q ij + called the joining possibility. The leaving possibility of an outer point is defined as 0 (it has gone). The joining possibility of an interior point is defined as 0 (it has been in the interval).
Any model based q ij + and q ij to calculate a PPD on I × U p is called an IOSM. The first IOSM was suggested in Huang (1998) and applied to study the risk of crop flood giving a better result to support risk management in crops avoiding flood than the traditional probability method (Huang 2002a). In Huang and Moraga (2002), the model was transformed into a matrix algorithm. The second IOSM was suggested in Moraga and Huang (2003) with complexity in the O (n log n) class instead of complexity O(n 2 ) and applied to make soft risk maps (Zhang 2005). The third IOSM was introduced in Zong (2004) to smooth the abrupt slopes in a PPD where the membership is less than or equal to 0.5 if it is not 1. This paper focuses on exploring a new approach to verify reliability, not to improve IOSM. Therefore, we employ the second IOSM in Eq. 20 to calculate a PPD.