Estimation of the Data Region Using Extreme-Value Distributions
In the field of pattern recognition or outlier detection, it is necessary to estimate the region where data of a particular class are generated. In other words, it is required to accurately estimate the support of the distribution that generates the data. Considering the 1-dimensional distribution whose support is a finite interval, the data region is estimated effectively by the maximum value and the minimum value in the samples. Limiting distributions of these values have been studied in the extreme-value theory in statistics. In this research, we propose a method to estimate the data region using the maximum value and the minimum value in the samples. We calculate the average loss of the estimator, and derive the optimally improved estimators for given loss functions.
KeywordsLoss Function Data Region Gaussian Mixture Model Asymptotic Distribution Outlier Detection
Unable to display preview. Download preview PDF.
- 1.Leadbetter, M.R., Lindgren, G., Rootzen, H.: Extremes and Related Properties ofRandom Sequences and Processes. Springer, Berlin (1983), 218, Watanabe K., Watanabe S.Google Scholar
- 4.Scholkopf, B., Platt, J., Shawe-Taylor, J., Smola, A.J., Wiiliamson, R.C.: Estimatingthe support of a high-dimensional distribution. TR MSR 99-87, MicrosoftResearch, Redmond, WA (1999)Google Scholar
- 6.Gumbel, E.J.: Statistics of Extremes. Columbia University Press (1958)Google Scholar
- 7.Watanabe, K., Watanabe, S.: Estimating the Data Region Using the AsymptoticDistributions of Extreme-value Statistics. In: Proc. of Sixth Workshop on Information-Based Induction Sciences, IBIS2003, pp. 59–64 (2003) (in Japanese)Google Scholar
- 8.Watanabe, K., Watanabe, S.: Learning Method of the Data Region Based onExtreme-value Theory. In: Proc. of International Symposium on Information Theoryand its Applications, ISITA 2004 (to appear)Google Scholar