Creation of Histograms for Data in Various Mineral Resource and Engineering Problems: A Review of Existing Methods and a Proposed New Method to Define Bin Number
- 301 Downloads
Histograms are widely used in geosciences for data analysis and visualization. In cases where a distribution is not fitted to data, histograms are often used to address various sampling- and interpolation-related aspects. However, the results of these applications are substantially affected by the histogram’s number of bins as determined by several binning methods. This paper proposes a new binning approach and compares it with various standard approaches to demonstrate the relative performance of the new approach. Cut-off grade optimization for polymetallic deposits, Monte-Carlo modeling, and derivation of conditional distribution, all of which use histograms, are used as case studies. The proposed technique is based on calculating the squared error for each bin in a histogram, and combining the error values to evaluate the total error for each histogram. The new technique then selects the bin number which minimizes the total error. The results showed that the new binning approach is well suited for binning small datasets and can be used in geoscience applications if needed.
KeywordsHistogram bins Cut-Off grade Binning methods Conditional distribution
The authors thank the Natural Sciences and Engineering Research Council of Canada (NSERC) for supporting this research (Fund Number: 236482).
- Doane, D. P. (1976). Aesthetic frequency classifications. The American Statistician, 30(4), 181–183.Google Scholar
- Freedman, D., & Diaconis, P. (1981). On the histogram as a density estimator: L 2 theory. Probability Theory and Related Fields, 57(4), 453–476.Google Scholar
- Goovaerts, P. (2006). Geostatistical modeling of the spaces of local, spatial, and response uncertainty for continuous petrophysical properties. In T. C. Coburn, J. M. Yarus, & R. L. Chambers (Eds.), Stochastic modeling and geostatistics: Principles, methods, and case studies (pp. 59–79). Elk City: AAPG Computer Applications in Geology.Google Scholar
- Hyndman, R. J. (1995). The problem with Sturges’ rule for constructing histograms. Clayton: Monash University.Google Scholar
- Knuth, K. H. (2006). Optimal data-based binning for histograms. arXiv preprint physics/0605197.Google Scholar
- Martinez, W. L., & Martinez, A. R. (2007). Computational statistics handbook with MATLAB. Boca Raton: CRC Press.Google Scholar
- Rao, S. E., & Journel, A. G. (1997). Deriving conditional distributions from ordinary kriging. Geostatistics Wollongong, 96, 92–102.Google Scholar
- Rudemo, M. (1982). Empirical choice of histograms and kernel density estimators. Scandinavian Journal of Statistics, 9, 65–78.Google Scholar
- Torikian, H., & Kumral, M. (2014). Analyzing reproduction of correlations in Monte Carlo simulations: Application to mine project valuation. Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards, 8(4), 235–249.Google Scholar
- Wand, M. (1997). Data-based choice of histogram bin width. The American Statistician, 51(1), 59–64.Google Scholar