# Creation of Histograms for Data in Various Mineral Resource and Engineering Problems: A Review of Existing Methods and a Proposed New Method to Define Bin Number

- 301 Downloads

## Abstract

Histograms are widely used in geosciences for data analysis and visualization. In cases where a distribution is not fitted to data, histograms are often used to address various sampling- and interpolation-related aspects. However, the results of these applications are substantially affected by the histogram’s number of bins as determined by several binning methods. This paper proposes a new binning approach and compares it with various standard approaches to demonstrate the relative performance of the new approach. Cut-off grade optimization for polymetallic deposits, Monte-Carlo modeling, and derivation of conditional distribution, all of which use histograms, are used as case studies. The proposed technique is based on calculating the squared error for each bin in a histogram, and combining the error values to evaluate the total error for each histogram. The new technique then selects the bin number which minimizes the total error. The results showed that the new binning approach is well suited for binning small datasets and can be used in geoscience applications if needed.

### Keywords

Histogram bins Cut-Off grade Binning methods Conditional distribution## Notes

### Acknowledgements

The authors thank the Natural Sciences and Engineering Research Council of Canada (NSERC) for supporting this research (Fund Number: 236482).

### References

- Birgé, L., & Rozenholc, Y. (2006). How many bins should be put in a regular histogram.
*ESAIM: Probability and Statistics,**10*, 24–45.CrossRefGoogle Scholar - Daly, J. E. (1988). The construction of optimal histograms.
*Communications in Statistics-Theory and Methods,**17*(9), 2921–2931.CrossRefGoogle Scholar - Denby, L., & Mallows, C. (2009). Variations on the histogram.
*Journal of Computational and Graphical Statistics,**18*(1), 21–31.CrossRefGoogle Scholar - Deutsch, J. L., Palmer, K., Deutsch, C. V., Szymanski, J., & Etsell, T. H. (2016). Spatial modeling of geometallurgical properties: Techniques and a case study.
*Natural Resources Research,**25*(2), 161–181.CrossRefGoogle Scholar - Doane, D. P. (1976). Aesthetic frequency classifications.
*The American Statistician,**30*(4), 181–183.Google Scholar - Douce-Patiño, A. E. (2016). Metallic mineral resources in the twenty-first century: II. Constraints on future supply.
*Natural Resources Research,**25*(1), 97–124.CrossRefGoogle Scholar - Freedman, D., & Diaconis, P. (1981). On the histogram as a density estimator: L 2 theory.
*Probability Theory and Related Fields,**57*(4), 453–476.Google Scholar - Goovaerts, P. (2006). Geostatistical modeling of the spaces of local, spatial, and response uncertainty for continuous petrophysical properties. In T. C. Coburn, J. M. Yarus, & R. L. Chambers (Eds.),
*Stochastic modeling and geostatistics: Principles, methods, and case studies*(pp. 59–79). Elk City: AAPG Computer Applications in Geology.Google Scholar - Hyndman, R. J. (1995).
*The problem with Sturges’ rule for constructing histograms*. Clayton: Monash University.Google Scholar - Khan, K. D., & Deutsch, C. V. (2016). Practical incorporation of multivariate parameter uncertainty in geostatistical resource modeling.
*Natural Resources Research,**25*(1), 51–70.CrossRefGoogle Scholar - Knuth, K. H. (2006). Optimal data-based binning for histograms. arXiv preprint physics/0605197.Google Scholar
- Martinez, W. L., & Martinez, A. R. (2007).
*Computational statistics handbook with MATLAB*. Boca Raton: CRC Press.Google Scholar - Mooney, C. R., & Boisvert, J. B. (2016). Using a discrete fracture network and spatial point processes to populate veins and model grade in a coarse gold deposit.
*Natural Resources Research,**25*(3), 255–268. doi: 10.1007/s11053-015-9280-1.CrossRefGoogle Scholar - Osanloo, M., & Ataei, M. (2003). Using equivalent grade factors to find the optimum cut-off grades of multiple metal deposits.
*Minerals Engineering,**16*(8), 771–776.CrossRefGoogle Scholar - Rao, S. E., & Journel, A. G. (1997). Deriving conditional distributions from ordinary kriging.
*Geostatistics Wollongong,**96*, 92–102.Google Scholar - Rudemo, M. (1982). Empirical choice of histograms and kernel density estimators.
*Scandinavian Journal of Statistics,**9*, 65–78.Google Scholar - Scott, D. W. (1979). On optimal and data-based histograms.
*Biometrika,**66*(3), 605–610.CrossRefGoogle Scholar - Shimazaki, H., & Shinomoto, S. (2007). A method for selecting the bin size of a time histogram.
*Neural Computation,**19*(6), 1503–1527.CrossRefGoogle Scholar - Sircombe, K. N. (2004). AgeDisplay: an EXCEL workbook to evaluate and display univariate geochronological data using binned frequency histograms and probability density distributions.
*Computers & Geosciences,**30*(1), 21–31.CrossRefGoogle Scholar - Stueck, H., Houseknecht, D., Franke, D., Gautier, D., Bahr, A., & Ladage, S. (2016). Shale-gas assessment: Comparison of gas-in-place vs. performance-based approaches.
*Natural Resources Research,**25*(3), 315–329.CrossRefGoogle Scholar - Sturges, H. A. (1926). The choice of a class interval.
*Journal of the American Statistical Association,**21*(153), 65–66.CrossRefGoogle Scholar - Thakur, M., Samanta, B., & Chakravarty, D. (2014). Support and information effect modeling for recoverable reserve estimation of a beach sand deposit in India.
*Natural Resources Research,**23*(2), 231–245.CrossRefGoogle Scholar - Torikian, H., & Kumral, M. (2014). Analyzing reproduction of correlations in Monte Carlo simulations: Application to mine project valuation.
*Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards,**8*(4), 235–249.Google Scholar - Wand, M. (1997). Data-based choice of histogram bin width.
*The American Statistician,**51*(1), 59–64.Google Scholar - Wang, X.-X., & Zhang, J.-F. (2012). Histogram-kernel error and its application for bin width selection in histograms.
*Acta Mathematicae Applicatae Sinica, English Series,**28*(3), 607–624.CrossRefGoogle Scholar - Weiss, R. (1994). Multivariate density estimation: Theory, practice, and visualization.
*Journal of the American Statistical Association,**89*(425), 359–361.CrossRefGoogle Scholar - Zagayevskiy, Y., & Deutsch, C. V. (2015). A methodology for sensitivity analysis based on regression: Applications to handle uncertainty in natural resources characterization.
*Natural Resources Research,**24*(3), 239–274.CrossRefGoogle Scholar