Advertisement

Natural Resources Research

, Volume 26, Issue 2, pp 201–212 | Cite as

Creation of Histograms for Data in Various Mineral Resource and Engineering Problems: A Review of Existing Methods and a Proposed New Method to Define Bin Number

  • Louis St-Pierre
  • Yuksel Asli Sari
  • Mustafa Kumral
Review Paper
  • 328 Downloads

Abstract

Histograms are widely used in geosciences for data analysis and visualization. In cases where a distribution is not fitted to data, histograms are often used to address various sampling- and interpolation-related aspects. However, the results of these applications are substantially affected by the histogram’s number of bins as determined by several binning methods. This paper proposes a new binning approach and compares it with various standard approaches to demonstrate the relative performance of the new approach. Cut-off grade optimization for polymetallic deposits, Monte-Carlo modeling, and derivation of conditional distribution, all of which use histograms, are used as case studies. The proposed technique is based on calculating the squared error for each bin in a histogram, and combining the error values to evaluate the total error for each histogram. The new technique then selects the bin number which minimizes the total error. The results showed that the new binning approach is well suited for binning small datasets and can be used in geoscience applications if needed.

Keywords

Histogram bins Cut-Off grade Binning methods Conditional distribution 

Notes

Acknowledgements

The authors thank the Natural Sciences and Engineering Research Council of Canada (NSERC) for supporting this research (Fund Number: 236482).

References

  1. Birgé, L., & Rozenholc, Y. (2006). How many bins should be put in a regular histogram. ESAIM: Probability and Statistics, 10, 24–45.CrossRefGoogle Scholar
  2. Daly, J. E. (1988). The construction of optimal histograms. Communications in Statistics-Theory and Methods, 17(9), 2921–2931.CrossRefGoogle Scholar
  3. Denby, L., & Mallows, C. (2009). Variations on the histogram. Journal of Computational and Graphical Statistics, 18(1), 21–31.CrossRefGoogle Scholar
  4. Deutsch, J. L., Palmer, K., Deutsch, C. V., Szymanski, J., & Etsell, T. H. (2016). Spatial modeling of geometallurgical properties: Techniques and a case study. Natural Resources Research, 25(2), 161–181.CrossRefGoogle Scholar
  5. Doane, D. P. (1976). Aesthetic frequency classifications. The American Statistician, 30(4), 181–183.Google Scholar
  6. Douce-Patiño, A. E. (2016). Metallic mineral resources in the twenty-first century: II. Constraints on future supply. Natural Resources Research, 25(1), 97–124.CrossRefGoogle Scholar
  7. Freedman, D., & Diaconis, P. (1981). On the histogram as a density estimator: L 2 theory. Probability Theory and Related Fields, 57(4), 453–476.Google Scholar
  8. Goovaerts, P. (2006). Geostatistical modeling of the spaces of local, spatial, and response uncertainty for continuous petrophysical properties. In T. C. Coburn, J. M. Yarus, & R. L. Chambers (Eds.), Stochastic modeling and geostatistics: Principles, methods, and case studies (pp. 59–79). Elk City: AAPG Computer Applications in Geology.Google Scholar
  9. Hyndman, R. J. (1995). The problem with Sturges’ rule for constructing histograms. Clayton: Monash University.Google Scholar
  10. Khan, K. D., & Deutsch, C. V. (2016). Practical incorporation of multivariate parameter uncertainty in geostatistical resource modeling. Natural Resources Research, 25(1), 51–70.CrossRefGoogle Scholar
  11. Knuth, K. H. (2006). Optimal data-based binning for histograms. arXiv preprint physics/0605197.Google Scholar
  12. Martinez, W. L., & Martinez, A. R. (2007). Computational statistics handbook with MATLAB. Boca Raton: CRC Press.Google Scholar
  13. Mooney, C. R., & Boisvert, J. B. (2016). Using a discrete fracture network and spatial point processes to populate veins and model grade in a coarse gold deposit. Natural Resources Research, 25(3), 255–268. doi: 10.1007/s11053-015-9280-1.CrossRefGoogle Scholar
  14. Osanloo, M., & Ataei, M. (2003). Using equivalent grade factors to find the optimum cut-off grades of multiple metal deposits. Minerals Engineering, 16(8), 771–776.CrossRefGoogle Scholar
  15. Rao, S. E., & Journel, A. G. (1997). Deriving conditional distributions from ordinary kriging. Geostatistics Wollongong, 96, 92–102.Google Scholar
  16. Rudemo, M. (1982). Empirical choice of histograms and kernel density estimators. Scandinavian Journal of Statistics, 9, 65–78.Google Scholar
  17. Scott, D. W. (1979). On optimal and data-based histograms. Biometrika, 66(3), 605–610.CrossRefGoogle Scholar
  18. Shimazaki, H., & Shinomoto, S. (2007). A method for selecting the bin size of a time histogram. Neural Computation, 19(6), 1503–1527.CrossRefGoogle Scholar
  19. Sircombe, K. N. (2004). AgeDisplay: an EXCEL workbook to evaluate and display univariate geochronological data using binned frequency histograms and probability density distributions. Computers & Geosciences, 30(1), 21–31.CrossRefGoogle Scholar
  20. Stueck, H., Houseknecht, D., Franke, D., Gautier, D., Bahr, A., & Ladage, S. (2016). Shale-gas assessment: Comparison of gas-in-place vs. performance-based approaches. Natural Resources Research, 25(3), 315–329.CrossRefGoogle Scholar
  21. Sturges, H. A. (1926). The choice of a class interval. Journal of the American Statistical Association, 21(153), 65–66.CrossRefGoogle Scholar
  22. Thakur, M., Samanta, B., & Chakravarty, D. (2014). Support and information effect modeling for recoverable reserve estimation of a beach sand deposit in India. Natural Resources Research, 23(2), 231–245.CrossRefGoogle Scholar
  23. Torikian, H., & Kumral, M. (2014). Analyzing reproduction of correlations in Monte Carlo simulations: Application to mine project valuation. Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards, 8(4), 235–249.Google Scholar
  24. Wand, M. (1997). Data-based choice of histogram bin width. The American Statistician, 51(1), 59–64.Google Scholar
  25. Wang, X.-X., & Zhang, J.-F. (2012). Histogram-kernel error and its application for bin width selection in histograms. Acta Mathematicae Applicatae Sinica, English Series, 28(3), 607–624.CrossRefGoogle Scholar
  26. Weiss, R. (1994). Multivariate density estimation: Theory, practice, and visualization. Journal of the American Statistical Association, 89(425), 359–361.CrossRefGoogle Scholar
  27. Zagayevskiy, Y., & Deutsch, C. V. (2015). A methodology for sensitivity analysis based on regression: Applications to handle uncertainty in natural resources characterization. Natural Resources Research, 24(3), 239–274.CrossRefGoogle Scholar

Copyright information

© International Association for Mathematical Geosciences 2016

Authors and Affiliations

  • Louis St-Pierre
    • 1
  • Yuksel Asli Sari
    • 1
  • Mustafa Kumral
    • 1
  1. 1.Department of Mining and Materials EngineeringMcGill UniversityMontrealUSA

Personalised recommendations