Abstract
Statistical evaluation of the data collected from field and laboratory is an important task in the earth sciences. The aim of a geological study is to reveal a scientific result from the earth, but the conclusions must be based on the analytical inferences, as in natural and engineering sciences. Therefore, the importance of data analysis increases depending on the improvement of technological methods in geology. The purpose of data analysis in geology is to examine the rate at which a feature changes within the population. The geological data may be lithological, textural (e.g. grain size and shape), structural (e.g. bedding, fault, foliation, jointing or lineation etc.) and chemical (e.g. major and trace elements, isotope ratios etc.) measurements of the rock, mineral, fossil, soil or water specimens. GEOstats is an excel-based data analysis program that provides graphical and numerical results, and data simulation/statistical modeling (e.g. simple regression analysis, box plot, Q–Q plot, XYZ plot, sample distribution and classification) of samples representing a population for geologists and other researchers as well. The program can perform both simple data analysis (e.g. basic statistical calculations) of numerical results and highly complex multivariate analysis such as cluster analysis, principal component analysis and common factor analysis . GEOstats is also able to carry out the error bars analysis by using relative standard deviation values for different confidence intervals.
Similar content being viewed by others
Data availability
GEOstats is available to download at GitHub (https://github.com/mstgndz/GEOstats). Disk occupancy of GEOstats is approximately 14.5 MB, and it works on the Microsoft Windows versions with Excel.
Code availability
Codes used in GEOstats can be imported from GitHub at the public repositories: (1) Eigen values for a symmetric matrix (https://github.com/YoichiroUrita/Math) and (2) k-means algorithm for clustering (https://github.com/gpolic/kmeans-excel). The VBA codes (3.36 and 16.4 KB, respectively) have been provided as open access by their authors in 2018.
References
Abt K (1987) Descriptive data analysis: a concept between confirmatory and exploratory data analysis. Methods Inf Med 26(02):77–88
Aitchison J (1982) The statistical analysis of compositional data. J R Stat Soc B 44(2):139–177
Aitchison J, Egozcue J (2005) Compositional data analysis: where are we and where should we be heading? Math Geol 37(7):829–850
Asan K (2020) Whole-rock elemental and Sr-Nd isotope geochemistry and petrogenesis of the Miocene Elmadağ Volcanic Complex, Central Anatolia (Ankara, Turkey). Geosciences 10(9):348
Behrens JT (1997) Principles and procedures of exploratory data analysis. Psychol Methods 2(2):131–160
Bernard K (2020) Epithermal clast coating inside the rock avalanche-debris flow deposits from Mount Meager Volcanic Complex, British Columbia (Canada). J Volcanol Geotherm Res 402:1–22
Bernard K, van Wyk de Vries B, Thouret J-C (2019) Fault textures in volcanic debris-avalanche deposits and transformations into lahars: the Pichu Pichu thrust lobes in south Peru compared to worldwide avalanche deposits. J Volcanol Geotherm Res 371:116–136
Blott SJ, Pye K (2001) GRADISTAT: a grain size distribution and statistics package for the analysis of unconsolidated sediments. Earth Surf Process Landforms 26:1237–1248
Chayes F (1960) On correlation between variables of constant sum. J Geophys Res 65(12):4185–4193
Cortés JA (2009) On the Harker variation diagrams: a comment on “the statistical analysis of compositional data. where are we and where should we be heading?” by Aitchison and Egozcue (2005). Math Geosci 41:817–828
Cortés JA, Palma JL, Wilson M (2007) Deciphering magma mixing: the application of cluster analysis to the mineral chemistry of crystal populations. J Volcanol Geotherm Res 165:163–188
Cox KG, Bell J, Pankhurst R (1979) The interpretation of Igneous Rocks. George Allen & Unwin, London, p 450
Folk RL, Ward WC (1957) Brazos River bar: a study in the significance of grain size parameters. J Sediment Petrol 27:3–27
Fowler AC, Scheu B (2016) A theoretical explanation of grain size distributions in explosive rock fragmentation. Proc R Soc A 472:20150843
Hammer Ø, Harper DAT (2006) Paleontological data analysis. Blackwell Publishing, Oxford, p 351
Hammer Ø, Harper DAT, Ryan PD (2001) Past: paleontological statistics software package for education and data analysis. Palaeontol Electron 4:1–9
Hosseini ST, Asghari O, Haroni HA (2020) Multivariate anomaly modeling of primary geochemical halos by U-spatial statistic algorithm development: a case study from the Sari Gunay epithermal gold deposit, Iran. Ore Geol Rev 127:3845
Jensen J, Lake LW, Corbett PWM, Goggin D (2000) Statistics for petroleum engineers and geoscientists, 2nd edn. Prentice Hall PTR, Hoboken, p 362
Konicki KM, Holman RA (2000) The statistics and kinematics of transverse sand bars on an open coast. Mar Geol 169:69–101
Lohmar S, Robin C, Gourgand A, Clavero J, Parada MA, Moreno H, Ersoy O, López-Escobar L, Naranjo J-A (2010) Evidence of magma-water interaction during the 13,800 years BP explosive cycle of the Licán ignimbrite, Villarrica volcano (southern Chile). Andean Geol 34:233–248
MacLeod N (2007) PalaeoMath 101: Part 12—Groups III: cluster analysis. palaeontology. Newsletter 1–14
Maiz I, Arambarri I, Garcia R, Millan E (2000) Evaluation of heavy metal availability in polluted soils by two sequential extraction procedures using factor analysis. Environ Pollut 110(1):3–9
Mann PS (1995) Introductory statistics, 2nd edn. Wiley, New York, p 784
Mariño-Paredes J, Morgavi D, Di Vito M, de Vito S, Sandivero F, Dueffels K, Beckmann G, Perugini D (2017) Syneruptive sequential fragmentation of pyroclastics from fractal modeling of grain size distributions of fall deposits: the Cretaio Tephra eruption (Ischia Island, Italy). J Volcanol Geotherm Res 345:161–171
Marshall G, Jonker L (2010) An introduction to descriptive statistics: a review and practical guide. Radiography 16:e1–e7
Miall AD (1977) Ordinal data and the gamma statistic in geology. J Sediment Res 47(2):794–799
Morgenthaler S (2009) Exploratory data analysis. Wiley Interdiscip Rev Comput Stat 1(1):33–44
Nadoll P, Mauk JL, Hayes TS, Koenig AE, Box SE (2012) Geochemistry of magnetite from hydrothermal ore deposits and host rocks of the Mesoproterozoic Belt Supergroup, United States. Soc Econom Geol 107:1275–1292
Nick TG (2007) Descriptive statistics. In: Ambrosius WT (ed) Topics in biostatistics. Methods in molecular biology™. Humana Press, Totowa, p 528
Reid MK, Spencer KL (2009) Use of principal components analysis (PCA) on estuarine sediment datasets: the effect of data pre-treatment. Environ Pollut 157:2275–2281
Reimann C, Filzmoser P, Garrett RG (2002) Factor analysis applied to regional geochemical data: problems and possibilities. Appl Geochem 17:185–206
Roberts NM, Tikoff B, Davis JR, Stetson-Lee T (2019) The utility of statistical analysis in structural geology. J Struct Geol 125:64–73
Rollinson H (1992) Another look at the constant sum problem in geochemistry. Mineral Mag 56:469–475
Roux D (2010) VBA pour Excel—Bibliothèque mathématique avec applications pratiques. Ellipses, London, p 594
Scott L, Neumann FH, Brook GA, Bousman CB, Norström E, Metwally AA (2012) Terrestrial fossil-pollen evidence of climate change during the last 26 thousand years in Southern Africa. Quater Sci Rev 32:100–118
Shikazono N, Ogawa Y, Utada M, Ishiyama D, Mizuta T, Ishikawa N, Kubota Y (2008) Geochemical behavior of rare earth elements in hydrothermally altered rocks of the Kuroko mining area, Japan. J Geochem Explor 98:65–79
Smith G, Rowley P, Williams R, Giordano G, Trolese M, Silleni A, Parsons DR, Capon S (2020) Grain size variations in ignimbrites and implications for the transport of pyroclastic flows. Nat Commun 11:1–11
Swan ARH, Sandilands M (1995) Introduction to geological data analysis. Blackwell Science, Oxford, p 446
Swan ARH, Sandilands M (2012) Practical statistics for geoscientists. Online edition: Practical Statistics for Geoscientists.pdf (anu.edu.au), 180 pp
Temur S (2003) Jeolojide veri analizleri. Çizgi Kitapevi Yayınları, Konya, 170 pp
Tukey JW (1977) Exploratory Data Analysis. Pearson, 712 pp
van den Boogaart KG, Tolosana-Delgado R (2008) ”Compositions”: a unified R package to analyze compositional data. Comput Geosci 34:320–338
Yu L, Rozemeijer J, Van Breukelen BM, Ouboter M, van Der Vlugt C, Broers HP (2018) Groundwater impacts on surface water quality and nutrient loads in lowland polder catchments: monitoring the greater Amsterdam area. Hydrol Earth Syst Sci 22:487–508
Acknowledgements
The authors thank to the editor Dr. Hassan A. Babaie and two anonymous reviewers for their critical and constructive comments that improved the quality of this paper and software. The first author also thanks to Campus France for Scholarships to Foreign Students in France and Turkey’s Council of Higher Education for 100/2000 PhD Scholarship.
Author information
Authors and Affiliations
Contributions
M.G.: Software and Programming, Methodology, Conceptualization, Writing. K.A.: Methodology, Conceptualization, Writing, Supervision.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Communicated by: H. Babaie
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Gündüz, M., Asan, K. GEOstats: an excel-based data analysis program applying basic principles of statistics for geological studies. Earth Sci Inform 15, 705–712 (2022). https://doi.org/10.1007/s12145-021-00710-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-021-00710-6