Skip to main content
Log in

Solving petrological problems through machine learning: the study case of tectonic discrimination using geochemical and isotopic data

Contributions to Mineralogy and Petrology Aims and scope Submit manuscript

Abstract

Machine-learning methods are evaluated to study the intriguing and debated topic of discrimination among different tectonic environments using geochemical and isotopic data. Volcanic rocks characterized by a whole geochemical signature of major elements (SiO2, TiO2, Al2O3, Fe2O3T, CaO, MgO, Na2O, K2O), selected trace elements (Sr, Ba, Rb, Zr, Nb, La, Ce, Nd, Hf, Sm, Gd, Y, Yb, Lu, Ta, Th) and isotopes (206Pb/204Pb, 207Pb/204Pb, 208Pb/204Pb, 87Sr/86Sr and 143Nd/144Nd) have been extracted from open-access and comprehensive petrological databases (i.e., PetDB and GEOROC). The obtained dataset has been analyzed using support vector machines, a set of supervised machine-learning methods, which are considered particularly powerful in classification problems. Results from the application of the machine-learning methods show that the combined use of major, trace elements and isotopes allows associating the geochemical composition of rocks to the relative tectonic setting with high classification scores (93 %, on average). The lowest scores are recorded from volcanic rocks deriving from back-arc basins (65 %). All the other tectonic settings display higher classification scores, with oceanic islands reaching values up to 99 %. Results of this study could have a significant impact in other petrological studies potentially opening new perspectives for petrologists and geochemists. Other examples of applications include the development of more robust geothermometers and geobarometers and the recognition of volcanic sources for tephra layers in tephro-chronological studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Abedi M, Norouzi G-H, Bahroudi A (2012) Support vector machine for multi-classification of mineral prospectivity areas. Comput Geosci 46:272–283. doi:10.1016/j.cageo.2011.12.014

    Article  Google Scholar 

  • Agrawal S, Guevara M, Verma SP (2004) Discriminant analysis applied to establish major-element field boundaries for tectonic varieties of basic rocks. Int Geol Rev 46:575–594. doi:10.2747/0020-6814.46.7.575

    Article  Google Scholar 

  • Agrawal S, Guevara M, Verma SP (2008) Tectonic discrimination of basic and ultrabasic volcanic rocks through log-transformed ratios of immobile trace elements. Int Geol Rev 50:1057–1079. doi:10.2747/0020-6814.50.12.1057

    Article  Google Scholar 

  • Bishop C (2007) Pattern recognition and machine learning. Springer, New York

    Google Scholar 

  • Box GEP, Cox DR (1964) An analysis of transformations. J Roy Stat Soc B Met 26:211–252. doi:10.2307/2287791

    Google Scholar 

  • Cannata A, Montalto P, Aliotta M et al (2011) Clustering and classification of infrasonic events at Mount Etna using pattern recognition techniques. Geophys J Int 185:253–264. doi:10.1111/j.1365-246X.2011.04951.x

    Article  Google Scholar 

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. doi:10.1007/BF00994018

    Google Scholar 

  • Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Dorffner G, Bischof H, Hornik K (2001) Artificial neural networks—ICANN 2001. Springer, Berlin

    Book  Google Scholar 

  • El-Khoribi RA (2008) Support vector machine training of HMT models for land cover image classification. ICGST-GVIP 8:7–11

    Google Scholar 

  • Fischer CC, Tibbetts KJ, Morgan D, Ceder G (2006) Predicting crystal structure by merging data mining with quantum mechanics. Nat Mater 5:641–646. doi:10.1038/nmat1691

    Article  Google Scholar 

  • Frisch W, Meschede M, Blakey R (2011) Plate tectonics. Continental drift and mountain building. Springer, New York

    Google Scholar 

  • Goldstein EB, Coco G (2014) A machine learning approach for the prediction of settling velocity. Water Resour Res 50:3595–3601. doi:10.1002/2013WR015116

    Article  Google Scholar 

  • Grimes CB, Wooden JL, Cheadle MJ, John BE (2015) “Fingerprinting” tectono-magmatic provenance using trace elements in igneous zircon. Contrib Miner Petrol 170:46. doi:10.1007/s00410-015-1199-3

    Article  Google Scholar 

  • Hsu C-WC, Lin CC-J (2002) A comparison of methods for multiclass support vector machines. Neural Netw IEEE Trans 13:415–425. doi:10.1109/72.991427

    Article  Google Scholar 

  • Huang C, Davis LS, Townshend JRG (2002) An assessment of support vector machines for land cover classification. Int J Remote Sens 23:725–749. doi:10.1080/01431160110040323

    Article  Google Scholar 

  • Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv (CSUR) 31:264–323. doi:10.1145/331499.331504

    Article  Google Scholar 

  • James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning. Springer, New York

    Book  Google Scholar 

  • Jordan MI, Mitchell TM (2015) Machine learning: trends, perspectives, and prospects. Science 349:255–260. doi:10.1126/science.aaa8415

    Article  Google Scholar 

  • Kavzoglu T, Colkesen I (2009) A kernel functions analysis for support vector machines for land cover classification. Int J Appl Earth Obs Geoinf 11:352–359. doi:10.1016/j.jag.2009.06.002

    Article  Google Scholar 

  • Knerr S, Personnaz L, Dreyfus G (1990) Single-layer learning revisited: a stepwise procedure for building and training a neural network. In: Neurocomputing: algorithms, architectures and applications. Springer, Berlin, pp 41–50

  • Kotsiantis SB (2007) Supervised machine learning: a review of classification techniques. Informatica 31:249–268

    Google Scholar 

  • Lach-hab M, Yang S, Vaisman II, Blaisten-Barojas E (2010) Novel approach for clustering zeolite crystal structures. Mol Inform 29:297–301. doi:10.1002/minf.200900072

    Article  Google Scholar 

  • Le Maitre RW (1982) Numerical petrology: statistical interpretation of geochemical data. Elsevier, Amsterdam

  • Lee JA, Verleysen M (2009) Quality assessment of dimensionality reduction: Rank-based criteria. Neurocomputing 72:1431–1443. doi:10.1016/j.neucom.2008.12.017

    Article  Google Scholar 

  • Li C, Arndt NT, Tang Q, Ripley EM (2015) Trace element indiscrimination diagrams. Lithos 232:76–83. doi:10.1016/j.lithos.2015.06.022

    Article  Google Scholar 

  • Masotti M, Falsaperla S, Langer H et al (2006) Application of support vector machine to the classification of volcanic tremor at Etna, Italy. Geophys Res Lett 33:L20304. doi:10.1029/2006GL027441

    Article  Google Scholar 

  • Meschede M (1986) A method of discriminating between different types of mid-ocean ridge basalts and continental tholeiites with the Nb–Zr–Y diagram. Chem Geol 56:207–218. doi:10.1016/0009-2541(86)90004-5

    Article  Google Scholar 

  • Murphy KP (2012) Machine learning: a probabilistic perspective. The MIT Press, Cambridge

    Google Scholar 

  • Pearce JA (1976) Statistical analysis of major element patterns in basalts. J Petrol 17:15–43. doi:10.1093/petrology/17.1.15

    Article  Google Scholar 

  • Pearce JA, Cann JR (1973) Tectonic setting of basic volcanic rocks determined using trace element analyses. Earth Planet Sci Lett 19:290–300. doi:10.1016/0012-821X(73)90129-5

    Article  Google Scholar 

  • Pearce JA, Norry MJ (1979) Petrogenetic implications of Ti, Zr, Y, and Nb variations in volcanic rocks. Contrib Miner Petrol 69:33–47. doi:10.1007/BF00375192

    Article  Google Scholar 

  • Pearce J, Stern R (2006) Origin of back-arc basin magmas: trace element and isotope perspectives. In: Christie DM, Fisher CR, Lee SM, Givens S (eds) Back-arc spreading systems: geological, biological, chemical, and physical interactions. American Geophysical Union, Washington, DC

    Google Scholar 

  • Pearce JA, Harris NBW, Tindle AG (1984) Trace element discrimination diagrams for the tectonic interpretation of granitic rocks. J Petrol 25:956–983. doi:10.1093/petrology/25.4.956

    Article  Google Scholar 

  • Pedregosa F, Varoquaux GG, Gramfort A et al (2011) Scikit-learn: machine Learning in Python. J Mach Learn Res 12:2825–2830

    Google Scholar 

  • Petrelli M, Perugini D, Moroni B, Poli G (2003) Determination of travertine provenance from ancient buildings using self-organizing maps and fuzzy logic. Appl Artif Intell 17:885–900. doi:10.1080/713827251

    Article  Google Scholar 

  • Provost F, Kohavi R (1998) Guest editors’ introduction: on applied research in machine learning. Mach Learn 30:127–132. doi:10.1023/A:1007442505281

    Article  Google Scholar 

  • Saccani E (2015) A new method of discriminating different types of post-Archean ophiolitic basalts and their tectonic significance using Th–Nb and Ce–Dy–Yb systematics. Geosci Front 6:481–501. doi:10.1016/j.gsf.2014.03.006

    Article  Google Scholar 

  • Scholkopf B, Burges CJC, Girosi F et al (1997) Comparing support vector machines with Gaussian kernels to radial basis function classifiers. IEEE Trans Signal Process 45:2758–2765. doi:10.1109/78.650102

    Article  Google Scholar 

  • Shai S-S, Shai B-D (2014) Understanding machine learning: from theory to algorithms. Cambridge University Press, Cambridge

    Google Scholar 

  • Shervais JW (1982) Ti-V plots and the petrogenesis of modern and ophiolitic lavas. Earth Planet Sci Lett 59:101–118. doi:10.1016/0012-821X(82)90120-0

    Article  Google Scholar 

  • Smola AJ, Schölkopf B (2004) A tutorial on support vector regression. Stat Comput 14:199–222. doi:10.1023/B:STCO.0000035301.49549.88

    Article  Google Scholar 

  • Snow CA (2006) A reevaluation of tectonic discrimination diagrams and a new probabilistic approach using large geochemical databases: moving beyond binary and ternary plots. J Geophys Res 111:B06206. doi:10.1029/2005JB003799

    Article  Google Scholar 

  • Taylor B, Martinez F (2003) Back-arc basin basalt systematics. Earth Planet Sci Lett 210:481–497. doi:10.1016/S0012-821X(03)00167-5

    Article  Google Scholar 

  • Templ M, Filzmoser P, Reimann C (2008) Cluster analysis applied to regional geochemical data: problems and possibilities. Appl Geochem 23:2198–2213. doi:10.1016/j.apgeochem.2008.03.004

    Article  Google Scholar 

  • Thompson JB (1982a) Composition space; an algebraic and geometric approach. Rev Miner Geochem 10:1–31

    Google Scholar 

  • Thompson JB (1982b) Reaction space: an algebraic and geometric approach. Rev Miner Geochem 10:33–52

    Google Scholar 

  • Tomlinson E, Smith V, Albert P (2015) The major and trace element glass compositions of the productive Mediterranean volcanic sources: tools for correlating distal tephra layers in and around Europe. Quat Sci Rev 118:48–66. doi:10.1016/j.quascirev.2014.10.028

    Article  Google Scholar 

  • Verma SP, Pandarinath K, Verma SK, Agrawal S (2013) Fifteen new discriminant-function-based multi-dimensional robust diagrams for acid rocks and their application to Precambrian rocks. Lithos 168–169:113–123. doi:10.1016/j.lithos.2013.01.014

    Article  Google Scholar 

  • Vermeesch P (2006a) Tectonic discrimination diagrams revisited. Geochem Geophys Geosyst. doi:10.1029/2005GC001092

    Google Scholar 

  • Vermeesch P (2006b) Tectonic discrimination of basalts with classification trees. Geochim Cosmochim Acta 70:1839–1848. doi:10.1016/j.gca.2005.12.016

    Article  Google Scholar 

  • Wood DA (1980) The application of a Th-Hf-Ta diagram to problems of tectonomagmatic classification and to establishing the nature of crustal contamination of basaltic lavas of the British Tertiary Volcanic Province. Earth Planet Sci Lett 50:11–30. doi:10.1016/0012-821X(80)90116-8

    Article  Google Scholar 

  • Yang Q, Li X, Shi X (2008) Cellular automata for simulating land use changes based on support vector machines. Comput Geosci 34:592–602. doi:10.1016/j.cageo.2007.08.003

    Article  Google Scholar 

  • Yu H, Yang J, Han J, Li X (2005) Making SVMs scalable to large data sets using hierarchical cluster indexing. Data Min Knowl Discov 11:295–321. doi:10.1007/s10618-005-0005-7

    Article  Google Scholar 

  • Zuo R, Carranza EJM (2011) Support vector machine: a tool for mapping mineral prospectivity. Comput Geosci 37:1967–1975. doi:10.1016/j.cageo.2010.09.014

    Article  Google Scholar 

Download references

Acknowledgments

We thank the editor (Prof. O. Müntener) and two unknown reviewers for valuable comments and suggestions that contributed to increase the quality our manuscript. We also acknowledge Rebecca Astbury for the proofreading of the final version of the manuscript. This project was supported by the ERC Consolidator “CHRONOS” project (Grant No. 612776) and by the Microsoft Research Azure Award Program (Maurizio Petrelli: Azure Machine Learning Award).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maurizio Petrelli.

Additional information

Communicated by Othmar Müntener.

Appendices

Appendix A: mathematical principles of support vector machines

An extensive introduction to the mathematical principles of support vector machines is reported in Abedi et al. (2012) and Cortes and Vapnik (1995). To introduce the formulation of support vector machines, we first discuss a two-class problem.

Consider a training dataset of S dimensional samples (e.g., S chemical elements as input) x i with i = 1, 2, 3,…, n where n is the number of samples. To each sample, a label y i is assigned. The label y i is equal to 1 for the first class and −1 for the second class.

In the case the two classes are linearly separable, then there exists a group of linear separators that satisfy the following equation (Kavzoglu and Colkesen 2009):

$$w{\cdot}x_{i} + b \ge + 1\quad {\text{for}} \quad y_{i} = + 1$$
$$w{\cdot}x_{i} + b \le - 1\quad {\text{for}}\quad y_{i} = - 1$$

As a consequence, the separating hyper-plane can be formalized as a decision function:

$$f\left( x \right) = \text{sgn} \left( {wx + b} \right)$$

with sgn(x) defined as follows:

$$\text{sgn} \left( x \right) = \left\{ {\begin{array}{*{20}l} 1 & {{\text{if}}\quad x > 0} \\ 0 & {{\text{if}}\quad x = 0} \\ { - 1} & {{\text{if}}\quad x < 0} \\ \end{array} } \right.$$

The parameters of w and b can be obtained by solving the optimization function:

$${\text{minimize}}\,\tau \left( w \right) = \frac{1}{2}w^{2}$$

subject to:

$$y_{i} \left( {\left( {wx_{i} } \right) + b} \right) \ge 1, \quad i = 1,2,3, \ldots ,n$$

An example of two-dimensional problem where two different populations can be divided by a linear function is reported in Fig. 8A. However, there are problems where a nonlinear trend can separate the different populations more efficiently (Fig. 8B).

Fig. 8
figure 8

Simplified examples of 2D populations that can be separated by a linear (a) and nonlinear (b) functions

In these cases, a projection function \(\phi \left( x \right)\) can be utilized to map the training data form the original space x to a Hilbert space X. This means that a nonlinear function is learned by a linear learning machine in a high-dimensional feature space while the capacity of the system is controlled by a parameter that does not depend on the dimensionality of the space (Cristianini and Shawe-Taylor 2000). This is called “kernel trick” and means that the kernel function transforms the data into a higher dimensional feature space allowing for performing a linear separation (Cortes and Vapnik 1995).

As reported by Abedi et al. (2012), the training algorithm in the Hilbert space only depends on data in this space through a dot product (i.e., a function with the form \(\phi \left( {x_{i} } \right) \times \phi \left( {x_{j} } \right)\)). As a consequence, a kernel function K can be formalized as follows:

$$K\left( {x_{i} ,x_{j} } \right) = \phi \left( {x_{i} } \right) \times \phi \left( {x_{j} } \right)$$

The two-class problem can be also solved as follows (El-Khoribi 2008):

$${\text{maximize}} \mathop \sum \limits_{i = 1}^{n} \alpha_{i} - \frac{1}{2}\mathop \sum \limits_{ij = 1}^{n} \alpha_{i} \alpha_{j} y_{i} y_{j} K\left( {x_{i} ,x_{j} } \right)$$

subject to:

$$\alpha_{i} \ge 0,\quad i = 1,2,3, \ldots ,n\quad {\text{and}}\quad \mathop \sum \limits_{i = 1}^{n} \alpha_{i} y_{i} = 0$$

The decision function can be now rewritten as (Yang et al. 2008):

$$f\left( x \right) = \text{sgn} \left( {\mathop \sum \limits_{i = 1}^{n} y_{i} \alpha_{i} K\left( {x_{i} ,x_{j} } \right)} \right)$$

Many potential functions can be utilized as \(K\left( {x_{i} ,x_{j} } \right)\)(Zuo and Carranza 2011). Among these, the radial basis function (RBF) utilized in our work is defined as follows:

$$K\left( {x_{i} ,x_{j} } \right) = e^{{ - \gamma \left( {x_{i} - x_{j} } \right)^{2} }}$$

As reported by Cortes and Vapnik (1995), support vector machines were originally developed for the solution of two-class problems, but many of the potential applications are characterized by more than two classes (multiclass problems). In order to solve multiclass problems, the two most popular approaches are the One Vs One (OVO) and the One Vs Rest (OVR) approach (Fig. 9). In OVO, one SVM classifier is built for all possible pairs of classes (Fig. 9B, C) (Knerr et al. 1990; Dorffner et al. 2001). The output from each classifier is obtained in the form of a class label. The class label with the highest frequency is assigned to that point in the data vector (Hsu and Lin 2002). Since the number of SVMs required in this approach is M(M − 1)/2, it is not suitable for those datasets characterized by a large number of classes (Dorffner et al. 2001).

Fig. 9
figure 9

Exemplification of the OVR and OVO approaches in 2D for the discrimination among the three populations reported in a. In OVO (b, c) each population is compared with each of the population separately (M − 1) times; in OVR (d) each population is compared against all the other populations mixed together

On the contrary, in OVR, one SVM is built for each of the M classes. The SVM for a particular class is constructed using the training examples from that class as positive examples and the training examples of the rest of (M-1) class as negative examples (Fig. 9D).

In other words, in the OVO (One Vs One) approach, each population is compared with each other population, separately. In the OVR (One Vs Rest) approach, each population is compared with all the other populations mixed together, simultaneously.

Appendix B: the logic behind classification

Figure 10 reports a flowchart showing the steps to be implemented to determine the tectonic environment of igneous. The first step consists in verifying whether the learning process has been already performed. In the case the learning is missing, a new learning process is required. To complete this task, the reference dataset has to be normalized and split into two portions: the learning and test dataset. The role of the learning dataset is to train the system and develop a provisional model. The role of the test dataset is to check the goodness of the provisional model developed using the learning dataset. To complete this task, the samples belonging to the test dataset are evaluated as unknowns using the provisional model. If the validation process is completed successfully, the provisional model is converted to a final model. On the contrary, the whole classification process is aborted and more detailed studies are required.

Fig. 10
figure 10

Flowchart showing the steps to be performed in the attempt of determining the tectonic setting of unknown samples

When the final model is ready, the samples belonging to the unknown population are processed by the system. Results are then cross-validated using conventional techniques such as petrographic inspections, classical geochemical investigations and field observations.

Finally, if the provisional results are confirmed, the unknown samples can be safely assigned to a specific tectonic setting. On the contrary, further investigations are needed.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Petrelli, M., Perugini, D. Solving petrological problems through machine learning: the study case of tectonic discrimination using geochemical and isotopic data. Contrib Mineral Petrol 171, 81 (2016). https://doi.org/10.1007/s00410-016-1292-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00410-016-1292-2

Keywords

Navigation