Skip to main content
Log in

An appraisal of an iterative construction of the endmembers controlling the composition of deep-sea manganese nodules from the Central Indian Ocean Basin

  • Published:
Journal of Earth System Science Aims and scope Submit manuscript

Abstract

This paper describes an estimation of endmember compositions followed by the assessment of those results by log-ratio variance analysis. As an appraisal, it deals only with the first objective of an endmember analysis namely, to identify endmembers if they exist by estimating their compositions. Following the creation of the endmember estimates, the computation of an array of log-ratio variances was a key innovation in this type of study. Log-ratio variances revealed intrinsic linear associations between the dominant elements on each of the estimated endmember compositions, largely confirming the endmember analysis. The dataset under study contained the concentrations of 16 elements in 93 samples of deep-sea manganese nodules from the Central Indian Ocean Basin. Many previous analyses of these nodules were undertaken to assess the economic potential of the deposits. This study by contrast, quantified the inter-element associations that account for the nodule compositions. Four endmembers were identified. The elements loaded on each were: (1) Mn, Zn, Ni, Cu, Mn-rich, (2) Fe, Ti, P, Co, Fe-rich, (3) Si, Al, Na, K, clay minerals, (3) Mg, ultramafic material, possibly including Mn, Cr, V, Ca, Na. These latter elements were also detected by their log-ratio variances to be associated with Mg on the 4th endmember.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Aitchison J 1986 The Statistical Analysis of Compositional Data; Chapman and Hall, London.

    Book  Google Scholar 

  • Aitchison J and Bacon-Shone J 1999 Convex linear combinations of compositions; Biometrika 86 351–364.

    Article  Google Scholar 

  • Chayes F 1960 On correlation between variables of constant sum; J. Geophys. Res. 65 4185–4193.

    Article  Google Scholar 

  • Chayes F 1962 Numerical correlation and petrographic variation; J. Geol. 70 440–552.

    Article  Google Scholar 

  • Chayes F 1983 Detecting nonrandom associations between proportions by tests of remaining space variables; Math. Geol. 15 197–206.

    Article  Google Scholar 

  • Chen J C and Owen R M 1989 The hydrothermal component in ferromanganese nodules from the southeast Pacific Ocean; Geochim. Cosmochim. Acta 53 1299–1305.

    Article  Google Scholar 

  • Comas M and Thió-Henestrosa S 2011 CoDaPack 2.0: A stand-alone, multi-platform compositional software; In: Proceedings of the 4th International Workshop on Compositional Data Analysis (eds) Egozcue J J, Tolosana-Delgado R and Ortego M I, pp. 1–10.

  • Dobigeon N, Moussaoui N, Coulon M, Tourneret J and Hero A O 2009 Signal processing; IEEE Trans. Signal Process. 57 4355–4368.

    Article  Google Scholar 

  • Full W E, Ehrlich R and Klovan J E 1981 Extended Qmodel – objective definition of external endmembers in the analysis of mixtures; Math. Geol. 13 331–344.

    Article  Google Scholar 

  • Imbrie J and Van Andel T H 1964 Vector analysis of heavy mineral data; Geo. Soc. Am. Bull. 76 1131–1156.

    Article  Google Scholar 

  • Jauhari P and Pattan J N 2000 Ferromanganese nodules from the Central Indian Basin; In: Handbook of Marine Mineral Deposits (ed.) Cronan D S (Boca Raton: CRC Press), pp. 171–195.

  • Jauhari P and Iyer S D 2008 A comprehensive view of manganese nodules and volcanics of the Central Indian Ocean Basin; Mar. Georesour. Geotechnol. 26 231–258.

    Article  Google Scholar 

  • Kaiser H F 1958 The varimax criterion for analytic rotation in factor analysis; Psychometrika 23 187–200.

    Article  Google Scholar 

  • Kaiser H F 1959 Computer program for varimax rotation in factor analysis; Educ. Psychol. Meas. 19 413–420.

    Article  Google Scholar 

  • Leinen M and Pisias N 1984 An objective technique for determining end-member compositions and for partitioning sediments according to their sources; Geochim. Cosmochim. Acta 48 47–62.

    Article  Google Scholar 

  • Menke W 1984 Geophysical data analysis: Discrete inverse theory; Academic Press, Orlando.

    Google Scholar 

  • Miesch A T 1976 Q-mode factor analysis of compositional data; Comput. Geosci. 1 147–159.

    Article  Google Scholar 

  • Mukhopadhyay R, Ghosh A K and Iyer S D 2008 The Indian Ocean Nodule Field: Geology and Resource Potential; Elsevier, Amsterdam.

    Google Scholar 

  • Palmer M J and Douglas G B 2008 A Bayesian statistical model for endmember analysis of sediment geochemistry, incorporating spatial dependencies; Appl. Stat. J. Roy. St. Series C 57 313–327.

    Article  Google Scholar 

  • Pawlowsky-Glahn V and Egozcue J J 2006 Compositional data and their analysis: An introduction; In: Compositional Data Analysis in the Geosciences: From Theory to Practice (eds) Buccianti A, Mateu-Figueras G and Pawlowsky-Glahn V; Geol. Soc. London, Spec. Publ. 264 1–10.

  • Pearson K 1897 Mathematical contributions to the theory of evolution. On a form of spurious correlation which may arise when indices are used in the measurement of organs; Proc. Roy. Soc. London 60 489–498.

    Article  Google Scholar 

  • Renner R M 1988 On the resolution of compositional datasets into convex combinations of extreme vectors; Technical Report 88/02; Institute of Statistics and Operations Research, Victoria University of Wellington, Wellington, New Zealand.

    Google Scholar 

  • Renner R M 1989 On the resolution of compositional datasets into convex combinations of extreme vectors; Ph.D. Thesis, Victoria University of Wellington, Wellington, New Zealand.

    Google Scholar 

  • Renner R M 1991 An examination of the use of the logratio transformation for the testing of endmember hypotheses; Math. Geol. 23 549–563.

    Article  Google Scholar 

  • Renner R M 1993 The resolution of a compositional dataset into mixtures of fixed source compositions; Appl. Stat. J. Roy. St. Series C 42 615–631.

    Google Scholar 

  • Renner R M 1995 The construction of extreme compositions; Math. Geol. 27 485–497.

    Article  Google Scholar 

  • Renner R M 1996 An algorithm for computing extreme compositions; Comput. Geosci. 22 15–25.

    Article  Google Scholar 

  • Renner R M 2012 Statistical comparisons of heavy metal pollutants between seven regions of the Polish exclusive economic zone; Environ. Earth Sci. 67 987– 997.

    Article  Google Scholar 

  • Renner R M, Glasby G P and Walter P 1997 Endmember analysis of metalliferous sediments from the Galapagos Rift and East Pacific Rise between 2 N and 42 S; Appl. Geochem. 12 383–395.

    Article  Google Scholar 

  • Renner R M, Glasby G P and Szefer P 1998 Endmember analysis of heavy-metal pollution in surficial sediments from the Gulf of Gdansk and the southern Baltic Sea off Poland; Appl. Geochem. 13 313–318.

    Article  Google Scholar 

  • Sarkar C, Iyer S D and Hazra S 2008 Inter-relationship between nuclei and gross characteristics of manganese nodules, Central Indian Ocean Basin; Mar. Georesour. Geotechnol. 26 259–289.

    Article  Google Scholar 

  • Valsangkar A B 2001 Mineral resources; In: The Indian Ocean A Perspective Volume 2, Oxford & IBH Publishing Co. Pvt. Ltd, New Delhi, pp. 585-643.

  • Vineesh T C, Nath B C, Banerjee R, Jaisankar S and Lekshmi V 2009 Manganese nodule morphology as indicators for oceanic processes in the Central Indian Basin; Int. Geol. Rev. 51 27–44.

    Article  Google Scholar 

  • Weltje G J 1997 End-member modeling of compositional data: Numerical-statistical algorithms for solving the explicit mixing problem; Math. Geol. 29 503–549.

    Article  Google Scholar 

  • Zare A and Gader P 2011 An investigation of likelihoods and priors for Bayesian endmember estimation; Am. Inst. Physics Conf. Proc. 1305 311–318.

    Google Scholar 

Download references

Acknowledgements

Authors would like to thank Prof. J H Johnston, Victoria University of Wellington who confirmed the identity of the endmembers, Dr B N Nath, National Institute of Oceanography, Goa, India for supplying the primary data, and Dr Peter Beatty, Oncology Department, Whangarei Hospital, New Zealand, without whose clinical care, this paper would not have been written.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R M Renner.

Appendix

Appendix

A given geochemical dataset, denoted by {x i j }, consists of the measurements on the concentrations of p elements in each of n samples (e.g., rock, sediment). That is, i=1,2,…,n, and j=1,2,…,p, so that x i j is the concentration of the jth element in the ith sample. Accordingly, {x i j } represents an n×p rectangular array, with the observation vectors of the samples laid out along the rows, and the elements down the columns.

When the concentrations are all measured on the same scale, and, x i1+x i2+⋯+x i p =C (a fixed constant) for all i, then each row of the array {x i j } is a composition. Suppose C=100% and the pth (or any other) component is excluded from every sample; the remaining components are then rescaled to sum to 100%, each row of the n×(p−1) array, {y i j }, so formed, is a subcomposition. In the case that it is x i p that is excluded for all i, the sub-compositional component corresponding to x ij is y i j = a i x i j , where a i =100/(100−x i p ) for each i. So the ratio of any two components of a sub-composition, y i j /y i k , is a i x i j /a i x i k =x i j /x i k , the same ratio as that of the corresponding components in the full composition. (Indeed, this result would be true for any other value of a i .)

Abundance data is typically compositional, being expressed as percentages, ppm, etc. In order to overcome the well-documented difficulties associated with the statistical analysis of such data, Aitchison (1986) showed that transforming the data to log-ratios by forming typically, v i j =log(x i j /x i p ),ă j=1,2,⋯ ,p−1, it was possible in some cases to apply traditional multivariate statistical methodology to the n×(p−1) array {v i j }. In addition, Aitchison (1986) showed that given certain assumptions on the evolution of the data, v ij = log(x i j /x i p ) is a component of a multivariate normal distribution.

1.1 A.1 Aitchison compositional variation array

If x ik and x i l are the kth and lth concentrations in the ith sample, and w i k l = log(x i k /x i l ), then the log-ratio mean \(\overline {{w}}_{{kl}} \), and log-ratio variance (LRV) \({s}_{kl}^{2}\), of the w i k l , over all n samples, are given by:

$$ \overline{{w}}_{kl} =\tfrac{1}{n}\sum\limits_{i=1}^{n} {w_{ikl} } $$
(1)
$$ s_{kl}^{2} =\tfrac{1}{n-1}\sum\limits_{i=1}^{n} {(w_{ikl} } -\overline{{w}}_{kl} )^{2} $$
(2)

for k,l=1,2,…,p, in equations (1) and (2). The square p×p array {\({\overline {{w}}}_{{kl}} \)} is anti-symmetric since w i k l =−log(x i l /x i k )=−w i l k for all i, and so \({\overline {{w}}}_{{kl}} = -\overline {{w}}_{lk} \). The diagonal entries of {\({\overline {{w}}}_{{kl}} \)} are all zero. Then, l=k,x i k /x i k =1, and log(1)=0. Similarly, the diagonal entries of p×p array {\({s}_{{kl}}^{2} \)} are also zero. There is no limit on the magnitudes of the non-diagonal entries of {\({s}_{{kl}}^{2} \)} except they must be≥0. Hence {\({s}_{{kl}}^{2} \)} is clearly symmetric. Aitchison (1986) defined the Compositional Variation Array to be that p×p array which contains the values of {\({s}_{{kl}}^{2} \)} above its diagonal, and the values of {\({\overline {{w}}}_{{kl}} \)} below. (The diagonal entries are left blank). This array is computed by the free, compositional data processing software package, CoDaPack (Comas and Thió-Henestrosa 2011).

It is evident that the mean and variance of the log-ratios of two elements in any subcomposition are equal to those for the same elements in the full composition since the corresponding ratios are equal. This is a consequence of scaling all the elements of the ith row of {y i j } by an ith constant a i . There is a similar result obtained by scaling the columns of {x i j } (or {y i j }) that applies to just the LRV, \({s}_{{kl}}^{2} \). Suppose all the concentrations of {x i j } are percentages, then provided b k =10,000, the term b k x i k is in ppm for all i. That is, the kth column of {x i j } is now scaled by the constant b k .

Let u i k l =log(b k x i k /x i l ). Then u i k l =log(b k )+log(x i k /x i l ), that is, u i k l = log(b k )+w i k l . So the mean \({\overline {{u}}}_{{kl}} \) of the u i k l is given by:

$$ \overline{{u}}_{kl} =\tfrac{1}{n}\sum\limits_{i=1}^{n} {(\log (b_{k} )+w_{ikl} } )=\log (b_{k} )+\overline{{w}}_{kl} . $$
(3)

Hence, in the expression for the LRV, \(t_{kl}^{\mathrm {2}} \) of the u i k l , the deviation from the mean is \((u_{ikl} -\overline {{u}}_{kl} )=({w}_{ikl} -\overline {{w}}_{kl} )\), since log(b k ) cancels, so that \({t}_{{kl}}^{{2}} \) becomes:

$$ t_{kl}^{2} \,=\,\tfrac{1}{n-1}\sum\limits_{i=1}^{n} {(u_{ikl} } -\overline{{u}}_{kl})^{2}\,=\,\tfrac{1}{n-1}\sum\limits_{i=1}^{n} {(w_{ikl} } -\overline{{w}}_{kj} )^{2}\, \\ = \,s_{kl}^{2} $$
(4)

which is the variance of the w i k l . Summing up, the LRV between two elements is invariant for a composition and its subcompositions (i.e., two row transformations). This property is called subcompositional coherence. It follows that the LRV remains invariant even in the special case where two-part subcompositions are created of the form X 1+X 2= 100%, by the reduction of all samples to just the two elements, X 1, X 2. By contrast, the correlation coefficient in that case between X 1 and X 2, would necessarily be -1 (or possibly undefined), irrespective of the relationship between X 1 and X 2. The LRV is also invariant to differences in the scales of the measurements of the elements (column transformations). In particular, they need not be on the same scale. That also happens to be a property of the (Pearson) correlation coefficient. The fundamental difference being, the LRV is immutable to both row and column transformations. Such results would seem to imply that it measures a basic underlying relationship between any two elements of a given dataset.

1.2 A.2 Intrinsic linear associations

The identification of intrinsic associations between the elements of geochemical datasets is vital to understanding the makeup of minerals, sediments, and other geological substances. A strong intrinsic association between Al and Si, for example, would indicate the presence of alumino-silicates. Traditionally, the Pearson correlation coefficient has been widely misused in the geosciences as a measure of linear association. Although it has been well-documented for over a century that the correlation coefficients between elements of a compositional dataset are spurious, and therefore, may or may not have any geochemical relevance (see Pearson 1897; Aitchison 1986; Pawlowsky-Glahn and Egozcue 2006). Moreover, unlike the LRV, the correlation coefficient lacks subcompositional coherence. In fact, the correlation between a corresponding pair of elements of a composition and its subcompositions are not only not necessarily equal, but may actually contradict each other, one being a large positive, the other an equally large negative (Renner 2012).

The potentially useful property of the LRV, \({s}_{{kl}}^{2} \), is that if there were a perfect linear relation between two elements that is, x i k /x i l =m(>0), a fixed constant of proportionality for all i, then \({s}_{{kl}}^{2} =0\). In this elementary situation, w i k l =log(x i k /x i l )=log(m). Hence \({\overline {{w}}}_{{kl}} =\) log(m), so \(\left ({{w}_{{ikl}} -{\overline {{w}}}_{{kl}} } \right )^{2}=0\) for all i, and hence \({s}_{{kl}}^{2} =0\). Geochemical data is of course, notoriously noisy. Even if two elements were bound up in all the samples in such a way as Al and Si are in an alumino-silicate, their ratios (e.g., x iAl/x iSi), would not be precisely constant. Moreover, the presence of these elements from more than one source mineral increases the variability in the log-ratio. Nonetheless, in the expression for \({s}_{{kl}}^{2} \), all the terms \(\left ({{w}_{{ikl}} -{\overline {{w}}}_{{kl}} } \right )^{2}\ge 0\), so a value of \({s}_{{kl}}^{2} \) close to zero would imply all such terms are close to zero, and hence the kth and lth elements of the dataset would have an approximately linear, intrinsic association. When a subset of the elements of the dataset all have mutually pairwise LRVs close to zero, then their joint intrinsic associations would indicate that the elements of that subset form a suite such as the group of dominant elements in an endmember. The software package CoDaPack (Comas and Thió-Henestrosa 2011) outputs the Compositional Variation Array, with the lowest values of the LRVs highlighted (in blue), thereby expediting the identification of related elements.

1.3 A.3 An alternative correlation matrix

The array of log-ratio variances can be transformed into an array with the same symmetric layout as the correlation matrix except all entries would be non-negative but still ≤ 1. For example, exp(−LRV) and exp(−\(\sqrt {\text {LRV}})\) are equal to one for LRV= 0, and they both tend to zero with increasing LRV. In either case, a value of one signifies a perfect linear association, accounting for ones in the diagonal of the array, while a value of zero would signify an absence of linear association. An unresolved problem is to determine the statistical properties of any such transformation. In the case of the Pearson correlation coefficient, the statistical properties are known when a bivariate pair is assumed to be bivariate normal. Ironically, this is never the case for compositional data, despite the widespread citations of p-values obtained from the correlation matrices output by computer packages. It is clear that the properties of the LRV described here apply to all multivariate data, and not just compositional data. Hence empirical studies involving specified statistical distributions could compile for comparison, the results of Principal Components, and other multivariate analyses based on both correlation and LRV matrices. So, until the theoretical issues are resolved, such empirical studies could provide information on the variable behaviour of the LRV under differing initial conditions.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Renner, R.M., Nath, B.N. & Glasby, G.P. An appraisal of an iterative construction of the endmembers controlling the composition of deep-sea manganese nodules from the Central Indian Ocean Basin. J Earth Syst Sci 123, 1399–1411 (2014). https://doi.org/10.1007/s12040-014-0469-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12040-014-0469-1

Keywords

Navigation