Mixed data types and the use of pattern analysis on the Australian groundnut germplasm data

Harch, B. D.; Basford, K. E.; DeLacy, I. H.; Lawrence, P. K.; Cruickshank, A.

doi:10.1007/BF00132957

Mixed data types and the use of pattern analysis on the Australian groundnut germplasm data

Regular Research Papers
Published: August 1996

Volume 43, pages 363–376, (1996)
Cite this article

Genetic Resources and Crop Evolution Aims and scope Submit manuscript

B. D. Harch^1,4,5,
K. E. Basford¹,
I. H. DeLacy¹,
P. K. Lawrence² &
…
A. Cruickshank³

87 Accesses
9 Citations
Explore all metrics

Abstract

Data in germplasm collections contain a mixture of data types; binary, multistate and quantitative. Given the multivariate nature of these data, the pattern analysis methods of classification and ordination have been identified as suitable techniques for statistically evaluating the available diversity. The proximity (or resemblance) measure, which is in part the basis of the complementary nature of classification and ordination techniques, is often specific to particular data types. The use of a combined resemblance matrix has an advantage over data type specific proximity measures. This measure accommodates the different data types without manipulating them to be of a specific type. Descriptors are partitioned into their data types and an appropriate proximity measure is used on each. The separate proximity matrices, after range standardisation, are added as a weighted average and the combined resemblance matrix is then used for classification and ordination.

Germplasm evaluation data for 831 accessions of groundnut (Arachis hypogaea L.) from the Australian Tropical Field Crops Genetic Resource Centre, Biloela, Queensland were examined. Data for four binary, five ordered multistate and seven quantitative descriptors have been documented. The interpretative value of different weightings—equal and unequal weighting of data types to obtain a combined resemblance matrix—was investigated by using principal co-ordinate analysis (ordination) and hierarchical cluster analysis.

Equal weighting of data types was found to be more valuable for these data as the results provided a greater insight into the patterns of variability available in the Australian groundnut germplasm collection. The complementary nature of pattern analysis techniques enables plant breeders to identify relevant accessions in relation to the descriptors which distinguish amongst them. This additional information may provide plant breeders with a more defined entry point into the germplasm collection for identifying sources of variability for their plant improvement program, thus improving the utilisation of germplasm resources.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Comprehensive Survey of Clustering Algorithms

Article 01 June 2015

Density-Based Clustering Based on Hierarchical Density Estimates

Land use intensification has extensive effects on the functional and phylogenetic diversity of neotropical ant communities

Article 17 May 2024

References

Anderberg, M.R., 1973. Cluster Analysis for Applications. Academic Press, Inc., New York.
Google Scholar
Anon., 1989. The case for crop networks. Geneflow, June 1989, FAO Italy, pp. 6–7.
Barnett, V., 1981. Interpreting Multivariate Data. John Wiley & Sons, London.
Google Scholar
Belbin, L., 1993. PATN: Technical Reference. Division of Wildlife and Ecology, CSIRO, Canberra.
Google Scholar
Bray, J.R. & J.T. Curtis, 1957. An ordination of the upland forest communities of southern Wisconsin. Ecol. Monogr. 27: 325–349.
Google Scholar
Burr, E.J., 1968. Cluster sorting with mixed character types. I. Standardisation of character values. Aust. Comp. J. 1: 97–99.
Google Scholar
Cain, A.J. & G.A. Harrison, 1958. An analysis of the taxonomists' judgement of affinity. Proc. Zool. Soc. London 131: 85–98.
Google Scholar
Digby, P.G.N. & R.A. Kempton, 1987. Multivariate analysis of ecological communities. Chapman and Hall, New York.
Google Scholar
Ducker, S.C., W.T. Williams & G.N. Lance, 1965. Numerical classification of the pacific forms of Chlorodesmis (Chlorophyta). Aust. J. Bot. 13: 489–499.
Google Scholar
Esquivel, M., M. Barrios, L. Walón & K. Hammer, 1993a. Peanut (Arachis hypogaea L.) genetic resources in Cuba. I. Collecting and characterization. FAO/IBPGR Plant Genetic Resources Newsletter 91/92: 9–15.
Google Scholar
Esquivel, M., Z. Fundora & K. Hammer, 1993b. Peanut (Arachis hypogaea L.) genetic resources in Cuba. II. Preliminary germplasm evaluation. FAO/IBPGR Plant Genetic Resources Newsletter 91/92: 17–20.
Google Scholar
Frankel, O.H. & A.H.D. Brown, 1984. Current plant genetic resources — a critical appraisal. In: (Eds.) Genetics: new frontiers, 4: 1–11, Oxford & IBH Publishing Co., New Delhi.
Google Scholar
Goodall, D.W., 1966. A new similarity index based on probability. Biometrics. 22: 882–907.
Google Scholar
Gordon, A.D., 1981. Classification: Methods for the exploratory analysis of multivariate data. 1st ed. Chapman and Hall, New York.
Google Scholar
Gower, J.C., 1967. Multivariate analysis and multivariate geometry. The Statistician 17: 13–28.
Google Scholar
Gower, J.C., 1971. A general coefficient of similarity and some of its properties. Biometrics. 27: 857–872.
Google Scholar
Gregory, W.C. & M.P. Gregory, 1967. Induced mutation and species hybridisation in the de-speciation of Arachis. Ciência e Cultura 19: 42–44.
Google Scholar
Harch, B.D., K.E. Basford, I.H. DeLacy, P.K. Lawrence & A. Cruickshank, 1995a. Patterns of diversity in fatty acid composition in the Australian groundnut germplasm collection. Genet. Resour. Crop Evol. 42: 243–256.
Google Scholar
Harch, B.D., K.E. Basford, I.H. DeLacy, P.K. Lawrence & A. Cruickshank, 1995b. Diversity in the Australian Groundnut Germplasm Collection. Int. Arachis News. 15, ICRISAT.
Hartigan, J.A., 1975. Clustering Algorithms. John Wiley & Sons Inc., Canada.
Google Scholar
Henning, R.J., A.H. Allison & L.D. Tripp, 1982. Cultural Practices. In: H.E. Pattee & C.T. Young (Eds.), Peanut Science and Technology, pp. 123–138, American Peanut Research and Education Society Inc., Texas.
Google Scholar
IBPGR & ICRISAT, 1992. Descriptors for Groundnut. International Board for Plant Genetic Resources, Rome, Italy & International Crop Research Institute for the Semi-Arid Tropics, Patancheru, India.
Google Scholar
Jaccard, P., 1901. Distribution de la flore alpine dans le Bassin des Dranses et dans quelques régions voisines. Bull. Soc. Vand. Sci. Nat. 37: 241–272.
Google Scholar
Kaufman, L. & P.J. Rousseeuw 1990. Finding groups in data: an introduction to cluster analysis. 1st ed. John Wiley & Sons, Inc, Canada.
Google Scholar
Kouamé, C.N. & K.H. Quesenberry, 1993. Cluster analysis of a world collection of red clover germplasm. Genet. Resour. Crop Evol. 40: 39–47.
Google Scholar
Krapovickas, A., 1968. Origen, variabilidad y dispersión del mani (Arachis hypogaea). Actas y Memorias XXXVII Congresso Internacional de Americanistas 2: 517–534.
Google Scholar
Krapovickas, A. & V.A. Rigoni, 1951. Estudios citologicas en el genero Arachis. Rev. Invest. Agric. (Buenos Aires). 5: 289–293.
Google Scholar
Krapovickas, A. & V.A. Rigoni, 1960. La nomenclature de las subespecies y variedades de Arachid hypogaea L. Rev. Invest. Agric. (Buenos Aires) 12: 197–228.
Google Scholar
Lance, G.N. & W.T. Williams, 1967. Mixed-data classificatory programs I. Agglomerative. Aust. Comput. J. 1: 15–20.
Google Scholar
Lawrence, P. 1989. The Australian Tropical Field Crops Genetic Resource Centre. The Aust. Plant Introd. Rev. 20 (2): 1–5.
Google Scholar
Milligan, G.W., & M.C. Cooper, 1988. A study of standardisation and variables in cluster analysis. J. of Classification 5: 181–204.
Google Scholar
Odum, E.P., 1950. Bird populations of the Highlands (North Carolina) Plateau in relation to plant succession and avian invasion. Ecology 31: 587–605.
Google Scholar
Pecetti, L., P. Annicchiarico & A.B. Damania, 1992. Biodiversity in a germplasm collection of durum wheat. Euphytica 60: 229–238.
Google Scholar
Peeters, J.P. & J.A. Martinelli, 1989. Hierarchical cluster analysis as a tool to manage variation in germplasm collections. Theor. Appl. Genet. 78: 42–48.
Google Scholar
Radloff, D.L. & D.R. Betters, 1978. Multivariate analysis of physical site data for wildland classification. For. Sci. 24: 3–10.
Google Scholar
Romesburg, H.C., 1984. Cluster analysis for researchers. Lifetime Learning Publications, Belmont, CA.
Google Scholar
Rubin, J., 1967. Optimal classification into groups: an approach for solving the taxonomy problem. J. Theoret. Biol. 15: 103–144.
Google Scholar
SAS Institute, 1989. SAS User's Guide: Basics. Cary, NC.
Shannon, C.E., 1948. A mathematical theory of communication. Bell System Tech. J. 27: 379–423 & 27: 623–656.
Google Scholar
Sharpiro, S.S. & M.B. Wilk, 1965. An analysis of variance test for normality (complete samples). Biometrika 52: 591–611.
Google Scholar
Sharpiro, S.S., M.B. Wilk & H.J. Chen, 1968. A comparative study of various tests for normality. J. Am. Stat. Assoc. 63: 1343–1372.
Google Scholar
Sokal, R.R. & P.H.A. Sneath, 1963. The principles of numerical taxonomy W.H. Freeman, San Francisco.
Google Scholar
Sneath, P.H.A. & R.R. Sokal, 1973. Numerical Taxonomy. W.H. Freeman, San Francisco.
Google Scholar
Souza, E. & M.E. Sorrells, 1991a. Relationships among 70 North American oat germplasms: I. Cluster analysis using quantitative characters. Crop Sci. 31: 599–605.
Google Scholar
Souza, E. & M.E. Sorrells, 1991b. Relationships among 70 North American oat germplasms: II. Cluster analysis using qualitative characters. Crop Sci. 31: 605–612.
Google Scholar
Späth, H., 1980. Cluster analysis algorithms. Ellis Horwood Limited, London.
Google Scholar
Tukey, J.W., 1977. Exploratory Data Analysis. Addison-Wesley, Reading, MA.
Google Scholar
Veronesi, F. & M. Falcinelli, 1988. Evaluation of an Italian germplasm collection of Festuca arundinacea Schreb. through multivariate analysis. Euphytica 38: 211–220.
Google Scholar
Ward, J.H., 1963. Hierarchical grouping to optimise an objective function. J. Am. Stat. Assoc. 58: 236–244.
Google Scholar
Watson, L., W.T. Williams & G.N. Lance, 1967. A mixed data numerical approach to Angiosperm taxonomy. Proc. Linn. Soc. 178: 25.
Google Scholar
Williams, W.T., 1976. Pattern Analysis in Agricultural Science. Elsevier, New York.
Google Scholar
Williams, W.T. & G.N. Lance, 1958. Automatic subdivision of associated populations. Nature (London) 182: 1755.
Google Scholar
Wynne, J.C. & T.A. Coffelt, 1982. Genetics of Arachis hypogaea L., In: H.E. Pattee & C.T. Young (Eds.), Peanut Science and Technology, pp. 50–94, American Peanut Research and Education Society Inc., Texas.
Google Scholar
Zar, J.H., 1984. Biostatistical Analysis. Prentice Hall International, Inc., New Jersey.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Agriculture, The University of Queensland, 4072, Brisbane, Old, Australia
B. D. Harch, K. E. Basford & I. H. DeLacy
Australian Tropical Field Crops Genetic Resource Centre, P.O. Box 201, 4715, Biloela, Qld, Australia
P. K. Lawrence
J. Bjelke-Petersen Research Station, P.O. Box 23, 4610, Kingaroy, Qld, Australia
A. Cruickshank
Institute of Natural Resources, CSIRO, PMB No. 2, 5064, Glen Osmond, SA, Australia
B. D. Harch
Environment Biometrics Unit, CSIRO, PMB No. 2, 5064, Glen Osmond, SA, Australia
B. D. Harch

Authors

B. D. Harch
View author publications
You can also search for this author in PubMed Google Scholar
K. E. Basford
View author publications
You can also search for this author in PubMed Google Scholar
I. H. DeLacy
View author publications
You can also search for this author in PubMed Google Scholar
P. K. Lawrence
View author publications
You can also search for this author in PubMed Google Scholar
A. Cruickshank
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Harch, B.D., Basford, K.E., DeLacy, I.H. et al. Mixed data types and the use of pattern analysis on the Australian groundnut germplasm data. Genet Resour Crop Evol 43, 363–376 (1996). https://doi.org/10.1007/BF00132957

Download citation

Received: 12 April 1995
Accepted: 18 September 1995
Issue Date: August 1996
DOI: https://doi.org/10.1007/BF00132957

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mixed data types and the use of pattern analysis on the Australian groundnut germplasm data

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Density-Based Clustering Based on Hierarchical Density Estimates

Land use intensification has extensive effects on the functional and phylogenetic diversity of neotropical ant communities

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Key words

Navigation

Mixed data types and the use of pattern analysis on the Australian groundnut germplasm data

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Density-Based Clustering Based on Hierarchical Density Estimates

Land use intensification has extensive effects on the functional and phylogenetic diversity of neotropical ant communities

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation