Multiple Data Analyses and Statistical Approaches for Analyzing Data from Metagenomic Studies and Clinical Trials
Metagenomics, also known as environmental genomics, is the study of the genomic content of a sample of organisms (microbes) obtained from a common habitat. Metagenomics and other “omics” disciplines have captured the attention of researchers for several decades. The effect of microbes in our body is a relevant concern for health studies. There are plenty of studies using metagenomics which examine microorganisms that inhabit niches in the human body, sometimes causing disease, and are often correlated with multiple treatment conditions. No matter from which environment it comes, the analyses are often aimed at determining either the presence or absence of specific species of interest in a given metagenome or comparing the biological diversity and the functional activity of a wider range of microorganisms within their communities. The importance increases for comparison within different environments such as multiple patients with different conditions, multiple drugs, and multiple time points of same treatment or same patient. Thus, no matter how many hypotheses we have, we need a good understanding of genomics, bioinformatics, and statistics to work together to analyze and interpret these datasets in a meaningful way. This chapter provides an overview of different data analyses and statistical approaches (with example scenarios) to analyze metagenomics samples from different medical projects or clinical trials.
Key wordsMetagenomics Metatranscriptomics Microbiome Clinical trials Comparative metagenomics
The diversity of species on earth is high, and most of them are microorganisms. Their ubiquitous presence makes it extremely difficult to identify and classify all microbes in a laboratory environment. Standard genomics tries to enrich pure cultures and study them: for example, the taxonomy, the genome, the genes, and the pathways. However, only a miniscule fraction of all microbes can be cultured because of their complex symbiosis and nutrient requirements in other organisms. The scientific community is now equipped with the development of new sequencing techniques and high-throughput analysis. The study of the genomic content of a sample of microorganisms obtained from a common habitat is made possible with the field of metagenomics, also known as environmental genomics . Instead of taking the DNA for sequencing from isolated cultures it is obtained directly from the environment. Therefore, the analysis of microbes that are deemed unculturable (which means current laboratory culturing techniques are unable to grow them) with standard laboratory techniques becomes possible. Two main approaches commonly used in metagenomic studies: marker gene-based metagenomics (e.g., 16S amplicon sequencing) and metagenomic shotgun sequencing. In the first approach, DNA is used as the template for PCR to amplify a segment of the conserved 16S ribosomal RNA (rRNA) gene sequence. Universal primers complementary to conserved regions are used so that the region can be amplified from any bacteria. After purification of PCR products, sequencing of the 16S rRNA gene is performed . In the second approach, shotgun sequencing, DNA is broken up randomly into multiple small segments, which are sequenced using the chain termination method to obtain reads. Multiple overlapping reads for the target DNA are obtained by performing several rounds of this fragmentation and sequencing. Computer programs then use the overlapping ends of different reads to assemble them into a continuous sequence .There are several publications discussing the differences in microbial biodiversity discovery between 16S amplicon and shotgun sequencing, for example see . In a recent study using water samples from Brazil’s major river floodplain systems, authors showed shotgun sequencing outdone by amplicon . Here, the authors ascribed the poor performance of shotgun sequencing mainly to the weakness of the database used in the study, as compared to databases for the 16S rRNA gene. This study can be used as a caution for people working with rare environments (See article by Catherine Offord in The Scientist1). Comparisons of the two methods in well-studied systems such as the gut microbiome have generally found that shotgun sequencing identifies more microbial diversity .
Further recent advancement of culturomics approach is shedding light on multiple high-throughput culture conditions [7, 8]. As the samples used in metagenomics do not contain the genome of just one but many different microorganisms, the possibility of analyzing their functional and metabolic interplay arises. Next-generation sequencing technology (NGS) has effectively transformed infectious disease research throughout the last decade, fuelling the growth in genetic data and providing huge number of DNA reads at an affordable cost. Many studies use these techniques, which examine microorganisms that inhabit niches in the human body, sometimes causing disease, and researchers often try to correlate these microorganisms and their change with multiple treatment conditions (e.g., see ). Gene annotations in these studies support the association of specific genes or metabolic pathways with health and with specific diseases. In a recent article authors discussed how host gene–microbial interactions are major determinants for the development of multifactorial chronic disorders and thus for the relationship between genotype and phenotype . There are many other reports based on the application of metagenomics in understanding oral health and disease [11, 12, 13]. As recently described by Forbes et al., metagenomics and other “omics” disciplines could provide the solution to a cultureless future in clinical microbiology, food safety, and public health .
No matter from which environment it comes, the analysis of datasets from such studies are similar to some extent. Most projects aim at determining either the presence or absence of specific species of interest, or to obtain an overview of the taxa represented in a given metagenome and comparing the biological diversity and the functional activity of a wider range of microorganisms within their communities. The importance increases for comparison of different datasets, as researchers will need to determine and understand the similarities and dissimilarities within the metagenomes of different environments. These environments can be multiple patients with different conditions, multiple drugs, or multiple time points of same treatment or same patient. Further, sometimes researchers also may compare different environments for example to study antibiotic resistance genes (ARG) and understand which environments are more prone to such ARGs. Thus, no matter how many hypotheses we have, we need a good understanding of genomics, bioinformatics, and statistics to work together to analyze and interpret these datasets in a meaningful way.
This chapter provides an overview of different data analyses and statistical approaches to analyze metagenomics samples from a number of clinically derived datasets. The methodological description of this chapter will be guided by three main scenarios. The first one is a published data set from human atherosclerotic plaque samples (Scenario 1) ; the second one is a clinical trial example comparing the effects of two omega-3 polyunsaturated fatty acids (PUFAs) supplements on healthy volunteers (Scenario 2) ; and the third one is another clinical trial example comparing the efficacy of two drugs for an infectious disease (Scenario 3).
The Scenarios 3 came from an ongoing unpublished project; therefore, the real datasets are not provided. This chapter is mainly focused on multiple data analyses/annotation and statistical approaches that can be used in similar situations, but any biological finding of the example scenarios is not explained here. Although all of these scenarios are derived from medical projects, the analyses approach can be adapted to environmental samples as well. On this occasion, I must emphasize the importance to have good metadata, that is, a detailed description of each parameter like health status or sampling site or age or any similar information relating to specific samples that may be important for the analyses. Good metadata are key to good analyses and noise reduction in data analysis processes.
2 Description of Example Studies
2.1 Scenario 1: Metagenomic Analyses of Human Atherosclerotic Plaque Samples
To investigate microbiome diversity within human atherosclerotic tissue samples high-throughput metagenomic analysis was employed on (1) atherosclerotic plaques obtained from a group of patients who underwent endarterectomy due to recent transient cerebral ischemia or stroke and (2) presumed stabile atherosclerotic plaques obtained from autopsy from a control group of patients who all died from causes not related to cardiovascular disease. Our data provides evidence that suggest a wide range of microbial agents in atherosclerotic plaques, and an intriguing new observation that shows this microbiota displayed differences between symptomatic and asymptomatic plaques, as judged from the taxonomic profiles in these two groups of patients. Additionally, functional annotations reveal significant differences in basic metabolic and disease pathway signatures between these groups.
2.1.1 Methodology Details
For this study, we used atherosclerotic tissue samples from a group of 15 patients that underwent elective carotid endarterectomy following repeated transient ischemic attacks or minor strokes (samples from symptomatic atherosclerotic plaques as cases).2 Further, we have asymptomatic atherosclerotic plaques from seven persons who died from causes not related to atherosclerotic disease (samples from stable plaques as controls).3
Sample statistics and read assignments
In this study our data provided evidence that suggest a wide range of microbial agents (some pathogens) in atherosclerotic plaques, and these microbes displayed differences between symptomatic and asymptomatic plaques as judged from the taxonomic profiles in these two groups of patients. Further, fluorescence in situ hybridization (FISH) was performed to validate the presence of biofilm-like structures of few pathogens (which have been previously predicted from taxonomic analyses) in the symptomatic atherosclerotic plague samples. FISH staining demonstrates the presence of live bacteria; thus, this is a very good approach for cross-validation of any computational finding in the lab.
There are also potentials of using this data for not only taxonomic annotation but also to reveal the functional profiles through partial assembly of specific members and their functional annotations. Functional annotations reveal significant differences in basic metabolic and disease pathway signatures between these groups. Here, we will not provide details of the whole study, but interested readers may refer to .
On this occasion, it is necessary to mention that in any similar project in future, for alignment purpose, we would have used DIAMOND  which uses improved algorithms and additional heuristics and works much faster compared to available other aligners. Scenario 1 is an example of analyzing shotgun sequence datasets obtained from tissue samples or host-associated metagenome. In case readers have shotgun sequence datasets from environmental samples or from fecal samples, they do not need to perform alignment step to get rid of the host-associated sequences, unless there is any doubt of contamination. Normally we suggest to have control or blank samples in two wells per 96-well plate to address any issue with contaminations.
2.2 Scenario 2: The Effect of Omega-3 Polyunsaturated Fatty Acid Supplements on the Human Intestinal Microbiota
2.2.1 Study Design
Two 200 mL drinks per day (providing approximately as the triglyceride daily) at any suitable time of day, or
Four soft-gel capsules (each containing 250 mg EPA and 250 mg DHA as the ethyl ester) twice daily with meals (providing 2000 mg EPA and 2000 mg DHA per day), both for 8 weeks.
After a 12-week “washout” period, participants took the second intervention for 8 weeks. We also included a final study visit after a second 12-week “washout” period (V5; Fig. 2). Fecal samples were collected at five time-points for microbiome analysis by 16S rRNA PCR and Illumina MiSeq sequencing. Parallel red blood cell (RBC) fatty acid analysis was performed by liquid chromatography–tandem mass spectrometry.
2.2.2 Sample Preparation and Sequencing
Microbial DNA extractions were performed based on the method of Yu and Morrison,  with slight modifications. DNA was extracted from approximately 250 mg feces using the QIAamp DNA Stool Mini Kit (Qiagen, Germany) with bead beating. DNA Library Prep Kit for Illumina, NEBNext Singleplex Oligos for Illumina (New England Biolabs, UK), and unique in-house-designed index primers (Integrated DNA Technologies, UK) were used to allow for multiplexing of samples. Twelve cycles of enrichment PCR were performed, and final libraries were cleaned with AMPure Beads (Beckman Coulter, UK). Successful libraries were confirmed by DNA 1000 bioanalyzer chips or DNA Analysis screen tapes (Agilent, UK). Quantification was performed with the Quant-iT dsDNA Assay Kit, broad range. A total of 30 ng of each library was pooled and sequenced on an Illumina MiSeq (2 × 250 bp) . The variable region (V4) of the 16S rRNA gene was sequenced for these samples.
2.2.3 Data Analyses
Demultiplexed FASTQ files were trimmed of adapter sequences using cutadapt . Paired reads were merged using fastq-join  under default settings and then converted to FASTA format. Consensus sequences were removed if they contained any ambiguous base calls, two contiguous bases with a PHRED quality score lower than 33, or a length more than 2 bp different from the expected length of 240 bp. Further analysis was performed using QIIME . Operational taxonomy units (OTUs) were picked using usearch  and aligned to the Greengenes reference database using PyNAST . Taxonomy was assigned using the RDP 2.2 classifier . The resulting OTU BIOM files from the above analyses were imported in MEGAN for detailed group-specific analyses, annotations, and plots . R statistical programming language  was used for multivariate statistics and other plots.
This dataset and method pipeline are purely described as an example for similar analyses; thus, we will not explain the results here, but interested, readers may see . Scenario 2 is a typical example of analyzing 16S sequence data. In Subheading 3, we will describe few of the analysis approaches using data from this study.
2.3 Scenario 3: Comparing Effects of Two Drug Treatments for an Infectious Disease
2.3.1 Sample Preparation and Sequencing and Data Analyses
In a clinically relevant setting this type of study wants to know which drug works better for a similar group of patients. Patients are randomized between drug arms to control any selection bias. Usually in this type of projects as we want to compare several factors, we need many samples to start with. Readers are advised to seek statistics help to do power calculation to obtain the preferred sample size. In general, as we end up having hundreds of samples, we usually go for 16S sequencing as a cost-effective solution. However, some projects can also use shotgun sequencing. Similar to previous examples, we assume that we have sequenced (either 16S or shotgun sequencing) our samples and performed further analysis process as outlined earlier to obtain taxonomic profile (following data analyses methods as described in previous scenarios) for each patient at each time point. Besides analyzing time series of each individual separately, we have also grouped them in certain time points such as baseline, mid-treatment, end of treatment, and follow-up. Besides treatment groups, patients are also compared based on multiple factors such as age, gender, and geography.
3 General Methods for Annotation and Statistical Analyses
Broadening our focus beyond these studies, additional analysis techniques are explained below which are used in these studies and also can be used in similar projects.
3.1 Taxonomic and Functional Annotation
Taxonomic annotation addresses the question, ‘Who is out there?’ or in other words tries to obtain information regarding the species composition of a given metagenome. On the other hand, functional annotation attempts to answer the question, ‘What are they doing?’ There are different approaches for metagenome analyses, among which one type of approach is to use phylogenetic markers to distinguish between different species in a sample. The most widely used marker is the small subunit ribosomal ribonucleic acid (SSU rRNA) gene (16S or 18S) and a second type of method is based on analyzing the nucleotide composition of reads. In a supervised approach the nucleotide composition of a collection of reference genomes is used to train a classifier, which is then used to place a given set of reads into taxonomic bins. In an unsupervised approach, reads are clustered by composition similarity and then the resulting clusters are analyzed in an attempt to place the reads. Subheading 4 of this chapter provide details of multiple approaches and available different tools which readers can use according to their preferences.
In general, for annotating 16S rRNA sequences we use QIIME  and for shotgun sequencing we use MEGAN  which can also be used for 16S. MEGAN is a highly efficient program for interactive analysis and comparison of microbiome data, allowing one to explore hundreds of samples and billions of reads. While taxonomic profiling is performed based on the NCBI taxonomy, MEGAN also provides a number of different functional profiling approaches. MEGAN Community Edition also supports the use of metadata in the context of principal coordinate analysis and clustering analysis . In all the three scenarios explained in this chapter, MEGAN is used as primary tool for annotations. For more details on MEGAN tool, see Chapter 23.
If we have shotgun sequencing then we have good option for functional annotation, but with 16S sequences we can only perform taxonomic analyses with confidence although there are few tools which might predict metagenome functional content from marker genes [32, 33]. Most shotgun annotation pipelines (such as MEGAN , MG-RAST , IMG/MER , EBI Metagenomics ) support functional annotations and they often use databases such as KEGG , SEED , eggNOG , and COG/KOG , as well as protein domain databases such as TIGRFAM  and PFAM .
3.2 Metagenome Assembly
Similar in nature to the genomic assembly, which is the reconstruction of genomes from the sequenced DNA segments (or reads), metagenome assembly is more complex. The main goal is to stitch together the fragments of the reads that could be from the same genome. Here the reads consist of mixture of DNA from different organisms and also may have widely different levels of abundance. Few recent reviews discussed new challenges and opportunities as well as assessed the most common and freely available metagenome assembly tools with respect to their output statistics, their sensitivity for low-abundance community members and variability in resulting community profiles as well as their ease of use. Interested readers please refer to reviews [43, 44].
3.3 Rarefaction Curves
Rarefaction curves represent a powerful method for comparing species richness among habitats on an equal-effort basis based on the construction of the so-called rarefaction curves . This is a very useful tool for statistical data analyses that helps us to Correct for bias in species number due to unequal sample sizes by standardization to the number of species expected in a sample if it had the same total size as the smallest sample. As an example, we have two sample groups, first having 50 individuals and second 30 individuals with multiple number of species obtained from their taxonomic analyses. Rarefaction helps us to compare the situation, if we would have same number of individuals in two sample groups. Rarefaction curves are used differently in case of 16S and shotgun metagenomics. Ni and colleagues have described methods for estimating a reasonable and practical amount for SSU rRNA gene sequencing and explained how much metagenomic sequencing is enough to achieve a given goal . In metagenomic shotgun sequencing, the fraction of the metagenome represented in the data set is termed coverage, which can be assessed through rarefaction curve. Interested readers may refer to a recent publication which has advocated for the estimation of the average coverage obtained in metagenomic studies, and briefly presented the advantages of different approaches .
Similarly, in Scenario 2 also, rarefaction was performed at various levels to compare diversity for different sample groupings. All groups were rarefied to the lowest read number, and the diversity calculated using weighted and unweighted UniFrac as well as the non-phylogenetic Bray–Curtis dissimilarity measure.
3.4 Subsample Comparison
In situations like Fig. 3, where two samples have much higher sequencing depth, another option can be subsample comparison. In this process without excluding high-depth samples from further study, another approach is to simulate subsample of lowest sample size (of other samples in the study) for sufficient number of times. And then take a median of the subsamples to generate a pseudo profile, which can serve as a good comparable sample for the group. For example, if in a study for most of the samples sequence reads are in a range of 200,000–300,000. However, only few samples have approx. 1 million reads, in those cases we simulate subsample of 200,000 reads from them for large number of times (say 1000) and we take median of the profiles, which we can then compare with other samples.
3.5 Comparative Visualization
3.6 Diversity Analyses
Diversity analyses is one of the prominent statistical analysis approaches that address some of the downstream analysis steps associated with metagenomic studies. Species abundance estimates in the community are used to make inference about diversity on the whole community. The terms alpha, beta, and gamma diversity were all introduced by R. H. Whittaker to describe the spatial component of biodiversity . Alpha diversity is just the diversity of each site (samples in each group). Beta diversity represents the differences in species composition among sites. Gamma diversity is the diversity of the entire landscape of different sites (all species pool from multiple samples). A diversity index measures how many different types (such as species) are there in a dataset (a community) and simultaneously takes into account how evenly the basic entities (such as individuals) are distributed among these types. Three commonly used measures of diversity, Simpson’s index, Shannon’s entropy, and the total number of species, are related to Renyi’s definition of a generalized entropy, and are well explained and compared by Hill . Interested readers may also refer to  for consistent terminology for quantifying species diversity. Many other publications also explain this topic very well.
3.7 Comparison Using Distance Matrices
Another common technique to compare metagenomic datasets is using distance matrices. First, a taxonomic profile is computed for each data set. Second, a matrix of pairwise distances is determined using one of several possible ecological indices. Finally, the distances are represented using an appropriate visualization technique. Mitra et al.  explained multiple distance matrices (such as Bray–Curtis, Kulczynski, χ2, Hellinger, and Goodall) in the context of multiple metagenome comparison. In addition to these UniFrac is another distance metric used for comparing biological communities. It differs from dissimilarity measures such as Bray–Curtis by incorporating information on the relative relatedness of community members by incorporating phylogenetic distances between observed organisms in the computation [52, 53, 54]. Both weighted (quantitative) and unweighted (qualitative) variants of UniFrac are often used in microbial ecology, where the former accounts for abundance of observed organisms, while the latter only considers their presence or absence.
3.9 Hierarchical Clustering
Interestingly, the asymptomatic atherosclerotic plaques have more abundance of host microbiome-associated microbial families such as Porphyromonadaceae, Bacteroidaceae, Micrococcaceae, and Streptococcaceae than the symptomatic atherosclerotic plaques. In contrast, the symptomatic atherosclerotic plaques have more abundance of pathogenic microbial families such as Helicobacteraceae, Neisseriaceae, and sulfur-consuming families such as sulfur-oxidizing symbionts and Thiotrichaceae than the asymptomatic atherosclerotic plaques (Fig. 8). For P0613, the species profile appeared very different from all other samples. Thus, this sample also treated as an outlier in further analyses (see  if interested in actual study).
3.10 Principal Component Analysis (PCA) and Principal Coordinates Analysis (PCoA)
PCA and PCoA are tools for multivariate analysis. PCA uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components . This is often used for quantitative variables, so the axes in graphic have a quantitative weight, and the positions of the samples are in relation with those weight. On the other hand, PCoA or multidimensional scaling (MDS) is a means of visualizing the level of similarity of individual cases of a dataset . PCoA is similar to Polar ordination (PO; ) arranges samples between endpoints or ‘poles’ according to the distance matrix maximizing the linear correlation between the distances in the distance matrix. If further interested in these methods please see .
3.11 Canonical-Correlation Analysis (CCA) and Canonical-Correspondence Analysis (CCA)
In this occasion it is important to note that CCA does not perform variable selection. Further, when the number of variables exceeds the number of observations (or sample size), CCA cannot be applied directly due to singularity of the covariance matrix. In a recent study  the authors have discussed this problem and a few existing solutions. Additionally, they developed a method for structure-constrained sparse canonical correlation analysis (ssCCA) in a high-dimensional setting. ssCCA takes into account the phylogenetic relationships among bacteria, which provides important prior knowledge on evolutionary relationships among bacterial taxa (see  if interested).
3.12 Multivariate Analyses
Multivariate data analysis refers to any statistical approach used to analyze data with more than one variable. For example, as described in Scenario 3 we have multiple factors. The key to identifying important microbial taxa associated with two treatments is that the large datasets from each patient are compared within groups, and then the metadata from the patients’ groups are compared against each other. Analysis of multivariate data in response to factors, groups, or treatments in an experimental design needs sophisticated methods.
To achieve this, we can use PERMANOVA (permutational multivariate analysis of variance)  to test the homogeneity of multivariate dispersions within groups, on the basis of any resemblance measure. PERMANOVA is a better approach than ANOVA (Analysis of variance)/MANOVA (Multivariate analysis of variance) for our study as PERMANOVA works with any distance measure that is appropriate to the data, and uses permutations to make it distribution free, unlike assuming normal distributions. Finally, in addition to the above multiple comparisons, we can examine if there is consistency of microbiota changes and patterns across the geographical locales of treatment subjects; as our samples are from different countries. We are not showing the details of multivariate analyses, but there are multiple available packages for such analyses with good tutorials. Interested readers may visit these packages and websites as detailed below.
The Primer-E package  is commonly used by microbial ecologists and allows for multiple multivariate statistical analyses. We often use R statistical programming language  for multivariate statistics. Moreover R is used for several types of graphical representations. Particular packages provide in-built functions and libraries (within R environment) specially for metagenomic datasets such as Bioconductor , vegan , and phyloseq .
4 Tools and Packages Commonly Used in Metagenomic Studies
- 1.Processing of raw sequence reads and quality control (QC):
Cut-adapt (both adapter trimming and quality trim) .
Condetri (Read trimmer for Illumina data) .
Trimmomatic (allows multiple threads) .
- 3.Analyses for 16S projects: OTU clustering, picking, and taxonomic assignment.
- 4.Assembly of shotgun metagenomics data.
- 5.Removing near-exact matches by maping to specific genomes.
Bowtie 2 .
- 6.Binning tools for metagenomes.
- 12.Statistical computing.
Many other tools can be used for statistical analyses.
5 Concluding Remarks
This chapter has illustrated multiple data analyses and annotation techniques in metagenomic studies with three case studies. This is not a chapter about any new method development but a description of optimized pipelines using various available tools. With these example scenarios, the use of multiple pipelines has been demonstrated to analyze and interpret the data starting from very raw sequence to the final statistical outputs. Example scenarios describe some of the tools that we have used for analyzing the projects selected for demonstration, but besides these there are plenty of other available tools for metagenomics, most of which are listed in Subheading 4. This chapter does not provide the details of the tools or describe their pros and cons but this can be a good starting point for the readers to explore available options to analyze and interpret their datasets. From this chapter readers shall get an idea of current research projects in medical studies and multiple approaches used to analyze the data originating from these projects, although readers should keep in mind that this is not an exclusive list of possible pipelines for analyzing metagenomic samples. There might be other approaches as well. While step-by-step instructions of all the tools is beyond the scope of this chapter, the methods outline here might be useful to researchers to plan, analyze, and interpret their research projects successfully.
All methods and experimental manuals were approved by The National Committee on Health Research Ethics (Danish) and was granted by the Ethical Committee of the region of Copenhagen (H-3-2011-013).
These samples originated from the tissue bank at the Department of Forensic Medicine (Approval No. 1501230).
- 21.R Development Core Team (2008) R: A language and environment for statistical computing. R Foundation for Statistical Computing, ViennaGoogle Scholar
- 25.Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. Genomics 98(1):152–153Google Scholar
- 26.Aronesty E (2011) ea-utils: “Command-line tools for processing biological sequencing data”. https://github.com/ExpressionAnalysis/ea-utils
- 55.McGill R, Tukey JW, Larsen WA (1978) Variations of box plots. Am Stat 32(1):12–16Google Scholar
- 59.Borg I, Groenen PJ (2005) Modern multidimensional scaling: theory and applications. Springer Science & Business Media, New YorkGoogle Scholar
- 61.Michael PW Ordination methods—an overview. http://ordination.okstate.edu/overview.htm
- 66.Anderson MJ (2001) A new method for non-parametric multivariate analysis of variance. Austral Ecol 26(1):32–46Google Scholar
- 67.Clarke KG, Gorley RN (2006) PRIMER v6: user manual/tutorial. PRIMER-E, PlymouthGoogle Scholar
- 69.Oksanen J, Blanchet FG, Friendly M, Kindt R, Legendre P, McGlinn D, et al (2008) The vegan package. https://cran.r-project.org/web/packages/vegan/vegan.pdf
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.