Abstract
Population structure of Cannabis sativa L. was explored across nine independent collections that each contained a unique sampling of varieties. Hierarchical Clustering of Principal Components (HCPC) identified a range of three to seven genetic clusters across datasets with inconsistent structure based on use type indicating the importance of sampling particularly when there is limited passport data. There was broader genetic diversity in modern cultivars relative to landraces. Further, in a subset of geo-referenced landrace accessions, population structure was observed based on geography. The inconsistent structure across different collections shows the complexity within Cannabis, and the importance of understanding any particular collection which could then be leveraged in breeding programs for future crop improvement.
Similar content being viewed by others
Availability of data and material
Data are available upon reasonable request to the corresponding author.
Code availability
Code and filtered vcfs are available at https://github.com/ahmccormick and high resolution figures at https://figshare.com/authors/Anna_H_McCormick/17741367
References
Alexander SP (2020) Barriers to the wider adoption of medicinal Cannabis. Br J Pain 14(2):122–132
Allen KD, McKernan K, Pauli C, Roe J, Torres A, Gaudino R (2019) Genomic characterization of the complete terpene synthase gene family from Cannabis sativa. PLoS ONE 14(9):e0222363
Andrews S (2010) FastQC: a quality control tool for high throughput sequence data.
Badowski ME, Perez SE (2016) Clinical utility of dronabinol in the treatment of weight loss associated with HIV and AIDS. HIV/AIDS-Res Palliat Care. https://doi.org/10.2147/HIV.S81420
Bicket MC, Stone EM, McGinty EE (2023) Use of cannabis and other pain treatments among adults with chronic pain in US states with medical cannabis programs. JAMA Netw Open 6(1):e2249797–e2249797
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15):2114–2120
Campbell LG, Dufresne J, Sabatinos SA (2020) Cannabinoid inheritance relies on complex genetic architecture. Cannabis Cannabinoid Res 5(1):105–116
Caplan D, Dixon M, Zheng Y (2019) Increasing inflorescence dry weight and cannabinoid content in medical cannabis using controlled drought stress. HortScience 54(5):964–969
Carlson CH, Stack GM, Jiang Y, Taşkıran B, Cala AR, Toth JA, Philippe G, Rose JK, Smart CD, Smart LB (2021) Morphometric relationships and their contribution to biomass and cannabinoid yield in hybrids of hemp (Cannabis sativa). J Exp Bot 72(22):7694–7709
Chao TC (2014) Enhancing metadata for research methods in data curation. Proc Am Soc Inf Sci Technol 51(1):1–4
Clarke RC, Merlin MD (2013) Cannabis. University of Cali, Evolution and Ethnobotany
Clarke RC, Merlin MD (2016) Cannabis domestication, breeding history, present-day genetic diversity, and future prospects. Crit Rev Plant Sci 35(5–6):293–327
Clarke RC (1987) Cannabis evolution. MS thesis, Indiana University, Bloomington, IN.
Cooper HD (2002) The international treaty on plant genetic resources for food and agriculture. Rev. Eur. Comp. & Int’l Envtl. L. 11:1
Danecek P, McCarthy SA (2017) BCFtools/csq: haplotype-aware variant consequences. Bioinformatics 33(13):2037–2039
Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, McVean G (2011) The variant call format and VCFtools. Bioinformatics 27(15):2156–2158
Darriba D, Posada D, Kozlov AM, Stamatakis A, Morel B, Flouri T (2020) ModelTest-NG: a new and scalable tool for the selection of DNA and protein evolutionary models. Mol Biol Evol 37(1):291–294
De Meijer EP, Bagatta M, Carboni A, Crucitti P, Moliterni VC, Ranalli P, Mandolino G (2003) The inheritance of chemical phenotype in Cannabis sativa L. Genetics 163(1):335–346
De Meijer EPM, Hammond KM, Micheler M (2009) The inheritance of chemical phenotype in Cannabis sativa L.(III): variation in cannabichromene proportion. Euphytica 165(2):293–311
Della Coletta R, Qiu Y, Ou S, Hufford MB, Hirsch CN (2021) How the pan-genome is changing crop genomics and improvement. Genome Biol 22(1):1–19
Dreiseitl A (2020) Specific resistance of barley to powdery mildew, its use and beyond: a concise critical review. Genes 11(9):971
Duvall CS (2017) Drug laws, bioprospecting and the agricultural heritage of Cannabis in Africa. In Drugs, Law, People, Place and the State, 10–25. Routledge.
El Sohly MA, Mehmedic Z, Foster S, Gon C, Chandra S, Church JC (2016) Changes in cannabis potency over the last 2 decades (1995–2014): analysis of current data in the United States. Biol Psychiat 79(7):613–619
Evangelou E, Ioannidis JP (2013) Meta-analysis methods for genome-wide association studies and beyond. Nat Rev Genet 14(6):379–389
Ewels P, Magnusson M, Lundin S, Käller M (2016) MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32(19):3047–3048
Ferroni M, Castle P (2011) Public-private partnerships and sustainable agricultural development. Sustainability 3(7):1064–1073
Frankel OH (1984) Genetic perspectives of germplasm conservation. Genet Manip: Impact on Man and Soc 161:170
Gao S, Wang B, Xie S, Xu X, Zhang J, Pei L, Yu Y, Yang W, Zhang Y (2020) A high-quality reference genome of wild Cannabis sativa. Hortic Res. https://doi.org/10.1038/s41438-020-0295-3
Garfinkel AR, Otten M, Crawford S (2021) SNP in potentially defunct tetrahydrocannabinolic acid synthase is a marker for cannabigerolic acid dominance in Cannabis sativa L. Genes 12(2):228
Gilmore S, Peakall R, Robertson J (2007) Organelle DNA haplotypes reflect crop-use characteristics and geographic origins of Cannabis sativa. Forensic Sci Int 172(2–3):179–190
Govindaraju DR (2019) An elucidation of over a century old enigma in genetics—Heterosis. PLoS Biol 17(4):e3000215
Grassa CJ, Weiblen GD, Wenger JP, Dabney C, Poplawski SG, Timothy Motley S, Michael TP et al (2021) A new Cannabis genome assembly associates elevated cannabidiol (CBD) with hemp introgressed into marijuana. New Phytol 230(4):1665–1679
Guerriero G, Behr M, Legay S, Mangeot-Peter L, Zorzan S, Ghoniem M et al (2017) Transcriptomic profiling of hemp bast fibres at different developmental stages. Sci Rep 7(1):4961
Guerriero G, Deshmukh R, Sonah H, Sergeant K, Hausman JF, Lentzen E, Valle N, Siddiqui KS et al (2019) Identification of the aquaporin gene family in Cannabis sativa and evidence for the accumulation of silicon in its tissues. Plant Sci 287:110167
Henry P, Khatodia S, Kapoor K, Gonzales B, Middleton A, Hong K, Hilyard A, Johnson S, Allen D, Chester Z et al (2020) A single nucleotide polymorphism assay sheds light on the extent and distribution of genetic diversity, population structure and functional basis of key traits in cultivated north American cannabis. J Cannabis Res 2(1):1–11
Hillig KW (2005) Genetic evidence for speciation in Cannabis (Cannabaceae). Genet Resour Crop Evol 52:161–180
Hübner S, Bercovich N, Todesco M, Mandel JR, Odenheimer J, Ziegler E, Lee JS, Baute GJ, Owens GL, Grassa CJ et al (2019) Sunflower pan-genome analysis shows that hybridization altered gene content and disease resistance. Nat Plants 5(1):54–62
Hurgobin B, Tamiru-Oli M, Welling MT, Doblin MS, Bacic A, Whelan J, Lewsey MG (2021) Recent advances in Cannabis sativa genomics research. New Phytol 230(1):73–89
Jiang H, Lei R, Ding SW, Zhu S (2014) Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads. BMC Bioinform 15:1–12
Jin D, Henry P, Shan J, Chen J (2021) Classification of cannabis strains in the canadian market with discriminant analysis of principal components using genome-wide single nucleotide polymorphisms. PLoS ONE 16(6):e0253387
Kassambara A, Mundt F (2017) factoextra: extract and visualize the results of multivariate data analyses (Version 1.0. 5). URL https://www.rdocumentation.org/packages/factoextra/versions/1.0,5.
Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30(4):772–780
Lamarck JB (1785) Encyclopédie méthodique. Botanique. Panckoucke, Paris
Laverty KU, Stout JM, Sullivan MJ, Shah H, Gill N, Holbrook L, Deikus G, Sebra R, Hughes TR, Page JE, Van Bakel H (2019) A physical and genetic map of Cannabis sativa identifies extensive rearrangements at the THC/CBD acid synthase loci. Genome Res 29(1):146–156
Lê S, Josse J, Husson F (2008) FactoMineR: An R Package for Multivariate Analysis. J Stat Softw 25(1):1–18
Lewis MA, Russo EB, Smith KM (2018) Pharmacological foundations of cannabis chemovars. Planta Med 84(04):225–233
Li J, Yuan D, Wang P, Wang Q, Sun M, Liu Z, Si H, Xu Z, Ma Y, Zhang B, Pei L (2021) Cotton pan-genome retrieves the lost sequences and genes during domestication and selection. Genome Biol 22(1):1–26
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009) 1000 Genome Project Data Processing Subgroup, 2009. The sequence alignment/map format and SAMtools. Bioinformatics 25(16):2078–2079
Li H (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv preprint arXiv:1303.3997.
Linnaeus C (1753) Species Plantarum. Laurentius Salvius, Stockholm, 1200.
Lydon J, Teramura AH, Coffman CB (1987) UV-B radiation effects on photosynthesis, growth and cannabinoid production of two Cannabis sativa chemotypes. Photochem Photobiol 46(2):201–206
Lynch RC, Vergara D, Tittes S, White K, Schwartz CJ, Gibbs MJ, Ruthenburg TC, DeCesare K, Land DP, Kane NC (2016) Genomic and chemical diversity in Cannabis. Crit Rev Plant Sci 35(5–6):349–363
Maoz TY (2020) Making Cannabis History in 2020. WWW document] URL https://www.nrgene.com/blog/making-cannabis-history-in-2020.
Marchi N, Schlichta F, Excoffier L (2021) Demographic inference. Curr Biol 31(6):R276–R279
McKernan KJ, Helbert Y, Kane LT, Ebling H, Zhang L, Liu B, Eaton Z, McLaughlin S, Kingan S, Baybayan P, Concepcion G (2020) Sequence and annotation of 42 cannabis genomes reveals extensive copy number variation in cannabinoid synthesis and pathogen resistance genes. BioRxiv. https://doi.org/10.1101/2020.01.03.894428
McPartland JM (2018) Cannabis systematics at the levels of family, genus, and species. Cannabis and Cannabinoid Res 3(1):203–212
McPartland JM, Guy GW (2017) Models of Cannabis taxonomy, cultural bias, and conflicts between scientific and vernacular names. Bot Rev 83:327–381
McPartland JM, Small E (2020) A classification of endangered high-THC cannabis (Cannabis sativa subsp. indica) domesticates and their wild relatives. PhytoKeys 144:81
Mead A (2017) The legal status of cannabis (marijuana) and cannabidiol (CBD) under US law. Epilepsy Behav 70:288–291
Mead A (2019) Legal and regulatory issues governing cannabis and cannabis-derived products in the United States. Front Plant Sci 10:697
Merrick LF, Lyon SR, Balow KA, Murphy KM, Jones SS, Carter AH (2020) Utilization of evolutionary plant breeding increases stability and adaptation of winter wheat across diverse precipitation zones. Sustainability 12(22):9728
Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, Von Haeseler A, Lanfear R (2020) IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol Biol Evol 37(5):1530–1534
Mostafaei Dehnavi M, Ebadi A, Peirovi A, Taylor G, Salami SA (2022) THC and CBD fingerprinting of an elite Cannabis collection from iran: quantifying diversity to underpin future Cannabis breeding. Plants 11(1):129
Murovec J, Eržen JJ, Flajšman M, Vodnik D (2022) Analysis of morphological traits, cannabinoid profiles, thcas gene sequences, and photosynthesis in wide and narrow leaflet high-cannabidiol breeding populations of medical Cannabis. Front Plant Sci 13:786161
Onofri C, Mandolino G (2017) Genomics and molecular markers in Cannabis sativa L. Cannabis Sativa L—Botany and Biotechnology. https://doi.org/10.1007/978-3-319-54564-6_15
Parker LA, Rock EM, Limebeer CL (2011) Regulation of nausea and vomiting by cannabinoids. Br J Pharmacol 163(7):1411–1422
Perucca E (2017) Cannabinoids in the treatment of epilepsy: hard evidence at last? J Epilepsy Res 7(2):61
Petit J, Salentijn EM, Paulo MJ, Thouminot C, van Dinter BJ, Magagnini G, Gusovius HJ, Tang K, Amaducci S, Wang S, Uhrlaub B (2020) Genetic variability of morphological, flowering, and biomass quality traits in hemp (Cannabis sativa L.). Front Plant Sci 11:102
Punja ZK, Holmes JE (2020) Hermaphroditism in marijuana (Cannabis sativa L.) inflorescences–impact on floral morphology, seed formation, progeny sex ratios, and genetic variation. Front Plant Sci. https://doi.org/10.3389/fpls.2020.0071
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, De Bakker PI, Daly MJ, Sham PC (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81(3):559–575
Raj A, Stephens M, Pritchard JK (2014) fastSTRUCTURE: variational inference of population structure in large SNP data sets. Genetics 197(2):573–589
Ren G, Zhang X, Li Y, Ridout K, Serrano-Serrano ML, Yang Y, Liu A, Ravikanth G, Nawaz MA, Mumtaz AS, Salamin N (2021) Large-scale whole-genome resequencing unravels the domestication history of Cannabis sativa. Sci Adv 7(29):eabg2286
Roman MG, Gangitano D, Houston R (2019) Characterization of new chloroplast markers to determine biogeographical origin and crop type of Cannabis sativa. Int J Legal Med 133:1721–1732
Ryan JE, McCabe SE, Boyd CJ (2021) Medicinal cannabis: policy, patients, and providers. Policy Polit Nurs Pract 22(2):126–133
Sawler J, Stout JM, Gardner KM, Hudson D, Vidmar J, Butler L, Page JE, Myles S (2015) The genetic structure of marijuana and hemp. PLoS ONE 10(8):e0133292
Schultes RE, Klein WM, Plowman T, Lockwood TE (1974) Cannabis: an example of taxonomic neglect. Bot Mus Leafl Harv Univ 23(9):337–367
Schwabe AL, McGlaughlin ME (2019) Genetic tools weed out misconceptions of strain reliability in Cannabis sativa: implications for a budding industry. J Cannabis Res 1(1):1–16
Schwabe AL, Hansen CJ, Hyslop RM, McGlaughlin ME (2021) Comparative genetic structure of Cannabis sativa including federally produced, wild collected, and cultivated samples. Front Plant Sci. https://doi.org/10.3389/fpls.2021.675770
Schwabe AL, Johnson V, Harrelson J, McGlaughlin ME (2023). Uncomfortably high: Testing reveals inflated THC potency on retail Cannabis labels. PLOS ONE 18(4): e0282396.
Small E (2015) Evolution and classification of Cannabis sativa (marijuana, hemp) in relation to human utilization. Bot Rev 81:189–294
Small E, Cronquist A (1976) A practical and natural taxonomy for Cannabis. Taxon 25(4):405–435
Smith CJ, Vergara D, Keegan B, Jikomes N (2022) The phytochemical diversity of commercial cannabis in the United States. PLoS ONE 17(5):e0267498
Soler S, Gramazio P, Figàs MR, Vilanova S, Rosa E, Llosa ER, Borràs D, Plazas M, Prohens J (2017) Genetic structure of Cannabis sativa var. indica cultivars based on genomic SSR (gSSR) markers: implications for breeding and germplasm management. Ind Crops Prod 104:171–178
Soorni A, Fatahi R, Haak DC, Salami SA, Bombarely A (2017) Assessment of genetic diversity and population structure in Iranian cannabis germplasm. Sci Rep 7(1):15668
Stone NL, Murphy AJ, England TJ, O’Sullivan SE (2020) A systematic review of minor phytocannabinoids with promising neuroprotective potential. Br J Pharmacol 177(19):4330–4352
Svendsen KB, Jensen TS, Bach FW (2004) Does the cannabinoid dronabinol reduce central pain in multiple sclerosis? Randomised double blind placebo controlled crossover trial. BMJ 329(7460):253
Toth JA, Stack GM, Cala AR, Carlson CH, Wilk RL, Crawford JL, Viands DR, Philippe G, Smart CD, Rose JK, Smart LB (2020) Development and validation of genetic markers for sex and cannabinoid chemotype in Cannabis sativa L. Gcb Bioenergy 12(3):213–222
Toth JA, Smart LB, Smart CD, Stack GM, Carlson CH, Philippe G, Rose JK (2021) Limited effect of environmental stress on cannabinoid profiles in high-cannabidiol hemp (Cannabis sativa L.). GCB Bioenergy 13(10):1666–1674
Toth JA, Stack GM, Carlson CH, Smart LB (2022) Identification and mapping of major-effect flowering time loci Autoflower1 and Early1 in Cannabis sativa L. Front Plant Sci 13:991680
Van Bakel H, Stout JM, Cote AG, Tallon CM, Sharpe AG, Hughes TR, Page JE (2011) The draft genome and transcriptome of Cannabis sativa. Genome Biol 12(10):1–18
van Velzen R, Schranz ME (2021) Origin and evolution of the cannabinoid oxidocyclase gene family. Genome Biol Evol 13(8):evab130
Vavilov NI, Bukinich DD (1929) Agricultural Afghanistan. Bull. Appl. Bot. Genet. Plant Breeding Supp. 33:380–382
Vergara D, Huscher EL, Keepers KG, Pisupati R, Schwabe AL, McGlaughlin ME, Kane NC (2021) Genomic evidence that governmentally produced Cannabis sativa poorly represents genetic variation available in state markets. Front Plant Sci 12:668315
Walker JM, Huang SM (2002) Cannabinoid Analgesia. Pharmacol Ther 95(2):127–135
Wickham H (2011) ggplot2. Wiley Interdiscip Rev Comput Stat 3(2):180–185
Williamson HF, Brettschneider J, Caccamo M, Davey RP, Goble C, Kersey PJ, Leonelli S (2021) Data management challenges for artificial intelligence in plant and agricultural research. F1000Research. https://doi.org/10.12688/f1000research.52204.2
Woods P, Campbell BJ, Nicodemus TJ, Cahoon EB, Mullen JL, McKay JK (2021) Quantitative trait loci controlling agronomic and biochemical traits in Cannabis sativa. Genetics 219(2):iyab099
Woods P, Price N, Matthews P, McKay JK (2023) Genome-wide polymorphism and genic selection in feral and domesticated lineages of Cannabis sativa. G3 13(2):209
Zhang Q, Chen X, Guo H, Trindade LM, Salentijn EM, Guo R, Guo M, Xu Y, Yang M (2018) Latitudinal adaptation and genetic insights into the origins of Cannabis sativa L. Front Plant Sci 9:1876
Zhang J, Yan J, Huang S, Pan G, Chang L, Li J, Zhang C, Tang H, Chen A, Peng D, Biswas A (2020) Genetic diversity and population structure of cannabis based on the genome-wide development of simple sequence repeat markers. Front Genet 11:958
Zheng X, Levine D, Shen J, Gogarten SM, Laurie C, Weir BS (2012) A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics 28(24):3326–3328
Zimmerman SJ, Aldridge CL, Oyler-McCance SJ (2020) An empirical comparison of population genetic analyses using microsatellite and SNP data for a species of conservation concern. BMC Genomics 21:1–16
Acknowledgements
We would like to thank Mr. Robert Connell Clarke for his curation of the use-type associations for the Phylos Bioscience (n=1378) dataset as well as for valuable discussions and insights.
Funding
This manuscript was prepared without external financial support or funding.
Author information
Authors and Affiliations
Contributions
Conceptualization: AHMC, KH, MBK, NB, RRM, KL, EJK, Formal Analysis: AHMC, RRM, Figure Preparation: AHMC, Manuscript Drafting: AHMC, Writing and Reviewing Manuscript: AHMC, KH, MBK, NB, RRM, KL, EJK.
Corresponding authors
Ethics declarations
Conflict of interest
LeafWorks Inc. is a for profit company.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Halpin-McCormick, A., Heyduk, K., Kantar, M.B. et al. Examining population structure across multiple collections of Cannabis. Genet Resour Crop Evol (2024). https://doi.org/10.1007/s10722-024-01928-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10722-024-01928-1