1 Introduction

Humans have intentionally managed animal and plant species around the world for over 15,000 years (Frantz et al. 2020; Larson and Fuller 2014). In some cases, management created subsets of the original population containing unique phenotypes and genotypes, i.e., domesticates. Archeological and archival evidence of domestication, where it exists, is an important source of specific migration, importation, or crossbreeding events; domestication centers; and morphological intermediates in the domestication process (Frantz et al. 2020; Larson et al. 2012; Zeder et al. 2006). However, reconstructing a complete domestication history from records alone is challenging. Modern genomics can address this gap by retroactively piecing together the demographic histories and phenotypic diversity of domesticate populations (Alberto et al. 2018; Braud et al. 2017; Meyer and Purugganan 2013; Sanchez et al. 2016).

The Western honey bee (Apis mellifera L.; henceforth “honey bee”) is one of the few intensely managed insect species in the world. Humans have managed them for nearly 11,000 years for wax, honey, and—more recently—pollination (Batra 1995; Bloch et al. 2010; Crane 1999; Roffet-Salque et al. 2015). As such, we have a detailed historic account of the natural and management histories of honey bees around the world (Crane 1999; Ruttner 1988). The honey bee’s native range extends across Africa, Asia, and Europe, and is divided into at least five morphometrically, phenotypically, and genetically distinct lineages: A (Africa), M (northern Europe and central Asia), C (central and southern Europe), O (Middle East and western Asia), and Y (Arabian Peninsula and Ethiopian highlands) (Cridland et al. 2017; Han et al. 2012; Miguel et al. 2011; Ruttner 1988; Ruttner et al. 1978; Whitfield et al. 2006). Initially, these lineages were defined morphologically (Ruttner 1988) and were later supported by mitochondrial DNA (mtDNA) (Arias and Sheppard 1996; Cornuet and Garnery 1991; Crozier et al. 1989; Crozier and Crozier 1993; Garnery et al. 1992). Each lineage has unique mitochondrial haplotypes (mitotypes) with an average divergence of 2.5% between A, C, and M lineages (Garnery et al. 1992). Microsatellites further supported the mitochondrial and morphological groupings (Estoup et al. 1995; Garnery et al. 1998; Solignac et al. 2003). Most recently, genome-wide analysis taking advantage of the released honey bee genome (The Honeybee Genome Sequencing Consortium 2006) supported the five lineage models (Han et al. 2012; Whitfield et al. 2006). Average genome-wide differentiation (measured by FST) among lineages ranges from 0.3 to 0.6 (Cridland et al. 2017; Harpur et al. 2014; Miguel et al. 2011; Wallberg et al. 2014). Each lineage can further be divided into several subspecies by the same mitochondrial, whole genome, and microsatellite metrics; presently, twenty-nine subspecies are recognized. Average FST between subspecies within the same lineage is smaller than between lineages (Wallberg et al. 2014); A. m. ligustica and A. m. carnica, both C lineage honey bees, have an FST of 0.06 (Dall’Olio et al. 2007).

Today, the honey bee can be found in introduced populations in western Asia, North America, South America, and Australia, to name a few. Across this new range, beekeepers have imported honey bees from multiple lineages, subjected them to selective breeding, and allowed them to escape and spread, resulting in admixed populations that exist in both managed and feral states (Chapman et al. 2008, 2019; Harpur et al. 2015; Oldroyd et al. 1997; Rivera-Marchand et al. 2008; Seeley 2007; Smith 1991). In the United States of America, at least nine honey bee subspecies have been introduced since 1622 (as we review below). Today, both feral and managed honey bees can be found throughout the country in admixed populations. As the United States of America covers over 9,800,000 km2 and exhibits considerable variation in climate, geography, honey bee importation history, and honey bee management practices, we predict the country contains a highly diverse population of honey bees.

Understanding honey bee genetics and population structure is vital to understanding and maintaining genetic diversity (Bohling et al. 2016; Brito et al. 2017; Harpur et al. 2012, 2015), discovering loci contributing variation to desirable traits (Bolormaa et al. 2017; Harpur et al. 2019), and ultimately developing breeding programs (Dekkers 2012; Jarquín et al. 2014). Fortunately, the USA has a detailed archival record (Crane 1999; Horn 2005; Oertel 1976a, b, c, d, e; Pellett 1938; Sheppard 1989a, b) and there is a growing number of genetic analyses of honey bee populations (Bozek et al. 2018; Calfee et al. 2020; Cleary et al. 2018; Coulson et al. 2005; Cridland et al. 2018; Darger 2013; Delaney et al. 2009; Kono and Kohn 2015; Magnus et al. 2011, 2014; Magnus and Szalanski 2010; Mikheyev et al. 2015; Pinto et al. 2004, 2007; Rangel et al. 2016, 2020; Schiff and Sheppard 1993, 1995, 1996; Schiff et al. 1994; Seeley et al. 2015; Szalanski et al. 2016a, b; Szalanski and Magnus 2010). Here, we synthesize and review historic records (Table S1, S2) and available genetic data sets (Table S3) to understand how management practices have influenced genetic diversity and differentiation of honey bee populations across the United States of America (USA). The confluence of these data sets allows us to propose novel hypotheses and frame honey bee management in the USA in the context of domestication genetics.

2 The history of honey bee importation, expansion, and invasion into the United States of America

Honey bees have been continually imported into the United States of America since 1622 (Table S1). The earliest known record refers to English “dark bees” (M lineage: Apis mellifera mellifera) which were imported by colonists to Jamestown, VA (Table S1; Fig. 1). The available records from 1622 to 1859 mention only two other recorded importations of honey bees to the United States of America until 1859: Mobile, AL (1773) (Oertel 1976b) and Pensacola, FL (1763) (Crane 1999; Pellett 1938). Two additional importations—to Plymouth, MA in the 1630s and Long Island, NY in the 1670s—are implied by historical records; both locations recorded honey bees sooner than would be expected had they moved by natural swarming (Fig. 1). Records suggest that A. m. mellifera was the only subspecies present in North America until the 1850s (Table S1).

Figure 1.
figure 1

Estimated A. mellifera population expansions in the mainland United States. Locations have been recorded or estimated from digitized importation records of European colonialists or their accounts of swarm sightings. Sightings are given a specific location if possible; otherwise, the date text is larger and imposed over the state or area the sighting occurred (Table S1). Color gradients are proposed ranges estimated by combining known sightings and introductions with an estimation of average yearly spread as (50 years – establishment date) × 16.5 km. Alaska has been excluded because honey bees were first introduced to the state in 1924 and cannot be attributed to a source population (Table S2). Dates before 1859 are exclusively A. m. mellifera sightings, but dates after 1859 can be attributed to A. m. ligustica, A. m. mellifera, A. m. ligustica x A. m. mellifera hybrids, and/or various A. mellifera  hybrids.

A. m. mellifera colonies radiated out from their introduction sites on the East Coast by swarming and with beekeeper assistance. By the 1700s, “bee trees” were found throughout the Carolinas and Pennsylvania (Crane 1999; Horn 2005; Oertel 1976c). Honey bees could be found on the west bank of the Mississippi River by 1790 (Crosby and Worster 2004) and continued to spread across the Great Plains during the nineteenth century (Fig. 1, Table S2) (Crane 1999; Horn 2005; Oertel 1976b, d; Pellett 1938). During the 1850s, colonists introduced colonies to the Pacific Coast. Over three hundred A. m. mellifera colonies were shipped from New York, Pennsylvania, and Michigan to central California between 1852 and 1860 (Crane 1999; Harbison 1919; Pellett 1938; Watkins 1967, 1968a, b), establishing a founder population for further introductions to Hawaii in 1857 (Hopkins 1857), Washington in 1856 (Watkins 1968b), Nevada in 1852, and Southern California in 1862 (Fox 1878; Handy 1872; Horn 2005).

The second subspecies introduced to the United States of America was the Italian honey bee (C lineage: A. m. ligustica), which was initially sourced from exported populations in Tambaschoff, Germany, and Tamins, Switzerland (Table S1) (Benton 1892; Mahan 1861; Watkins 1968c). Later, American beekeepers imported A. m. ligustica queens directly from northern Italy, favoring the Lake Maggiore region in the 1860s (Langstroth 1866a) and Milan and Bologna in the 1890s (Benton 1892; Robinson 1892). During the 1860s and 1870s, honey bee queen suppliers established themselves throughout the United States of America by providing imported or open-mated A. m. ligustica queens to beekeepers across the country. Their actions spread A. m. ligustica from Vermont to California (~ 2500 km) within 10 years of introduction (Table S1; S2) (Fox 1878; Langstroth 1881; Pellett 1938). While we do not know the exact number of queens nor colonies imported, the available data suggests it was likely in the thousands. One importer, Charles Dadant, is reported to have introduced “hundreds [of A. m. ligustica queens] … each year for several years” beginning in 1874 with a single shipment of 400 queens (C. Dadant 1880; C. P. Dadant 1918) and another, A. I. Root, claimed to have introduced 50–75 queens per year from the late 1860s at least until 1899 (Root 1899).

The Carniolan honey bee (C lineage: A. m. carnica) was first imported to the United States of America from Dalmatia (modern day Croatia) in 1877 (C. Dadant 1877, 1878), with additional introductions occurring during and after the 1880s (Fig. 2, Table S2). Two of the most prominent A. m. carnica importers were Frank Benton, an American honey bee scientist, and Henry Alley, a breeder based in Coleraine, MA. Alley chiefly imported the “Banat bee,” which were A. m. carnica native to Banat, Hungary (Sheppard 1989a). Benton, who was living in Munich, Germany, sent A. m. carnica from the Austrian Alps to the United States of America from 1882 to 1892 (Benton 1881, 1892; Lyon 1905). A. m. carnica was a popular subspecies in the late 1880s; based on queen supplier ads, A. m. carnica was widespread across the United States of America by 1900 (Table S1; S2). Benton continued to import A. m. carnica queens to the USDA experimental apiary in Maryland until the early 1900s (Benton 1910).

Figure 2.
figure 2

Seven subspecies were introduced to the mainland United States between 1859 and 1922: A. m. ligustica (1859), A. m. lamarckii (1866), A. m. carnica (1877), A. m. cypria (1880), A. m. syriaca (1880), A. m. caucasica (1882), and A. m. intermissa (1891) (Alley 1891; C. Dadant 1877; Houck 1883; Jones 1880a, 1880b, 1880c; Langstroth 1866b, 1881). Here, we have highlighted the first 5 years of a recorded importation of a given subspecies into the mainland United States. The exceptions are A. m. caucasica and A. m. cypria, both of which were imported sporadically or for a short time before importations ceased and were continued under Benton (see Table S1 for all importation records). One series of A. m. cypria and A. m. syriaca importations to Beeton (now New Tecumseth); Canada is recorded because these were the source of importations to the USA (Table S1) (Benton 1892; Jones 1880a). The boxes represent large-scale operations in Ohio (Rev. L. L. Langstroth); Washington, D.C. (Frank Benton); and Canada (D. A. Jones).

Beekeepers from New York State imported Caucasian honey bees (O lineage: A. m. caucasica) as early as 1882 (Houck 1883; Tefft 1890). One beekeeper reported sourcing them from extant populations in Germany (Houck 1883). However, A. m. caucasica did not become widespread until the early 1900s. In 1902, Benton imported three A. m. caucasica queens from the shores of the Black and Caspian Seas (Benton 1905, 1906; Lyon 1905) and in 1905, he sourced an unknown number of queens from Russia (Table S1). He used both importations as a nucleus for a breeding program based in Beltsville, Maryland, from which he claims to have distributed at least 2000 A. m. caucasica queens to beekeepers across the United States of America in 1905 alone (Table S2) (Lyon 1905; Phillips 1906). There were other A. m. caucasica importers operating both before and simultaneous to Benton (Table S1). Political unrest in the Caucasus Mountains between 1890 and 1910 deterred or delayed at least two A. m. caucasica importers, suggesting importations were infrequent (Phillips 1906; Pratt 1905; Root 1915). Importations directly from the Caucasus region ended in 1913 with a final recorded importation to Texas (Table S1) (Wilder 1913).

Four additional subspecies have been documented being imported into the US—the Egyptian honey bee (A lineage: A. m. lamarckii), the Cyprian honey bee (O lineage: A. m. cypria), the Syrian honey bee (O lineage: A. m. syriaca), and the Tellian honey bee (A lineage: A. m. intermissa)—but records suggest that they were not distributed as widely because they were prone to swarming, absconding, and defensive behavior (Benton 1892; Massie 1891; Root 1899). A. m. lamarckii was only introduced to three states between 1866 and 1869 (Table S1). At least four A. m. intermissa queens were sourced from Hallamshire, England, after being imported from Northern Africa (Table S1) (Pratt 1891). Their progeny was distributed to five states between 1891 and 1893 and no record of their sale or propagation exists afterward (Table S1). A. m. syriaca was spread across nine states between 1881 and 1905 from an unknown number of source queens (Table S1). A. m. cypria was imported across 13 states between 1880 and 1910 (Table S2). Several hundred queens were imported directly (and weekly from 1880 to 1882) from Cyprus (Table S1), beginning with an initial shipment of 150 queens to Jones’ apiary in Beeton (now New Tecumseth), Ontario (Jones 1880a). Importation from Cyprus ended in 1882 but some queen suppliers continued to offer Cyprian honey bees—presumably the descendants of directly imported queens—into the 1890s (Brown 1891; Guenther 1891). In the early 1900s, Benton was able to import additional A. m. cypria from Cyprus for breeding experiments and did so until his retirement in 1910 (Benton 1910; Lyon 1905).

Events of the 1910s lead to an overall reduction in number of reported honey bee importations and ultimately the creation of the Bee Act of 1922. World War I (1914–1918) and the Isle of Wight disease (1906–1916) both reduced the number of colonies coming from Europe (Root 1921; Walther 1915). The Isle of Wight disease was partially responsible for the Honey Bee Act of 1922, which forbade the importation of live honey bees and hive products in the USA (Anonymous 1922). In the 1960s, the act was expanded to ban germplasm to prevent the spread of African-hybrid honey bees (also called “AHBs,” “Africanized honey bees,” or “killer bees”) from South and Central America (Sheppard 2013). Importations from Canada continued until 1987 when unease over African-hybrid honey bees caused that country to close its borders (Winston 1992) until 2004 (Sheppard 2013). However, both before and after the 1922 ban was reworded, imports were made for experimental purposes (Table S1). The introduction of Varroa destructor in 1987 sparked a second wave of importation as Varroa-resistant strains from Europe were introduced to augment US breeding programs. Notably, A. m. carnica “Yugo” honey bees were imported from Hungary in 1989. A decade later, hundreds of Varroa-tolerant hybrid queens from the Primorsky Region from eastern Russia were imported between 1997 and 2002 (Table S1).

In 1990, the first African-hybrid honey bees (AHBs; also called Africanized honey bees) became established in the United States of America. AHBs are the descendants of 36 mated A. m. scutellata queens sourced from Tanzania and Transvaal, South Africa (Winston 1992). Their offspring mated with the Brazilian honey bee population (likely M lineage: A. m. iberiensis) (Crane 1999; Wallberg et al. 2014; Whitfield et al. 2006), escaped captivity, and spread north and south (Calfee et al. 2020; García et al. 2018; Winston 1992). Since 1990, AHBs have spread to ten states in the southern US states: Arizona, Arkansas, California, Florida, Louisiana, Oklahoma, Nevada, New Mexico, Texas, and Utah. Their spread, unlike that of the subspecies described above, has largely been accomplished by natural swarming and not beekeeping (Bozek et al. 2018; Cridland et al. 2018; Pinto et al. 2004).

In 2004, the United States of America initiated honey bee trade with New Zealand and Australia. Hundreds of thousands of colonies were imported from Australia alone between 2004 and 2010. Trade ceased with Australia in 2010 over concerns of importing Apis cerana (Sheppard 2013). Currently, the USA allows live honey bee importations from New Zealand and Canada, and germplasm importations from Australia, Bermuda, Canada, France, Great Britain, New Zealand, and Sweden (USDA n.d.). Germplasm from other sources can be used with USDA permission. Since 2009, there have been at least 11 importations of germplasm to Washington State from Germany (A. m. carnica, 2008, 2009), Italy (A. m. ligustica, 2008, 2009, 2010, 2012) (Cobey et al. 2015; Sheppard 2012), Georgia (A. m. caucasica, 2010, 2011) (Sheppard 2012), Slovenia (A. m. carnica, 2011) (Sheppard 2012), and Poland (A. m. carnica, 2018) (Cobey et al. 2019).

3 Unravelling honey bee management history in the age of genomics

While this record, like any archeological record, is incomplete and contains several caveats (Supplemental text; Table S1), historic primary sources are informative to our understanding of honey bee demography, genetic diversity, and differentiation across the country. Above, we documented that, since 1622, at least nine honey bee subspecies have been imported from at least four of the five honey bee lineages. By most accounts, the imported honey bees were distributed en masse to beekeepers across the country (Table S1; S2), with some spread by swarming (Table S1; Fig. 1). These actions by beekeepers over the course of several hundred years have likely lead to a genetically diverse, highly admixed, and genetically structured population of honey bees across the country. We continue this review by intersecting the historic record we have compiled with the genetic studies performed in honey bee populations in the United States of America.

To our knowledge, there have been 23 population genetic studies in 22 US states that examined feral and/or managed honey bees (Fig. 3; Table S3). Most of these studies have focused on mtDNA and microsatellites with only four whole genome re-sequencing studies (Bozek et al. 2018; Calfee et al. 2020; Cridland et al. 2018; Mikheyev et al. 2015). Sample sizes have varied immensely. For example, most states have < 50 honey bees sequenced, while California has 606 and Texas has 1022 (Table S3). The geographic area sampled during these studies has also been inconsistent. While Hawaii (Szalanski et al. 2016b), Utah (Cleary et al. 2018), and California (Cridland et al. 2018) have been sampled statewide, the honey bees of Powdermill Creek Nature Reserve (4.5 km2) (Rangel et al. 2020) and Arnot Forest (17 km2) (Seeley et al. 2015) are the only feral honey bees sampled from Pennsylvania (total land area = 119,280 km2) and New York (total land area = 141,300 km2), respectively.

Figure 3.
figure 3

Summary of locations, methods, and sample sizes of honey bee genetic studies in the United States of America, as well as lineages or subspecies (when identifiable) detected. Both managed and feral populations are included in each state, but there is considerable—if not complete—overlap between lineages and/or mitotypes found within a state’s feral and commercial colonies. The studies represented in this figure (as well as Table S3) specified the ancestry of individual samples. For example, Kono and Kohn (2015) used mtDNA to determine clines of A lineage in ancestry in California, but did not release specific mitotypes associated with each collected honey bee. Overall, the figure highlights two major shortcomings in US honey bee genomics: small sample sizes (11 states have fewer than 11 samples) and biases towards either managed or feral colonies within states. Furthermore, despite the documented presence of AHBs in Louisiana, Alabama, Mississippi, and Georgia, sampling methods have missed them here.

4 Genetic diversity

4.1 Mitotypes from historically imported subspecies are still present in the United States of America

Mitochondrial studies currently provide the broadest look at the genetic diversity of honey bees across the USA both because of the sheer number of such studies (Table S3) and because we have yet to assemble genomic data for all honey bee subspecies. Genomic studies of honey bees in Africa and Asia are especially needed (Harpur et al. 2014; Wallberg et al. 2014). Mitochondrial studies have demonstrated that seven of the expected nine subspecies imported into the USA have mitotypes present within the USA today (Table S3; Fig. 3). The two most common mitotypes in the United States of America are C1 (A. m. ligustica) and C2 (A. m. carnica): C1 has been found in twenty-three states and C2 in sixteen (Table S3). Less common mitotypes that have been documented include those from A. m. lamarckii (Pinto et al. 2004, 2007; Rangel et al. 2016; Schiff and Sheppard 1993) and O2 from A. m. syriaca (Magnus et al. 2014), which were only briefly imported into the USA (Fig. 2). The A. m. lamarckii  mitotype was reported in the southern states of Arizona, Texas, Mississippi, Alabama, and Georgia and the O2 mitotype—identical to a Lebanese A. m. syriaca mitotype (Magnus et al. 2014)—was found in Texas and California. Since there were no documented importations of A. m. syriaca since 1881 and A. m. lamarckii since 1869 (Table S1), these mitotypes may be remnants of the initial importations or more recent and unreported importations.

4.2 Genomic data suggest high standing genetic diversity in US honey bees but cannot yet identify where that variation came from

The number of mitotypes in the USA and the number of subspecies imported historically might suggest that the honey bees of the USA are highly genetically diverse. However, mitotypes alone tell an incomplete picture of how much genetic diversity exists in populations and honey bees in the USA have undergone at least three major bottlenecks as a result of pests and parasites (Horn 2005; Sanford 2001; vanEngelsdorp and Meixner 2010). A more complete picture can be made using the nuclear genome. The large number of independent markers used in analysis—up to several million for honey bees—provide accurate estimates of genetic diversity for colonies and populations, can increase the accuracy of ancestral sorting, and can allow one to tease apart the roles of demography and selection in contributing to genetic diversity in regions across the genome (Calfee et al. 2020; Cridland et al. 2018; Harpur et al. 2012; Mikheyev et al. 2015; Saelao et al. 2020; Wallberg et al. 2014). There have been five such studies focusing on honey bees in the mainland United States (Bozek et al. 2018; Calfee et al. 2020; Cridland et al. 2018; Mikheyev et al. 2015; Saelao et al. 2020) (Fig. 3).

These studies provide a glimpse at how historic importations may have impacted modern genetic diversity on a nationwide scale. Furthermore, when paired with additional genomic data from either museum specimens (Cridland et al. 2018; Mikheyev et al. 2015) or representatives from the honey bee’s native range (Cridland et al. 2018; Harpur et al. 2014; Wallberg et al. 2014), it is possible to compare genetic diversity over space and time. One key takeaway from such studies to date is that genetic diversity within modern US populations is comparable to populations in the honey bee’s native range. Californian non-AHB feral and commercial populations have genetic diversity similar to European M and C lineage source populations (Cridland et al. 2018); AHB have similar or greater genetic diversity compared to native A lineage honey bees (Bozek et al. 2018; Cridland et al. 2018); and within Texas and Arizona, genetic diversity increased following the introduction and spread of introgression with AHBs despite the loss of genetic diversity from Varroa (Bozek et al. 2018). Comparisons of genetic diversity of commonly used US commercial stocks to populations in Europe also suggest similar if not higher levels of genetic diversity (Saelao et al. 2020). This observation comes with several critical considerations. First, genetic comparisons between modern US honey bee populations and modern populations in the honey bee’s native range occur over a spectrum of management intensities in both locations. Second, honey bees in their introduced range in the USA and their native range have experienced declines in genetic diversity due to Varroa mites (Bozek et al. 2018; Espregueira Themudo et al. 2020). Finally, only 13 honey bee subspecies have been sequenced to date (Chen et al. 2016; Haddad et al. 2018; Harpur et al. 2014; Wallberg et al. 2014, 2017; Wragg et al. 2018; Yunusbaev et al. 2019) and the sequencing has been performed using different sequencing technologies, complicating the direct comparison of data sets (Cridland et al. 2017). This has limited not only our ability to cleanly compare levels of genetic diversity but also our ability to estimate relatedness of samples in the USA to samples within the honey bee’s native range. Where this has been done, US honey bees are related to the C, M, and A lineages (Bozek et al. 2018; Cridland et al. 2018; Mikheyev et al. 2015). While this supports some of the expectations of the historic record, the limited availability of genome sequence from honey bees within their native range and the limited sampling of honey bees within each lineage make it difficult to pinpoint precisely which subspecies is responsible for this relationship. More work in this field through careful sampling and comparison would be welcomed. The information gained is incredibly useful to our understanding of honey bee management history and a complete picture of the level of genome-wide genetic diversity and its distribution across management scenarios, and the landscape has yet to be put together.

5 Population structure and ancestry

5.1 Genomic data provide evidence of population structure, the existence of feral populations, and variation in ancestry across the United States of America

Differences in importation history and the extent of natural and artificial selection experienced by populations can contribute to genetic differentiation. Historic data suggest beekeepers readily shared material across the country and in great numbers, a pattern continued today. For example, a single queen supplier may rear thousands of queens from a single colony and disperse them across the country (Delaney et al. 2009). What does this mean for genetic differentiation? Are there distinct populations (e.g., East vs West; Feral vs Commercial) across the country or do honey bees form a single, contiguous population? How has selection shaped genetic differentiation in populations across the country?

Mitocondrial studies are, again, insightful. Clear differences in mtDNA haplotype frequency have been demonstrated between the Western and the Southeastern US (Delaney et al. 2009). Furthermore, mtDNA studies have suggested mitotype segregation by climate. O2 has been described as identical to a Lebanese A. m. syriaca mitotype (Magnus et al. 2014) and is found exclusively in states with mild winters (i.e., winter average temperatures above 12 °C; California and Utah) (Cleary et al. 2018; Magnus et al. 2014). Similarly, A. m. lamarckii mitotypes have only been identified in Alabama, Mississippi, and Texas, three states with similarly mild winters (Pinto et al. 2004; Schiff and Sheppard 1993; Schiff et al. 1994). Historic records indicate that beekeepers were using A. m. syriaca in Los Angeles as early as 1881 (Osborn 1881), only a year after its initial importation into the country (Table S2). There may also be some structural differences between mainland and island portions of the US. Hawaiian honey bees do not contain unique mitotypes but as much as 35% of feral Hawaiian honey bees have M lineage mitotypes (Szalanski et al. 2016b), as compared to an estimated continental average of 7% (Magnus et al. 2014).

Nuclear genomic studies, though less extensive, suggest similar population structure across the country. In California, Avalon Island is significantly differentiated from mainland Californian populations (Cridland et al. 2018). Populations in Texas and Arizona are currently genetically distinct and have been even prior to the spread of AHB (Bozek et al. 2018). Recent work by Saelao et al. (2020) showed that there is some evidence of genetic differentiation among commonly used honey bee stocks. Given the limited spatial sampling of genomic studies so far (Fig. 3), more work is certainly needed to quantify and understand how distinct honey bee populations across the country are and what drives that distinction.

One important contributor to differentiation in the mainland US has been AHBs. The AHB invasion, unlike those of the managed honey bee subspecies above, has largely been accomplished by natural swarming (Bozek et al. 2018; Cridland et al. 2018; Pinto et al. 2004). AHBs are highly genetically distinct from other honey bees in the USA having both distinct mitotypes (Cleary et al. 2018; Darger 2013; Szalanski and Magnus 2010) and nuclear genomic variation (Kadri et al. 2016). Furthermore, AHBs have become established across much of the southern United States, where they demonstrate an ancestry cline between the previously introduced subspecies (largely C and M lineage, or “European honey bees”) and A. m. scutellata. In California, average AHB ancestry is highest at the southern border and almost nonexistent 750 km away at 38° N latitude. The size and smoothness of this cline suggest there will not be a hard border between EHBs and AHBs, but rather a large hybrid zone extending hundreds of kilometers. Within this area, AHBs will exhibit a variety of intermediate phenotypes selected by climate variation (Calfee et al. 2020).

One particularly exciting avenue of exploration is identifying feral honey bee populations. Feral populations—defined here as a genetically distinct honey bee population living outside the confines of management—have existed historically in the United States of America but were thought to have been largely extirpated by Varroa (Kraus and Page 1995; Loper et al. 2006). However, feral populations likely still exist within the US. There is substantial evidence of mitotypes restricted to wild-caught colonies (Magnus and Szalanski 2010; Pinto et al. 2007; Schiff et al. 1994), variation in ancestry across the genome between wild-caught and nearby commercial colonies (Cridland et al. 2018; Kono and Kohn 2015; Mikheyev et al. 2015), and lower genetic diversity relative to nearby managed populations (Cridland et al. 2018; López-Uribe et al. 2017). In New York, feral honey bees collected in 2010 show greater similarity to honey bees collected in the same area 33 years previously than to modern commercial honey bees (Mikheyev et al. 2015). However, a microsatellite study examining feral and commercial honey bees from North Carolina found small but significant differentiation between the populations. The study concluded feral honey bees could be escapees from nearby managed colonies (López-Uribe et al. 2017). Therefore, significantly distinct feral colonies are not likely to exist everywhere. Instead, there may be a “hybridization zone” between managed and feral populations, as is the case when any domesticate is kept near its wild counterpart (Gering et al. 2015; Khosravi et al. 2013; Nussberger et al. 2013; Quilodrán et al. 2019), or when AHBs and preexisting honey bee populations collide (Calfee et al. 2020). Truly distinct feral colonies have been demonstrated to exist in wildlife refuges, teaching forests, remote islands, and other areas with little human habitation and no beekeeping (Bozek et al. 2018; Mikheyev et al. 2015). The number of such populations across the country, their origins, and how selection has shaped levels of genetic diversity are all significant outstanding questions.

6 Conclusions and the future of honey bee population genomics

Despite the extensive sequencing efforts conducted to date (Table S3), many gaps in our knowledge still remain; these are highlighted even more when compared to the historic record. There are obvious gaps in sampling across the country. Almost all existing genetic studies have been conducted in the southern United States and therefore likely underestimate genetic diversity across the country (Fig. 3; Table S3). This is also true of honey bees in their native ranges: only 13 of 29 subspecies have been sequenced (Chen et al. 2016; Haddad et al. 2018; Wallberg et al. 2014, 2017; Wragg et al. 2018). This incomplete record—in both areas—prevents us from fully identifying the origins of haplotypes present within US populations. For example, several sources record the importation of A. m. intermissa (Pratt 1891) and A. m. cypria (Jones 1880c) (Table S1; S2), but neither mitochondrial nor nuclear DNA evidence has yet to verify them (Table S3). As well, the increased relatedness of New York feral colonies to A. m. yemenitica and A. m. scutellata in Mikheyev et al. (2015) despite a lack of historic record tying these subspecies to that location may indicate either an incomplete historic record or an incomplete set of reference genomes from honey bees within their native range. Expanding sequencing efforts across both the USA and the honey bee’s native range would open the door for massive, continent-wide analyses of population structure, phylogeography, and local adaptation.

Given that beekeepers imported, distributed, and experimented with at least nine subspecies in the USA and some of these honey bees escaped and spread naturally, we predict that future genomic work will find that haplotypes originating from these introductions may contribute to phenotypic and genetic variation of honey bees in the US. We have yet to fully explore how much genomic variation from introduced subspecies is still present in US honey bees. We may expect some of those haplotypes to be segregating today as a result of selection or drift. How these haplotypes segregate across the country, where they originated from, and how they contribute to phenotypic variation are all questions still left to be answered. Recent population genomic studies have already begun to show how haplotypes of different origin may contribute to phenotypic variation in admixed populations. In Puerto Rico, haplotypes originating from C lineage populations are likely contributing to reduced defense response in AHBs (Avalos et al. 2014). There is similar evidence in Brazil for European haplotypes contributing to colony-level defense response (Harpur et al. 2020). In California, there is clear clinal variation in ancestry at specific regions of the genome that are maintained by selection (Calfee et al. 2020). We hypothesize that heritable phenotypes of economic value within the USA and other admixed honey bee populations around the world are likely underpinned by haplotypes originating from historic importations.

Honey bees stand at the unique juncture of wild and domesticated. Genomics is a versatile field suitable for studying honey bees in both capacities through further investigations into genetic diversity and selective pressures. However, genomics alone cannot unravel honey bees’ complicated management history, especially in introduced ranges with rampant gene flow among disparate subspecies. Historical evidence can provide a scaffold for future hypotheses and vice versa; genomics can be used to support previous historical evidence or provide avenues for new studies.