Introduction

Horsegram (Macrotyloma uniflorum (Lam.) Verdc. is a hardy pulse crop of semi-arid tropics that has been poorly studied. Despite its current and historical importance to the diet of a large part of the population in India, there are entrenched biases against horsegram, as it is considered a low status food of the poor, particularly in southern India (Kadam et al. 1985; Ambasta 1986, 181). Smartt (1985, 299) remarked that “[t]here has been remarkably little incentive to study domestication and evolution of horse gram”. Indeed, very little agronomic research has been done on this crop (Yadav 1992). The limited scientific knowledge of this crop is reflected in its status in textbooks, even those produced in India, its main country of production (Fig. 1). Horsegram has received far less research than pulses of higher status, such as Indian Vigna (V. radiata (L.) Wilczek, V. mungo (L.) Hepper) or pigeonpea (Cajanus cajan (L.) Millsp.). Indeed, whilst both the Indian Vigna spp. and Cajanus have received monographic studies of wild relatives (Tomooka et al. 2014; Khoury et al. 2015; Mallikarjuna et al. 2011; van der Maesen 1986) and genetic studies of relationships with wild relatives (Lee 2013; Xu-xiao et al. 2003; Zong et al. 2003; Aruna et al. 2009; Kassa et al. 2012; Saxena et al. 2014). Only recently has small scale genetic research been conducted on horsegram (e.g. Sharma et al. 2015). Horsegram earned its common English name as it has been used as fodder to horses and cattle for centuries (Watt 1889–1893), and less often eaten by the British or higher status Indians; often in India as a supplement to the bulky-straw fodders used (Nezamuddin 1970, 321). Despite such prejudice, horsegram (Macrotyloma uniflorum) ranks among the most important pulse crops of India (Fig. 2).

Fig. 1
figure 1

Quantitative comparison of knowledge based on textbook coverage of Indian pulses in a selection of agricultural reference books published in India (Kachroo and Arif 1970; Yadav 1992; Sundararaj and Thulasidas 1993; Lokeshwar 1997)

Fig. 2
figure 2

A comparison of production of nine major pulse crops of India, in terms of estimated area sown and annual yield in tonnes, based on Lokeshwar (1997). Inset: estimated percentage of pulse cultivation area devoted to different species in India according to Randhawa (1958)

Indeed, horsegram is the fifth most widely grown pulse species in modern India (Fig. 2). It is amongst the most ubiquitous archaeological pulse finds (Fig. 3), indicating that it has been of widespread importance since the Neolithic period. It is especially important on the Indian peninsula in the Dravidian-speaking states of Tamil Nadu, Karnataka and Andhra Pradesh (Nezamuddin 1970, 321; Sundararaj and Thulasidas 1993, 159). Tamil Nadu and Andhra Pradesh together account for nearly 90% of the total Indian acreage under this crop. Annual yields of horsegram are low given its area of production, which may be due in part to its use on fields with poor agronomic conditions, but this may also reflect in part a bias against research on and improvement efforts devoted to this crop. It would appear that horsegram’s importance declines as one moves north (Lokeshwar 1997). Nevertheless, it is also cultivated, on a smaller scale, in Pakistan, Bangladesh, Nepal, and Myanmar (Spate and Learmonth 1967). It is reported to be grown in the northwest Himalayas up to ca. 2000 meters and in the eastern Himalayas (Sikkim) up to at least 1000 meters (Atkinson 1882; Watt 1889–1893) and in recent times in Australia, Taiwan and the Philippines as a fodder crop. It was introduced in colonial Southeast Asia as a fodder crop (Burkill 1966), although archaeological evidence indicates that it had previously been produced in peninsular Thailand for at least a few centuries, ca. 300 BC–AD 100 (Castillo et al. 2016). This raises the question as to whether cultivation of this crop was formerly more widespread in Southeast Asia.

Fig. 3
figure 3

Archaeobotanical ubiquity in South Asia (percentage of archaeological sites with pulses present) based on Fuller and Harvey (2006). Total number of sites/site phases = 124

Recent investigations undertaken, mainly by Indian researchers, have examined the genetic variability in order to improve this crop (Bolbhat and Dhumal 2009; Dikshit et al. 2014; Dhumal and Bolbhat 2012; Bhardwaj et al. 2010, 2012; 2013a, b; Prakash et al. 2010; Varma et al. 2013; Sharma et al. 2015), particularly in regards to the genetic and biochemical properties of drought-tolerant variants in response to climate change and rapid population growth in India (Bhardwaj et al. 2013a; Bhardwaj and Yadav 2012; Morris et al. 2013; Reddy et al. 1998, 2008). Nevertheless, an evolutionary and historical perspective on this crop remains to be better developed.

The aim of the present paper is to start to redress the research imbalance of this important crop species by providing a comprehensive assessment of the evidence for the biogeographical dispersal, domestication and importance of horsegram in ancient times. We draw together all the available published archaeobotanical evidence, and provide quantitative evidence for the domestication process of this species, and a comparison of modern material. This study provides a baseline for further research into the origins and evolution of domesticated horsegram. In addition, we review inferences from historical linguistics that also highlight the long-term importance of this species in the agricultural systems of India, especially in the South.

Descriptive botany, nutrition and taxonomy

Botanically, Macrotyloma uniflorum, commonly known as horsegram, is an annual herb, growing to a height of 30–40 cm (Neelam et al. 2014, 17; Nezamuddin 1970, 322; Smartt 1985, 12; Sundararaj and Thulasidas 1993, 159). Recent studies on modern germplasm from Andhra Pradesh, India have revealed a wide range of phenotypic variation in this species (Neelam et al. 2014, 17). Unsurprisingly then, based upon its phenotypic plasticity, one of horsegram’s most important traits is its tolerance to a wide range of climatic and soil conditions (Kachroo and Arfi 1970; Nezamuddin 1970, 321; Yadav 2002); even growing wild in the astringent soil of the eucalyptus forests of Queensland (Nezamuddin 1970, 322). In southern India horsegram is grown as a dry crop from August to October, in areas with less than 90 cm of rain annually, and as low as 40 cm, on mostly poor or lateritic soils, usually with no irrigation (Kingwell-Banham and Fuller 2014, 3490; Nezamuddin 1970, 321–322). It is considered native to the drier climatic tracts of India (Asouti and Fuller 2008, 67).

Along with horsegram’s catholic growing conditions its main agrarian value lies in its multiple usages: as green manure, as its husks have excellent water retaining capacities (Nezamuddin 1970, 321; Zaman and Mallick 1991); for its good soil retention abilities; its short height allows it to be used as an understory crop, grown under taller crops such as sorghum (Sorghum bicolor (L.) Moench), pearl millet (Pennisetum glaucum (L.) R.Br.) or pigeonpea (Cajanus cajan) (Nezamuddin 1970, 321–322). Horsegram may be planted as a preparatory crop on new marginal land due to its nitrogen fixing properties (Nezamuddin 1970, 322) and, advantageously, horsegram is a crop that is relatively free of pests and diseases. All these beneficial traits in this pulse would have secured its place in cultivation since ancient times.

As an edible crop, horsegram is an excellent source of protein, carbohydrates, dietary fibre, and micronutrients (Jacobs and Steffen 2003; Yadav et al. 2004; Sangita et al. 2004). However, horsegram flour usage has been limited due to the presence of certain anti-nutrient effects from phytate, tannins and tryspin inhibitors, which limit its nutrient value (Kawsar et al. 2008a; Sreerama et al. 2012, 462). Horsegram is regarded as having poor functional and expansion properties as a flour (Sreerama et al. 2008, 891), in contrast with several other Indian pulses. However, these same anti-nutrient phtyochemicals are thought to have beneficial medicinal and nutraceutical properties (Muthukumara et al. 2014; Bhartiya et al. 2015; Prasad and Singh 2015). Nevertheless, processing horsegram into commercially viable food products, including composite pulse flour, with cowpea and chickpea, has attracted growing interest from researchers and commercial food manufacturers (Abbas et al. 1984; Sreerama et al. 2012, 467; Khatum et al. 2013).

There has been a degree of taxonomic confusion over horsegram, reflected in the botanical, agronomic as well as the recent archaeobotanical literature of India. Since Joseph Dalton Hooker’s Flora of British India (1879), Linnaeus’ Dolichos biflorus has been widely used as the scientific name for horsegram, in the India floristics and archaeobotanical literature (e.g. Watt 1908; Gamble 1935; Kajale 1991; Saraswat 1992; Vishnu-Mittre 1989). However, the type material originally used by Linnaeus in 1753 was actually a catjang-type cowpea, now placed in the polytypic species Vigna unguiculata (L.) Walp. (Smartt 1985, 299). Thus, D. biflorus L. is an old synonym for the cowpea, Vigna unguiculata (or Vigna unguiculata subsp. unguiculata), a very different crop, with origins in Africa (D’Andrea et al. 2007; Fuller and Hildebrand 2013).

Unfortunately, this erroneous equation has been wrongly used in some modern literature, leading to apparent reports of African cowpea, where Indian horsegram is clearly implied by figures, descriptions, and English and Hindi names (e.g. Weber 1991; Reddy 1994; Devaraj et al. 1995; Kroll 1996) The archaeological finds traditionally described as D. biflorus should in fact be Dolichos uniflorum Lam., or its revised synonym Macrotyloma uniflorum (Lam.) Verdc. (Kingwell-Banham and Fuller 2014, 3490). Transferred from the heterogenous genus Dolichos to Macrotyloma in 1970 by the botanist Verdcourt (1970) (Smartt 1985, 298–299), horsegram is now in a genus that includes three economic plants: M. uniflorum, M. axillare (Meyer) Verdcourt (a fodder crop), and M. geocarpum (Harms) Maréchal & Baudet, the African groundbean or Bambara groundnut (Isely 1983, 492). Thus, Macrotyloma uniflorum (Lam.) Verdc. is used below as the correct synonym for Dolichos uniflorus Lam. and for mislabelled archaeological “Dolichos biflorus” (Fuller 2002; Kingwell-Banham and Fuller 2014, 3490).

Historical linguistics and the hypothesis of peninsular Indian origins

Historical linguistics reconstructs hypothetical ancestral languages based on what is essentially a phylogenetic approach to modern languages, based on shared innovations. This can allow for the reconstruction of words, both their past phonetics and their probable meaning, including the names for plants, for past periods when this is shared across several branches in a language family (e.g. Crowley 1997; Southworth 2005). It is also possible to infer ancient loanwords between languages. In India, except in the high Himalayas, there are three main language families (Indo-European, Austroasiatic [Munda], and Dravidian), and most common names across all of these families suggest a shared ancient name for horsegram, indicating deep cultural roots and ancient cultural knowledge of this crop that was transferred across languages (Fuller 2003, 2007a; Southworth 2005). We briefly summarize these data here, as they suggest an origin of this crop somewhere in the peninsular Indian region, and it can be suggested that knowledge of this crop, or at least its name, was transmitted from the early Dravidian speakers of peninsular India to early Indic languages (including Sanskrit) and Austroasiatic.

Based on regular sound correspondences between and across most Dravidian languages, Southworth (2005) has reconstructed an ancient word for horsegram *kol- or *kol-ut. This is reflected in descended terms in many modern languages (Table 1), while two series of related words can be seen in the Indic languages, a branch of Indo-European, e.g. Sanskrit kulattha, and the Munda group of languages found in the hills of eastern and central India, e.g. juang kulto (Zide and Zide 1976; Fuller 2003; Southworth 2005). While the phylogeny of the Dravidian languages is reasonably well-established, it is more difficult to infer when different language sub-families diverged and where the speakers of past proto-languages lived. Written sources can help, but for Dravidian languages only Tamil, Kannada and Telugu have old written evidence and most of that is less than 2000 years old (Dravidian languages). Another approach to constraining the timing of divergence is to relate lists of reconstructed vocabulary for material things to when those things are known to be present archaeologically. Thus, for example, reconstructed vocabulary for iron metallurgy cannot date to prior to the Iron Age when such technology was first adopted in a region. In the case of South Dravidian languages, including the ancestral speech to modern Tamil and Kannada, vocabulary for iron metallurgy and textile production, especially from cotton, can be reconstructed (Fuller 2008, 2009). This places this ancestral speech no earlier than ca. 1200 BC, and suggests these languages diverged sometime more than 2000 years ago (i.e. during the Iron Age). Crops provide another useful set of material terms, as their earliest occurrences can be inferred from archaeobotanical data, especially for species that are not native to a region, such as the arrival of wheat and barley in South India, or crops such as sorghum which are of African origin (Fuller 2003, 2007b).

Table 1 Linguistic evidence for horsegram in South Asia.

In addition, knowledge of native flora, such as the names of trees can indicate something of the ecological zones with which early language speakers were familiar, and in the case of early Dravidian we can construct several major trees from the moist deciduous, dry deciduous and savanna vegetation of peninsular India (Fuller 2007a; Asouti and Fuller 2008). In addition, the familiarity of proto-Dravidian speakers with several kinds of pottery indicates that this whole language differentiated since the Neolithic when pottery first developed in India (Southworth 2005; Fuller 2009). Thus, while Southworth (2005) had inferred a lower Godavari valley origin for Dravidian, based on the geographical centre of language diversity, the tree vocabulary suggests somewhere around the margins of Deccan plateau, no further north or west than Gujarat and Rajasthan (Fuller 2007a). This fits with the hypothesis that the Neolithic of southern India dispersed with early pastoralism (and pottery) from Gujarat through the open woodlands of the Deccan starting ca. 3000 BC (Fuller 2011). A more westerly Indian origin also fits with more recent arguments that Proto-Dravidian languages are more distantly related to the Elamite languages of the Iranian plateau (Southworth and McAlpin 2013). Horsegram would perhaps be among the earliest cultivars of the Dravidian speaking Neolithic of peninsular India, alongside native millets like Brachiaria ramosa (L.) Stapf. and the mungbean (Vigna radiata) (Fuller et al. 2004).

Archaeological and botanical evidence for origins

Archaeobotanically, Macrotyloma is widely reported from Chalcolithic and Neolithic sites, with candidates for the earliest occurrences from Khujhun, in the Vindhyan plateau (Kajale 1991; Saraswat 1992), the Harappan site of Burthana Tigrana in Haryana (Willcox 1992) and Southern Neolithic sites of Andhra and Karnataka (Fuller et al. 2011; Kajale 1991, 1998). Thus, this pulse is well represented by archaeological finds across India, from the mid or late third millennium BC onwards. However, the regional origins of this pulse have been obscure as wild progenitor populations have been poorly studied in South Asia, and have never been described in the floristic studies of India (Fuller 2002). Nevertheless, Fuller and Harvey (2006) inferred a likely South Indian origin, and perhaps a separate northwest Indian origin based on the distribution of archaeobotanical evidence and a limited assessment of herbarium specimens from the Botanical Survey of India, Pune. The Haryana region, Gujarat region, south Deccan are all plausible foci of early cultivation or domestication (Kingwell-Banham and Fuller 2014, 3492).

Materials and methods

The present study expands upon this earlier work through three lines of evidence. First, we have surveyed further herbarium specimens of likely wild progenitor’s populations held in herbarium collections from Kew and the London Natural History Museum (Fig. 4). These provide some augmentation of the distribution of wild populations, which can be combined with the extent of climatic conditions similar to where these have been found. Second, we provide an extensive baseline study of seed size in modern domesticated and wild horsegram, which provides a basis from which to infer the domesticated status of archaeological horsegram based on seed measurements. Third, we summarize current seed size data from archaeological specimens which allows us to infer the time period(s) of horsegram domestication in or near likely regions of origin (Fig. 5). Fourth, we provide an updated database on the archaeological occurrence of horsegram in time and space (Fig. 6) which allows us to identify those regions in which it occurs earliest and are therefore more likely be at or close to the region(s) of initial cultivation and domestication.

Fig. 4
figure 4

Herbarium specimen originally labelled as Dolichos biflorus with seeds in pouch (Image can be found at http://specimens.kew.org/herbarium/K001092968)

Herbarium collections of Macrotyloma were surveyed including those from South Asia and from Africa. In 2004 one of us (DQF) studied collections held in the Botanical Survey of India herbaria in Pune and Calcutta, which provided 6 localities in India, mostly associated with the drier savanna belt, where wild horsegram had been collected (Fuller and Harvey 2006). This is augmented in the current study by 31 additional occurrences (see results, below). M. uniflorum var. stenocarpum (Brenan) Verdc. are by definition wild specimens, but we also included those that had been identified as M. uniflorum var. uniflorum but have dehiscent (wild-type) pods and/or occurred in wild rather than in cultivated habitats. In addition, where seeds were visible on the herbarium specimen, or loose in attached pouches, these were measured. While Macrotyloma uniflorum occurs wild in Africa, and three subspecies have been described (Verdcourt 1970, 1971), these have never apparently been domesticated. Therefore, their seed metrics provide a useful baseline for wild size range which augments the more limited materials of wild horsegram from India.

Morphometric measurements were undertaken on the length, width (Fig. 5), thickness and hilum length of modern populations of horsegram, including all 3 wild subspecies (M. uniflorum var. stenocarpum (Brenan) Verdc., M. uniflorum var. verrucosum Verdc., and M. uniflorum var. benadirianum (Choiv.) Verdc.) and also sister species of horsegram including M. axillare (E. Mey.) Verdc., M. ciliatum (Willd.) Verdc., from several reference collections including UCL, Mediterranean and Near Eastern Reference Collection, Royal Botanic Gardens, Kew, Economic botany collection and herbarium specimens, along with additional requested germplasm kindly supplied by the USDA (Table 2). We have gathered from the literature all the available published measurements of archaeological horsegram and have augmented these with measurements from our own archaeological collections (Table S6). Further analysis of archaeological seed metrics is currently ongoing and will be fully published elsewhere (Murphy and Fuller, in prep.)

Fig. 5
figure 5

Measurements taken using archaeological specimens of horsegram for length (mm) of the longest point and width (mm) across the hilum at the widest point

Table 2 Modern specimens of horsegram measured
Fig. 6
figure 6

Map of identified archaeological horsegram in South Asia (Table S4 for further details). Sites numbered: 1 Rohira 2 Masudpur VII 3 Banawali 4 Kanmer 5 Balu 6 Kunal 7 Farmana 8 Masudpur VII 9 Jhusi 10 Ahirua Rajarampur 11 Kayatha 12 Paiyampalli 13. Hallur 14 Rojdi 15 Bahola 16 Kaothe 17 Watgal 18 Bahola 19 Hattibelagallu 20 Kurugodu, 21 Mitathal 22 Hiregudda 23 Sanganakallu 24 Sanghol 25 Daimabad 26 Inamgaon 27 Piklihal IIIA 28 Senuwar 29 Hallur 30 Hanumantaraopeta 31 Hulas 32 Injedu 33 Peddamudiyam 34 Singanapalle 35 Tekkalakota 36 Apegaon 37Tokwa 38 Ojiyana 39 Tuljapur Garhi 40 Golbai Sassan 41 Gopalpur 42 Harirajpur 43 Malhar 44 Piklihal IIIB 45 Narhan, 46 Bahola 47 Charda 48 Narhan 49 Kadebakele 50 Ahichchhatra, 51 Piklihal IIIB/IV 52 Ter (Thair) 53 Adam 54 Noh 55 Saunphari 56 Khao Sam Kheo 57 Paithan I 58 Nevasa 59 Veerapuram 60 Phu Khao Thong 61 Kodumanal 62 Perur 63 Bhagimohari 64 Sanghol 65 Mantai 66 Hund, 67 Chungliyimti 68 Khezhakeno 69 Vikrampura 70 Khusomi 71 Ludwala (Mangali Ludwala), 72 Loteshwar

Results

Distribution and ecology of wild horsegram

Wild horsegram has received little attention from botanists working in India or from crop geneticists. However, relatively recently a separate species, Macrotyloma sar-garhwalensis has been proposed as a new species by Gaur and Dangwal (1997), although its scientific name is still considered unresolved (www.theplantlist.org). It was found and named after the type locality of the village of Sara of Garhwal Himalaya (Pauri District) Uttarakhand, India. It is commonly found near edges of crop fields (Gaur and Dangwal 1997, 283; Gaur 1999), and can be expected up to elevations of about 1500 meters. Germplasm collections, for example that of the USDA, include only cultivated material, and while we relied on this extensively for measurements, we have had to turn to older herbarium collections, both to infer where wild populations have been encountered in the past and to provide measurements on wild seeds (Table 3). We can now provide the following updated map of wild occurrences in India (Fig. 7), shading indicates areas that might be considered as ecologically plausible zones for wild horsegram now or in the past, prior to habitat destruction through agricultural occupancy and pastoralism. The wild habitat of horsegram appears to be in the dry evergreen open woodlands (Acacia and Albizzia dominated), which represents India’s savannah vegetation (Asouti and Fuller 2008). It is in similar bioclimatic zones where wild Macrotyloma uniflorum is reported in Africa (Verdcourt 1971). As horsegram is an excellent fodder crop, it is likely that the spread of domesticated animals since the Neolithic has greatly reduced wild populations. In addition, in the recent Flora of Mizoram M. uniflorum is noted as a “common species in open places” (Singh et al. 2002, 485), and as this is not described as cultivated it is possible that some wild population extend to north-eastern India and even to adjacent Myanmar. On the other hand, it is also possible that these represent feral populations derived from ancient crops. Targeted fieldwork to study wild horsegram is still needed (Fuller 2002, 485).

Table 3 Wild Populations of horsegram examined
Fig. 7
figure 7

Map of distribution of wild populations of horsegram based upon data from Table 5 (and Table S4), and including M. sar-garhwalensis

Macrotyloma metrics: a modern baseline for domestication studies

Domestication in horsegram, as in other pulses, should involve loss of wild seed dispersal, i.e. retention of seeds in pod through non-dehiscence, loss of seed dormancy and germination inhibition, and changes in seed dimensions (Zohary et al. 2012, 76; Smartt 1990; Fuller 2007b). Pods, however, have never been recovered archaeologically. Changes in seed coat thickness that might relate to dormancy require further study (Murphy and Fuller 2017). Another related domestication trait that we observed in modern collections is testa colour. Modern horsegram populations show polymorphism in terms of testa colours which include non-cryptic testa colours, red and white, and mottled patterns whilst wild populations have uniformly black to very dark brown seeds. However, because the archaeological grains are preserved by charring they all appear black and the original colour is not preserved. As far as we can determine under light microscopy and scanning electron microscopy the changes in colour do no correlate with any obvious structural changes in the seed coat, and therefore this cannot be used to examine domestication in archaeological finds. Therefore, of all these traits, grain size increase is currently the most tractable archaeobotanical trait (Fig. 8).

Fig. 8
figure 8

Scatterplot of modern Macrotyloma seeds, width (mm) plotted by length (mm)

Details of modern seed metrics

Morphometric measurements of modern specimens showed a great deal of intra-species variation, which could be accounted for due to the large geographical area under study. Wild modern specimens of Macrotyloma axillare, M. uniflorum var. stenocarpum, and M. uniflorum var. verrucosum showed a much smaller seeds size than modern domesticated horsegram as expected (Fig. 9). If the comparison is restricted to the wild subspecies of M. uniflorum versus domesticated forms, overlap between the two forms is minimal and separation is feasible.

Fig. 9
figure 9

Frequency histogram of seed width measurements in wild seeds (light/red) (n = 86) versus domesticated seeds (dark/blue) (n = 1000). Box plots compare width, length and thickness of seeds of wild versus domesticated forms; also shown are measurements adjusted for 20% shrinkage to approximate the effects of charring, the main form of archaeological preservation. (Color figure online)

Seed metrics and pulse domestication in India

There have been discussions as whether size increase occurs during the initial stages of domestication (Purugganan and Fuller 2009, 2011). Recent comparative studies suggest that in general seed size increases in pulses and other seed crops occurs during the same domestication episode that saw the evolution of domestication traits (Fuller et al. 2014; Moles et al. 2007). Nevertheless, in South Indian mungbeans (Vigna radiata) seed size increase is marked in the Second Millennium BC (Fuller 2011; Fuller et al. 2014), and occurs after the introduction of mungbeans to the Ganges valley in this period (Fuller and Harvey 2006; Fuller 2007a, 903, 915–916). Although mungbeans were present earlier in the eastern Harappan zone, available data on seed size suggests that these were already enlarged and fully domesticated, whereas the mung beans initially introduced to the Ganges plain were still small-seeded and in the wild size range, like those in the earlier Neolithic/Chalcolithic of the Deccan. It can be suggested that a separate trend of seed enlargement for mung bean took place in the Ganges region slightly faster than the late Neolithic domestication processes documented in South India (Fuller and Harvey 2006).

Distribution of archaeological horsegram

Wild horsegram (Macrotyloma uniflorum var. stenocarpum) is native to the Acacia thickets ranging from the Aravalli hills in Rajasthan, through Gujarat and the savannahs of the Southern Peninsula (Asouti and Fuller 2008; Fuller and Murphy 2014), as well as the margins of dry deciduous woodlands, possibly extending to hills in central and eastern India (Fig. 7). It may be that these wild progenitors of horsegram were more widespread during the mid-Holocene climatic wet phase. A key unresolved question is whether wild horsegram population ever extended west into the Saurashtra peninsula region, south of the Thar Desert and west of the Aravalli hills. A subsequent reduction of their availability, during the aridification that began in the later Fourth Millennium BC (Ponton et al. 2012; Prasad et al. 2014), may be connected to their domestication and the emergence of the Southern Neolithic; as these hunter/gatherer/foragers began to collect and artificially concentrate patches of horsegram in their seasonal rounds (Fuller and Korisettar 2004; Asouti et al. 2005; Fuller 2006; Fuller and Murphy 2014; Murphy and Fuller 2016).

The earliest archaeological finds of horsegram come from three regions of India, the northwest in Haryana state, the western part of Gujarat (the Saurashtra peninsula), and the south Deccan (Karnataka). In the north-west the earliest reported horsegram is from sites of the Early Harappan period (3000-2600 BC), including Balu, Banawali and Masudpur VII (Bates 2015; see Table S4 for further primary sources). The evidence from Gujarat consists of a single find from Loteshwar, a site also known for early millet cultivation and pastoralism by the early part of the Third Millennium BC (García-Granero et al. 2016). Of note is that this site fall outside our expected distribution of wild Macrotyloma uniflorum var. stenocarpum. While it is possible that wild populations did extend through Saurashtra in the past, as this region does have a dry deciduous tropical and savannah flora in common with the Deccan, on current evidence this could indicate an early translocation into Gujarat from wild habitats further east. If so, despite the small size of this find (see Table S2) it can be regarded as a potential early cultivar. The third focus of early finds is in the South Deccan Neolithic. While most finds in this region are securely dated only to after 2000 BC, horsegram was found at Watgal which was occupied as early as 2800 BC onwards (Fuller et al. 2007). Unfortunately, the stratigraphic contextual position of these finds within Watgal are not reported, and as the site has a long sequence (until perhaps 1000 BC), how ancient this horsegram was, remains unclear.

Recent archaeobotanical sampling in the Deccan plateau of South India, a large, arid region featuring rich Neolithic period remains (Fuller 2002; Fuller et al. 2004, 2007a, b; see also, Bellwood 2005; Balter 2007) has shown that some of the earliest Southern Neolithic crop domesticates appear to have been locally domesticated. One of the staple crops of the Southern Neolithic (which falls within the territory of the modern states of Karnataka, Andhra Pradesh, and parts of Tamil Nadu) is the native domesticate horsegram (Fuller 2011; Fuller et al. 2014). In this region, published measurements have been augmented by our own measurements of specimens from Tekkalakota, sites of the Kunderu River, and Gopalpur (Fig. 10). What these data indicate is that seed length and width are smaller in the earliest populations and appear to increase around the middle of the Second Millennium BC, suggesting a domestication that started by at least 4000 years ago. Size increase is evident by around 3500 years ago (1500 BC) and finished by 3000 years ago (1000 BC). Golbai Sassan from eastern India (Odisha) is included on this graph, although it lies in a culturally distinct geographical zone, but could relate to dispersal of early cultivars from the Deccan.

Fig. 10
figure 10

Seed size (length, top, and width, bottom) in archaeological assemblages of horsegram (Macrotyloma uniflorum) from the Deccan Plateau region of India, including eastern India (Odisha state), indicating mean, standard deviation, maximum recorded size (+) and minimum recorded size (−). All archaeological specimens are preserved carbonized (by charring). Modern wild and domesticated comparisons are given with a correction factor of −20% to account for the probable effects of charring. Grey zone indicates the overlap zone of the largest 32% of wild specimens and the lower end of the domesticated range, and thus provides a visual baseline against which to judge increases in size over time. Dataset summarized in Table S6. (Color figure online)

In the Upper Indo-Gangetic alluvium, including the state of Haryana, the earliest known agricultural settlements date to the Early Harappan period, starting by ca. 3000-2800 BC, and the archaeobotanical evidence suggests that winter and summer crops were already both part of the agricultural system at such sites (e.g., Kunal, Balu, Banawali, Masudpur), which included horsegram. Finds are also fairly frequent during the Mature Harappan period (2600-2000 BC) in this region, and available metrics, such as those from Balu suggest that these may be marginally larger on average than expected for wild populations, indicating that domestication processes had begun (Fig. 11). In the Ganges basin, the introduction of winter crops from the Indus valley included wheat and barley, lentils, as well as some pulses of Indian origin, including horsegram, which are present sometime around 2000-1800 BC (Fuller 2011), and livestock (sheep, goat and zebu cattle). Fuller (2011) previously suggested that mungbean (Vigna radiata) in the Ganges might have been introduced from the Deccan to the south, although dispersal from Haryana to the west is equally possible. Horsegram metrics from this region, as well as those from the western states of Gujarat and Rajasthan, indicate that trends towards size increase were underway before 2000 BC (Fig. 11). The timing of size increase appears to be finished somewhat earlier in the northwest, compared to South India, and this distinct trend suggests that the domestication process was separate and may have begun earlier in north-western India, perhaps from wild populations in the western Himalayas that were brought down to Indo-Gangetic plains in the Haryana region for cultivation.

Fig. 11
figure 11

Seed size (length, top, and width, bottom) in archaeological assemblages of horsegram (Macrotyloma uniflorum) from Northern and Northwest India, including eastern India (Odisha state), indicating mean, standard deviation, maximum recorded size (+) and minimum recorded size (−). All archaeological specimens are preserved carbonized (by charring). Modern wild and domesticated comparisons are given with a correction factor of −20% to account for the probable effects of charring. Grey zone indicates the overlap zone of the largest 32% of wild specimens and the lower end of the domesticated range, and thus provides a visual baseline against which to judge increases in size over time. Dataset summarized in Table S6

Discussion

Our review of the evidence, leaves no doubt that horsegram (Macrotyloma uniflorum) was domesticated in ancient India. The evidence of remnant wild populations today suggest that the wild progenitors were distributed in the semi-arid savannah or scrub woodland zones, including margins of tropical dry deciduous woodlands, of western and peninsular India, and also through parts of the lower slopes of the western Himalayas. This suggests two main wild distribution areas that were geographically separated, one in the savannah corridor of western and peninsular India, and one in the western Himalayas. A similar disjunct distribution has been identified for wild mungbean (Vigna radiata var. sublobata), although the latter occurs in wetter habitats (moist deciduous woodlands) (Fuller and Harvey 2006; Fuller 2007a, b). In the case of mungbean two distinct domestication trajectories have been inferred, one in northwest India and one in the South (Fuller and Harvey 2006; Fuller 2011). The data presented in the present paper suggests a similar pattern in horsegram. In north-western India, seed size increase took place during the Harappan to Late Harappan periods, whereas in South India it took place over the course of the Second Millennium BC during the later Southern Neolithic/Deccan Chalcolithic (Figs. 10, 11). These two domestication episodes can be suggested to derive from the disjunct wild progenitor populations, and we would therefore predict distinct genetic differences between wild and crop specimens studied with genomic techniques. Limited modern genetic data suggest two groups of horsegram (Sharma et al. 2015), although it is not clear whether these relate to distinct origins. More broadly, syntheses on the origins of agriculture in India recurrently find evidence for likely independent centres of plant domestication in north-western India and South India (Fuller 2011; Fuller and Murphy 2014; Kingwell-Banham et al. 2015). In the case of South India, seed size increase in horsegram appears to start later (ca. 1500 BC) (Fuller 2011; Fuller et al. 2014), and this could be indicative of cultivation and domestication of horsegram starting earlier. This makes sense considering archaeological evidence that the Neolithic in peninsular India was initially focused on the driest savannah habitats highly suited to pastoralism (Murphy and Fuller 2016), and only subsequently did farmers push into adjacent tropical moist deciduous zones where mungbean was wild (Kingwell-Banham and Fuller 2012). The status of early horsegram in western Gujarat, whether available from the wild or cultivated remains unresolved.

As we have demonstrated in this paper, horsegram has a long history of use in South Asia, although it has received disproportionately little research. Wild populations appear to be rare, possibly endangered, in India, and ought to be the focus of collecting, as these have potential to expand the genetic basis of horsegram improvement. Horsegram’s derided status as a crop of the poor needs to be re-evaluated in the light of modern economic and agrarian realities and its potential medicinal, utility and nutritional properties. Today, horsegram is grown across tropical Africa, South Asia, Southeast Asia, China, the Americas and Australia (Kingwell-Banham and Fuller 2014, 3490), although outside of South Asia it is largely used as fodder and within South Asia it is often relegated to consumption by the poorer classes. As one of India’s most ancient indigenous pulses as well as one of its most stress tolerant, further research on the origins, diversification and improvement of this species can be expected to contribute to future agricultural sustainability.