Quantifying an online wildlife trade using a web crawler

Legally protected plants are illegally traded through online sales platforms and orchids are a significant component of this wildlife trade. This study focused on salep, a compound product made from wild collected orchid tubers from several genera—including Anacamptis, Dactylorhiza, Himantoglossum, Ophrys, Orchis, Serapias—whose harvest endangers some of the species used, despite their collection and sale being restricted by national and international legislation. Using a custom designed web crawler in combination with DNA barcoding of a subset of products over 18 months 1942 items of salep were detected as sold at a total value of US$ 37,775, estimated to be equivalent to 90,000 to 180,000 wild orchids being destructively harvested. Wild harvested tubers traded at a value of $0.21 and equivalent cultivated orchids have a market price of $16–28; cultivation is currently no viable alternative to wild harvesting. Using a web crawler on open trade sites contributes to knowledge on illegal wildlife trade, which can be used to address illegal plant trade at the national and international level.


Introduction
Market surveys are a longstanding method applied to investigate social and economic significance of plants and animals, and to identify wild-harvested products that present sustainability concerns (Cunningham 2014;D'Cruze et al. 2020;Ticktin et al. 2020). Online shopping by credit card started in 1994 with the purchase of a $12.48 music CD, and has rapidly grown to 1.66 billion people participating worldwide with a value of $2.3 trillion in Communicated by David Hawksworth. This article belongs to the Topical Collection: Biodiversity exploitation and use.

3
2017, projected to rise to $4.48 trillion by 2021 (Lewis 1994;Statista 2018). A notable way in which e-commerce has reduced obstructions in international trade is by making it easier for small and medium enterprises to extend beyond local markets and reach global markets (Dongwei 2016).
Using an online trading platform as a place for collating trade data is an extension of physical market surveys, and is particularly relevant as transactions increasingly occur online (Humair et al. 2015;Nijman 2020;Siriwat and Nijman 2020;Xiao et al. 2017;Ye et al. 2020). While publicly displayed information such as that on e-trade sites is a fraction of content on the internet, concealed darkweb wildlife trade is negligible .
Approximately 35,811 species of animals and plants are listed under the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES), which is an international agreement between governments aimed at ensuring that international trade does not threaten the survival of wild species (CITES 2017). Out of these listed species a total of 28,484, roughly 80%, are orchids (WCSP 2018). In order to examine online commerce in wildlife trade we focused on one group of orchids in particular.
Orchids of horticultural value have previously been the topic of research in online trade (Hinsley et al. 2016;Lavorgna and Sajeva 2020;Wong and Liu 2019). Terrestrial orchids with edible tubers are also wild collected and traded internationally for use as chikanda (Challe and Price 2009;Veldman et al. 2014). Our specific focus was a group of orchids whose tubers are traded as salep which is used to make a hot drink traditionally attributed medicinal properties, and ice cream with a chewy texture in countries around the Eastern Mediterranean and within the Balkans, including Egypt, Greece, Israel, Montenegro, and Turkey (Abouzid and Mohamed 2011;Lev and Amar 2000;Menković et al. 2011;Sezik 2002a;Tamer et al. 2006). A total of 35 taxa-34 species and 1 subspecies-from the genera Anacamptis, Dactylorhiza, Himantoglossum, Neotinea, Ophrys, Orchis and Serapias have been reported as collected for salep in Greece, Iran, and Turkey (Ghorbani 2016;Kasparek and Grimm 1999;Kreziou et al. 2016), see Supplementary Table 1. Import and export of salep from Iran to Turkey and Turkey to Germany, and into Jordan has also been documented (Ghorbani et al. 2014a;Kasparek and Grimm 1999;Lev and Amar 2002). Salep is also the subject of global patent applications in materials manufacture and novel medicines (Masters et al. 2020). Orchid genera used for salep are difficult to cultivate on a commercial scale for their tubers. Currently, only a few small-scale cultivation trials for the tubers of orchids sold as salep exist, consequently they are only available commercially via wild harvesting Erzurumlu and Doran 2011). Salep is wild harvested and representative of the family that dominates CITES listing and is predominantly wild harvested. All species of orchid used for salep are listed in Appendix II of CITES (CITES 2021). Furthermore collection of their tubers for trade is considered to be a threat to the species used (Delforge 2006;Kasparek and Grimm 1999;Ghorbani 2016;Ghorbani et al. 2014a;Kreziou et al. 2016). Reported volumes of harvest and trade in salep vary, with estimates of between 7.5 and 30 tons being collected per year within Turkey, 7.5 tons exported, and lack of clarity on whether estimates include tubers illegally imported into Turkey (Kasparek and Grimm 1999;Sezik 2002a, b;Ghorbani et al. 2014a).
Finding relevant data is the biggest challenge in tackling illegal wildlife trade online . eBay is an international sales platform, with significant annual trade-$10.8 billion in 2019 (eBay 2020). This sales platform also displays the number of products sold by each advert, thereby making it possible to quantify sales. Web scrapingextracting online data-can be done manually or by using a web crawler (Zhao 2020). As an automated system of data extraction web scraping is more time efficient and less error 1 3 prone than copy-and-pasting or manual transfer of data from web pages to a spreadsheet (Dogucu and Çetinkaya-Rundel 2020). Without login requirements eBay adverts publish items for sale, sales, prices and general vendor location. Using a web scraper to collect publically accessible data is permissible (Katris and Schaul 2021). In the creation of new knowledge by extracting data from the web, data scrapers should not overload targeted websites, access information behind password barriers, steal market share from targeted website, and should copy information that is factual (Zamora 2019). By systematic and small scale collection of public records of sales on eBay, and authenticating selected products with DNA barcoding, we investigated the extent of online trade in salep in 2019 and 2020.
We aimed to answer the following questions (i) What volumes of salep are being traded? (ii) How much revenue is being generated? (iii) Are products sold as salep made from real orchid tubers or also from substitutes? We also looked at the intersection of global trade with local and international law and how this can make products legal or illegal on the basis of vendor location. In doing this we progress from widely investigated traditional constructs of salep and historic trade to build a contemporary perspective on salep.

Collection of transaction data
Consistent and automatic data collection was done with a web crawler specifically constructed using Python to index data from eBay adverts containing trigger words (salep, salepi, sahlep, sahlab, salab misri) and save it to CSV files. Our selected data points included: eBay item number, advert title, advert URL, seller, price of product, mass of product, number of items sold, location of seller, state of product (roots, powder, packet, unknown), image of the product, and crawl time. This was extracted as one CSV per crawl with image files for each advert. Our web crawler ran weekly for the duration of 2019 resulting in 50 downloads, and 25 in 2020. The web crawler script is available on Gitlab (https:// gitlab. com/ natur alis/ bii/ trade-monit oring/ salep-crawl er).

Validation
Prior to running the web crawler we placed a dummy advert on eBay and made a purchase from it ourselves to find out if the automatically listed number of purchases would change by one after the sale. This verified that our web crawler could correctly record the number of products sold.

Data cleaning
While the web crawler reduced the need to manually scan weekly listings the CSV files given as outputs contained items that were not salep, and items that might contain salep or could be other products. Consequently advertisement titles were read to remove items manually that should not or could not be used in our analysis e.g. adverts for a poster of a salep vendor. In some adverts multiple items were for sale and it was not possible to determine whether sold items were salep or another category of the same product. For example a sale 1 3 from an advert for 18 Turkish herbs could have been salep or chamomile tea. Data used for analysis was from adverts for salep that was sold to third parties.

Sample testing
Adulteration of products with cheaper or more readily available fillers and substitutes is a known problem with plant based products (Ichim 2019;Newmaster et al. 2013;Seethapathy et al. 2019). In order to gauge whether salep sold online is made from real orchid tubers, a subset of salep for sale on eBay was purchased and analysed. DNA barcoding was used to test whether salep products sold online contained orchids, and our sampling was limited by the aim to avoid creating extra demand. In order to capture the range of countries where salep is sold by eBay 16 samples in which the product for sale was illustrated with different photos were bought from Greece (5), Turkey (3), and India (8). This sample size of 16 was less than 1% of the number of sales of salep detected other than our sample purchase. Samples bought in Greece and Turkey were authenticated by DNA barcoding within these countries, whereas samples from India were authenticated in Norway (see acknowledgments).
In order to establish a baseline to estimate the number of tubers traded by weight, we weighed all salep tuber samples held by the Economic Botany collection at Royal Botanic Gardens, Kew.

DNA extraction
In order to reduce gel formation, due to high glucomannan content of tubers 100 mg of salep powder was washed three times with STE buffer (0.25 M sucrose, 0.03 M Tris, 0.05 M EDTA) prior to following an optimised extraction protocol (Ghorbani 2016;Shepherd and McLay 2011). DNA from each product was extracted from homogenized contents using the Nucleospin Tissue kit (Macherey-Nagel, Düren, Germany), according to the manufacturer's instructions. The final elution volume was 100 μl. Extracted DNA was quantified using a Quawell UV-Vis Spectrophotometer Q5000 (Quawell Technology, San Jose, CA, US).

DNA amplification
The following nrITS2 primers (Invitrogen, UK) were used for the amplification of the sample DNA, ITS2-S2F 5′-ATG CGA TAC TTG GTG TGA AT-3′ and ITS2-S3R 5′-GAC GCT TCT CCA GAC TAC AAT-3′ (Chen et al. 2010) Polymerase chain reactions were carried out using DNA extracted from the powder products in final reaction volumes of 20 µl including 2 µl of template DNA solution (ranging from 3 to 20 ng/µl), 10 mM of each dNTPs, 1.5 mM MgCl 2 , 10 μmol of primer, 2 μl of 10 × Taq DNA polymerase buffer, 1.5 mM Syto 9 green fluorescent nucleic acid stain and 1 U Kapa Taq DNA polymerase (Kapa Biosystems, USA). Accumulation of amplified product during PCR was monitored by using a third-generation DNA intercalating dye (Syto 9). PCR amplifications were performed in a Rotor-Gene Q (QIAGEN Hilten, Germany) as follows: an initial step of 4 min at 94 °C, followed by 35 cycles, each one including 30 s at 94 °C for denaturation, 30 s at 52 °C for annealing and 1 min at 72 °C for elongation. A 7 min step at 72 °C was programmed as a final extension. Amplification products were separated by electrophoresis on a 1.5% agarose gel and stained with ethidium bromide. Gels and images were analysed using UV Minibis Pro (DNR Bio-Imaging Systems, Jerusalem, Israel) to quantify signal intensity.
Sanger sequencing PCR products were directly sequenced in two directions of each fragment with a Big Dye terminator v3.1 Cycle sequencing kit (PE Applied Biosystems, Foster City, CA, USA) in an automated ABI 3730 sequencer (PE Applied Biosystems). Sequences were aligned with the CLUSTALW program using the BioEdit software (Carlsbad, CA, USA).

DNA sequence analysis
The marketed samples were first analysed by sequencing the nrITS2 marker and the nucleotide sequences were submitted to a basic local alignment search tool BLAST (see http:// blast. ncbi. nlm. nih. gov/ Blast. cgi, and Table 2). Sequences were sequentially queried using megablast (Altschul et al. 1990) online at NCBI nucleotide BLAST against the nucleotide database. A sequence similarity-based approach with BLAST was used for DNA barcode identification. For this method, a similarity score was calculated for up to 100 BLAST hits if the query cover was 70% or higher: max score*(query cover/ identity). Identifications were assigned based on a combination of the identity score (High identity: i ≥ 97%; Medium identity: 90% ≤ i < 97%; Low identity: i < 90%) and the number of species within 1% deviation of the similarity score, as described by ).

Results
Over 2019 and the first 6 months of 2020 we detected 1942 items of salep sold at a total value of US$ 37,774.62 (see Table 1).

Tuber average mass
We weighed 394 individual tubers and 957 tubers in batches of 20-150, in total 1391 tubers traded as salep from the Economic Botany collection at Royal Botanic Gardens, Kew. We found an average tuber mass of 0.94 g overall. In the smaller sample of 394 tubers that were individually measured average tuber mass was 1.34 g with a standard deviation of 1.09 (Individual and batch mass detailed in Appendix II and Appendix III). Using the average mass of tubers traded as salep sampled from this Economic Botany collection, and the price per gram where mass of sales was recorded by the web crawler ($0.21), we extrapolated the total number of tubers recorded as sold by the web crawler to be 179,879 tubers.

Location of vendors
Overall the country from which the greatest value of salep was sold was Greece ($13,082.53), followed by India ($11,089.33), and Turkey ($8883.94). (See Fig. 1 and Appendix IV sales by country).

DNA barcoding of samples analysed
Orchids present in the samples analysed were Orchis mascula, Anacamptis sp. and Dactylorhiza maculata. All three of these species have previously been reported as being used in salep (see Supplementary Table 1). Other plant species detected were unspecified Orchidaceae and Allium spp., and Oryza sp. (rice), Sida sp., and two samples that could not be identified. See Table 2 for summary on identification of species, sequences generated and their NCBI GenBank accession numbers.

Orchids specified in sales
Of the total 1942 items sold, 929, worth $23,908.70, had orchids specified by name in the advert title. Vendors use a mixture of scientific and vernacular names in product descriptions, and the names given below are the vernacular names and not scientific names. NB We do not use italics when these are not currently accepted scientific names or genera. When orchids were named either with one type of orchid specified or with several specified the most frequent was 'Orchis mascula' (in 864 items), followed by 'Latifolia' (in 197 items), 'Laxiflora' (98), Marsh orchid (59), 'Dactylorhiza hatagirea' (56), Munjataka (17),   Fig. 3). Some items sold also had accompanying pictures of the flowers of orchids that have been reported as used in salep (Ghorbani et al. 2014b) (see Fig. 4). A total of 864 items sold worth $21,928.39 were titled as containing 'Orchis mascula', of which 469 items worth $11,228.77 were from vendors in Greece and 297 items worth $8188.4 were from India. Munjataka and Marsh orchid are vernacular names used for Dactylorhiza hatagirea (Teoh 2016). Combining adverts that specified marsh orchid, marsh, munjataka or 'Dactylorhiza hatagirea' makes D. hatagirea the second most specified orchid in sales. It is not clear to which species the vernacular names of Latifolia and Laxiflora refer, but Dactylorhiza incarnata and Anacamptis laxiflora are among the most likely candidates because of the similarity with their past and current species names.

Quantifying salep trade in economic value and number of individuals harvested
In quantifying salep trade we found that the economic value of individual orchid tubers, and the plants destructively harvested to obtain them, is low. With one and sometimes two tubers collected per plant for salep, sales detected by the web crawler were estimated as the equivalent of 89,939-179,879 wild orchids being destructively harvested. The world's most expensive orchid, valued as an ornamental, was sold for approximately $580,000 (Seyler et al. 2019). In contrast we found that wild orchids traded in the form of tubers for salep were worth $0.21 per tuber (pers. obs.). Wild harvesting has an impact on populations of orchids not only when orchids are high-value individual plants, but also when individual orchids have a low value (Kreziou et al. 2016).
Furthermore, with each tuber being worth only $0.21, but species used for salep that are commercially propagated as ornamentals being sold for $16-$28 per plant there is a large financial gap between cultivated orchids and wild orchids traded as salep. Consumers of salep and ornamental orchids are likely very different. But cultivated tubers for salep are not yet sold. The price of salep orchids sold for horticulture is therefore given as the nearest current example of prices of cultivated salep orchids as sold commercially. The price gap between wild and cultivated orchids is consistent with other market surveys (Gale et al. 2019). Although some plants, for example American ginseng, are more valuable to consumers when labelled as wild collected (Burkhart et al. 2021). Cultivation is not always a viable conservation solution for overcollected wild plants . Consumers are willing to pay a small premium for food products that are labelled as sustainably sourced, for example 0.33% extra (Bissinger 2019). Bridging the gap between recorded value of wild harvested salep tubers as a commodity and the price for a cultivated salep orchid greatly exceeds this example of consumers willing to pay a premium for sustainability. If cultivation is to be developed as a viable means of protecting wild populations of orchids, innovation is needed to make cultivation of orchids for salep economically viable.

Orchids and adulterants sold
Orchids were present in some of the products we sampled but they were not always the species named in adverts. Our DNA barcoding analysis did not find Orchis mascula in products sold as salep that had specified it as the ingredient. Some adulterants were orchids, but others were not orchids. With the DNA barcodes retrieved, the Allium, Oryza and Sida samples could not be identified to species level so we do not know whether or not threatened species were used. In line with other research on botanical trade there is substitution of species in products sold as salep (de Boer et al. 2015;Newmaster et al. 2013). While some adulteration may be accidental due to incorrect identification of plants or cross-contamination, it may also be the result of intentional substitution of cheaper substitutes (Ichim 2019). Consumers making purchases via e-commerce are more vulnerable to mislabelled or falsely portrayed products as they rely on images and text provided by the vendor, and cannot look at the product directly prior to purchase.

National salep trade and legislation
Our data showed a different pattern of salep sale than expected at national level. In conservation literature salep trade has been predominantly associated with Turkey and the Turkish diaspora in Germany (Ghorbani et al. 2014b;Kasparek and Grimm 1999). However e-commerce detected by the web crawler was in accordance with recent publications that trade of these orchids is notably taking place in Greece, despite orchids being protected in Greece under Presidential decree 67/1981 (Kreziou et al. 2016). Consequently a large proportion of the trade we detected was illegal, not because of the rarity of the orchids traded but due to national law. Similarly collection and sale of wild orchids is prohibited in Israel under the National Parks and Reserves Law (Furst 2017). Each country has different 1 3 legislation on protection of orchids. As indicated by the ease in detecting vendor location on e-trade sites there is scope for National law enforcement authorities to use web crawlers as an efficient means of detecting vendors who are not compliant with national wildlife trade restrictions. Operating at a national level of enforcement offers alignment between jurisdiction of enforcement and where the crime takes place.

International salep trade and CITES
Sales data also indicated breach of international legislation. Product description and sample analysis found that terrestrial orchids reported as being sold as salep in Europe and Turkey were also traded in India, outside of their growing range, as salep, salep misri, and salab. CITES permits are not required by eBay, but for CITES-listed species trade is illegal not only when it occurs without CITES permits but also when harvested in contravention of national legislation (Hinsley et al. 2018;. Cumulatively most of the detected trade in salep on eBay was illegal as within-country trade or not adhering to international permit requirements under CITES. Our results indicate that more regulation of online trade in salep is called for. Without engagement from online trade platforms, while being theoretically protected by legislation, wild-harvested orchids are in practice not protected from trade.

Wildlife trade and plant awareness disparity
Research on wildlife trade has concentrated on animals and wildlife trade has attracted attention for zoonotic transmission of disease (Borsky et al. 2020;Fukushima et al. 2020;. Despite the food and agricultural sector recognising the threat of phytosanitary disease being spread via cultivated plant material, wildlife trade conforms with Plant Awareness Disparity and the risk of plant disease being spread is largely absent from publications on wildlife trade (Iftikhar and Sajid 2020;Parsley 2020;Perrings 2016). While e-commerce opens access to markets for small and medium enterprises it must also shape the online shopping experience so that vendors follow phytosanitary certification procedures.

Intervening in online wildlife trade
Heavy handed interventions may drive illegal wildlife trade to be concealed . Both vendors and buyers in the transactions we detected may not be aware of the legislation they are contravening. Raising awareness of wildlife trade restrictions by working with online platforms is an approach that offers the potential to reduce wildlife trade both as part of the Coalition to end Wildlife Trafficking Online and at national level (Wong and Liu 2019). It has already been applied to wildlife trade for some products, for example rhino horn on Instagram (see Fig. 5).
We conclude that failure of online marketplaces to highlight required phytosanitary and CITES permits to prevent illegal trade has the consequence that endangered wild plant species that should be protected by legislation are currently being collected and sold in large numbers. This trade can efficiently be detected at national level using a web crawler, and could be deterred by installation of automatic pop-ups on salep and other CITES-listed species informing customers of permits required for sale to be legal.
Acknowledgements Thanks to Mark Nesbitt and Frances Cook of the Economic Botany Collection, Royal Botanic Gardens, Kew for providing access to samples of salep for measurement. Import permit Unidentified commercial herbal products purchased in India were imported into Norway under Norwegian Medical Products Agency permit number 18/13493-2.

Funding Research supported by authors' affiliated institutions.
Data availability Code available, and additional data provided in appendices.

Declarations
Conflict of interest Not applicable. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.