Phylogenomic analysis of cytochrome P450 multigene family and their differential expression analysis in Solanum lycopersicum L. suggested tissue specific promoters
- 208 Downloads
Cytochrome P450 (P450) is a functionally diverse and multifamily class of enzymes which catalyses vast variety of biochemical reactions. P450 genes play regulatory role in growth, development and secondary metabolite biosynthesis. Solanum lycopersicum L. (Tomato) is an economically important crop plant and model system for various studies with massive genomic data. The comprehensive identification and characterization of P450 genes was lacking. Probing tomato genome for P450 identification would provide valuable information about the functions and evolution of the P450 gene family.
In the present study, we have identified 233 P450 genes from tomato genome along with conserved motifs. Through the phylogenetic analysis of Solanum lycopersicum P450 (SlP450) protein sequences, they were classified into two major clades and nine clans further divided into 42 families. RT-qPCR analysis of selected six candidate genes were corroborated with digital expression profile. Out of 233 SlP450 genes, 73 showed expression evidence in 19 tissues of tomato. Out of 22 intron gain/loss positions, two positions were conserved in tomato P450 genes supporting intron late theory of intron evolution in SlP450 families. The comparison between tomato and other related plant P450s families showed that CYP728, CYP733, CYP80, CYP92, CYP736 and CYP749 families have been evolved in tomato and few higher plants whereas lost from Arabidopsis. The global promoter analysis of SlP450 against all the protein coding genes, coupled with expression data, revealed statistical overrepresentation of few promoter motifs in SlP450 genes which were highly expressed in specific tissue of tomato. Hence, these identified promoter motifs can be pursued further as tissue specific promoter that are driving expression of respective SlP450.
The phylogenetic analysis and expression profiles of tomato P450 gene family offers essential genomic resource for their functional characterization. This study allows comparison of SlP450 gene family with other Solanaceae members which are also economically important and attempt to classify functionally important SlP450 genes into groups and families. This report would enable researchers working on Tomato P450 to select appropriate candidate genes from huge repertoire of P450 genes depending on their phylogenetic class, tissue specific expression and promoter prevalence.
KeywordsCytochrome P450 Phylogeny Intron map Genome-wide promoter analysis Tissue specific promoter
Arabidopsis thaliana P450
Basic Local Alignment Search Tool
Fragments Per Kilobase of Transcript per Million mapped reads
Molecular Evolutionary Genetics Analysis
Real Time Quantitative Polymerase Chain Reaction
Solanum lycoperscium Cytochrome P450
Cytochrome P450 (P450) belongs to a very divergent multigene family present in all living organisms. In angiosperms, approximately 300 genes are speculated per genome in 50 plant families . The P450 monooxygenases are heme-thiolate enzymes, which catalyse broad range of chemical reactions like epoxidation, sulfoxidation, dehalogenation, dealkylation, C-C cleavage, ring extension, and reduction with the help of oxygen and NADPH . They are involved in the oxidative metabolism of various endogenous and exogenous compounds like herbicides, pesticides and xenobiotics [3, 4]. The P450 proteins, present in plants are membrane bound and difficult to characterize . The molecular mass of P450 from plant origin ranges from 45 to 62 kDa with an average molecular mass of 55 kDa. They possess four conserved key domains namely heme binding domain, I-helix, K-helix and PERF/W domain . Heme-binding signature motif has 10 conserved residues among which cysteine is highly conserved. This heme-iron motif has a binding site for oxygen and various compounds involved in drug metabolism . The P450 gene family is third largest gene family present in Arabidopsis. Most of the P450 studied in plants are localized in the endoplasmic reticulum, chloroplast or mitochondria and other secretary pathways . They are involved in many biosynthetic pathways such as alkaloids, flavonoids, lignans, isoprenoids, phenolics, antioxidants and phenylpropanoid [8, 9, 10]. The P450 genes are crucial in metabolism and tolerance to allelochemicals in plants as well as in animals . The gene families CYP90, CYP724 and CYP734 are involved in biosynthesis of steroidal saponins and glycoalkaloids. Different types of glycoalkaloids present in all members of Solanaceae family are vital compounds but toxic to other living organisms . The P450 proteins are involved in the biosynthesis of aglycones from cholesterol by oxygenation, transamination and cyclization at different carbon positions. The P450 mediated derivatization of glycoalkaloids made them less toxic and during course of domestication solanaceae members with less amount of toxic glycoalkaloids have been selected .
Availability of whole genome sequences of large number of plant species allowed the genome wide identification of P450 multigene family in different plant species, namely soybean (Glycin max) , mulberry (Morus notabilis) , flax (Linum usitatissimum)  and tobacco (Nicotiana tabacum) . The draft genome sequence of tomato (Solanum lycopersicum) was made publicly available in 2012 which provides an opportunity for genome-wide study of tomato specific gene families . Tomato (Solanum lycopersicum L.) is an economically important crop and routinely used model plant for fruit ripening, plant-pathogen interaction and molecular genetics mapping . However, very few P450 genes have been reported and functionally annotated from Tomato. Moreover, no comprehensive genome-wide study of these genes has been reported until date. Therefore. in this study, we attempt to classify functionally important P450 genes into groups and families according to standard P450 nomenclature committee [20, 21]. Understanding the molecular evolution, differential expression in different tissue types as well as intron and promoter analysis of SlP450 genes will pave the way for functional characterization of important candidate genes.
Identification of P450 genes from the tomato genome
The Arabidopsis thaliana P450 genes were downloaded from ‘The Cytochrome P450 homepage’ reported by D. R. Nelson (http://drnelson.uthsc.edu/CytochromeP450.html) . These 254 Arabidopsis P450 sequences were treated as a query to perform BlastP search with the E-value ≤1e− 40 against tomato (Solanum lycopersicum) genome (ITAG2.3) available at Phytozome database V10 (www.phytozome.net) . Furthermore, a manual analysis of putative Solanum lycopersicum P450 (SlP450) sequences was conducted for the complete ORF and truncation. The analysis consist of non-redundant and full-length SlP450 genes. Universal names for SlP450 genes were assigned according to the standard system of P450 nomenclature committee [20, 21].
Multiple sequence alignment, phylogenetic tree construction and conserved motif analysis
The 48 P450 protein sequences from other plants such as Arabidopsis thaliana (40), Populas trichocarpa (1) and Solanum tuberosum (7) along with 233 SlP450s from Solanum lycopersicum were considered to construct the phylogenetic tree. The accession numbers are provided in Additional file 1. Multiple sequence alignment of these P450 genes was carried out with Muscle algorithm  using default parameters present in MEGAX software . The phylogenetic tree was constructed using Neighbour-joining (NJ)  and maximum likelihood (ML) algorithm. The Dayhoff substitution matrix (PAM250) along with the bootstrapping (1000 replicates) was employed for NJ analysis. The unrooted maximum likelihood phylogenetic tree and evolutionary analyses were carried out using IQ-TREE web server (http://iqtree.cibiv.univie.ac.at/) . The best-fit model was selected from 168 amino acid substitution models using modelfinder tool . The modelfinder reported LG + F + I + G4 as best-fit model according to bayesian information criterion (BIC score 420,547.05). The ML tree was built with 1000 ultrafast bootstrap  replications and the final tree with highest log likelihood (− 208,278.21) was considered for phylogeny inferences. For conserved domain identification, multiple sequence alignment of SlP450 protein sequence were carried out using Clustal X program using default parameters . The alignment file was submitted to Web Logo generator software for generating the logo of conserved domains available at (http://weblogo.berkeley.edu/) .
Intron map and their organization
Intron map of tomato P450 genes was drawn by using previously described methods suggested by Barvkar et.al. and Paquette et.al. [30, 31]. The intron-exon boundaries, introns phases and their position in protein sequences were considered for the same. Introns present in genomic sequences, were mapped on protein sequences and serially numbered. Introns can have three intron phases: intron phase 0, 1 and 2. Introns with the identical positions in one codon along with similar intron phase are termed as ‘conserved intron’. The intron map was constructed by considering 145 (62.23%), SlP450 genes sequences with one and two introns.
Promoter analysis of SlP450 genes and identification of tissue specific promoters
The promoter analysis of tomato P450 genes helps to identify over-represented motifs regulating gene expression. We used previously characterized motifs from PLACE  and plant CARE databases  to obtain regulatory motifs which are over-represented in a group of genes. The consensus motifs from these databases were used since it has high coverage of previously characterized plant motifs (total 946 plant motifs). The complete Solanum lycopersicum genome was downloaded from Phytozome database. Moreover, the bed file with genomic coordinates was used to extract 2 kb upstream sequence of all the protein coding genes using bedtools suite with getfasta option . The promoter motifs for all protein-coding genes were identified using perl script generously shared by Dr. Angelica Lindlöf . The presence of core promoter sequence can occur randomly because of the short length. Hence, we excluded random occurrence probability of any promoter motif in SlP450 upstream sequence. To calculate non-random occurrence probability, the presence or absence of individual promoter motif in two groups was compared statistically. The first group included SlP450 genes highly expressed in specific tissue types (Leaf, buds, peel, petals, roots, seeds) and the second group contains all the protein coding genes. The statistical one-sample test for binomial proportions was applied at significant p-value (≤ 0.05). We used fragments per kilobase of transcript per million mapped reads (FPKM) values from RNA sequencing of various tissue types to understand the relationship between promoter occurrence and actual gene expression of individual SlP450 gene. Furthermore, a comparison was carried between previously mentioned two groups. The motifs which are statistically significantly overrepresented were assigned as tissue specific promoter motif that are driving expression of selected SlP450 genes.
Digital expression analysis of SlP450 genes
The digital expression analysis was performed to gain an insight of the role of the identified SlP450 in the various tissues. We used publicly available RNA-sequencing data from Dr. Asaph Aharoni lab (https://www.weizmann.ac.il/plants/aharoni/sites/plants.aharoni/files/uploads/tomato_rnaseq_data_19_tissues.xlsx) in order to decipher expression of SlP450 in 19 different tissues namely leaf, root, floral buds, petals and peel, flesh, seeds of immature green, mature green, breaker, orange and red fruits respectively. Available RNA-sequencing data were normalized with FPKM method. Digital expression profile of SlP450 genes in the form of heat map was constructed using ClustVis software (http://biit.cs.ut.ee/clustvis/) with default parameters .
The Solanum lycoperscium L. cv MicroTom (TGRC accession number: LA3911) seeds were generously provided by Prof. Asaph Aharoni (Department of Plant Sciences, Weizmann Institute of Science, Israel) which were obtained from Tomato Genetics Resource Center (http://tgrc.ucdavis.edu). The Tomato plants were grown in the poly house and maintained at controlled conditions of temperature (25 °C) and humidity (54%). On maturation of plants, root (R), stem (S), leaves (L), flower (F), green fruit (GF), mature green fruit (MGF) tissues were harvested. The tissues were frozen in liquid nitrogen and stored at − 80 °C until further use.
Real-time quantitative PCR (RT-qPCR) analysis
To confirm the digital expression analysis of SlP450s, we have selected six genes i.e. SlCYP51G1, SlCYP90A5, SlCYP77A20, SlCYP71AX11, SlCYP74C3 and SlCYP733A depending on their higher expression in various tissues. Total RNA from root, stem, leaves, flower, green fruit, and mature green fruit tissues were extracted using trizol reagent (Invitrogen, USA)  as per the manufactures protocol. Total RNA was quantified with NanoDrop (ND-1000 spectrophotometer, Wilmington, USA) and then treated with RNase-free DNaseI (Promega, USA) to remove DNA contamination. Total 2 μg of RNA was reverse transcribed into cDNA by using AMV reverse transcriptase (Applied biosystems, USA) . The cDNA synthesized from different tissues were used for RT-qPCR analysis. Primers for RT-qPCR were designed using Primer 3 software available at (http://bioinfo.ut.ee/primer3-0.4.0/). The primer sequences are available in the Additional file 2. RT-qPCR analysis was performed in the Realflex2 Master cycler (Eppendorf, Germany). We used 5 μl of 2x SYBR green master mix (Roche, USA), sterile milliQ water, 10pM forward and reverse primer and 1.5 μl (1:3 diluted) cDNA for RT-qPCR analysis. Thermal profile used for RT-qPCR analysis were as follows: initial denaturation at 95 °C:5 min followed by 95 °C:15 s, 60 °C:30 s, 72 °C:30 s for 40 cycles. After amplification, melting curve analysis was conducted at 60–95 °C ramps with 0.5 °C increment per cycle to check the primer specificity. Elongation factor one alpha (EF1α NCBI Acc No. NM_001247106) gene was used as housekeeping/internal control after verifying the uniform expression in all the studied tissues of tomato. Relative expression profile of selected six candidate genes SlCYP51G1, SlCYP90A5, SlCYP77A20, SlCYP71AX11, SlCYP74C3, SlCYP733A were determined by using 2(−Delta Delta C(T)) Method as described by Livak et al. . Each gene had a PCR efficiency and R2 value between 0.9–1.00 along with single melting curves. The experiment was repeated with three biological and two technical replicates for each gene.
Annotation and classification of tomato P450 multigene family
Phylogenetic analysis of the tomato P450 multigene family
Intron gain and loss events to investigate evolution of P450 multigene family
In silico analysis of tomato P450 gene promoters
Tissue specific SlP450 having over-represented promoter motifs along with their probable biological role
Over represented Motif name
Tomato Tissue Type
AC motif and MYB1LEPR motif
Leaf specific P450 genes
Root specific P450 genes
Buds specific P450 genes
Auxin responsive element
Root specific P450 genes
Soybean GH3 gene has three auxin responsive element which are important in auxin mediated gene expression .
HSE heat shock element
Root specific P450 genes
TCP transcription factor
Petal specific P450 genes
TCP transcription factor involved in growth, development and defense mechanism also induces biosynthesis of Brassinosteroid (BR), Jasmonic acid (JA) and flavonoids might be involved in regulation of floral tissues developing genes in tomato plant. In Arabidopsis TCP14 and TCP15 motifs are involved in regulation of floral tissues and leaf blade development [54, 55, 56].
Digital expression profiling of tomato P450 genes
Cytochrome P450 genes are involved in catalysis of variety of reactions which include growth, development and secondary metabolite biosynthetic pathways. In present study we identified 233 P450 genes from tomato which are comparable with genes identified in Arabidopsis thaliana (245)  but more than mulberry (176) . All identified tomato P450 genes contain four P450 signature conserved domains. The orthologs comparison of tomato P450 gene families with plant species such as Arabidopsis, Medicago, poplar, flax, moss, rice and soybean revealed the evolution of P450 gene family (Additional file 7). These results demonstrated that CYP702 and CYP708 families are present in Arabidopsis and absent from other analysed plants. This may be attributed to biosynthesis of triterpenoid derivatives that are Brassicaceae specific . The CYP749A20 gene was up-regulated in red and orange fruit with unknown function in tomato. However, its orthologue from Arabidopsis is absent. During the course of evolution, CYP749 family is evolved only in Asteroids, Rosides and Ranunculales members . Tomato CYP78 family members have only CYP78A subfamily, interestingly genes from this family are involved in flower development and meristem specific function in Arabidopsis . The SlCYP78A sub-family genes, SlCYP78A75 and SlCYP78A77 were respectively up-regulated in flower buds and root. In addition, the SlCYP78A77 also contains root specific promoter motifs i.e. auxin responsive element and HSE (heat shock element). These motifs are consequently involved in auxin mediated gene expression and combating oxidative stress in other plants [51, 52, 53]. The SlCYP81 family has 10 genes distributed in four sub families which belong to clan71. The SlCYP81B and SlCYP81C subfamily genes were up-regulated during different stages of the tomato fruit development. It is demonstrated in Arabidopsis that CYP81D, CYP81F, CYP81H and CYP81G subfamily genes play important role in disease resistance [57, 58]. The SlCYP81B and SlCYP81C might be involved in tomato fruit development as well as protection from different diseases since they are highly expressed in these tissue types . The CYP80 family is present in tomato, poplar and grape. It supposedly involved in phenolic coupling during alkaloid biosynthesis . The SlCYP80E6 gene found to be up-regulated in petals and it contain overrepresented TCP transcription factor which was a petal specific motif. In Arabidopsis, TCP transcription factor is involved in floral organs development and biosynthesis of different phytohormones [54, 55, 56]. Hence, SlCYP80E6 is a potential candidate to study the floral development. The expression data suggests that SlCYP84A2 gene was up-regulated in root and has root specific overrepresented AGL promoter motif. In Arabidopsis, CYP84A1 gene is involved in the lignin biosynthesis. The functional analysis of this gene affects the lignification and vascular development . Expression and promoter data from tomato suggests that SlCYP84A2 gene might be involved in vascular development of the root.
Phylogenetic tree topology of tomato and Arabidopsis P450 revealed similar clustering that indicates conserved nature of P450 multigene family across the various plant species . The single family clans contain low copy genes with essential function in all the plants. They restrict themselves from gene duplication due to strong purifying/negative selection process . The CYP51 is ancient and conserved clan with single copy in all the phyla studied so far. The SlCYP51G1 showed 82% identity with AtCYP51G1 and it is involved in sterol metabolism . RT-qPCR expression data has showed that SlCYP51G1 is constitutively expressed in all selected tissues of tomato and has sterol demethylase activity required for the maintenance of membrane integrity . The CYP71 family genes evolve through the gene duplication and seems to be recent in the evolutionary history [60, 61]. Tomato CYP71 family genes have average 30% sequence identity with Arabidopsis CYP71 family. The expression data of SlCYP71AX and SlCYP77A20 from to clan71 showed that these two genes were up-regulated in green and mature green fruit of tomato which is in accordance with the transcriptome data. These two genes would be good candidates for the study of secondary metabolite synthesis in tomato fruit [62, 63].
The CYP74 family is an atypical plant P450 family and thought to be involved in catalysis of already oxygenated polyunsaturated C18 fatty acid hydroperoxide into other oxylipins . The RT-qPCR data exhibited upregulation of SlCYP74C3 gene in mature green tomato fruits and hence it could be a potential candidate gene to study oxylipin biosynthesis in tomato fruit. The SlCYP90A5 gene was up-regulated in tomato flower and showed less expression in leaf which correlate with transcriptome data. The AtCYP90A1 is involved in brassinosteroid metabolism and shows less expression in expanding leaf [58, 5] whereas tomato orthologue SlCYP90A5 has similar expression profile. CBF/DREB1 transcription factor plays role in cold response  and is over-represented in SlCYP72A184, SlCYP85A1 and SlCYP96A48 genes. These genes can be candidate for cold stress tolerance in tomato. Intron map along with their phases and gain/loss events plays a crucial role in understanding the evolution of gene families within phylogenetic group. Conserved introns are ancient elements and present with similar intron phase . Intron phase changes due to intron sliding events or changing intron-exon boundaries with one or two nucleotides . Introns tend to maintain their phases during evolution, given that changes in intron phases occur rarely. In the mulberry P450, maximum genes contain one and two introns that were comparable with tomato P450 introns . Both conserved introns were evolved in clan71 gene families due to gene duplication events. In Arabidopsis, two conserved introns were absent from non-A type P450 gene families whereas they appeared in A-type P450 gene families . It is observed that conserved intron I13 evolved gradually and conserved intron I14 lost from SlP450 genes during the course of evolution. Intron gain was observed in the A-type of SlP450 genes which was absent in the ancestral (Non-A) gene families. Hence, this data support the intron late view of intron evolution [30, 31].
The expression evidence to the genes profoundly depends on developmental stages, age of the plant, environmental conditions, extent of expression, tissue specificity and biotic or abiotic stress. In the present study, only 31.33% P450 genes showed evidence of expression which could be compared with rice (49.81%)  and soybean (31.92%) . In mulberry, Ma et.al. (2014) have identified 173 P450 genes which were further divided into five clusters for expression profile and found that the maximum 23.6% P450 genes were expressed . Present study is conducted on the available RNA sequencing data of different tissues of tomato. The data was not obtained by challenging plant with any pathogen or exposing plants to the different stress conditions. Following possibilities can be asserted in the given case: i) remaining genes have developmental specificity or ii) it is expressed in different biotic or abiotic stress conditions or iii) it is present in the in-detectable level or iv) is inactive. The digital expression analysis provides global landscape could be instrumental to study various tissue specific P450. The promoter analysis suggested SlP450 promoter motifs are driving tissue specific expression. Thus present study may enable researchers to select appropriate candidate gene from huge repertoire of SlP450 for detailed functional characterization.
The Tomato genome has a greater number of P450 clans as compared to Arabidopsis with variable number of P450 genes in each clan. Phylogenetic tree analysis provided the information about the functional evolution of P450 gene family in tomato. In intron map, gain and loss of conserved introns reveals P450 gene family evolution in tomato plant. Digital and experimental expression profile suggests tissues specific highly expressed P450 genes that could be potential candidates for further study. The promoter motifs driving the higher expression of P450 in analysed tissues types can be further evaluated using functional genomics for traits of economic importance. Thus, this study provides solid foundation for functional characterization of candidate genes with their biological significance.
The authors thank Dr. Angelica Lindlöf, University of Skövde, Sweden for sharing Perl script for promoter analysis. We extend words of appreciation for Dr. A. S. Kashikar, Department of Statistics, S. P. Pune University, Pune for her support and help during statistical analysis and Prof. S. S. Bhargava her for valuable suggestions. The authors thank Dr. Asaph Aharoni for providing Solanum lycoperscium L. cv MicroTom seeds and tissues specific RNA sequencing data. We are grateful to the University Grant Commission (UGC), India for financial support in the form of Start-Up grant received by VTB. The authors acknowledges Savitribai Phule Pune University for publication fee assistance.
This study was funded by the University Grant Commission (UGC), India under the scheme of Start-Up grant received by VTB.
Availability of data and materials
All the data obtained in the current study have been presented in this article.
APV carried out the analysis of the sequences, cloning, RT-qPCR and drafted the manuscript. VTB participated in its design, coordination and supervised the study. Both the authors read and approved the final manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 1.Ortiz De Montellano PR. Cytochrome P450: structure, Mechanism, and Biochemistry; 1986.Google Scholar
- 5.Bak S, Beisson F, Bishop G, Hamberger B, Höfer R, Paquette S, et al. Cytochromes P450. In: The Arabidopsis Book American Society Of Plant Biologist; 2011. p. e0144.Google Scholar
- 14.Guttikonda SK, Trupti J, Bisht NC, Chen H, An Y-QC, Pandey S, et al. Whole genome co-expression analysis of soybean cytochrome P450 genes identifies nodulation-specific P450 monooxygenases. BMC Plant Biol. 2010;10(1):243.Google Scholar
- 16.Babu PR, Rao KV, Reddy VD. Structural organization and classification of cytochrome P450 genes in flax (Linum usitatissimum L.). Gene. 2013;513(1):156-62.Google Scholar
- 30.Barvkar VT, Pardeshi VC, Kale SM, Kadoo NY, Gupta VS. Phylogenomic analysis of UDP glycosyltransferase 1 multigene family in Linum usitatissimum identified genes with varied expression patterns. BMC Genomics. 2012;13(1):175.Google Scholar
- 38.Roths MJ, Tanese N, Goff SP. Purification and characterization of murine retroviral reverse transcriptase expressed in Escherichia coli. J Biol Chem. 1985;260:9326–35.Google Scholar
- 39.Livak KJ, Schmittgen TD. Analysis of relative gene expression data using real-time quantitative PCR and the 2− ΔΔCTmethod. Nat Methods. 2001;25:402–8.Google Scholar
- 51.Tiwari SB, Hagen G, Guilfoyle T. The roles of auxin response factor domains in auxin-responsive transcription. The Plant Cell. 2003;15(2):533-43.Google Scholar
- 53.Storozhenko S, De PP, Van MM, Inze D, Kushnir S. The heat-shock element is a functional component of the Arabidopsis APX1 gene promoter. Plant Physiol. 1998;118(3):1005–14. Google Scholar
- 61.Kim HB, Schaller H, Goh CH, Kwon M, Choe S, An CS, et al. Arabidopsis cyp51 mutant shows postembryonic seedling lethality associated with lack of membrane integrity. Plant Physiol. 138:2033–47.Google Scholar
- 67.Lan Z, Kai W, Jun TAN, Wei LI, Songgang LI. Putative cytochrome P450 genes in rice genome ( Oryza sativa L . Ssp . Indica ) and their EST evidence. Sci China Ser C Life Sci. 2002:45.Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.