Abstract
The broad mite, Polyphagotarsonemus latus (Tarsonemidae: Acari) is a highly polyphagous species that damage plant species spread across 57 different families. This pest has developed high levels of resistance to some commonly used acaricides. In the present investigation, we deciphered the genome information of P. latus by PacBio HiFi sequencing. P. latus is the third smallest arthropod genome sequenced so far with a size of 49.1 Mb. The entire genome was assembled into two contigs. A set of 9,286 protein-coding genes were annotated. Its compact genome size could be credited with multiple features such as very low repeat content (5.1%) due to the lack of proliferation of transposable elements, high gene density (189.1/Mb), more intronless genes (20.3%) and low microsatellite density (0.63%).
Similar content being viewed by others
Background & Summary
The chelicerate mites and ticks, belonging to the Acari group, are the second most diverse group of animals on the earth after insects. The spiders, mites, ticks, and scorpions together constitute one of the mega-diverse arthropod lineages with global scale distribution and significant economic and ecological importance. The class Acari comprises two superorders namely Acariformes and Parasitiformes. Acari diverged from other arthropod lineages approximately 400 million years ago and formed a separate lineage1. Under Acariformes, the order Trombidiformes represents major phytophagous superfamilies of mites such as Tetranychoidea and Tarsonemoidea2. Eriophyoidea, the yet another superfamily with phytophagous mites is placed under another order called Sarcoptiformes as per the recent classification3. Mites comprise many notorious pests of agricultural and veterinary importance. They can thrive across a wide range of habitats and display great diversity in their evolution.
Phytophagous mites classified under the family Tarsonemidae contain about 40 genera and more than 500 described species, of which the broad mite or yellow mite, Polyphagotarsonemus latus (Banks) (Fig. 1a) is a serious pest of more than 250 crop plants of commercial importance that spread across 57 plant families4. The crops such as hot (Fig. 1b) and sweet peppers, mulberry, citrus, cotton, tea, mango, jute, and potato are severely damaged among others. At present, it has become a well-established pest across all six zoogeographical regions worldwide, namely, Australia, Asia, Africa, North America, South America, and the Pacific Islands5 (Fig. 1c). Like many other mites, P. latus reproduces through the haplo-diploidy system with a female-biased sex ratio4. The males are haploid (n = 2) and produced by arrhenotokous parthenogenesis and the females are diploid (2n = 4) and produced by fertilized eggs6. It is equally prevalent in tropical, subtropical, and greenhouse environments owing to several intrinsic and extrinsic factors7,8,9. Due to its microscopic nature, its occurrence is evident only after substantial injury is caused to the plants. Gregory and Young10 emphasized a positive relationship between genome size and body size in mites, while also noting that certain acarine genomes exhibit remarkably low levels of DNA content. P. latus exhibits a significantly smaller size (<0.2 mm), amounting to approximately half or even less than that of other mites such as Tetranychus urticae. The minute body size of P. latus coupled with many other genomic features aligns well with this observation.
Chemical control with synthetic acaricides remains the common management strategy adopted by crop growers. The intensive and widespread use of acaricides coupled with shorter life cycles and unusual modes of reproduction has favoured many mite and tick species around the world to evolve resistance against acaricides with different modes of action11,12,13. In India, more than two dozen acaricides under 12 different modes of action were officially used for the management of phytophagous mites. Due to their indiscriminate/overuse, the occurrence of high levels of acaricide resistance in field-collected P. latus populations was documented14,15.
To date, genomes of 39 mites and 17 ticks have been sequenced (NCBI, accessed 27 November 2023) including seven species of phytophagous mites namely, T. urticae16, T. cinnabarinus, T. truncatus17 (Spider mites: Tetranychoidea: Tetranychidae), Panonychus citri18 (Citrus red mite: Tetranychoidea: Tetranychidae), Aculops lycopersici19 (Tomato russet mite: Eriophyoidea: Eriophyidae), Fragariocoptes setiger20 (Gall mite: Eriophyoidea: Eriophyidae) and Halotydeus destructor21 (Redlegged earth mite: Eupodoidea: Penthaleidae).The majority of the mite genomes are smaller as compared to ticks. The de novo draft genome of P. latus brought off in this study is the third smallest arthropod genome sequenced so far and the first deciphered genome under the Acari family Tarsonemidae. Its genome has been assembled into just two contigs, the lowest among the mite species. The genome has very low repeat content (5.1%) due to a lack of proliferation of transposable elements. Although the genome codes only 9,286 protein-coding genes, it has nothing to do with the pest’s adaptability to xenobiotics14,15. CAFE analysis revealed that P. latus exhibits one of the highest rates of gene family contractions and a lower rate of gene family expansions. The assembled genome of P. latus holds many novel features both froma phylogenetic point of view as well as developing new acaricidal molecules and novel pest management strategies.
Methods
Establishment and maintenance of P. latus colony
P. latus was originally collected from hot pepper plants (Fig. 1b) cultivated near Bengaluru, India (12.6254°N, 77.2319°E) during July 2020, and subsequently an iso-female colony was established (NBAIR-GR-TAR-01a) under laboratory conditions. This colony was maintained on potted mulberry plants (Morus alba; variety: V1) in a growth chamber since then at ICAR-National Bureau of Agricultural Insect Resources, Bengaluru, India. The identity was further confirmed by DNA barcoding. A 631 bp long COI sequence was deposited in the NCBI-GenBank (ON103156). The amplified COI sequence and the specimen details were submitted to the BOLD database V4 (BIN No. AED8321).
DNA isolation, library preparation, and sequencing
More than 5,000 adult females from the laboratory-reared susceptible iso-female colony were individually hand-picked from the infested leaves under the microscope. The collected mites were starved for a brief time and then frozen in liquid nitrogen. High-quality, high molecular-weight DNA was extracted using the CTAB method22. The extracted DNA was dissolved in the TE buffer and sent to Nucleome Informatics Pvt. Ltd. (Hyderabad, India) for library preparation and sequencing. DNA was sheared to 15-20 Kb with the g-tube system (Covaris). SMRTbell® gDNA Sample Amplification Kit was used to amplify the DNA. Approximately 600 ng of amplified DNA was used to generate a HiFi SMRT bell library for one Sequel II SMRT Cell using SMRTbell Express template preparation kit 2 (PacBio, USA). Size selection (20 kb) was performed using the BluePippin system (Sago Science). In all the steps, DNA was quantified by Qubit Fluorometer (ThermoFisher Scientific) and quality was checked by Femto Pulse (Agilent Technologies). The libraries were sequenced on the PacBio Sequel II platform. The subreads were used to call the CCS reads using the SMRT link v10.2 (PacBio, USA) to produce HiFi reads by CCS software (https://github.com/PacificBiosciences/ccs). The following settings were used: minimum number of passes: 1, minimum accuracy 0.9 with quality Q20 and above. From the two runs, 40.9 Gb and 38.8 Gb of raw data were generated with N50 values of 8,239 and 6,063, respectively (Table 1). A total of 186,760 HiFi reads were generated with an average read length of 11.6 Kb (Table 1).
Contamination removal, contig level assembly, and assembly polishing
The microbial contaminations were filtered out using Kraken2 v2.1.223 and adapters were removed through a similarity search against the NCBI UniVec database using Blastn. The reads were further mapped against the host (mulberry) genome (GenBank assembly accession no. GCA_012066045.3) using Minimap2 v2.2424 to eliminate the host plant DNA contamination that resulted in the removal of 1,494 reads.The filtered HiFi reads were assembled into contigs using the default parameters of Hifiasm v0.1625, an efficient and fast haplotype-resolvedde novo assembler especially for PacBio HiFi reads. From the resulting partially phased genome, we took hap2 for the downstream analysis as the organism is homozygous (from the k-mer plot generated by the assembler) and the number of contigs is less compared to the hap1 assembly. To improve the contig assembly, HiFi reads were aligned back to the hap2 genome assembly using pbmm2 v1.5.0, and an aligned sorted bam file was generated, which was used to polish and assemble the hap2 assembly using gcpp v1.9.0. The phased genome has beenassembled into two contigs with an assembly size of 49.1 Mb and N50 value of 30.90 Mb (Table 2).
Estimation of genome size and completeness
The k-mer frequency distribution histogram was constructed using a jellyfish v 2.3.0 program26. Histogram along with read length and k-mer length were used as inputs in the program Genomescope v 2.0 (k = 21) for the estimation of genome size, level of repeats, and heterozygosity. The assembly length was observed to be similar to the length estimated by k-mer analysis (49.3 Mb) with the lowest error rate of 0.27 percent. The average heterozygosity rate was estimated at 0.16 percent (Table S1; Fig. S1).
Repeat identification and genome masking
For the identification and accurate compilation of sequence models representing all of the unique transposable element (TE) families dispersed in the assembled genome, RepeatModeler227 v2.0.4 was used which is an automated pipeline employing different algorithms for the repeat identification. After repeat identification, 2,497,763 bp were masked in the assembled genome. The dominant repeat elements were simple repeats accounting for 2.49%, followed by low complexity repeats (1.62%), total interspersed repeats (0.97%), retroelements (0.27%), and LINES (0.24%) (Table 3).
Gene prediction
The genic and intergenic regions in the assembled genome were predicted using Genemark-ES v228 and Augustus v3.429. Genemark-ES uses the heuristic method of initialization of the hidden semi-Markov model algorithm for finding the maximum likelihood parse of sequence into coding and non-coding regions and also does iterative self-training on sequences, whereas Augustus was trained by pre-trained species from the BUSCO30 analysis from the lineage arthropoda_odb10. Also, RNA-Seq of iso-female colonies of acaricide-resistant and susceptible populations were used as evidence for the prediction of genemodels. Approximately 9,286 and 7,787 genes were predicted from the genome assembly using Genemark-ES and Augustus, respectively (Table S2). The genes predicted from the Genemark-ES were used for the annotation. A total of 22,909 introns were predicted in the entire genome with 20.3% of the genes being intronless. A total of 32,195 exons were detected and the number of exons per gene varied from 1–33 (Table S3).
Functional Annotation and Gene-Ontology
The predicted protein sequences were further annotated with the eggNOG31, nr32, KEGG33, and InterPro34 databases. The corresponding Gene Ontology terms (GO) of identified gene sequences were predicted using the eggNOG mapper v2.1.935. A total of 1,274 genes were annotated by BLAST + v2.13.036 againstthe NCBI nr database; 6,964 genes were annotated by Interproscan v5.60.92 and homology search against the KEGG database assigned 4,984 genes with their corresponding pathways. Gene ontology annotation classified 5,420 genes into three GO classes namely biological process, molecular function, and cellular component.
Prediction of RNA species and microsatellites
To find the different RNAs present in the genome, the Infernal37 (INFERence of RNA ALignment)tool v1.1.4 was used with the default parameters. We used cmscan locally to search the Rfam CM libraries against the P. latus genome. The RNA species classification revealed a tRNA count of 102 and a miRNA count of four. Further, 5S_rRNA, 5.8S_rRNA, small subunit ribosomal RNA (SSU_rRNA_eukarya), and large subunit ribosomal RNA (LSU_rRNA_eukarya) were observed to be 25, 17, 16, and 16 in numbers, respectively (Table S4). To identify the microsatellite repeats in the assembled genome, Krait v1.4.038, a robust and ultrafast tool was employed. The minimum repeats for each perfect Simple Sequence Repeats (SSR) type were set to 12 for mono-, 7 for di-, 5 for tri, 4 for tetra-, 4 for penta-, 4 for hexa- and motif standardization level was set to Level 3. The total number of perfect SSRs was 23,116 which accounted for 0.64% of the genome (Table S5). Among the perfect microsatellites, tri-nucleotide microsatellites were the most abundant (324), followed by tetra-nucleotides (122), di-nucleotides (73), penta-nucleotides (38) and hexa-nucleotides (26). The most abundant SSR motifs were AAG (37.18%), followed by AAAT (30.53%), ATC (29.49%), and AAT (17.03%) (Table S6).
Orthologous gene family identification, and CAFÉ analysis
For the single copy orthologous gene identification, genomic information of eight species namely, broad mite (Polyphagotarsonemus latus) predatory mite (Metaseiulus occidentalis); two-spotted spider mite (Tetranychus urticae); tomato mite (Aculops lycopersici); common house spider (Parasteatoda tepidariorum); social velvet spider (Stegodyphus mimosarum) and black-legged tick (Ixodes scapularis) were used as ingroup and fruit fly (Drosophila melanogaster) was used as outgroup reference. The proteomes of these organisms were collected from NCBI (ftp://ftp.ncbi.nlm.nih.gov/) and Ensembl genome repositories (ftp://ftp.ensemblgenomes.org/).
OrthoVenn v339 is a comprehensive platform for comparative genomics, designed to identify orthogroups and orthologs. Blastp was used for sequence homology search; MUSCLE v5.140 and FastTree2-v2.1.1141 were used for multiple sequence alignment; and tree inference for the phylogeny construction, respectively. The OrthoVenn v3 generated 14,646 orthologous clusters with 225 overlaps and 669 single-copy clusters (Fig. S2, S3). A total of 145,984 proteins (proteome of P. latus and references) were present of which 27,969 (19.16%) were singletons. CAFE42 v5.0 and OrthoMCL v243 algorithm were used for grouping orthologous protein sequences with a P-value threshold of 0.05 to analyse the gene family expansions and contractions in P. latus and its related species. The expansion and contraction of gene family analysis revealed that among the eight arthropods analysed, P. latus exhibits one of the highest rates of gene family contractions (211) and a lower rate of gene family expansions (43) (Fig. S4).
The GO enrichment analysis using OrthoVenn v3 with P-value of 0.05 of significantly contracted gene families in P. latus included xenobiotic metabolic process (GO:0006805), xenobiotic transport (GO:0042908), glucosylceramide catabolic process (GO:0006680), ubiquitin-dependent protein catabolic process (GO:0006511), retinol metabolic process (GO: 0042572), wing disc development (GO: 0035220), visual learning (GO: 0008542) and segmentation (GO: 0035282). The GO terms related to biological processes like cholesterol metabolic process (GO:0008203), lipid catabolic process (GO:0016042), and neuron projection morphogenesis (GO:0048812) were also contracted.
The GO enrichment of the significantly expanded gene families included those responsible for xenobiotic detoxification enzymes like glutathione transferase activity (GO:0004364), UDP-glycosyltransferase activity (GO:0008194), serine-type carboxypeptidase activity (GO:0004185), glucuronosyltransferase activity (GO:0015020), and metalloexopeptidase activity (GO:0008235). The GO terms related to biological processes like testicular fusome organization (GO:0030724), antibiotic metabolic process (GO:0016999), and regulation of store-operated calcium entry (GO:2001256) were also expanded.
Data Records
The PacBio raw data, genome assembly, and annotation files have been submitted to NCBI under the bioproject ID: PRJNA904956; the assembly accession no: GCA_040055235.144 and SRA ID:SRX19915179,SRX1973815245. The transcriptome reads used for validation of genome accuracy have been submitted to NCBI SRR21762741 to SRR21762743 by our previous study46. The genome assembly accession IDs of species used in comparative studies were fruit fly (Drosophila melanogaster); predatory mite (Metaseiulus occidentalis) (GCA_000255335.2); two-spotted spider mite (Tetranychus urticae)(GCA_000239435.1); tomato mite (Aculops lycopersici) (Eriophyoidea: Eriophyidae)(GCA_015350385.1); common house spider (Parasteatoda tepidariorum) (GCA_000365465.3); spider (Stegodyphus mimosarum) (GCA_000611955.2); black-legged tick(Ixodes scapularis) (GCF_016920785.2); carmine spider mite(Tetranychus cinnabarinus)(GCA_022266195.1); spider mites (Tetranychus truncatus) (GCA_028476895.1); American house dust mite (Dermatophagoides farinae) (GCA_020809275.1); European house dust mite (Dermatophagoides pteronyssinus) (GCA_001901225.2); red-legged earth mite (Halotydeus destructor) (GCA_022750525.1) and human (Homo sapiens) (GCA_000001405.29). The gene annotation, gtf, and sequences file of P. latus are shared in figshare https://doi.org/10.6084/m9.figshare.25825984.v147.
Technical Validation
Completeness and accuracy of the assembled genome were assessed using Benchmarking Universal Single-Copy Orthologs (BUSCO v5)30 analysis using the lineage eukaryota_odb10, arthropoda_odb10, and arachnidae_obd10 datasets as orthologs reference, n = 255, 1,013 and 2,934 respectively and by mapping of RNA-Seq reads against the assembled genome. Estimates of the completeness of the draft genome assembly by BUSCO assessment located 92.9%, 86.9% and 83.4% of the eukaryotes single-copy orthologs (92.5% complete single copy and 0.4% duplicated), arthropod single-copy orthologs (84.3% complete single copy and 0.9% duplicated) and arachnid single-copy orthologs searched (82.0% complete single copy and 1.72% duplicated), respectively (Table S7; Fig. S5). The mapping of RNA-Seq reads against the genome was performed using the STAR48 aligner v2.7.10b with the default parameters. The mapping percentage of 96.44 was observed for the mapping of RNA-Seq reads against the assembled genome. The statistics indicate that P. latus genome assembly is comparable to any other chelicerate genomes concerning the number of assembled contigs and completeness.
Code availability
In the current study, no custom scripts were used. All the data processing was done using the guidelines standard pipelines and bioinformatics tools given in the Methods section.
References
Lozano-Fernandez, J. et al. Increasing species sampling in chelicerate genomic-scale datasets provides support for monophyly of Acari and Arachnida. Nat. Commun. 10, 2295 (2019).
Walter, D. E. & Krantz, G. W.Collecting, rearing, and preparing specimens. in A Manual of Acarology Vol. 3 (eds.G. W. Krantz & D. E. Walter), pp. 83–96. (Texas Tech University Press, 2009).
Klimov, P. B. et al. Comprehensive phylogeny of acariform mites (Acariformes) provides insights on the origin of the four-legged mites (Eriophyoidea), a long branch. Mol. Phylogenet. Evol. 119, 105–117 (2018).
Gerson, U. Biology and control of the broad mite, Polyphagotarsonemus latus (Banks) (Acari: Tarsonemidae). Exp. Appl. Acarol. 13, 163–178 (1992).
Lin, J. & Zhang, Z.-Q. Tarsonemidae of the world (Acari: Prostigmata): key to genera, geographical distribution, systematic catalogue and annotated bibliography (Systematic & Applied Acarology Society, 2002).
Flechtmann, C. H. W. & Flechtmann, C. A. H. Reproduction and chromosomes in the broad mite, Polyphagotarsonemus latus (Banks, 1904) (Acari, Prostigmata, Tarsonemidae). Acarol. VI 1, 455–456 (1984).
Rai, A. B., Satpathy, S., Gracy, R. G., Swamy, T. M. S. & Rai, M. Yellow mite (Polyphagotarsonemus latus Banks) menace in chilli crop. Veg. Sci. 34, 1–13 (2007).
Luypaert, G. et al. Temperature-dependent development of the broad mite Polyphagotarsonemus latus (Acari: Tarsonemidae) on Rhododendron simsii. Exp. Appl. Acarol. 63, 389–400 (2014).
Ovando-Garay, V., González-Gómez, R., Zarza, E., Castillo-Vera, A. & de Coss-Flores, M. E. Morphological and genetic characterization of the broad mite Polyphagotarsonemus latus Banks (Acari: Tarsonemidae) from two Mexican populations. PLoS One 17, e0266335 (2022).
Gregory, T. R. & Young, M. R. Small genomes in most mites (but not ticks). Int. J. Acarol. 46, 1–8 (2020).
Van Leeuwen, T. & Dermauw, W. The molecular evolution of xenobiotic metabolism and resistance in chelicerate mites. Annu. Rev. Entomol. 61, 475–498 (2016).
Agwunobi, D. O., Yu, Z. & Liu, J. A retrospective review on ixodid tick resistance against synthetic acaricides: implications and perspectives for future resistance prevention and mitigation. Pestic. Biochem. Physiol. 173, 104776 (2021).
De Rouck, S., İnak, E., Dermauw, W. & Van Leeuwen, T. A review of the molecular mechanisms of acaricide resistance in mites and ticks. Insect Biochem. Mol. Biol. 159, 103981 (2023).
Augustine, N. et al. Resistance to fenazaquin in broad mite, Polyphagotarsonemus latus (Banks) (Acari: Tarsonemidae): Realized heritability, risk assessment and cross-resistance. J. Appl. Entomol. 148, 279–286 (2024).
Augustine, N., Venkatasan, T., Upasana, S. & Mohan, M. Acaricide resistance among broad mite (Polyphagotarsonemus latus (Banks)) populations in Karnataka, India. Curr. Sci. 124, 1462–1468 (2023).
Grbić, M. et al. The genome of Tetranychus urticae reveals herbivorous pest adaptations. Nature 479, 487–492 (2011).
Chen, L. et al. The genome sequence of a spider mite, Tetranychus truncatus, provides insights into interspecific host range variation and the genetic basis of adaptation to a low-quality host plant. Insect Sci. 30, 1208–1228 (2023).
Yu, S. et al. Whole genome sequencing and bulked segregant analysis suggest a new mechanism of amitraz resistance in the citrus red mite, Panonychus citri (Acari: Tetranychidae). Pest Manag. Sci. 77, 5032–5048 (2021).
Greenhalgh, R. et al. Genome streamlining in a minute herbivore that manipulates its host plant. Elife 9, e56689 (2020).
Klimov, P. B. et al. Symbiotic bacteria of the gall-inducing mite Fragariocoptes setiger (Eriophyoidea) and phylogenomic resolution of the eriophyoid position among Acari. Sci. Rep. 12, 3811 (2022).
Thia, J. A. et al. The redlegged earth mite draft genome provides new insights into pesticide resistance evolution and demography in its invasive Australian range. J. Evol. Biol. 36, 381–398 (2023).
Hunt, G. J. Insect DNA extraction protocol. in Fingerprinting Methods based on Arbitrarily Primed PCR (eds. Micheli, M.R., Bova, R.) ch. pp: 21-24 (Springer Lab Manuals. Springer, Berlin, Heidelberg, 1997).
Wood, D. E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol. 20, 1–13 (2019).
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).
Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764–770 (2011).
Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl. Acad. Sci. 117, 9451–9457 (2020).
Lomsadze, A., Bonny, C., Strozzi, F. & Borodovsky, M. GeneMark-HM: improving gene prediction in DNA sequences of human microbiome. NAR genom. Bioinform. 3, lqab047 (2021).
Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439 (2006).
Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
Huerta-Cepas, J. et al. eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res. 47, D309–D314 (2019).
Pruitt, K. D., Tatusova, T. & Maglott, D. R. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 33, D501–D504 (2005).
Kanehisa, M. & Goto, S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).
Paysan-Lafosse, T. et al. InterPro in 2022. Nucleic Acids Res. 51, D418–D427 (2023).
Cantalapiedra, C. P., Hernández-Plaza, A., Letunic, I., Bork, P. & Huerta-Cepas, J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38, 5825–5829 (2021).
McGinnis, S. & Madden, T. L. BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res. 32, W20–W25 (2004).
Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013).
Du, L., Zhang, C., Liu, Q., Zhang, X. & Yue, B. Krait: an ultrafast tool for genome-wide survey of microsatellites and primer design. Bioinformatics 34, 681–683 (2018).
Sun, J. et al. OrthoVenn3: an integrated platform for exploring and visualizing orthologous data across genomes. Nucleic Acids Res. 51, W397–W403 (2023).
Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).
Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2–approximately maximum-likelihood trees for large alignments. PLoS One 5, e9490 (2010).
Mendes, F. K., Vanderpool, D., Fulton, B. & Hahn, M. W. CAFE 5 models variation in evolutionary rates among gene families. Bioinformatics 36, 5516–5518 (2020).
Li, L., Stoeckert, C. J. & Roos, D. S. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 13, 2178–2189 (2003).
Mohan, M. et al. Genbank https://identifiers.org/insdc.gca:GCA_040055235.1 (2024).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRP428391 (2024).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRP400343 (2023).
Mohan, M. et al. Dataset of P. latus genome, https://doi.org/10.6084/m9.figshare.25825984.v1 (2019).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Acknowledgements
The authors extend gratitude to the Indian Council of Agricultural Research (ICAR) for generous funding under the Consortia Research Platform (CRP) on Genomics project. The authors thank the sequencing service rendered by Nucleome Informatics Pvt Ltd., Hyderabad, and bioinformatics analysis support by providing a dedicated high-performance computing facility by Bionivid Technology Pvt. Ltd., Bengaluru. The infrastructure facilities provided by the Director, ICAR- National Bureau of Agricultural Insect Resources, Bengaluru are gratefully acknowledged.
Author information
Authors and Affiliations
Contributions
M.M. contributed to the conception and design of various experiments and interpretation of data (comparative genomics), drafting of the manuscript, and fund acquisition. N.A. contributed tothe rearing and maintenance of iso-female colonies of P. latus, the preparation of mite samples for genome and transcriptome sequencing, writing and drafting of the manuscript. SBS contributed to genome composition estimation, gene prediction, gene annotation, RNA-Seq mapping, and drafting of the manuscript. A.P.J. contributed to gene annotation, genome size estimation, and drafting of the manuscript. US contributed to genome assembly, bioinformatics analysis, and drafting of the manuscript. J.P. and GRG contributed to the drafting of the manuscript. V.T. contributed to the DNA barcoding of mite samples and proof reading of the manuscript. S.N.S. contributed to the population dynamics of P. latus.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Mohan, M., Augustine, N., Selvamani, S.B. et al. The miniature genome of broad mite, Polyphagotarsonemus latus (Tarsonemidae: Acari). Sci Data 11, 748 (2024). https://doi.org/10.1038/s41597-024-03579-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-024-03579-4
- Springer Nature Limited