The miniature genome of broad mite, Polyphagotarsonemus latus (Tarsonemidae: Acari)

Mohan, Muthugounder; Augustine, Neenu; Selvamani, Selva Babu; P. J., Aneesha; Selvapandian, Upasna; Pathak, Jyoti; Gracy R., Gandhi; Thiruvengadam, Venkatesan; S. N., Sushil

doi:10.1038/s41597-024-03579-4

The miniature genome of broad mite, Polyphagotarsonemus latus (Tarsonemidae: Acari)

Data Descriptor
Open access
Published: 09 July 2024

Volume 11, article number 748, (2024)
Cite this article

Download PDF

You have full access to this open access article

Scientific Data

The miniature genome of broad mite, Polyphagotarsonemus latus (Tarsonemidae: Acari)

Download PDF

Muthugounder Mohan ORCID: orcid.org/0000-0002-2327-6988¹,
Neenu Augustine¹^nAff2,
Selva Babu Selvamani¹,
Aneesha P. J. ORCID: orcid.org/0000-0001-9547-429X¹,
Upasna Selvapandian¹,
Jyoti Pathak¹,
Gandhi Gracy R.¹,
Venkatesan Thiruvengadam¹ &
…
Sushil S. N.¹

373 Accesses
3 Altmetric
Explore all metrics

Abstract

The broad mite, Polyphagotarsonemus latus (Tarsonemidae: Acari) is a highly polyphagous species that damage plant species spread across 57 different families. This pest has developed high levels of resistance to some commonly used acaricides. In the present investigation, we deciphered the genome information of P. latus by PacBio HiFi sequencing. P. latus is the third smallest arthropod genome sequenced so far with a size of 49.1 Mb. The entire genome was assembled into two contigs. A set of 9,286 protein-coding genes were annotated. Its compact genome size could be credited with multiple features such as very low repeat content (5.1%) due to the lack of proliferation of transposable elements, high gene density (189.1/Mb), more intronless genes (20.3%) and low microsatellite density (0.63%).

A chromosome-level genome assembly of the spider mite Tetranychus piercei McGregor

Article Open access 05 April 2024

Chromosome-level genome assembly of Microplitis manilae Ashmead, 1904 (Hymenoptera: Braconidae)

Article Open access 10 May 2023

Chromosome genome assembly and whole genome sequencing of 110 individuals of Conogethes punctiferalis (Guenée)

Article Open access 16 November 2023

Background & Summary

The chelicerate mites and ticks, belonging to the Acari group, are the second most diverse group of animals on the earth after insects. The spiders, mites, ticks, and scorpions together constitute one of the mega-diverse arthropod lineages with global scale distribution and significant economic and ecological importance. The class Acari comprises two superorders namely Acariformes and Parasitiformes. Acari diverged from other arthropod lineages approximately 400 million years ago and formed a separate lineage¹. Under Acariformes, the order Trombidiformes represents major phytophagous superfamilies of mites such as Tetranychoidea and Tarsonemoidea². Eriophyoidea, the yet another superfamily with phytophagous mites is placed under another order called Sarcoptiformes as per the recent classification³. Mites comprise many notorious pests of agricultural and veterinary importance. They can thrive across a wide range of habitats and display great diversity in their evolution.

Phytophagous mites classified under the family Tarsonemidae contain about 40 genera and more than 500 described species, of which the broad mite or yellow mite, Polyphagotarsonemus latus (Banks) (Fig. 1a) is a serious pest of more than 250 crop plants of commercial importance that spread across 57 plant families⁴. The crops such as hot (Fig. 1b) and sweet peppers, mulberry, citrus, cotton, tea, mango, jute, and potato are severely damaged among others. At present, it has become a well-established pest across all six zoogeographical regions worldwide, namely, Australia, Asia, Africa, North America, South America, and the Pacific Islands⁵ (Fig. 1c). Like many other mites, P. latus reproduces through the haplo-diploidy system with a female-biased sex ratio⁴. The males are haploid (n = 2) and produced by arrhenotokous parthenogenesis and the females are diploid (2n = 4) and produced by fertilized eggs⁶. It is equally prevalent in tropical, subtropical, and greenhouse environments owing to several intrinsic and extrinsic factors^7,8,9. Due to its microscopic nature, its occurrence is evident only after substantial injury is caused to the plants. Gregory and Young¹⁰ emphasized a positive relationship between genome size and body size in mites, while also noting that certain acarine genomes exhibit remarkably low levels of DNA content. P. latus exhibits a significantly smaller size (<0.2 mm), amounting to approximately half or even less than that of other mites such as Tetranychus urticae. The minute body size of P. latus coupled with many other genomic features aligns well with this observation.

Chemical control with synthetic acaricides remains the common management strategy adopted by crop growers. The intensive and widespread use of acaricides coupled with shorter life cycles and unusual modes of reproduction has favoured many mite and tick species around the world to evolve resistance against acaricides with different modes of action^11,12,13. In India, more than two dozen acaricides under 12 different modes of action were officially used for the management of phytophagous mites. Due to their indiscriminate/overuse, the occurrence of high levels of acaricide resistance in field-collected P. latus populations was documented^14,15.

To date, genomes of 39 mites and 17 ticks have been sequenced (NCBI, accessed 27 November 2023) including seven species of phytophagous mites namely, T. urticae¹⁶, T. cinnabarinus, T. truncatus¹⁷ (Spider mites: Tetranychoidea: Tetranychidae), Panonychus citri¹⁸ (Citrus red mite: Tetranychoidea: Tetranychidae), Aculops lycopersici¹⁹ (Tomato russet mite: Eriophyoidea: Eriophyidae), Fragariocoptes setiger²⁰ (Gall mite: Eriophyoidea: Eriophyidae) and Halotydeus destructor²¹ (Redlegged earth mite: Eupodoidea: Penthaleidae).The majority of the mite genomes are smaller as compared to ticks. The de novo draft genome of P. latus brought off in this study is the third smallest arthropod genome sequenced so far and the first deciphered genome under the Acari family Tarsonemidae. Its genome has been assembled into just two contigs, the lowest among the mite species. The genome has very low repeat content (5.1%) due to a lack of proliferation of transposable elements. Although the genome codes only 9,286 protein-coding genes, it has nothing to do with the pest’s adaptability to xenobiotics^14,15. CAFE analysis revealed that P. latus exhibits one of the highest rates of gene family contractions and a lower rate of gene family expansions. The assembled genome of P. latus holds many novel features both froma phylogenetic point of view as well as developing new acaricidal molecules and novel pest management strategies.

Methods

Establishment and maintenance of P. latus colony

P. latus was originally collected from hot pepper plants (Fig. 1b) cultivated near Bengaluru, India (12.6254°N, 77.2319°E) during July 2020, and subsequently an iso-female colony was established (NBAIR-GR-TAR-01a) under laboratory conditions. This colony was maintained on potted mulberry plants (Morus alba; variety: V1) in a growth chamber since then at ICAR-National Bureau of Agricultural Insect Resources, Bengaluru, India. The identity was further confirmed by DNA barcoding. A 631 bp long COI sequence was deposited in the NCBI-GenBank (ON103156). The amplified COI sequence and the specimen details were submitted to the BOLD database V4 (BIN No. AED8321).

DNA isolation, library preparation, and sequencing

More than 5,000 adult females from the laboratory-reared susceptible iso-female colony were individually hand-picked from the infested leaves under the microscope. The collected mites were starved for a brief time and then frozen in liquid nitrogen. High-quality, high molecular-weight DNA was extracted using the CTAB method²². The extracted DNA was dissolved in the TE buffer and sent to Nucleome Informatics Pvt. Ltd. (Hyderabad, India) for library preparation and sequencing. DNA was sheared to 15-20 Kb with the g-tube system (Covaris). SMRTbell^® gDNA Sample Amplification Kit was used to amplify the DNA. Approximately 600 ng of amplified DNA was used to generate a HiFi SMRT bell library for one Sequel II SMRT Cell using SMRTbell Express template preparation kit 2 (PacBio, USA). Size selection (20 kb) was performed using the BluePippin system (Sago Science). In all the steps, DNA was quantified by Qubit Fluorometer (ThermoFisher Scientific) and quality was checked by Femto Pulse (Agilent Technologies). The libraries were sequenced on the PacBio Sequel II platform. The subreads were used to call the CCS reads using the SMRT link v10.2 (PacBio, USA) to produce HiFi reads by CCS software (https://github.com/PacificBiosciences/ccs). The following settings were used: minimum number of passes: 1, minimum accuracy 0.9 with quality Q20 and above. From the two runs, 40.9 Gb and 38.8 Gb of raw data were generated with N50 values of 8,239 and 6,063, respectively (Table 1). A total of 186,760 HiFi reads were generated with an average read length of 11.6 Kb (Table 1).

Table 1 Complete PacBio sequencing metrics of the P. latus genome.

Full size table

Contamination removal, contig level assembly, and assembly polishing

The microbial contaminations were filtered out using Kraken2 v2.1.2²³ and adapters were removed through a similarity search against the NCBI UniVec database using Blastn. The reads were further mapped against the host (mulberry) genome (GenBank assembly accession no. GCA_012066045.3) using Minimap2 v2.24²⁴ to eliminate the host plant DNA contamination that resulted in the removal of 1,494 reads.The filtered HiFi reads were assembled into contigs using the default parameters of Hifiasm v0.16²⁵, an efficient and fast haplotype-resolvedde novo assembler especially for PacBio HiFi reads. From the resulting partially phased genome, we took hap2 for the downstream analysis as the organism is homozygous (from the k-mer plot generated by the assembler) and the number of contigs is less compared to the hap1 assembly. To improve the contig assembly, HiFi reads were aligned back to the hap2 genome assembly using pbmm2 v1.5.0, and an aligned sorted bam file was generated, which was used to polish and assemble the hap2 assembly using gcpp v1.9.0. The phased genome has beenassembled into two contigs with an assembly size of 49.1 Mb and N50 value of 30.90 Mb (Table 2).

Table 2 Contig level assembly statistics of P. latus genome.

Full size table

Estimation of genome size and completeness

The k-mer frequency distribution histogram was constructed using a jellyfish v 2.3.0 program²⁶. Histogram along with read length and k-mer length were used as inputs in the program Genomescope v 2.0 (k = 21) for the estimation of genome size, level of repeats, and heterozygosity. The assembly length was observed to be similar to the length estimated by k-mer analysis (49.3 Mb) with the lowest error rate of 0.27 percent. The average heterozygosity rate was estimated at 0.16 percent (Table S1; Fig. S1).

Repeat identification and genome masking

For the identification and accurate compilation of sequence models representing all of the unique transposable element (TE) families dispersed in the assembled genome, RepeatModeler2²⁷ v2.0.4 was used which is an automated pipeline employing different algorithms for the repeat identification. After repeat identification, 2,497,763 bp were masked in the assembled genome. The dominant repeat elements were simple repeats accounting for 2.49%, followed by low complexity repeats (1.62%), total interspersed repeats (0.97%), retroelements (0.27%), and LINES (0.24%) (Table 3).

Table 3 Repeat element statistics of P. latus genome.

Full size table

Gene prediction

The genic and intergenic regions in the assembled genome were predicted using Genemark-ES v2²⁸ and Augustus v3.4²⁹. Genemark-ES uses the heuristic method of initialization of the hidden semi-Markov model algorithm for finding the maximum likelihood parse of sequence into coding and non-coding regions and also does iterative self-training on sequences, whereas Augustus was trained by pre-trained species from the BUSCO³⁰ analysis from the lineage arthropoda_odb10. Also, RNA-Seq of iso-female colonies of acaricide-resistant and susceptible populations were used as evidence for the prediction of genemodels. Approximately 9,286 and 7,787 genes were predicted from the genome assembly using Genemark-ES and Augustus, respectively (Table S2). The genes predicted from the Genemark-ES were used for the annotation. A total of 22,909 introns were predicted in the entire genome with 20.3% of the genes being intronless. A total of 32,195 exons were detected and the number of exons per gene varied from 1–33 (Table S3).

Functional Annotation and Gene-Ontology

The predicted protein sequences were further annotated with the eggNOG³¹, nr³², KEGG³³, and InterPro³⁴ databases. The corresponding Gene Ontology terms (GO) of identified gene sequences were predicted using the eggNOG mapper v2.1.9³⁵. A total of 1,274 genes were annotated by BLAST + v2.13.0³⁶ againstthe NCBI nr database; 6,964 genes were annotated by Interproscan v5.60.92 and homology search against the KEGG database assigned 4,984 genes with their corresponding pathways. Gene ontology annotation classified 5,420 genes into three GO classes namely biological process, molecular function, and cellular component.

Prediction of RNA species and microsatellites

To find the different RNAs present in the genome, the Infernal³⁷ (INFERence of RNA ALignment)tool v1.1.4 was used with the default parameters. We used cmscan locally to search the Rfam CM libraries against the P. latus genome. The RNA species classification revealed a tRNA count of 102 and a miRNA count of four. Further, 5S_rRNA, 5.8S_rRNA, small subunit ribosomal RNA (SSU_rRNA_eukarya), and large subunit ribosomal RNA (LSU_rRNA_eukarya) were observed to be 25, 17, 16, and 16 in numbers, respectively (Table S4). To identify the microsatellite repeats in the assembled genome, Krait v1.4.0³⁸, a robust and ultrafast tool was employed. The minimum repeats for each perfect Simple Sequence Repeats (SSR) type were set to 12 for mono-, 7 for di-, 5 for tri, 4 for tetra-, 4 for penta-, 4 for hexa- and motif standardization level was set to Level 3. The total number of perfect SSRs was 23,116 which accounted for 0.64% of the genome (Table S5). Among the perfect microsatellites, tri-nucleotide microsatellites were the most abundant (324), followed by tetra-nucleotides (122), di-nucleotides (73), penta-nucleotides (38) and hexa-nucleotides (26). The most abundant SSR motifs were AAG (37.18%), followed by AAAT (30.53%), ATC (29.49%), and AAT (17.03%) (Table S6).

Orthologous gene family identification, and CAFÉ analysis

For the single copy orthologous gene identification, genomic information of eight species namely, broad mite (Polyphagotarsonemus latus) predatory mite (Metaseiulus occidentalis); two-spotted spider mite (Tetranychus urticae); tomato mite (Aculops lycopersici); common house spider (Parasteatoda tepidariorum); social velvet spider (Stegodyphus mimosarum) and black-legged tick (Ixodes scapularis) were used as ingroup and fruit fly (Drosophila melanogaster) was used as outgroup reference. The proteomes of these organisms were collected from NCBI (ftp://ftp.ncbi.nlm.nih.gov/) and Ensembl genome repositories (ftp://ftp.ensemblgenomes.org/).

OrthoVenn v3³⁹ is a comprehensive platform for comparative genomics, designed to identify orthogroups and orthologs. Blastp was used for sequence homology search; MUSCLE v5.1⁴⁰ and FastTree2-v2.1.11⁴¹ were used for multiple sequence alignment; and tree inference for the phylogeny construction, respectively. The OrthoVenn v3 generated 14,646 orthologous clusters with 225 overlaps and 669 single-copy clusters (Fig. S2, S3). A total of 145,984 proteins (proteome of P. latus and references) were present of which 27,969 (19.16%) were singletons. CAFE⁴² v5.0 and OrthoMCL v2⁴³ algorithm were used for grouping orthologous protein sequences with a P-value threshold of 0.05 to analyse the gene family expansions and contractions in P. latus and its related species. The expansion and contraction of gene family analysis revealed that among the eight arthropods analysed, P. latus exhibits one of the highest rates of gene family contractions (211) and a lower rate of gene family expansions (43) (Fig. S4).

The GO enrichment analysis using OrthoVenn v3 with P-value of 0.05 of significantly contracted gene families in P. latus included xenobiotic metabolic process (GO:0006805), xenobiotic transport (GO:0042908), glucosylceramide catabolic process (GO:0006680), ubiquitin-dependent protein catabolic process (GO:0006511), retinol metabolic process (GO: 0042572), wing disc development (GO: 0035220), visual learning (GO: 0008542) and segmentation (GO: 0035282). The GO terms related to biological processes like cholesterol metabolic process (GO:0008203), lipid catabolic process (GO:0016042), and neuron projection morphogenesis (GO:0048812) were also contracted.

The GO enrichment of the significantly expanded gene families included those responsible for xenobiotic detoxification enzymes like glutathione transferase activity (GO:0004364), UDP-glycosyltransferase activity (GO:0008194), serine-type carboxypeptidase activity (GO:0004185), glucuronosyltransferase activity (GO:0015020), and metalloexopeptidase activity (GO:0008235). The GO terms related to biological processes like testicular fusome organization (GO:0030724), antibiotic metabolic process (GO:0016999), and regulation of store-operated calcium entry (GO:2001256) were also expanded.

Data Records

The PacBio raw data, genome assembly, and annotation files have been submitted to NCBI under the bioproject ID: PRJNA904956; the assembly accession no: GCA_040055235.1⁴⁴ and SRA ID:SRX19915179,SRX19738152⁴⁵. The transcriptome reads used for validation of genome accuracy have been submitted to NCBI SRR21762741 to SRR21762743 by our previous study⁴⁶. The genome assembly accession IDs of species used in comparative studies were fruit fly (Drosophila melanogaster); predatory mite (Metaseiulus occidentalis) (GCA_000255335.2); two-spotted spider mite (Tetranychus urticae)(GCA_000239435.1); tomato mite (Aculops lycopersici) (Eriophyoidea: Eriophyidae)(GCA_015350385.1); common house spider (Parasteatoda tepidariorum) (GCA_000365465.3); spider (Stegodyphus mimosarum) (GCA_000611955.2); black-legged tick(Ixodes scapularis) (GCF_016920785.2); carmine spider mite(Tetranychus cinnabarinus)(GCA_022266195.1); spider mites (Tetranychus truncatus) (GCA_028476895.1); American house dust mite (Dermatophagoides farinae) (GCA_020809275.1); European house dust mite (Dermatophagoides pteronyssinus) (GCA_001901225.2); red-legged earth mite (Halotydeus destructor) (GCA_022750525.1) and human (Homo sapiens) (GCA_000001405.29). The gene annotation, gtf, and sequences file of P. latus are shared in figshare https://doi.org/10.6084/m9.figshare.25825984.v1⁴⁷.

Technical Validation

Completeness and accuracy of the assembled genome were assessed using Benchmarking Universal Single-Copy Orthologs (BUSCO v5)³⁰ analysis using the lineage eukaryota_odb10, arthropoda_odb10, and arachnidae_obd10 datasets as orthologs reference, n = 255, 1,013 and 2,934 respectively and by mapping of RNA-Seq reads against the assembled genome. Estimates of the completeness of the draft genome assembly by BUSCO assessment located 92.9%, 86.9% and 83.4% of the eukaryotes single-copy orthologs (92.5% complete single copy and 0.4% duplicated), arthropod single-copy orthologs (84.3% complete single copy and 0.9% duplicated) and arachnid single-copy orthologs searched (82.0% complete single copy and 1.72% duplicated), respectively (Table S7; Fig. S5). The mapping of RNA-Seq reads against the genome was performed using the STAR⁴⁸ aligner v2.7.10b with the default parameters. The mapping percentage of 96.44 was observed for the mapping of RNA-Seq reads against the assembled genome. The statistics indicate that P. latus genome assembly is comparable to any other chelicerate genomes concerning the number of assembled contigs and completeness.

Code availability

In the current study, no custom scripts were used. All the data processing was done using the guidelines standard pipelines and bioinformatics tools given in the Methods section.

References

Lozano-Fernandez, J. et al. Increasing species sampling in chelicerate genomic-scale datasets provides support for monophyly of Acari and Arachnida. Nat. Commun. 10, 2295 (2019).
Article ADS PubMed PubMed Central Google Scholar
Walter, D. E. & Krantz, G. W.Collecting, rearing, and preparing specimens. in A Manual of Acarology Vol. 3 (eds.G. W. Krantz & D. E. Walter), pp. 83–96. (Texas Tech University Press, 2009).
Klimov, P. B. et al. Comprehensive phylogeny of acariform mites (Acariformes) provides insights on the origin of the four-legged mites (Eriophyoidea), a long branch. Mol. Phylogenet. Evol. 119, 105–117 (2018).
Article PubMed Google Scholar
Gerson, U. Biology and control of the broad mite, Polyphagotarsonemus latus (Banks) (Acari: Tarsonemidae). Exp. Appl. Acarol. 13, 163–178 (1992).
Article Google Scholar
Lin, J. & Zhang, Z.-Q. Tarsonemidae of the world (Acari: Prostigmata): key to genera, geographical distribution, systematic catalogue and annotated bibliography (Systematic & Applied Acarology Society, 2002).
Flechtmann, C. H. W. & Flechtmann, C. A. H. Reproduction and chromosomes in the broad mite, Polyphagotarsonemus latus (Banks, 1904) (Acari, Prostigmata, Tarsonemidae). Acarol. VI 1, 455–456 (1984).
Google Scholar
Rai, A. B., Satpathy, S., Gracy, R. G., Swamy, T. M. S. & Rai, M. Yellow mite (Polyphagotarsonemus latus Banks) menace in chilli crop. Veg. Sci. 34, 1–13 (2007).
Google Scholar
Luypaert, G. et al. Temperature-dependent development of the broad mite Polyphagotarsonemus latus (Acari: Tarsonemidae) on Rhododendron simsii. Exp. Appl. Acarol. 63, 389–400 (2014).
PubMed Google Scholar
Ovando-Garay, V., González-Gómez, R., Zarza, E., Castillo-Vera, A. & de Coss-Flores, M. E. Morphological and genetic characterization of the broad mite Polyphagotarsonemus latus Banks (Acari: Tarsonemidae) from two Mexican populations. PLoS One 17, e0266335 (2022).
Article CAS PubMed PubMed Central Google Scholar
Gregory, T. R. & Young, M. R. Small genomes in most mites (but not ticks). Int. J. Acarol. 46, 1–8 (2020).
Article Google Scholar
Van Leeuwen, T. & Dermauw, W. The molecular evolution of xenobiotic metabolism and resistance in chelicerate mites. Annu. Rev. Entomol. 61, 475–498 (2016).
Article PubMed Google Scholar
Agwunobi, D. O., Yu, Z. & Liu, J. A retrospective review on ixodid tick resistance against synthetic acaricides: implications and perspectives for future resistance prevention and mitigation. Pestic. Biochem. Physiol. 173, 104776 (2021).
Article CAS PubMed Google Scholar
De Rouck, S., İnak, E., Dermauw, W. & Van Leeuwen, T. A review of the molecular mechanisms of acaricide resistance in mites and ticks. Insect Biochem. Mol. Biol. 159, 103981 (2023).
Article PubMed Google Scholar
Augustine, N. et al. Resistance to fenazaquin in broad mite, Polyphagotarsonemus latus (Banks) (Acari: Tarsonemidae): Realized heritability, risk assessment and cross-resistance. J. Appl. Entomol. 148, 279–286 (2024).
Article CAS Google Scholar
Augustine, N., Venkatasan, T., Upasana, S. & Mohan, M. Acaricide resistance among broad mite (Polyphagotarsonemus latus (Banks)) populations in Karnataka, India. Curr. Sci. 124, 1462–1468 (2023).
CAS Google Scholar
Grbić, M. et al. The genome of Tetranychus urticae reveals herbivorous pest adaptations. Nature 479, 487–492 (2011).
Article ADS PubMed PubMed Central Google Scholar
Chen, L. et al. The genome sequence of a spider mite, Tetranychus truncatus, provides insights into interspecific host range variation and the genetic basis of adaptation to a low-quality host plant. Insect Sci. 30, 1208–1228 (2023).
Article CAS PubMed Google Scholar
Yu, S. et al. Whole genome sequencing and bulked segregant analysis suggest a new mechanism of amitraz resistance in the citrus red mite, Panonychus citri (Acari: Tetranychidae). Pest Manag. Sci. 77, 5032–5048 (2021).
Article CAS PubMed Google Scholar
Greenhalgh, R. et al. Genome streamlining in a minute herbivore that manipulates its host plant. Elife 9, e56689 (2020).
Article CAS PubMed PubMed Central Google Scholar
Klimov, P. B. et al. Symbiotic bacteria of the gall-inducing mite Fragariocoptes setiger (Eriophyoidea) and phylogenomic resolution of the eriophyoid position among Acari. Sci. Rep. 12, 3811 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Thia, J. A. et al. The redlegged earth mite draft genome provides new insights into pesticide resistance evolution and demography in its invasive Australian range. J. Evol. Biol. 36, 381–398 (2023).
Article CAS PubMed Google Scholar
Hunt, G. J. Insect DNA extraction protocol. in Fingerprinting Methods based on Arbitrarily Primed PCR (eds. Micheli, M.R., Bova, R.) ch. pp: 21-24 (Springer Lab Manuals. Springer, Berlin, Heidelberg, 1997).
Wood, D. E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol. 20, 1–13 (2019).
Article Google Scholar
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Article CAS PubMed PubMed Central Google Scholar
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).
Article CAS PubMed PubMed Central Google Scholar
Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764–770 (2011).
Article PubMed PubMed Central Google Scholar
Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl. Acad. Sci. 117, 9451–9457 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Lomsadze, A., Bonny, C., Strozzi, F. & Borodovsky, M. GeneMark-HM: improving gene prediction in DNA sequences of human microbiome. NAR genom. Bioinform. 3, lqab047 (2021).
Article PubMed PubMed Central Google Scholar
Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439 (2006).
Article CAS PubMed PubMed Central Google Scholar
Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
Article PubMed Google Scholar
Huerta-Cepas, J. et al. eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res. 47, D309–D314 (2019).
Article CAS PubMed Google Scholar
Pruitt, K. D., Tatusova, T. & Maglott, D. R. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 33, D501–D504 (2005).
Article CAS PubMed Google Scholar
Kanehisa, M. & Goto, S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).
Article CAS PubMed PubMed Central Google Scholar
Paysan-Lafosse, T. et al. InterPro in 2022. Nucleic Acids Res. 51, D418–D427 (2023).
Article CAS PubMed Google Scholar
Cantalapiedra, C. P., Hernández-Plaza, A., Letunic, I., Bork, P. & Huerta-Cepas, J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38, 5825–5829 (2021).
Article CAS PubMed PubMed Central Google Scholar
McGinnis, S. & Madden, T. L. BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res. 32, W20–W25 (2004).
Article CAS PubMed PubMed Central Google Scholar
Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013).
Article CAS PubMed PubMed Central Google Scholar
Du, L., Zhang, C., Liu, Q., Zhang, X. & Yue, B. Krait: an ultrafast tool for genome-wide survey of microsatellites and primer design. Bioinformatics 34, 681–683 (2018).
Article CAS PubMed Google Scholar
Sun, J. et al. OrthoVenn3: an integrated platform for exploring and visualizing orthologous data across genomes. Nucleic Acids Res. 51, W397–W403 (2023).
Article CAS PubMed PubMed Central Google Scholar
Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).
Article CAS PubMed PubMed Central Google Scholar
Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2–approximately maximum-likelihood trees for large alignments. PLoS One 5, e9490 (2010).
Article ADS PubMed PubMed Central Google Scholar
Mendes, F. K., Vanderpool, D., Fulton, B. & Hahn, M. W. CAFE 5 models variation in evolutionary rates among gene families. Bioinformatics 36, 5516–5518 (2020).
Article CAS Google Scholar
Li, L., Stoeckert, C. J. & Roos, D. S. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 13, 2178–2189 (2003).
Article CAS PubMed PubMed Central Google Scholar
Mohan, M. et al. Genbank https://identifiers.org/insdc.gca:GCA_040055235.1 (2024).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRP428391 (2024).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRP400343 (2023).
Mohan, M. et al. Dataset of P. latus genome, https://doi.org/10.6084/m9.figshare.25825984.v1 (2019).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

The authors extend gratitude to the Indian Council of Agricultural Research (ICAR) for generous funding under the Consortia Research Platform (CRP) on Genomics project. The authors thank the sequencing service rendered by Nucleome Informatics Pvt Ltd., Hyderabad, and bioinformatics analysis support by providing a dedicated high-performance computing facility by Bionivid Technology Pvt. Ltd., Bengaluru. The infrastructure facilities provided by the Director, ICAR- National Bureau of Agricultural Insect Resources, Bengaluru are gratefully acknowledged.

Author information

Neenu Augustine
Present address: School of Agricultural Innovations and Advanced Learning (VAIAL), Vellore Institute of Technology, Tamil Nadu, 632014, India

Authors and Affiliations

Division of Genomic Resources, ICAR- National Bureau of Agricultural Insect Resources, Hebbal, Bengaluru, 560024, India
Muthugounder Mohan, Neenu Augustine, Selva Babu Selvamani, Aneesha P. J., Upasna Selvapandian, Jyoti Pathak, Gandhi Gracy R., Venkatesan Thiruvengadam & Sushil S. N.

Authors

Muthugounder Mohan
View author publications
You can also search for this author in PubMed Google Scholar
Neenu Augustine
View author publications
You can also search for this author in PubMed Google Scholar
Selva Babu Selvamani
View author publications
You can also search for this author in PubMed Google Scholar
Aneesha P. J.
View author publications
You can also search for this author in PubMed Google Scholar
Upasna Selvapandian
View author publications
You can also search for this author in PubMed Google Scholar
Jyoti Pathak
View author publications
You can also search for this author in PubMed Google Scholar
Gandhi Gracy R.
View author publications
You can also search for this author in PubMed Google Scholar
Venkatesan Thiruvengadam
View author publications
You can also search for this author in PubMed Google Scholar
Sushil S. N.
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.M. contributed to the conception and design of various experiments and interpretation of data (comparative genomics), drafting of the manuscript, and fund acquisition. N.A. contributed tothe rearing and maintenance of iso-female colonies of P. latus, the preparation of mite samples for genome and transcriptome sequencing, writing and drafting of the manuscript. SBS contributed to genome composition estimation, gene prediction, gene annotation, RNA-Seq mapping, and drafting of the manuscript. A.P.J. contributed to gene annotation, genome size estimation, and drafting of the manuscript. US contributed to genome assembly, bioinformatics analysis, and drafting of the manuscript. J.P. and GRG contributed to the drafting of the manuscript. V.T. contributed to the DNA barcoding of mite samples and proof reading of the manuscript. S.N.S. contributed to the population dynamics of P. latus.

Corresponding author

Correspondence to Muthugounder Mohan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Table S1, Table S2, Table S3, Table S4, Table S5, Table S6, Table S7

Fig. S1, Fig. S2, Fig. S3, Fig. S4, Fig. S5

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Mohan, M., Augustine, N., Selvamani, S.B. et al. The miniature genome of broad mite, Polyphagotarsonemus latus (Tarsonemidae: Acari). Sci Data 11, 748 (2024). https://doi.org/10.1038/s41597-024-03579-4

Download citation

Received: 27 December 2023
Accepted: 27 June 2024
Published: 09 July 2024
DOI: https://doi.org/10.1038/s41597-024-03579-4
Springer Nature Limited

The miniature genome of broad mite, Polyphagotarsonemus latus (Tarsonemidae: Acari)

Abstract

Similar content being viewed by others

A chromosome-level genome assembly of the spider mite Tetranychus piercei McGregor

Chromosome-level genome assembly of Microplitis manilae Ashmead, 1904 (Hymenoptera: Braconidae)

Chromosome genome assembly and whole genome sequencing of 110 individuals of Conogethes punctiferalis (Guenée)

Background & Summary

Methods

Establishment and maintenance of P. latus colony

DNA isolation, library preparation, and sequencing

Contamination removal, contig level assembly, and assembly polishing

Estimation of genome size and completeness

Repeat identification and genome masking

Gene prediction

Functional Annotation and Gene-Ontology

Prediction of RNA species and microsatellites

Orthologous gene family identification, and CAFÉ analysis

Data Records

Technical Validation

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Table S1, Table S2, Table S3, Table S4, Table S5, Table S6, Table S7

Fig. S1, Fig. S2, Fig. S3, Fig. S4, Fig. S5

Rights and permissions

About this article

Cite this article

Navigation

The miniature genome of broad mite, Polyphagotarsonemus latus (Tarsonemidae: Acari)

Abstract

Similar content being viewed by others

A chromosome-level genome assembly of the spider mite Tetranychus piercei McGregor

Chromosome-level genome assembly of Microplitis manilae Ashmead, 1904 (Hymenoptera: Braconidae)

Chromosome genome assembly and whole genome sequencing of 110 individuals of Conogethes punctiferalis (Guenée)

Background & Summary

Methods

Establishment and maintenance of P. latus colony

DNA isolation, library preparation, and sequencing

Contamination removal, contig level assembly, and assembly polishing

Estimation of genome size and completeness

Repeat identification and genome masking

Gene prediction

Functional Annotation and Gene-Ontology

Prediction of RNA species and microsatellites

Orthologous gene family identification, and CAFÉ analysis

Data Records

Technical Validation

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Table S1, Table S2, Table S3, Table S4, Table S5, Table S6, Table S7

Fig. S1, Fig. S2, Fig. S3, Fig. S4, Fig. S5

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation