Introduction

Solar salterns represent an extreme environment containing up to 5 M sodium chloride in addition to high ultraviolet (UV) radiation and low oxygen concentration (Jones and Baxter 2017; Chung et al. 2020). These habitats are enriched with heavy metals as a consequence of water evaporation and various anthropogenic activities like urbanization and industrialization (Voica et al. 2016).

Halophilic archaea, the predominant microflora of these hypersaline environments, encounter metals in their natural environment and use some of them, such as iron, cobalt, copper, manganese, molybdenum, and zinc, as trace elements for various key physiological functions. Other heavy metals, such as mercury, aluminum, cadmium, lead, and arsenic are toxic and are non-essential metals (Nies 1999). At high levels, essential and non-essential metals can be toxic and require adaptation strategies to overcome metal toxicity (Coombs and Barkay 2005; Kaur et al. 2006; Srivastava and Kowshik 2013). Thus, the concentrations of metals within cells are stringently controlled to maintain metal ion homeostasis (Srivastava and Kowshik 2013). The most common mechanisms of heavy metal resistance include regulation of metal transport across membranes (influx/efflux) by specific transporters, intra or extracellular metal sequestration by biopolymers, enzymatic detoxification, and biotransformation into less toxic forms (Nies 1999; Kaur et al. 2006; Voica et al. 2016).

For many haloarchaea, resistance and homoeostasis are achieved by a combination of two or more of the mentioned basic mechanisms (Nies 1999). Haloarchaea have also developed other molecular adaptation mechanisms for protecting cells against abiotic stressors, such as: synthesis of stress proteins (Matarredona et al. 2020), DNA repair mechanisms (Gaba et al. 2020), synthesis of biopolymers (exopolysaccharide) (Shukla et al. 2017), or hyperpigmentation (Thombre et al. 2016; Giani and Espinosa 2020). Haloarchaea’s metabolic capabilities to tolerate salt and metals are of great importance and few essays have been carried out to study their natural susceptibility to heavy metals (Matarredona et al. 2021a). Compared with the expensive physico-chemical techniques, haloarchaea’s resistance mechanism to harmful heavy metals could be a promising, cheap alternative for the bioremediation of heavy metal-contaminated environments (Nies 1999; Almeida et al. 2009; Hrynkiewicz and Baum 2014; Kaushik et al. 2021).

These halophilic archaea, which are able to survive and thrive under exposure to a wide range of extreme environmental conditions, have gained increasing attention in biotechnological applications due to their unique metabolic capabilities, production of stable enzymes under extremely hostile conditions, and unique biomaterials and/or secondary metabolites (Shukla et al. 2017; Matarredona et al. 2021a; Pfeifer et al. 2021).

Adaptations in such extreme environments make halophilic archaea genome highly rich in multiple essential genes that are absent in other microorganisms (Papke et al. 2015). Genomic approaches have recently been developed for many halophilic archaea, opening the way toward an understanding of adaptation to harsh environment at the molecular level (Lim et al. 2016; Pfeiffer et al. 2020; Gaba et al. 2020; Lee et al. 2021).

An increase in the metal concentration in Sfax solar saltern leads to accumulation of these pollutants in sediment (Bahloul et al. 2018). Halophilic microorganisms colonizing this environment are able to survive not only under salt stress but also under metal stress. Previous works related to the heavy metal tolerance of prokaryotic flora in the superficial sediments of Sfax solar saltern ponds revealed the archaeal strains from the most contaminated ponds were tolerant to high concentrations of lead, cadmium, and nickel from 2.5 to 4.5 mM (Baati et al. 2020a). The strains are related to Halobacterium salinarum NRC-1, with 99.7–100% 16S rRNA gene sequence similarity. The effect of heavy metals ions on the growth kinetics of strain AS1, selected for its high resistance to heavy metals especially cadmium and lead, has been recently studied. This strain was able to tolerate high concentrations of lead, cadmium, and nickel in liquid medium (Baati et al. 2020b). However, to date, no information exists on the genomic basis of halophilic archaea survival under variable environmental conditions. This study reports the first genome analysis of these strains in Sfax solar saltern. The whole genome of highly heavy metal-resistant strains, AS1, AS2, AS8, AS11, and AS19, were sequenced to provide information on genes that confer tolerance to high salinity and high heavy metal concentrations, and to understand the main molecular mechanisms used by these microorganisms.

Materials and methods

Isolation and identification of strains

The archaeal strains used in this study were isolated from sediment samples collected aseptically during May 2017 from the Sfax solar saltern (Tunisia) as described previously (Baati et al. 2020a). The strains phylogenetically characterized by sequencing PCR-amplified 16S rRNA showed their affiliation to Halobacterium salinarum. The obtained sequences were submitted to EMBL/GenBank databases under accession number MT332425, MT741679, MT738366, MT738365, and MT738367 for the strains AS1, AS2, AS8, AS11, and AS19 respectively (Baati et al. 2020b).

Genomic DNA extraction and whole genome sequencing

The strains were cultivated to extract genomic DNA, for 10 days, at 37 °C, in DSC-97 medium as described previously (Baati et al. 2020b). The cells obtained were harvested by centrifugation (8000 g for 30 min), and the genomic DNA of each strain was extracted using Genomic DNA Purification Kit (NucleoSpin Tissue Kit, macherey Nagel) according to the manufacturer’s protocol. After DNA extraction, an ethanol precipitation was used to further clean the DNA. Genomic DNA of the strains was prepared for sequencing using the Nextera XT library preparation kit (Illumina Inc). The libraries were sequenced using a MiSeq V3 (2 × 300 bp) kit at the National Center for Agricultural Utilization Research, in Peoria, IL, USA. The sequences were quality trimmed (Q30) and assembled using CLC Genomics Workbench 20.0 (Qiagen Inc). The genomes were checked for completeness and contamination using CheckM (Parks et al. 2015). Genomic average nucleotide identity and aligned percentage was conducted using CLC Genomics Workbench 21.0 and based on greater than 70% overlap and 70% nucleotide identity.

Genome annotation and analysis

The genes in the assembled sequences were then predicted and annotated automatically using the RAST webserver (Rapid Annotation using Subsystem Technology) to predict protein-coding open reading frames, tRNAs, and structural RNA genes (Aziz et al. 2008). Functional classification analysis for the SEED subsystem was performed on genome data using the RAST server SEED viewer (Overbeek et al. 2014). RAST used a strategy based on the comparison with manually curated subsystems and subsystem-based protein families, guaranteeing a high degree of consistency and accuracy (Gudhka et al. 2015). In addition, the genomes were annotated using the NCBI prokaryotic genome annotation pipeline (PGAP) (Tatusova et al. 2016). A comparison of the annotations was made to a manually curated genome of Halobacterium salinarum NRC-1 based on the experimentally characterized homologs (Pfeiffer and Oesterhelt 2015). To compare the annotations, we used our most complete genome, AS11 (JAHTKX000000000) and performed a whole genome alignment with H. salinarum NRC-1 (AM774415, AM774416 and AM774417) using CLC Genomics Workbench 21.0 (Qiagen Inc). In areas that aligned, the annotations were transferred to the AS11 genome with the appropriate coordinates. The three sets of annotations were put in sequentially order for manual evaluation. For the remaining for genomes, comparisons are included for RAST and PGAP annotations.

Nucleotide sequence accession number

The genomes sequences of the five archaeal strains (AS1, AS2, AS8, AS11, and AS19) were deposited at DDBJ/ENA/GenBank under the accessions JAHCLW000000000, JAHTKV000000000, JAHTKW000000000, JAHTKX000000000 and JAHTKY000000000 respectively.

Results and discussion

General genome features of the five archaeal strains

The draft genome of the five haloarchaeal strains ranges between 2,060,688 and 2,467,461 bp. The estimated sequencing coverage depth ranged from 109 and 551fold coverage. All five were confirmed to be > 98% complete and contained l < 1% contamination (Table 1). The 2,060,688 bp long genome of the archaeal strain AS1 (JAHCLW000000000) is assembled in a total of 558 contigs and G + C content was 65.5%. The taxonomy of the strains was confirmed using average nucleotide identity, with all five strains sharing greater than 98.9% nucleotide identity with the type strain of H. salinarum NRC-1 (Table S1, Figure S1). The genome annotation using RAST with the SEED database, revealed 2803 coding sequences; 1315 of these are annotated as hypothetical proteins (46.91%). The genome of strain AS2 (JAHTKV000000000) containing 2,467,461 bp is assembled in 195 contigs and G + C content was 66%. The genome encoded for 2836 coding sequences. Among them, 1305 sequences are annotated as hypothetical proteins (46.01%). The 2,236,624 bp long genome of the strain AS8 (JAHTKW000000000) had a G + C content of 67.0% and is assembled in 79 contigs. The genome contains 2476 coding sequences; 1082 sequences are annotated as hypothetical proteins (43.69%). The genomes sizes of the two strains AS11 (JAHTKX000000000) and AS19 (JAHTKY000000000) are comparable. They have the same G + C content (66.2%) and r + tRNA (50). The genomes are assembled in 48 and 67 contigs respectively. The genome annotation results in a total of 2695 and 2710 coding sequences; 1500 and 1423 (55.65 and 52.50%) have hypothetical functions. According to Gaba et al. (2020), the abundance of hypothetical proteins in the haloarchaeal genomes revealing there are genes whose function is still unknown.

Table 1 Genomes features of the 5 halophilic archaeal strains and other Halobacterium sp

The RAST annotation categorizes these genes into 159, 170, and 167 subsystems for AS1, AS2, and AS8 respectively. The same number of subsystems (168) is found for AS11 and AS19. The subsystem categories with the highest number of coding sequences are: protein metabolism (106–151), amino acids and derivatives (113–122), DNA metabolism (59–77), respiration (48–56), cofactors, vitamins, prosthetic groups, pigments (44–60), and several others (Fig. 1).

Fig. 1
figure 1

Subsystem category distribution of studied haloarchaeal strains (a): AS1; (b): AS2; (c): AS8; (d): AS11; (e): AS19 and (f): Halobacterium sp NRC-1 using Rapid Annotation System Technology (RAST)

Comparative genomic features between the studied archaeal strains and other Halobacterium species are shown in Table 1. The size of the genome is slightly smaller than those of other Halobacterium species previously sequenced (Ng et al. 2000; Lim et al. 2016; Pfeiffer et al. 2020). The r + tRNA numbers varies from 46 to 50 in the 5 studied archaeal genomes, while it ranges from 47 to 53 for the other Halobacterium species. On the other hand, the G + C content of 5 archaeal genomes varies between 65.5 and 67% which is similar to those of the other Halobacterium species. Ng et al. (2000) identified 2,682 likely genes in the genomes of the well characterized Halobacterium sp NRC-1, of which only 591 genes were hypothetical proteins. With regard to the subsystem distribution of coding sequences, the largest number of the sequences are found in “cofactors, vitamins, prosthetic groups, pigments (195), protein metabolism (181), amino acids, and derivatives (164) and carbohydrates (112) (Fig. 1f).

Because this study is dependent on the annotation of the genomes, we performed a comparison of annotations. We compare annotations by RAST, NCBI’s PGAP and the annotations of the manually curated genome of Halobacterium salinarum NRC-1 based on experimentally characterized homologs (Pfeiffer and Oesterhelt 2015). The results show the differences between the three types of annotations (Table S3, AS11). The most common difference (198 occurrences) is RAST annotating an ORF as a “hypothetical protein” while PGAP annotated it, usually at the domain or family level (e.g., MGMT family protein, "DUF2103 domain-containing protein"). The second most common difference (185 occurrences) was RAST calling unique ORFs and labeling them as “hypothetical proteins”, typically these are small or partial ORF near the end of contigs. In general, the PGAP annotations and the manually annotated where very close to each other. For this study, the differences in annotation do not represent a significant issue, since the proteins annotated at the family or domain level are not usually interpreted at the functional level and the comparisons are all based on the same annotations. The presence of additional “hypothetical proteins” do not impact the comparisons.

Genes related to osmotic stress adaptation

Upon analysis of the 5 archaeal strains genomes, based on the RAST analysis with the SEED database, various genes responsible for resistance to osmotic stress were found (Table 2). Many genes involved in potassium transport, like Trk potassium uptake system protein (TrkA and TrkH) and Kef-type K+, transport were identified. All the strains have 5 copies of TrkA genes except for AS2, which has 6 copies. AS1 and AS2 have 5 and 6 copies of TrkH genes respectively whereas the other strains have 3 copies. In response to changes in the external osmolarity, potassium is accumulated in the cytoplasm by transport via the Trk system potassium uptake protein (Gorriti et al. 2014). Furthermore, 4 copies of Kef-type K+ transport proteins were detected within the genomes of all strains. AS2, AS8, AS11 and AS19 each had 2 copies of potassium channel proteins and 3 copies of potassium-transporting ATPase (A, B and C chain) while AS1 had 3 copies of potassium channel proteins and 4 copies of potassium-transporting ATPase. All the strains have 1 gene encoding chloride channels except for AS1 which has 2. Moreover, genome analysis confirms the presence of gene encoding Na+/H+ antiporter (one gene for each strain) related to sodium efflux. The Na+/H+ antiporters keep cytoplasm isosmotic with the environment by extruding sodium out of cells in exchange for hydrogen ions (Yang et al. 2006; Cimerman et al. 2018). Furthermore, the genomes contain Na+/H+ antiporter NhaC (one gene for each strain) and Na+/H+ antiporter subunit A, B, C, D, E, F, and G (nine genes for each strain) (Table 2).

Table 2 Genes involved in salinity adaptation for the 5 archaeal strainstable

The presence of potassium-related genes and chlorides channels indicates the studied strains accumulated K+ and Cl passively through K+ and Cl channels in the membrane and monitored potassium homeostasis in the cytoplasm but active ATP-dependent K+ transport systems are also present. Most halophilic archaea are known to use classical adaptation strategy “Salt-in strategy” (Thombre et al. 2016; Cimerman et al. 2018). Most organisms adapting this mechanism utilize the Na+/H+ antiporters and ATPase dependent ion transporters for the stable maintenance of sodium gradient across the cell (Thombre et al. 2016). Choline sulfatase, necessary to convert choline sulfate into choline (Osteras et al. 1998), is detected in AS11 and AS19. The presence of this gene suggests these two strains can use “the salt-in strategy” as well as “the salt-out strategy” (Roberts 2005) by the biosynthesis of compatible solutes to resist the osmotic stress in Sfax solar saltern sediments.

All genomes appear to have 2 copies of mechanosensitive ion channel genes that could function as osmosensors and ‘safety-valves’ at times when other systems fail to guard cells from certain death (Martinac 2004). Carbamoyl-phosphate synthase is found in all strains (4 genes for AS1 and 2 genes for the others). Coker et al. (2007) showed carbamoyl-phosphate synthase, which is also detected in Halobacterium sp. NRC-1, is up-regulated under high salt conditions and down-regulated under low salt conditions. Many kinds of protein kinases are found in the genome of the archaeal strains (Table 2). Protein kinases are believed to be involved in the salt stress response cascade in eukaryotic cells as well as some prokaryotic species (Shoumskaya et al. 2005).

Based on the annotation results, the studied genomes encode universal stress proteins (USP) (1 gene for each strain). These proteins are induced under heat/cold shock, nutrient starvation, oxidative stress, and heavy metal toxicity. The primary function of universal stress proteins is protection against environmental stresses (Thombre et al. 2016). USPs is emerging as important players in stress resistance (Matarredona et al. 2020). Multiple genes coding for different classes of heat-shock protein (Hsp), which may help the strains cope with temperature variations, were identified (5 genes for AS8 and 4 genes for the others) (Table 2). Hsp is another important protein expressed in response to environmental stimuli including abiotic and biotic stressors (Maleki et al. 2016). In addition, all strains harbor chaperone proteins (DnaK (Hsp70), DnaJ (Hsp40) and GrpE) (2 genes for AS1 and 3 genes for the others), cold-shock proteins (2 genes for each strain), and a small heat shock protein (1 gene for each strain). Multiple chaperone proteins (Becker et al. 2014) and cold-shock genes (Coker et al. 2007) are found in the genomes of several haloarchaea exposed to different environmental stresses. Chaperone proteins play an important role in the maintenance of cellular proteins at the physiological level (Matarredona et al. 2020). Small heat-shock protein (around 30 kDa or less) is also involved in salt stress responses in haloarchaea (Macario et al. 1999). These genes encoded for stress proteins are helpful for enduring extreme stress conditions and have widespread industrial applications (Thombre and Oke 2015; Lee et al. 2021; Matarredona et al. 2020). Other stress proteins are used to protect archaea against external environmental stresses like DNA repair proteins (DNA repair and DNA mismatch) and proteasomal components. Proteasomal component, a multi-subunit protease complex synthesized in cells, plays a fundamental role in major stress conditions. It is considered to be the main molecular machine for regulating the degradation of intracellular proteins (Thombre et al. 2016; Matarredona et al. 2020). For protecting the stability of their genomes from potential damage, the strains have different molecular processes involved in DNA damage repair, such as DNA protection during starvation protein (1 gene for each strain), DNA damage-inducible protein (3 genes for each strain), DNA repair and recombination protein RadA and B (2 genes for each strain) and Rad 3, Rad 25, and Rad 50 (9 for AS1, 8 for AS2, 5 for AS11, and 4 for AS8 and AS19). For the mismatch repair system, genes for MutS (6 genes for AS1, 5 for AS2, 3 for AS8 and 2 for AS11 and AS19), MutL (3 genes for AS2 and 1 for the others), and other DNA mismatch repair protein (2 genes for AS2 and AS8, 1 gene for the other strains) were also identified (Table 2). According to Gaba et al (2020), DNA mismatch proteins possibly help halophilic archaea to repair its DNA and enhance its longevity. There is a great number of DNA repair proteins used by halophilic archaea to protect against external environmental stresses (Matarredona et al. 2020).

The genomes of the studied strains encode dodecin (2 genes for AS1 and 1 gene for the other strains) (Table 2). Dodecin, which is a flavin storage/sequestration protein probably involved in protecting the cell against changing environmental conditions. Archaeal dodecins, only found in the phylum Haloarchaea, reflects the adaptation of these species to hypersaline conditions (Grininger et al. 2009). However, the genome sequences of the strains in this study encoded many origins of replications gene (origin recognition complex/cell division cycle 6) (9 for AS8; 12 for AS1 and AS2; 14 for AS11 and 16 for AS19). According to Dsouza et al. (1997), haloarchaea contain multiple dormant origins of replications that can be activated as a cellular and molecular response caused by environmental stresses resulting in DNA replicative stress.

Moreover, many transposases were found (6 for AS8, 7 for AS1 and AS2, 12 for AS11 and AS19) suggesting a large number of insertion sequences in the genome. A gene cluster with 14 gvp genes (gvpMLKJIHGFEDACNO) for gas vesicle and a gene encoded for S-layer are identified in the genomes sequences (Table S2). The function of gas vesicles, hollow proteinaceous structures surrounding a gas filled space, is to enable the cells floating to the more oxygenated surface layers (DasSarma 2004; Coker et al. 2007; DasSarma and DasSarma 2015). S-layer is involved in maintenance of haloarchaea cellular morphology, cell division, and influences the cells resistance to osmotic stress (Siddaramappa et al. 2012).

AS1 and AS2 are found to possess CRISPRs (Clustered Regularly Interspaced Short Palindromic Repeats) and CRISPR associated proteins (Cas1 and Cas2). The CRISPR spacer system is an adaptive microbial immune system against foreign genetic elements in complex environments (Wang et al. 2019). Previous work noted the absence of CRISPRs or Cas genes in Halobacterium salinarum R1 and Halobacterium NRC-1 (Ng et al. 2000; Lynch et al. 2012) and the presence of 2 confirmed CRISPRs in Halobacterium sp. DL1 and 9 putative CRISPRs in Halobacterium noricense CBA1132 (Lim et al. 2016).

Genes related to heavy metal resistance

Results from the annotation by RAST reveal the presence of numerous genes in the genome of the studied strains predicted to be involved in heavy-metal resistance, transport, and detoxification (Table 3).

Table 3 Genes involved in heavy metal resistance for the 5 archaeal strains

Membrane transporters

Genes coding for copper-translocating P-type ATPases were identified (8 for AS1, 7 for AS2, 5 for AS8, and 4 for AS11 and AS19) (Table 3). These P-type ATPases which are a large family of transmembrane transporters, are known for their role in ion homeostasis and biotolerance of heavy-metal ions, such as: Pb2+, Cd2+, Zn2+, Cu2+, Mn2+, Ag2+, and Hg2+ (Nies 1999). Nies (2003) showed P-type ATPase pumped out metal ion from the cytoplasm to periplasm. Such a mechanism requires ATP hydrolysis. P-type ATPases were described as key factors for the response to heavy metals for the model organism Halobacterium NRC-1 who has 2 copies of the gene (Kaur et al. 2006). AS1 had 2 genes coding for cobalt–zinc–cadmium resistance protein (czcD) whereas the 4 other strains only had 1 gene. CzcD, an important member of the CDF (Cation Diffusion Facilitators) protein family, is responsible for the efflux of cobalt, zinc and cadmium (Nies 1999).

Several genes related to diverse ATP-binding cassette (ABC) transporters are identified in the genomes sequences. The gene number varies within the studied strains (73–83). The highest numbers of ABC transporters were found in AS1 (Table 3). Some genes related to ATP-binding cassette (ABC) transporters like phosphate (PstABCS), dipeptide (DppABDF), oligopeptide (OppABD), zinc (ZnuABC), cobalt (Cbi MNQO), molybdenum (ModB), and ferric, were detected in the genomes and indicate these strains are able to resist metal stress and can survive in heavy metal environments. The number of genes with a role in transport of metals ions is the highest in AS1 compared to the other strains (Table 3). Besides the transport of specific substrates, some ABC proteins contribute to the high level of resistance against metal by pumping metal ions out of the cells (Nies 2003; Srivastava and Kowshik 2013; Gaba et al. 2020). According to Tame et al. (1994), ABC transporter proteins, such as phosphate, oligopeptide, and dipeptide transporters were found to be differentially regulated by more than one metal ion. Rodionov et al. (2006) showed through metal accumulation assays the activity of the CbiMNQO as a cobalt transporter. Völkel et al. (2018) indicated, through transcriptional analysis, Pst ABC transporter suggests a metal import function. They also showed the Dpp ABC transporters are involved in the peptides/nickel transport system pathway and might play a crucial role in the observed metal-specific biofilm formation in Halobacterium salinarum R1, whereas the Znu ABC transporter is not specific only to zinc ions but also to nickel and copper. Völkel et al. (2020) showed, by a proteomic approach, the ATP-binding protein Znu C, a component of the Znu ABC transporter, was increased in planktonic cultures exposed to Ni2+ suggesting a role of these proteins in the transport of Ni2+.

Nickel responsive regulator NikR (1 gene for each strain), and halocyanin (Copper transport and blue copper proteins) (9 genes for AS1, 8 for AS2 and 7 for the others) were also present. Copper (I) chaperone CopZ is only present in AS1 and AS2. In all strains, 3 permease enzymes belonging to the drug/metabolite transporter superfamily were also found. These transporters are presumed to move toxic metabolites out of the cell as described previously by Gudhka et al. (2015). Small multidrug resistance family (SMR) protein is also present in each strain (1 gene each). It is the smallest secondary drug efflux proteins known, consisting of about 110-amino acids (Paulsen et al. 1996).

Oxidative stress defense system

High salinity induced oxidative stress and high heavy metal concentration can damage cells by generating reactive oxygen species (ROS). The ROS are usually scavenged by specific enzymatic detoxification systems that reduce the toxic metal species to less damaging forms (Lushchak et al. 2011).

In this study, many genes including superoxide dismutases (3 genes for AS1 and 2 for the other strains), glutathione S-transferase, multicopper oxidase, and catalase-peroxidase KatG (1 gene for each strain respectively) are used to overcome oxidative stress (Table 3). Superoxide dismutase is responsible for limiting damage from reactive oxygen species, which are induced by extreme conditions, such as oxidative stress and excess irradiance in, many species (Cheeseman et al. 1997). Glutathione S-transferase is a detoxification enzyme involved in cell protection from reactive oxygen species (Oakley 2011). Multicopper oxidase protects against copper toxicity. Copper-translocating P-type ATPases and multicopper oxidase act together as an efflux pump in order to pump out excess Cu2+ from the cell (Ladomersky and Petris 2015). Catalase-peroxidase (KatG) gene is also annotated in the genomes of Halobacterium salinarum and Haloferax volcanii (Siddaramappa et al. 2012). Furthermore, many dehydrogenases/oxidoreductases are found in the genomes studied (12 genes for AS2, 10 for AS1, AS11, and AS19, and 9 for AS8). Several previous reports have suggested oxidoreductases and dehydrogenases contribute to an oxidative protection in both bacteria and archaea in response to heavy metals (Kaur et al. 2006; Williams et al. 2007). The strains in this study also encoded for ions scavenging systems, such as thioredoxin (1 gene for each strain except AS8, exempted of such gene), thioredoxin reductase (3 genes for AS1 and AS2, 2 genes for AS11 and AS19, and 1 gene for AS8), ferredoxin (4 genes for each strain), and glutaredoxin (1 gene for each strain) (Table 3). These oxidative defense proteins protect microorganisms from oxidative damage (Baker et al. 2007; Gorriti et al. 2014; Shigeki et al. 2020). Additionally, the genome sequences revealed the presence of many other metal resistance-associated genes: arsenite/antimonite pump-driving ATPase (ArsA) (1 gene for each strain except AS11 and AS19, exempted of such gene), arsenate reductase (ArsC) (1 gene for each strain), and arsenical resistance operon repressor (ArsR) (2 genes for AS1, AS2, and AS8, and 1 gene for AS11 and AS19). ArsR plays a major role in arsenic resistance. In this detoxification system, the ArsR reduces arsenate to arsenite, and the toxic arsenite is excreted through the ArsAB efflux pump (Gorriti et al. 2014). On the other hand, arsenic metallochaperone ArsD and arsenite methyltransferase are found only in AS1, AS2 and AS8. Arsenite methyltransferase may be used to transform arsenite to a volatile form (trimethylarsine) (DasSarma 2004). Similar mechanisms of arsenic tolerance have been reported in Halobacterium sp. NRC-1 and Halobacterium noricense CBA1132 (Ng et al. 2000; Lim et al. 2016).

Transcriptional regulation

Another mechanism used by halophilic archaea to deal with environmental variations, different stress conditions, and nutritional changes is the transcriptional gene regulation (Matarredona et al. 2020, 2021b). The strains AS1 and AS2 show the highest number of genes coding for transcriptional regulators (30 and 29 genes respectively), whereas the other strains range between 19 and 20. The most abundant families of these transcriptional regulators are AsnC, PadR, ArsR, and TrmB (Table 3). Plaisier et al. (2014) showed AsnC in Halobacterium NRC-1 is regulated in response to oxidative stress conditions. PadR family protein, involved in regulation of phenolic acid metabolism, has also shown to be involved in regulation of multidrug pumps (Huillet et al. 2006). In addition to its role of dealing with environmental variations and stress conditions, the ArsR family is involved in alleviation of heavy metal toxicity (Busenlehner et al. 2003). Becker et al. (2014) showed a wide distribution of transcription regulators through analysis of 23 haloarchaeal genera. These proteins play important roles in regulating physiological responses to shifting environmental parameters in hypersaline environments including changes in oxygen availability, salinity, and concentrations of heavy metals. TATA-binding protein (TBP), transcription factor B (TFB), and transcription factor E and S are also found in the genomes studied (Table 3). AS11 and AS19 have the highest number of transcriptional factor (15). TFB is the most factor found in all the genomes. TBP and TFB may enable haloarchaeal species to quickly and efficiently modify transcriptional response to environmental stresses (Becker et al. 2014). The number of transcriptional regulators present in a genome determines the regulatory potential of this organism (Charoensawan et al. 2010). According to Perez-Rueda et al. (2018), the diverse stimuli and environment challenges increase the transcriptional factor content.

Secondary metabolites

The genomes analysis of the strains shows they can use yet another strategy to overcome the stressful conditions provided by secondary metabolites including carotenoids, chelating agents (siderophores), and exopolysaccharide (EPS). Wang et al. (2019) noted secondary metabolites can act as an alternative defense mechanism to archaea living in extreme environments.

Many genes related to carotenoid biosynthesis were annotated in the genomes studied as seen in Table S4. Carotenoids, one of the most diverse groups of secondary metabolites, perform many functions, such as photoprotection, stabilization of the cell membrane under such stress, and protection from oxidative stress through their antioxidant activity (Thombre et al. 2016; Jones and Baxter 2017; Pfeifer et al. 2021). Recently, Giani and Espinosa (2020) demonstrated oxidative stress induces modification in the cytoplasmic membrane of Haloferax mediterranei and increased the level of the most abundant identified carotenoid, bacterioruberin.

Genes encoding retinal proteins, including 2 bacteriorhodopsins (bop gene), 1 halorhodopsin (hop gene), and 2 sensory rhodopsins (sopI, sopII), are found in all the strains. These kinds of proteins were extremely important to cope harsh environment where is a lack in nutrition and full of salt (Lynch et al. 2012). Retinal proteins are produced in sufficient quantities to serve as a chromophore for opsins used in phototrophy and phototaxis (DasSarma 2004). Although several halophilic archaea can produce bacteriorhodopsin, H. salinarum has become the model organism for industrialization of bacteriorhodopsin (Pfeifer et al. 2021).

The strains in this study can also utilize another mechanism for metal tolerance by secreting metal sequestering proteins named siderophores. The genomes contained 6 genes coding for enzymes involved in the biosynthesis of siderophores (Table S2). Siderophores are a class of low molecular weight iron chelating compounds which store iron and are overexpressed during conditions of stress or iron deficiency (Neilands 1995). Siderophores were also considered as candidate to chelate metals other than iron. They become a useful tool in bioremediation of a wide range of metals, such as Cd, Cu, Ni, Pb, Zn, Mn, Co, and Al (Ahmed and Holmström 2014). Hubmacher and al. (2007) reported Halobacterium salinarum does not produce a siderophore. However, Haloferax volcanii is able to produce siderophore and its genome contains 6 genes encoded for siderophore biosynthesis proteins (Niessen and Soppa 2020).

The archaeal strains are also able to secrete Exopolysaccharides (EPS) (Table S2). EPS are one of the most important secondary metabolites which are externally secreted to the cell surface for growth promotion, protection, and adhering to solid surfaces. EPS prove the strains’ ability to remove heavy metal contamination from environments high in salt by biosorption process (Shukla et al. 2017). Archaeal EPS production has mainly focused on the biological function of EPS and there have not been any attempts to improve EPS yields (Pfeifer et al. 2021).

The strains contain many genes that might be responsible for heavy metal resistance: copper-translocating P-type ATPases, ABC transporter, CDF transporters (cobalt-zinc-cadmium resistance), detoxification enzymes (superoxide dismutases, multicopper oxidase, arsenic resistance genes, catalase-peroxidases, dehydrogenases, oxidoreductase, thioredoxin, and glutaredoxin), transcription regulators, and secondary metabolites (carotenoid, bacteriorhodopsin, siderophore, and exopolysaccharide). The abundant metal-resistant genes in the genome of AS1 suggest it can tolerate different metals, which is consistent with a previous report (Baati et al. 2020b). Based on genomes analysis and characterization, added to the tested heavy metals (Cd, Pb, Zn, Cu and Nickel), the strains studied might be resistant to other heavy metal, such as cobalt (Co), molybdenum (Mn), iron (Fe) and arsenic (As). Further analysis will be performed to confirm its resistant ability to other heavy metals (Co, Mn, Fe and As).

Genes related to biotechnological applications

The RAST annotation analysis shows the presence of several genes coding for many industrially important hydrolytic enzymes in the strains studied meaning they may have many potential applications in biotechnology including: sulfatase, phosphatase, phosphoesterase, chitinase, and many types of proteases (Table S2). The extracellular enzymes released from microbes adapted to live in harsh conditions (high temperature, salinity, high levels of UV light, high osmotic pressure, and high toxic compounds concentration) are involved in nutrient cycles of carbon, nitrogen, phosphorus, and sulfur and are able to degrade complex molecules (Chung et al. 2020).

In addition to their capacity to produce enzymes, the strains studied are able to produce important secondary metabolites (carotenoids, retinal proteins, siderophore and exopolysaccharide). Exopolysaccharide has various biotechnological applications, such as detoxification of heavy metals and bioremediation of pollutants. It can also be used as anticoagulants and immunomodulator agents (Shukla et al. 2017). Many genes encoding industrially important proteins were detected: stress proteins, surface layer (S-layer), and gas vesicles. All the functions of the surface layer (S-layer) have not yet been understood. Archaeal S-layer proteins could be a product of interest to be extracted from archaeal cultures in the future (Pfeifer et al. 2021). Gas vesicles can be used in vaccine development (DasSarma and DasSarma 2015; Pfeifer et al. 2021) and are developed as bioengineerable and biocompatible antigen and drug-delivery systems (Andar et al. 2021).

Furthermore, the strains contain biosynthesis genes encoding valuable compounds like squalene, an unsaturated triterpene intermediate of cholesterol biosynthesis that can influence spatial organization in archaeal membranes. Squalene production by Halobacterium salinarum has been previously studied by Gilmore et al. (2013). Squalene has been known to play diverse biological roles as an anti-oxidant, anti-cancer agent, antibacterial agent, skin care products, and adjuvant for vaccines and drug carrier (Gohil et al. 2019).

Conclusion

This is the first report describing the genome sequence of archaeal strains isolated from Sfax solar saltern. The genome possessed osmotic stress-related coding sequences (potassium uptake, sodium efflux, and kinases) as well as various genes encoding stress-tolerance proteins (universal stress proteins, and cold- and heat-shock proteins), DNA repair systems, and proteasomal components.

The archaeal strains used in this study protect cell homeostasis and avoid heavy metal toxicity by reducing influx/enhanced efflux through enzymatic detoxification, transcription regulators, and three classes of transmembrane transporters: P-type ATPase, CDF transporters “Cobalt-zinc-cadmium resistance protein”, and ABC transporters. The strains’ tolerance is also attributed to the accumulation of carotenoid pigments within cell membranes. Additionally, tolerance is achieved through binding exopolysaccharide and sequestering with siderophores. The ability of AS1 to survive at high concentrations of metal ions as previously showed is due to greater number of genes attributed to the heavy metals resistance. Therefore, AS1 can be exploited as bioremediation agents of multi-metal-contaminated environments.

In addition to the important secondary metabolites (carotenoids, retinal proteins, EPS, squalene, stress proteins and siderophore), that can be produced by the strains, the genome analysis shows several commercially important enzymes, such as protease, sulfatase, phosphatase, phosphoesterase, and chitinase, are encoded in these strains.