Introduction

Post-translational modifications (TMPs), which are widespread throughout the phylogenetic scale, consist of chemical modifications that occur in proteins catalysed by specific enzymes1. TMPs allow cells to produce rapid responses to changes in the environment. Among the different types described in both prokaryotic and eukaryotic cells is the so-called ADP-ribosylation2,3, which introduces units of ADP-ribose (ADPr) at the expense of NAD+. This reaction is catalysed by a special class of glycosyltransferases, named ADP-ribosyltransferases (ARTs). They were first described in the diphtheria toxin and then in the choleric toxin as a form of interference with important proteins (e.g. elongation factor 2, G proteins, and Rho GTPases), thereby disrupting host cell biosynthetic, regulatory and metabolic pathways as a way of gaining advantage during the infection process4. ARTs can be divided into two main groups based on active site amino acids: the so-called ADP-ribosyl transferases cholera toxin-like (ARTCs) and ADP-ribosyl transferases diphtheria toxin-like (ARTDs). The first group includes GPI-anchored extracellular or secreted enzymes containing an R-S-E (Arg-Ser-Glu) motif, which catalyse the mono-ADP-ribosylation (MARylation) of their substrates5. The remaining group comprises intracellular ADP-ribosyl transferases able to transfer either a single ADP-ribose residue (H-Y-I/L motif) or several ADP-ribose residues (H-Y-E motif), resulting in linear or branched chains of ADP-ribose (poly-ADP-ribosylation or PARylation)6. In the latter group, the invariant Glu (E) is the key catalytic residue that coordinates the transfer of ADP-ribose to the acceptor site, the His (H) forms a hydrogen bond with the N-ribose, and the tyrosine (Y) side chain stacks with the N-ribose and the nicotinamide moiety, thus facilitating the binding of NAD+7. However, when the catalytic glutamate residue is replaced by a small hydrophobic residue in enzymes of the mono-ARTD group (mARTD), a glutamate residue of the substrate is used as the catalytic glutamate, giving rise to a substrate-assisted catalysis to transfer the ADP-ribose moiety. This produces a modified glutamate residue, which is then no longer available for the addition of new ADPr molecules8.

PARylation in mammal cells plays a crucial role in cellular functions, including mitosis, DNA repair and cell death9. Among the seventeen PARP enzymes identified in the human genome10, only Poly(ADP-ribose) polymerase-1 (PARP1 or ARTD1), PARP2, PARP3, PARP4, Tankyrase1 (TNKS1, also known as ARTD5 or PARP5a) and Tankyrase2 (TNKS2, also known as ARTD6 or PARP5b) are capable of catalysing poly-(ADP-ribosyl)ation, whereas PARP10, PARP12, PARP14 and PARP15 are mono-(ADP-ribosyl)transferases10. The remaining members of the family, PARP9 and PARP13, appear to be enzymatically inactive11. Among them, human PARP-1 (hPARP1) is the most abundant and most active protein in the PARP family, being a nuclear chromatin-associated protein11. It is also the best-studied protein in the PARP family since monotherapy with PARP-1 inhibitors selectively kills tumours harbouring deficiencies in BRCA1 and BRCA2 genes, which are involved in homologous recombination DNA repair pathway12. This ‘synthetic lethality’ has attracted clinical attention over the years as more potent and selective inhibitors have been identified. Several clinical trials are currently being conducted with them as a form of ‘personalized’ cancer therapy13.

hPARP1 has a modular architecture comprising six domains14. The N-ter site consists of two zinc finger domains (Zn1 and Zn2) that recognize the damaged DNA ends, and a third zinc finger domain (Zn3) that intervenes in DNA-dependent activation15. There is also a central BRCA C-terminal-like domain (BRCT) that modulates protein-protein interactions and accomplishes PAR self-modification, and a tryptophan-glycine-arginine (WGR) domain that is important for DNA-dependent activation after interaction with DNA15. The last portion of the protein is the catalytic domain, which has an α-helix domain serving in the allosteric regulation (PARP_reg) followed by an ART domain (PARP_cat), which contains the conserved catalytic glutamate14.

The last three domains (WGR-PARP_reg-PARP_cat) are also found in hPARP2 and hPARP3 but fused with a variable N-ter tail, as well as in most eukaryotes except for yeasts7. Nevertheless, the number of sequences in prokaryotes is reduced to only 28 PARP homologue sequences in 27 bacterial species16. Curiously, its activity has only been experimentally tested by western blot with anti-PAR antibodies with a recombinant enzyme cloned from the filamentous predatory gram-negative bacterium Herpetosiphon aurantiacus17. The above enzyme (HaPARP) was active in the presence of activated DNA and inhibited with the hPARP1 inhibitor KU-005894817. In addition, H. aurantiacus also has a DUF2263 protein (UniProt code: T3D766) that is capable of effectively removing PAR17, and whose sequence contains a poly (ADP-Ribose) glycohydrolase (PARG) signature (GGG-X6–8QEE)18. Thus, the presence of both enzymes in the same microorganism suggests that certain bacteria may have a functional PAR metabolism16. Another example of a microorganism with both putative PARP and PARG homologues is the rod-shaped, Gram-positive spore-forming anaerobic bacillus Clostridium difficile CD160, now reclassified as Clostridioides difficile CD16019,20.

C. difficile is a nosocomial opportunistic antibiotic-associated pathogen of humans responsible for a spectrum of diseases known collectively as C. difficile infections (CDI), which range from a mild self-limiting diarrhoea to pseudomembranous colitis and toxic megacolon21. These pathologies often result in death, causing 29,000 deaths and costs in excess of US 6.0 billion dollars per annum in the USA alone22. The major virulence factors of this microorganism are the potent A-type monoglycosyltransferases toxins A (TcdA) and B (TcdB) that attach glucose to Rho proteins using UDP-glucose as a cosubstrate23, and the C. difficile transferase (CDT) toxin, frequently produced by so-called hypervirulent strains. CDT is a two component toxin, CDTa being involved in the ADP-ribosylating activity and the CDTb in binding21. All the above-mentioned toxins are located within two defined loci, the PaLoc (Pathogenic Locus, with tdcA, tdcB, tdcR, tdcE and tdcC genes) and the CdtLoc (with cdtA and cdtB genes)24. Interestingly, non-toxigenic C. difficile strains (which lack the genes for toxins A and B and the binary toxin) are also relatively common, although little is known about their biology25. The same is the case with C. difficile CD160, which was isolated from a clinical stool at the University of Michigan Medical Centre, to test for its toxigenicity. However, the evaluation of this isolate by PCR with two different primers revealed that it was negative for C. difficile toxin genes A and B (tcdA and tcdB). Nevertheless, C. difficile CD160 was selected for sequencing because CE-PCR analysis showed it to be a unique ribotype (Prof. Seth Walk, Montana State University, USA, personal communication). Apart from this uniqueness, only one protein has been studied from this microorganism, a type IV pilin region (PilA1) with an unusual conformation compared with PilA1 from C. difficile R20291 and NAP08 strains26.

The present work makes a comprehensive phylogenetic analysis of bacterial PARPs to better understand the domain organization of these enzymes. Surprisingly, the domain organization in C. difficile CD160 PARP (CdPARP) differed from that of other clostridial PARPs. A detailed study of this anomalous distribution suggested the fact that C. difficile CD160 long ago diverged from other C. difficile strains and also defined a new toxigenotype. In addition, the presence of an active DUF2263 protein in C. difficile CD160 (CdPARG) revealed the existence of a functional PAR metabolism. Kinetic characterization also revealed CdPARP to be a highly active enzyme, with an activity that was 3-fold higher than that observed previously in HaPARP. Finally, the inhibition profile of the bacterial PARPs studied was also different from that of hPARP1, providing new information for the development of novel bacterial inhibitors with well-defined selectivity.

Results

CdPARP shows an atypical PARP domain organization compared with other clostridial PARPs

A phylogenetic analysis of bacterial PARPs was carried out to compare the sequence and domains of CdPARP, using the data available in the UniProt and NCBI databases. Seventy-two sequences were found containing at least the PARP domain (PARP_cat) (Supplementary Table S1) compared to the 28 previously described in the bibliography16. These sequences are distributed in six phyla (Actinobacteria, Bacteriodetes, Choloflexi, Cyanobacteria, Firmicutes, and Proteobacteria) and twelve orders (Fig. 1). The phylogenetic tree shows that these sequences are divided into two large clades. Clade 1 groups all the sequences of two orders, Cytophagales (Clade 1.1) and Clostridiales (Clade 1.2), with the exception for the sequence of CdPARP, which is in Clade 2 (see below). Almost all members of Clade 1 are characterized by the presence of WGR and PARP_cat domains without a defined PARP_reg domain. However, when these sequences are modelled, their corresponding three-dimensional structures show the presence of a helical domain, which could fulfil the same regulatory role as the canonical PARP_reg domain does. Clade 1.1 also contained the longest bacterial PARP used in this study (764 amino acids), which corresponded to one of the two PARPs reported for the bacterium Microscilla maritima, and whose origin seems to be linked to a fusion of a tail of 350 residues in the N-ter. In addition, Bacteroidetes bacterium ADurb.BinA174 PARP (UniProt code: A0A1V5GYX3) is a sister clade of Clade 1.1. On the other hand, Clade 2 is characterized by the balanced presence of both WGR-PARP_reg-PARP_cat and WGR-PARP_cat domain configurations. In the latter case, modelling of the sequences again shows the presence of a helical domain such as that found in the proteins of Clade 1. An example of this dichotomy can be found in Clade 2.1, which consists of four Deltaproteobacteria sequences with the two domain organizations described above. The rest of the Deltaproteobacteria found in the databases were encountered in Clade 2.2, where they formed a sister group with different bacilli. All members of this Clade 2.2, including CdPARP, have the canonical WGR-PARP_reg-PARP_cat organization, an architecture also found in Clades 2.3 and 2.4, formed by proteins belonging to the orders Herpetosiphonales (Clade 2.3), Pleurocapsales, Pseudomonadales and Vibronales (Clade 2.4). By contrast, in Clade 2.5, constituted by two Burkholderiales and one Firmicutes bacterium, the predominant organization was the WGR-PARP_cat. Finally, Clade 2.6 contains all the actinobacterial sequences found in this study, divided into two large groups, belonging to the Micrococcales and Corynebacteriales orders, the latter being the most abundant. The only member of this group with the three domains is the PARP of Leifsonia sp. Leaf264, a microorganism of the Arabidopsis leaf microbiota.

Figure 1
figure 1

Phylogenetic analysis of bacterial PARPs. The Neighbour-Joining (NJ) tree was obtained from 1000 replicates. Protein domain architecture is shown behind each protein code: WGR domain (Red), PARP regulatory domain (Blue), and PARP catalytic domain (Green). Protein codes are summarized in Supplementary Table S1.

The MISTIC server27 (mutual information server to infer coevolution) was used to analyse and visualize the extent of the coevolutionary relationship between two positions in the bacterial PARP protein family by using the information contained within the above MSA (multiple sequence alignment)28. The mutual information (MI) obtained is usually used to find structurally or functionally important positions in a given protein fold family29,30. The residue-based Kullback-Leibler conservation score (Fig. 2, coloured rectangles) and cumulative mutual information score (cMI) for each residue (Fig. 2, black histograms) were fairly similar, both revealing that highly coevolving residues were primarily localized in the donor site of the catalytic domain (N316-Y432 in CdPARP; Fig. 2), and in the DNA binding site of the WGR domain (V20-N100, Fig. 2). In addition, some residues of PARP_reg domain are also highly coevolving (K163-Y239, Fig. 2). The three main regions of the donor site corresponding to a nicotinamide (NI site; G323, Y359, S367, Y370 and E428; Fig. 2, green circles), a phosphate (PH site; D234 and D237; Fig. 2, black circles) and an adenine-ribose binding (AD site; D241, H322, S324, Y336 and K338; Fig. 2, blue circles) sites are in zones characterized by high cMI scores (Fig. 2). Among these residues, especially conserved are those involved in the catalytic triad (H322, Y359 and E428; Fig. 2, red stars). The circos representation also shows three areas of high cMI scores in the WGR domain (N34-Y39; Y57-G61 and K88-Y93), which are related with the DNA-dependent activity of PARPs, the highest being those corresponding to G58 and R59. Of note is that, in the bacterial sequences used, the first position of WGR zone is mostly occupied by a tyrosine (Y), followed by a tryptophan (W) or a phenylalanine (F). In the PARP_reg domain, two leucines (L181 and L236) involved in the stabilization of PARP_reg hydrophobic core are also highly conserved. These two leucines correspond to L269/L233 and L321/L286 in hPARP2/hPARP3, respectively, whose alanine mutants showed an increase in DNA-independent activity, mimicking of distortion of the helical domain produced after DNA binding31.

Figure 2
figure 2

Circos representation of the bacterial PARPs. The outer ring shows the amino acid code corresponding to CdPARP (Uniprot code; T3DQ72). Coloured rectangular boxes of the second circle indicate the KL (Kullback-Leibler) conservation score (from red to cyan, red: highest; cyan: lowest)27. The third circle shows the cMI (cumulative Mutual Information score) scores as histograms. Lines in the centre of the circle connect pairs of positions with MI (Mutational Information) score > 6.5. Red lines represent the top 5%, the black lines between 70 and 95%, and the grey lines account for the last 70%. Sequence distribution of WGR, PARP regulatory and PARP catalytic domains are shown. Amino acids belonging to the catalytic triad are marked with red stars, the amino acids from the NI site, PH site and AD site are marked with green, black, and blue circles, respectively.

The results described above, along with those found in Supplementary Fig. S1, which shows that the 12 bacterial sequences corresponding to each of the 12 different orders found are grouped together with DNA-repairing eukaryotic PARPs in Clade 116,32, may indicate that the proposed organization into three structural domains for the last common ancestor of all eukaryotes may occasionally have been acquired by bacteria through horizontal gene transfer, in some cases remaining intact and in others changing their canonical sequence but not theirs helical structure. This conclusion corroborates what Ahel’s group previously proposed16.

Genetic organization of C. difficile CD160 reveals a new toxinotype

In order to explain the above anomalous localization of CdPARP in a clade other than that of Clostridiales, a maximum likehood tree from the cdu1 genes of at least three C. difficile strains for each of the six clade already described24 was generated, including for the first time the corresponding gene from C. difficile CD160 (Uniprot code: QEW_0946) (Fig. 3). The result showed that C. difficile CD160 forms a sister group within Clade C-I, a highly divergent lineage containing toxigenic and non-toxigenic strains24,33, including SA10-0505 and CD10-165 strains24. These latter strains, along with toxinotypes XXX and XXXI, are a group of C. difficile that lack a complete tcdA gene while preserving tcdB and CDT genes (toxinotype AB+CDT+); however, CD160 strain is a completely new toxinotype (AB+CDT) with the same PaLoc organization as toxigenic SA10-0505 and CD10-165 strains (cdtR, tcdR, tcdB, tcdE) but without cdtB-cdtA genes (Supplementary Fig. S2). CD160 strain also contains the endolysin cwlH gene corresponding to a N-acetylmuramoyl-L-alanine amidase, which corroborates the phage origin of the PaLoc24. In addition, the homology of the TcdR (UniProt code: T3DH32) and CdtR (UniProt code: T3DFV6) regulators in CD160 and those of the reference strain CD630 (Uniprot codes: Q189K4 and Q182U3 respectively) is low (73% and 52%, respectively), but in the range of values described for SA10-0505 and CD10-165 strains (70% and 62%, respectively)24. These low values suggest a long-term divergence of CD160 strain compared not only with those of the C. difficile strains already studied but perhaps also with other clostridiales; indeed, a different origin for the C. difficile CD160 PARP gene is plausible.

Figure 3
figure 3

Phylogenetic analysis of the C. difficile cdu1 genes. Maximum likelihood tree with 1000 replicates was constructed using representative clade strains24.

C. difficile CD160 has functional PAR metabolism

In order to test whether or not CdPARP is a bona fide DNA-dependent PARP as predicted by in silico analysis, it was cloned into pET28a vector and transformed into E. coli Rosetta 2(DE3). In addition, due to the great variability that the bibliography mentions for kinetic parameters depending on the assay used, two additional PARPs with demonstrated DNA-dependent PARP activity were also cloned: the full-length hPARP1 (1–1014) into pET24b vector and the bacterial HaPARP into pET28a vector17,34. The soluble recombinant proteins obtained at 20 °C after induction with IPTG were isolated in three simple steps, as described in Materials and Methods. SDS-PAGE pure enzymes were obtained with their corresponding molecular masses of 113.5 kDa, 52 kDa and 47 kDa for hPARPl, CdPARP and HaPARP, respectively (Supplementary Fig. S3). CdPARP rendered the highest yield of purified protein (4.5 mg/L culture), followed by HaPARP (3.9 mg/L culture), whereas hPARPl had the lowest (1.2 mg/L).

HaPARP and hPARP1 DNA-dependent automodification has previously been demonstrated by western blot and PAR antibodies17. When this process was assayed with CdPARP, using hPARP1 as a control, both PARPs showed a noticeable shift in mobility as seen by western blot using both NAD+ and activated DNA as substrates (Fig. 4), thus indicating poly(ADP-ribosyl)ation of the corresponding PARP. This process was also sensitive to rucaparib, a well-known hPARP1 inhibitor (Fig. 4). In addition, the PARylation shown by hPARP1 and CdPARP was abolished by recombinant protein corresponding to the Clostridioides difficile CD160 QEW_4455 gene (Uniprot code: T3D766). This uncharacterized protein has a DUF2263 domain similar to that of hPARG and HaPARG, which also reverses PARylation (Supplementary Fig. S4). These results demonstrate that C. difficile CD160 has a functional PAR metabolism with both functional CdPARP and PdPARG, as has also been described for H. aurantiacus17.

Figure 4
figure 4

PARylation assay of CdPARP and hPARP1. It was carried out with CdPARP (A) and hPARP1 (B) as control in the presence of absence of NAD+, activated DNA and the PARP inhibitor rucaparib. Numbers on the right margin indicate protein markers (kDa).

Biochemical characterization of recombinant CdPARP

The kinetic parameters of hPARP1 were determined using different assay methods with PARP alone or combined with histones, together with NAD+ (radioactive, biotinylated or as ADP-ribose-pNP) and with activated DNA or double-stranded DNA oligomers35,36. This resulted in different KM (0.06–0.27 mM), kcat (0.41 s−1) and kcat/KM (1.47–6.95 mM−1 s−1) values37. Since no kinetic parameters have been described for any bacterial PARP, recombinant hPARP1 was used as a reference to validate the fluorescent method used in this work, which measures the consumption of substrate by the chemical conversion of the remaining NAD+ into a stable fluorescent condensation product upon treatment by acetophenone in ethanol at basic pH, followed by heating at acidic conditions38,39. Among all the enzymes used, hPARP1 was the most active with a kcat/KM at 3.06 × 10 mM−1 s−1, value 5.2- and 17-fold higher than those of CdPARP and HaPARP, respectively (Table 1). These results highlighted the important contribution of the three zinc fingers and BRCT domain for the activity, which results in a lower KM and a higher kcat (Table 1). As regards bacterial PARPs, both showed a similar KM but the kcat was higher in CdPARP, resulting in a 3-fold higher catalytic efficiency compared with HaPARP (Table 1). These data, together with the purification yield of CdPARP, point to this enzyme as a promising biocatalyst for PAR production.

Table 1 Kinetic parameters of hPARP1, CdPARP and HaPARP. Literature values are from Altmeyer et al.37.

The thermal stability of the above PARPs was also studied under different conditions by monitoring Sypro Orange fluorescence while heating the samples, and determining the midpoints of the transitions (melting temperature or Tm)30. Previous investigations showed that this assay provides a good estimate of binding affinity, optimal storage pH and the suitable protein stabilizers to be used40,41. To carry out such study, the melting temperature of each enzyme in MilliQ® water was taken as a reference, and thus the increment in Tm (ΔTm) was calculated. All enzymes showed a similar Tm in MilliQ® water (40–45 °C) (Supplementary Table S2), hPARP1 being the most stable. When the effect of pH on enzyme stability was studied (Supplementary Table S2), both hPARP1 and CdPARP showed higher increments in Tm at pH 7.5, while, in the case of HaPARP the values were higher at pH 8.0–8.5. However, pH values below pH 7.0 or above pH 8.5 decreased the stability of these enzymes. The protein-stabilizing compounds used (ammonium sulphate and hydroxyectoine), while producing an increase in their Tm with all the enzyme, presented a differential effect between them (Supplementary Table S2). Thus, ammonium sulphate was the best stabilizer for CdPARP, while hydroxyectoine was the best stabilizer for HaPARP and hPARP1. This information may be relevant for the case of hPARP1 as it is a commercial enzyme. In fact, when comparing the stability of hPARP1 in the presence of glycerol (10%) and hydroxyectoine (200 mM) at 4 °C, the latter compound preserved 100% of the activity after 24 h, while in the presence of glycerol it is reduced to 75% (Supplementary Fig. S5). This proportion is maintained even at 72 hours, where the activity in the presence of glycerol is only 8% compared with 32% in the presence of hydroxyectoine (Supplementary Fig. S5). Finally, NAD+ and ADP-ribose had a destabilizing effect at 1 mM, but nicotinamide and the inhibitor 3-aminobenzamide at the same concentration increased the Tm of all enzymes, especially in the case of HaPARP, where the Tm increase was 7 °C (Supplementary Table S2). This increase in Tm due to the presence of 3-aminobenzamide has previously been observed with an hPARP1 construct comprising the domains PARP_reg-PARP_cat (K654-L1013), but not with the entire protein40.

Bacterial PARPs are specifically inhibited by a compound that mimics NAD+

The inhibitory profile of bacterial PARPs was assessed using a subset of small molecules that were originally selected following an in silico high-performance screening campaign on two complementary libraries of 50,000 compounds each (DIVERSet-EXP and DIVERSet-CL, ChemBridge), designed to provide the greatest coverage of pharmacophore while maintaining structural diversity. In addition, nine well-known hPARP1 inhibitors were also tested using a more sensitive fluorescent method due to the high potency of some of them. Thus, the remaining NAD+ was amplified by coupling the enzymes alcohol dehydrogenase and diaphorase. Each time the NAD+ was recycled through these enzymes, a highly fluorescent resorufin molecule was generated. This assay was first used to determine the enzyme activity of NMN-adenyltransferase and ADP-ribosyl cyclase42,43.

The selected PARP inhibitors and analogues were initially screened at 10 µM and 50 µM with both hPARP1 and bacterial PARPs (CdPARP and HaPARP), respectively (Supplementary Table S3). Of the 44 compounds analysed, twenty-three showed more than 50% inhibition with hPARP1 (Supplementary Table S3). Half-inhibitory concentration (IC50) values were calculated for the first 17 hPARP1 inhibitors, obtaining very similar values for the 8 inhibitors already described in the bibliography (Table 2). After the 5 highly selective known inhibitors (rucaparib, ABT-888, PJ-34, EB-47 and DPQ) surprisingly, four compounds derived from 4-substituted 3-nitrophenyl-1(2 H)-phthalazinone appeared (Table 2; Supplementary Fig. S6). Among them, standing out for its selectivity, was compound 7655698 with an IC50 of 81 nM, which, to the best of our knowledge, has never been used as hPARP1 inhibitor, and which could be optimized to increase its potency and selectivity. The others members of this family of compounds with IC50 values between 136–481 nM have been tested as modulators of cytokine activity44 or maternal embryonic leucine zipper kinase45, and only one (7660328) has been used to calculate its Tm increment with a hPARP1 construct (K654-L1013)46. This was also the case with compound 7650155 (1,3-dioxo-2,3-dihydro-1H-benzo[de]isoquinoline-5-sulfonamide)46, which together with another two N-alkyl derivatives (7670490 and 9019116) that had never before been tested with hPARP1, belong to a second family of compounds with IC50 values between 403–676 nM (Table 2; Supplementary Fig. S6).

Table 2 IC50 values of well-known and new compounds against hPARP1, PdPARP and HaPARP.

As regards bacterial PARPs, only five compounds showed inhibition in excess of 50% with CdPARP, and three compounds with HsPARP (Supplementary Table S3). However, only rucaparib and EB-47 were able to inhibit them completely. When IC50 values were determined, rucaparib showed values of 5.3 and 7.2 µM, while EB-47 showed values of 0.8 and 1.0 µM, respectively (Table 2). The latter inhibitor was designed to mimic the NAD+ within the hPARP1 substrate-binding site47 and has been co-crystallized with tankyrase-2 with an IC50 of 45 nM48. When a structural alignment of this tankyrase-2 with EB-47 (PDB: 4BJ9) was carried out, both with hPARP1 (PDB: 4opx) and with the modelled CdPARP and HsPARP, it was observed that at the nicotinamide site the residues that form the hydrogen bonds (G1032 and S1068; TKNS2 numbering) and produce π-π stacking (Y1071) with isoindolinone moiety are also conserved in the rest of the studied PARPs (Fig. 5). In addition, the residues involved in binding the ribose hydroxyls (H1031 and S1033) and the two H-bonds with the adenosine (G1043 and D1045) were also structurally conserved (Fig. 5). The different IC50 of the above PARPs and TKNS2 could be related with the specific interaction of EB-47 with the D-loop of each enzyme, as previously described for TKNS248.

Figure 5
figure 5

PARP inhibitor EB-47 binding site. Structural alignment of TNKS2 (PDB: 4BJ9, orange), hPARP1 (PDB: 4OPX, cyan) and modelled CdPARP (salmon) and HaPARP (green) was carried out using chimera54. hPARP1 AD site is represented by R878, G876, H862 and S864, whereas Ni site by G863, S904 and Y907, respectively.

Discussion

Poly-ADP-rybosilation is a remarkably, well-conserved post-transcriptional modification in eukaryotes, but it is less common in bacteria. This was also observed in the phylogenetic studies carried out, mainly focusing on the evolutionary distribution of the poly(ADP-ribose) polymerases in eukaryotes. In this study, we looked into bacterial PARPs, not only as regard their phylogenetic distribution, by representing their first phylogenetic three (Fig. 1), but also the key amino acids responsible for catalysis, and DNA-dependent activation, using the coevolutionary relationship between two positions provided by the MISTIC server. With this aim, we focused on the C. difficile CD160 putative protein corresponding to QWE_4625 gene, since its protein in the PFAM database indicates the presence of a WGR-PARP_reg-PARP_cat canonical organization. The updated phylogenetic analysis was carried out with 72 sequences rather than the 27 previously used. Unexpectedly, the result showed that even when PFAM did not found in most of the protein sequences the PARP_reg domain, the corresponding modelled sequences displayed a similar helical domain to the canonical PARP_reg domain, which is necessary for the destabilization of PARP_cat domain after DNA-binding at the WGR domain. The study of this last domain produced two important findings: an YGR domain exits in most of bacterial PARPs along with other highly coevolved amino acid, which have never before been studied, such as G92 and Y93 (Fig. 2). In addition, the organization of the CdPARP anomalous domain compared with that of other clostridial PARPs suggests a long-term divergence of CD160 strain, based on its cdu1 gene phylogeny (Fig. 3) and homology in TcdR and CdtR. Of note too, is that the latter, poorly studied strain (just one paper mentions it), presents a new toxinotype never before described (AB+CDT) with an atypical “Bi-Toxin Paloc” comprising only the CdtR, tcdR, tcdB, tcdE, CwlH genes. Surprisingly, this PaLoc organization differs from that of the closely related SA10-50 and CD10-165 strains24. Furthermore, this microorganism has also a functional bona fide PAR metabolism, not only by the existence of PARP (Fig. 4) and PARG (Supplementary Fig. S5), but also by the presence of a putative MacroD (T3DIR9), which is the enzyme predictably responsible for the removal of the terminal ADP-ribose unit after PARG has acted, giving rise to a completely unmodified protein49. Surprisingly, no ARH3-like protein was found in UniProt for C. difficile CD160, whereas three genes were found in H. auranticus ATCC 23779 (Haur_3413, Haur_2362 and Haur_0057), in addition to the existing three MacroD genes (Haur_0008, Haur_4555 and Haur_2355). The presence of different genes involved in the PAR metabolism in Bacteria, together with the presence of sirtuins-like genes would seem to be rather more than a random horizontal gene transfer phenomenon, and more likely an adaptive advantage that has been preserved over time. Such an advantage has been recently described in a new toxin-antitoxin (TA) system, which possibly contributes to bacterial persistence50. The latter system is based on the combined action of a toxin protein (DarT) that specifically leads to a sequence-specific ADP-ribosylation on single-stranded DNA and an antitoxin macrodomain protein (DarG) that reverse such DNA modification50.

Finally, our data revealed different inhibitory profiles between hPARP1 and bacterial PARPs, the latter enzymes being specifically inhibited by EB-47, a compound that mimics NAD+ (Table 2). However, the major discovery in this study was a compound, never before used with PARP, that selectively inhibited hPARP1 (compound 7655698) but not the bacterial PARPs. The absence of Pan-PARP inhibition for bacterial PARPs, as produced by rucaparib (Table 2), makes this compound as an ideal selective inhibitor for seeking further improvements in its potency. Taken together, the above described results will provide the foundation for new future bacterial PAR metabolism studies and the development of new specific PARP inhibitors.

Materials and Methods

Protein expression and purification

Bacterial and human PARP and PARG proteins were expressed from the pET28a vector (EMD Millipore, Madrid, Spain) bearing an N-terminal His-tag, with the exception of the hPARP1 (1–1014), which has a C-terminal His-tag and was kindly provided by J. M. Pascal (Université de Montréal, Montréal, QC) in a pET24b vector. PARP (QEW_4625) and PARG (QEW_4455) genes from Clostridioides difficile CD160 (BioProject and SRA Accession numbers PRJNA85757 and SRR593185, respectively) were obtained from GenScript (Piscataway, USA). Genomic DNA from Herpetosiphon aurantiacus DSM785, purchased from DSMZ (Braunschweig, Germany), was used to clone PARG (Haur_1618) and PARP (Haur_4763) genes. hPARG-pCMV6-XL5 construct was purchased from Origene (Rockville, USA). Genes were amplified by PCR using KAPA HiFi DNA polymerase and the corresponding primers, including NheI and XhoI restriction site extensions. The cloning and transformation techniques used were essentially those previously described51.

All PARP constructs were expressed in Escherichia coli strain BL21(DE3) Rosetta2. The clones containing the constructions were grown in 1 litre of Terrific Broth (TB) medium supplemented with kanamycin (50 µg/mL) and chloramphenicol (30 µg/mL). When the culture reached an optical density of 2.5 at 600 nm, it was induced with 0.2 mM isopropyl-β-D-thiogalactoside (IPTG) for hPARP1 or with 0.1 mM IPTG for both HaPARP and CdPARP for 16 h at 20 °C with constant shaking. Pelleted cells were resuspended in lysis buffer (50 mM HEPES pH 7.5, 250 mM NaCl, 1 mM β-mercaptoethanol, 1 mM PMSF, 1 mM benzamidine and cOmplete EDTA-free protease inhibitor cocktail) before being disrupted by sonication (450-D Sonifier, Branson). After ultracentrifugation (40000 g, 40 min), the supernatant was incubated with Ni-NTA beads (Macherey-Nagel, Germany) at 4 °C for 1 h. Then, they were washed with lysis buffer, and the protein was eluted with the same buffer containing 250 mM imidazole. The PARP-containing fractions were loaded onto a HP heparin column coupled to a FPLC chromatography system (ÄKTA Prime Plus, GE Lifesciences)52, and eluted with 50 mM Tris-HCl pH 7.5, 1 M NaCl, 1 mM EDTA and 1 mM β-mercaptoethanol. Finally, the protein was loaded onto a Superdex 200 HiLoad 16/600 column (GE Lifesciences), obtaining an electrophoretically pure enzyme. PARP aliquots were stored at −20 °C with 10% glycerol.

All PARG constructs were expressed as above described but induced with 0.2 mM IPTG when the culture reached an optical density of 4.0 at 600 nm in TB medium. After ultracentrifugation (40000 g, 40 min), the resulting supernatant was purified in two steps comprising a Ni2+-chelating affinity chromatography (HiPrep IMAC 16/10 FF column) followed by gel filtration onto a Superdex 200 HiLoad 16/600 column (GE Life Sciences). PARG aliquots were also stored at −20 °C with 10% glycerol. The protein concentration was determined using Bradford reagent (Bio-Rad) and BSA as standard.

PARP activity assays

PARP activity was determined by a fluorescence-based assay based on the reaction of N-alkylpyridinium compounds, such as NAD+, with acetophenone to give rise a fluorescent compound after heating in acid38. The hPARP1 automodification reaction was allowed to proceed in a black 96-well fluorescence plate (Greiner Bio-One, USA) for 20 min at 25 °C in a reaction medium containing 90 nM of hPARP1, 1 mM NAD+ (Trevigen), 75 µg/mL of activated DNA (Trevigen), and PARP assay buffer (100 mM Tris-HCl pH 8.0, 10 mM MgCl and 1 mM DTT) in a final volume of 60 µL. The NAD+ present after the reaction was then determined by the addition of 20 µL of an aqueous 2 M KOH solution and 20 µL of a 20% acetophenone solution in ethanol. The plate was incubated at 4 °C for 10 min. Then, 80 µL of formic acid 88% was added, incubated at 110 °C for 5 min and allowed to cool for 30 min. Fluorescence was measured on Synergy HT equipment (Biotek Instruments) at an excitation wavelength of 360 nm and emission wavelength of 445 nm. HaPARP and CdPARP activity assays were carried out as above but at 30 °C for 1 h and at enzyme concentrations of 850 nM and 500 nM, respectively. KM values were estimated using plots of initial rates vs. NAD+ concentrations.

Due to the potency of some of them, the inhibitors were tested using a more sensitive cycling assay involving alcohol dehydrogenase (ADH) and diaphorase (DP)42,43. ADH reduces the NAD+ to NADH and oxidizes ethanol to acetaldehyde, while DP turns NADH back into NAD+ with the reduction of resazurin, providing resorufin, which has an excitation maximum at 544 nm and an emission maximum at 590 nm. The reaction mixture contains enzyme (90 nM hPARP1 or 0.5 µM CdPARP or 0.85 µM HsPARP), 100 nM NAD and 75 µg/mL of activated DNA in PARP assay buffer. The inhibitors were added at different concentrations at a constant volume of 0.3 µL of DMSO. The PARP automodification reaction was allowed to proceed for 20 min at 25 °C (hPARP1) or 60 min at 30 °C (CdPARP and HsPARP). Then, 30 µL of cycling reagent, containing 2% ethanol, 50 µg/mL ADH, 50 µM Resazurin, 5 µg/mL DP, 10 µM FMN in PARP assay buffer was added. The enzyme-coupled reaction was measured over 15 min in a Synergy HT (Biotek Instruments). The compounds were screened at 10 μM (hPARP1) or at 50 μM (CdPARP and HsPARP) in triplicate, and the best compounds were selected for measuring their corresponding IC50 values, which were estimated using dose-response curve fitting. The reported values represent means ± SE of the fits of the curves based on triplicate experiments, each determined on the basis of three replicates.

Western blot

PARP automodification was also assayed by western blot in a reaction containing different concentrations of enzyme (250 nM HaPARP, 200 nM CdPARP, or 85 nM hPARPl) in assay buffer, with or without NAD+ (1 mM), activated DNA or rucaparib (1 µM), respectively. The reactions were allowed to proceed for 1 h at 25 °C. PAR-mediated PARG hydrolysis was analysed using the above-mentioned PARP automodification reaction after stopping it with rucaparib, followed by the addition of PARG enzymes (500 nM) and incubation for 1 hour at 30 °C. The above reactions were then run on 7–10% SDS-PAGE gels and transferred to a nitrocellulose membrane. The presence of poly ADP-ribose (PAR) was detected by the use of rabbit polyclonal anti-PAR antibodies (1:1000; Trevigen), and goat anti-Rabbit IgG antibody conjugated to horseradish peroxidase (1:5000; Bio-Rad), and Opti-4CN as optimized colorimetric substrate (Bio-Rad).

Thermal stability assay

Protein melting curves to determine the thermal stability of PARPs were obtained using the fluorescent dye SYPRO Orange (Molecular Probes), as previous described30,41. Proteins (10 µg) were preincubated with 10X Sypro Orange (excitation at 490 nm, emission at 530 nm) in the presence of different buffers (100 mM), compounds (NAD+, ADP-ribose, nicotinamide, ammonium sulphate and hydroxyectoine) or inhibitors (3-aminobenzamidine). The assay, performed in 7500 RT-PCR equipment (Applied Biosystems), monitors Sypro Orange fluorescence while heating the samples from 5 to 98 °C. Each experiment was carried out in triplicate.

In silico analysis

Protein sequences were obtained from NCBI non-redundant (NR) and UniProt databases using Clostridiales difficile CD 160 PARP protein sequence as a query. Incomplete sequences and duplicates were removed, rendering the sequences described in Supplementary Table S1. A Neighbour-Joining (NJ) tree with 1000 replicates was constructed using the MAFF server (https://mafft.cbrc.jp/alignment/server/). Domain architectures of retrieved sequences were obtained from the Pfam (http://www.sanger.ac.uk/Software/Pfam) and GenomeNet (http://www.genome.jp/tools/motif/) databases. Mutation correlation analysis was made with the retrieved bacterial PARP sequences, using Mistic (Mutual Information Server to Infer Coevolution) web server (http://mistic.leloir.org.ar)27. The x-ray structure of hPARP1 (PDB: 2DR6) was used as model for the in silico high-performance screening campaign on two complementary libraries of 50,000 compounds each (DIVERSet-EXP and DIVERSet-CL, ChemBridge) using LeadIT and SeeSAR (https://www.biosolveit.de/). Protein sequences were modelled with SwissModel53. Protein structure images and structural alignments were obtained with Chimera54. Cdu1 gene phylogenetic three of C. difficile CD160 was constructed with MEGA 7.055 using maximum likelihood method and the data provided by Prof. Marc Monot (Institute Pasteur, FR). PaLoc and CdTLoc representations of C. difficile strains were obtained from Patric web (https://www.patricbrc.org/).