Introduction

Kingella negevensis is a gram-negative bacterium and produces RTX toxin associated hemolysis [1]. Its colonization is known in respiratory tract and oropharynx, with carriage and colonization resembling K. kingae [2, 3]. Its occurrence is known to increase from 6 to 24 month children and decrease thereafter [4]. Its role in the septic arthritis of an infant [2, 4, 5], endocarditis, pediatric osteomyelitis and bacteremia has been implicated [1], while the bacterial spread is through person to person contact [4].

Its genome has been sequenced, with genome size around 2 MB [6]. This bacterium has been reported to show heterogeneity in genetic makeup in different strains [4]. Among several key virulence factors, human epithelial binding through elements such as an exopolysaccharide, a polysaccharide capsule, an adhesin autotransporter, and a pili (type IV) are shared between K. negevensis and K. kingae [7]. Its integrative and conjugative elements (ICE) also resemble the Neisseria gonorrhea [8], with high homology between type IV secretion system protein virB4, involved in human endothelial cell subversion. Apart from this, type IV coupling protein T4CP, DNA transesterase enzyme relaxase and integrase are also exceptionally conserved. DNA uptake sequences with ICE sequences, match that of Neisseria gonorrhea and Neisseria meningitides [8].

Genome sequence availability is a boon for bioinformatics based studies, where algorithms and softwares can be used to mine therapeutic targets and explore druggable potential of a bacteria [9]. This approach has previously been implemented for identifying therapeutic targets in drug resistant Salmonella typhi [10], Enterobacteriacea family [11], Bacillus sp. [12], Chlamydia pneumoniae [13] etc. Main principle is application of the ‘essentiality & selectivity criteria’, where a gene product presence should be necessary for the bacterial survival and lacking in host. This guarantees that drug molecules targeted against the pathogen will not disrupt the host system [14]. The capability to discern molecules having a robust modulatory activity against a pathogen, using softwares (via lock and key fit contrivance) is a blessing for drug discovery. A large number of molecules with selectivity for an enzyme target can be screened in a small time. Molecular docking estimates the best binding through a regression or classification based scoring and thus, prioritizes hits in a library [15]. However, the hits should be subjected to several filters as decoys, toxic, less bioavailable and non-active molecules are of little use in real scenario. For this purpose, the absorption, distribution, metabolism, and excretion (ADME) properties, along with physiologically based pharmacokinetic (PBPK) profiling helps confirm drug efficacy and tolerability [16]. In the past, poor PK properties (e.g., small bioavailability) have led to the failure of a large fraction of lead compounds [17]. This is why, good PK properties can be of motivation to further explore molecule as a drug. It can also shed light on dosing [18] and explicit adaptations of the regimen in different ethnicities [19], as well as health conditions.

Natural products have gained a lot of importance in drug design against pathogens and around half of the FDA approved drugs (USFDA, 1981–2019) are sourced or based on natural compounds [20]. Their large chemical space [21] as well as already established medicinal properties as traditional remedy against ailments makes them an important treasure trove for screening against pathogenic bacteria. Traditional Indian (Ayurvedic) and Traditional Chinese Medicine (TCM), have a deep-rooted history in the pharmacopeia of respective areas. Drugs derived from these medicinal systems have been implicated in diseases like cancer [22,23,24,25,26,27,28], COVID-19 [28, 29]. Although traditional medicine comprises single or multi-component preparation, computational screening has lent a quick evaluation strategy for single compound assessment. Therapeutic efficacy of compounds screened from these sources is not disease but rather structure based.

In this study, we inferred therapeutic targets and carried out virtual screening of two natural product libraries of Indian and Chinese origin, against pdxJ gene product of K. nevegensis. ADMET and PBPK properties were also studied for prioritized compounds of our selected natural product libraries. To the best of authors knowledge, this is the first report of therapeutic target map of K. nevegensis and screening of natural product inhibitors against it.

Material and methods

Data acquisition

Genome data of the Kingella negevensis strain Sch538, with accession CCNJ00000000, was obtained from NCBI. Coding DNA sequences and proteome was also procured.

Subtractive genomics

The data were subjected to subtractive genomics for therapeutic target identification using previously described pipeline [30]. Core i7 (7th generation) machine was used for analysis, with 8 GB RAM. Paralogous sequences with more than 60% similarity were removed and genomic data were converted to protein dataset. Data for DEG [31] and CEG [32]database, human proteome, gut microbiota (n = 84 bacteria), were obtained from UniProt (https://www.uniprot.org), NCBI (https://www.ncbi.nlm.nih.gov), DEG (http://www.essentialgene.org) and CEG website (http://cefg.uestc.cn/ceg), respectively. Homologous or non-homologous sets of proteins were filtered in the same order, against these datasets using standalone BLAST 2.2.31. One protein pyridoxine 5'-phosphate (PNP) synthase (Accession no: WP_032137481.1), involved in vitamin B6 synthesis, was chosen from obtained therapeutic targets for downstream analysis.

Structure modeling and virtual screening

PNP synthase was subjected to 3D structural modeling using I-Tasser [33], with LOMETS multi-threading algorithm at the back-end. Since LOMETS picks lots of templates from the PDB library, I-TASSER sifts out significant ones and retains top ten based on Z-score. Top template was a crystal structure of PNP synthase from Escherichia coli, with bound 1-deoxy-d-xylulose phosphate (PDB ID: 1m5w), while ten structures in all were used for threading based structure modeling. Alignment of these structures were obtained using several softwares i.e. SPARKS-X, HHSEARCH, FFAS-3D, Neff-PPAS, pGenTHREADER, wdPPAS, PROSPECT2 and SP3. Generated decoys were clustered and top one picked by SPICKER program. C-score was used for final ranking of structures in top cluster. Pro-motif was used to study secondary structure while co-factor and COACH were used to predict ligand binding site residues. Ramachandran analysis was done for structure validation (https://zlab.umassmed.edu/bu/rama/index.pl).

Docking based screening was carried out using Molecular Operating Environment (MOE) software version 2019.1. TCM consisted of 36,000 compounds while Ayurvedic library consisted of 2002 molecules. Triangle placement method was used for initial round of hit prioritization and forcefield based refinement was carried out for further improvement of hits screening. Three complexes with least energy values were saved from each library and ligand interaction diagrams were also drawn for 2D visualization. Top scoring compounds were subjected to dynamics simulation of 100 ns using Desmond, according to previously described parameters [34].

ADMET and PBPK evaluation

These compounds were subjected to ADMET using machine learning based pkCSM (https://biosig.lab.uq.edu.au/pkcsm/) platform. It is centered on a graph-based technique, where distance between atoms is used for training predictive regression and classification models, that calculate ADMET properties. Absorption variables include water solubility, intestinal absorption, Caco-2 and skin permeability, whereas distribution lists blood brain barrier and central nervous system permeability. Metabolism of cytochrome enzymes is available, alongside clearance as excretion parameter. Toxicity is determined by quantity tolerated in rat, minnow, T. pyriformis, hepatoxicity, Ames toxicity and skin sensitization [35].

PBPK modeling is based on a chain of differential equations and has been executed in GastroPlus software (SimulationsPlus LLC). Compound-specific parameters like weight, pKa values are input and plasma concentration is profiled after chosen route of administration. Statistically, oral route has been the most efficacious [36], so it was chosen. GastroPlus includes distribution factors for Pgp, PepT1, HPT1, OCTN1, LAT2 and OATP1A2 transporters. Vmax is adjusted for each compartment, based on the values from these transporters. Health conditions were taken as (a) normal, (b) cirrhosis, (c) renal impairment. Quantity was taken as 100 mg, with intake in fasting state in the humans (population of 200). pH-dependent dissolution model was selected, with pH = 7.2. Diffusion coefficient values were taken as 0.5–1.5 × 10–5 cm2/sec, transit time value for the stomach = 0.25 h, transit time value for caecum = 4.5 h, transit time value for colon = 13.5 h and drug particle density = 1.2 g/mL. Lengths of compartments, radii, transit times and pH values were adjusted in population, with respect to weights of individuals. Simulation time was 12 h and Advanced Compartmental and Transit (ACAT) Model was implemented.

Results and discussion

Therapeutic candidate mapping

After paralog removal, 2037 hits were obtained from the total 2104 CDSs. DEG similar sequences were 944 and CEG similar sequences were 814. Common sequences to both DEG and CEG proteome were only 798. These were compared to the human proteome and 386 non-similar proteins were obtained, with 157 among these dissimilar to the human gut microbiota proteome. These were compared to the DrugBank and 41 druggable proteins were obtained (Supplementary Table 1).

Among the druggable proteins, PNP synthase was selected for further analysis as it is an essential part of pathways involved in amino acid synthesis in prokaryotes but absent in humans [37]. This protein is a product of pdxJ gene and the active form of vitamin B6 contrived by this enzyme serves as a co-enzyme for metabolism of several amino acid, lipid and glucose pathways. This enzyme is vital to many processes in the bacterium, including deamination, transamination, decarboxylation, and racemization. [26]. This enzyme family exists as a small but necessary fraction of prokaryotic genome (~ 1.5% of genome) [38]. Nearly 4% of reactions propelled by enzymes are linked with this family of enzymes, as catalogued in the Enzyme Commission database, but approved drugs against this class of enzymes is scanty. We aimed to explore this enzyme as a drug target against K. negevensis and screen traditional medicinal compounds against it. For this, first of all structure was modeled as no experimental model was present for this specie in the Protein databank.

3D structure modeling of PNP synthase

Structure of the PNP synthase (EC 2.6.99.2) was threaded using 10 templates by I-TASSER. Overall ERRAT quality factor was 97%. Top templates with identity of more than 50% with the aligned region as well as whole protein sequence were PNP synthase from Escherichia coli (PDB ID: 1M5W), pyridoxal phosphate biosynthetic protein from Burkholderia pseudomallei (PDB ID: 3GK0), and PNP synthase from Pseudomonas aeruginosa (PDB ID: 5DLC). The obtained model was composed of 1 sheet, 6 beta-alpha–beta motif units (with some residues in loops and some in helices), 1 parallel wide type beta bulge, 8 parallel strands with topology 7X -1X -1X -1X -1X -1X -1X, 13 helices, 17 helix-helix interacs, 8 beta turns, 2 gamma turns (Fig. 1A). Non-glycine and non-proline residues were 218 while glycine and proline residues were 23. Normally, this enzyme depicts TIM-barrel or alpha/beta construction with eight helices and parallel beta strands. However, inner core is hydrophilic and three additional helices are present [37].

Fig. 1
figure 1

A 3D structure of the modeled PNP synthase of K. negevensis, with active site residues indicated by dotted spheres and central gorge visible as a tunnel in the surface representation of the ribbon structure. B Ramachandran plot showing particularly good observations (95.87%) in GREEN Crosses, followed by slightly less favored in brown triangles (2.2%) and questionable ones in red circles (1.8%). Black and gray regions are for preferred conformations, with delta values ≥ 2

Five binding sites were identified by COACH and co-factor. The crystal structure of PNP synthase from template of Escherichia coli consisted of eight binding sites, and predicted active site residues by I-TASSER based on chosen templates consisted of Asn6, His9, Thr12, His42, Arg44, Glu69, Val91, Glu93, Gly102, Phe130, His152, Gly191, Thr193, Gly212, Ile216 residues. Binding site residue prioritization based on all hits via I-TASSER predicted highest bonding capability of Asn6, Glu150, Gly189, His190, Asn210, Ile211, Gly212, His213. Ramachandran plot showed 95% residues in the most favored and 5% in additional allowed region. Only 1.8% (four residues) were in the disallowed region (Fig. 1B).

Docking based screening

Structure modeling was followed by docking with TCM and Ayurvedic medicinal compounds. ZINC02525131, ZINC33833737, ZINC85486932 were mined as top inhibitors from TCM, while Cadiyenol, 9,11,13-Octadecatrienoic acid and 6-Gingerol was prioritized from Ayurvedic library. Prioritized compounds (Table 1) showed bonding with majority of identified active site residues in all of the complexes (Fig. 2). The S-values and MM/PBSA values were in alliance with each other, except for the 9,11,13-Octadecatrienoic acid complex with PNP synthase, as it showed slightly lower value compared to Cadiyenol, which had a lower S-value but slightly higher MM/PBSA value.

Table 1 Prioritized compounds with their binding score (S values) and MM/PBSA values
Fig. 2
figure 2

2D representation of the docked A PNP synthase and ZINC02525131 complex. B PNP synthase and ZINC33833737 complex C PNP synthase and ZINC85486932 complex D PNP synthase and Cadiyenol complex E PNP synthase and 9,11,13-Octadecatrienoic acid complex F PNP synthase and 6-Gingerol complex. 3D depiction of these representations is shown in the supplementary Fig. 1

Information for every compound was not available in literature but ZINC02525131/β-Hydroxyisovalerylshikonin, the top hit from TCM library, has previously been isolated from Lithospermum radix and known to impart chemotherapeutic properties, conferring apoptotic cell death in human lung cancer DMS114 cells [39] and adenocarcinomic human alveolar basal epithelial A549 cells [39]. This compound has also been isolated from Lithospermum erythrorhizon [40]. It has also been reported that it has fungal properties and reduces mycelium formation in C. albicans [39]. Cadiyenol has been isolated from Centella asiatica (also known as Indian pennywort) [41, 42], and shown apoptotic activity in the murine lymphoma cells. 9,11,13-Octadecatrienoic acid, also called Punicic acid or ɑ-eleostearic acid, is a linoleic acid derivative and occurs in Momordica cochinchinensis and known to inhibit estrogen negative and positive breast cancer proliferation [43]. It also occurs in Pleurocybella porrigens [44], Momordica charantia [45], M. cymbalaria [46] Punica granatum [47] and Aleurites montana [48]. Its antibacterial properties have been reported [49, 50]. 6-Gingerol is present in the rhizome of ginger (Zingiber officinalis) and the physicochemical properties of this tuber have made it a timeless traditional medicinal plant against several ailments [51]. Gingerol is a phenolic constituent and previously, 6-Gingerol has been attributed to reduce allergic rhinitis by suppression of T-cells [52]. Extract with high content of 6-Gingerol has acted as an antioxidant and anti-inflammatory agent in murine models subjected to organophosphate pesticide chlorpyrifos, that causes oxidative damage [53]. It has also shown anti-proliferative effect in prostate cancer cells [54]. Here, we have predicted anti-bacterial property of all the compounds mentioned in Table 1.

Dynamics simulation

Dynamics simulation of two top complexes, PNP synthase with ZINC02525131 and Cadiyenol was conducted. The average RMSD for ZINC02525131 did not exceed 3 Å throughout the whole simulation, depicting the complex as stable (Fig. 3A). Protein ligand interaction (Fig. 3B) showed that Ser214 and Gly191played a significant role in ligand binding through hydrogen bonding, with a robust impact on metabolization, specificity and adsorption of ZINC02525131. Compared to these residues, Asn6, Asp8, His42, Arg44, Thr193, Gly212, His213 made small or transitory contact with ligand through hydrogen bonds. Hydrogen-bonded interactions arbitrated by a water molecule, also known as water bridges were made by Asn6, Asp8, His9, His42, Arg44, His190, Gly191, Thr193, His213. Transitory ionic interactions mediated by the protein backbone were displayed by just two residues, His42 and Glu69, while hydrophobic interaction by just Ile211. Arg44, His213, Ser214 and Gly191 retained interactions for more than 30% of the simulation time.

Fig. 3
figure 3

A 100 ns MD simulation plot depicting PNP synthase and ZINC02525131 interaction. B Details of four type of interactions shown by PNP synthase residues with ZINC02525131

Interaction plot with Cadiyenol (Fig. 4A) showed some fluctuations in the beginning of the simulation, high disparity from 20 to 30 ns and later 60–80 ns. From 90 to 100 ns, the ligand seemed to move away from the protein but the overall rmsd did not exceed 3 Å on the average. It was tightly bound with PNP synthase from 20 to 30 ns, 60–70 ns and 80–90 ns. However, the pattern is not uniform so this interaction is more unstable than PNP synthase with ZINC02525131. This shows the discrepancy in the binding score value versus dynamics simulation plot, as the binding score of Cadiyenol was higher than ZINC02525131, but binding stability was inferred as less. Glu69, Phe130, and His190 retained interactions for more than 30% of the simulation time. Gly69 and His190 was making water bridge (Fig. 4B) and hydrogen bond interaction while Phe130 depicted a hydrophobic interaction.

Fig. 4
figure 4

A 100 ns MD simulation plot depicting PNP synthase and Cadiyenol interaction. B Details of four type of interactions shown by PNP synthase residues with Cadiyenol

ADMET and pharmacokinetics

Good intestinal absorption was seen for the studied compounds, except for ZINC85486932. Cadiyenol, 6-Gingerol and ZINC33833737 had high caco-2 permeability, meaning high absorption of orally consumed drugs. Cadiyenol and 9,11,13-Octadecatrienoic acid were substrates and inhibitors of P-glycoprotein, means that they can bind but inhibit their transport outside of the cell. ZINC85486932 seems to have a high tendency of being purged out of the cell as it did not show inhibition of any of the P-glycoprotein. ZINC02525131 had least tendency to cross skin, but overall almost all other compounds had low skin permeability as well (values less than − 2.5). Steady state volume of distribution (VDss) is calculated to estimate the amount of dose required for uniform distribution of the drug in similar quantity in the plasma. The values were not too high (log VDss > 0.45) or too low (log VDss < − 0.15). Compounds did not show high blood brain barrier (BBB) permeability (none showed logBB > 0.3) but most poor BBB permeability was seen for ZINC85486932 (logBB < − 1). ZINC33833737 showed some possibility of central nervous system penetration while ZINC85486932 depicted no capability to cross central nervous system barrier. Compounds bind to cytochrome (CYP) 450 enzymes for detoxification, excretion or activation. While some compounds were substrates of CYP3A4, none inhibited CYP1A2, CYP2C19, CYP2C9 and CYP2D6. Only 6-Gingerol inhibited CYP3A4. Traditional Indian compounds had better clearance compared to TCM compounds, while no compound showed AMES toxicity. Highest tolerated doses were for ZINC02525131 and 6-Gingerol. ZINC02525131 depicted hepatotoxicity and 9,11,13-Octadecatrienoic acid showed skin sensitization but none of the compounds was an inhibitor of calcium channel hERG I/II. Inhibition of these genes causes long QT syndrome, further leading to ventricular arrhythmia and thus, should be stopped from further processing. In the past, many drugs have been withdrawn after showing this inhibition property.

Gastrointestinal tract absorption and kinetics of the compound were also simulated. ZINC02525131 and 6-Gingerol had maximum bioavailability in healthy state, while Cadiyenol bioavailability improved from ~ 77 to 99% in liver and renal impairment. Absorption of all compounds remained equal to or increased as compared to bioavailability in impaired state. This may be because the transit and excretion are considered as a continuous process in PBPK modeling and rates of these processes deliberated as reciprocal values of individual compartment’s transit time. This means that even after completion of transit time through stomach, intestine and colon, a substantial quantity of drug might still be absorbed in the gut. During enterohepatic recirculation, the absorbed percentage may be more as some of the dose is reabsorbed after secretion in the bile. Impaired state may have an impact on this parameter and thus, values are higher compared to the healthy state. Least time was required for ZINC02525131 to reach highest plasma concentration, while highest for the 9,11,13-Octadecatrienoic acid in healthy state. In impaired health state, time was in slight alliance with healthy state, except for ZINC33833737. It showed a large disparity in time to reach maximum concentration in plasma (increased from approximately 4–5 h to 10 h & 8–9 h in liver and renal impairment, respectively. ZINC33833737 also had the highest AUC till 12 h of simulation in healthy state, while 9,11,13-Octadecatrienoic acid had least AUC in liver impairment. ZINC33833737 concentration quantity was highest among all the compounds in healthy state, while ZINC85486932 showed highest concentration in impaired state.

K. negevensis has been isolated from the oral cavity of children [4, 55] and vagina in vaginosis [1]. It is known to produce RTX toxin and thus, designated a pathogen [1]. The mode of action of this class of toxins (part of type I secretion system and acting as a virulence factor) is hemolysis/cytotoxicity through membrane perforation of the host cell [56]. Todate, no report of therapeutic targets exists in literature for K. negevensis. For this reason, a comprehensive subtractive genomics strategy was utilized to infer the candidates that quality as druggable. More than 40 such proteins were inferred (Supplementary Table 1). Among these, one (PNP synthase) was selected for further analysis based on its importance, functional role in cell and novelty. PNP synthase is a homooctameric enzyme, that carries out catalysis of the last step of B6 vitamer biosynthesis, via condensation of deoxyxylulose-5-phosphate and aminoacetone-3-phosphate. This synthesis is essential to many pathways (like amino acid metabolism and antibiotic production) [57, 58]. Additionally, owing to the exclusive occurrence of PNP synthase in some bacteria and not in humans, as well as the vitality for bacterial survival, this enzyme is an encouraging target for screening antibacterial compounds. The structure of this enzyme in K. negevensis is not known yet and necessary for docking, so it was modeled using bioinformatics approach of threading. Full-length PNP synthase model was constructed using iterative procedure, with similar structural fragments cut out from template protein structures and simulated for our sequence. Secondary structure and B-factor values complemented the 3D coordinate information, indicating that the residues with helix or sheet architecture and flexible or rigid. Helices comprised major portion of the protein, followed by coils and strands. The residues were flexible mostly in coil regions, with most flexibility at C and N terminal regions. Active site residues were predicted and used for screening natural product inhibitors (see Table 2).

Table 2 ADMET parameters of the studied compounds

Virtual screening of natural products is a swift strategy which involves interaction modeling of drug and protein, with favored pose having least energy and showing stable configuration. Therefore, this strategy was adapted and natural products of traditional Indian and Chinese medicine origin, having a large structural as well as physicochemical variety were screened. Docking revealed the binding conformations for the PNP synthase and natural product compounds. Usage of natural products derived from Ayurveda and traditional Chinese medicinal plants dates back to old times and is still utilized in some places of the world [59]. Their revitalization has occurred in cheminformatics based drug mining literature, with new studies looking for natural product based inhibitors against pathogens [30, 60,61,62]. Structure docking was done on hits from TCM and Ayurvedic compounds against PNP synthase to identify potent inhibitors. Previously Ahmad et al. have reported a compound 2-acetyl-3-(2-heptanamidoethyl)-1H-indol-6-yl heptanoate inhibitor of this enzyme through computational screening, with high affinity in Yersinia enterocolitica [63]. We prioritized six compounds from traditional medicinal plants/herbs, based on scoring functions of software’s where ADMET profiling revealed 6-Gingerol as most readily bioavailable and safe. Its medicinal properties have been demonstrated previously as well [52, 53]. Some scientists date the usage of Zingiber officinale to more than 2000 years ago [64] as food condiment. Its use has been implied in both Indian and Chinese medicine [65]. This tuber is “generally recognized as safe” by the Food and Drug Administration, so the constituents of this condiment are nontoxic for consumption [66]. Its antibacterial properties have been demonstrated previously and is an encouraging substitute of synthetic antimicrobials [67,68,69]. We suggest that the 6-Gingerol be tested further in lab on cell lines and in mouse models, for targeted antibacterial action against K. negevensis (see Table 3).

Table 3 PBPK parameters of the studied compounds

Conclusion

K. negevensis possesses a small sized genome (~ 2 MB) but hosts several virulence factors and causes diseases in children. A case of vaginosis in adult patient has also been reported. The bacterium is currently understudied and similarity as well as co-occurrence with K. kingae in many cases makes it additionally difficult to separately study and link causation with symptoms of the resultant disease. This is why few data is available in literature regarding this bacterium. Only three genome sequences are present in the public NCBI database. In this study, therapeutic targets were mined from the bacterium isolated from pharynx of a child and using biophysical approach, structure modeling and virtual screening of a key drug target PNP synthase was done. We propose on the basis of PBPK and ADMET analysis that 6-Gingerol should be pursued further as an antibacterial compound against K. negevensis.