Identification of potent HDAC 2 inhibitors using E-pharmacophore modelling, structure-based virtual screening and molecular dynamic simulation

Histone deacetylase 2 (HDAC 2) of class I HDACs plays a major role in embryonic and neural developments. However, HDAC 2 overexpression triggers cell proliferation by diverse mechanisms in cancer. Over the decades, many pan and class-specific inhibitors of HDAC were discovered. Limitations such as toxicity and differential cell localization of each isoform led researchers to hypothesize that isoform selective inhibitors may be relevant to bring about desired effects. In this study, we have employed the PHASE module to develop an e-pharmacophore model and virtually screened four focused libraries of around 300,000 compounds to identify isoform selective HDAC 2 inhibitors. The compounds with phase fitness score greater than or equal to 2.4 were subjected to structure-based virtual screening with HDAC 2. Ten molecules with docking score greater than  -12 kcal/mol were chosen for selectivity study, QikProp module (ADME prediction) and dG/bind energy identification. Compound 1A with the best dock score of  -13.3 kcal/mol and compound 1I with highest free binding energy,  -70.93 kcal/mol, were selected for molecular dynamic simulation studies (40 ns simulation). The results indicated that compound 1I may be a potent and selective HDAC 2 inhibitor. Further, in vitro and in vivo studies are necessary to validate the potency of selected lead molecule and its derivatives. Graphical abstract Supplementary Information The online version contains supplementary material available at 10.1007/s00894-022-05103-0.


Introduction
Chromatin comprises the repeating units of nucleosomes and facilitates to embed the DNA in the core. Chromatin consists of H2A, H2B, H3 and H4 histone proteins (two molecules of each histone proteins form an octamer) crossing roughly 147 bp of DNA has been subjected to intense research [1]. For the past few years, there has been significant progress in our knowledge on the different types of histone modifications such as phosphorylation, methylation, acetylation and ubiquitination. Acetylation and deacetylation modifications of lysine present in H3 and H4 are facilitated by the activities of histone acetyltransferases (HATs) and histone deacetylases (HDACs) respectively. The activity of HATs is associated with a relaxed chromatin structure and transcription promotion, while HDACs activity promotes compact chromatin structure and downregulates the transcription. Till now, eighteen diverse types of HDACs are discovered and classified into four groups: class I (HDAC 1, 2, 3 and 8), class II (HDAC 4, 5, 6, 7, 9 and 10), class III (sirtuins: NAD-dependent enzymes) and class IV (HDAC 11). These isoforms are recognised as potential therapeutic targets due to their significant role in different diseases such as cancer, inflammation, neurological and lung disorders [2,3].
Histone deacetylase inhibitors (HDACIs) can induce cell cycle arrest and apoptosis. Hence, they are attracted considerable interest as therapeutically potent scaffolds [4]. Recent data suggest that HDACIs enhance cognitive ability and repair neurodegenerative impairment thereby helps to re-establish long-term memory [5]. The drugs that have been discovered to date are pan inhibitors that target all the HDAC isoforms or class selective inhibitors. Few pan HDAC inhibitors such as suberoylanilide hydroxamic acid (SAHA), belinostat and panobinostat have been approved to treat cutaneous T-cell lymphoma, peripheral T-cell lymphoma and multiple myeloma, respectively. Some of the HDAC class-specific inhibitors reported are nanatinostat (class I) and rocilinostat (class II) [6]. These inhibitors affect the global acetylation and deacetylation process which in turn alters large number of genes. Even though pan inhibitors are potent therapeutics, they can be toxic due to simultaneous inhibition of several isoforms. Therefore, second-generation HDAC inhibitors are focused on isoform-selective compounds and can serve as potent drug molecules for several diseases [7].
Current work focuses on HDAC 2, which belongs to class I HDACs. HDAC 2 plays a vital role in embryonic, neural development and cardiac functions [8]. However, literature survey has shown that HDAC 2 is overexpressed leading to proliferation of oral, breast and colon cancers [9][10][11]. Aberrant HDAC 2 influences the expression of tumour suppressor genes such as p21 (WAF1/Cip1) and p53 [12,13]. HDAC 2 is downregulated by microRNAs such as miR-200 and miR-145 and promotes apoptosis [14,15]. Moreover, selective HDAC 1/HDAC 2 inhibition induces neuroblastoma differentiation and reduces cell viability [16]. Furthermore, HDAC 2 modulates synaptic plasticity and longlasting changes of neural circuits may negatively regulate the processes of learning and memory [17]. Aberration in HDAC 2 affects several pathways such as NF-kB and STAT1 signalling [18,19]. Even though HDAC 2 plays a crucial role in various diseases, till date no compounds have been approved as an isoform selective HDAC 2 inhibitor. Therefore, recently many researchers have focused on the development of isoform selective HDAC 2 inhibitors [20].
Computational methods help to reduce drug development cost by screening large databases. It analyses the interaction between the ligand and the protein in a biological environment. Many novel lead molecules were discovered using virtual screening [21][22][23]. Despite being a promising approach in drug discovery, very few drugs based on virtual screening have entered into clinical studies and these include PRX-03140 (phase IIB) and PRX-08066 (phase IIA) to treat Alzheimer's disease and pulmonary hypertension, respectively [24]. Similar approaches were used to identify potent HDAC inhibitors. A comparative structure and ligand-based in silico study was carried out to explore the structural requirements of isoform selective HDACIs [25]. In another study, non-hydroxamic acid based HDAC inhibitors were recognised and the lead compound obtained was evaluated for its HDAC inhibitory and anticancer activity in vitro [26]. HDAC 8 isoform selective inhibitors were identified by in silico study, using 167,000 molecules and the three best hits were filtered based on factors such as rule of five, presence of zinc-binding groups (ZBG), binding pattern and pharmacophore models. The hits were then subjected to in vitro enzyme inhibition assays [27]. Similarly, in another study, potent HDAC 8 inhibitors were identified from a library of 4.3 × 10 6 molecules [22]. Novel HDAC 6 selective inhibitors were discovered from library of 330,000 compounds and tested for HDAC 6 inhibition and cytotoxicity [28]. HDAC 2 selective inhibitors were identified using quantum polarised ligand docking, pharmacophore generations and binding free energy calculation [29]. In another study, 3D QSAR pharmacophore generation and structure-based virtual screening were used to recognise HDAC 2 selective inhibitors [21].
In the present study, we have combined e-pharmacophore, structure-based virtual screening, free binding energy calculation and molecular dynamic simulation to discover novel HDAC 2 selective inhibitors.

Protein preparation for docking studies
All the studies were performed with Maestro version 11.4 (Schrodinger Inc.) The crystal structures of different isoforms of HDACs were retrieved from the Protein Data Bank (Table 1).
These structures were subjected to protein preparation wizard (PPW), missing hydrogens were added and the metal ionisation state was corrected to maintain the formal charge and force field [30]. Protein preparation wizard has three-step workflow as follows: (1) pre-processing, (2) review and modify and (3) minimise. In the pre-processing steps, the PPW tool automatically identifies any problem with the imported protein structure, like missing hydrogen atoms, missing side chains and missing loops, and rectifies them as per its inbuilt algorithm. PPW can assign bond orders, create zero-order bonds to metals and create disulphide bonds. PPW employs integrated prime functionality to fill missing side chains or loops. In the second step, it can generate het states using Epik at any specified pH. For minimization, OPLS3e force field was used [31]. The crystal structure of HDAC 5 (PDB ID: 5UWI) did not had Zn 2+ ion and crystal structures of HDAC 9 and 11 were unavailable in the Protein Data Bank. Therefore, we have obtained 3D protein model for HDAC 5 (Q9UQL6), HDAC 9 (Q9UKV0) and HDAC 11 (Q96DB2) using SWISS-MODEL [32] (https:// swiss model. expasy. org/).

Database selection and ligand preparation
We have utilised four different databases: (a) targeted oncology from Asinex (6,728 compounds), (b) National Institute of Health (NIH) from National Cancer Institute (NCI) (237,000 compounds), (c) screened compounds from Maybridge (4,107 compounds) and (d) HDAC inhibitors from Maybridge (53,352 compounds). These compounds were subjected to Ligprep module (version 44,011) to generate 3D structures with low energy retaining the originality of chirality and ionisation. Force field applied was OPLS 2003e to produce minimal energy structures with corrected chirality [30]. SAHA and MS-275 are known pan and class I HDAC inhibitors, respectively and considered as positive controls.

E-pharmacophore modelling and virtual screening
E-pharmacophore-based virtual screening combines structure and ligand-based approaches and carried out by PHASE module maestro Schrodinger, which is used to screen the compounds based on e-pharmacophore generated [33,34]. In the current study, we generated e-pharmacophore by using co-crystallised ligand 4-(acetylamino)-N-[2-amino-5-(thiophen-2-yl)-phenyl]-benzamide (selective for HDAC 1 and HDAC 2 isoforms) with the HDAC 2 protein [35]. The compounds which fulfil the hypothesis were selected for the further study.

Structure-based virtual screening
Grid box of HDAC 2, 3, 8, 4 and 7 proteins was made at the site of a co-crystallised ligand. SiteMap module comprises of an algorithm to locate binding sites and can be used to setup grid boxes [36,37]. Active sites of HDAC 1, 6 and 10 were identified using the SiteMap module. Since FDAapproved HDAC inhibitors chelate Zn 2+ ion to inhibit its activity, we have considered amino acids surrounding Zn 2+ ion to create the grid box of modelled proteins. HDAC 5 protein grid box was generated including PRO  Glide offers rapid vs precision options ranging from highthroughput virtual screening (HTVS) -capable of screening large compound libraries, standard precision (SP) -up to hundreds of compounds with high precision and extra precision (XP) -highly accurate models eliminating false positives [38][39][40].

Free binding energy calculation (MM-GBSA)
Docking of identified ligands demonstrates efficient binding to the active site of protein. However, the protein-ligand association should continue in the same state to promote any potential biological response. These responses mainly depend upon the free binding energy. Therefore, ten best hits are subjected to prime module to determine the free binding energy. ADME prediction ADME (absorption, distribution, metabolism and excretion) prediction is discovered to reduce the late stage failures in the drug development process. We have used QikProp version 5.4 of Maestro module to predict the ADME properties of best ten hits. Properties such as molecular weight, donor HB (number of H-bond donors), acceptor HB (number of H-bond acceptors), QPlogPO/w (Octonol/water partition coefficient), QPlogS (predicted aqueous solubility), QPP-Caco (Caco-2 permeability), QPlogBB (blood/brain partition coefficient), HOA (qualitative human oral absorption value ranges from 1 for low, 2 for medium and 3 for high), PHOA (percentage of human absorption), QPlogKhsa (binding to human serum albumin), ROF (rule of five), PSA (polar surface area), Metab (number of likely metabolic reactions) and ROT (rule of three) were measured.

Molecular dynamics simulation
Molecular dynamics (MD) simulation is done to overcome the disadvantages of molecular docking studies. It provides flexible receptor-ligand interaction by solvating the system. Compounds IA and 1I, with high dock score and with high dG/binding score, respectively, were subjected to molecular dynamics simulation for the better understanding of the stability of protein-ligand interactions. MD simulation studies were performed using Desmond module of Schrodinger. It has a three-step workflow where system builder was the first step. In this step, ligand-protein complex was solvated using simple point charge (SPC) solvent model in an orthorhombic box shape. SPC is a three-site solvent model widely used in MD simulations studies for small molecule-protein complexes. SPC assumes an ideal tetrahedral shape (HOH angle of 109.47 °) instead of the observed angle of 104.5 °. Second step was the minimisation of the solvated ligand-protein complex using steepest descent (SD) method with maximum iterations fixed in 2000 and convergence threshold at 1 kcal/ mol/Å. Slow relaxation protocol was followed for the minimised complex, and it was calibrated at a temperature and pressure of 300 K and 1 bar, respectively. Nose-Hoover method was used as thermostat and Martina-Tobias-Klein method was used as barostat. The last step was simulating this minimised complex for 40 ns. A frame was captured every 40 ps and thus a total of 1000 frames were generated. RMSD plots, RMSF plots, ligand interaction diagrams, histogram plots etc. were generated to analyse the results of MD simulation studies [41].

E-pharmacophore modelling and virtual screening
The ligand 4-(acetylamino)-N-[2-amino-5-(thiophen-2-yl)phenyl]-benzamide, which is co-crystallised with HDAC 2, is used as a reference for the e-pharmacophore generation [35]. E-pharmacophore hypothesis consists of three aromatic rings (R7, R8 and R9), one H-bond acceptor (A2) and one H-bond donor (D4) (Fig. 1). This hypothesis is used to form a basic skeleton of compounds with specific angle and distance which is likely to bind to HDAC 2. Formed e-pharmacophore is used to screen the libraries of compounds and subjected to Ligprep module, which generated 17,054; 443,161; 8,210 and 83,797 compounds from targeted oncology database, National Cancer Institute database, HDAC inhibitory compounds collection and screening collection database from Maybridge, respectively. Compounds matching with minimum of three criteria in the hypothesis were selected. Among these, compounds with phase fitness score more than or equal to 2.4 were chosen for structure-based virtual screening. Targeted oncology produced 6,124 compounds. However, all the compounds had phase fitness score less than 2.4. Hence, 17,054 compounds generated in Ligprep were directly subjected to structurebased virtual screening. Similarly, 7 out of 38,578 and 79 Fig. 1 A five-feature e-pharmacophore (RRRAD) model generated using PHASE module for selective HDAC 2 inhibitor illustrating hydrogen bond acceptor (pink sphere), hydrogen bond donor (sky blue sphere) and aromatic ring (orange rings). A green area indicated inter-site angle between features and B purple lines indicate inter-site distance between features out of 4,127 compounds were selected from NCI and HDAC inhibitory compounds collection database from Maybridge. All the 6,088 compounds obtained from Maybridge screening database had phase fitness score less than 2.4 and hence not considered for further analysis.

Structure-based virtual screening and MM-GBSA
17,054 compounds selected from targeted oncology database from Asinex were subjected to HTVS mode of docking. One hundred forty two compounds with dock score more than or equal to -9 kcal/mol were chosen for SP docking. Fifty compounds with SP dock score more than or equal to -10 kcal/ mol were subjected to XP docking. Finally, 4 compounds (1A to 1D) with XP dock score above -12 kcal/mol were selected. Similarly, 86 compounds from NCI and HDAC inhibitors from Maybridge were subjected to XP mode of docking and 6 compounds (1E to 1 J) with XP dock score more than -12 kcal/mol were selected. Chemical structures of best 10 hits (Fig. 2) are shown below.
To check the selectivity, ligands were docked with other HDAC isoforms. XP docking scores of the selected ligands with HDAC 2 were greater than -12 and greater than the positive controls, SAHA and MS-275 (-11.6 kcal/ mol) ( Table 2). Compounds showed higher docking scores towards HDAC 2, compared to all other HDAC isoforms. The XP dock score with HDAC isoforms ranges from -2.0 to -13.3 kcal/mol. To validate the docking study, the co-crystallised ligand of HDAC 2 in the PDB ID is redocked and RMSD value was found to be 0.25 Å. The ligand poses and the interacted amino acids are compared between X-ray crystallography and redocked ligand-protein complex (Figs. S1 and S2).
Compound 1A showed maximum XP dock score with the value of -13.3 kcal/mol, followed by compounds 1E and 1B with the values of -12.8 kcal/mol 1 and -12.6 kcal/mol, respectively. Among the ten hits, compound 1I showed better isoform selectivity towards HDAC 2 with dock score of -12.1 kcal/mol and dock score less than -7 kcal/mol for other isoforms. Best 10 hits were subjected to MM-GBSA analysis with HDAC 2 and free binding energy was determined ( Table 2). Compound 1I showed better dG/binding energy of -70.9 kcal/mol, followed by compound 1J with the value of -70.3 kcal/mol −1 . 2D interaction of 10 best hits with HDAC 2 was compared to SAHA and MS-275 (Fig. 3).

ADME prediction
ADME prediction was carried out by QikProp module and showed that the compounds selected could be promising HDAC 2 inhibitors (Table 4). All the compounds had recommended values for molecular weight (Mol. wt), hydrogen bond donor ability (donorHB), hydrogen bond acceptor ability (AccptHB), water/gas partition coefficient (QPlogPo/w) and aqueous solubility (QPlogS). Caco-2 cell permeability is predicted by QPPCaco and ranges from 41.69 to 851.00.
The compounds 1F and 1I have shown great permeability with 851 nm/s and 606 nm/s, respectively. The blood-brain partition coefficient is predicted by QPlogBB. All the compounds listed have shown the ability to cross the blood-brain barrier with values ranging from -0.05 to -1.62. All the best hits showed HOA value 3 indicating better oral absorption. PHOA ranges from 74 to 100, PSA ranges from 61.16 to 115.48 and QPlogKhsa ranges from -0.07 to 1.11. None of the compounds violated the rule of five and two compounds (1A and 1B) violated the rule of three (ROT).

Molecular dynamic simulation
Compound 1A with best XP dock score (-13.3) and compound 1I with best MM-GBSA score (-70.9) were exposed to molecular dynamic simulation for 40 ns. Overall, 1000 frames were generated in the trajectory. Protein-ligand interaction stability throughout the simulation was studied by RMSD (root mean square deviation) analysis. Figure 4A demonstrates RMSD for 1A-HDAC 2 complex and was almost stable throughout the simulation. However, slight drift was observed at 2 to 11 ns, 21 to 28 ns and 35 to 37 ns. Figure 4B demonstrates the conformational changes taking place along the HDAC 2 protein side chain. RMSF (root mean square fluctuation) data of protein depicts the flexibility from 0.40 to 2.3 Å. Ligand-protein interactions were analysed throughout the simulation. XP docking protein-ligand interaction of compound 1A and MD simulation protein-ligand interaction were compared. It retained hydrogen bond interaction with GLY 154, hydrophobic interaction with PHE 155, PHE 210 and TYR 308, charged negative interaction with ASP 181 and ASP 269, polar interaction with HIS 145, HIS 146 and HIS 183 from XP docking. In  addition, it also showed, H-bond interaction with HIS 154 and pi-pi stacking with HIS 146 during molecular dynamic simulation. Interactions of compound 1A-protein complex is shown in Fig. 4C and D. Figure 5A depicts the RMSD analysis of 1I -HDAC 2 complex which was stable throughout the simulation. Figure 5B demonstrates the conformational changes taking place in the HDAC 2 protein side chain. Root mean square   Fig. 5C and D. Both the compounds exhibited stable protein-ligand complex throughout 40 ns simulation. Compound 1I has maximum MM-GBSA score and showed stable interaction with HDAC 2 protein. Even though it has lesser XP dock score than 1A, its dock score was more than the positive control. Also, dock score of 1I with other HDAC isoforms were less than seven indicating that it may be a best HDAC 2 isoform selective inhibitor among the selected compounds.

Conclusions
In this study, we have identified ten best compounds as potent HDAC 2 inhibitors and among these compounds 1I can be more selective towards HDAC 2 compared to other HDAC isoforms. We designed an e-pharmacophore model using ligand co-crystallized with HDAC 2 protein (PDB ID: 4LY1) and compounds were subjected to e-pharmacophore-based virtual screening. Filtered compounds were subjected to structure-based virtual screening. Based on the docking scores, ten best hits were selected. All the hits showed better XP docking score with the minimum value of -12 and better than SAHA and MS-275. The best hits were subjected to virtual screening against other HDAC isoforms. Further, they were subjected to ADME and MM-GBSA score prediction. Based on docking score, ADME and MM-GBSA results, all the hits were efficient to be developed as potent HDAC 2 inhibitors. However, two best hits, one with top docking score and the other one with top MM-GBSA score were selected and subjected to molecular dynamic simulation. Molecular dynamic simulation of these compounds exhibited stable protein-ligand interaction throughout the simulation. Further, by validating the potency of selected lead molecules, this study could aid in developing selective HDAC 2 inhibitors.  Author contribution K. S. Babitha: conceptualisation, supervision, resources, methodology, data curation, writing-original draft preparation, writing-reviewing and editing, final approval of the manuscript. Padmini: methodology, data curation, writing-original draft preparation, writing-reviewing and editing, final approval of the manuscript.
Avinash Kumar: resources, methodology, software, investigation, writing-reviewing and editing, final approval of the manuscript.
Manasa Gangadhar Shetty: methodology, data curation, writing-original draft preparation, writing-reviewing and editing, final approval of the manuscript.
Suvarna Ganesh Kini: resources, methodology, software, investigation, writing-reviewing and editing, final approval of the manuscript.
Manoj Bhat Krishna: resources, methodology, software, investigation, writing-reviewing and editing, final approval of the manuscript.
Kapaettu Satyamoorthy: supervision, resources, methodology, data curation, writing-reviewing and editing, final approval of the manuscript.

Data availability Not applicable.
Code availability Not applicable.

Declarations
Ethics approval Not applicable.

Consent to participate Not applicable.
Consent for publication All the authors gave their consent for publication.

Conflict of interest The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated C and D represents bar graph and 2D interaction between ligand and protein throughout trajectory otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.