Translate this page into:
Machine learning approaches in designing anti-HIV nitroimidazoles: 2D/3D QSAR, kNN-MFA, docking, dynamics, PCA analysis and MMGBSA studies
⁎Corresponding authors. palladiumsalt@gmail.com (Mohd Usman Mohd Siddique), nanomedicine96@gmail.com (Tasneem Khan) tasneem.khan_mpharm@jamiahamdard.ac.in (Tasneem Khan)
-
Received: ,
Accepted: ,
This article was originally published by Elsevier and was migrated to Scientific Scholar after the change of Publisher.
Abstract
In this study, newly synthesized 20 nitroimidazole derivatives were subjected to 2D and 3D Quantitative Structure-Activity Relationship (QSAR) study to investigate their anti-HIV activity against both ROD and IIIB strains. Later, proposed hypothesis was virtually proved by further in-silico studies. In statistically significant 2D-QSAR models r2 values for IIIB strains 0.9241 and for ROD strains 0.9412 with corresponding q2 values of 0.7706 and 0.8299, were obtained, respectively. Different models were constructed using three different kNN-MFA 3D QSAR approaches such as SW-FB, SA score, and GA. By using the generated hypothesis, newer analogues of nitroimidazole derivatives was designed and molecular modelling studies were conducted to prove the hypothesis. The three molecules were displayed the good docking scores compared to the reference molecule. The stabilities of docked complexes were analyzed by MD simulations and MMGB/SA calculations. These results offer insightful design guidance for novel anti-HIV compounds synthesis and suggest interesting directions for future pharmaceutical research.
Keywords
HIV
2D/3D QSAR
kNN-MFA
Molecular docking
MD simulations
MMGB/SA calculations
1 Introduction
The acquired immunodeficiency syndrome (AIDS) is one of the most serious threats to healthcare professionals in today's world. AIDS has changed from being a mystery and frequently fatal ailment to a chronic but controllable disease since it first appeared in the early 1980s. The Human Immunodeficiency Virus (HIV), a lentivirus that mainly targets the immune system's CD4 T cells and impairs the body's capacity to fight off infections and diseases, is the primary cause of AIDS (Deeks et al., 2015; Fauci et al., 1996). The virus can be passed from mother to kid during childbirth or nursing as well as through unprotected sexual contact and tainted needles. According to the most recent data available, there are currently 38 million individuals living with HIV/AIDS globally, and the disease has claimed millions of lives since it first appeared. Significant advancements in the knowledge of the virus and the development of successful therapeutic agents have been accomplished over the previous four decades (Gardner et al., 2011; Trickey et al., 2017). HIV therapy has advanced dramatically over time and its severity has been reduced for many people because to antiretroviral medication (ART), which makes the disease manageable. Without therapy an immune system-attacking HIV virus can develop the AIDS (acquired immunodeficiency syndrome), a condition in which the immune system has been severely compromised making the body vulnerable to many infections and malignancies. Effective medication or vaccination is needed for proper treatment of AIDS or HIV virus. Though significant development against the effective medication against the HIV viruses, the development of drug-resistant strains and the requirement for better therapeutic alternatives continue to highlight the necessity of continued research (Nastri et al., 2023; Richman et al., 2009). The treatment of HIV is done by targeting various stages of viral cell cycle or components of viral cell. Antiretroviral therapy has been proven to be most effective clinical target for treatment of HIV infections and it frequently targets HIV protease enzyme since it is essential to the viral life cycle. With the potential to improve treatment effectiveness, lessen side effects and eventually move us closer to the objective of an HIV-free world we aimed to design a newer nitroimidazole derivatives (Cohen et al., 2011; Lv et al., 2015; Qian et al., 2009). An enzyme is a biomolecular structure that acts as a catalyst in biological reactions. It is frequently made of proteins and it enables and accelerates the chemical reactions necessary for diverse cellular processes. Enzymes serve a key role in living organisms. These biomolecular structures can interact with particular substrates to transform them into products without being consumed in the process. They play a vital role in the physiology of living organisms, including most viruses (Agarwal, 2006; Cooper, 2000; Walsh, 2014).
The therapeutic potential of substituted nitroimidazoles in numerous medical applications has made them a popular class of substances since it is used in various diseases since over decades. Some anticancer drugs include dacarbazine (DTIC), temozolomide, and misonidazole (Bartroli et al., 1992; Biskupiak et al., 1991). The most commonly used antifungal agents bearing nitro-imidazole includes clotrinazole and metronidazole (Ross et al., 1973). Capravirine a good clinically used antiviral agents contains the imidazole ring (Fujiwara et al., 1998). Some nitroimidazole compounds have been found to operate as powerful and selective histamine H-3 receptor agonists (Eriks et al., 1992; Kovalainen et al., 1999), inhibitors of mitogen-activated protein (MAP) kinases (Oeztuerk-Winder and Ventura, 2012), nitric oxide synthase (Salerno et al., 1999), and antibacterial agents (Castelli et al., 2000). In addition, a fascinating class of these chemicals known as 5-nitro-substitutedhaloimidazoles shown significant biological activity as potential radiosensitizers (Adams et al., 1979). Thus, optimizing or alteration in nitro imidazole structure may pave the new path for design and discovery of newer imidazole derivatives as against viral protease enzymes.
We used the computer-aided drug design to make the development and assessment of novel molecules with the quantitative structure–activity relationship (QSAR) being the most innovative experimental strategy (Aouidate et al., 2018; Chtita et al., 2016). This mathematical approach of QSAR creates a link between examining structural characteristics and the related pharmaceutical efficacy (Er-rajy et al., 2023; Ghamali, 1988; Halgren and Nachbar, 1996; Kumar et al., 2024; Saha et al., 2023). Moreover, the development of scientific knowledge, the discovery of new medicines the design of new drugs and many other research projects and all benefit greatly from molecular modelling studies (Aouidate et al., 2017; Chtita et al., 2022; El Rhabori et al., 2024; Mitra et al., 2024; Tropsha et al., 2024). However, the preliminary data obtained either after HTVS protocol or molecular docking studies often needs a validation as it considered the protein as rigid structure. Contrast to this, protein structure is dynamic and under solvated state (Abchir et al., 2023; Chalkha et al., 2022). Thus, to get the more validated results molecular dynamics simulations studies is an important computational technique. Researchers can better understand the dynamics, interactions and stability of drug candidates and their targets by using these simulations studies which offer insightful information about the behavior of molecules at the atomic and molecular levels (Chtita et al., 2022; Daoui et al., 2023). Hence, a deeper knowledge of molecular interactions with enhanced efficacy and safety profiles are provided by molecular dynamics simulations, which is crucial tool in drug design and discovery. They fill the gap between theoretical understanding and practical findings providing molecular insights that could be difficult or time-consuming to achieve just through laboratory research (“29. Golbraikh, A., Tropsha, A., 2003. QSAR modeling… – Google Scholar,” n.d.; Aouidate et al., 2017; Baumann, 2002; De Vivo et al., 2016; Golbraikh and Tropsha, 2000; Moorthy et al., 2011; Salo-Ahen et al., 2020). Development of two-dimensional, group-based, k-nearest neighbor and pharmacophore-based statistically significant QSAR approaches for anti-HIV drug and design of newer nitroimidazole derivatives was the goal of the current study. With the use of these techniques, it was possible to examine specific chemical substitution sites and create statistically significant QSAR models for nitro imidazole derivatives as anti-HIV drugs. Further the proposed hypothesis was virtually confirmed by ligand matching, molecular docking, MD simulations studies and MMGB/SA to develop a newer candidate for anti-HIV drug (Aouidate et al., 2017; Elkaeed et al., 2024; Ji et al., 2020; Muhammed and Aki-Yalcin, 2024; Patel et al., 2024; Vinjoda et al., 2024).
2 Materials and method
The detailed workflow diagram is displayed in Fig. 1. By employing the molecular modelling software VLife Molecular Design Suite (V-Life MDS) version 4.2, all computational computations and molecular modelling investigations (2D and 3D-QSAR) were carried out on a Dell workstation. The Glide module of the Schrodinger software was used to conduct the molecular docking studies. Then docked complexes were subjected to MD simulation study using the Desmond v3 software, developed by DE Shaw research group. Then, MMGBS/SA was used to calculate the binding free energies by using the last 30 frames generated by MD simulations on Prime module of Schrodinger software. The detailed methodologies have been discussed individually.Workflow diagram of proposed study.
2.1 Dataset and biological activity
The QSAR analysis was conducted using a series of 20 compounds that were substituted for nitroimidazole as highly selective and potent inhibitors of non-nucleoside reverse transcriptase (Alderton et al., 2001). To associated with the structural characteristics (physicochemical descriptors), the compound’s logEC50 (pEC50) values were converted from their micro molar growth inhibitory concentrations. Table 1 displayed the chemical structures of 20 such compounds together with estimates of their biological activity used for generation of 2D and 3D QSAR models.
Sr. No.
R
EC50IIIB
Log EC50IIIB
EC50ROD
logEC50ROD
1
-CH2Cl
0.41
−0.38
0.50
−o.30103
2
-CH2-S-Ph
1.74
0.24055
2.74
0.43775
3
-CH2-S-p-Cl-Ph
2.17
0.33646
1.94
0.2878
4
-CH2-S-pOMe-Ph
6.02
0.7796
5.29
0.72346
5
-CH2-S-benzyl
7.64
0.88309
6.48
0.81158
6
-CH2-S-p-Cl-benzyl
5.95
0.77452
5.81
0.76418
7
-CH2-S-naphthyl
4.48
0.65128
3.99
0.60097
8
-CH2-S-CH2-CO2-Et
2.48
0.39445
10.64
1.02694
9
-CH2-SO2-naphthyl
17.70
1.24797
14.50
1.16137
10
-CH3
63.70
1.80414
63.10
1.80003
11
p-Cl-Ph
21.20
1.32634
22.50
1.35218
12
p-F-Ph
57.50
1.75967
63.20
1.80072
13
p-NO2-Ph
84.70
1.92788
86.00
1.9345
14
p-OMe-Ph
64.70
1.83059
71.80
1.85612
15
2,4-Cl2-Ph
14.60
1.16435
14.50
1.16137
16
2-thiophene
49.00
1.6902
54.80
1.73878
17
-(CH2)2CO2Me
29.80
1.47422
38.10
1.97174
18
-(CH2)2CONH2
78.90
1.89708
93.70
1.97174
19
O-9-methyl-9H Carbazole
11.30
1.05308
9.24
0.96567
20
3-ethyl-1H-indole
15.20
1.18184
15.40
1.18752
2.2 Dataset division into training and test sets
Compounds were sketched out and transformed into 3D structures using the 2D draw program. The Merck molecular force field and atomic charges were used in the batch energy minimization approach to do energy minimization and geometry optimization. The medium's dielectric constant was 1, the maximum number of cycles was 1000 and the convergence criterion (RMS gradient) was 0.01. To ensure the accuracy of our QSAR model we conducted a search on each energy minimized structure. This involved employing a strategy to the RIPS approach, where we introduced perturbations to the coordinates of each atom in the molecule generating fresh molecular conformations. When creating the training and test sets for our model we took into account the importance of having structures and a range of biological activities similar to those in the training set. To achieve this we utilized the sphere exclusion method (Bhadoriya et al., 2013; Bhatia et al., 2017) which determined the radius based on dissimilarity values (as shown in Table 1). This method allowed us to divide our dataset into two groups; a training set and a test set. Each group consisted of 20 molecules. Furthermore, we randomly selected 20 chemicals for our training set and 6 compounds for our test set. This division was essential in constructing and validating our model.
2.3 Molecular modeling
2.3.1 2D-QSAR studies and calculation of 2D descriptor
The QSAR Plus module of VLife MDS computed roughly 239 physicochemical descriptors as part of the 2D-QSAR research, which also includes the computation of molecular descriptors. The universal descriptors that apply to all molecules were removed because they had no impact on the QSAR. A genetic algorithm was used to choose the variables. Different molecular structure characteristics were encoded using 2D atomic valence connectivity index (chi V), descriptors such as the retention index (chi), molecular weight, chain path count, path count, path cluster, cluster, semi-empirical, element count, estate number, topological index, slogP and molecular refractivity. Additional calculations were performed for the different alignment-independent (AI) descriptors (Bhadoriya et al., 2013). In this study, we used a distance range of 0–7 and the following attributes to calculate AI descriptors like 3 (triple bonded atom), 2 (double bonded atom), C, N, S, O, and H. If the QSAR models meet the any of criteria such as q2 > 0.6 or r2 > 0.7 they are deemed to be acceptable (Bhatia et al., 2017; Choudhari and Bhatia, 2015; Clark et al., 1989). The q2 score demonstrated the statistical significance and predictability of the models when used as a criterion for the model's robustness and predictive capability. The models are ideal for accurate predictions if they have a high q2 score, such as one of q2 > 0.5 (Cochran et al., 1996). It is better to estimate standard errors that are lower. The regression function accurately predicts the observed data, demonstrating its precision. The minimal standard errors of pred_r2se, q2se, and r2se affirm the high quality and fitness of the model. To assess the predictive capabilities of the generated QSAR model, both cross-validation (Leave-One-Out, LOO) for q2 and external validation using pred_r2 were employed. In this external validation approach, the data was divided into training and test sets, yielding pred_r2 scores. The impressive pred_r2 and low pred_r2se values highlight the exceptional predictive performance of the model. These external validation parameters, pred_r2 and pred_r2se are the essential components of all models produced. Furthermore, upon further analysis it is evident that the residuals for all substances are relatively small. The difference between the actual and predicted activity levels is minimal, indicating the remarkable predictive potential of the QSAR model.
2.4 3D-QSAR analysis
2.4.1 kNN-MFA
It is a cutting-edge methodology that can correlate molecular field descriptors and biological activity linearly which is in contrast to conventional QSAR regression techniques (Choudhari et al., 2012; Rajesh Sharma and Swaraj Patil, 2013; Sharma et al., 2013). It is therefore a better indicator of biological activity. kNN is better suited to explain activity trends than traditional correlation techniques, which seek to establish a linear link with the activity since it is fundamentally non-linear. A conceptually straightforward method for solving pattern recognition issues is the kNN technique. The classification of an unknown pattern using this method was based on the training set's k nearest neighbor’s class memberships, which were considered to be the majority. A suitable distance metric (like a molecular similarity index, which is determined by the field interactions of molecule structures) was used to determine how close something is. Calculating the distances between an unclassified item (u) and each object in the training data set is a critical step in implementing the conventional kNN algorithm (Choudhari et al., 2012; Rajesh Sharma and Swaraj Patil, 2013; Sharma et al., 2013). Then, k training set objects were chosen that are most similar to object u in terms of those distances. Finally, classified item u into the category that contains the vast majority of the objects in the k training set. Through cross-validation with one sample left out or optimization through samples categorization of a test set, an ideal k value is chosen. The following descriptions of various variable selection techniques are used to select the variables and best k values.
2.4.2 kNN-MFA with simulated annealing
Simulated Annealing (SA) is a commonly used stochastic method in the field of QSAR for function optimization. SA mimics the real-world process of annealing, where a system is first heated to a high temperature and then gradually cooled down to a specific temperature like room temperature. Throughout this cooling process the system explores various configurations distributed according to the Boltzmann distribution, eventually reaching equilibrium with a greater concentration of low-energy states.
2.4.3 kNN-MFA with genetic algorithm
Genetic algorithms (GAs) developed by Holland match the principles of natural evolution by simulating a dynamic population containing chromosomes and serve as carriers for encoding specific traits. Typically, this encoding is represented as bit strings with specific bits signifying certain attributes while others remain unassigned or cleared. A model is created utilizing the encoded features for each chromosome. The error of the model is calculated using the training data and acts as a fitness function. The chromosomes are prone to crossover and mutation during the course of evolution. By allowing the most adaptable chromosomes to survive and reproduce the approach greatly lowers the error function in succeeding generations (Rajesh Sharma and Swaraj Patil, 2013).
2.4.4 kNN-MFA with stepwise (SW) variable selection
The accuracy of our quantitative structure–activity relationship (QSAR) models was improved by the use of k-Nearest Neighbour Molecular Field Analysis (kNN-MFA) method in combination with Stepwise (SW) Variable Selection. The best feature selection was achieved by employing stepwise variable selection to systematically discover and include the most significant descriptors in the models. We were able to build reliable QSAR models with better predictive ability thanks to this tactical combination. Applying kNN-MFA with SW Variable selection is a crucial step in our research since it allows for a finer resolution of the molecular features causing the observed biological effects and provides a rational way to develop novel anti-HIV drugs.
2.4.5 Alignment rules
The molecular alignment of the supplied set of molecules was done to observe the structural diversity within them. A typical rectangular grid was then created enclosing the molecules. Unsubstituted nitroimidazole (the template structure) was employed for alignment while taking into account the shared parts of the series, as illustrated in Fig. 2A and 2B. The reference molecule as shown in Fig. 2A was chosen as a template molecule because of its potent inhibitory activity which made it a reliable lead molecule. The template alignment approach was used to superimpose every molecule from the series following optimization using the template structure and the reference molecule. The template-based alignment was applied to appropriately align the given set of molecules in accordance with the kNN-MFA methodology after optimization.(A) Template molecule used for QSAR study and (B) Stereoview of all 20 aligned molecules.
2.4.6 Creation of interaction energies
The molecules above are aligned, and their RMSD values fall within the range of 2 Å. The atom/shape-based RMS fit and multi fit approach was used to superimpose the molecules onto the template structure.
2.4.7 Generation of training and test sets
The QSAR model was created by using the molecules with known biological activity present in training set. To evaluate the QSAR model's performance, the dataset was put into training and test sets using various methods, including sphere exclusion, random selection, and manual selection. The QSAR model is put to the test using the training set to check its predictive capacity, which is not taken into account during model construction
2.5 Molecular docking studies
The co-crystallized protein structure was obtained from the publicly accessible protein data bank (pdb id: 1HIV) (http://www.rcsb.org) whereas molecular docking investigations were carried out using a chosen ligand and reference molecule (Deeks et al., 2015). Then, using the protein preparation wizard the downloaded protein was subjected to different steps of adding the missing H atom and correctly allocating the bond order. Protein co-factors received formal charges, and an internal ligand heterostat was created. The last step was to perform OPLS2005 forcefield energy minimization to eliminate any unphysical interactions between the protein or ligand atoms and maintain the relaxed state of the co-crystallized protein complex. Any water molecules within 5 were also removed. The receptor grid was created in order to pinpoint the precise location where the compounds under study should bind. The centroid of the co-crystallized crystals was located within a 10 Å site, and default parameters were used with a 1.0 scaling factor. The designed nitroimidazole ligands with the reference molecule were put through the ligprep tool of Maestro utility. With the use of the OPLS2005 forcefield, the potential conformers, tautomers or stereoisomers of the ligands were created. For molecular docking investigations using the XP mode and a Van der scaling of 0.8, these synthesized ligands with energy minimization were employed. In the resulting grid's centroid spaces, 5 poses for each ligand were then captured and the best scoring pose was examined using XP visualizer.
2.6 MD simulation studies
Using the Desmond V3 program created by the DE Shaw group, chosen compounds with a satisfactory docking score were simulated. Three step procedure using the system builder, minimization and molecular dynamics run were employed to produce the trajectory file. The trajectory file was subsequently examined to get the required outcomes. The integrated ligand–protein complex discovered through docking experiments was single point charged (SPC) and a solvated orthorhombic border box with a distance of 10 Å was created. The produced system was neutralized using Na+ and Cl− ions and a physiological salt content of 0.15 M was maintained. The system was then put through an energy minimization process utilizing NPT at 300 K and 1 bar of pressure. A 100 ns MD simulation was performed on this balanced system at standard temperature and pressure of 1 bar in order to produce the trajectory file. The simulation interaction diagram utility of the Desmond module was used to import the .cms file after a successful MD run and the outcomes were examined. We created RMSD (root mean square deviation), RMSF and ligand 2D interactions plots in order to examine the stability of complexes.
2.7 Principal component analysis (PCA) analysis
Principal component analysis (PCA) was employed to get a deeper understanding of the ways in which ligand binding impacts the Cα atom dynamics in ttbk1. Using this statistical method, we were able to understand the relationship between the different Cα conformations sampled throughout the four simulated systems throughout the molecular dynamics (MD) simulations and explain the Cα root-mean-square deviation (RMSD). The Bio3D R software tool developed especially for comparing protein structures made the study easier. Essentially, we used VMD to import the Desmond trajectory files (.dtr) and CHARMM/NAMD trajectory format (.dcd). files were made. We first overlaid the Cα atoms from every trajectory frame onto the minimised structure PDB file in order to get the data ready for PCA. After completing this procedure, we were able to obtain numerical coordinates (x, y, z) of the Cα atoms, which were imported into RStudio along with the relevant.dcd files. A lower-dimensional representation of the structural dataset was created by projecting the minimised structure and MD trajectory snapshots onto the subspace defined by the largest principal component (PC), which represents the highest variance in Cα atom locations. By using the pca.xyz function in the Bio3D package, we were able to project this image and capture all most important conformational changes that we saw throughout the simulations.
2.8 Binding free energies calculations (MMGB/SA calculations)
The computational technique known as MMGB/SA (Molecular Mechanics/Generalized Born Surface Area) is used in computational chemistry and structural biology to determine the binding free energy between proteins and ligands (small molecules). In drug development and studies of protein–ligand interactions, it is commonly used to predict the binding affinity of potential drug candidates to target proteins. The generalized Born solvent models, surface area calculations and molecular mechanics which makes use of force fields to describe a system internal energy are all combined in the MMGBSA method. The following is an equation (1) of the binding free energy (G bind):
2.9 In-silico ADME calculation
Many Drugs Design & Discovery project have been failed because of inadequate pharmacokinetic profiling of many potentially effective compounds. Even few molecules have been withdrawn from the market because of severe toxicities during its clinical usage leading to time loss and financial burden on any organization. Therefore, early-stage calculation ADME properties is important for successful drug development. In present study, QikProp (version 3.0) from Schrödinger, LLC was used for predictions virtual ADME. The parameters like drug-likeness of the ligand molecules based on Lipinski's Rule of Five and Rule of 3, lipophilicity, blood–brain barrier permeability, molecular weight and the numbers of H bond donors and acceptors etc.
3 Result and discussion
Based on literature survey, recently reported nitroimidazole derivatives were chosen for generation of 2D/3D QSAR models. 3D QSAR models were generated based on the kNN method using SW-FB, SA and GA methods. With the help of different statistical parameters, the best 3D QSAR model was identified. By using the identified best model, the activities of already reported nitroimidazole derivatives were predicted and then compared. Then newer nitroimidazole derivatives were designed with the help of pharmacophoric features generated by the QSAR models. This generated hypothesis was the validated virtually to prove the credibility of 3D QSAR models. For validation, the designed newer nitroimidazole derivatives were subjected to molecular docking studies and best scoring three molecules were then used for molecular dynamics simulation studies. The in-depth interactions between ligands and protein were assessed in this study and generated MD trajectories were used for binding free energies calculations using MMGB/SA method. The obtained results at individual steps were discussed in following sections.
3.1 2D QSAR (2D QSAR for IIIB strain)
2D QSAR studies play a crucial role in drug design project by correlating the structural features of chemical compounds with their biological activities. These studies mainly focus on different structural descriptor such as hydrophobicity, electronic properties, atomic correlation and steric factors with their biological activities. These values can be easily calculated by studying about the 2D QSAR analysis and these models are less expensive, rapid and robust making them an essential tool of early-stage drug design or lead optimization (Chtita et al., 2022; Chtita et al., 2022). Here, the atom based 2D QSAR study of reported nitroimidazole was performed and results were discussed.
3.1.1 Interpretation
According to the equation, our model explains a significant 92.41 % (r2 = 0.9241) of the total variance in the training set, according to the established equation (Table 2). With q2 and pred_r2 values of 77.06 % and 40.80 %, respectively, it displayed the excellent internal and external prediction ability. These results support the capability of generated models to give predictions that can be used to a wide range of situations. The QSAR equation's coefficients provide important information about the molecules that affect the biological activity. It is noteworthy that the negative coefficient for T_2_Cl_7 reveals the number of double-bonded atoms in a molecule that are distant from the chlorine atom by seven bonds, indicating a decrease in inhibitory activity for such configurations. On the other hand, a positive coefficient for SssSEindex, which stands for Electrotopological State Indices (ESI), shows that lower values of this descriptor are associated with stronger inhibitory action. The function of descriptors like T_N_Cl_6 (N atom bonded by single or double bond, separated from any oxygen atom) and SaasCE-index (number of C atoms with one single bond and two aromatic bonds) in anticipating inhibitory activity is revealed by further study. Increased inhibitory activity is shown by a larger T_N_Cl_6 value, whereas decreased inhibitory effects are indicated by a greater SaasCE-index value. It has been interesting to incorporate Electrotopological State Indices into our model for the prediction of inhibitory activity as shown in Fig. 3. These results lay a platform for understanding the structural components needed for biological activity, enabling rational drug design and development. Mode-l (Test set: 14, 18, 19, 2, 20, 6). pIC50(column) = −0.7398(T_2_Cl_7) 0.7873(SssSEindex) −0.8244(T_N_Cl_6) 0.1539 (SaasCEindex) 1.2286. Statistics: [n = 14; Degree of freedom = 9; q2 = 0.7706; r2se = 0.2017; F test = 27.38; r2 = 0.9241; q2se = 0.3506; pred_r2se = 0.4891; pred_r2 = 0.4080].
Parameters
r2
q2
r2se
q2se
Pred_r2
Pred_r2se
F test
1(Model-1)
0.9241
0.7706
0.2017
0.3506
0.4080
0.4891
27.38
2(Model-2)
0.9528
0.8592
0.1377
0.2378
0.3358
0.6647
45.40
3(Model-3)
0.9047
0.6959
0.2353
0.4204
0.6540
0.3544
21.35
(A) The 2D QSAR contribution plot shows us about the different QSAR parameter by using them we can design new potent derivatives on the basis of rationality, (B) data fitness plot shows the actual vs predicted activity of the best 2D QSAR model, and (C & D) Training and test set shows how the activity fluctuate from center point from actual to predicted, (2D QSAR using IIIrd B strain).
3.2 2D QSAR for ROD strain
3.2.1 Interpretation
The outstanding external (pred_r2) and internal (q2) prediction abilities of the produced models which stand at 48.94 % and 82.99 %, respectively, were used to demonstrate their predictive powers. A noteworthy capacity to explain variance within the training set is also demonstrated by the model which accounts for a significant 94.12 % (r2 = 0.9412) of the overall variance. The F-test result with a value of 36.027 is higher than the predicted threshold supporting the model's statistical significance (Table 3). SaaNEindex has a negative coefficient value [Electrotopological state indices for the number of N atoms bonded with two aromatic bonds]. Positive coefficient value of T_C_N_4 for C atoms which are four bonds away from N atom, either single or double bonded] indicated the lower values would give the potent inhibitory activity. A negative coefficient of T_T_N_5 results in a reduction in inhibitory activity, whereas a positive coefficient enhances inhibitory activity, according to the biological activity. A T_T_N_5 here indicate that single or double bonded N atoms which are five bonds away from another N atom in a molecule. A lower or positive coefficient value for T_2_Cl_7 indicates a reduced inhibitory effect on the molecule, whereas a negative coefficient value indicates increased inhibitory activity. T_2_Cl_7 term explained that any double bonded atom separated by 7 bonds from the chlorine atom in a molecule. The results analysis suggested that greater biological activity results in improved inhibitory activity, while lower biological activity results in decreased inhibitory activity. The obtained results reveal important molecular characteristics that affect biological activity. These findings suggested the important insights for design and discovery of newer molecule by indicating that increased biological activity was correlated with improved inhibitory activity and lower biological activity was related with decreased inhibitory activity as shown in Fig. 4. Model-(Test set: 1, 14, 17, 5, 8, 9). pIC50(column) = -12.6359(SaaNEindex) + 0.2926(T_C_N_4) −0.1239(T_C_N_5) −0.4142(T_2_Cl_7) 54.9262. Statistics: [n = 14; r2 = 0.9412; Degree of freedom = 9; q2 = 0.8299; r2se = 0.1698; F test = 36.027; pred_r2 = 0.4894; q2se = 0.2889; pred_r2se = 0.5197].
Trials
r2
q2
r2se
q2se
Pred_r2
Pred_r2se
F test
1(Model-1)
0.9412
0.8299
0.1698
0.2889
0.4894
0.5197
36.027
2(Model-2)
0.9649
0.8406
0.1249
0.2660
0.3883
0.6580
61.76
3(Model-3)
0.9464
0.9113
0.1350
0.1736
0.3919
0.7489
39.70
(A) The 2D QSAR contribution plot shows us about the different QSAR parameter, by using them we can design new potent derivatives on the basis of rationality, (B) data fitness plot shows the actual vs predicted activity of the best 2D QSAR model, and (C and D) Training and test set shows how the activity fluctuates from centre point from actual to predicted.
3.3 3D QSAR (3D QSAR for IIIB strain)
3D QSAR studies are the extension of 2D QSAR by adding the spatial arrangement of atoms and its correlation with biological activities. It gives the more comprehensive understanding how molecular conformations affect their biological activity. This allows for the lead optimization or molecular interactions optimization between drug and receptor leading to more potent and selective drug candidate. Considering these facts here, 3D QSAR study of reported nitroimidazole was done and results were reported Table 4. The values of k (3), q2 (0.7183), pred_r2 (0.7585), q2_se (0.3277), and pred_r2 se (0.3040) in Model 1 demonstrated the statistical significance of the QSAR equation thus derived and demonstrate the model predictive power of 75.85 % (internal validation) and 71.83 % (external validation) (Table 4). Moreover, the obtained values for Electrostatic field, −E_1656 (−0.8562, −0.6964) suggested that less bulky substituent groups are preferable in that location due to the fact that an increase in activity is favorably influenced by negative electrostatic potential. Similarly, Steric field value −S_1383(−0.045, −0.0866) indicated that a decrease in the activity is encouraged by a negative Steric potential, hence less bulky substituent groups are favored there. In this work, we used a thorough methodology to evaluate the predictive capability of our QSAR model. The projected inhibitory activity for the training and test sets are clearly displayed in Table 1. With the help of this division, we were able to evaluate how well our model was able to predict outcomes for new data. The data fitness plot for Model 1 is also shown in Fig. 5, giving a visual picture of the model's performance. A helpful view on the model's efficacy during training and its capacity to extend predictions to an external test set is provided by the observed vs predicted activity diagram shown in Fig. 5. The accuracy with which our model can predict inhibitory actions for compounds seen during training as well as those encountered for the first time in the external test set confirms the robustness of our approach. This demonstrates the usefulness and dependability of our QSAR model in applications for drug discovery and design. Model-11 (Test set: 15, 16, 17, 2, 3, and 9). pIC50 = −E_1656 (−0.1119, −0.0747) & −S_1383 (−0.0448, −0.0124). Statistics: [kNN = 3; n = 14; DOF = 11; q2 = 0.88; q2_se = 0.20; pred_r2 = 0.52; pred_r2se = 0.48].
Trials
kNN
DOF
q2
q2_se
pred_r2
pred_r2se
1(Model-1)
3
11
0.7183
0.3277
0.7585
0.3040
2(Model-2)
2
11
0.7436
0.2900
0.3281
0.6839
3(Model-3)
2
11
0.7661
0.3103
0.4255
0.4727
(A) 3D alignment of molecules with the steric, (B) point data fitness plot shows the actual vs predicted activity of the best model, and (C and D) Training and test set shows how the activity fluctuate from center point from actual to predicted.
3.4 3D QSAR for ROD strain
Owing to the importance of 3D QSAR for optimization of molecular interactions between ligand and receptor, the 3D QSAR studies of reported nitroimidazole derivatives for ROD Strain were determined and results were reported in Table 5. The statistical analysis of Model 1 gave its robustness and predictive power. The key metric highlights the importance of the derived QSAR equation, including k (2), q2 (0.9039), pred_r2 (0.5622), q2_se (0.2071), and pred_r2_se (0.3355). The values obtained for Electrostatic field, −E_1812(−0.274, −0.265) shown that a less bulky substituent group is favored in that location and that a negative electrostatic potential is favorable for an increase in activity. It has been predicted that a negative electrostatic potential is desirable for an increase in activity, hence lower bulky substituents are preferred there. Similarly Steric field, −S_1120 (−0.1017, −0.0875); since it is suggested that a negative electrostatic potential is advantageous for an increase in activity, there is a desire for a lower substituent group there. The models have the good capacity to explain the variance in the training dataset accounting for almost 90.39 % of the observed variability, was explained by the high q2 score of 0.9039 in internal validation. Moreover, the remarkable predictive performance of models on unobserved data is highlighted by the pred_r2 score of 0.5622 in external validation, which explained that it can anticipate about 56.22 % of the variability in inhibitory activities of test set. The predicted inhibitory activity for both the training and test sets are summarized in Table 6 and displayed in Fig. 6 to give a clear insight of how well our model performed. The observed vs. anticipated activity diagram, confirms that our model is successful in identifying the underlying trends in the data. Due to its capacity to generalize predictions and direct decision-making processes, Model 1 is a significant asset in drug discovery and design after being subjected to this complete review. Model-1 (Test set: 11, 16, 2, 20, 7, and 9). pIC50 = −E_1812 (−0.0799, 0.0511), E_955 (−10.00, −10.00) and S_1120 (−0.1017, −0.0875). Statistics: [kNN = 2; n = 14; DOF = 10; q2 = 0.9039; q2_se = 0.2071; pred_r2 = 0.5622; pred_r2se = 0.3355].
Trials
kNN
DOF
q2
q2_se
pred_r2
pred_r2se
1(Model-1)
2
10
0.9039
0.2071
0.5622
0.3355
2(Model-2)
2
10
0.7627
0.2042
0.5245
0.3540
3(Model-3)
2
12
0.7072
0.3254
0.5109
0.4848
Sr. No.
Log EC50
IIIB
logEC50
RODB
Predicted activity IIIB(2D)
Predicted activity ROD(2D)
Predicted activity IIIB(3D)
Predicted activity ROD(3D)
1
−0.38
−o.30103
0.07
0.974
0.608
0.506
2
0.24055
0.43775
0.787
0.816
0.647
0.499
3
0.33646
0.2878
0.267
0.288
0.647
0.364
4
0.7796
0.72346
0.827
0.861
0.684
0.506
5
0.88309
0.81158
0.893
0.76
1.112
0.966
6
0.77452
0.76418
1.116
0.502
0.346
0.832
7
0.65128
0.60097
0.654
1.379
0.862
0.986
8
0.39445
1.02694
0.401
1.168
0.475
1.306
9
1.24797
1.16137
1.616
1.676
1.475
1.953
10
1.80414
1.80003
1.627
1.383
1.488
1.869
11
1.32634
1.35218
1.104
1.88
1.84
1.953
12
1.75967
1.80072
1.604
2.027
1.694
1.886
13
1.92788
1.9345
1.673
1.479
1.674
1.886
14
1.83059
1.85612
1.815
1.279
1.671
1.315
15
1.16435
1.16137
1.317
1.576
1.148
1.289
16
1.6902
1.73878
1.79
1.952
1.512
1.887
17
1.47422
1.97174
1.597
1.634
1.147
1.886
18
1.89708
1.97174
1.589
0.752
1.687
0.764
19
1.05308
0.96567
1.604
1.173
1.186
1.301
20
1.18184
1.18752
1.801
0.774
1.641
0.549
(A) 3D alignment of molecules with the steric, (B) point data fitness plot shows the actual vs predicted activity of the best model, and (C and D) Training and test set shows how the activity fluctuates from center point from actual to predicted.
3.5 Molecular docking studies
The molecules with good predicted IC50 values as displayed in Table 7 were used further confirm the proposed hypothesis by means of computational study. Before the start of docking study, the software was validated by extracting and redocking of internal ligand without change in its heterostate. Then RMSD value compared to its original position was determined. The lower RMSD value 1.533 Å indicated that the software is validated and it is ready to use for further computational study. To predict how molecules, such as small novel HITs or ligands, interact with a target macromolecule, usually a protein, researcher utilize a computational technique called molecular docking. It is predicting the binding mechanism and binding affinity of a ligand to a protein receptor. This knowledge is essential for comprehending molecular interactions, creating novel medicines and improving ones that already exist. The downloaded protein 1HIV was a co-crystallized structure of HIV-1 protease. The structural analysis of this co-crystallized protein suggested that the protein known as Kinemage 2, is made up of two 99-residue polypeptide chains. While the second chain's numbers range from 101 to 199, the first chain's numbers range from 1 to 99. U-75875, a diol inhibitor with the numbers 201–206, is an additional compound. This inhibitor is situated within a well-defined and large active site cleft, just like other protease-inhibitor complexes described by Wlodawer et al. in 1992. Two separate “flaps” originating from each monomer make up the 23 Å long cleft that runs across the dimer interface on one side of the molecule. The HIV protease (1HIV protein) is an important target for the design and development of antiretroviral drugs (Thanki et al., 1992). Hence, we considered the molecular docking to examine the binding interactions between the identified HITs & this protein the results were compared the reference molecule. It was done to determine the binding affinities and mechanisms of action of identified with the active site of the 1HIV protein. The interactions between the identified HITs and the 1HIV protein were assessed by the molecular docking studies, which highlighted the importance of certain hydrogen bonds, hydrophobic contacts and electrostatic interactions. The results obtained were summarized in Table 8 and 3D and 2D interactive diagrams of reference molecule and identified HITs were displayed in Figs. 7 and 8, respectively. The docking scores of reference molecule and ZH-1, ZH-2 and ZH-3 were found to be −4.399, −4.712, −2.172 and −1.454 respectively. The overall good docking scores displayed by the ZH-3 might be due to the two H bond with Asp29 residue. The hydrophobic interactions were also observed with Pro81, Val82, Ile84, Leu23, Ala28, Ile50, Ile47, Val32, Ala28 and Ile84 active site residues with reference molecule and identified HITs. But in case of compounds ZH-3 additional hydrophobic residue Phe83 were observed. Similarly, the electrostatic −ve charged interactions were observed with Asp25, Asp29 and Asp30 residues in all the compounds including reference also. Two additional interactions were observed in case of ZH-3, firstly Pi-cation interactions with Arg8 and polar interaction with Thr80. Overall, these interactions with active site residues were quite identical in case of reference molecule and identified HITs making them good candidate for design of anti-retroviral therapy. Moreover, additional interactions observed in case of ZH-3, made this compound better candidate as development of anti-HIV drugs. However, these results were only the part of preliminary screening between receptor and ligands, the validation of the obtained docking results were needed with advance computational tools. Hence molecular dynamics (MD) simulations studies were performed to check the stability of docked complexes and observed interactions at atomic level.
Sr. no
New molecules
Activity IIIrd
ROD
1
0.0004
0.103
2
0.050
0.08
3
0.084
0.0014
4
0.083
2.08
5
0.0169
0.096
6
0.0102
1.62
7
0.016
0.479
8
0.010
0.00098
9
0.0072
0.0030
10
0.00020
0.00427
Sr. No
Compound
Docking score
Interactions
Active site residues (IUT6)
Reference
−4.399
H bonds
Asp25
Hydrophobic
Pro81, Val82, Ile84, Leu23, Ala28, Ile50, Ile47, Val32, Ala28, Ile84
Polar
Nil
Charged (−ve)
Asp25, Asp29, Asp30
ZH-1
−4.712
H bonds
Nil
Hydrophobic
Val82, Leu23, Ala28, Ile50, Ile47, Val32, Ala28, Ile84
Polar
Thr80
Charged (−ve)
Asp25, Asp29, Asp30
ZH-2
−2.172
H bonds
Asp29 (2H bonds)
Hydrophobic
Phe83, Pro81, Val82, Leu23, Ala28, Ile50, Ile47, Val32, Ala28
Polar
Thr80
Charged (−ve)
Asp25, Asp29
Pi-cation
Arg8
ZH-3
−1.454
H bonds
Asp29 (2H bonds)
Hydrophobic
Phe83, Pro81, Leu10, Val82, Leu23, Ala28, Ile50, Ile47, Val32, Ala28
Polar
Thr80
Charged (−ve)
Asp25, Asp29, Asp30
Pi-cation
Arg8
3D interactive diagrams of different molecules under the study at the active site of 1HIV (A) reference molecules, (B) ZH-1, (C) ZH-2 and (D) ZH-3.
2D interactive diagrams of (A) reference molecules, (B) ZH-1, (C) ZH-2 and (D) ZH-3 at the active pocket of HIV protease enzyme.
3.6 MD simulations studies
All biological systems are thought of as dynamic networks of molecular interactions, in contrast to the singular rigid picture that molecular docking provides of the interaction between proteins and ligands (Bunker and Róg, 2020; Gelpi et al., 2015). We used the SPC model to simulate the dynamical behavior of the three promising docked complexes for 100 ns each in order to investigate the different conformations that these complexes might acquire in the solvated phase. For the 100 ns simulation run produces the 1000 frames (trajectories) of the protein–ligand complex structures, the protein C-atoms RMSD, the ligand RMSD with respect to protein, the Root Mean Square fluctuation (RMSF) and the protein ligand contact analysis were analyzed. The protein–ligand complex mean deviation in atom displacement during a specific time interval from the original (native structure) frame is known as the root mean square deviation (RMSD). Determination of the stability of the protein and ligand structure during simulation is a crucial quantitative assessment. It implies that the protein–ligand combination is more stable if there is less variation in the RMSD during the MD simulation and vice versa (Liu et al., 2018). Protein C-atoms were found to have mean RMSD values of 1.332 Å, 1.461 Å, 1.335 Å and 1.511 Å when bound with ligands referenced in Fig. 9A, ZH-1, ZH-2, and ZH-3; in contrast, the mean RMSD of 1HIV protein C-atoms was determined to be 1.41 Å (Table 9). The ZH-3-1HIV complex exhibited slight variations at 80 ns with an RMSD of 1.4 Å, according to the protein's RMSD analysis (Fig. 9B), before converging to a stable system at the conclusion of the simulation period. Fig. 9C demonstrated that all of the systems (reference, ZH-1, ZH-2, and ZH-3) maintained the molecular interaction profile during the 100 ns MD simulation run. More significantly, they did not exhibit any significant fluctuation, which is unquestionably acceptable to assess the consistency of the protein–ligand complex interaction. Out of all the compounds that were found, reference had the highest RMSD deviations when ligand–protein RMSD values were analyzed. The RMSD in ZH-1 and ZH-2 remained constant during the MD simulation. In ZH-3, equilibration caused initial fluctuations that eventually became constant with only slight variations. The RMSD data suggests that the ligand under research have a strong stabilizing effect on the 1HIV complex protein when compared to the reference (Fig. 9B). The stability of protein–ligand complex analyzed using the RMSF technique. It is a measurement of the flexibility of a protein residue in terms of fluctuations in protein atoms across a 100 ns MD simulation. The average RMSF values for 1HIV bound with reference, ZH-1, ZH-2 and ZH-3 molecules were measured to be 0.953 Å, 0.866 Å, 0.862 Å, and 0.930 Å respectively. However, it was shown that the RMSF values of Apo protein was 0.903 Å (Table 1). It is evident from the low RMSF values that no significant alterations in amino acid residues emerged during the experiment. The ligand made interaction with Asp25, Gly27, Ala28, Gly49, Ile50, and Gly48 in the ZH-1HIV protein complex. There were variations seen in the residues that make up the binding pocket of up to 2.6Å. The reference, ZH-1, ZH-2 and ZH-3 have the least amount of variations. However, reference has a comparable RMSF trend that is consistent with its RMSD pattern. The slight variations in RMSF made it evident that the ZH-1HIV amino acids did, in fact vary considerably at Gly16 and Gly52 throughout the simulation (Fig. 2C). Another criterion for figuring out the root mean square distance of a collection of atoms from their common centre of mass is the evaluation of radius of gyration (Rg), which may provide an indication of the approximate compactness of system. In Fig. 9D, the Rg data for each frame were plotted against time to show how they varied over time. Table 1 displays the maximum, lowest and median Rg values. Fig. 2D shows that for all of the complexes, there was no significant change in Rg. The small differences in Rg values demonstrated that all systems adopt compact protein structures equally, which may be attributable in part to significant contributions from the protein structure α-helices. It was evident from the RMSD, RMSF, and Rg parameters derived from the MD simulation trajectories that 1HIV inhibitors were retained within the protein cavity and that protein–ligand complexes were stable in a dynamic state (Table 9). To gain a better understanding of the inhibitors' binding affinities with 1HIV protein, Throughout the 100 ns simulation, each ligand–protein complex interaction was examined. The protein–ligand interactions histogram for the full 100 ns MD experiment was shown in Fig. 10. The protein–ligand contacts showed that the identified inhibitors primarily interact with HIV through hydrophobic and hydrogen bond interactions. As a result, the binding free energies computation utilizing MMGB/SA calculations further confirmed the observed results.(A) Structural changes at C-α atoms of protein (RMSD) over entire run of 100 ns trajectory on binding with identified ligands and reference molecule, (B) RMSD values with protein backbone (pdb id: 1HIV), (C) Fluctuations of residues (RMSF) on binding with ligand, and (D) The compactness of protein represented by radius of gyration.
Sr. No.
Parameter
Reference
ZH-1
ZH-2
ZH-3
1.
RMSD Cα (Å)*
1.3329
1.4611
1.3356
1.5111
2.
RMSD backbone (Å)*
1.3479
1.4735
1.3349
1.5245
3.
RMSF (Å)*
0.9536
0.8663
0.8626
0.9300
4.
Radius of Gyration (Å)*
3.4378
5.8047
5.2955
5.1815
Protein ligands Interactions and protein ligand contacts histogram of (A) reference molecules, (B) ZH-1, (C) ZH-2 and (D) ZH-3 respectively at the active site of HIV over the entire run of 100 ns (PDB id:). The green, violet and blue color represent the hydrophobic interactions, H bond and water bridges respectively.
3.7 PCA analysis
PCA technique is widely used powerful dimensionality reduction method that can be used to extract useful information from MD simulation data on the main collective movements and conformational changes of biomolecular systems. Large-scale simulations of Molecular Dynamics provide a huge amount of data. To study a medium-sized protein accurately, the positions of at least 10,000 atoms must be calculated every 10−15 s. There are five steps involved in PCA including data centring, covariance matrix computing, eigenvector and eigenvalue computation, eigenvector sorting by eigenvalue sorting and data projection onto the new feature space. The Fig. 11 displayed the PCA analysis of trajectories including reference and ZH1, ZH2, ZH4. The PCA analysis of 1HIV-reference complex showed that PC1, PC2 and PC3 contributed for the 79.1 %, 7.81 %, and 4.01 % respectively (Fig. 11A). Similarly, 1HIV-ZH1 displayed the 78.62 %, 11.92 %, and 3.37 % values for PC1, PC2, and PC3, respectively (Fig. 11B). For 1HIV-ZH2, PC1, PC2, and PC3 values were 74.67 %, 15 %, and 3.51 %, respectively (Fig. 11C). Together, these three top PCs accounted for most of the variation in the original distribution, yielding a succinct yet useful picture of the atomic fluctuations among the studied conformations. As a result, we reduced the complexity of the paths of the simulated systems onto this two-dimensional landscape by concentrating our research on the conformational changes projected onto the subspace defined by PC1 and PC2. The projection allowed the Cα atoms three-dimensional fluctuations to be captured in a lower-dimensional form, which made it easier to understand the conformational shifts. The continuous colour scale, which went from blue to white to red, showed irregular transitions between different conformational states. A clear difference between the blue and red conformations could be seen by projecting the trajectories along the PC1 and PC2 axes which suggests that the simulated systems underwent a conformational transition. The conformational diversity sampled during the simulations was brought to light by this approach, which also offered insights into possible conformational routes and transitions connected to ligand binding and unbinding events.Instantaneous conformations of trajectory obtained by PCA analysis coloured in chronological order from blue to red. The degree of conformational changes in the simulation was described by the spread of blue and red color dots where the color ranges from blue to white to red corresponds to simulation duration. The first, intermediate and the final timestep is represented by blue, white and red respectively. The atomic displacements for PCA-1 are shown on the left side, indicating notable changes. (A) A reference molecules, (B) ZH-1, (C) ZH-2 and (D) ZH-3.
3.8 Binding free energies calculations (MMGB/SA)
Molecular mechanics/Generalized Born Surface Area (MMGB/SA) is a critical computational technique in drug design and molecular modelling. It is essential to studying and maximizing molecular interactions, particularly in the context of drug discovery. MMGBSA is an essential tool for rational drug design and discovery projects because it provides quantitative insights into molecular interactions, helps in the identification of lead compounds, improves their binding affinities and guides decision-making throughout the drug development pipeline. It contains contributions from molecular mechanics (MM) interactions and solvation effects (Genheden and Ryde, 2015). The calculated dG Bind scores were found to be −64.2064, −100.314, −86.211, and −100.461 kcal/mole, respectively for reference, ZH-1, ZH-2 & ZH-3 molecules. The least binding energy for ZH-3 (−100.461 kcal/mole) may be due to the additional interactions what we have discussed in MD simulation study part. All of the identified HITs exhibit good interactions, according to the additional binding energies such as dG_Bind_Coulomb, dG_Bind_Covalent, dG_Bind_Hbond, dG_Bind_Lipo, and dG_Bind_Packing (stacking interactions) which were also discovered in negative values. All the values were summarized in Table 10. The substances with lower (more negative) total binding free energies are great candidates for further investigation because they are more likely to interact strongly with the target protein.
S. No.
Docked complexes
dG_Bind, kcal/mole
dG_Bind_Coulomb kcal/mole
dG_Bind_Covalent kcal/mole
dG_Bind_Hbond kcal/mole
dG_Bind_Lipo kcal/mole
dG_Bind_Packing kcal/mole
1.
Reference
−64.2064
−8.6706
0.7049
−0.2029
−33.9547
0
2.
ZH-1
−100.314
−6.8246
2.0771
−0.2728
−50.1729
−1.9133
3.
ZH-2
−86.2110
−7.82931
2.0573
−1.1308
−44.5062
−1.5334
4.
ZH-3
−100.461
48.2928
3.9930
−0.1962
−52.3336
−1.2216
In-silico ADME calculation: In this study, in-silico ADME prediction of Reference, ZH-1, ZH-2 and ZH-3 was done with the results summarized in Table 11. The parameters mentioned in material and methods part have been evaluated. Additionally, the newly synthesized compounds did not cross the blood–brain barrier and were not predicted to have central nervous system toxicity. For the most active predicted compound ZH-1 was found to have a 0 & 6.5H bond acceptor and donor respectively. The obtained results revealed that molecule has a lower oral absorption rate and metabolism. Moreover, ZH-1 contained the 83.459 polar surface area. Apart from this other physicochemical properties QPlogPo/w 6.3, QPlogBB of −0.761 and a QPPMDCK of, suggesting identified molecule has a minimal toxicity and all drug likeness properties.
ADME Properties
Reference
ZH-1
ZH-2
ZH-3
Molecular Weight
329.706
567.704
592.096
593.083
HB Donor
1
0
0
0
HB Acceptor
3.5
6.5
6
6.5
LogPO/W
3.865
7.773
7.248
logBB
0.057
−0.761
−1.013
−1.033
#Metabolism
4
4
4
3
% Human oral Absorption
3
3
3
1
PSA
50.482
83.459
84.057
87.619
Ro5 Violation
0
2
2
2
Ro3 Violation
0
1
1
1
4 Conclusion
The goal of this study was to clarify the structural requirements influencing the inhibitory actions of a number of nitroimidazole derivatives as prospective anti-HIV medicines. We were successful in producing statistically meaningful 2D/3D-QSAR models. The reliability and resilience of our 2D-QSAR models were proven by extensive validation processes, such as cross-validation testing, randomization tests, and external test set predictions. T_2_Cl_7, SsssEIndex, T_N_Cl_6 (IIIB), SaaNE index, T_C_N_4, T_T_N_5, T_2_Cl_7, and T_2_S_5 HCT-116 were the most significant descriptors found in the best-performing 2D-QSAR models. These discoveries offer significant new information about the structural factors that control inhibitory activity and provide useful direction for the development of newer compounds with improved anti-HIV capabilities. In addition, our 2D/3D-QSAR models have demonstrated their reliability in facilitating molecular design, opening the door for the discovery of novel molecules with enhanced therapeutic potential. Significantly lower docking scores compared to reference compounds provided additional proof of the newly developed molecules efficacy in docking investigations. Furthermore, MD simulation study suggested the stability of ligand–protein complexes needed for good inhibitory activity. The lower MMGB/SA score indicated the strong interactions between drugs and their targets, which can lead to increased drug efficacy, potency, specificity and safety while also helping in the optimization of drug candidates and reducing development costs. In summary, our study emphasizes the value of 2D/3D-QSAR modelling as an effective tool for rational drug design. The in-silico data generated from this study has great potential for the creation of more potent anti-HIV drugs constituting a crucial step in the fight against this serious health issue.
Funding
We are thankful to the Researchers Supporting Project number (RSPD2024R1005), King Saud University, Riyadh, Saudi Arabia, for supporting this work.
CRediT authorship contribution statement
Momin Ziyaul-Haque: Data curation, Conceptualization. Rashid Ayub: Investigation, Formal analysis. Mohd Usman Mohd Siddique: Resources, Methodology, Conceptualization. Amit Gangwal: Visualization, Validation. Azim Ansari: Formal analysis, Data curation. Mudassar Shahid: Writing – review & editing, Visualization, Conceptualization. Yogeeta O. Agrawal: Data curation. Tasneem Khan: Writing – review & editing, Writing – original draft.
Acknowledgement
Authors are thankful to the Researchers Supporting Project number (RSPD2024R1005), King Saud University, Riyadh, Saudi Arabia, for supporting this work.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References
- Exploration of Cannabis constituents as potential candidates against diabetes mellitus disease using molecular docking, dynamics simulations and ADMET investigations. Sci. Afr.. 2023;21:e01745
- [Google Scholar]
- Structure-activity relationships in the development of hypoxic cell radiosensitizers: I. Sensitization efficiency. Int. J. Radiat. Biol. Relat. Stud. Phys. Chem. Med.. 1979;35:133-150.
- [CrossRef] [Google Scholar]
- Enzymes: an integrated view of structure, dynamics and function. Microb. Cell Fact.. 2006;5:1-12.
- [Google Scholar]
- Nitric oxide synthases: structure, function and inhibition. Biochem. J.. 2001;357:593-615.
- [Google Scholar]
- Combined 3D-QSAR and molecular docking study on 7, 8-dialkyl-1, 3-diaminopyrrolo-[3, 2-f] Quinazoline series compounds to understand the binding mechanism of DHFR inhibitors. J. Mol. Struct.. 2017;1139:319-327.
- [Google Scholar]
- Furanone derivatives as new inhibitors of CDC7 kinase: development of structure activity relationship model using 3D QSAR, molecular docking, and in silico ADMET. Struct. Chem.. 2018;29:1031-1043.
- [CrossRef] [Google Scholar]
- Bartroli, J., Alguero, M., Boncompte, E., Forn, J., 1992. Synthesis and antifungal activity of a series of difluorotritylimidazoles.
- An alignment-independent versatile structure descriptor for QSAR and QSPR based on the distribution of molecular features. J. Chem. Inf. Comput. Sci.. 2002;42:26-35.
- [CrossRef] [Google Scholar]
- Three-dimensional quantitative structure–activity relationship (3D-QSAR) analysis and molecular docking-based combined in silico rational approach to design potent and novel TRPV1 antagonists. Med. Chem. Res.. 2013;22:2312-2327.
- [CrossRef] [Google Scholar]
- Pharmacophore modeling and 3D QSAR studies of aryl amine derivatives as potential lumazine synthase inhibitors. Arab. J. Chem.. 2017;10:S100-S104.
- [Google Scholar]
- Synthesis of an (iodovinyl)misonidazole derivative for hypoxia imaging. J. Med. Chem.. 1991;34:2165-2168.
- [CrossRef] [Google Scholar]
- Mechanistic understanding from molecular dynamics simulation in pharmaceutical research 1: drug delivery. Front. Mol. Biosci.. 2020;7:604770
- [Google Scholar]
- Cytotoxicity and probable mechanism of action of sulphimidazole. J. Antimicrob. Chemother.. 2000;46:541-550.
- [Google Scholar]
- Synthesis, characterization, DFT mechanistic study, antimicrobial activity, molecular modeling, and ADMET properties of novel pyrazole-isoxazoline hybrids. ACS Omega. 2022;7:46731-46744.
- [CrossRef] [Google Scholar]
- Pharmacophore identification and QSAR studies on substituted benzoxazinone as antiplatelet agents: KNN-MFA approach. Sci. Pharm.. 2012;80:283
- [Google Scholar]
- 3D QSAR, pharmacophore indentification studies on series of 1-(2-ethoxyethyl)-1H-pyrazolo [4, 3-d] pyrimidines as phosphodiesterase V inhibitors. J. Saudi Chem. Soc.. 2015;19:265-273.
- [Google Scholar]
- Investigation of antileishmanial activities of acridines derivatives against promastigotes and amastigotes form of parasites using quantitative structure activity relationship analysis. Adv. Phys. Chem.. 2016;2016:1-16.
- [CrossRef] [Google Scholar]
- In silico investigation of phytoconstituents from Cameroonian medicinal plants towards COVID-19 treatment. Struct. Chem.. 2022;33:1799-1813.
- [CrossRef] [Google Scholar]
- Validation of the general purpose tripos 5.2 force field. J. Comput. Chem.. 1989;10:982-1012.
- [CrossRef] [Google Scholar]
- Prevention of HIV-1 infection with early antiretroviral therapy. N. Engl. J. Med.. 2011;365:493-505.
- [Google Scholar]
- The Central Role of Enzymes as Biological Catalysts. Sinauer Associates; 2000.
- Cyclohexane-1,3-dione derivatives as future therapeutic agents for NSCLC: QSAR modeling, in silico ADME-Tox properties, and structure-based drug designing approach. ACS Omega. 2023;8:4294-4319.
- [CrossRef] [Google Scholar]
- Role of molecular dynamics and related methods in drug discovery. J. Med. Chem.. 2016;59:4035-4061.
- [CrossRef] [Google Scholar]
- Design of new molecules against cervical cancer using DFT, theoretical spectroscopy, 2D/3D-QSAR, molecular docking, pharmacophore and ADMET investigations. Heliyon. 2024;10
- [Google Scholar]
- Computer-aided drug discovery of natural antiviral metabolites as potential SARS-CoV-2 helicase inhibitors. J. Chem. Res.. 2024;48:17475198231221253
- [CrossRef] [Google Scholar]
- Histamine H2-receptor agonists. Synthesis, in vitro pharmacology, and qualitative structure-activity relationships of substituted 4- and 5-(2-aminoethyl)thiazoles. J. Med. Chem.. 1992;35:3239-3246.
- [CrossRef] [Google Scholar]
- Design of novel anti-cancer drugs targeting TRKs inhibitors based 3D QSAR, molecular docking and molecular dynamics simulation. J. Biomol. Struct. Dyn.. 2023;41:11657-11670.
- [CrossRef] [Google Scholar]
- S-1153 inhibits replication of known drug-resistant strains of human immunodeficiency virus type 1. Antimicrob. Agents Chemother.. 1998;42:1340-1345.
- [CrossRef] [Google Scholar]
- The spectrum of engagement in HIV care and its relevance to test-and-treat strategies for prevention of HIV infection. Clin. Infect. Dis.. 2011;52:793-800.
- [Google Scholar]
- Molecular dynamics simulations: advances and applications. AABC. 2015;37
- [CrossRef] [Google Scholar]
- The MM/PBSA and MM/GBSA methods to estimate ligand-binding affinities. Expert Opin. Drug Discov.. 2015;10:449-461.
- [CrossRef] [Google Scholar]
- DFT-based QSAR Studies of MK801 derivatives for non. Mol. Pharmacol.. 1988;33:581-584.
- [Google Scholar]
- Merck molecular force field. IV. Conformational energies and geometries for MMFF94. J. Comput. Chem.. 1996;17:587-615.
- [CrossRef] [Google Scholar]
- Prediction of the binding affinities and selectivity for CB1 and CB2 ligands using homology modeling, molecular docking, molecular dynamics simulations, and MM-PBSA binding free energy calculations. ACS Chem. Nerosci.. 2020;11:1139-1158.
- [CrossRef] [Google Scholar]
- Synthesis and in vitro pharmacology of a series of new chiral histamine H3-receptor ligands: 2-(R and S)-amino-3-(1H-imidazol-4(5)-yl)propyl ether derivatives. J. Med. Chem.. 1999;42:1193-1202.
- [CrossRef] [Google Scholar]
- Discerning of isatin-based monoamine oxidase (MAO) inhibitors for neurodegenerative disorders by exploiting 2D, 3D-QSAR modelling and molecular dynamics simulation. J. Biomol. Struct. Dyn.. 2024;42:2328-2340.
- [CrossRef] [Google Scholar]
- Molecular dynamics simulations and novel drug discovery. Expert Opin. Drug Discov.. 2018;13:23-37.
- [CrossRef] [Google Scholar]
- HIV protease inhibitors: a review of molecular selectivity and toxicity. HIV/AIDS-Res. Palliat. Care 2015:95-104.
- [Google Scholar]
- Finding structural requirements of structurally diverse α-glucosidase and α-amylase inhibitors through validated and predictive 2D-QSAR and 3D-QSAR analyses. J. Mol. Graph. Model.. 2024;126:108640
- [Google Scholar]
- Structural analysis of α-glucosidase inhibitors by validated QSAR models using topological and hydrophobicity based descriptors. Chemom. Intel. Lab. Syst.. 2011;109:101-112.
- [Google Scholar]
- Molecular docking: principles, advances, and its applications in drug discovery. Lett. Drug Des. Discovery. 2024;21:480-495.
- [Google Scholar]
- The many faces of p38 mitogen-activated protein kinase in progenitor/stem cell differentiation. Biochem. J.. 2012;445:1-10.
- [Google Scholar]
- Hydroxamic acid derivatives as selective HDAC3 inhibitors: computer-aided drug design strategies. J. Biomol. Struct. Dyn.. 2024;42:362-383.
- [CrossRef] [Google Scholar]
- HIV entry inhibitors and their potential in HIV therapy. Med. Res. Rev.. 2009;29:369-393.
- [Google Scholar]
- Rajesh Sharma, R.S., Swaraj Patil, S.P., 2013. Three dimensional quantitative structure analysis substituted 1, 3-diaryl propenone derivatives as antimalarial activity.
- Antiparasitic nitroimidazoles. 3. Synthesis of 2-(4-carboxystyryl)-5-nitro-1-vinylimidazole and related compounds. J. Med. Chem.. 1973;16:347-352.
- [CrossRef] [Google Scholar]
- Field and atom-based 3D-QSAR models of chromone (1-benzopyran-4-one) derivatives as MAO inhibitors. J. Biomol. Struct. Dyn.. 2023;41:12171-12185.
- [CrossRef] [Google Scholar]
- N-substituted-imidazoles as inhibitors of nitric oxide synthase: a preliminary screening. Pharmazie. 1999;54:685-690.
- [Google Scholar]
- Molecular dynamics simulations in drug discovery and pharmaceutical development. Processes. 2020;9:71
- [Google Scholar]
- 3D QSAR kNN-MFA studies on 6-substituted benzimidazoles derivatives as nonpeptide angiotensin II receptor antagonists: a rational approach to antihypertensive agents. J. Saudi Chem. Soc.. 2013;17:167-176.
- [Google Scholar]
- Crystal structure of a complex of HIV-1 protease with a dihydroxyethylene-containing inhibitor: comparisons with molecular modeling. Protein Sci.. 1992;1:1061-1072.
- [CrossRef] [Google Scholar]
- Survival of HIV-positive patients starting antiretroviral therapy between 1996 and 2013: a collaborative analysis of cohort studies. The Lancet HIV. 2017;4:e349-e356.
- [Google Scholar]
- Integrating QSAR modelling and deep learning in drug discovery: the emergence of deep QSAR. Nat. Rev. Drug Discov.. 2024;23:141-155.
- [Google Scholar]
- In silico identification of novel drug target and its natural product inhibitors for herpes simplex virus. In: Nanotechnology and in Silico Tools. Elsevier; 2024. p. :377-383.
- [Google Scholar]
- Proteins: Biochemistry and Biotechnology. John Wiley & Sons; 2014.