Translate this page into:
Integrating machine learning, docking analysis, molecular dynamics, and experimental validation for accelerated discovery of novel FLT3 inhibitors against AML
* Corresponding authors: E-mail addresses: 2225694159@qq.com (Y. Zhao), huangqiang65@sina.com (Q. Huang)
-
Received: ,
Accepted: ,
Abstract
Acute myeloid leukemia (AML) is a malignant clonal disorder driven by the excessive proliferation of immature myeloid cells in the bone marrow and blood, often linked to Fms-like tyrosine kinase 3 (FLT3) mutations, which occur in about one-third of AML patients. While FLT3 inhibitors such as midostaurin, quizartinib, and gilteritinib have demonstrated clinical efficacy, their therapeutic potential is often limited by drug resistance and adverse reactions. Therefore, the development of novel FLT3 inhibitors is critical for improving AML treatment outcomes. In this study, we employed a multi-faceted computer-aided drug design (CADD) approach, integrating machine learning (ML), molecular docking, and molecular dynamics simulations, to accelerate the discovery of new FLT3 inhibitors. A ML-based FLT3 classification model achieved an accuracy of 0.958, while an MV4-11 cell activity prediction model demonstrated strong predictive performance with an R2 of 0.846, MAE of 0.368, and RMSE of 0.492. Virtual screening of 7,280 compounds from the ChemDiv database led to the identification of 68 potential FLT3 inhibitors, with molecular dynamics simulations confirming their stable binding to the FLT3 protein. Experimental validation of four selected compounds showed promising activity in MV4-11 cellular assays, demonstrating the reliability of this integrated CADD approach. These results underscore the potential of a CADD-driven approach, enhanced by ML, to rapidly design new FLT3 inhibitors for AML treatment.
Keywords
AML
FLT3
Machine learning
Molecular docking
Molecular dynamics

1. Introduction
Acute myeloid leukemia (AML) constitutes a neoplastic clonal disorder that originates from myeloid hematopoietic stem or progenitor cells. This malignancy is defined by the uncontrolled growth of primitive and immature myeloid cells in the bone marrow and bloodstream, leading to rapid disease progression, low cure rates, poor prognosis, and short survival periods [1-3]. FMS-like tyrosine kinase 3 (FLT3) mutations are the most common mutations in AML, affecting approximately one-third of patients [4]. FLT3, a receptor tyrosine kinase, plays a critical role in hematopoiesis. Its mutation or overexpression can cause aberrant cell proliferation and survival, driving leukemia development [5]. Recently, significant efforts have been directed towards developing FLT3 inhibitors for leukemia treatment. Numerous studies have validated the efficacy of FLT3 inhibitors in treating AML [6-8]. These inhibitors, such as midostaurin [6,9,10], quizartinib [8,11], and gilteritinib [7,12,13], primarily inhibit FLT3 by competitively binding to its adenosine triphosphate (ATP) binding site, blocking FLT3 receptor phosphorylation. Despite their approval for commercial distribution in the United States, current FLT3 inhibitors face challenges such as resistance, adverse reactions, and limited drug selectivity, which impact treatment efficacy and patient quality of life. Therefore, developing novel and potent FLT3 inhibitors is crucial for improving therapeutic outcomes in AML.
However, the traditional process of drug development is lengthy and costly, typically taking 10 to 20 years and costing around $1.8 billion [14]. The time from the discovery of FLT3 mutations to the clinical approval of FLT3 inhibitors spans more than 20 years [6,15], primarily due to the limitations of conventional laboratory methods. Advanced technologies that can accelerate drug discovery are therefore critical for improving development efficiency and success rates [16]. Computer-aided drug design (CADD), which encompasses molecular docking and molecular dynamics [17], has emerged as a transformative approach in drug discovery [18,19], particularly with the inclusion of machine learning (ML). The successful application of CADD in various therapeutic areas has demonstrated its potential to enhance drug development efficiency [19-21]. The integration of ML into CADD workflows has further amplified this impact. ML excels in processing complex biological data [22], learning the intricate relationships between molecular structures and biological activity, and providing actionable insights for drug design. When molecular docking, molecular dynamics, and ML are combined, they enhance the predictive accuracy and expedite the drug discovery process, thereby increasing the chances of identifying novel inhibitors [23,24]. For FLT3-targeted AML therapy, integrating ML techniques with CADD can more precisely accelerate the discovery process by optimizing compound screening and refinement, compared to traditional FLT3 activity prediction models alone [25,26].
Therefore, this study aims to expedite the design and development of FLT3 inhibitors for AML by employing a multi-faceted approach that integrates CADD techniques, including ML, molecular docking, and molecular dynamics simulations (As shown in Scheme 1). An ML-based FLT3 classification model was developed to identify compounds with inhibitory activity, achieving 0.958 accuracy on an independent test set. Additionally, an ML model predicting MV4-11 cell activity (R2 = 0.846) facilitated rapid screening of 7,280 compounds from ChemDiv. Of these, 68 potential FLT3 inhibitors were selected based on docking scores (≤10.524 kcal/mol), and their stability was confirmed through molecular dynamics simulations. Four promising compounds were experimentally validated, confirming the reliability of the computational predictions. The findings of this study highlight the efficiency of a CADD-driven approach, enhanced by ML, in the rapid identification of novel FLT3 inhibitors, providing a valuable framework for the development of future therapeutic agents targeting AML and related diseases.

- Flowchart of integrating ML, docking analysis, molecular dynamics, and experimental validation for accelerated discovery of novel FLT3 inhibitors against AML.
2. Materials and Methods
2.1. Construction of FLT3 inhibitor activity dataset
In this study, a classification model dataset (CHEMBL1974) was sourced from the ChEMBL online database on December 1, 2023 [27]. To ensure accurate classification, the inhibitors collected were divided into active inhibitors (marked as ‘1’) and inactive inhibitors (marked as ‘0’). Generally, IC50 values for approved and clinical FLT3 drugs are below 100 nM. Therefore, a threshold of 100 nM was set in this study: inhibitors with IC50 values under 100 nM were classified as active, while those with values above 1000 nM were deemed inactive. To reduce the influence of boundary effects and experimental errors on predictions, inhibitors with IC50 values between 100 nM and 1000 nM were excluded. For the classification model, we employed 1,189 active inhibitors and 1,412 inactive inhibitors. Using DataWarrior software [28], we analyzed the plain ring systems and Murcko scaffolds of the 2,601 inhibitors, identifying 265 plain rings and 1,141 Murcko scaffolds. Figure 1 highlights the 24 most frequent plain rings, showcasing the compound diversity within the dataset, which contributed to the construction of robust predictive models. Table S1 and S2 provide the SMILES and occurrence frequencies of the top 24 plain rings and Murcko scaffolds after analyzing the 2,601 FLT3 inhibitors.

- The top 24 plain rings of 2601 FLT3 inhibitors.
The dataset at the cellular level (MV4-11, all compounds are FLT3 inhibitors) for this research was compiled through an extensive literature review and meticulous data collection, resulting in 844 sets of IC50 data for FLT3 inhibitors at the MV4-11 cell level. For compounds with multiple IC50 values, the average value was calculated and used as the final result. In this study, we used the base-10 logarithm of IC50 (expressed as -logIC50, or pIC50) as the dependent variable, rather than IC50. The distribution of this data has been shown in Figure 2, with a notable clustering of pIC50 values between 5.00 and 9.00. DataWarrior software was again employed to examine the plain ring systems and Murcko scaffolds of the 844 FLT3 inhibitors at the MV4-11 cell level, identifying 92 plain rings and 337 Murcko scaffolds. These findings highlight the compound diversity within our MV4-11 dataset. Table S3 and S4 present the SMILES and occurrence frequencies of the top 24 plain rings and Murcko scaffolds after analyzing 844 FLT3 inhibitors at the MV4-11 cell level.

- Distribution of pIC50 data for 844 FLT3 inhibitors at the MV4-11 cell level.
2.2. ML models building
The molecular fingerprints employed in this study served as inputs for ML algorithms, facilitating the transformation of chemical structures into numerical representations through descriptor computation. These conversions, expressed as either binary or continuous values, enable molecular diversity and similarity to be quantified in a manner suitable for ML models. By encoding molecular data into fixed-length vectors, processing efficiency was substantially improved. Specifically, fingerprints were calculated using the PaDEL software [29], with four types of molecular fingerprints utilized: CDK fingerprint, CDK extended fingerprint, Substructure fingerprint, and Substructure fingerprint count. These fingerprints capture essential molecular properties, aiding in the development of predictive models for FLT3 activity. Additional information on the molecular fingerprints is available in Table S5.
To create binary classification models, LightGBM, Random Forest, k-nearest neighbors (KNN), multi-layer perceptron (MLP), Decision Tree, and Logistic Regression were employed. For regression, LightGBM, Random Forest, KNN, PLS, LASSO, and XGBoost were used. Among these, the LightGBM algorithm was preferred due to its advantages as an ensemble learning method [30]. Additionally, the LightGBM algorithm is capable of building both classification models and powerful regression models. LightGBM, a modification of Gradient Boosting Decision Trees (GBDT), is well-known for its efficiency in tackling classification and regression challenges involving extensive datasets and high-dimensional attributes. It incorporates two primary strategies: gradient-based one-sided sampling (GOSS) and exclusive feature bundling (EFB) [31]. GOSS enables efficient handling of large sample sizes, while EFB optimizes the management of numerous features, reducing overfitting risks. Furthermore, LightGBM’s leaf-wise tree splitting strategy enhances accuracy by focusing on nodes with the greatest delta loss. This approach has consistently outperformed traditional level-wise splitting methods [32]. A ten-fold cross-validation strategy was employed to minimize the impact of random data splitting, and enhancement methods such as hyper-parameter adjustment were utilized to boost the model’s accuracy. The grid search approach was employed to identify the best parameters for the models. When the parameter space is limited and computational resources are sufficient, grid search ensures a comprehensive exploration of parameter combinations and the reproducibility of results, while efficiently utilizing parallel computing to accelerate the optimization process. The datasets were split into training and testing subsets with a 90:10 ratio. In this study, compounds with IC50≤100 nM were classified as the “positive class,” while those with IC50≥1000 nM were classified as the “negative class.” The confusion matrix was used to visually represent the model’s performance, comparing true positive (TP), false negative (FN), false positive (FP), and true negative (TN) predictions. Accuracy was defined as (Eq. 1):
When class distribution is balanced, accuracy is a reliable metric, but for imbalanced classes, it tends to favor the majority class. Therefore, the F1-score, which balances precision and recall, was used. Precision and recall were calculated as follows (Eqs. 2-4):
The formulas show that recall depends on the true label, while precision is influenced by predictions. A high recall usually comes with low precision, and vice versa, depending on how the model balances these factors. The F1 score combines both metrics, with higher scores indicating superior model performance.
For regression models, evaluation metrics included Mean absolute error (MAE), f (RMSE), and coefficient of determination (R2). The corresponding formulas are (Eqs. 5-7):
In this equation, yexp and ypred represent the true and forecasted values, respectively. ȳ indicates the average of the predicted values, while m denotes the number of samples. The complete source code and dataset are available at https://github.com/Yihuan-Zhao93/FLT3ML.
2.3. Molecular docking studies and molecular dynamics (MD) simulation
To identify novel FLT3 inhibitors, we performed a structure-based virtual screening approach. The crystal structure of FLT3 (PDB ID: 4XUF) was retrieved from the Protein Data Bank (PDB) [33]. Initially, the PDB file was prepared by removing crystallographic water molecules and the bound ligand, followed by assigning partial charges. The 3D structure of the ligand was generated and saved as a pdbqt file using AutoDock software [34]. Once the protein and ligand structures were prepared, molecular docking was conducted with AutoDock Vina. Docking simulations were carried out by placing the ligands into the active site of FLT3, with the grid box set to 20 × 20 × 20 Å3, centered at the original ligand’s binding site. Binding affinity was evaluated based on the docking scores, where a more negative binding energy suggested stronger interactions between the ligand and the FLT3 receptor. For detailed interaction analysis, Discovery Studio Visualizer [35] was employed, allowing visualization of the molecular interactions between the ligands and protein. To ensure the reliability of the docking procedure, the co-crystallized ligands were re-docked into their respective binding sites within the FLT3 crystal structure. The resulting docking conformations closely matched the experimentally observed ones, with low root-mean-square deviation (RMSD) values (Figure S1), validating the docking methodology. Subsequently, a virtual screening of approximately 1.6 million compounds from the ChemDiv database was performed using the Glide software. Docking poses and scores were generated for all compounds, providing a comprehensive evaluation of potential inhibitors. Notably, the ChemDiv compounds are commercially available, facilitating rapid procurement and experimental verification.
Utilizing the Gromacs 2022 suite [36], we conducted molecular dynamics (MD) simulations. We followed the detailed protocol as outlined in prior literature [22] to execute a 100 ns molecular dynamics simulation, capturing frames every 10 ps. After concluding the simulation, trajectory analysis was performed utilizing Visual molecular dynamics (VMD) and PyMOL, and the binding free energy between the protein and ligand was calculated using the g_mmpbsa tool.
2.4. Experimental validation
In this study, compounds C1-C4 were procured from TargetMol (Shanghai). Cell viability was assessed using the = (CCK-8) assay. MV4-11 cells were seeded into 96-well plates at a density of 3 × 104 cells per well in 100 μL of medium and incubated for 24 h. The experimental setup included a blank control group, which contained only the 1640 complete medium, and treatment groups with varying concentrations of compounds C1-C4 at 1, 10, 100, 1,000, 10,000, and 100,000 nM. Each treatment condition was replicated in three wells, and the absorbance in each well was recorded three times to ensure accuracy. Compounds C1-C4 were dissolved and diluted in 1640 complete medium to prepare the respective treatment solutions. Following a 24-h treatment period with the compounds, 10 μL of CCK-8 solution was added to each well. Subsequently, the plates were incubated for an additional 2 h at 37°C. Absorbance was then measured at 450 nm using a microplate reader to evaluate cell viability. The percentage of cell viability was calculated with the following formula (Eq. 8):
3. Result and Discussion
3.1. Construction of FLT3 biological activity ML classification models
In this study, ML models were developed to predict the activity of FLT3 inhibitors, utilizing various molecular fingerprints as input features. The representation of FLT3 inhibitors in the datasets included CDK fingerprints, CDK extended fingerprints, substructure fingerprints, and substructure fingerprint counts. CDK fingerprints are 1024-bit one-dimensional arrays that capture the presence of distinct structural features. The extended CDK fingerprints improve upon standard CDK fingerprints by incorporating additional ring features. Substructure fingerprints and counts, represented as 307-bit binary strings, indicate the presence of SMiles ARbitrary Target Specification (SMARTS) patterns for functional group classification, as defined by Christian Laggner. Given that model performance can vary significantly based on the choice of input features, we first assessed the predictive performance of models with different input representations. Table 1 presents the performance metrics for various models using ten-fold cross-validation on an independent test set. The results demonstrated that the model employing CDK extended fingerprints (1024 bits) achieved the highest accuracy, precision, and F1-score, indicating superior predictive performance. Combining different molecular fingerprints did not enhance the model’s performance; therefore, CDK extended fingerprints (1024 bits) were selected as the optimal input for the classification models. Figure 3(a) displays the confusion matrix for the classification model using CDK extended fingerprints without cross-validation, with prediction accuracies of 0.959 for active compounds and 0.957 for inactive compounds, resulting in an overall accuracy of 0.958, demonstrating high predictive accuracy. Additionally, the performance of the LightGBM algorithm was compared to five other algorithms. The optimal parameter values for different classification ML prediction models have been shown in Table S6. Figure 3(b) shows the comparative F1-scores of various algorithms utilizing CDK extended fingerprints as input. The LightGBM algorithm exhibited superior predictive performance, validating its effectiveness in constructing structure-activity relationship models for inhibitors. Thus, the LightGBM algorithm was also used in the subsequent activity prediction model at the MV4-11 cell level. To further validate the model’s accuracy, 400 compounds not included in the training or independent test sets (out-of-sample, sourced from the BindingDB database) were selected, consisting of 200 active and 200 inactive compounds. The model accurately predicted the activity of 391 out of the 400 compounds, achieving an overall accuracy of 0.978. The detailed information of the 400 compounds has been listed in Table S7. This high accuracy underscores the model’s robust generalization capability, indicating its potential in predicting FLT3 activity for unknown samples. To illustrate the higher predictive accuracy of our model, we downloaded the data from the referenced study [25] and used CDK extended fingerprints as input, applying the LightGBM method to build a classification model. With ten-fold cross-validation, the training set achieved an accuracy of 0.9782 ± 0.0018, precision of 0.9782 ± 0.0019, and F1 score of 0.9782 ± 0.0018. For the independent test set, the model yielded an accuracy of 0.8427 ± 0.0186, a precision of 0.8437 ± 0.0185, and an F1 score of 0.8426 ± 0.0186. Additionally, we downloaded the 1,350 data points from the recent publication [26] and applied the LightGBM method to build a regression model. With ten-fold cross-validation, the training set achieved an R2 of 0.998 ± 0.000, MAE of 0.031 ± 0.001, and RMSE of 0.047 ± 0.001. The test set achieved an R2 of 0.896 ± 0.005, MAE of 0.242 ± 0.007, and RMSE of 0.313 ± 0.008. These results further demonstrate the high accuracy of our ML method.
| Input (fingerprints) | Input number | Independent test set | ||
|---|---|---|---|---|
| Accuracy | Precision | F1-score | ||
| CDK | 1024 | 0.9167 ± 0.0180 | 0.9159 ± 0.0179 | 0.9162 ± 0.0181 |
| CDK extended | 1024 | 0.9184 ± 0.0192 | 0.9179 ± 0.0193 | 0.9180 ± 0.0192 |
| Substructure | 307 | 0.8436 ± 0.0255 | 0.8435 ± 0.0251 | 0.8429 ± 0.0254 |
| Substructure count | 307 | 0.8726 ± 0.0210 | 0.8719 ± 0.0207 | 0.8719 ± 0.0209 |
| CDK + CDK extended | 2048 | 0.9167 ± 0.0207 | 0.9159 ± 0.0206 | 0.9163 ± 0.0207 |
| Substructure + Substructure count | 614 | 0.8751 ± 0.0246 | 0.8745 ± 0.0247 | 0.8745 ± 0.0246 |
| CDK + Substructure count | 1331 | 0.9128 ± 0.0185 | 0.9120 ± 0.0184 | 0.9124 ± 0.0186 |
| CDK extended + Substructure count | 1331 | 0.9154 ± 0.0192 | 0.9150 ± 0.0189 | 0.9150 ± 0.0192 |
| CDK extended + Substructure + Substructure count | 1638 | 0.9150 ± 0.0211 | 0.9146 ± 0.0208 | 0.9145 ± 0.0213 |
| CDK + Substructure + Substructure count | 1638 | 0.9132 ± 0.0209 | 0.9126 ± 0.0208 | 0.9129 ± 0.0210 |
| CDK+CDK extended + Substructure | 2355 | 0.9137 ± 0.0194 | 0.9130 ± 0.0190 | 0.9133 ± 0.0195 |
| CDK+CDK extended + Substructure count | 2355 | 0.9120 ± 0.0173 | 0.9112 ± 0.0170 | 0.9116 ± 0.0173 |
| CDK + CDK extended + Substructure + Substructure count | 2662 | 0.9141 ± 0.0168 | 0.9134 ± 0.0167 | 0.9137 ± 0.0168 |

- The performance of FLT3 classification prediction model (a) Confusion matrix (b) Comparison of prediction performance of different algorithms utilizing CDK extended fingerprints as input.
3.2. Construction of MV4-11 ML models
To identify potent FLT3 inhibitors for targeted AML therapy, we developed ML predictive models based on activity data from 844 FLT3 inhibitors tested in MV4-11 cells. Initially, we evaluated the impact of different molecular fingerprint inputs on prediction performance, as summarized in Table 2. The results show that using individual molecular fingerprints, such as CDK or CDK extended fingerprints, resulted in high prediction accuracy. Specifically, with CDK or CDK extended fingerprints alone, the LightGBM model achieved R2 values of 0.809 and 0.805, MAE values of 0.402 and 0.406, and RMSE values of 0.548 and 0.553, respectively, indicating that even single molecular fingerprints provide reasonably accurate predictions. Moreover, combining different molecular fingerprints further improved the model’s performance. For instance, the combination of CDK extended and Substructure count fingerprints yielded an R2 value of 0.817, an MAE of 0.394, and an RMSE of 0.536. Notably, when combining CDK, CDK extended, Substructure, and Substructure count fingerprints, the model’s performance reached its highest level, achieving an R2 value of 0.819, an MAE of 0.394, and an RMSE of 0.533 on the independent test set. This combination significantly enhanced both prediction accuracy and model stability compared to using single fingerprints alone. Even though integrating all molecular fingerprints yielded the highest predictive performance, utilizing only the CDK, CDK extended, and Substructure count fingerprints resulted in similarly effective outcomes. Consequently, we chose this combination as the ideal input for our ultimate predictive model. In summary, the combination of multiple molecular fingerprints substantially enhances the performance of FLT3 inhibitor activity prediction models. Figure 4 illustrates the prediction performance of the model using CDK, CDK extended, and Substructure count fingerprints as inputs. For the independent test set, the model achieved an R2 value of 0.846, an MAE of 0.368, and an RMSE of 0.492, demonstrating excellent predictive capabilities. Figure S2 presents a comparison of the model performance between LightGBM and other algorithms, including Random Forest, KNN, PLS, LASSO, and XGBoost. From the figure, it can be observed that the model constructed using the LightGBM algorithm has the smallest MAE and RMSE. Table S8 presents the parameter values of different ML prediction models. Therefore, after constructing classification and regression prediction models with high prediction accuracy, we applied these models to predict the activity of compounds that exhibited high docking scores obtained through molecular docking.
| Fingerprint | Input | Independent Test set | ||
|---|---|---|---|---|
| Number | R2 | MAE | RMSE | |
| CDK | 1024 | 0.809±0.011 | 0.402±0.014 | 0.548±0.016 |
| CDK extended | 1024 | 0.805±0.021 | 0.406±0.016 | 0.553±0.030 |
| Substructure | 307 | 0.665±0.020 | 0.541±0.011 | 0.726±0.021 |
| Substructure count | 307 | 0.625±0.035 | 0.561±0.021 | 0.767±0.036 |
| CDK + CDK extended | 2048 | 0.808±0.016 | 0.402±0.015 | 0.549±0.024 |
| Substructure + Substructure count | 614 | 0.625±0.035 | 0.562±0.021 | 0.767±0.036 |
| CDK extended + Substructure count | 1331 | 0.817±0.015 | 0.394±0.012 | 0.536±0.022 |
| CDK + Substructure count | 1331 | 0.815±0.015 | 0.400±0.009 | 0.539±0.022 |
| CDK extended + Substructure + Substructure count | 1638 | 0.817±0.015 | 0.394±0.012 | 0.536±0.022 |
| CDK + Substructure + Substructure count | 1638 | 0.815±0.015 | 0.400±0.009 | 0.539±0.022 |
| CDK+CDK extended + Substructure | 2355 | 0.812±0.012 | 0.399±0.013 | 0.543±0.018 |
| CDK+CDK extended + Substructure count | 2355 | 0.819±0.012 | 0.394±0.011 | 0.533±0.018 |
| CDK+CDK extended + Substructure + Substructure count | 2662 | 0.819±0.012 | 0.394±0.011 | 0.533±0.018 |

- The true pIC50 value and predicted value obtained from LightGBM model using CDK, CDK extended, and Substructure count fingerprints as inputs of (a) Training set and (b) Independent test set.
3.3. Molecular docking studies
To ensure the reliability of molecular docking, we thoroughly validated it beforehand. The ligand P30 (quizartinib) was re-docked into the FLT3 binding site, achieving an RMSD of 1.481 Å, as illustrated in Figure S2, confirming accurate pose prediction. The binding energy between P30 and FLT3 was determined to be -10.524 kcal/mol, indicating a robust interaction. Additionally, we performed docking with sorafenib and FLT3, which yielded a docking score of -10.997 kcal/mol. This supports the use of -10.524 kcal/mol as a reliable threshold for screening potent FLT3 inhibitors. Next, we docked 1,600,000 molecules from the ChemDiv library into the FLT3 site using set parameters. A threshold of -10.524 kcal/mol was established, selecting 7,280 compounds as potential FLT3 inhibitors for further study. This criterion was chosen to identify compounds with strong binding affinity. We employed DataWarrior software to explore the structural diversity, analyzing plain ring systems and Murcko scaffolds. This revealed 436 unique rings and 2,678 scaffolds, indicating notable diversity beneficial for discovering new FLT3 inhibitors. Molecular weight and AlogP values were calculated and plotted, with Figure 5(a) showing a wide chemical space distribution, highlighting significant diversity. ML models predicted compound activity, identifying 2,720 as active, with 304 having a high activity probability, indicating strong potential as FLT3 inhibitors. The MV4-11 prediction model assessed cellular activity, finding 951 compounds (13.06%) with a predicted pIC50 above 7.00, as seen in Figure 5(b). Applying criteria of high activity probability (probability ≥ 0.90) and a pIC50≥7.00, we identified 68 potential FLT3 inhibitors for AML cells. Table S9 details these selected compounds. The scaffold analysis of the 68 compounds identified the most frequently occurring murcko scaffolds (Table S10), resulting in the selection of four compounds containing the most common murcko scaffolds for further molecular dynamics simulations and experimental validation, as shown in Figure 6.

- (a) The chemical space distribution of 7280 molecules (b) The distribution of pIC50 values for the 7280 molecules.

- The structural formulas of the four compounds selected for experimental validation.
3.4. Molecular dynamics simulations
To evaluate the binding stability of the selected potential FLT3 inhibitors with FLT3 protein, we conducted a 100 ns molecular dynamics simulation on the docked complex of compound C1 and FLT3. Since the structural formulas of compounds C2-C4 are similar to C1, we will only discuss the case of compound C1 in this context. First, we examined the RMSD over the course of the 100 ns molecular dynamics simulation. RMSD is an essential measure that accurately represents the total displacement of all atoms from the reference conformation at any specific time point, acting as a primary indicator for evaluating the stability of the system. A detailed analysis of the data presented in Figure 7(a) reveals that as the simulation progresses, the RMSD values for both the complex structure and the protein exhibit a gradually stabilizing trend, with no significant fluctuations observed, indicating a progressively stable complex structure. Furthermore, to investigate the dynamic behavior of the small molecule on the protein surface, we focused on the interactions between the small molecule and the protein. By precisely calculating the distance between the centroid of the initial docking site residues and the centroid of the small molecule, as well as the distance between the ligand and the protein’s centroid, we enhanced our understanding of the ligand’s binding state to the protein. Figure 7(b) clearly illustrates that the distances between the ligand and the protein center, as well as between the ligand and its original binding site, remain highly stable without significant fluctuations. This finding strongly suggests that the small molecule consistently binds to the initial binding site on the protein throughout the simulation, demonstrating a stable binding relationship between the small molecule and the protein.

- (a) RMSD and (b) Distance obtained from the MD simulation system between compound C1 and FLT3 protein.
Considering the solvation effects, we integrated several parameters, including buried solvent-accessible surface area and interaction energy, to comprehensively assess the stability of the complex. Based on these considerations, we selected the stable-state trajectories of the complex and performed calculations using the Molecular Mechanics-Poisson Boltzmann Surface Area (MM-PBSA) method. The binding energy and related energy terms obtained have been detailed in Table 3. In-depth analysis of the data in the table reveals that the van der Waals interaction energy (ΔEvdw) within the complex is significantly higher than the electrostatic interaction energy (ΔEele), being specifically 2.6 times greater. Additionally, both are higher than the hydrophobic interaction energy (ΔEnonpol). This indicates that van der Waals interactions dominate the binding energy composition, with electrostatic interactions playing a supplementary role, and hydrophobic interactions providing further support. Notably, the binding energy (ΔEMMPBSA) between the small molecule and the protein reached -186.622±2.329 kJ/mol, a significant negative value indicating a very high binding energy and affinity between the small molecule and the protein. This finding provides important insights for further understanding and optimizing the interactions between the small molecule and the protein.
| Terms | Compound 1-FLT3 |
|---|---|
| ΔEvdw (van der Waals energy) | −279.302 ± 2.823 |
| ΔEele (electrostatic energy) | −104.423 ± 1.305 |
| ΔEpol (polar solvation free energy) | 227.063 ± 0.263 |
| ΔEnonpol (non-polar solvation free energy) | −29.961 ± 0.186 |
| ΔEMMPBSA* | −186.622 ± 2.329 |
| -TΔS | 16.304 ± 0.857 |
| ΔGbind* (calculated Gibbs free energy) | −170.319 ± 3.181 |
* ΔEMMPBSA = ΔEvdw + ΔEele + ΔEpol + ΔEnonpol ΔGbind = ΔEMMPBSA -TΔS
For a detailed structural and interaction analysis, we selected the conformation at the end of the simulation. As depicted in Figure 8, certain amino acid residues, including CYS828, ASP829, LYS644, GLU661, and CYS694, formed stable hydrogen bonds with the small molecule. Additionally, hydrophobic interactions such as π-Sulfur, π-π stacking, π-π T-shaped, and π-Alkyl were observed between the small molecule and amino acids like MET665, PHE691, PHE830, MET664, CYS828, LYS664, LEU818, ALA642, CYS694, and LEU616. Van der Waals forces were also noted between the small molecule and amino acids LEU668, GLU692, and GLY697. These results indicate that the small molecule binds securely to the protein, with high binding energy and affinity. Among the observed interactions, van der Waals forces are predominant, while electrostatic interactions are secondary, and hydrophobic interactions contribute significantly to the overall stability.

- Binding interactions of compound C1 and FLT3: Interaction analysis of compound 1 with FLT3 based on the end of the MD simulations.
3.5. Experimental validation: Biological activities assessment
After comprehensive screening using molecular docking, ML model predictions, and molecular dynamics simulations, we selected four compounds with potential inhibitory activity for assays on MV4-11 cell lines. In the experiments, we tested the effects of different concentrations of these compounds on the viability of MV4-11 cells and quantified their biological activity by calculating the IC50. As shown in Figure 9, these compounds exhibited varying degrees of inhibition on the proliferation of MV4-11 cells at different concentrations. Among them, compounds C1-C4 all showed inhibitory effects on MV4-11 cells. Compound C2 demonstrated the strongest inhibitory effect with an IC50 value of 1177 nM, indicating significant inhibitory activity against MV4-11 cells. Compound C2 followed with an IC50 of 1731 nM. The other two compounds showed relatively weaker inhibition, with IC50 values of 7280 nM and 8843 nM, respectively. The predicted values and experimental test values for these four compounds have been summarized in Table S11. These experimental results validate our screening method’s ability to effectively identify compounds with inhibitory effects on MV4-11 cell lines, further demonstrating the potential of integrating computational chemistry and ML-based screening strategies in early drug development. While the observed inhibitory activity is promising, the current IC50 values have not yet reached the 100 nM threshold, indicating that further structural optimization through SAR (structure-activity relationship) exploration is needed. Key modifications could involve introducing halogen atoms (e.g., Cl, F) to enhance hydrophobic interactions with FLT3’s ATP-binding pocket, as well as optimizing hydrophobic substituents to strengthen van der Waals interactions.

- Activity levels of compounds C1-4 in MV4-11 cells.
4. Conclusions
This study successfully applied a multidisciplinary CADD approach, integrating ML, molecular docking, and molecular dynamics simulations, to accelerate the discovery of novel FLT3 inhibitors for AML treatment. The models developed for FLT3 classification and MV4-11 cell activity demonstrated high predictive accuracy, enabling efficient virtual screening of large molecular libraries. From 7,280 compounds, 68 potential FLT3 inhibitors were identified, and molecular dynamics simulations confirmed stable binding interactions with the FLT3 protein. Experimental validation further substantiated the activity of selected compounds, supporting the reliability of this computational framework. These findings underscore the utility of combining CADD techniques with ML to streamline drug discovery processes. This approach not only enhances the speed and accuracy of identifying effective inhibitors but also provides a scalable framework for developing future therapeutic agents targeting AML and related diseases.
Acknowledgment
This project is supported by Guizhou Provincial Science and Technology Projects (No. Qian Ke He Jichu-[2024] youth 322), Science and Technology Plan Project of Guizhou (No. Qian Science Platform Talent [2021]1350-017).
CRediT authorship contribution statement
Yihuan Zhao designed the research. Yihuan Zhao and Qiang Huang performed the research and wrote the manuscript. Qiang Liu and Zhonghua Shi contributed to data analysis. Yihuan Zhao, Qiang Huang and Fushan Tang revised the manuscript. All authors reviewed the manuscript.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Data availability
The datasets and code supporting the conclusions of this article are available in GitHub (https://github.com/Yihuan-Zhao93/FLT3ML).
Declaration of Generative AI and AI-assisted technologies in the writing process
The authors confirm that there was no use of artificial intelligence (AI)-assisted technology for assisting in the writing or editing of the manuscript and no images were manipulated using AI.
Supplementary data
Supplementary material to this article can be found online at https://dx.doi.org/10.25259/AJC_220_2024.
References
- Advances in acute myeloid leukemia. BMJ (Clinical Research Ed.). 2021;375:n2026. https://doi.org/10.1136/bmj.n2026
- [Google Scholar]
- Acute myeloid leukemia: Current progress and future directions. Blood Cancer Journal. 2021;11:41. https://doi.org/10.1038/s41408-021-00425-3
- [Google Scholar]
- Acute myeloid leukemia: 2019 update on risk-stratification and management. American Journal of Hematology. 2018;93:1267-1291. https://doi.org/10.1002/ajh.25214
- [Google Scholar]
- Genomic classification and prognosis in acute myeloid leukemia. The New England Journal of Medicine. 2016;374:2209-2221. https://doi.org/10.1056/NEJMoa1516192
- [Google Scholar]
- FLT3 inhibitors in acute myeloid leukemia: Ten frequently asked questions. Leukemia. 2020;34:682-696. https://doi.org/10.1038/s41375-019-0694-3
- [Google Scholar]
- Midostaurin plus chemotherapy for acute myeloid leukemia with a FLT3 mutation. New England Journal of Medicine. 2017;377:454-464. https://doi.org/10.1056/nejmoa1614359
- [Google Scholar]
- Gilteritinib or chemotherapy for relapsed or refractory FLT3 -mutated AML. New England Journal of Medicine. 2019;381:1728-1740. https://doi.org/10.1056/nejmoa(1902)688
- [Google Scholar]
- Quizartinib versus salvage chemotherapy in relapsed or refractory FLT3-ITD acute myeloid leukaemia (QuANTUM-R): A multicentre, randomised, controlled, open-label, phase 3 trial. Lancet Oncology. 2019;20:984-97. https://doi.org/10.1016/S1470-2045(19)30150-0
- [Google Scholar]
- Inhibitors of protein kinases: CGP 41251, a protein kinase inhibitor with potential as an anticancer agent. Pharmacology & Therapeutics. 1999;82:293-301. https://doi.org/10.1016/s0163-7258(99)00005-4
- [Google Scholar]
- Uniform sensitivity of FLT3 activation loop mutants to the tyrosine kinase inhibitor midostaurin. Blood. 2007;110:4476-4479. https://doi.org/10.1182/blood-2007-07-101238
- [Google Scholar]
- AC220 is a uniquely potent and selective inhibitor of FLT3 for the treatment of acute myeloid leukemia (AML) Blood. 2009;114:2984-2992. https://doi.org/10.1182/blood-2009-05-222034
- [Google Scholar]
- Gilteritinib, a FLT3/AXL inhibitor, shows antileukemic activity in mouse models of FLT3 mutated acute myeloid leukemia. Investigational New Drugs. 2017;35:556-565. https://doi.org/10.1007/s10637-017-0470-z
- [Google Scholar]
- Preclinical studies of gilteritinib, a next-generation FLT3 inhibitor. Blood. 2017;129:257-260. https://doi.org/10.1182/blood-2016-10-745133
- [Google Scholar]
- How to improve R&D productivity: the pharmaceutical industry’s grand challenge. Nature Reviews Drug Discovery. 2010;9:203-14. https://doi.org/10.1038/nrd3078
- [Google Scholar]
- Internal tandem duplication of the flt3 gene found in acute myeloid leukemia. Leukemia. 1996;10:1911-1918.
- [Google Scholar]
- Principles of early drug discovery. British Journal of Pharmacology. 2011;162:1239-1249. https://doi.org/10.1111/j.1476-5381.2010.01127.x
- [Google Scholar]
- Molecular modeling and simulations of some antiviral drugs, benzylisoquinoline alkaloid, and coumarin molecules to investigate the effects on Mpro main viral protease inhibition. Biochemistry and Biophysics Reports. 2023;34:101459. https://doi.org/10.1016/j.bbrep.2023.101459
- [Google Scholar]
- Computer-aided drug discovery approaches against the tropical infectious diseases malaria, tuberculosis, trypanosomiasis, and leishmaniasis. ACS Infectious Diseases. 2016;2:8-31. https://doi.org/10.1021/acsinfecdis.5b00093
- [Google Scholar]
- Computer aided drug design: Success and limitations. Current Pharmaceutical Design. 2016;22:572-581. https://doi.org/10.2174/1381612822666151125000550
- [Google Scholar]
- Artificial intelligence in drug discovery: Recent advances and future perspectives. Expert Opinion on Drug Discovery. 2021;16:949-959. https://doi.org/10.1080/17460441.2021.1909567
- [Google Scholar]
- Discovering anti-cancer drugs via computational methods. Frontiers in Pharmacology. 2020;11 https://doi.org/10.3389/fphar.2020.00733
- [Google Scholar]
- Construction of IRAK4 inhibitor activity prediction model based on machine learning. Molecular Diversity. 2024;28:2289-2300. https://doi.org/10.1007/s11030-024-10926-5
- [Google Scholar]
- Discovery of novel JAK1 inhibitors through combining machine learning, structure-based pharmacophore modeling and bio-evaluation. Journal of Translational Medicine. 2023;21:579. https://doi.org/10.1186/s12967-023-04443-6
- [Google Scholar]
- Exploring novel lead scaffolds for SGLT2 inhibitors: Insights from machine learning and molecular dynamics simulations. International Journal of Biological Macromolecules. 2024;263:130375. https://doi.org/10.1016/j.ijbiomac.2024.130375
- [Google Scholar]
- Classification of FLT3 inhibitors and SAR analysis by machine learning methods. Molecular Diversity. 2024;28:1995-2011. https://doi.org/10.1007/s11030-023-10640-8
- [Google Scholar]
- A simple machine learning-based quantitative structure–activity relationship model for predicting pIC50 inhibition values of FLT3 tyrosine kinase. Pharmaceuticals. 2025;18:96. https://doi.org/10.3390/ph18010096
- [Google Scholar]
- The ChEMBL database in 2023: A drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Research. 2024;52:D1180-D1192. https://doi.org/10.1093/nar/gkad1004
- [Google Scholar]
- DataWarrior: An open-source program for chemistry aware data visualization and analysis. Journal of Chemical Information and Modeling. 2015;55:460-473. https://doi.org/10.1021/ci500588j
- [Google Scholar]
- PaDEL‐descriptor: An open source software to calculate molecular descriptors and fingerprints. Journal of Computational Chemistry. 2011;32:1466-1474. https://doi.org/10.1002/jcc.21707
- [Google Scholar]
- Lightgbm: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems. 2017;30:3146-54.
- [Google Scholar]
- LightBBB: Computational prediction model of blood–brain-barrier penetration based on LightGBM. Bioinformatics. 2021;37:1135-1139. https://doi.org/10.1093/bioinformatics/btaa918
- [Google Scholar]
- Partner relationships, hopelessness, and health status strongly predict maternal well-being: An approach using light gradient boosting machine. Scientific Reports. 2023;13:17032. https://doi.org/10.1038/s41598-023-44410-1
- [Google Scholar]
- RCSB protein data bank: Powerful new tools for exploring 3D structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences. Nucleic Acids Research. 2021;49:D437-D451. https://doi.org/10.1093/nar/gkaa1038
- [Google Scholar]
- AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. Journal of Computational Chemistry. 2010;31:455-461. https://doi.org/10.1002/jcc.21334
- [Google Scholar]
- Studio, D., 2008. Discovery studio, Accelrys [2.1], 420.
- GROMACS: Fast, flexible, and free. Journal of Computational Chemistry. 2005;26:1701-1718. https://doi.org/10.1002/jcc.20291
- [Google Scholar]
