Translate this page into:
Predictive methods for the heat of decomposition of reactive chemicals: From CHETAH, QSPR, and quantum chemical calculations to deep learning
Corresponding author: E-mail addrress: zhanghz.qday@sinopec.com (H. Zhang)
-
Received: ,
Accepted: ,
Abstract
The heat of decomposition of reactive chemicals is a critical parameter for characterizing their thermodynamic properties and has broad application across various essential fields, including hazard identification, chemical classification, and safety risk assessment of chemical reactions. Currently, the primary method for determining the heat of decomposition of compounds relies on experimental techniques, which are not only complex but also inherently associated with significant potential hazards. The existing predictive methodologies are still in the developmental stage. This paper first analyzes and summarizes the current research progress on the prediction of heat of decomposition, then provides an in-depth exploration of the three most prominent predictive methods: the Chemical Thermodynamic and Energy Release Computer Program (CHETAH) program, quantum chemistry calculation, and Quantitative Structure-Property Relationship (QSPR) methods. It focuses on their underlying prediction mechanisms, accuracy, advantages, and limitations. Among these, quantum chemical computation methods and the QSPR approach have demonstrated superior predictive performance. The study specifically emphasizes that integrating deep learning techniques can overcome existing bottlenecks: transfer learning can mitigate the challenge of limited samples in QSPR modeling, while large language models (LLMs) in chemistry can address the prediction difficulties of decomposition reaction equations. These innovative directions are expected to significantly enhance predictive accuracy and provide crucial technical pathways for future research.
Keywords
CHETAH
Deep learning
Heat of decomposition
Quantitative structure-property relationship
Quantum chemical calculations

1. Introduction
Reactive chemicals refer to a class of hazardous substances capable of undergoing chemical reactions either with themselves or with other materials [1,2]. These reactions are typically accompanied by significant exothermic effects, which can lead to fire and explosion hazards in industrial settings. Consequently, the thermal hazards associated with reactive chemicals have long been a focus point in industrial production. In this context, one of the key parameters for assessing thermal hazards is the heat of decomposition, which refers to the amount of heat released when a reactive chemical undergoes a decomposition reaction. This parameter is widely applied across various fields and serves as a critical indicator for evaluating and controlling the safety of reactive chemicals [3-5]. Specifically, within the domain of hazard identification and classification of chemicals, the heat of decomposition is employed as a crucial parameter to determine whether chemicals qualify as explosives or self-reactive substances, as outlined in the “Recommendations on the Transport of Dangerous Goods: Manual of Tests and Criteria” [6]. In the safety risk assessment of chemical reactions, the heat of decomposition serves as an essential criterion for evaluating the potential combustibility and explosiveness of materials [7]. Generally, substances with higher heat of decomposition exhibit correspondingly higher adiabatic temperature rises during their decomposition processes, thereby escalating potential explosive hazards. Therefore, the precise determination and prediction of the heat of decomposition are of vital significance in ensuring the safety of chemicals and chemical reactions.
Currently, the most reliable method for determining the heat of decomposition of compounds predominantly relies on calorimetric testing. Common calorimeters include the Differential Scanning Calorimeter (DSC), Accelerating Rate Calorimeter (ARC), and Calvet calorimeter (e.g., the C80 microcalorimeter) [8,9]. However, these experimental methods are often time-consuming, cumbersome, and high-risk, necessitating the development of theoretical predictive methods that are simple, rapid, and reliable to overcome the limitations of experimental techniques. In this context, investigating the relationship between the molecular structure of reactive chemicals and their heats of decomposition, and subsequently exploring theoretical predictions based on this intrinsic linkage, has become a focal point in current theoretical research. Nevertheless, to date, there is a lack of a systematic review dedicated to specific methods for predicting the heat of decomposition based on the molecular structure of compounds.
This paper will systematically analyze and summarize the research progress in predicting the heat of decomposition of reactive chemicals based on molecular structures. It will discuss the mechanisms of various prediction methods in detail, summarize their advantages, key challenges, and potential improvements, and conclude with an outlook on future trends in the prediction of heat of decomposition in reactive chemicals.
2. Evaluation of current predictive methods for heat of decomposition
The prediction of heat of decomposition plays a critical role in chemical hazard identification and classification, as well as chemical reaction safety risk assessment. By analyzing the molecular structure of compounds to predict their heat of decomposition, this area of research holds significant theoretical value and practical importance in chemical safety evaluations and chemical reaction process optimization. Table 1 summarizes representative research achievements in this field, highlighting three primary predictive methods: the commercial software Chemical Thermodynamic and Energy Release Computer Program (CHETAH), quantum chemistry calculation methods (QC Methods), and Quantitative Structure-Property Relationship (QSPR) methods. Each of these methods demonstrates unique advantages and potential applications. The related research primarily focuses on the thermal stability and reaction characteristics of reactive chemicals, such as organic peroxides, as well as risk assessments in chemical processes. These studies have not only advanced our understanding of chemical safety but have also provided a scientific basis for safety evaluations in the chemical industry.
| Prediction methods | Prediction parameter | Application field | Ref. |
|---|---|---|---|
| CHETAH | Heat of decomposition | Chemical process hazard evaluation | [10] |
| CHETAH | Heat of decomposition | Decomposition reactions evaluation of chemical process | [11] |
| CHETAH | Heat of decomposition | Molecular reactivity evaluation of some aryl azides and diazides | [12] |
| CHETAH | Heat of decomposition for condensed phases | Prediction of the explosibility of self-reactive materials | [13] |
| CHETAH | Heat of decomposition | Thermochemical stability evaluation of common use chemicals | [14] |
| QC Methods | Heat of decomposition | Predicting heats of detonation | [15] |
| QC Methods | Enthalpy of formation for condensed phases | Prediction of heats of formation of energetic materials | [16] |
| QC Methods | Heat of decomposition | Predicting thermal hazards of nitroaromatic compounds (NACs) | [17] |
| QC Methods | Enthalpy of formation | Accurate prediction of the standard enthalpy of formation | [18] |
| QSPR | Heat of decomposition | Prediction of heat of decomposition of urea inclusion compounds | [19] |
| QSPR | Heat of decomposition | Prediction of thermal stability of NACs | [20] |
| QSPR | Heat of decomposition | Prediction of thermal stability of NACs | [21] |
| QSPR | Heat of decomposition | Prediction of thermal stability of NACs | [22] |
| QSPR | Heat of decomposition | Prediction of the reactivity hazards for organic peroxides | [23] |
| QSPR | Heat of decomposition | Hazardous Chemicals Classification System | [24] |
| QSPR | Heat of decomposition | Prediction of the thermal decomposition of organic peroxides | [25] |
| QSPR | Heat of decomposition | Predicting heats of decomposition of nitroaromatics | [26] |
| QSPR | Heat of decomposition | Prediction of heat of decomposition of organic peroxides | [27] |
| QSPR | Heat of decomposition | Predicting the decomposition heat of organic peroxides | [28] |
| QSPR | Heat of decomposition | Predicting thermal stability of self-reactive substances | [29] |
| Pattern recognition | The entire DSC curve | Thermal stability predictions of chemicals | [30] |
Note: QC refers to quantum chemical.
Specifically, as shown in Figure 1, the CHETAH program analyzes the molecular structure to calculate the enthalpy of formation data, which, combined with the decomposition reaction equation predicted by the maximum exothermic rule, is used to effectively predict the heat decomposition. In contrast, quantum chemistry methods employ precise molecular simulations and thermodynamic calculations to obtain key enthalpy of formation data, offering a high-precision approach for predicting heat of decomposition. Therefore, both CHETAH and quantum chemistry methods share a common feature: they rely on the prediction of enthalpy of formation data and decomposition reaction equations to predict the heat of decomposition. By comparison, the QSPR method adopts a data-driven approach, constructing quantitative relationship models between molecular structure and heat of decomposition, thereby bypassing the need for direct calculations of reaction equations and enthalpy of formation data. The distinct advantage of the QSPR method lies in the simplification of its computational process, which enhances its prediction efficiency and has made it one of the most widely used predictive tools due to its simplified workflow.

- The heat of decomposition prediction process of the three commonly used methods.
Furthermore, we systematically evaluated the predictive accuracy of the three distinct approaches (CHETAH, QC, and QSPR) for estimating heat of decomposition. To quantitatively assess prediction performance, representative case studies were selected for comparative analysis between predicted and experimental values using two statistical metrics: root mean square error (RMSE) and coefficient of determination (R2). The R2 metric reflects the model’s explanatory power for data variance (with values approaching 1 indicating perfect fit), while RMSE quantifies the average deviation between predicted and experimental values (where lower values denote higher precision). As presented in Table 2, the CHETAH method demonstrated limited predictive capability, with an R2 of merely 0.10, indicating that only 10% of data variability could be explained by the model, coupled with a substantially higher RMSE, confirming systematic deviations between predictions and experimental measurements. In contrast, the QC method exhibited intermediate predictive accuracy. Notably, the QSPR approach achieved superior performance, with an R2 approaching the theoretical maximum of 1 and a significantly reduced RMSE, demonstrating markedly lower average prediction errors compared to conventional methods. These results robustly validate the exceptional advantage of the QSPR framework integrated with machine learning (ML) for predicting decomposition heats of reactive chemicals.
|
Prediction methods |
Substances | Prediction parameter | RMSE | R2 | Ref. |
|---|---|---|---|---|---|
| CHETAH | Nitro compounds | ΔH (J/g) | 2280 | 0.09 | [14] |
| CHETAH | Organic peroxides | ΔH (J/g) | 2030 | 0.08 | [14] |
| QC Methods | Explosives | ΔH (kJ/mol) | 287 | 0.90 | [15] |
| QC Methods | Nitroaromatic compounds | ΔH (J/g) | 570 | 0.59 | [17] |
| QSPR | Organic peroxides | ΔH (J/g) | 113 | 0.90 | [25] |
| QSPR | Self-reactive substances | ΔH (kJ/mol) | 52 | 0.85 | [29] |
In addition to these mainstream methods, several innovative predictive strategies have emerged in recent years. For instance, L. Mage et al. [30] proposed a pattern recognition-based strategy that leverages image processing algorithms to analyze DSC images, identifying different decomposition modes of compounds. Based on their exothermic characteristics, compounds are categorized, and corresponding predictive models are developed for each class of molecules, enabling the prediction of the entire DSC curve. By analyzing these DSC curves, researchers can further calculate and obtain important thermodynamic parameters such as the onset exothermic temperature and heat of decomposition. Although this method is still in its early exploratory stages, its novelty and potential applications undoubtedly deserve attention.
In summary, existing heat of decomposition prediction methods possess distinct characteristics in terms of theoretical foundations, computational accuracy, and application scope, with each method offering unique advantages and facing certain limitations. The following sections will delve into the prediction mechanisms of the CHETAH program, quantum chemistry methods, and QSPR methods, analyzing their respective strengths and weaknesses, and providing a forward-looking perspective on future research trends, with the aim of offering theoretical support and practical guidance for the continued development of heat of decomposition prediction techniques.
3. Heat of decomposition prediction utilizing the CHETAH program
3.1. Commercially available rapid and simplified predictive approaches
CHETAH is a sophisticated chemical thermodynamics and energy release (hazard) evaluation tool developed by the E27.07 Committee of the American Society for Testing and Materials (ASTM) [31]. Initially launched in 1974 and updated to its eleventh iteration in 2020, CHETAH’s primary function is to predict energy release hazards and assess the potential explosiveness risks of chemical substances or mixtures by providing thermodynamic data. For instance, in the case of the classic explosive 2,4,6-trinitrotoluene (TNT), CHETAH’s process for predicting enthalpy of formation and maximum heat of decomposition is schematically outlined in Figure 2. At its core, CHETAH employs the Benson group contribution method to estimate the enthalpy of formation of compounds, which is then used to calculate decomposition enthalpy, reaction heat, entropy, and other thermodynamic parameters. The Benson group contribution method, developed by S. W. Benson, is a semi-empirical technique that estimates the enthalpy of formation by decomposing a compound into fundamental molecular fragments (or groups) and using the known thermochemical data of these groups to surmise the overall enthalpy of formation [32]. In the prediction process for the heat of decomposition, CHETAH employs a method known as “maximum heat of decomposition” [33], which assumes that all energy released during the thermal decomposition of reactive chemicals originates intrinsically from the compound itself, with decomposition products limited to those possible formations allowed by the inherent atomic composition. Consequently, the program performs a database search for products to maximize heat of decomposition, typically selecting small, stable molecules such as water, methane, graphite, and carbon dioxide as plausible decomposition products. Although this calculation approach may not be entirely accurate from a thermodynamic perspective, since it omits the entropy component and does not perform a free energy minimization procedure, it remains the most efficient and widely applicable method available today. Despite its limitations, CHETAH is widely utilized and recognized in chemical hazard assessment and energy release prediction due to its simplicity, speed, and relatively high applicability.

- A schematic diagram of the process for maximum heat of decomposition in CHETAH.
To further explore the prediction of thermodynamic properties of chemical reactions, particularly regarding the thermal hazard evaluation of primary reactions (those anticipated during the manufacturing process) and secondary reactions (undesired sequential or side reactions), Theodor Grewer et al. [11] utilized CHETAH to estimate the reaction heats of main reactions (e.g., polymerization, diazotization, and hydrogenation reactions) alongside the heats of decomposition of secondary reactions (e.g., decomposition reactions). By comparing the thermodynamic data predicted by CHETAH with reaction heats and decomposition enthalpies obtained through experimental methods, research findings indicated a strong correlation between predicted and experimental values, highlighting the significant utility of CHETAH in assessing the thermal hazards of chemical reactions.
Additionally, Pasturenzi et al. [14] employed the CHETAH program to construct a risk database encompassing the heats of decomposition and Energy Release Potentials (ERP) of 342 commonly encountered chemicals, thereafter comparing predicted results with experimental data to verify the accuracy of the program’s predictions. Selected comparisons of predicted and experimental results are presented in Table 3, illustrating that CHETAH, as an initial screening tool, offers considerable practicality in evaluating the ERPs of compounds. Although CHETAH generally overestimates the actual heat of decomposition, its prediction reliability remains robust for most functional groups. However, for certain functional groups, such as epoxides, the predicted heats of decomposition accuracy is lower, primarily due to the Benson group contribution method’s incomplete coverage of all possible groups and its disregard for molecular spatial structure, resulting in increased prediction errors in enthalpy of formation. Overall, while CHETAH is an effective tool for assessing chemical thermochemical stability, it requires supplementation with experimental data for further validation and refinement, particularly when dealing with specific functional groups.
| Compound | CAS |
ΔHf (kJ/mol) |
ΔHdec,max (kJ/g) |
ERP | Instrument |
ΔHdec (kJ/g) |
ε(%) |
|---|---|---|---|---|---|---|---|
| 4-Bromobutyl nitrate | 146563-40-8 | -178.91 | -2.12 | HIGH | DSC | -1.07 | 49.5 |
| 1-Chloro-3,4-dinitrobenzene | 610-40-2 | 23.85 | -4.64 | HIGH | ARC | -1.8 | 61.2 |
| 2,4-difluoronitrobenzene | 446-35-5 | -310.87 | -4.06 | HIGH | ARC | -0.92 | 77.3 |
| 2,4-Dinitro-6-bromoaniline | 1817-73-8 | 78.65 | -3.68 | HIGH | ARC | -0.63 | 82.9 |
| 1,2,3-Trichloronitrobenzene | 17700-09-3 | -20.08 | -2.59 | HIGH | ARC | -0.67 | 74.1 |
| 4-Chloronitrobenzene | 100-00-5 | 37.24 | -3.76 | HIGH | ARC | -1.76 | 53.2 |
| 4-Chloro-2-nitroaniline | 89-63-4 | 44.77 | -3.56 | HIGH | ARC | -2 | 43.8 |
| 2-Nitrophenol | 88-75-5 | -96.23 | -4.35 | HIGH | ARC | -2.13 | 51.0 |
| 2-Nitroaniline | 88-74-4 | 63.6 | -4.22 | HIGH | ARC | -2 | 52.6 |
| p-Nitrobenzoyl chloride | 122-04-3 | -119.24 | -3.39 | HIGH | DSC | -2.17 | 36.0 |
| 2-Nitrobenzoic acids | 552-16-9 | -304.18 | -3.55 | HIGH | ARC | -1.71 | 51.8 |
| 3,5-Dinitrobenzyl alcohol | 71022-43-0 | -126.77 | -5.03 | HIGH | DSC | -3.47 | 31.0 |
| Benzoyl peroxide | 94-36-0 | -271.12 | -3.01 | HIGH | ARC | -1.84 | 38.9 |
| Peroxyacetic acid | 79-21-0 | -336.6 | -4.52 | HIGH | DSC | -2.04 | 54.8 |
| tert-Butyl hydroperoxide | 75-91-2 | -246.01 | -3.89 | HIGH | DSC | -1.05 | 73.0 |
| tert-Butylperoxide | 110-05-4 | -348.94 | -2.72 | HIGH | ARC | -1.36 | 50.0 |
| Cumene hydroperoxide | 80-15-9 | -78.66 | -3.64 | HIGH | DSC | -1.88 | 48.4 |
Note: The data were obtained from Pasturenzi et al. [14].
3.2. Limitations of the CHETAH program in predicting heat of decomposition
While CHETAH has found widespread application in the domain of heat of decomposition prediction, it continues to confront challenges necessitating enhancement.
3.2.1. Inherent limitations of the Benson group attribution method
The prediction accuracy of compound standard enthalpies of formation is relatively low. CHETAH calculates the standard enthalpies of formation through the Benson group contribution method; however, this technique has inherent limitations. Notably, gaps in numerical data for certain groups restrict their applicability, and group contribution methods fail to account for structural effects, such as steric hindrances, ring deformations, and chelation formation within molecules, thereby constraining calculation accuracy. For instance, Sato et al. [13] employed the CHETAH program to estimate the energy release of self-reactive materials. However, due to the limitations of the Benson group additivity thermochemical database, the heat of formation for certain samples proved challenging to calculate. Furthermore, Grewer et al. [11] utilized the CHETAH program to estimate the maximum heat of decomposition for typical decomposition reactions. Nevertheless, for some critical functional groups lacking Benson data, only estimation methods could be employed, which significantly constrained the accuracy of the results. Such intrinsic defects in the Benson group contribution method fundamentally limit CHETAH’s application and precision in complex molecular systems. Despite researchers’ attempts to expand the range of Benson groups [34], the method still lacks coverage for all potential groups and inadequately considers the influence of spatial structure on enthalpy of formation, rendering its predictive accuracy inferior to that of advanced techniques such as quantum chemical calculations.
Moreover, the calculation of enthalpy of formation within the CHETAH framework assumes that compounds are in a gaseous state; however, enthalpies of formation for condensed phases (liquids or solids) must be adjusted with enthalpies of phase change. The absence of available literature data for enthalpies of phase change precludes effective prediction by CHETAH. Although researchers have proposed the use of condensed-phase Benson group values for estimating enthalpies of formation under condensed phases [13,35], the method’s limited group coverage and insufficient prediction accuracy remain constraining factors.
3.2.2. Constraints of the maximum exothermicity rule
In the prediction of decomposition reaction equations, CHETAH faces pronounced challenges. The program can only predict decomposition reaction equations following the “maximum exothermic rule,” often leading to significant disparities between predicted results and actual reaction equations. As shown in Table 2, the limitation of the “maximum exothermic rule” results in CHETAH-produced heat of decomposition values being generally overestimated and accompanied by considerable computational errors.
Consequently, although CHETAH holds certain potential in the heat of decomposition prediction, due to the aforementioned intrinsic flaws, neither CHETAH nor similar software has yet emerged as a cutting-edge research domain in recent years.
4. Heat of decomposition prediction via quantum chemical methods
4.1. High-precision computational methods for energy calculation
Quantum chemical computations represent an advanced computational approach grounded in the principles of quantum mechanics, capable of investigating molecular and atomic behaviors. These computations specifically aim to predict molecular structures, energies, reaction pathways, and a range of other physical and chemical properties by solving the Schrödinger equation [36,37]. As illustrated in Figure 3, quantum chemical computational methods encompass four principal categories: Hartree-Fock (HF) methods, semi-empirical methods, density functional theory (DFT) methods, and higher-level ab initio methods, such as MP2 and CCSD [38]. Among these, DFT methods have emerged as one of the most widely implemented and successful computational strategies in contemporary quantum chemistry, primarily due to their remarkable computational efficiency and chemical accuracy, particularly when applied to large-scale molecular systems. The core philosophy of DFT stems from the Hohenberg-Kohn theorem, which posits that all ground state properties of an atom or molecule are uniquely determined by its electron density. Leveraging this theorem, DFT method finds an optimal balance between chemical precision and computational cost, becoming an indispensable foundational tool in modern material science and molecular chemistry modeling [39-41].

- Current quantum chemical methods. HF: Hartree-Fork, ROHF: Restricted Open shell Hartree-Fork, UHF: Unrestricted Hartree-Fork, MIDO/3: Modified Intermediate Neglect of Differential Overlap (Version 3), AM1: Austin Model 1, PM3: Parameterized Model 3, B3LYP: Becke, 3-parameter, Lee-Yang-Parr, LDA: Local Density Approximation, GGA: Generalized Gradient Approximation, CI: Configuration Interaction, CC: Coupled Cluster, MPn: Møller-Plesset Perturbation Theory (n-th order), CCSD: Coupled Cluster Singles and Doubles, CCSD(T): CCSD with Perturbative Triples.
In the context of current scientific research and technological applications, quantum chemical computations are extensively applied across a wide array of cutting-edge fields, including drug design, material science, environmental science, catalysis, and theoretical and computational chemistry. These computations particularly excel in the domain of thermochemical property calculations, enabling molecular energy computations with accuracy down to sub-kJ/mol levels, thereby facilitating precise predictions of thermodynamic parameters such as enthalpy of formation and phase transition enthalpy [42-44]. Consequently, compared to traditional programs like CHETAH, quantum chemical computation methods significantly reduce errors in calculating a compound’s enthalpy of formation and enthalpy of phase transition, while effectively overcoming the inherent shortcomings and limitations associated with CHETAH’s reliance on the Benson group attribution method. These advantages render quantum chemical computations as indispensable and potent tools in predicting a compound’s heat of decomposition.
In a study aimed at predicting the heat of decomposition of nitroaromatic compounds (NACs), Mathieu et al. [17] employed the semi-empirical RM1 Hamiltonian method to compute the theoretical heat of decomposition. This process commenced by identifying potential decomposition products of NACs, based on the “maximum exothermicity rule”, following a sequence of priority: HF, CF4, H2O, CO2, CCl4, CO, and HCl, converting residual elements into their corresponding elemental forms where necessary. Subsequently, the RM1 Hamiltonian method was utilized to calculate the theoretical enthalpies of formation of NACs and their decomposition products. Ultimately, these enthalpies of formation data were input into the reaction equation to compute the theoretical heat of decomposition. While the RM1 Hamiltonian method demonstrated superior accuracy in enthalpy of formation calculations compared to the CHETAH program, its precision within the domain of existing quantum chemical calculation methods remains relatively modest, leading to a discernible discrepancy between the theoretical heat of decomposition and the values obtained through DSC. To address this issue, researchers introduced a heat of decomposition correction model, incorporating the electronegativity and chemical hardness parameters of NAC molecules. This correction model is expressed as Eq. (1):
where Δ dH (DSC) represents the experimental heat of decomposition acquired through DSC experiments, Δ dH0 is the theoretical heat of decomposition calculated via the aforementioned methods, χ denotes molecular electronegativity, and η signifies the chemical hardness of the molecule. Utilizing this correction model, researchers successfully enhanced the predictive precision of heat of decomposition for sealed samples tested via DSC, achieving a more desirable corrective effect.
In the domain of high-energy materials, such as propellants, the thermodynamic properties of these substances are critically important for the development and deployment of advanced high-performance tanks, artillery, electrothermal chemical systems, and naval bombardment systems. Against this backdrop, Rice et al. [45] successfully predicted the enthalpy of formation for high-energy molecules in both gaseous and condensed phases via quantum chemical computations. The structure of some high-energy molecules is shown in Figure 4(a). Initially, adopting the DFT computational method, the enthalpy of formation of high-energy molecular gas phases was calculated using an atom equivalent scheme. Specifically, the equation for calculating the gaseous phase enthalpy of formation is as Eq. (2):

- (a) The structure of some high-energy molecules. (b) Calculated gas-phase heats of formation and (c) heats of vaporization versus experimental values for 35 energetic molecules.
where Ei is the energy of molecule i, and ϵ
j refers to “atom equivalent” defined as ϵ
j = Ej - xj, where Ej is the energy of atom j within molecule i, xj is the correction for atom j at the employed theoretical level, and nj indicates the number of j atoms within molecule i. The adoption of the atom equivalent method offers significant benefits, particularly in obviating the need for high-level electron correlation processing or reliance on experimental input data, while effectively correcting inherent computational errors, like zero-point energy discrepancies. This distinctive feature has rendered the method a prevalent and extensively applied tool in gaseous phase enthalpy of formation prediction, not only simplifying the computation process but also enhancing prediction accuracy and reliability to some extent. In this study, researchers compared computational results across different basis sets and functionals, revealing that the B3LYP functional is not only economical and reasonable in calculating enthalpy of formation but also demonstrates high concordance with experimental values for the predicted gaseous phase enthalpy of formation (Figure 4b). Specifically, the RMSE of gaseous phase enthalpy of formation stands at 3.1 kcal/mol, with a maximum deviation of 7.3 kcal/mol from experimental values. Given that the standard state of materials typically corresponds to the condensed phase, predicting condensed phase enthalpy of formation is of paramount importance. To achieve this goal, researchers employed gaseous phase enthalpy of formation and phase transition enthalpies, such as sublimation enthalpy and vaporization enthalpy, utilizing Hess’s Law [46] to extrapolate the condensed phase enthalpy of formation. Hess’s Law is expressed as Eq. (3) and Eq. (4):
Furthermore, researchers formulated a functional relationship between evaporation enthalpy, sublimation enthalpy, and the electrostatic potential of isolated molecules obtained through quantum chemical computations, expressed as Eq. (5) and Eq. (6) [47]:
Here, SA denotes the surface area corresponding to an isosurface at an electron density of 0.001 electrons/bohr3; the electrostatic potential for this isosurface is used to generate two statistically-based quantities, σ2Tot and υ, where σ2Tot serves as an indicator of molecular surface electrostatic potential variability, and υ reflects the degree of balance between the positive and negative electrostatic potentials on the molecular surface. Using least squares fitting on these parameters, researchers derived correction coefficients a, b, and c consistent with the enthalpy of phase transition. This method was successfully applied to predict the condensed phase enthalpy of formation, with phase transition enthalpy predictions exhibiting good agreement with experimental data (Figure 4c). To further enhance the predictive accuracy of enthalpy of formation and phase transition enthalpy, Byrd and Rice [16] conducted calculations with a larger basis set (6-311++G(2df,2p)), based on the same computational process, which significantly reduced the prediction error.
In summary, quantum chemical computation methods demonstrate substantial advantages and efficacy in overcoming the CHETAH program’s deficiencies in gaseous phase enthalpy prediction accuracy and addressing the challenges inherent in condensed phase enthalpy calculations.
4.2. Challenges in predicting decomposition reaction equations
Quantum chemistry calculation methods, owing to their high accuracy in energy calculations, can enable precise predictions of enthalpy of formation and phase transition enthalpy, thereby demonstrating potential advantages over traditional methods, such as the CHETAH program, in predicting heat of decomposition. However, despite the significant value of quantum chemistry in this area, its application in predicting the heat of decomposition still faces several inherent limitations and challenges.
Firstly, the prediction of decomposition reaction equations for reactive compounds remains a significant challenge. Like the CHETAH program, quantum chemistry methods typically rely on the “maximum exothermicity rule” as a general approach for predicting decomposition reaction equations. This strategy often leads to overestimations of the heat of decomposition. As previously mentioned, Rice and Hare [15] accurately calculated the gas-phase enthalpy of formation and phase transition enthalpy for explosives using quantum chemistry methods, subsequently employing the “maximum exothermicity rule” to infer the formation of explosive products, ultimately calculating the heat of decomposition of the explosives. However, this strategy, based on the “maximum exothermicity rule”, tends to result in generally overestimated predictions of heat of decomposition.
For the above problems, a significant advantage of quantum chemistry methods lies in their ability to accurately infer the reaction transition states and combine experimental methods to determine the reaction pathways [48-51]. This advantage allows for a deeper theoretical understanding of the microscopic mechanisms of chemical reactions, particularly in predicting the mechanisms of complex decomposition reactions. By leveraging this method, researchers are able to systematically identify specific pathways of decomposition reactions, providing a reliable basis for accurate heat of decomposition calculations. For instance, Gong et al. [52] combined experimental and theoretical methods to investigate the initial decomposition mechanisms of di-tert-butyl peroxide (DTBP), tert-butyl peroxyisopropyl carbonate (TBCP), tert-butyl peroxybenzoate (TBPB), and tert-butyl peroxy-2-ethylhexanoate (TBPEH) (Figure 5). Initially, gas chromatography-mass spectrometry (GC-MS) experiments were conducted to identify thermal decomposition products (Figure 5a), and based on the analysis, decomposition pathways were proposed (Figure 5b). Furthermore, the researchers utilized DFT calculations to compute the energy changes at each step of the decomposition pathway (Figure 5c). The experimental and DFT theoretical results were combined to validate the proposed decomposition pathways from a microscopic perspective. This demonstrates the validity and feasibility of utilizing quantum chemistry methods to infer reaction pathways.

- (a) The first-level mass spectrogram of DTBP and TBCP. (b) Initial decomposition pathways of DTBP and TBCP. (c) Optimized configuration of reactants, intermediates, transition states and products.
However, despite the high theoretical accuracy demonstrated by this method, several challenges remain in its practical application. Specifically, the process of inferring reaction pathways is often cumbersome and time-consuming, especially when dealing with multi-step reaction systems, where the computational demands are enormous, leading to prolonged calculation times for heat of decomposition. Additionally, the method’s general applicability is relatively limited, making it difficult to apply to all types of decomposition reactions. Thus, its practical use faces constraints that need to be overcome.
Apart from these primary drawbacks, the use of quantum chemistry software typically requires operators to possess advanced expertise, as the process is relatively complex and the hardware requirements for computational devices are high. To address the challenges related to the complexity of quantum chemistry software operation and the high level of user expertise required, the development of integrated computational software is undoubtedly an effective solution. For instance, the quantum chemistry software T1 is designed for the direct calculation of thermodynamic parameters such as the enthalpy of formation [53]. Its user-friendly design allows users to perform relevant calculations more efficiently.
In conclusion, quantum chemistry methods, while offering significant potential for predicting the heat of decomposition of reactive compounds, face several challenges, particularly in the prediction of decomposition reaction equations. The prediction of these equations often relies on the “maximum exothermicity rule,” which, although simplifying the prediction process to some extent, generally leads to overestimations of decomposition heat, thus impacting the accuracy of the predictions. Furthermore, although quantum chemistry methods can predict reaction pathways, this process is not only time-consuming but also computationally intensive, especially when complex reaction mechanisms are involved, further limiting the method’s applicability and generalizability. Therefore, to overcome these inherent limitations, there is an urgent need to develop more efficient, precise, and widely applicable predictive methods for the determination of decomposition reaction equations.
4.3. Deep learning (DL)-assisted prediction of decomposition reaction equations
4.3.1. Overview of DL technologies and large language models (LLMs)
Machine learning (ML), as a data-driven artificial intelligence technology, utilizes various algorithms that allow computer systems to automatically learn from input data and make predictions or decisions without explicit programming [54-56]. The core objective of ML is to analyze input sample data and construct a mathematical model capable of recognizing patterns and making inferences, thereby enabling the system to make reasonable judgments in various tasks.
In comparison to traditional ML methods, DL is an ML technique based on multi-layer neural network architectures [57,58]. The key concept of DL lies in its multi-layer neural network structures, such as convolutional neural networks (CNN), recurrent neural networks (RNN), long short-term memory networks (LSTM), and generative adversarial networks (GAN). These architectures enable computers to automatically extract features during data processing without the need for predefined feature values. This self-learning capability represents a major breakthrough in DL, meaning that models can autonomously discover and build useful features through the network training process, without relying on explicit human guidance. It is important to emphasize that as the depth of the neural network increases, the model’s expressive power is enhanced, with each network layer applying an activation function to transform the input data, thus extracting increasingly abstract features. This ability enables DL to effectively address highly complex tasks and improve prediction accuracy by increasing the number of hidden layers. However, DL models are often highly complex, making their internal workings difficult to understand intuitively, which is why they are often referred to as “black box” models. Despite this, DL technology has been widely applied across various fields such as medicine [59-61] and chemistry [62-64], demonstrating its powerful capabilities.
LLMs are natural language processing models based on DL, typically containing billions to trillions of parameters, and are capable of understanding, generating, translating, and summarizing natural language texts [65-67]. These models are trained on vast corpora of data, learning the syntax, semantics, and complex relationships between contexts within texts. One of the core technologies of LLMs is the Transformer architecture, which effectively captures long-range dependencies through self-attention mechanisms, significantly improving training efficiency and enhancing model performance. In recent years, LLMs have seen rapid development across various fields [68-71] such as natural language processing [72], computer vision [73], and autonomous driving [74]. With their exceptional understanding and reasoning abilities, they have shown great potential in scientific research. Notably, LLMs have been successfully applied to tasks related to chemistry, such as predicting molecular properties [75,76], designing molecular structures [77] and designing experimental protocols [78], yielding satisfactory results. Therefore, LLMs based on DL technology, specifically designed for reaction prediction, hold promise as a potential solution for predicting decomposition reaction equations of reactive compounds.
4.3.2. Decomposition reaction equation prediction based on LLMs
In response to the challenge of accurately predicting the decomposition reaction equations of reactive compounds, it is possible to develop LLMs specifically tailored for reaction prediction, leveraging advanced DL techniques [79]. These models could serve as a valuable tool to assist in the determination of decomposition reaction equations, thereby enhancing the precision and reliability of the prediction process. For instance, the first billion-scale specialized model for the chemical science domain, ChemDFM, collaboratively released by Zhao et al. [80], showcases the ability to predict reaction product structures given specific reactants and conditions. The construction of ChemDFM primarily consists of two processes: domain pre-training and instruction tuning (Figure 6a). Firstly, to address the lack of domain-specific knowledge in general LLMs, the research team employed a multi-source data fusion strategy to construct a high-quality chemical corpus comprising 34 billion tokens for domain-adaptive pre-training of the general large model LLaMa. This corpus not only integrates nearly 4 million rigorously screened cutting-edge research articles in chemistry but also incorporates foundational chemical textbooks and reference materials, effectively balancing the dynamics of disciplinary frontiers with fundamental knowledge systems. Secondly, to overcome the bottleneck in chemical semantic understanding of large models, the team developed a structured instruction dataset of 1.7 million entries covering diverse tasks such as molecular identification, property prediction, and reaction prediction, based on PubChem (the world’s largest molecular database) and USPTO (the authoritative chemical reaction database) (Figure 6b). Multi-task fine-tuning significantly enhanced the model’s specialized representational capacity. Experimental validation demonstrates that ChemDFM possesses robust chemical knowledge comprehension and reasoning capabilities, outperforming commercial large models like GPT-4 in numerous chemical tasks. Notably, ChemDFM has achieved exceptional performance in reaction prediction (Figure 6c), proving the practical feasibility of using LLMs for predicting chemical reaction equations.

- (a) Two-step training procedure to obtain chemical LLM ChemDFM. (b) Itemized list of the instruction tuning dataset. MD: Molecule Description, TBMD: Text-Based Molecule Design, MPP: Molecular Property Prediction, RC: Reaction Completion, MNA: Molecular Notation Alignment. (c) Accuracy scores of different models in reaction prediction and retrosynthesis tasks. YP: Yield Prediction, RP: Reactant Prediction, RS: Reagent Selection, Retro: Retrosynthesis.
Additionally, the ChemLLM model developed by Zhang et al [81] also possesses reaction product prediction capabilities. This model employs a two-stage instruction fine-tuning strategy (general corpus pre-training + chemical data hybrid optimization), which precisely adapts to chemical task requirements while maintaining the model’s general language capabilities (Figure 7a). Notably, the researchers collected chemical data from numerous online repositories, including PubChem, ChEMBL, ChEBI, and ZINC, and subsequently created a large-scale dataset named ChemData for fine-tuning ChemLLM (Figure 7b). The ChemData dataset utilizes a template-based instruction construction method to transform structured chemical data into natural conversational formats suitable for LLMs training. This dataset comprises 7 million question-answer pairs for instruction fine-tuning, covering extensive chemical domain knowledge, with the categories of these QA pairs aligned with molecular, reaction, and other chemistry-related task categories. Experimental results demonstrate that ChemLLM achieves performance comparable to the commercial large-scale model GPT-4 in multiple chemical tasks, including reaction product prediction.

- (a) Two-stage instruction tuning pipeline for ChemLLM. (b) The data of ChemData and Chembench. ChemData contains 7 million Instruction Tuning Q&A, aligned with three principal task categories: molecules, reactions, and other domain-specific tasks. ChemBench contains 4k multiple choice, aligned with two principal task categories: molecules and reactions.
Thus, the development of efficient prediction models integrating DL technologies to assist in precisely determining compound decomposition reaction equations is anticipated to become a promising and forefront research direction with significant application prospects. This research domain not only addresses the shortcomings of current methods in predicting decomposition reaction equations but also promotes further development of thermodynamic parameter prediction through the robust pattern recognition and data-driven prowess of DL.
5. Heat of decomposition prediction based on QSPR models
5.1. Data-driven approaches for predictive modeling
5.1.1. Introduction to QSPR methodology
Quantitative Structure-Property Relationships (QSPR) represent a sophisticated scientific methodology designed to predict the physical and chemical properties of compounds by establishing a mathematical relationship between the properties and molecular structure descriptors. The foundational principle of QSPR is based on the assumption that a quantifiable correlation exists between a compound’s properties and its molecular structure, with such structural features represented by molecular descriptors that encapsulate critical information such as molecular shape, size, and charge distribution [82-84]. The construction of QSPR models involves initially assembling a dataset comprising multiple compounds, each annotated with molecular descriptors and corresponding physical and chemical property values. Subsequently, statistical techniques are employed to ascertain the optimal mathematical relationship between these descriptors and the target properties, thereby enabling accurate prediction of relevant properties for novel compounds (Figure 8). Critical to the predictive performance of these models are both the selection of molecular descriptors and the modeling methodologies employed. Depending on the dimensionality of the molecular descriptors employed, QSPR models may be categorized into multiple dimensions: 1D, 2D, 3D, and even higher dimensions [85,86]. Specifically, 1D descriptors typically include attributes such as molecular weight, atom type counts, number of hydrogen bond donors or acceptors, number of rings, or specific functional groups. In contrast, 2D descriptors are derived from analyzing molecular graphs to compute various molecular indices; 3D descriptors are contingent upon geometric molecular knowledge, such as polar surface area; while 4D descriptors arise from conformation searches or molecular dynamics simulations. As the variety of molecular descriptors continues to expand, increasingly sophisticated ML algorithms, such as k-nearest neighbors, support vector machines, random forests, and artificial neural networks, have been extensively incorporated into QSPR modeling, significantly enhancing the models’ predictive accuracy and generalization capabilities [87-90].

- Mechanism of QSPR method employing input data representing molecules in 1D to 4D.
5.1.2. Methods for molecular descriptors screening
The methods for molecular descriptor screening can be categorized into unsupervised and supervised approaches. Unsupervised methods perform dimensionality reduction or redundancy elimination based on the intrinsic properties of descriptors, with principal component analysis (PCA) and its nonlinear extension kernel principal component analysis (Kernel PCA) being representative examples [91]. PCA transforms high-dimensional descriptors into low-dimensional principal components through linear combinations, significantly reducing data dimensionality while retaining most information, though it cannot handle nonlinear relationships. Kernel PCA addresses this limitation by introducing nonlinear mapping via kernel functions. For instance, Wen et al. [92] employed both PCA and Kernel PCA to process 200 topological descriptors in a flash point study, obtaining 8 principal components via PCA and 5 via Kernel PCA (Figure 9). This approach enhanced model stability and simplicity while further minimizing the loss of critical information. However, unsupervised methods do not incorporate target property information, limiting their applicability.

- Schematic diagram of molecular descriptor screening process using PCA and KPCA.
Supervised methods are further divided into traditional regression techniques and intelligent search algorithms. Traditional methods include stepwise selection (SS), partial least squares (PLS), and an improved variant of all subsets regression (VSMP). Although intuitive and efficient, these methods exhibit limited performance when addressing complex nonlinear problems or large-scale descriptor pools. Intelligent search algorithms employ optimization strategies to identify globally optimal descriptor combinations. Among these, genetic algorithms (GA) simulate biological evolution, utilizing crossover and mutation operations to screen optimal subsets, combining efficiency with global search capability. Its derivative method, genetic function approximation (GFA), further integrates adaptive regression splines, enabling simultaneous variable selection and model construction [93]. Particle swarm optimization (PSO) mimics bird flock foraging behavior, employing a velocity-position update mechanism to search for optimal solutions, characterized by few parameters and ease of implementation [94]. Ant colony optimization (ACO) guides path selection through a pheromone feedback mechanism, with its probabilistic search nature making it particularly effective for multi-solution problems [95]. Compared to traditional methods, intelligent algorithms can effectively handle high-dimensional, nonlinear data, though a trade-off between computational cost and model complexity must be considered. Currently, these algorithms have become mainstream descriptor screening tools in QSPR research.
5.1.3. Algorithms for model construction
The construction of QSPR models primarily relies on statistical learning and ML methods, which can be categorized into linear and nonlinear regression approaches. Among linear models, multiple linear regression (MLR) establishes a linear relationship between target properties and molecular descriptors, offering strong interpretability and computational efficiency. However, it requires careful handling of multicollinearity among independent variables and may perform poorly with insufficient sample sizes. Partial least squares regression (PLSR) addresses multicollinearity by extracting latent components that capture both predictor and response variations, resulting in improved model stability. Linear models are particularly suitable for scenarios with limited data and well-defined relationships, though optimal solutions should be determined through parameter optimization and cross-validation.
In contrast, nonlinear models capture intricate dependencies through complex mapping functions. For instance, artificial neural networks (ANN) employ layered architectures with adaptive weights to model nonlinear responses, often achieving superior accuracy compared to linear models [96]. However, their black-box nature and susceptibility to overfitting necessitate mitigation strategies such as genetic algorithm optimization or data augmentation. Support vector machines (SVM) leverage kernel functions to project data into higher-dimensional spaces, offering advantages over conventional statistical methods by overcoming empirical limitations, handling small-sample problems effectively, and guaranteeing globally unique optimal solutions [97]. Random forests (RF) reduce prediction variance by aggregating multiple decision trees, making them robust for classification and regression tasks involving high-dimensional data [98]. K-nearest neighbors (KNN) relies on local similarity for rapid predictions but suffers from sensitivity to high-dimensional data [99]. While nonlinear models excel in modeling complex datasets, their adoption requires careful consideration of trade-offs among model complexity, interpretability, and computational cost. The final modeling strategy should be guided by performance evaluation and application-specific requirements.
5.2. Application of QSPR techniques in heat of decomposition prediction
QSPR technology has found extensive applications across multiple disciplines, particularly in chemical safety assessment and risk evaluation. Research dedicated to the prediction of heat of decomposition for reactive compounds has gradually proceeded, achieving significant advancements. Organic peroxides, given their high energy potential, are widely utilized in industrial and military spheres; however, their thermal instability can present safety hazards during storage and transportation. Thus, assessing the thermal stability of these compounds becomes imperative, particularly focusing on key parameters such as initial decomposition temperature and heat of decomposition, which are crucial for evaluating the thermal stability of organic peroxides. Prana et al. [25] developed a database consisting of experimental heat of decomposition data for 38 organic peroxides and constructed a multiple linear regression model utilizing quantum chemical descriptors derived from DFT computations to successfully predict the thermal stability of organic peroxides (Figures 10a and b). Within this model, all molecular descriptors exhibited similar importance, with concentration emerging as a significant descriptor. Additionally, the descriptors of local softness and NBO charge on oxygen were directly associated with the peroxide bond, corroborating the critical role of peroxide bond dissociation in the decomposition process.

- (a) Families of organic peroxides. (b) Prediction model and values of the heat of decomposition versus experimental data (Zohari, 2016). (c) Experimental vs. calculated heats of decomposition (Fayet, 2022). (d) Williams’ plot illustrating the applicability domain. (e) Distribution of absolute deviations observed between predicted and measured heats of decomposition in the training and in the validation set.
Moreover, Zohari et al. [27] employed simpler descriptors, such as the number of carbon atoms, oxygen atoms, and the presence of specific functional groups, to formulate innovative QSPR models for predicting the thermal stability parameters of organic peroxides. This approach obviated reliance on complex quantum chemical computational parameters, thereby rendering the model easier to build and more practically applicable.
Except for organic peroxides, self-reactive substances represent another class of highly unstable chemicals, prone to decomposition during transport, storage, or processing, and potentially resulting in severe outcomes like explosions under extreme conditions. Consequently, evaluating their thermal stability is vital for addressing potential process safety concerns and classification purposes. Fayet et al. [29] have developed a QSPR model targeting self-reactive substances, with the objective of predicting their heat of decomposition. The model leverages three-dimensional molecular structures and SMILES codes, demonstrating effective validation, particularly the SMILES-based model, deemed the most practical tool owing to its robust predictive performance on smaller datasets and its circumvention of intricate quantum chemical computations (Figures 10c-e). Recent trends indicate a preference among researchers to eschew quantum chemical descriptors with high computational costs and time demands, instead opting for simpler molecular descriptors like composition and topology descriptors to construct heat of decomposition prediction models, thus reducing computational burdens and enhancing prediction efficacy.
5.3. Enhancement of QSPR models through DL techniques
QSPR methods exhibit promising potential in the domain of heat of decomposition prediction, primarily due to their capacity to deliver relatively precise predictive outcomes. The fundamental rationale of this method lies in the existence of a quantitative relationship between the compound’s heat of decomposition and its molecular structure; employing statistical methodologies to elucidate this relationship enables the prediction of properties for novel compounds. Thus, compared to the CHETAH program, QSPR methods eliminate the need for calculating the compound’s enthalpy of formation and enthalpy phase transformation, unrestricted by the limitations of the Benson group attribution method regarding comprehensive coverage of all groups and consideration of molecular spatial structures, serving as an effective alternative for heat of decomposition prediction. However, QSPR methods continue to suffer from several challenges: (1) The substantial sample size required to achieve high-precision predictive results poses a formidable hurdle [100,101]. Central to the QSPR approach is the dependence on statistical methodologies to establish mathematical relationships between molecular descriptors and target properties, thereby necessitating extensive, high-quality sample data for accurate predictions. (2) The applicability range of predictive models is relatively constrained [102,103]. For different classes of compounds, separate predictive models typically need to be constructed, thereby limiting the models’ widespread applicability. Particularly in the domain of heat of decomposition prediction, the challenge of constructing multiple high-precision models becomes particularly pronounced, primarily due to the insufficient sample size of heat of decomposition for any single class of compounds. Despite these challenges, QSPR methods remain a significant focus in chemical safety assessment research owing to their simplicity and efficiency.
Over recent years, DL techniques have been extensively adopted in QSPR modeling due to their robust feature learning capabilities and adeptness at capturing complex patterns. Utilizing deep neural networks, generative adversarial networks, RNN, and other DL techniques has effectively enhanced the predictive accuracy of physical and chemical properties, with promising potential to diminish the need for large sample sizes. The methodology has already secured notable successes in pioneering fields such as drug discovery [104-107]. Notably, Uesawa et al. introduced the “Deepsnap” method, employing compound images as input features for DL, thereby obviating the use of traditional molecular descriptors and achieving superior performance in prediction results (Figure 11) [108-110].

- Approaches of contrasting traditional and deep QSPR models.
As a pivotal extension of DL, transfer learning offers an innovative solution to the data scarcity challenge in QSPR modeling. As illustrated in Figure 12, transfer learning mitigates the dependence on target domain data volume by transferring molecular representation knowledge acquired from pre-trained models in source domains (e.g., large-scale compound databases) to target domains (e.g., specific property prediction tasks) [111]. For instance, Li et al. [112] proposed a Convolutional Recurrent Neural Network and Transfer Learning (CRNNTL) framework, which integrates the local feature extraction capability of CNNs, the global sequence modeling strength of RNNs, and transfer learning. By leveraging transfer learning technology to effectively transfer knowledge from larger datasets to smaller ones, this approach demonstrated superior predictive performance across 27 drug and material property datasets compared to conventional methods, substantially alleviating the challenges of small dataset training.

- Schematic diagram of model construction process using transfer learning.
Furthermore, Li et al. [113] introduced the MolPMoFiT method, an efficient transfer learning strategy based on self-supervised pre-training followed by task-specific fine-tuning for QSPR/QSAR modeling. The large-scale molecular structure prediction model was pretrained in a self-supervised manner using one million unlabeled molecules from ChEMBL, followed by fine-tuning on smaller chemical datasets with specific endpoints across various QSPR/QSAR tasks. Ultimately, this method outperformed existing DL approaches in tasks such as lipophilicity prediction, HIV inhibition, and blood-brain barrier penetration, demonstrating robust predictive performance even in few-shot learning scenarios.
These groundbreaking advancements indicate that integrating transfer learning with established DL techniques (e.g., Deepsnap) will significantly reduce the sample size requirements for QSPR model construction while enhancing predictive accuracy. Consequently, applying DL techniques to the construction of heat of decomposition prediction models is anticipated to further elevate the accuracy of predictions and provide new breakthroughs for the advancement of QSPR technologies.
6. Conclusions
The heat of decomposition is a critical parameter in characterizing the thermodynamic properties of compounds. Theoretical prediction methods offer a valuable alternative to experimental testing, effectively mitigating the drawbacks of lengthy time requirements and potential hazards associated with experiments. This article provides a comprehensive review of the advancements in three methods for predicting heat of decomposition: the CHETAH program, the QSPR method, and the quantum chemical calculation method. It emphasizes a detailed analysis of the predictive mechanisms, advantages, limitations, and potential improvement directions for each approach.
The CHETAH program primarily relies on the Benson group attribution method to estimate the enthalpy of formation of compounds. Subsequently, it predicts the heat of decomposition by determining the decomposition reaction equation based on the “maximum exothermic rule”. Owing to the simplicity of its operation and its rapid calculation speed, the CHETAH program has found extensive use in the domain of predicting heat of decomposition, particularly suitable for simple and fast estimations where high accuracy is not required. However, the inherent limitations of the group attribution method, such as its inability to account for all possible chemical groups and its neglect of molecular spatial structures, fundamentally restrict the scope and accuracy of the CHETAH program’s applications. This limitation leads to significant errors in predicting standard enthalpies of formation and challenges in forecasting the enthalpies of formation for condensed phases. Consequently, neither CHETAH nor similar software has emerged as a prominent research area in recent years.
Quantum chemical calculations, leveraging high-precision computational approaches such as DFT, allow for precise prediction of the enthalpy of formation and phase transition enthalpy of compounds. Like CHETAH, they use the “maximum exothermic rule” to determine decomposition reaction equations to predict the heat of decomposition. The accuracy and applicability of this method in predicting enthalpy of formation far exceed that of the CHETAH program, and it also achieves predictions for the enthalpy of formation of condensed phases. Therefore, quantum chemical computational methods are suitable for scientific research pursuits requiring higher accuracy. However, quantum chemical computation methods still face challenges in determining decomposition reaction equations. To address these challenges, a promising future research direction involves the development of LLMs for predicting decomposition reaction equations by integrating DL methods.
The QSPR method employs statistical techniques to construct a mathematical model that relates molecular descriptors to heat of decomposition, using a data-driven approach to successfully predict heat of decomposition. Unlike the CHETAH program, this approach does not require the computation of enthalpies of formation and phase transition, thus circumventing computational errors. It is particularly suitable for rapid and accurate prediction of the heat of decomposition for specific classes of compounds with large sample sizes. Nevertheless, modeling with the QSPR method heavily depends on large volumes of sample data, and its application scope is limited. In studies focused on specific categories of compounds, a common challenge is the insufficient sample size for heat of decomposition. Looking forward, integrating DL methods, which possess robust feature learning capabilities and an ability to capture complex patterns, into QSPR modeling could significantly enhance predictive accuracy.
In conclusion, the CHETAH program’s application scope and accuracy are constrained by inherent flaws, such as the inability of Benson group attribution method to encompass all groups and its neglect of molecular spatial structure influences, ultimately hindering its further development. Quantum chemical computational methods face challenges in confirming decomposition reaction equations. Meanwhile, the QSPR approach is limited by its dependence on sample data for modeling and the insufficient sample volume for certain compound types. The integration of DL technologies into the existing research framework is viewed as a cutting-edge direction for advancing this field. On one hand, using DL to develop LLMs to assist in the inference of decomposition reaction equations and integrating them with quantum chemical methods to predict decomposition heat opens up new research pathways for achieving high-precision prediction of the heat of decomposition of reactive compounds. On the other hand, combining DL with QSPR methods could significantly enhance the accuracy of models for predicting the physical and chemical properties of compounds.
Acknowledgment
The authors gratefully acknowledge financial support from the Technology Development Program of SINOPEC, China (Grant No. H23012).
CRediT authorship contribution statement
Liao Zhang: Concepts, Design, Literature search, Manuscript preparation, Manuscript editing and review, Definition of intellectual content, Statistical analysis. Xiangning song: Manuscript editing and review, Concepts, Design. Peng Li: Design. Yuan Yuan: Manuscript editing and review. Kefeng Wan: Manuscript editing and review. Fei Huang: Design, Definition of intellectual content, Manuscript editing and review. Yafeng Guo: Concepts, Design, Definition of intellectual content, Manuscript editing and review. Hongzhe Zhang: Concepts, Design, Definition of intellectual content, Manuscript preparation, Manuscript editing and review. All authors have accepted responsibility for the entire content of this manuscript and approved its submission.
Declaration of competing interest
We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work.
Declaration of generative AI and AI-assisted technologies in the writing process
The authors confirm that there was no use of AI-assisted technology for assisting in the writing of the manuscript and no images were manipulated using AI.
References
- Toward an inherently safer design and operation of batch and semi-batch processes: The N-oxidation of alkylpyridines. Journal of Loss Prevention in the Process Industries. 2012;25:797-802. https://doi.org/10.1016/j.jlp.2012.03.003
- [Google Scholar]
- Assessment on thermal hazards of reactive chemicals in industry: State of the Art and perspectives. Progress in Energy and Combustion Science. 2020;78:100832. https://doi.org/10.1016/j.pecs.2020.100832
- [Google Scholar]
- A method for the determination of the specific heat and heat of decomposition of composite materials. Thermochimica Acta. 1982;57:161-171. https://doi.org/10.1016/0040-6031(82)80057-9
- [Google Scholar]
- Measuring properties for material decomposition modeling. Fire and Materials. 2011;35:1-17. https://doi.org/10.1002/fam.1031
- [Google Scholar]
- Study on the thermal decomposition kinetics of ammonium carbamate for low-grade heat utilization. Thermochimica Acta. 2024;739:179809. https://doi.org/10.1016/j.tca.2024.179809
- [Google Scholar]
- Recommendations on the Transport of Dangerous Goods, Manual of Tests and Criteria, 6th revised ed. New York, Geneva: United Nations Publications; 2015.
- Investigation of how pressure influences the thermal decomposition behavior of azodicarbonamide. Journal of Loss Prevention in the Process Industries. 2023;83:105062. https://doi.org/10.1016/j.jlp.2023.105062
- [Google Scholar]
- Method for measuring the standard heat of decomposition of materials. Thermochimica Acta. 2012;545:34-47. https://doi.org/10.1016/j.tca.2012.06.027
- [Google Scholar]
- Advanced technology of thermal decomposition for AMBN and ABVN by DSC and VSP2. Journal of Thermal Analysis and Calorimetry. 2015;121:533-540. https://doi.org/10.1007/s10973-015-4559-3
- [Google Scholar]
- Using the ASTM CHETAH program in chemical process hazard evaluation. Plant/Operations Progress. 1992;11:224-228. https://doi.org/10.1002/prsb.720110410
- [Google Scholar]
- Prediction of thermal hazards of chemical reactions. Journal of Loss Prevention in the Process Industries. 1999;12:391-398. https://doi.org/10.1016/S0950-4230(99)00011-X
- [Google Scholar]
- Hazardous N-containing system: Thermochemical and computational evaluation of the intrinsic molecular reactivity of some aryl azides and diazides. New Journal of Chemistry. 2008;32:47-53. https://doi.org/10.1039/b707931c
- [Google Scholar]
- Prediction of explosibility of self-reactive materials by calorimetry of a laboratory scale and thermochemical calculations. Science and Technology of Energetic Materials. 2011;72:97-105. https://www.jes.or.jp/mag/stem/Vol.72/documents/Vol.72,No.4,p.97-105.pdf
- [Google Scholar]
- Thermochemical stability: A comparison between experimental and predicted data. Journal of Loss Prevention in the Process Industries. 2014;28:79-91. https://doi.org/10.1016/j.jlp.2013.03.011
- [Google Scholar]
- Predicting heats of detonation using quantum mechanical calculations. Thermochimica Acta. 2002;384:377-391. https://doi.org/10.1016/s0040-6031(01)00796-1
- [Google Scholar]
- Improved prediction of heats of formation of energetic materials using quantum mechanical calculations. The Journal of Physical Chemistry. A. 2006;110:1005-1013. https://doi.org/10.1021/jp0536192
- [Google Scholar]
- Significance of theoretical decomposition enthalpies for predicting thermal hazards. Journal of Chemistry. 2015;2015:1-12. https://doi.org/10.1155/2015/158794
- [Google Scholar]
- Accurate prediction of standard enthalpy of formation based on semiempirical quantum chemistry methods with artificial neural network and molecular descriptors. International Journal of Quantum Chemistry. 2021;121:e26441. https://doi.org/10.1002/qua.26441
- [Google Scholar]
- Topological models for prediction of heat of decomposition of urea inclusion compounds containing aliphatic endocytes. Journal of Inclusion Phenomena and Macrocyclic Chemistry. 2008;60:187-192. https://doi.org/10.1007/s10847-007-9345-9
- [Google Scholar]
- On the prediction of thermal stability of nitroaromatic compounds using quantum chemical calculations. Journal of Hazardous Materials. 2009;171:845-850. https://doi.org/10.1016/j.jhazmat.2009.06.088
- [Google Scholar]
- QSPR modeling of thermal stability of nitroaromatic compounds: DFT vs. AM1 calculated descriptors. Journal of Molecular Modeling. 2010;16:805-812. https://doi.org/10.1007/s00894-009-0634-7
- [Google Scholar]
- Development of a QSPR model for predicting thermal stabilities of nitroaromatic compounds taking into account their decomposition mechanisms. Journal of Molecular Modeling. 2011;17:2443-2453. https://doi.org/10.1007/s00894-010-0908-0
- [Google Scholar]
- Prediction of the reactivity hazards for organic peroxides using the QSPR approach. Industrial & Engineering Chemistry Research. 2011;50:1515-1522. https://doi.org/10.1021/ie100833m
- [Google Scholar]
- Review of existing QSAR/QSPR models developed for properties used in hazardous chemicals classification system. Industrial & Engineering Chemistry Research. 2012;51:16101-16115. https://doi.org/10.1021/ie301079r
- [Google Scholar]
- Prediction of the thermal decomposition of organic peroxides by validated QSPR models. Journal of Hazardous Materials. 2014;276:216-224. https://doi.org/10.1016/j.jhazmat.2014.05.009
- [Google Scholar]
- A new method for predicting heats of decomposition of nitroaromatics. Zeitschrift für anorganische und allgemeine Chemie. 2015;641:1818-1823. https://doi.org/10.1002/zaac.201500273
- [Google Scholar]
- Prediction of decomposition onset temperature and heat of decomposition of organic peroxides using simple approaches. Journal of Thermal Analysis and Calorimetry. 2016;125:887-896. https://doi.org/10.1007/s10973-016-5451-5
- [Google Scholar]
- Development of simple QSPR models for the prediction of the heat of decomposition of organic peroxides. Molecular Informatics. 2017;36:1-9. https://doi.org/10.1002/minf.201700024
- [Google Scholar]
- First QSPR models to predict the thermal stability of potential self-reactive substances. Process Safety and Environmental Protection. 2022;163:191-199. https://doi.org/10.1016/j.psep.2022.05.017
- [Google Scholar]
- A systematic approach for thermal stability predictions of chemicals and their risk assessment: Pattern recognition and compounds classification based on thermal decomposition curves. Process Safety and Environmental Protection. 2017;110:43-52. https://doi.org/10.1016/j.psep.2017.02.017
- [Google Scholar]
- CHETAH-The ASTM chemical thermodynamic and energy release evaluation program. Philadelphia: American Society for Testing and Materials; 1974.
- Thermochemical kinetics: methods for the estimation of thermochemical data and rate parameters. New York: Wiley; 1976.
- Prediction of energy release hazards using a simplified adiabatic temperature rise. Journal of Loss Prevention in the Process Industries. 2007;20:151-157. https://doi.org/10.1016/j.jlp.2007.02.001
- [Google Scholar]
- Defining new benson groups for use in thermodynamic property estimation software. University of South Alabama 2010
- [Google Scholar]
- Prediction of the Strength of Energetic Materials Using the Condensed and Gas Phase Heats of Formation. Propellants, Explosives, Pyrotechnics. 2015;40:551-557. https://doi.org/10.1002/minf.201700024
- [Google Scholar]
- Quantum algorithms for quantum chemistry and quantum materials science. Chemical Reviews. 2020;120:12685-12717. https://doi.org/10.1021/acs.chemrev.9b00829
- [Google Scholar]
- Quantum computational chemistry. Reviews of Modern Physics. 2020;92:1-51. https://doi.org/10.1103/revmodphys.92.015003
- [Google Scholar]
- A review of quantum chemical methods for treating energetic molecules. Energetic Materials Frontiers. 2021;2:292-305. https://doi.org/10.1016/j.enmf.2021.10.004
- [Google Scholar]
- Computational predictions of energy materials using density functional theory. Nature Reviews Materials. 2016;1:1-13. https://doi.org/10.1038/natrevmats.2015.4
- [Google Scholar]
- Status and challenges of density functional theory. Trends in Chemistry. 2020;2:302-318. https://doi.org/10.1016/j.trechm.2020.02.005
- [Google Scholar]
- Direct dynamics with nuclear–electronic orbital density functional theory. Accounts of Chemical Research. 2021;54:4131-4141. https://doi.org/10.1021/acs.accounts.1c00516
- [Google Scholar]
- Quantitative quantum chemistry. Molecular Physics. 2008;106:2107-2143. https://doi.org/10.1080/00268970802258591
- [Google Scholar]
- Application of quantum calculations in the chemical industry—An overview. International Journal of Quantum Chemistry. 2015;115:107-136. https://doi.org/10.1002/qua.24811
- [Google Scholar]
- Quantum chemistry in the age of quantum computing. Chemical Reviews. 2019;119:10856-10915. https://doi.org/10.1021/acs.chemrev.8b00803
- [Google Scholar]
- Predicting heats of formation of energetic materials using quantum mechanical calculations. Combustion and Flame. 1999;118:445-458. https://doi.org/10.1016/s0010-2180(99)00008-5
- [Google Scholar]
- Physical Chemistry. Oxford: Oxford University Press; 1982.
- Calculation of heats of sublimation and solid phase heats of formation. Molecular Physics. 1997;91:923-928. https://doi.org/10.1080/002689797171030
- [Google Scholar]
- Thermal decomposition of furans with oxygenated substituents: A combined experimental and quantum chemical study. Proceedings of the Combustion Institute. 2021;38:699-707. https://doi.org/10.1016/j.proci.2020.07.124
- [Google Scholar]
- Quantum chemical investigation of the 1‐methyl‐ and 1‐neopentyl‐2‐methoxydiazene‐1‐oxides thermal decomposition mechanisms. Journal of Physical Organic Chemistry. 2022;35:1-14. https://doi.org/10.1002/poc.4407
- [Google Scholar]
- Process hazard and decomposition mechanism of benzoyl peroxide in the presence of incompatible substances. Journal of Molecular Liquids. 2023;372:121146. https://doi.org/10.1016/j.molliq.2022.121146
- [Google Scholar]
- Degradation pathways and mechanisms insight of indigo and shikonin with experiments and quantum chemical calculations. Dyes and Pigments. 2023;218:111455. https://doi.org/10.1016/j.dyepig.2023.111455
- [Google Scholar]
- Thermal hazards and initial decomposition mechanisms study of four tert-butyl organic peroxides combining experiments with density functional theory method. Thermochimica Acta. 2022;708:179142. https://doi.org/10.1016/j.tca.2021.179142
- [Google Scholar]
- Synthesis, Spectroscopic and Thermal Characterization of Azido-1,2,4-triazoles: A Class of Heteroarenes with a High Nitrogen Content. European Journal of Organic Chemistry. 2012;2012:1195-1201. https://doi.org/10.1002/ejoc.201101450
- [Google Scholar]
- Supervised machine learning: A brief primer. Behavior Therapy. 2020;51:675-687. https://doi.org/10.1016/j.beth.2020.05.002
- [Google Scholar]
- Machine learning: Algorithms, real-world applications and research directions. SN computer Science. 2021;2:160. https://doi.org/10.1007/s42979-021-00592-x
- [Google Scholar]
- A guide to machine learning for biologists. Nature reviews. Molecular Cell Biology. 2022;23:40-55. https://doi.org/10.1038/s41580-021-00407-0
- [Google Scholar]
- Machine learning and deep learning. Electronic Markets. 2021;31:685-695. https://doi.org/10.1007/s12525-021-00475-2
- [Google Scholar]
- If deep learning is the answer, what is the question? Nature Reviews. Neuroscience. 2021;22:55-67. https://doi.org/10.1038/s41583-020-00395-8
- [Google Scholar]
- A survey on deep learning and its applications. Computer Science Review. 2021;40:100379. https://doi.org/10.1016/j.cosrev.2021.100379
- [Google Scholar]
- Deep Learning applications for COVID-19. Journal of Big Data. 2021;8:18. https://doi.org/10.1186/s40537-020-00392-9
- [Google Scholar]
- Recent advances and applications of deep learning methods in materials science. npj Computational Materials. 2022;8:59. https://doi.org/10.1038/s41524-022-00734-6
- [Google Scholar]
- Development of xanthene-based fluorescent dyes: Machine learning-Assisted prediction vs. TD-DFT prediction and experimental validation. Chemistry–Methods. 2021;1:389-396. https://doi.org/10.1002/cmtd.202000068
- [Google Scholar]
- Data-driven machine learning models for quick prediction of the Stokes shift of organic fluorescent materials. Dyes and Pigments. 2023;220:111670. http://dx.doi.org/10.1016/j.dyepig.2023.111670
- [Google Scholar]
- Machine-learning-assisted rational design of Si─Rhodamine as Cathepsin-pH-activated probe for accurate fluorescence navigation. Advanced Materials (Deerfield Beach, Fla.). 2024;36:e2404828. https://doi.org/10.1002/adma.202404828
- [Google Scholar]
- ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences. 2023;103:102274. https://doi.org/10.1016/j.lindif.2023.102274
- [Google Scholar]
- Large language models in medicine. Nature Medicine. 2023;29:1930-1940. https://doi.org/10.1038/s41591-023-02448-8
- [Google Scholar]
- A survey on large language model based autonomous agents. Frontiers of Computer Science. 2024;18:186345. https://doi.org/10.1007/s11704-024-40231-1
- [Google Scholar]
- Summary of ChatGPT-related research and perspective towards the future of large language models. Meta-Radiology. 2023;1:1-14. https://doi.org/10.1016/j.metrad.2023.100017
- [Google Scholar]
- ChatGPT and large language models in academia: Opportunities and challenges. BioData Mining. 2023;16:20. https://doi.org/10.1186/s13040-023-00339-9
- [Google Scholar]
- Role play with large language models. Nature. 2023;623:493-498. https://doi.org/10.1038/s41586-023-06647-8
- [Google Scholar]
- Talking about large language models. Communications of the ACM. 2024;67:68-79. https://doi.org/10.1145/3624724
- [Google Scholar]
- Utilizing natural language processing and large language models in the diagnosis and prediction of infectious diseases: A systematic review. American Journal of Infection Control. 2024;52:992-1001. https://doi.org/10.1016/j.ajic.2024.03.016
- [Google Scholar]
- Learning to prompt for vision-language models. International Journal of Computer Vision. 2022;130:2337-2348. https://doi.org/10.1007/s11263-022-01653-1
- [Google Scholar]
- Large models for intelligent transportation systems and autonomous vehicles: A survey. Advanced Engineering Informatics. 2024;62:102786. https://doi.org/10.1016/j.aei.2024.102786
- [Google Scholar]
- Connecting molecular properties with plain language. Nature Machine Intelligence. 2024;6:249-250. https://doi.org/10.1038/s42256-024-00812-y
- [Google Scholar]
- Leveraging large language models for predictive chemistry. Nature Machine Intelligence. 2024;6:161-169. https://doi.org/10.1038/s42256-023-00788-1
- [Google Scholar]
- Large language models as molecular design engines. Journal of Chemical Information and Modeling. 2024;64:7086-7096. https://doi.org/10.1021/acs.jcim.4c01396
- [Google Scholar]
- Autonomous chemical research with large language models. Nature. 2023;624:570-578. https://doi.org/10.1038/s41586-023-06792-0
- [Google Scholar]
- Ab initio quantum chemistry with neural-network wavefunctions. Nature reviews. Chemistry. 2023;7:692-709. https://doi.org/10.1038/s41570-023-00516-8
- [Google Scholar]
- Developing ChemDFM as a large language foundation model for chemistry. Cell Reports Physical Science. 2025;6:1-14. https://doi.org/10.1016/j.xcrp.2025.102523
- [Google Scholar]
- Chemllm: A chemical large language model. arXiv preprint arXiv 2024:.06852. https://doi.org/10.48550/arXiv.2402.06852
- [Google Scholar]
- Inverse-QSPR for de novo design: A review. Molecular Informatics. 2020;39:e1900087. https://doi.org/10.1002/minf.201900087
- [Google Scholar]
- QSAR/QSPR in polymers. International Journal of Quantitative Structure-Property Relationships. 2020;5:80-88. https://doi.org/10.4018/ijqspr.2020010105
- [Google Scholar]
- Predicting melting point of ionic liquids using QSPR approach: Literature review and new models. Journal of Molecular Liquids. 2021;344:117631. https://doi.org/10.1016/j.molliq.2021.117631
- [Google Scholar]
- Prediction model of clearance by a novel quantitative structure–Activity relationship approach, combination deepsnap-deep learning and conventional machine learning. ACS Omega. 2021;6:23570-23577. https://doi.org/10.1021/acsomega.1c03689
- [Google Scholar]
- Integrating QSAR modelling and deep learning in drug discovery: The emergence of deep QSAR. Nature Reviews. Drug Discovery. 2024;23:141-155. https://doi.org/10.1038/s41573-023-00832-0
- [Google Scholar]
- Quantitative structure-property relationship (QSPR) study to predict retention time of polycyclic aromatic hydrocarbons using the random forest and artificial neural network methods. Structural Chemistry. 2020;31:1281-1288. https://doi.org/10.1007/s11224-019-01476-w
- [Google Scholar]
- Machine Learning for industrial applications: A comprehensive literature review. Expert Systems with Applications. 2021;175:114820. https://doi.org/10.1016/j.eswa.2021.114820
- [Google Scholar]
- A review of the application of machine learning in water quality evaluation. Eco-Environment & Health. 2022;1:107-116. https://doi.org/10.1016/j.eehl.2022.06.001
- [Google Scholar]
- Prediction of Nernst coefficient of S-containing compounds between fuel and ionic liquid phases in the extractive desulfurization using linear and supported vector machine (SVM) methods: QSPR-based machine learning. Journal of the Taiwan Institute of Chemical Engineers. 2024;165:105773. https://doi.org/10.1016/j.jtice.2024.105773
- [Google Scholar]
- Law and mechanism analysis of biodegradability of polychlorinated naphthalenes based on principal component analysis, QSAR models, molecular docking and molecular dynamics simulation. Chemosphere. 2020;243:125427. https://doi.org/10.1016/j.chemosphere.2019.125427
- [Google Scholar]
- A systematic modeling methodology of deep neural network‐based structure‐property relationship for rapid and reliable prediction on flashpoints. AIChE Journal. 2022;68:1-15. https://doi.org/10.1002/aic.17402
- [Google Scholar]
- QSPR models to predict thermodynamic properties of cycloalkanes using molecular descriptors and GA-MLR method. Current Computer-Aided Drug Design. 2020;16:6-16. https://doi.org/10.2174/1573409915666190227230744
- [Google Scholar]
- High-dimensional QSAR/QSPR classification modeling based on improving pigeon optimization algorithm. Chemometrics and Intelligent Laboratory Systems. 2020;206:104170. https://doi.org/10.1016/j.chemolab.2020.104170
- [Google Scholar]
- In silico rational design and virtual screening of bioactive peptides based on QSAR modeling. ACS Omega. 2020;5:5951-5958. https://doi.org/10.1021/acsomega.9b04302
- [Google Scholar]
- Application of supervised machine learning paradigms in the prediction of petroleum reservoir properties: Comparative analysis of ANN and SVM models. Journal of Petroleum Science and Engineering. 2021;200:108182. https://doi.org/10.1016/j.petrol.2020.108182
- [Google Scholar]
- QSPR prediction of fullerene derivatives solubility in different solvents using the genetic algorithm − Support vector machine. Journal of Molecular Liquids. 2025;426:127307. https://doi.org/10.1016/j.molliq.2025.127307
- [Google Scholar]
- QSPR models for solvation enthalpy based on quantum chemical descriptors. Journal of Molecular Liquids. 2023;389:122884. https://doi.org/10.1016/j.molliq.2023.122884
- [Google Scholar]
- Protein kinase inhibitors’ classification using K-Nearest neighbor algorithm. Computational Biology and Chemistry. 2020;86:107269. https://doi.org/10.1016/j.compbiolchem.2020.107269
- [Google Scholar]
- QSPR/QSAR: State-of-art, weirdness, the future. Molecules (Basel, Switzerland). 2020;25:1292. https://doi.org/10.3390/molecules25061292
- [Google Scholar]
- Simplex representation of molecular structure as universal QSAR/QSPR tool. Structural Chemistry. 2021;32:1365-1392. https://doi.org/10.1007/s11224-021-01793-z
- [Google Scholar]
- An analysis of QSAR research based on machine learning concepts. Current Drug Discovery Technologies. 2021;18:17-30. https://doi.org/10.2174/1570163817666200316104404
- [Google Scholar]
- Development of remediation technologies for organic contaminants informed by QSAR/QSPR models. Environmental Advances. 2021;5:100112. https://doi.org/10.1016/j.envadv.2021.100112
- [Google Scholar]
- Application of deep learning in food: A review. Comprehensive Reviews in Food Science and Food Safety. 2019;18:1793-1811. https://doi.org/10.1111/1541-4337.12492
- [Google Scholar]
- Artificial intelligence to deep learning: Machine intelligence approach for drug discovery. Molecular Diversity. 2021;25:1315-1360. https://doi.org/10.1007/s11030-021-10217-3
- [Google Scholar]
- Molecular graph-based deep learning method for predicting multiple physical properties of alternative fuel components. Fuel. 2022;313:122712. https://doi.org/10.1016/j.fuel.2021.122712
- [Google Scholar]
- Multiple machine learning algorithms assisted QSPR models for aqueous solubility: Comprehensive assessment with CRITIC-TOPSIS. The Science of the Total Environment. 2023;857:159448. https://doi.org/10.1016/j.scitotenv.2022.159448
- [Google Scholar]
- Quantitative structure–activity relationship analysis using deep learning based on a novel molecular image input technique. Bioorganic & Medicinal Chemistry Letters. 2018;28:3400-3403. https://doi.org/10.1016/j.bmcl.2018.08.032
- [Google Scholar]
- Novel QSAR approach for a regression model of clearance that combines DeepSnap-deep learning and conventional machine learning. ACS Omega. 2022;7:17055-17062. https://doi.org/10.1021/acsomega.2c00261
- [Google Scholar]
- S-COPHY: A deep learning model for predicting the chemical class of compounds as cosmetics or pharmaceuticals based on single 3D molecular images. Computational Toxicology. 2024;30:100311. https://doi.org/10.1016/j.comtox.2024.100311
- [Google Scholar]
- Toward diverse polymer property prediction using transfer learning. Computational Materials Science. 2024;244:113206. https://doi.org/10.1016/j.commatsci.2024.113206
- [Google Scholar]
- CRNNTL: Convolutional recurrent neural network and transfer learning for QSAR modeling in organic drug and material discovery. Molecules (Basel, Switzerland). 2021;26:7257. https://doi.org/10.3390/molecules26237257
- [Google Scholar]
- Inductive transfer learning for molecular activity prediction: Next-Gen QSAR models with MolPMoFiT. Journal of Cheminformatics. 2020;12:27. https://doi.org/10.1186/s13321-020-00430-x
- [Google Scholar]
