Translate this page into:
Distribution of concentration in hybrid computation and finite element method in assessment of the flow properties in the two-phase contactor
* Corresponding author: E-mail address: happy20242024@sina.com (F. Wang)
-
Received: ,
Accepted: ,
Abstract
This research paper explores the application of several machine learning (ML) models for simulating ozone content in a hybrid membrane ozonation (OZ) process. The data for ML computations were extracted from computational fluid dynamics (CFD) analysis of the membrane system. The dataset comprises over 10,000 data points derived from CFD simulation for the feed side of a two-phase membrane contactor. The ML models employed in this research include Multi-Layer Perceptron (MLP), Gated Recurrent Unit (GRU), Decision Tree Regression (DTR), and Huber Regression (HBR). To optimize the performance of these models, Bayesian Hyper-parameter Optimization (BHO) was adopted for hyper-parameter tuning. The results indicate that MLP achieved an R2 of 0.99523, root mean square error (RMSE) of 0.07995, and maximum error of 0.45358, revealing strong accuracy and consistency. GRU attained the highest R2 of 0.99611 with RMSE of 0.07282 but with slightly higher variability. DTR performed well with an R2 of 0.99305 and RMSE of 0.09738, while HBR, with an R2 of 0.85574 and RMSE of 0.39933, demonstrated weaker performance. These findings revealed MLP as a robust model for predicting ozone concentration in a hybrid membrane-OZ process to improve the efficiency of the process by uniform distribution of ozone throughout the feed solution, which consequently improves the degradation/separation of water pollutants.
Keywords
Machine learning
Membranes
Model
Numerical simulation
Removal

1. Introduction
Combination of ozonation (OZ) with other separation/contactor processes can be beneficial to enhance the efficiency of the OZ process in water and wastewater treatment. Given that the membrane contactor systems provide a high surface area for separation, they are considered as great candidates for separation and combination with OZ process for water/wastewater treatment [1-3]. This novel hybrid process has been successfully implemented for degradation of organic compounds from solution, and great results have been reported for the process [3,4], which opens new avenues for research in this field to enhance its efficiency and possibility for commercialization.
The new process of membrane-OZ, which is a hybrid process, can be analyzed via theoretical and experimental methods, and some results have been obtained. Computational fluid dynamics (CFD) has been reported to be a reliable method for simulation and understanding separation in membranes, by which the concentration distribution of all compounds can be tracked in the process by solution of mass transfer model [5-8]. The method based on CFD was developed by Cao and Ghadiri [1], where they utilized the CFD method to solve mass transfer equations in the process and predicted solute concentration profile in the membrane module unit. Indeed, they obtained the concentration for the ozone as solute, which plays an important role in the hybrid OZ-based process [1]. One of the key parameters in this process, as predicted by CFD simulation, is ozone distribution. If ozone, as an oxidizing agent, is distributed well, one can expect high separation efficiency of the process for removal of organic compounds. By CFD modeling, the effect of other parameters such as membrane properties on the separation efficiency can be investigated. Other work was conducted on the simulation of OZ-membrane process via machine learning (ML) models, which resulted in great accuracy for the modeling of the process [4]. For this type of modeling, some data are required. These can be collected using experimental evaluation of the process or by CFD simulation of the membrane-OZ unit.
Considering the complexity of implementing the CFD technique in simulation of OZ systems, hybrid modeling is preferred for understanding the membrane-OZ process. Therefore, availing of the benefits of both modeling approaches, i.e., CFD and ML, would be a new idea in this area to understand and improve the OZ-membrane process for water/wastewater treatment applications. Moreover, the generality of models and rigorous framework for preprocessing and optimization remains a challenge for hybrid modeling membrane-OZ process via two-phase contactors. In this work, an attempt was made to address the challenges for simulation of OZ-membrane process by combination of ML and CFD model to develop a hybrid model of process. Traditional analytical approaches for concentration estimation often rely on complex mathematical models and physical measurements, which can be time-consuming and expensive. In the past few years, ML models have emerged as reliable methods for process analysis, offering the potential for more efficient and accurate estimation of process behavior [9,10]. Therefore, ML models can be used as versatile models in combination with CFD for modeling the OZ-membrane process.
For the first time, CFD simulation data of OZ-membrane process is combined with four advanced ML models in this study, including Multi-Layer Perceptron (MLP), Gated Recurrent Unit (GRU), Decision Tree Regression (DTR), and Huber Regression (HBR). The MLP model is a type of artificial neural network (ANN) known for its ability to find challenging relationships between variables [11]. The DTR model uses a hierarchical structure of decision trees to make predictions, making it suitable for capturing nonlinear relationships [12]. The HBR model, on the other hand, is a robust regression method that can handle outliers and data with varying levels of noise [13]. The GRU is also a neural network architecture known for its efficient gating mechanisms, enabling it to capture complex relationships and dependencies in regression tasks [14].
Bayesian Hyper-parameter Optimization (BHO) is being applied for hyper-parameter tuning to maximize model performance. As a sophisticated method, BHO efficiently searches the hyper-parameter space to identify ideal configurations for the models. We seek to improve the predictive accuracy and capabilities of the models by adjusting the hyper-parameters [15,16].
In this work, the challenge of accurately simulating the hybrid membrane-OZ process for water and wastewater treatment is addressed by combining CFD with ML. CFD simulation was initially conducted to generate the concentration dataset for usage in ML in the next step of modeling. The research question centers on how this hybrid modeling approach can improve predictions of ozone concentration distributions within a membrane system. Unlike traditional analytical methods, this study introduces a novel framework that integrates CFD-derived data with four advanced ML models, MLP, GRU, DTR, and HBR to enhance predictive accuracy. This is the first time these specific models have been applied for this purpose, optimizing performance through BHO. Combining approaches shows in the research article that the accuracy of concentration prediction in the Ozone membrane system can be much raised. The importance of this study lies in its potential to streamline the design and analysis of water treatment systems, offering a robust method that is both efficient and scalable for practical applications.
2. Materials and Methods
Data used in this modeling work were obtained from a CFD simulation of a hybrid membrane-OZ process. The studied OZ-membrane system is a hollow-fiber contactor which is used for injecting ozone into the process. The meshed geometry of the process has been indicated in Figure 1. As seen, the system possesses three sections: feed, membrane, and shell. The left compartment is the feed side where the ozone concentration is obtained at this compartment [4,17].

- Meshed domain of the OZ-membrane system in water treatment.
The computational method for CFD simulations was performed in COMSOL Multiphysics (v. 3.5a) software, and the procedure was the same methodology as detailed and validated in [1]. The method of numerical solution in this package is the finite element method, which was used for the OZ-membrane process. Adaptive solver was used, which also found the optimum number of meshes for numerical solution of membrane-OZ process. After simulation of the process, the data were extracted for the ML simulation, while the inputs are r and z as coordinates, and the only response was concentration of OZ in the feed compartment, represented by C. Thus, the dataset consists of more than 10,000 data points. Each data point contains three values: C (the target variable), r, and z (input variables) [17]. The collected data were previously used, and the same procedure was employed in this study for generating the dataset, but different ML models were employed to compare their accuracy in simulating the membrane OZ process [17].
Cook’s distance analysis was utilized to detect and remove outliers. Cook’s Distance is a prevalent approach employed to detect influential observations that possess the potential to exert a disproportionate influence on the regression model [18]. By calculating Cook’s Distance for each data point, we were able to assess the impact of individual observations on the model’s parameters and predictions [19]. Outliers with high Cook’s Distance values indicate a significant influence on the model, and their removal helps ensure the robustness and accuracy of the subsequent analyses. After identifying the outliers, we applied appropriate data-cleaning techniques to remove them from the dataset, ensuring that our models were trained on a more reliable and representative set of data [20].
Table 1 provides an overview of the dataset statistics for the parameters C(mol/m3), r(m), and z(m). Also, Figure 2 displays the histogram representing the distribution of the output.
| Parameter | Minimum | Maximum | Mean | Standard deviation |
|---|---|---|---|---|
| C(mol/m3) | 0.0 | 5.557697 | 1.267909 | 1.193625 |
| r(m) | 0.0 | 0.001300 | 0.000680 | 0.000378 |
| z(m) | 0.0 | 0.006667 | 0.003396 | 0.002032 |

- Histogram: Distribution of ozone concentration, C parameter.
2.1. Modeling of process
In this study, the modeling process began with data preprocessing steps, which included outlier detection using Cook’s Distance and normalization to ensure consistent scaling of features. After preprocessing, the data was divided into training and testing sets with an 80-20 split, allowing 80% of the data for training the model. Four ML models of MLP, GRU, DTR, and HBR were used for predictive analysis. To enhance model robustness, K-fold cross-validation with K=5 was applied, whereby the training set was split into five folds, and each model was trained and validated on different fold combinations. This approach ensured that each data point contributed to the model evaluation, providing a more reliable performance assessment.
2.2. CFD modeling of OZ-membrane process
For the CFD simulation of OZ-membrane process, the whole domain of the process was divided into three subdomains, as illustrated in Figure 1, and then the mass transfer equation was solved for all subdomains to obtain the concentration distribution of ozone in the contactor unit. A laminar flow regime was considered. Below Eq. (1) is the main mass transfer equation, which was solved for ozone in the membrane contactor [1,4]:
where C is the ozone concentration (mol/m3), and N is the mass transfer flux including diffusion and convection. R and t refer to the chemical reaction and time, respectively.
2.3. BHO
One effective method used to adjust the hyper-parameters of ML models for enhancing their performance and generalizing capacity is BHO. It uses Bayesian optimization, a probabilistic method combining observed data with prior knowledge to effectively investigate the hyper-parameter space and identify the best design [21,22].
The process of BHO involves constructing a surrogate model, also known as a probabilistic surrogate or response surface, that approximates the performance metric as a function of the hyper-parameters. This surrogate model is typically a Gaussian process (GP), used for being flexible and interpretable. The GP captures the uncertainty of the model’s performance across the hyper-parameter space, allowing for efficient exploration and exploitation.
Initially, BHO commences with a limited quantity of preliminary samples to construct the surrogate model. Consequently, an acquisition function is formulated to equilibrate exploration and exploitation. The acquisition function directs the search for optimal hyper-parameters by identifying the most promising configurations to analyze. Useful acquisition functions include:
Expected Improvement (EI), which selects points likely to improve upon the current best result:
where is the best observed value so far.
Upper Confidence Bound (UCB), which balances exploration and exploitation by incorporating model uncertainty:
where is a parameter controlling the exploration-exploitation trade-off, and is the standard deviation (uncertainty) at x [16].
2.4. DTR
DTR is a non-parametric method that constructs a hierarchical structure of decision trees to facilitate predictions. Within the tree structure, every internal node signifies a feature or attribute, while each leaf node symbolizes a predicted value [23].
The decision tree structure is constructed through a recursive process known as recursive binary splitting. The goal is to partition the data into subsets that are as homogeneous as possible in terms of the target variable (compound concentration). During each step of the algorithm, the best feature and its corresponding threshold are chosen to minimize the impurity or error within each subset [23].
In DTR, the primary objective of each split is to reduce the error in the target variable within each node. This error is typically measured using the Mean Squared Error (MSE), which quantifies the variance in the target values. The formula for calculating MSE in a node is:
where represents the actual target values, stands for the predicted mean of the target values in the node, and n indicates the quantity of samples in that node.
To find the optimal split, the DTR algorithm evaluates all possible thresholds for each feature, aiming to minimize the total MSE of the resulting child nodes. The best split is the one that produces the smallest combined MSE for the left and right nodes after the split. Mathematically, the optimal threshold is determined as:
where and represent the mean squared errors for the left and right nodes created by the split. This process allows the decision tree to learn complex patterns by sequentially dividing the data into increasingly homogenous subsets, leading to accurate predictions [24].
2.5. MLP
The MLP structure is a prevalent form of ANN extensively employed for regression tasks, such as predicting compound concentrations from input variables. An MLP is a feedforward neural network consisting of several layers of interconnected nodes, or neurons. It utilizes non-linear activation functions to represent intricate relationships between the input features and the output parameter.
An MLP’s architecture typically comprises an input layer, one or more computational hidden layers, and an output layer that generates the result. Neurons in each layer collaborate to process and analyze the input data. Assigning weights to the connections between neurons is how the strength and direction of signal flow are determined [25]. The weights undergo adjustments throughout the training process to optimize the model’s performance. A Schematic representation of the MLP technique has been displayed in Figure 3.

- A Schematic representation of MLP.
Mathematically, the output of a neuron in an MLP can be calculated as follows [11]:
where stands for the activation of neuron i, the activation function is indicated by f, the weight stands for the connection weight linking neuron j from the preceding layer to neuron i, indicates the input from neuron j, and represents the bias term associated with neuron I [26].
2.6. HBR
HBR is a robust regression method commonly used to predict compound concentrations based on input variables. It is developed to minimize the influence of outliers and deviations from the assumed underlying distribution compared to traditional regression models. HBR combines the advantages of both least squares regression and absolute deviation regression, providing a robust and reliable estimation of compound concentrations [27,28].
The aim of HBR is to reduce a loss function that integrates squared error loss for minor residuals and absolute error loss for significant residuals. This combination allows HBR to strike a balance between the benefits of the two loss functions and provide robustness against outliers. Mathematically, the HBR loss function can be expressed as follows [13]:
where represents the actual compound concentration, represents the predicted value, and stands for a tuning parameter that determines the threshold for distinguishing between small and large residuals.
2.7. GRU
The GRU is a neural network architecture which, while originally designed for sequential data, can be effectively adapted for simple regression tasks. Its gating mechanisms, the update and reset gates, enable selective filtering of information, allowing the model to capture complex non-linear relationships between input features and outputs. This makes GRUs applicable even when the data is not sequential, such as in spatial or feature-based regression tasks [29]. By leveraging these gates, the GRU efficiently learns hierarchical patterns in the input space, making it a robust choice for regression tasks where the relationship between variables is intricate.
For simple regression, the GRU processes input features through its gated architecture, dynamically focusing on relevant data while ignoring noise. The output of the GRU layer is then passed through a fully connected layer with a linear activation function to predict the continuous target variable. This setup ensures that the GRU captures essential patterns in the input features and maps them to the desired output effectively [14]. While GRUs are often associated with time-series data, their ability to handle non-linear dependencies justifies their use in simple regression tasks, particularly when the data exhibits complex relationships. Additionally, GRUs are computationally efficient, requiring fewer parameters than Long Short-Term Memory (LSTM) networks, which makes them a practical alternative for resource-constrained scenarios.
Hyper-parameter selection is crucial for optimizing the performance of GRUs in regression tasks. Important hyper-parameters include the number of units (hidden size), which controls the dimensionality of the hidden state and determines the model’s capacity to learn complex patterns; the number of layers, where additional GRU layers can capture deeper features but increase computational demands; and the dropout rate, typically ranging from 0.1 to 0.5, which helps prevent overfitting by randomly deactivating neurons during training. The learning rate is another critical factor, with smaller values ensuring stable convergence and larger ones speeding up training at the cost of potential instability. The batch size determines the number of samples used for gradient updates, where smaller batches capture fine-grained patterns while larger batches improve stability. The optimizer selected, such as Adam, RMSprop, or SGD, determines how effectively the model converges; Adam is a common choice for its adaptive learning rate. At last, ReLU, tanh, or sigmoid activation functions affect the modeling of non-linear relationships; ReLU is usually chosen for efficiency. By means of fine-tuning these hyper-parameters, the GRU can adapt to simple regression tasks, guaranteeing strong performance and computational economy.
Through optimal performance in simple regression tasks, the GRU can be customized by adjusting these hyper-parameters. Outside their conventional domain of sequential data, GRUs are a useful tool due in great part to their computational efficiency and adaptability.
3. Results and Discussion
After implementation of the developed ML models using Python 3.10 (sklearn, mathplotlib, bejoor, and numpy packages), the performance of four distinctive ML models, namely MLP, DTR, and HBR, was evaluated. The final models were optimized by hyper-parameter tuning using the BHO method to achieve better predictive accuracy. The hyper-parameters were fine-tuned to enhance the performance of each model. Table 2 presents a summary of the performance metrics of the three models, including the R2 score, root-mean-square error (RMSE), and maximum error. It is important to mention that 80% of datapoints (exactly 7,916 data points) were used for training and cross-validation, and the remaining data points (exactly 1980 data points) were used for test.
| Model | R2 score | Mean CV R2 | CV standard dev | RMSE | Maximum error |
|---|---|---|---|---|---|
| MLP | 0.99523 | 0.99307 | 0.00871 | 0.07995 | 0.45358 |
| DTR | 0.99305 | 0.98628 | 0.02698 | 0.09738 | 0.42966 |
| HBR | 0.85574 | 0.85199 | 0.05593 | 0.39933 | 1.75696 |
| GRU | 0.99611 | 0.99228 | 0.00948 | 0.07282 | 0.45843 |
The MLP model achieved the greatest R2 factor of 0.99523, signifying a robust correlation between the simulated and reference concentrations. It demonstrated the RMSE of 0.07995, signifying minimal discrepancies between the predicted and actual values. The maximum error for the MLP model was 0.45358, which represents the largest deviation from the actual concentration. Figure 4 displays the distribution of residuals generated by the MLP model. Most predicted data points exhibit residuals that are in close proximity to zero. Additionally, the frequency of residuals decreases as the absolute value of the residuals increases. Figure 5 also depicts the MLP prediction surface as a function of two inputs in 3D.

- Distribution of residuals using MLP model.

- The MLP prediction surface.
The DTR model demonstrated a slightly lower but still impressive R2 = 0.99305, giving a good fit. The RMSE for the DTR model was 0.09738, indicating a slightly higher level of error compared to the MLP model. The maximum error for the DTR model was 0.42966. Figure 6 displays the distribution of residuals generated by the DTR model. Similar to MLP, a significant proportion of the anticipated data points demonstrate residuals that are in close proximity to zero. Furthermore, Figure 7 depicts a three-dimensional plot that illustrates the DTR prediction surface as a function of two input variables.

- Distribution of residuals using DTR model.

- The DTR prediction surface for concentration parameter.
Conversely, the HBR model exhibited a diminished R2 factor of 0.85574, proving a less robust correlation with the source (CFD) concentrations for the solute. The RMSE for the HBR model was 0.39933, indicating a relatively higher level of error compared to the other. The highest error for the HBR model was the highest among the three models, reaching 1.75696. The distribution of residuals produced by the HBR model has been depicted in Figure 8, which evidently indicates the superior precision of the other two models. The HBR prediction surface is also displayed as a function of two inputs in the 3D plot of Figure 9.

- Distribution of residuals using HBR model.

- The HBR prediction surface.
These results proved that MLP outperformed both the DTR and HBR models in terms of predictive accuracy, as evidenced by its higher R2 score and lower RMSE. However, it is worth noting that the DTR model also exhibited strong performance, while the HBR model showed relatively weaker predictive capabilities.
The GRU model demonstrated a superior predictive performance, as evidenced by its high R2 of 0.99611, which reflects a strong fit for ozone concentrations. The RMSE of 0.07282 further confirms the accuracy of the model, indicating minimal average error in predictions, while the maximum error was 0.45843. Figure 10 illustrates the residual distribution for the GRU model, showing that most residuals are close to zero, with few outliers, highlighting the model’s precision and robustness. Furthermore, the 3D prediction surface has been illustrated in Figure 11.

- Distribution of residuals using GRU model.

- The GRU prediction surface.
Evaluating the generalization performance of the models in Table 2 depends critically on the cross-valuation (CV) results. The mean CV R2 values show how well the models can extend to unprocessed data. With a Mean CV R2 of 0.99307, the MLP model proved to be quite generalizing. Additionally, its low CV standard deviation of 0.00871 highlights its consistency across folds, with minimal performance variance.
The DTR model followed with a strong mean CV R2 of 0.98628. However, its higher CV standard deviation of 0.02698 suggests that its predictions were somewhat more variable across folds. The HBR model performed the weakest, with a Mean CV R2 of 0.85199 and the highest CV standard deviation of 0.05593, indicating notable struggles with generalization and increased variability in performance.
The GRU model displayed competitive performance, with a Mean CV R2 of 0.99228, a slightly higher CV standard deviation of 0.00948 compared to MLP. While GRU demonstrated superior accuracy with an R2 score of 0.99611 and the lowest RMSE of 0.07282, its higher CV standard deviation indicates slightly less consistency across folds compared to MLP.
Overall, the results highlight the effectiveness of ML in predicting ozone concentrations based on radial and vertical distances. The GRU, MLP, and DTR models demonstrate their suitability for accurate predictions, whereas the HBR model may require further refinement or alternative modeling approaches for improved performance. Finally, considering that the R2 standard deviation of the MLP model is slightly better than GRU and its 3D graph is smoother, we have performed the final analysis using this model. Comparison of actual and predicted values using MLP as the best model has been illustrated in Figure 12. Two-dimensional Figures 13 and 14 show the effect of two input parameters on the output. Based on these figures, C generally increases with the increase of both inputs [10,30]. Also, Figure 15 is the contour plot of C utilizing this model. The ozone concentration changes in both directions can be observed, which could originate from the mass transfer in radial and axial directions by the convection-diffusion mechanism [4,10,31]. The obtained results are in good agreement with the previous study on analysis of membrane OZ process with different ML models and those combined with the CFD model [4,10,17]. The same trend was observed in the previous study for change of concentration in both radial and axial distance [17].

- Comparison of actual and predicted values using MLP as the best model.

- Dependence of C on r changes at different z levels.

- Dependence of C on z changes at different r levels.

- Contour plot for ozone concentration distribution in the feed side.
4. Conclusions
This study examined the utilization of various ML models, including MLP, DTR, and HBR, to predict the concentration (C) of ozone in a hybrid OZ-membrane contactor system for water treatment purposes. CFD was performed first to get the data for training/testing ML models.
Based on the evaluation of performance metrics, the GRU model demonstrated the highest accuracy with an R2 score of 0.99611, the lowest RMSE of 0.07282, and a maximum error of 0.45843, showcasing its strong predictive capability. The MLP model closely followed with an R2 score of 0.99523, an RMSE of 0.07995, and a maximum error of 0.45358, highlighting its robust correlation and consistent performance across folds.
The DTR model also exhibited excellent performance, achieving an R2 score of 0.99305 and an RMSE of 0.09738, although slightly lower than GRU and MLP. In contrast, the HBR model showed comparatively weaker performance, with an R2 score of 0.85574, an RMSE of 0.39933, and the highest maximum error of 1.75696.
Overall, the findings indicate that the GRU and MLP models are highly effective for predicting compound concentrations based on radial and vertical distances. While GRU achieved slightly better accuracy, MLP was more consistent across folds, making it the most reliable model for this study.
CRediT authorship contribution statement
Fen Wang: Writing, Investigation, Methodology, Validation, Software. Caihong Li: Writing, Investigation, Formal analysis, Conceptualization.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Data availability
The datasets used and analyzed during the current study are available from the corresponding author on reasonable request.
Declaration of Generative AI and AI-assisted technologies in the writing process
The authors confirm that there was no use of artificial intelligence (AI)-assisted technology for assisting in the writing or editing of the manuscript and no images were manipulated using AI.
References
- Numerical evaluation of the ozonation process in a hollow fibre membrane contactor. Process Safety and Environmental Protection. 2023;170:817-823. https://doi.org/10.1016/j.psep.2022.12.075
- [Google Scholar]
- Nano-ceramic membranes combined with ozonation for water treatment: Fundamentals and engineering applications. Journal of Hazardous Materials Advances. 2023;10:100279. https://doi.org/10.1016/j.hazadv.2023.100279
- [Google Scholar]
- Decolorization and control of bromate formation in membrane ozonation of humic-rich groundwater. Water Research. 2022;221:118739. https://doi.org/10.1016/j.watres.2022.118739
- [Google Scholar]
- Numerical simulation of ozonation in hollow-fiber membranes for wastewater treatment. Engineering Applications of Artificial Intelligence. 2023;123:106380. https://doi.org/10.1016/j.engappai.2023.106380
- [Google Scholar]
- CFD Simulation of He/CH4 Separation by Hyflon AD60X Membrane. Chemical and Biochemical Engineering Quarterly. 2022;35:355-367. https://doi.org/10.15255/cabeq.2021.1957
- [Google Scholar]
- Reviewing two-phase flow modeling in membrane processes through computational fluid dynamics. Chemical Engineering Research and Design. 2025;214:28-38. https://doi.org/10.1016/j.cherd.2024.12.018
- [Google Scholar]
- Computational fluid dynamics simulation of a membrane contactor for CO2 separation: Two types of membrane evaluation. Chemical Engineering Technology. 2023;46:2034-2045. https://doi.org/10.1002/ceat.202300102
- [Google Scholar]
- Theoretical investigations on the effect of absorbent type on carbon dioxide capture in hollow-fiber membrane contactors. PloS One. 2020;15:e0236367. https://doi.org/10.1371/journal.pone.0236367
- [Google Scholar]
- Optimization of ultrasonic-excited double-pipe heat exchanger with machine learning and PSO. International Communications in Heat and Mass Transfer. 2023;147:106985. https://doi.org/10.1016/j.icheatmasstransfer.2023.106985
- [Google Scholar]
- Combination of CFD and machine learning for improving simulation accuracy in water purification process via porous membranes. Journal of Molecular Liquids. 2023;386:122456. https://doi.org/10.1016/j.molliq.2023.122456
- [Google Scholar]
- Multi layer perceptron. Machine Learning Lab Special Lecture, University of Freiburg 2014:7-24. https://doi.org/10.1016/B978-0-12-409545-8.00007-8
- [Google Scholar]
- Decision trees. In: Data mining and knowledge discovery handbook. New York: Springer-Verlag; p. :165-192. https://doi.org/10.1007/0-387-25465-X_9
- [Google Scholar]
- A statistical learning assessment of Huber regression. Journal of Approximation Theory. 2022;273:105660. https://doi.org/10.1016/j.jat.2021.105660
- [Google Scholar]
- Gate-variants of gated recurrent unit (GRU) neural networks. in 2017 IEEE 60th international midwest symposium on circuits and systems (MWSCAS) IEEE 2017 https://doi.org/10.1109/MWSCAS.2017.8053243
- [Google Scholar]
- A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring. Expert Systems with Applications. 2017;78:225-241. https://doi.org/10.1016/j.eswa.2017.02.017
- [Google Scholar]
- Development of SVM-based machine learning model for estimating lornoxicam solubility in supercritical solvent. Case Studies in Thermal Engineering. 2023;49:103268. https://doi.org/10.1016/j.csite.2023.103268
- [Google Scholar]
- Artificial intelligence modeling and simulation of membrane-based separation of water pollutants via ozone process: Evaluation of separation. Thermal Science and Engineering Progress. 2024;51:102627. https://doi.org/10.1016/j.tsep.2024.102627
- [Google Scholar]
- Cook’s distance in linear longitudinal models. Communications in Statistics - Theory and Methods. 1998;27:2973-2983. https://doi.org/10.1080/03610929808832267
- [Google Scholar]
- Detection of influential observation in linear regression. Technometrics. 1977;19:15-18. https://doi.org/10.1080/00401706.1977.10489493
- [Google Scholar]
- A review of statistical outlier methods. Pharmaceutical Technology. 2006;30:82. https://doi.org/10.1016/j.dajour.2023.100164
- [Google Scholar]
- Initializing hyper-parameter tuning with a metaheuristic-ensemble method: A case study using time-series weather data. Evolutionary Intelligence. 2023;16:1019-1031. https://doi.org/10.1007/s12065-022-00717-y
- [Google Scholar]
- Practical bayesian optimization of machine learning algorithms. Advances in Neural Information Processing Systems. 2012;25:1-10. https://doi.org/10.48550/arXiv.1206.2944
- [Google Scholar]
- Data mining with decision trees. Theory and Applications. 2007;69:1-11. World scientific. https://doi.org/10.1142/9789812771728_0001
- [Google Scholar]
- Decision tree for predicting the mortality in hemodialysis patient with diabetes. Jurnal Minfo Polgan. 2023;12:346-356. https://doi.org/10.33395/jmp.v12i1.12412
- [Google Scholar]
- Machine learning basics. Deep Learning. 2016;1:1-11, 98-164. https://doi.org/10.1016/j.ipha.2024.11.003
- [Google Scholar]
- Multilayer Perceptron: Architecture Optimization and Training. International Journal of Interactive Multimedia and Artificial Intelligence. 2016;4:26. https://doi.org/10.9781/ijimai.2016.415
- [Google Scholar]
- A new principle for tuning-free huber regression. Statistica Sinica. 2021;10:1-10. https://doi.org/10.5705/ss.202019.0045
- [Google Scholar]
- Robust estimation of a location parameter. The Annals of Mathematical Statistics. 1964;35:73-101. http://dx.doi.org.libproxy.lib.unc.edu/10.1214/aoms/1177703732.
- [Google Scholar]
- Recurrent attention unit: A new gated recurrent unit for long-term memory of important parts in sequential data. Neurocomputing. 2023;517:1-9. https://doi.org/10.1016/j.neucom.2022.10.050
- [Google Scholar]
- Advanced modeling and intelligence-based evaluation of pharmaceutical nanoparticle preparation using green supercritical processing: Theoretical assessment of solubility. Case Studies in Thermal Engineering. 2023;48:103150. https://doi.org/10.1016/j.csite.2023.103150
- [Google Scholar]
- Separation of organic compound from water using membrane process: Hybrid machine learning-based modeling and validation. Case Studies in Thermal Engineering. 2023;51:103583. https://doi.org/10.1016/j.csite.2023.103583
- [Google Scholar]
