Application of deep learning and random forest algorithms in a machine learning-based well log analysis for a small data set of a sand zone

PETROLEUM EXPLORATION & PRODUCTION  
PETROVIETNAM JOURNAL  
Volume 6/2020, pp. 4 - 14  
ISSN 2615-9902  
APPLICATION OF DEEP LEARNING AND RANDOM FOREST  
ALGORITHMS IN A MACHINE LEARNING-BASED WELL LOG  
ANALYSIS FOR A SMALL DATA SET OF A SAND ZONE  
Ruwantha Ratnayake1, Pham Huy Giao1,2  
1Asian Institute of Technology (AIT)  
2Vietnam Petroleum Institute (VPI)  
Email: hgiao@ait.asia/giaoph@vpi.pvn.vn  
Summary  
Artificial intelligence (AI) and machine learning (ML) have the potential to reshape the oil and gas exploration and production  
landscape. Once viewed as a promising novelty, AI and ML are not far away from becoming mainstream for all exploration and production  
companies. Earlier many researchers have worked on using intelligent analyses such as Artificial Neural Network (ANN), deep learning  
(DL), Fuzzy, Genetic Algorithm (GA) in well log interpretation, which are supposed to be effective for large data sets. Random forest (RF)  
algorithm so far has not been much applied for well log analysis. In this research, a code in Python language was developed for DL and  
RF analyses for well log interpretation. To highlight the advantages of the RF-based well log analysis we applied the new code for a small  
data set over a 50 m depth zone consisting of clay and sand zones.  
Porosity, permeability and water saturation of the reservoir zone were predicted by the RF analysis, compared with those obtained  
by the DL analysis and validated with the core measurements. It was found that there is a significant improvement in the analysis running  
time and the accuracy of the RF-predicted well log answers compared to those results by DL analysis. It is therefore recommended that  
more applications of RF-based well log analysis be done for clastic reservoirs in Vietnam in the future.  
Key words: Machine learning (ML), Python, random forest (RF), well log analysis, sand reservoir.  
1. Introduction  
RF is a classifier that evolves from decision trees. To  
The RF algorithm is a supervised learning model, it uses  
labelled data to “learn” how to classify unlabelled data.  
The RF algorithm is used to solve both regression and  
classification problems, making it a diverse model that is  
widely used by engineers [1].  
classify a new instance, each decision tree provides a  
classification for input data; RF collects the classifications  
and chooses the most voted prediction as the result. The  
inputofeachtreeissampleddatafromtheoriginaldataset.  
In addition, a subset of features is randomly selected from  
the optional features to grow the tree at each node. Each  
tree is grown without pruning. Essentially, RF enables a  
large number of weak or weakly-correlated classifiers to  
form a strong classifier.  
1.1. Earlier developments to RF  
Ho proposed a method to overcome a fundamental  
limitation on the complexity of decision tree classifiers  
derived with traditional methods [2]. Such classifiers  
cannot grow to arbitrary complexity without sacrificing  
the generalisation accuracy on unseen data.The proposed  
method uses oblique decision trees which are convenient  
for optimising training set accuracy. The essence of the  
method is to build multiple trees in randomly selected  
subspaces of the feature space. The trees generalise their  
classification in complementary ways, and their combined  
classification can be monotonically improved.  
The RF algorithm is composed of different decision  
trees, each with the same nodes, but using different data  
that leads to different leaves. It merges the decisions  
of multiple decision trees in order to find an answer,  
which represents the average of all these decision trees.  
Amit and Geman proposed a shape recognition  
approach based on the joint induction of shape features  
Date of receipt: 22/11/2019. Date of review and editing: 22/11 - 24/12/2019.  
Date of approval: 5/6/2020.  
PETROVIETNAM - JOURNAL VOL 6/2020  
4
PETROVIETNAM  
and tree classifiers [3]. Because of virtually infinite number  
of features, they reached the conclusion that no classifier  
based on the full feature set could be evaluated as it was  
impossible to determine a priori whose features were  
informative. Due to the number and nature of features,  
standard decision tree construction based on a fixed  
length feature vector was not feasible. An alternative  
approach would be to entertain a small random of  
sample features at each node, constrain their complexity  
to increase with tree depth, and grow multiple trees.  
Terminal nodes contain estimates of the corresponding  
posterior distribution over shape classes. By sending the  
image down and aggregating the resulting distribution,  
the image can be classified.  
majority voting where each classifier casts one vote for its  
predicted class label, then the class label with the most  
votes is used to classify the instance.  
Decision tree: Figure 1 shows a schematic decision tree  
that is a structure used in decision making process. This  
structure starts with a root node, which then branches to  
another decision node, repeating this process until a leaf  
is reached. A node asks a question in order to help classify  
the data. A branch represents the different possibilities  
that this node could lead to.  
Some of the basic terminology related to decision  
trees are given below:  
Root node: It represents entire population or  
sample and this further gets divided into two or more  
homogeneous sets.  
In another paper by Ho [4], he proposed a method  
to solve the dilemma between overfitting and achieving  
maximum accuracy. This was done by constructing  
a decision-tree-based classifier that maintained the  
highest accuracy on training data and, at the same  
time, improved on generalisation accuracy as it grows  
in complexity. The classifier consisted of multiple  
trees constructed systematically by pseudo-randomly  
selecting subsets of components of the feature vector,  
that is, trees constructed in randomly chosen subspaces.  
When empirically tested against publicly available data  
sets, the subspace method proved its superiority when  
compared to single-tree classifiers and other forest  
construction methods. The next section introduces RF  
which is an ensemble method that combines existing  
techniques in order to construct a collection of decision  
trees with controlled variation.  
Splitting: It is a process of dividing a node into two or  
more sub-nodes.  
Decision node: When a sub-node splits into further  
sub-nodes, then it is called decision node.  
Leaf node: Node which does not split is called leaf or  
terminal node.  
Pruning: When a sub-node of a decision node is  
removed, this process is called pruning. It is the opposite  
process of splitting.  
Branch/Sub-tree: A sub section of an entire tree is  
called branch or sub-tree.  
Parent and Child node: A node, which is divided into  
sub-nodes is called parent node of sub-nodes whereas  
sub-nodes are the child of parent node.  
1.2. RF algorithm  
Splitting decision trees: Breiman. [5] introduced  
additional randomness during the construction of  
RF is an ensemble learning method used for  
classification and regression. Developed by Breiman  
[5], the method combines Breiman’s bagging sampling  
approach [6] and the random selection of features,  
introduced independently by Ho [2, 4] and Amit and  
Geman. [3], in order to construct a collection of decision  
trees with controlled variation. Using bagging, each  
decision tree in the ensemble is constructed using a sample  
with replacement from the training data. Statistically, the  
sample is likely to have about 64% of instances appearing  
at least once in the sample. Instances in the sample are  
referred to as in-bag instances, and the remaining instances  
(about 36%) are referred to as out-of-bag instances. Each  
tree in the ensemble acts as a base classifier to determine  
the class label of an unlabelled instance. This is done via  
Root  
Node  
Branches  
Decision  
Node  
Decision  
Node  
Branches  
Branches  
Leaf  
Node  
Leaf  
Node  
Leaf  
Node  
Leaf  
Node  
Figure 1. Decision tree.  
PETROVIETNAM - JOURNAL VOL 6/2020  
5
PETROLEUM EXPLORATION & PRODUCTION  
decision trees using the classification and regression trees  
(CART) technique. Using this technique, the subset of  
features selected in each interior node is evaluated with  
the Gini index heuristics. The feature with the highest Gini  
index is chosen as the split feature in that node. Gini index  
has been introduced by Breiman et al. [7]. However, it has  
been first introduced by the Italian statistician Corrado  
Gini in 1912.The index is a function that is used to measure  
the impurity of data, i.e. the uncertainty of the data. In  
classification, this event would be the determination of  
the class label [8]. The general form of Gini index is shown  
below:  
needs to be done. The data does not need to be rescaled  
or transformed.  
- Parallelable: They are parallelisable, meaning that  
we can split the process to multiple machines to run. This  
results in faster computation time. Boosted models are  
sequential in contrast and would take longer to compute.  
- Quick prediction/training speed: It is faster to train  
than decision trees because we are working only on a  
subset of features in this model, so we can easily work  
with hundreds of features. Prediction speed is significantly  
faster than training speed because we can save generated  
forests for future uses.  
(1)  
( )  
= 1 −  
- Handles unbalanced data: RF methods for  
balancing error in class population unbalanced data sets.  
RF tries to minimise the overall error rate, so when we have  
an unbalanced data set, the larger class will get a low error  
rate while the smaller class will have a larger error rate.  
Where: Gini is the Gini index; pi is the probability of  
an object being classified to a particular class; c is the  
number of unique labels.  
Breiman [5] showed that the RF error rate depends  
on correlation and strength. Increasing the correlation  
between any two trees in the RF increases the forest  
error rate. A tree with a low error rate is a strong classifier.  
Increasing the strength of the individual trees decreases  
the RF error rate. Such findings seem to be consistent with  
a study made by Bernard et al. [9], which showed that the  
error rate statistically decreases by jointly maximising the  
strength and minimising the correlation.  
- Low bias, moderate variance: Each decision tree  
has a high variance, but low bias. However, because we  
average all the trees in RF, we are averaging the variance  
as well so that we have a low bias and moderate variance  
model [1].  
1.4. General applications of RF algorithm  
There are several sectors where the RF can be applied  
as listed below:  
1.3. Advantages of RF  
Banking sector: The banking sector consists of most  
users. There are many loyal customers and also fraud  
customers. RF analysis can be used to determine whether  
the customer is a loyal or a fraud. A system uses a set of  
RF, which identifies the fraud transactions by a series of  
the pattern.  
Key advantages of RF are robustness to noise and  
overfitting [5, 10]. Overfitting generally occurs when a  
model is constructed in such a way that it fits the data  
more than it is warranted. A model which has been overfit  
will generally have poor predictive performance, as it does  
not generalise well. By generalisation we mean how well  
the model will make predictions for cases that are not in  
the training set. Hawkins pointed out that overfitting adds  
complexity to a model without any gain in performance  
or, even worse, leads to poorer performance [11]. A  
classifier that suffers from overfitting is likely to have a low  
error rate for the training instances (in-bag instances), and  
a higher error rate for the out-of-bag instances.  
Medicines: Medicines needs a complex combination  
of specific chemicals. Thus, to identify the great  
combination in the medicines, RF can be used. With the  
help of machine learning algorithm, it has become easier  
to detect and predict the drug sensitivity of a medicine.  
Also, it helps to identify the patient’s disease by analysing  
the patient’s medical record.  
Stock market: Machine learning also plays role in  
the stock market analysis. When it is needed to know  
the behaviour of the stock market, with the help of RF  
algorithm, the behaviour of the stock market can be  
analysed. Also, it can show the expected loss or profit which  
can be produced while purchasing a particular stock.  
Other advantages of RF can be listed as follows:  
- High versatility: Whether the task is regression or  
classification, RF is an applicable model for all the needs.  
It can handle binary features, categorical features, and  
numerical features. There is very little pre-processing that  
PETROVIETNAM - JOURNAL VOL 6/2020  
6
PETROVIETNAM  
Applications of RF algorithm in oil & gas: In a research  
done by Chen, he successfully applied ML methods to  
predict well productivity and design hydraulic fracturing  
parameters in Montney and Duvernay Formations. He  
found out that ensemble models such as RF and ExBoost  
seem to outperform other types of ML methods (SVM,  
ANN) with a higher prediction accuracy [12].  
Where: SST is the total variation in the data (sum of  
squared total), SSR is the sum of squares regression, yi is  
the y value for observation i, is the mean of y values and  
is the predicted values of y for observation I, and R2 is  
the correlation coefficient.  
1.6. Transfer functions  
Transfer function is an algorithm process to transfer  
weighted sum to the hidden layers and the output layer.  
Thetransferfunctionischosentosatisfysomespecification  
of the problem that neural network is attempting to solve.  
1.5. Grid search method in ML  
Grid search is the process of performing hyper  
parameter tuning in order to determine the optimum  
values for a given model. This is significant as the  
performance of the entire model is based on the hyper  
parameter values specified. ‘GridSearchCV’ in the sklearn  
library of Python is a method which calculates a score  
for different hyper parameter combinations based on  
accuracy (R2 score), network building time and running  
time of the module. The combination which has the R2  
highest score is selected as the optimum combination.  
The R2 score is calculated by Equations (2 - 4).  
2. Methodology  
The general workflow adopted in this study is  
explained in Figure 2.  
2.1. Data collection and preparing the training input  
data  
The published data set used in this study is taken from  
Darling [13] for a clastic reservoir located from 616 to 675  
m deep. The well log data consist of gamma ray (GR), deep  
resistivity (LLD), sonic (DT), density (RHOB), and neutron  
porosity (NPHI). A part of the well data from 616 to 631 m  
was used as training data, while the part of well log data  
from 631 to 675 m was used for prediction. The target  
effective porosity (Φe) was calculated based on density  
(ΦD) and neutron (ΦN) porosity as shown in Equations (5)  
and (6) [14]:  
(2)  
(3)  
(4)  
=
=
(
(
)
)
=
Table 1. Common transfer functions used in neural networks (ANN and DL analyses)  
ꢆctiꢃation function  
ꢅꢄuation  
ꢂeriꢃatiꢃe  
1ꢂ raꢀh  
Linear  
( ) =  
′( ) = 1  
0,  
0.5,  
1,  
< 0  
= 0  
> 0  
Unit step (Heaviside  
function)  
( )  
( )  
=
=
′( ) = 0  
′( ) = 0  
− 1,  
0,  
1,  
< 0  
= 0  
> 0  
Sign (Signum)  
1
Logistic (Sigmoid)  
′( ) = ( )(1 − ( ))  
′( ) = 1 − ( )  
( ) =  
( ) =  
1 +  
+
Hyperbolic tangent (tanh)  
ReLu  
0,  
< 0  
> 0  
0,  
1,  
< 0  
> 0  
( )  
( )  
=
=
,
PETROVIETNAM - JOURNAL VOL 6/2020  
7
PETROLEUM EXPLORATION & PRODUCTION  
Start  
Well log data collection [13]  
Develop the DL module  
using Python  
Develop the RF module  
using Python  
WL interpretation and  
preparation of training data  
Run machine learning analysis (DL & RF) and compare  
the results  
Figure 2. Workflow of the study.  
Table 2. Well log-calculated and core petrophysical parameters  
ꢄoroꢅitꢆ  
ꢉalculated  
ꢄermeaꢇilitꢆ ꢂmꢀꢃ  
ꢉore ꢉalculated  
ꢈater ꢅaturation  
ꢀeꢁth ꢂmꢃ  
ꢉore  
0.02  
0.02  
0.1105  
0.01  
0.095  
0.156  
0.15  
0.075  
0.105  
0.06  
0.179  
0.156  
ꢉalculated  
1.0  
620.116  
622.097  
624.078  
626.059  
628.040  
630.022  
632.003  
634.136  
636.118  
638.099  
640.080  
642.061  
0.03  
0.022  
0.11  
0.014  
0.091  
0.16  
0.13  
0.1  
0.08  
0.04  
0.16  
0.155  
0.01  
0.02  
22  
0.11  
0.05  
10.1  
1.0  
0.042  
0.74  
0.036  
0.018  
0.022  
0.035  
0.04  
0.03  
10.5  
135.6  
120  
11  
0.029  
7.12  
201.12  
68.45  
8.27  
15.3  
0.8  
5.42  
0.16  
1.0  
0.025  
0.046  
350  
130  
482.71  
218.9  
(5)  
(6)  
=
=
+
2
Where: ρm is the matrix density (g/cc) and is equal to  
2.65 g/cc in this case for sandstone, ρ is bulk density (g/  
cc), ρf is fluid density (g/cc), ΦD is density porosity, ΦN is  
neutron porosity and Φe is effective porosity.  
The water saturation of training data set was  
calculated using Simandoux (1963)’s method as shown in  
Equation (7). Simandoux equation was used because the  
zone of analysis includes shaly sand intervals.  
×
×
×
5
×
0.4  
Figure 3. Calculated petrophysical parameters vs core values.  
×
(7)  
=
+
×
To determine permeability a poro-perm relationship  
was developed based on Equation (8) using the core data.  
Where: Φ is effective porosity, Rw is resistivity of  
water, Vsh is shale volume, Rt is formation resistivity, Rsh  
is resistivity of shale, a is an empirical constant and m is  
cementation exponent.  
(8)  
k = 10(k + k × Φ )  
a
b
e
Using core porosity and permeability values ka and  
kb were determined as -2 and 28.04 respectively. The core  
PETROVIETNAM - JOURNAL VOL 6/2020  
8
PETROVIETNAM  
measurements for this case study are represented in Table 2. The calculated  
petrophysical parameters as mentioned above are plotted versus the core  
values in Figure 3 that show a very good match.  
2.3. Developing the RF module  
The RF module was developed with  
multiple number of decision trees and  
also coded in Python programming  
language as explained in Section 2.4. The  
flowchart of the Python code is shown  
in Figure 5. Normally in RF, the accuracy  
of the predicted results changes with  
the number of decision trees used.  
Therefore, in this case the trial and error  
method was used to find the optimum  
number of decision trees to get the most  
accurate results.  
2.2. Developing the DL module  
DL could be usefully applied in well log analysis as indicated in a  
research by Giao and Sandunil [15]. Figure 4 shows the architecture of  
the DL network used in this study, which has three main layers, namely,  
input layer, hidden layer(s) and output layer. Input values of each hidden  
layer is multiplied by a certain weight and the summation is introduced  
to a transfer function assigned to each neuron. Training of an DL network  
is done using training examples. Grid search method which is an inbuilt  
function of sklearn library was used to find out the optimum hyper  
parameters for this data set. In this a total of 960 combinations were  
tested by varying the number of hidden layers from 1 to 50, neurons from  
5 to 100, learning rate from 0.0001 to 0.1 and the transfer function being  
linear, unit step, sign, Sigmoid, tanh and Relu. Table 3 shows the best  
combination of hyper parameters which had the highest score that was  
used in DL code.  
2.4. Coding and running the DL and RF  
modules in Python language  
Python programming language  
was used in developing both the  
modules due to its ease to learn and the  
availability of vast amount of machine  
learning libraries. Number of standard  
libraries were used as shown in Table 4  
in developing the codes.  
Table 3. The best combination of hyper parameters  
ꢀꢁꢂer ꢂarameter  
Gridsearch R2 score  
Number of hidden layers  
Neurons per hidden layer  
Learning rate  
ꢃeꢄult  
0.952  
50  
100  
An illustration of the Python codes  
for DL and RF analysis is shown in Tables  
5 & 6. The Python package manager  
used in this study was Anaconda.  
Anaconda is a free and open-source  
distribution of the Python programming  
language for scientific computing (data  
science, machine learning applications,  
large-scale data processing, predictive  
analytics, etc.), that aims to simplify  
package management and deployment.  
In order to create the code Jupyter  
Notebook was used [16], which is an  
open-source web application that  
allows to create and share documents  
that contain live code, equations,  
visualisations and narrative text.  
0.0001  
1000  
ReLu  
Number of iterations  
Activation function  
Figure 4. Architecture of the DL network employed in this study.  
Split the  
imported data  
into training,  
testing,  
Build  
multiple  
decision  
trees using  
the training  
Import the  
target data set  
and predict  
well log results  
using trained  
Import  
machine  
learning  
libraries and  
training data  
Compute the  
R2 score  
between  
predicted and  
actual well log  
validation and  
Figure 5. Flowchart of the Python code created for RF module.  
PETROVIETNAM - JOURNAL VOL 6/2020  
9
PETROLEUM EXPLORATION & PRODUCTION  
Table 4. Libraries used in the code  
ꢀodule  
ꢁiꢂrarꢃ  
Pandas  
Matplotlib  
Keras  
Sklearn  
Pandas  
ꢄurꢅoꢆe  
Import training and predicting data sets  
Plotting the ꢀnal interpretation plots  
DL  
RF  
Building the neural network with desired hyper parameters  
Splitting and scaling the training data, calculating the R2 score  
Import training and predicting data sets  
Matplotlib  
Sklearn  
Plotting the ꢀnal interpretation plots  
Scale the training data and building random decision trees by using training data set  
Table 5. Structure of Python codes for DL module  
Taꢀk  
ꢁꢂthon code  
Importing libraries  
Building the neural network  
Training the network  
Testing and validating the network  
Running the module for predicting data set and  
saving the output  
Table 6. Structure of Python codes for RF module  
ꢁꢂthon code  
Taꢀk  
Importing libraries  
Splitting the data into training, testing  
and validation  
Creating decision trees  
Training phase  
Testing, validating phase  
Running the module for predicting data  
set and saving the output  
PETROVIETNAM - JOURNAL VOL 6/2020  
10  
PETROVIETNAM  
Figure 6. Well log curves [13].  
Table 7. Log responses and answers interpreted for the zoned layers at the study location [13]  
Log Responses  
Log Answers  
Shale Density Eꢀective Water  
volume porosity porosity Saturation  
Depth  
Zone Lithology interval  
(m)  
GR  
RHOB  
PHIN  
LLD  
(Ω.m)  
Permeability  
(mD)  
(API) (g/cc)  
(Vsh )  
0.67  
0.17  
0.65  
0.15  
0.52  
(ФD)  
0.01  
0.19  
0.10  
0.15  
0.04  
(
)
(Sw)  
0.85  
0.05  
0.08  
0.18  
0.65  
Фe  
01  
02  
03  
04  
05  
Shale  
Sand  
Shale  
Sand  
616-624  
624-637  
637-639  
639-655  
90  
35  
88  
33  
74  
2.63  
2.50  
2.60  
2.45  
2.55  
0.07  
0.065  
0.04  
0.09  
0.12  
11  
8
15  
2
0.04  
0.13  
0.07  
0.12  
0.08  
1.83  
61.39  
104.52  
186.29  
215.5  
Shaly sand 655-668  
15  
3. Results and discussion  
for the public data set were done using both DL and RF  
modules developed, and the results are presented in  
Figures 7 and 8.  
The collected log curves and data, taken from Darling  
[13], are shown in Figure 6 and Table 7.  
As shown in Table 8, the R2 scores were calculated for  
each training, testing and predicting phase and compared  
Results from developed Python modules: Analyses  
PETROVIETNAM - JOURNAL VOL 6/2020  
11  
PETROLEUM EXPLORATION & PRODUCTION  
between two machine learning techniques  
that were used in this study, i.e., DL and RF.  
It was also observed that the running  
time of the RF analysis is significantly lower  
compared to that taken by DL module to run  
as seen in Table 8 and Figure 9b.  
4. Conclusions and recommendations  
In this study, two machine learning-  
based analysis modules, i.e. DL and RF, were  
developed using Python programming  
language to perform well log analysis. These  
two modules were tested on a small size  
public data set of a clastic reservoir [9] and  
the accuracy of the results was compared. The  
following concluding remarks could be drawn:  
Figure 7. Results for the DL module.  
- Based on a conventional well log  
interpretation on the study data set taken  
from Darling [13] a main sand reservoir zone  
from 639 to 655 m was identified with an  
average effective porosity (ФD+N) of 0.125,  
permeability (K) of 123.84 mD, and water  
saturation (Sw) of 0.18 that match well with the  
core measurements values, i.e. Фcore = 0.13 and  
Kcore = 149.24 mD.  
- A number of DL analyses were  
conducted by varying the hyper parameters,  
i.e. number of hidden layer ranges from 1 to  
50, number of neuron per hidden layer varies  
from 5 to 100, the learning rate varies from  
0.0001 to 0.1, and the transfer function being  
linear, unit step, sign, Sigmoid, tanh and ReLu  
in both input and output layers. A total of 960  
DL analyses have been run, out of which the  
best analysis was found to be the one having  
50 hidden layers, 100 neurons per hidden layer,  
learning rate of 0.0001 and the ReLu transfer  
function that gave an average porosity of  
0.124, permeability of 112.14 mD and water  
saturation of 0.14.  
Figure 8. Results from RF module.  
Table 8. R2 scores for DL and RF modules  
Running time  
(sec)  
2
R score  
Well log  
answer  
Data set  
DL  
RF  
DL  
RF  
Training  
Testing  
0.999  
0.989  
0.904  
0.954  
0.845  
0.754  
0.995  
0.932  
0.786  
0.997  
0.984  
0.939  
0.989  
0.980  
0.894  
0.994  
0.975  
0.997  
Porosity  
91  
3
Predicting  
Training  
Testing  
Water  
Saturation  
98  
5
- Similarly, a number of RF analyses  
were run, varying the number of trees from  
1 to 10, out of which the analysis with 6 trees  
was found the best RF analysis that gave an  
average porosity of 0.126, permeability of  
122.15 mD and water saturation of 0.19.  
Predicting  
Training  
Testing  
Permeability  
97  
4
6
Predicting  
Average  
0.814  
0.943  
95.3  
- By comparing the results predicted by  
PETROVIETNAM - JOURNAL VOL 6/2020  
12  
PETROVIETNAM  
(a)  
(b)  
Figure 9. Comparison of (a) R2 score of the predicting phase from two modules, (b) running time of the modules.  
DL and RF analyses it was found that those by RF analyses  
are better than those predicted by DL analysis (Table 8  
and Figure 9a), i.e. better R2 and shorter running time. For  
example, the average running time is 6s for RF and 95.3s  
for DL, respectively.  
References  
[1] Madision Schott, “Random Forest Algorithm  
for machine learning, Part 4 of a Series on Introductory  
Machine Learning Algorithms, 25/4/2019. [Online].  
forest-algorithm-for-machine-learning-c4b2c8cc9feb.  
- A notable advantage of RF analysis is that it could  
avoid the overfitting problem that is very common for an  
ANN or DL analysis. Overfitting can be detected when the  
R2 value of the testing is significantly higher than that of  
predicting (Table 8).  
[2] Tim Kam Ho, “Random decision forests,  
Proceedings of the 3rd International Conference on Document  
Analysis and Recognition, 1995.  
[3] Yali Amit and Donald Geman, "Shape quantization  
and recognition with randomized trees", Neural  
Computation, Vol. 9, No. 7, pp. 1545 - 1588, 1997.  
- In term of code building, RF algorithm is easier to  
be developed in Python than ANN because the RF libraries  
are more diverse and a RF analysis requires less number of  
hyper parameters to be changed, i.e. only the number of  
the trees, while for an ANN or DL analysis more numbers of  
hyper parameters have to be tested, i.e. number of hidden  
layers, number of neurons per hidden layers, learning rate  
range, and type of activation or transfer function.  
[4] Tim Kam Ho, "The random subspace method for  
constructing decision forests", IEEE Transactions on Pattern  
Analysis and Machine Intelligence, Vol. 20, No. 8, pp. 832 -  
844, 1998.  
[5] Leo Beriman, "Random Forests", Machine Learning,  
Vol. 45, pp. 5 - 32, 2001.  
- Normally, machine learning-based analysis  
requires a big data set to be effective. However, in this  
study, the RF algorithm proved that it can be applied for a  
small data set, which would increase its applicability and  
can be recommended for more applications in well log  
analysis.  
[6] Leo Breiman, "Bagging predictors", Machine  
Learning, Vol. 24, pp. 123 - 140, 1996.  
[7] Leo Breiman, Jerome Friedman, R.A.Olshen,  
and Charles J.Stone, Classification and regression trees.  
Chapman & Hall/CRC, 1984.  
PETROVIETNAM - JOURNAL VOL 6/2020  
13  
PETROLEUM EXPLORATION & PRODUCTION  
[8] Mohamed Bader-El-Den and Mohamed Medhat  
Gaber, “GARF: Towards self-optimised random forests,  
Proceedings of the 19th International Conference on Neural  
Information Processing, pp. 506 - 515, 2012.  
Montney and Duvernay, Training course at SPE Canada  
Unconventional Resources Conference, 17 March 2019.  
[13] Toby Darling, Well logging and formation  
evaluation. Gulf Professional Publishing, 2005.  
[9] Simon Bernard, Laurent Heutte, and Sébastien  
Adam, “A study of strength and correlation in random  
forests, Proceedings of the 6th International Conference on  
Intelligent Computing, pp. 186 - 191, 2010.  
[14] Pham Huy Giao, “Lecture notes of the CE71.70  
course (Petrophysics)”, Asian Institute of Technology,  
Bangkok, Thailand, 2018.  
[15] Pham Huy Giao and Kushan Sandunil,  
Applications of deep learning in predicting the fracture  
porosity, Petrovietnam Journal, Vol. 10, pp. 14 - 22, 2017.  
[10] Praveen Boinee, Alessandro De Angelis, and  
G.L.Foresti, Meta random forests, International Journal  
of Computational Intelligence, Vol. 2, No. 3, pp. 138 - 147,  
2005.  
[17] P.Simandoux, "Dielectric measurements in  
porous media and application to shaly formation", Revue  
de LInstitut Français du Pétrole, pp. 193 - 215, 1963.  
[11] Douglas M.Hawkins, The problem of overfitting,  
Journal of Chemical Information and Computer Sciences,  
Vol. 44: pp. 1 - 12, 2004.  
[12] Shengnan Chen, “Application of machine  
learning methods to predict well productivity in  
PETROVIETNAM - JOURNAL VOL 6/2020  
14  
pdf 11 trang yennguyen 16/04/2022 2780
Bạn đang xem tài liệu "Application of deep learning and random forest algorithms in a machine learning-based well log analysis for a small data set of a sand zone", để tải tài liệu gốc về máy hãy click vào nút Download ở trên

File đính kèm:

  • pdfapplication_of_deep_learning_and_random_forest_algorithms_in.pdf