Abstract:
Imbalanced data is one of the problems in classification which can impact the
decreasing of the classification performance. The imbalanced data give the bad impact
to classification performance, because mostly ignore the minority class. One of the
handling that can be apply the Synthetic Minority Oversampling Technique (SMOTE)
to balance the data. This study discusses the application of SMOTE to the random forest
algorithm for handling imbalanced data in the analysis of livable house classification in
Riau Province in 2020. The results showed that the random forest algorithm obtained
accuracy, sensitivity, specificity, G-Mean, and AUC of values 94.38%, 68.83%,
98.92%, 82.51%, and 83.87%. In the random forest algorithm with data that has been
balanced using SMOTE, the accuracy, sensitivity, specificity, G-Mean, and AUC of
values 76.54%, 71.65%, 82.80%, 77.02%, and 77.22%. This showed that by
implementing SMOTE, the performance of the Random Forest algorithm becomes
better in classifying minority classes because the sensitivity values obtained increase.