Classifying Satisfaction Levels with Random Forest Machine Learning Model and Evaluating Performance
DOI:
https://doi.org/10.65168/Keywords:
Likert scale, categorical analysisAbstract
Satisfaction ratings are widely used to measure service and organizational performance, but because they rely on subjective data with Likert scales, they are difficult to fully interpret using traditional statistical methods. The purpose of this study is to classify High and Medium levels of satisfaction based on satisfaction questions V1–V4 using a Random Forest machine learning model and to comprehensively evaluate the performance of the model.
The study used data consisting of 42 observations and optimized the Random Forest model using 10-fold cross-validation 20 times. Model performance was evaluated using accuracy, Cohen’s Kappa, confusion matrix, ROC curve, AUC, feature importance, and permutation test. The results of the study showed that the Random Forest model showed 97.75% accuracy and Cohen's Kappa of 0.93, indicating high classification agreement. ROC analysis yielded AUC = 0.9998, indicating almost perfect discrimination between High and Medium satisfaction levels. According to the feature importance analysis, V2 questions had the greatest impact on classification performance, while the impact of V1 questions was relatively small. The results of the 100-fold permutation test also confirmed that the performance of the actual model was statistically significantly higher (p < 0.01) than random.
The Random Forest model is shown to be a reliable, stable, and interpretable method for classifying satisfaction levels, demonstrating the practical value of applying machine learning methods to satisfaction rating data.
References
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
DeVellis, R. F. (2016). Scale Development: Theory and Applications (4th ed.). Sage Publications.
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874.
Good, P. (2005). Springer. Permutation, Parametric and Bootstrap Tests of Hypotheses.
Hastie, T. T. (2009). The Elements of Statistical Learning. Springer.
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model
selection. International Joint Conferences on Artificial Intelligence (IJCAI).
Kuhn, M. (2008). Building predictive models in R using the caret package. Journal of Statistical Software, 28(5).
Liaw, A. &. (2002). R News. Classification and regression by randomForest, 2(3), 18–22.
Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, 22(140), 1–55.