Сэтгэл ханамжийн түвшинг Random Forest машин сургалтын загвараар ангилж, гүйцэтгэлийг үнэлэх нь

Одгэрэл  Болдбаатар; Пүрэвдолгор  Лувсанцэрэн; Жавзмаа  Цэнд; Баасандорж  Чалхаасүрэн; Оюунцэцэг Сандаг; Ахыт Тилеубайн; Нямдаваа Уугандаваа; Ажнай  Лувсан-Иш

doi:10.65168/

Authors

Odgerel Boldbaatar MNUMS Author https://orcid.org/0000-0003-4479-8299
Purevdolgor Luvsantseren MNUMS Author https://orcid.org/0000-0002-3151-515X
Javzmaa Tsend MNUMS Author https://orcid.org/0000-0002-4369-5549
Baasandorj Chalkhaasuren MNUMS Author
Oyuntsetseg Sandag MNUMS Author
Akhyt Tilyeubai MNUMS Author https://orcid.org/0000-0002-2514-3038
Nyamdavaa Uugandavaa MNUMS Author https://orcid.org/0009-0001-7609-3894
Ajnai Luvsan-Ish MNUMS Author https://orcid.org/0000-0001-6911-4020

DOI:

https://doi.org/10.65168/

Keywords:

Likert scale, categorical analysis

Abstract

Satisfaction ratings are widely used to measure service and organizational performance, but because they rely on subjective data with Likert scales, they are difficult to fully interpret using traditional statistical methods. The purpose of this study is to classify High and Medium levels of satisfaction based on satisfaction questions V1–V4 using a Random Forest machine learning model and to comprehensively evaluate the performance of the model.
The study used data consisting of 42 observations and optimized the Random Forest model using 10-fold cross-validation 20 times. Model performance was evaluated using accuracy, Cohen’s Kappa, confusion matrix, ROC curve, AUC, feature importance, and permutation test. The results of the study showed that the Random Forest model showed 97.75% accuracy and Cohen's Kappa of 0.93, indicating high classification agreement. ROC analysis yielded AUC = 0.9998, indicating almost perfect discrimination between High and Medium satisfaction levels. According to the feature importance analysis, V2 questions had the greatest impact on classification performance, while the impact of V1 questions was relatively small. The results of the 100-fold permutation test also confirmed that the performance of the actual model was statistically significantly higher (p < 0.01) than random.
The Random Forest model is shown to be a reliable, stable, and interpretable method for classifying satisfaction levels, demonstrating the practical value of applying machine learning methods to satisfaction rating data.

Author Biography

Odgerel Boldbaatar, MNUMS

АУ-ы магистр, Мэдээллийн технологи, боловсрол судлал чиглэлээр судалгаа хийдэг.

References

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.

DeVellis, R. F. (2016). Scale Development: Theory and Applications (4th ed.). Sage Publications.

Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874.

Good, P. (2005). Springer. Permutation, Parametric and Bootstrap Tests of Hypotheses.

Hastie, T. T. (2009). The Elements of Statistical Learning. Springer.

Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model

selection. International Joint Conferences on Artificial Intelligence (IJCAI).

Kuhn, M. (2008). Building predictive models in R using the caret package. Journal of Statistical Software, 28(5).

Liaw, A. &. (2002). R News. Classification and regression by randomForest, 2(3), 18–22.

Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, 22(140), 1–55.

Classifying Satisfaction Levels with Random Forest Machine Learning Model and Evaluating Performance

Authors

DOI:

Keywords:

Abstract

Author Biography

References

Downloads

Published

Issue

Section

Subscription

Language