Confident Learning on IndoBERT: Enhancing Sentiment Classification Performance

Authors

  • Daffa Al Akhdaan UIN Syarif Hidayatullah Jakarta
  • Taufik Edy Sutanto
  • Muhaza Liebenlito

DOI:

https://doi.org/10.33022/ijcs.v13i5.4401

Keywords:

IndoBERT, Confidentt Learning, Sentiment Analysis, Label Quality

Abstract

In the rapidly evolving field of artificial intelligence (AI), label uncertainty in datasets has become a significant challenge threatening the sustainability of AI. This study investigates the enhancement of IndoBERT's performance in Indonesian sentiment analysis by integrating the Confident Learning (CL) method. IndoBERT, an adaptation of BERT for Indonesian, shows strong performance but is affected by label uncertainty. CL is applied to correct mislabeled data and improve model accuracy. The results indicate that IndoBERT + CL achieves an accuracy improvement from 85.15% to 86.03%, with enhancements in precision, recall, and F1 score to 87.93%, 85.00%, and 86.44%, respectively. The confusion matrix results also show that IndoBERT + CL is more accurate in identifying positive labels. This research highlights the importance of applying CL to enhance label quality and model performance in NLP sentiment analysis.

Downloads

Published

29-10-2024