Exploring Strategies for Optimizing Mobilenetv2 Performance in Classification Tasks Through Transfer Learning and Hyperparameter Tuning with A Local Dataset from Kigezi, Uganda.

Authors

  • Jack Turihohabwe Mbarara University of Science and Technology
  • Ssembatya Richard Mbarara University of Science and Technology Uganda
  • Wasswa William Department of Biomedical Engineering, Mbarara University of Science and Technology Uganda

DOI:

https://doi.org/10.33022/ijcs.v14i1.4436

Keywords:

Hyperparameter tuning, Transfer Learning

Abstract

Abstract

Background

Deep learning has proved to very vital in numerous applications in recent years. However, the development of a model may require access to datasets. Training models on datasets may impose numerous challenges in terms of computational constraints, making it inefficient for limited computational environments and in this study a local dataset from Kigezi Uganda will be used.

The study will also explore the strategies of optimizing the MobilenetV2 through transfer learning and hyper-tuning.

Main Objective:

This study explored the strategies for optimizing MobileNetV2 performance in performing classification tasks through transfer learning, data augmentation, and hyper parameter tuning with local data from Kigezi, Uganda. A total of 2,415 images is the dataset used and 9,660 images were obtained after data augmentation.

Methodology

The methodology used is experimentation using transfer learning and hyper-tuning of the model.

Results

 The model layer freezing.

Freezing All Layers except Final Dense Layer: Training accuracy: 90%, Testing accuracy: 85%. The model was not flexible enough to adapt to the new dataset, Unfreezing Top 10 Layers: Training accuracy: 92%, Testing accuracy: 88%. Moderate improvement observed, but still underperforming. Unfreezing Top 20 Layers: Training accuracy: 95%, Testing accuracy: 91%. Significant improvement, suggesting that more layers need to be fine-tuned. Unfreezing Entire Network: Training accuracy: 98%, Testing accuracy: 96%. The model showed substantial improvement in learning task-specific features.

Hyper tuning the Learning Rate.

The optimal configuration was found by unfreezing the entire network, which allowed the model to fine-tune all layers, thus improving the model’s ability to generalize to the new dataset.

Learning Rate Tuning: Learning rate is one of the most crucial hyper parameters. An extensive grid search was performed over the following values: 0.1, 0.01, 0.001, 0.0001, and 0.00001, Batch Size Tuning: Different batch sizes (16, 32, 64, and 128) were tested to determine the most efficient size for gradient updates, Optimizer Selection: Various optimizers were tested, including SGD, RMSprop, and Adam. The Adam optimizer was selected for its adaptive learning rate capabilities.

Epochs and Early Stopping: The number of epochs was tuned along with early stopping criteria to prevent overfitting. Epochs were tested in the range of 10 to 100 with a patience of 5 for early stopping

The results of the learning rate 0.1: Training accuracy: 60%, Testing accuracy: 55%. The model was unable to converge 0.01: Training accuracy: 80%, Testing accuracy: 75%. Improved but still underperforming. 0.001: Training accuracy: 90%, Testing accuracy: 88%. Further improvement, but overfitting observed. 0.0001: Training accuracy: 99%, Testing accuracy: 98%. Optimal performance achieved.0.00001: Training accuracy: 95%, Testing accuracy: 92%. Learning was too slow.

Hyper-tuning using the batch-size:

16: Training accuracy: 97%, Testing accuracy: 94%. Good performance but higher computational cost32: Training accuracy: 99%, Testing accuracy: 98%. Optimal balance between performance and efficiency, 64: Training accuracy: 95%, Testing accuracy: 93%. Slightly reduced performance, 128: Training accuracy: 90%, Testing accuracy: 87%. The model struggled with larger batch sizes.

Hyper-tuning using by different optimizers

SGD: Training accuracy: 85%, Testing accuracy: 80%. Slower convergence. RMSprop: Training accuracy: 92%, Testing accuracy: 88%. Moderate performance. Adam: Training accuracy: 99%, Testing accuracy: 98%. Best performance due to adaptive learning rate.

The final customized model, after applying transfer learning and extensive hyper parameter tuning, achieved outstanding results: Training Accuracy: 99%Testing Accuracy: 98%,Training Loss: 0.02,Testing Loss: 0.04.

Downloads

Published

07-02-2025