Exploring Strategies for Optimizing Mobilenetv2 Performance in Classification Tasks Through Transfer Learning and Hyperparameter Tuning with A Local Dataset from Kigezi, Uganda.

Jack Turihohabwe; Ssembatya Richard; Wasswa William

doi:10.33022/ijcs.v14i1.4436

Authors

Jack Turihohabwe Mbarara University of Science and Technology
Ssembatya Richard Mbarara University of Science and Technology Uganda
Wasswa William Department of Biomedical Engineering, Mbarara University of Science and Technology Uganda

DOI:

https://doi.org/10.33022/ijcs.v14i1.4436

Keywords:

Hyperparameter tuning, Transfer Learning

Abstract

Abstract

Background

Deep learning has proved to very vital in numerous applications in recent years. However, the development of a model may require access to datasets. Training models on datasets may impose numerous challenges in terms of computational constraints, making it inefficient for limited computational environments and in this study a local dataset from Kigezi Uganda will be used.

The study will also explore the strategies of optimizing the MobilenetV2 through transfer learning and hyper-tuning.

Main Objective:

This study explored the strategies for optimizing MobileNetV2 performance in performing classification tasks through transfer learning, data augmentation, and hyper parameter tuning with local data from Kigezi, Uganda. A total of 2,415 images is the dataset used and 9,660 images were obtained after data augmentation.

Methodology

The methodology used is experimentation using transfer learning and hyper-tuning of the model.

Results

The model layer freezing.

Freezing All Layers except Final Dense Layer: Training accuracy: 90%, Testing accuracy: 85%. The model was not flexible enough to adapt to the new dataset, Unfreezing Top 10 Layers: Training accuracy: 92%, Testing accuracy: 88%. Moderate improvement observed, but still underperforming. Unfreezing Top 20 Layers: Training accuracy: 95%, Testing accuracy: 91%. Significant improvement, suggesting that more layers need to be fine-tuned. Unfreezing Entire Network: Training accuracy: 98%, Testing accuracy: 96%. The model showed substantial improvement in learning task-specific features.

Hyper tuning the Learning Rate.

The optimal configuration was found by unfreezing the entire network, which allowed the model to fine-tune all layers, thus improving the model’s ability to generalize to the new dataset.

Learning Rate Tuning: Learning rate is one of the most crucial hyper parameters. An extensive grid search was performed over the following values: 0.1, 0.01, 0.001, 0.0001, and 0.00001, Batch Size Tuning: Different batch sizes (16, 32, 64, and 128) were tested to determine the most efficient size for gradient updates, Optimizer Selection: Various optimizers were tested, including SGD, RMSprop, and Adam. The Adam optimizer was selected for its adaptive learning rate capabilities.

Epochs and Early Stopping: The number of epochs was tuned along with early stopping criteria to prevent overfitting. Epochs were tested in the range of 10 to 100 with a patience of 5 for early stopping

The results of the learning rate 0.1: Training accuracy: 60%, Testing accuracy: 55%. The model was unable to converge 0.01: Training accuracy: 80%, Testing accuracy: 75%. Improved but still underperforming. 0.001: Training accuracy: 90%, Testing accuracy: 88%. Further improvement, but overfitting observed. 0.0001: Training accuracy: 99%, Testing accuracy: 98%. Optimal performance achieved.0.00001: Training accuracy: 95%, Testing accuracy: 92%. Learning was too slow.

Hyper-tuning using the batch-size:

16: Training accuracy: 97%, Testing accuracy: 94%. Good performance but higher computational cost32: Training accuracy: 99%, Testing accuracy: 98%. Optimal balance between performance and efficiency, 64: Training accuracy: 95%, Testing accuracy: 93%. Slightly reduced performance, 128: Training accuracy: 90%, Testing accuracy: 87%. The model struggled with larger batch sizes.

Hyper-tuning using by different optimizers

SGD: Training accuracy: 85%, Testing accuracy: 80%. Slower convergence. RMSprop: Training accuracy: 92%, Testing accuracy: 88%. Moderate performance. Adam: Training accuracy: 99%, Testing accuracy: 98%. Best performance due to adaptive learning rate.

The final customized model, after applying transfer learning and extensive hyper parameter tuning, achieved outstanding results: Training Accuracy: 99%Testing Accuracy: 98%,Training Loss: 0.02,Testing Loss: 0.04.