Resnet-18 With Attention Mechanism-Bidirectional LSTM Hybrid Approach for Music Genre Classification Using Stacking MFCC and Mel-Spectogram Features
DOI:
https://doi.org/10.33022/ijcs.v13i6.4456Keywords:
Music genre classification, resnet-18, Bi-LSTM, CBAM attention mechanism, MFCC, Mel-spectogramAbstract
Incorrect Genre Classification is still often found. One of the causes is the selection of inappropriate features. This has an impact on the ability of the classifier model because some methods with a machine learning approach are highly dependent on the features used. Utilization of several features, especially spectral features, can improve the performance of the classifier model. On the other hand, methods with a deep learning approach such as CNN and RNN have been proven to outperform machine learning-based methods. This study proposes a hybrid Resnet18-BiLSTM model with the addition of the Convolutional Block Attention Module (CBAM) attention mechanism to improve the accuracy of music genre classification. Moreover, this study also combines two spectral features, namely mel-spectrogram and MFCC. The results of the experiment using the GTZAN dataset showed that the combination of mel-spectrogram and MFCC and the addition of the CBAM attention mechanism were able to classify music genres with an accuracy rate of 95.60% in validation and 95.30% in testing.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Dimas Elang Setyoko
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.