Klasifikasi Spesies Suara Burung Menggunakan YAMNet dan Random Forest untuk Konservasi Alam
DOI:
https://doi.org/10.30865/jurikom.v13i1.9443Keywords:
Classification of Bird Sounds, YAMNet, Random Forest, Confusion Matrux, Nature ConversionAbstract
Monitoring and identification of bird species is an important aspect of biodiversity conservation, but manual identification methods based on direct observation and expert listening still have limitations in terms of time, cost, and subjectivity. Other challenges arise due to variations in the quality of sound recordings, the presence of environmental noise, and the similarity in vocalization patterns between species that make it difficult to automate the classification process. This study aims to develop an automatic classification system of bird species based on acoustic signals by combining the YouTube Audio Event Network (YAMNet) model and the Random Forest algorithm. YAMNet is utilized to extract spectral log-Mel features that represent the frequency and temporal characteristics of bird sounds, while Random Forest is used as a classifier to determine species based on those features. The dataset used is the Sound of 114 Species of Birds till 2022, which includes species variation, recording duration, and complex acoustic conditions. The results showed that the features produced by YAMNet were able to form separation between species visually through Principal Component Analysis (PCA), although there was still overlap in species with similar vocalization characteristics. Evaluation using the confusion matrix shows that some species can be classified with a high degree of accuracy, while misclassification occurs mainly in species with similar frequency patterns. Receiver Operating Characteristic (ROC) analysis yields Area Under Curve (AUC) values of up to 0.98 in certain species, indicating the model's excellent discriminating ability. These findings suggest that the integration of YAMNet and Random Forest has the potential to be an efficient and reliable solution to support automated bird species identification systems in nature conservation.
References
[1] F. Terranova et al., “Windy events detection in big bioacoustics datasets using a pre-trained Convolutional Neural Network,” Sci. Total Environ., vol. 949, p. 174868, 2024, doi: https://doi.org/10.1016/j.scitotenv.2024.174868.
[2] P. Lauha et al., “Bird Sounds Global - model builder : An end-to-end workflow for building locally fine-tuned bird classifiers,” pp. 1–20.
[3] B. Ghani, T. Denton, S. Kahl, and H. Klinck, “Global birdsong embeddings enable superior transfer learning for bioacoustic classification,” Sci. Rep., vol. 13, no. 1, p. 22876, 2023, doi: 10.1038/s41598-023-49989-z.
[4] I. C. P, D. K. R, and M. R, “Bird Sound Identification System using Deep Learning,” Procedia Comput. Sci., vol. 233, pp. 597–603, 2024, doi: https://doi.org/10.1016/j.procs.2024.03.249.
[5] S. Kahl, C. M. Wood, M. Eibl, and H. Klinck, “BirdNET: A deep learning solution for avian diversity monitoring,” Ecol. Inform., vol. 61, p. 101236, 2021, doi: https://doi.org/10.1016/j.ecoinf.2021.101236.
[6] R. D. Beason, R. Riesch, and J. Koricheva, “Investigating the effects of tree species diversity and relative density on bird species richness with acoustic indices,” Ecol. Indic., vol. 154, p. 110652, 2023, doi: https://doi.org/10.1016/j.ecolind.2023.110652.
[7] J. Xie and M. Zhu, “Acoustic Classification of Bird Species Using an Early Fusion of Deep Features,” Birds, vol. 4, no. 1, pp. 138–147, 2023, doi: 10.3390/birds4010011.
[8] P. N. Andono and Y. Sari, “Feature Selection on Gammatone Cepstral Coefficients for Bird Voice Classification Using Particle Swarm Optimization,” vol. 16, no. 1, pp. 254–264, 2023, doi: 10.22266/ijies2023.0228.23.
[9] M. K. Baowaly et al., “Deep transfer learning-based bird species classification using mel spectrogram images,” PLoS One, vol. 19, no. 8, p. e0305708, Aug. 2024, [Online]. Available: https://doi.org/10.1371/journal.pone.0305708
[10] D. Trivedi, R. Sarmukaddam, and V. C. Gandhi, “Deep Learning for Urban Sound Classification: Using CNN and YAMNet Model Integration BT - Advanced Technologies in Electronics, Communications and Signal Processing,” K. K. Koganti, S. R. E., and N. Gupta, Eds., Cham: Springer Nature Switzerland, 2026, pp. 332–349.
[11] B. Williams et al., “Using tropical reef, bird and unrelated sounds for superior transfer learning in marine bioacoustics,” Philos. Trans. R. Soc. B Biol. Sci., vol. 380, no. 1928, p. 20240280, 2025, doi: 10.1098/rstb.2024.0280.
[12] S. D. H. Permana, G. Saputra, B. Arifitama, Yaddarabullah, W. Caesarendra, and R. Rahim, “Classification of bird sounds as an early warning method of forest fires using Convolutional Neural Network (CNN) algorithm,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 7, pp. 4345–4357, 2022, doi: https://doi.org/10.1016/j.jksuci.2021.04.013.
[13] K. K. Mohammed, E. I. A. El-Latif, N. E. El-Sayad, A. Darwish, and A. E. Hassanien, “Radio frequency fingerprint-based drone identification and classification using Mel spectrograms and pre-trained YAMNet neural,” Internet of Things, vol. 23, p. 100879, 2023, doi: https://doi.org/10.1016/j.iot.2023.100879.
[14] T. Tuncer, E. Akbal, and S. Dogan, “Multileveled ternary pattern and iterative ReliefF based bird sound classification,” Appl. Acoust., vol. 176, p. 107866, 2021, doi: https://doi.org/10.1016/j.apacoust.2020.107866.
[15] A. Revathi and N. Sasikaladevi, “Robust sound-based bird classification using multiple features and random forest classifier,” Int. J. Speech Technol., vol. 28, no. 1, pp. 117–127, 2025, doi: 10.1007/s10772-025-10168-2.
[16] Y.-C. Wei, W.-L. Chen, M.-N. Tuanmu, S.-S. Lu, and M.-T. Shiao, “Advanced montane bird monitoring using self-supervised learning and transformer on passive acoustic data,” Ecol. Inform., vol. 84, p. 102927, 2024, doi: https://doi.org/10.1016/j.ecoinf.2024.102927.
[17] L. Liu et al., “YAMNet-based transfer learning for compact noise classification in urban and wireless systems,” EURASIP J. Wirel. Commun. Netw., vol. 2025, no. 1, p. 74, 2025, doi: 10.1186/s13638-025-02483-8.
[18] S. Amini, M. Saber, H. Rabiei-Dastjerdi, and S. Homayouni, “Urban Land Use and Land Cover Change Analysis Using Random Forest Classification of Landsat Time Series,” Remote Sens., vol. 14, no. 11, 2022, doi: 10.3390/rs14112654.
[19] R. Pahuja and A. Kumar, “Sound-spectrogram based automatic bird species recognition using MLP classifier,” Appl. Acoust., vol. 180, p. 108077, 2021, doi: https://doi.org/10.1016/j.apacoust.2021.108077.
[20] F. J. Bravo Sanchez, M. R. Hossain, N. B. English, and S. T. Moore, “Bioacoustic classification of avian calls from raw sound waveforms with an open-source deep learning architecture,” Sci. Rep., vol. 11, no. 1, p. 15733, 2021, doi: 10.1038/s41598-021-95076-6.



