Klasifikasi Dialek Pengujar Bahasa Inggris Menggunakan Random Forest

Authors

DOI:

https://doi.org/10.30865/mib.v5i2.2754

Keywords:

Dialect recognition, Speech Recognition, Imbalance Class, MFCC, Random Forest

Abstract

Speech recognition is one of the important research fields which is currently widely used for various applications. However, speech recognition performance is affected by the dialect of the speaker. Therefore, dialect recognition is often used as an additional feature in speech recognition. The process of recognizing dialects is not easy. Currently, Machine Learning technology is widely applied in dialect recognition. One of the challenges in the introduction of machine learning-based dialects is the imbalance of classes and overlaps in a wide variety of classification techniques. This study applies Random Forest-based oversampling technology for dialect recognition. For hyper-parameter optimization of the random forest algorithm, we apply the Grid Search method. Experiments on Speech Accent Archive data using the MFCC feature resulted in an accuracy of 0.91 and AUC of 0.95

Author Biography

Muhamad Azhar, STMIK Nusa Mandiri, Jakarta

Ilmu Komputer, Ilmu Komputer

References

R. B. Handoko and S. Suyanto, “Klasifikasi Gender Berdasarkan Suara Menggunakan Support Vector Machine,†Indones. J. Comput., 2019, doi: 10.21108/indojc.2019.4.1.244.

I. S. Permana, Y. Indrawaty, and A. Zulkarnain, “IMPLEMENTASI METODE MFCC DAN DTW UNTUK PENGENALAN JENIS SUARA PRIA DAN WANITA,†MIND J., vol. 3, no. 1, pp. 61–76, Jan. 2019, doi: 10.26760/mindjournal.v3i1.61-76.

A. Setiawan, A. Hidayatno, and R. Isnanto, Rizal, “Aplikasi Pengenalan Ucapan dengan Ekstraksi Mel-Frequency Cepstrum Coefficients (MFCC) Melalui Jaringan Syaraf Tiruan (JST) Learning Vector Quantization (LVQ) untuk Mengoperasikan Kursor Komputer,†Apl. Pengenalan Ucapan dengan Ekstraksi Mel-Frequency Cepstrum Coefficients Melalui Jar. Syaraf Tiruan Learn. Vector Quantization untuk Mengoperasikan Kursor Komput., 2011, doi: 10.12777/transmisi.13.3.82-86.

N. Nurhamidah, E. C. Djamal, and R. Ilyas, “Perintah Menggunakan Sinyal Suara dengan Mel- Frequency Cepstrum Coefficients dan Learning Vector Quantization,†Semin. Nas. Apl. Teknol. Inf. 2017, 2017.

A. Lukman and W. T. Saputro, “IDENTIFIKASI NYAMUK CULEX DAN AEDES AEGYPTI BETINA MENGGUNAKAN LINIER PREDICTIVE CODING DAN JARINGAN SYARAF TIRUAN LEARNING VECTOR QUANTIZATION,†JIKO (Jurnal Inform. dan Komputer), vol. 1, no. 2, Sep. 2016, doi: 10.26798/jiko.2016.v1i2.33.

D. Satria and M. Mushthofa, “Perbandingan Metode Ekstraksi Ciri Histogram dan PCA untuk Mendeteksi Stoma pada Citra Penampang Daun Freycinetia,†J. Ilmu Komput. dan Agri-Informatika, 2013, doi: 10.29244/jika.2.1.20-28.

A. K. H. Al-Ali, D. Dean, B. Senadji, V. Chandran, and G. R. Naik, “Enhanced Forensic Speaker Verification Using a Combination of DWT and MFCC Feature Warping in the Presence of Noise and Reverberation Conditions,†IEEE Access, vol. 5, pp. 15400–15413, 2017, doi: 10.1109/ACCESS.2017.2728801.

Kunxia Wang, Ning An, Bing Nan Li, Yanyong Zhang, and Lian Li, “Speech Emotion Recognition Using Fourier Parameters,†IEEE Trans. Affect. Comput., vol. 6, no. 1, pp. 69–75, Jan. 2015, doi: 10.1109/TAFFC.2015.2392101.

L. Juvela et al., “Speech Waveform Synthesis from MFCC Sequences with Generative Adversarial Networks,†in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2018, doi: 10.1109/ICASSP.2018.8461852.

T. Bent, E. Atagi, A. Akbik, and E. Bonifield, “Classification of regional dialects, international dialects, and nonnative accents,†J. Phon., 2016, doi: 10.1016/j.wocn.2016.08.004.

B. S. Raghuwanshi and S. Shukla, “SMOTE based class-specific extreme learning machine for imbalanced learning,†Knowledge-Based Syst., vol. 187, p. 104814, Jan. 2020, doi: 10.1016/j.knosys.2019.06.022.

S. Vluymans, “Learning from Imbalanced Data,†in Studies in Computational Intelligence, 2019, pp. 81–110.

A. Sonak, R. Patankar, and N. Pise, “A new approach for handling imbalanced dataset using ANN and genetic algorithm,†in International Conference on Communication and Signal Processing, ICCSP 2016, 2016, doi: 10.1109/ICCSP.2016.7754521.

Shuo Wang and Xin Yao, “Using Class Imbalance Learning for Software Defect Prediction,†IEEE Trans. Reliab., vol. 62, no. 2, pp. 434–443, Jun. 2013, doi: 10.1109/TR.2013.2259203.

Y. Singh, A. Pillay, and E. Jembere, “Features of Speech Audio for Accent Recognition,†in 2020 International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems (icABCD), Aug. 2020, pp. 1–6, doi: 10.1109/icABCD49160.2020.9183893.

S. Helmiyah, I. Riadi, R. Umar, and A. Hanif, “Ekstraksi Fitur Pengenalan Emosi Berdasarkan Ucapan Menggunakan Linear Predictor Ceptral Coeffecient Dan Mel Frequency Cepstrum Coefficients,†Mob. Forensics, vol. 1, no. 2, p. 48, Dec. 2019, doi: 10.12928/mf.v1i2.1259.

A. Syukron and A. Subekti, “Penerapan Metode Random Over-Under Sampling dan Random Forest Untuk Klasifikasi Penilaian Kredit,†J. Inform., vol. 5, no. 2, pp. 175–185, Sep. 2018, doi: 10.31294/ji.v5i2.4158.

K. R. Gray, P. Aljabar, R. A. Heckemann, A. Hammers, and D. Rueckert, “Random forest-based similarity measures for multi-modal classification of Alzheimer’s disease,†Neuroimage, vol. 65, pp. 167–175, Jan. 2013, doi: 10.1016/j.neuroimage.2012.09.065.

M. Belgiu and L. Drăguţ, “Random forest in remote sensing: A review of applications and future directions,†ISPRS J. Photogramm. Remote Sens., vol. 114, pp. 24–31, Apr. 2016, doi: 10.1016/j.isprsjprs.2016.01.011.

Y. Shuai, Y. Zheng, and H. Huang, “Hybrid Software Obsolescence Evaluation Model Based on PCA-SVM-GridSearchCV,†in 2018 IEEE 9th International Conference on Software Engineering and Service Science (ICSESS), Nov. 2018, pp. 449–453, doi: 10.1109/ICSESS.2018.8663753.

Downloads

Published

2021-04-25

Issue

Section

Articles