Review: Metode-Metode Ekstraksi Ciri dan Klasifikasi Identifikasi Pembicara

Faisal Dharma Adhinata; Nur Ghaniaviyanto Ramadhan

doi:10.30865/mib.v6i1.3469

Authors

Faisal Dharma Adhinata Institut Teknologi Telkom Purwokerto, Purwokerto http://orcid.org/0000-0002-2624-173X
Nur Ghaniaviyanto Ramadhan Institut Teknologi Telkom Purwokerto, Purwokerto

DOI:

https://doi.org/10.30865/mib.v6i1.3469

Keywords:

Speaker Identification, MFCC, GMM, Hybrid Classifier

Abstract

Identifying a person's identity still often uses an ID card (KTP, SIM, passport, etc.). This method has a weakness because the ID Card is easily damaged and lost. Biometric recognition systems provide a solution by using human body parts as identity recognition. Sounds are readily available biometric information. Voice pattern recognition is used for the speaker identification process to obtain the identity of someone speaking. This paper reviews several feature extraction and classification methods that are often used in speaker identification. The selection of feature extraction methods and classification functions in computation and the level of accuracy of the speaker identification system. Based on the survey dataset applied with the feature extraction method, the Mel Frequency Cepstral Coefficients (MFCC) method has high accuracy even with noise input. Then in classification, the Gaussian Mixture Model (GMM) method is most often used because it can work in noise. Recently, a hybrid classifier has been developed, which increases the accuracy value.

References

R. Togneri and D. Pullella, â€œAn overview of speaker identification: Accuracy and robustness issues,â€ IEEE Circuits and Systems Magazine, vol. 11, no. 2, pp. 23â€“61, 2011, doi: 10.1109/MCAS.2011.941079.

A. H. Rasmussen and D. B. Mikalski, â€œSpeaker Identification,â€ Technical University of Denmark, 2007.

L. Feng, â€œSpeaker Recognition,â€ Technical University of Denmark, 2004.

S. S. Tirumala, S. R. Shahamiri, A. S. Garhwal, and R. Wang, â€œSpeaker identification features extraction methods: A systematic review,â€ Expert Systems with Applications, vol. 90, pp. 250â€“271, 2017, doi: 10.1016/j.eswa.2017.08.015.

A. Maurya, D. Kumar, and R. K. Agarwal, â€œSpeaker Recognition for Hindi Speech Signal using MFCC-GMM Approach,â€ Procedia Computer Science, vol. 125, pp. 880â€“887, 2018, doi: 10.1016/j.procs.2017.12.112.

K. Daqrouq and T. A. Tutunji, â€œSpeaker identification using vowels features through a combined method of formants, wavelets, and neural network classifiers,â€ Applied Soft Computing Journal, vol. 27, pp. 231â€“239, 2015, doi: 10.1016/j.asoc.2014.11.016.

S. V. Chougule and M. S. Chavan, â€œRobust Spectral Features for Automatic Speaker Recognition in Mismatch Condition,â€ Procedia Computer Science, vol. 58, pp. 272â€“279, 2015, doi: 10.1016/j.procs.2015.08.021.

Z. Weng, L. Li, and D. Guo, â€œSpeaker recognition using weighted dynamic MFCC based on GMM,â€ Proceedings - 2010 International Conference on Anti-Counterfeiting, Security and Identification, 2010 ASID, pp. 285â€“288, 2010, doi: 10.1109/ICASID.2010.5551341.

A. Shahab and D. Lestari, â€œAn investigation of Indonesian speaker identification for channel dependent modeling using I-vector,â€ 2016 Conference of the Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques, O-COCOSDA 2016, no. October, pp. 151â€“155, 2017, doi: 10.1109/ICSDA.2016.7919002.

L. Zhu and Q. Yang, â€œSpeaker Recognition System Based on weighted feature parameter,â€ Physics Procedia, vol. 25, pp. 1515â€“1522, 2012, doi: 10.1016/j.phpro.2012.03.270.

L. M. Yee and A. M. Ahmad, â€œComparative Study of Speaker Recognition Methods :DTW,GMM and SVM,â€ 2008.

N. Mohan, â€œGMM-UBM for Text-Dependent Speaker Recognition,â€ IEEE, pp. 432â€“435, 2012, doi: 10.1109/ICALIP.2012.6376656.

P. K. Nayana, D. Mathew, and A. Thomas, â€œComparison of Text Independent Speaker Identification Systems using GMM and i-Vector Methods,â€ Procedia Computer Science, vol. 115, pp. 47â€“54, 2017, doi: 10.1016/j.procs.2017.09.075.

D. Pandey, â€œImplementation of DTW Algorithm for Voice Recognition using VHDL,â€ pp. 1â€“4, 2017.

S. B. Magre, R. R. Deshmukh, and P. P. Shrishrimal, â€œA comparative study on feature extraction techniques in speech recognition,â€ no. June, 2013, doi: 10.1007/s40012-015-0063-y.

M. Subali, M. Andriansyah, and C. Sinambela, â€œAnalisis Frekuensi Dasar dan Frekuensi Formant dari Fonem Huruf Hijaiyah untuk Pengucapan Makhraj dengan Metode DTW,â€ Prosiding PESAT (Psikologi, Ekonomi, Sastra, Arsitektur &Teknik Sipil), vol. 6, pp. 60â€“73, 2015.

S. Srivastava, P. Nandi, G. Sahoo, and M. Chandra, â€œFormant Based Linear Prediction Coefficients for Speaker Identification,â€ International Conference on Signal Processing and Integrated Networks (SPIN), pp. 685â€“688, 2014.

N. Almaadeed, A. Aggoun, and A. Amira, â€œText-Independent Speaker Identification Using Vowel Formants,â€ Journal of Signal Processing Systems, vol. 82, no. 3, pp. 345â€“356, 2016, doi: 10.1007/s11265-015-1005-5.

P. J. Chaudhary and K. M. Vagadia, â€œA Review Article on Speaker Recognition with Feature Extraction,â€ International Journal of Emerging Technology and Advanced Engineering, vol. 5, no. 2, pp. 94â€“97, 2015.

K. Kaur and N. Jain, â€œFeature Extraction and Classification for Automatic Speaker Recognition System: A Review,â€ International Journal of Advances Research in Computer Science and Software Engineering, vol. 5, no. 1, pp. 1â€“6, 2015.

J. D. Wu and B. F. Lin, â€œSpeaker identification using discrete wavelet packet transform technique with irregular decomposition,â€ Expert Systems with Applications, vol. 36, no. 2 PART 2, pp. 3136â€“3143, 2009, doi: 10.1016/j.eswa.2008.01.038.

A. Shafik, S. M. Elhalafawy, S. M. Diab, B. M. Sallam, and F. E. Abd El-samie, â€œA wavelet based approach for speaker identification from degraded speech,â€ International Journal of Communication Networks and Information Security, vol. 1, no. 3, pp. 52â€“58, 2009.

R. Chakroun, L. B. Zouari, M. Frikha, and A. Ben Hamida, â€œA hybrid system based on GMM-SVM for speaker identification,â€ International Conference on Intelligent Systems Design and Applications, ISDA, pp. 654â€“658, 2016, doi: 10.1109/ISDA.2015.7489195.

D. Handaya, H. Fakhruroja, E. M. I. Hidayat, and C. Machbub, â€œComparison of Indonesian speaker recognition using vector quantization and Hidden Markov Model for unclear pronunciation problem,â€ Proceedings of the 2016 6th International Conference on System Engineering and Technology, ICSET 2016, pp. 39â€“45, 2017, doi: 10.1109/FIT.2016.7857535.

W. C. Chen, C. T. Hsieh, and C. H. Hsu, â€œRobust speaker identification system based on two-stage vector quantization,â€ Tamkang Journal of Science and Engineering, vol. 11, no. 4, pp. 357â€“366, 2008.

A. H. Mansour, G. Zen, A. Salh, and K. A. Mohammed, â€œVoice Recognition using Dynamic Time Warping and Mel-Frequency Cepstral Coefficients Algorithms,â€ International Journal of Computer Applications, vol. 116, no. 2, pp. 975â€“8887, 2015, doi: 10.5120/20312-2362.

T. F. FURTUNA, â€œDynamic Programming Algorithms in Speech Recognition,â€ Informatica Economica, vol. XII, no. March, pp. 94â€“98, 2008, [Online]. Available: http://econpapers.repec.org/RePEc:aes:infoec:v:xii:y:2008:i:2:p:94-98.

R. C. Rose, E. M. Hofstetter, and D. A. Reynolds, â€œIntegrated Models of Signal and Background with Application to Sneaker Identification in Noise,â€ IEEE Transactions on Speech and Audio Processing, vol. 2, no. 2, pp. 245â€“257, 1994, doi: 10.1109/89.279273.

N. Dehak, P. J. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, â€œFront-end factor analysis for speaker verification,â€ IEEE Transactions on Audio, Speech and Language Processing, vol. 19, no. 4, pp. 788â€“798, 2011, doi: 10.1109/TASL.2010.2064307.

N. S. Ibrahim and D. A. Ramli, â€œI-vector Extraction for Speaker Recognition Based on Dimensionality Reduction,â€ Procedia Computer Science, vol. 126, pp. 1534â€“1540, 2018, doi: 10.1016/j.procs.2018.08.126.

H. S. Bae, H. J. Lee, and S. G. Lee, â€œVoice recognition based on adaptive MFCC and deep learning,â€ Proceedings of the 2016 IEEE 11th Conference on Industrial Electronics and Applications, ICIEA 2016, pp. 1542â€“1546, 2016, doi: 10.1109/ICIEA.2016.7603830.

T. Gulzar, A. Singh, and S. Sharma, â€œComparative Analysis of LPCC, MFCC and BFCC for the Recognition of Hindi Words using Artificial Neural Networks,â€ International Journal of Computer Applications, vol. 101, no. 12, pp. 22â€“27, 2014, [Online]. Available: https://pdfs.semanticscholar.org/a9d5/3dce0ef368d9bb0e461ad73a4519319e79a6.pdf.

C. Li, X. Ma, B. Jiang, and X. Li, â€œDeep Speaker : an End-to-End Neural Speaker Embedding System,â€ arXiv, pp. 1â€“8, 2017.

G. R. Dhinesh, G. R. Jagadeesh, and T. Srikanthan, â€œA low-complexity speaker-and-word recognition application for resource-constrained devices,â€ Proceedings - 2011 International Symposium on Electronic System Design, ISED 2011, pp. 335â€“340, 2011, doi: 10.1109/ISED.2011.30.

U. Bhattacharjee, â€œA Comparative Study Of LPCC And MFCC Features For The Recognition Of Assamese Phonemes,â€ International Journal of Engineering Research & Technology (IJERT), vol. 2, no. 1, pp. 1â€“7, 2013.

B. Srinivas and P. Subhashini, â€œText Independent Speaker Identification using SVM with MFCC,â€ Global Journal of Advanced Engineering Technologies, vol. 5, no. 2, pp. 255â€“266, 2016.

Review: Metode-Metode Ekstraksi Ciri dan Klasifikasi Identifikasi Pembicara

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Menu Utama

flagcounter

template

statcounter

rji

terindex