Improving Infant Cry Recognition with CNNs and Imbalance Mitigation
DOI:
https://doi.org/10.30865/mib.v8i2.7370Keywords:
Baby Cry Classification, Neural Network, Handling Data Imbalance, Audio AnalysisAbstract
The classification of baby cries using machine learning is essential for developing automated systems that can assist caregivers in identifying and responding to the needs of infants promptly and accurately. This study aims to improve upon previous research relating to the Cry Baby Dataset, which has highly imbalanced data. We combine oversampling and undersampling techniques using SMOTE and ENN, along with data augmentation through pitch shifting and noise addition to address the data imbalance issue. The processed data was then modeled using Convolutional Neural Networks (CNN). The study yielded an overall accuracy of 88%, with balanced accuracy observed across all classes, effectively mitigating data imbalance. This represents a notable advancement compared to previous research, which often encountered challenges with unbalanced accuracies across classes. The classes identified include recordings of baby cries attributed to belly pain caused by colic, recordings related to burping, recordings associated with discomfort or other symptoms, recordings of hungry baby cries, and recordings indicating fatigue or the need for sleep. This shows a significant improvement from previous studies, which had very unbalanced accuracy for each class.
References
M. Viragova and S. O’Curry, “Understanding persistent crying in infancy,†Paediatr Child Health, vol. 31, Jan. 2021, doi: 10.1016/j.paed.2020.12.004.
S. Dewi, A. Prasasti, and B. Irawan, “The Study of Baby Crying Analysis Using MFCC and LFCC in Different Classification Methods,†Jan. 2019. doi: 10.1109/ICSIGSYS.2019.8811070.
A. Prayogi, M. Rizqi, and T. M. Fahrudin, “Klasifikasi Suara Tangisan Bayi Berdasarkan Prosodic Features Menggunakan Metode Moments of Distribution dan K-Nearest Neighbours,†vol. 8, pp. 119–125, Jan. 2019, doi: 10.34148/teknika.v8i2.206.
S. Yusdiantoro and T. Sasongko, “Implementasi Algoritma MFCC dan CNN dalam Klasifikasi Makna Tangisan Bayi,†Indonesian Journal of Computer Science, vol. 12, Jan. 2023, doi: 10.33022/ijcs.v12i4.3243.
L. Alzubaidi et al., “Review of deep learning: concepts, CNN architectures, challenges, applications, future directions,†J Big Data, vol. 8, no. 1, p. 53, 2021, doi: 10.1186/s40537-021-00444-8.
M. Wardana and M. Wibowo, “Audio-Visual CNN using Transfer Learning for TV Commercial Break Detection,†IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 17, p. 291, Jan. 2023, doi: 10.22146/ijccs.76058.
Y. Zayed, A. Hasasneh, and C. Tadj, “Infant Cry Signal Diagnostic System Using Deep Learning and Fused Features,†Diagnostics, vol. 13, p. 2107, Jan. 2023, doi: 10.3390/diagnostics13122107.
N. Azhar, M. S. Mohd Pozi, A. Mohamed Din, and A. Jatowt, “An Investigation of SMOTE based Methods for Imbalanced Datasets with Data Complexity Analysis,†IEEE Trans Knowl Data Eng, vol. 35, pp. 6651–6672, Jan. 2023, doi: 10.1109/TKDE.2022.3179381.
A. Alsabry, M. Algabri, A. Ahsan, M. Mosleh, A. Ahmed, and H. A. Qasem, “Enhancing Prediction Models’ Performance for Breast Cancer using SMOTE Technique,†Jan. 2023, pp. 1–8. doi: 10.1109/eSmarTA59349.2023.10293726.
Z. Shi, “Improving k-Nearest Neighbors Algorithm for Imbalanced Data Classification,†IOP Conf Ser Mater Sci Eng, vol. 719, p. 12072, Jan. 2020, doi: 10.1088/1757-899X/719/1/012072.
J. Zhai, J. Qi, and S. Zhang, “An instance selection algorithm for fuzzy K-nearest neighbor,†Journal of Intelligent & Fuzzy Systems, vol. 40, pp. 1–13, Jan. 2020, doi: 10.3233/JIFS-200124.
C. Bratan et al., “Dunstan Baby Language Classification with CNN,†Jan. 2021, pp. 167–171. doi: 10.1109/SpeD53181.2021.9587374.
Z. Kh. Abdul and A. K. Al-Talabani, “Mel Frequency Cepstral Coefficient and its Applications: A Review,†IEEE Access, vol. 10, pp. 122136–122158, 2022, doi: 10.1109/ACCESS.2022.3223444.
O. Özhan, “Fast Fourier Transform,†2022, pp. 465–494. doi: 10.1007/978-3-030-98846-3_8.
A. M S and S. P S, “Mel Scale-Based Linear Prediction Approach to Reduce the Prediction Filter Order in CELP Paradigm,†Circuits Syst Signal Process, vol. 40, pp. 1–23, Jan. 2021, doi: 10.1007/s00034-021-01647-3.
A. Salau, I. Oluwafemi, K. Faleye, and S. Jain, “Audio Compression Using a Modified Discrete Cosine Transform with Temporal Auditory Masking,†Jan. 2019, pp. 135–142. doi: 10.1109/ICSC45622.2019.8938213.
B. McFee et al., “librosa: Audio and Music Signal Analysis in Python,†Jan. 2015, pp. 18–24. doi: 10.25080/Majora-7b98e3ed-003.
A. F. Agarap, “Deep Learning using Rectified Linear Units (ReLU),†Jan. 2018.
I. Kouretas and V. Paliouras, “Hardware Implementation of a Softmax-Like Function for Deep Learning,†Technologies (Basel), vol. 8, no. 3, 2020, doi: 10.3390/technologies8030046.
X. Xie, P. Zhou, H. Li, Z. Lin, and S. Yan, “Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models.†Jan. 2022.
Q. Zhang et al., “Boosting Adversarial Attacks with Nadam Optimizer,†Electronics (Basel), vol. 12, p. 1464, Jan. 2023, doi: 10.3390/electronics12061464.
G. Lemaître, F. Nogueira, and C. Aridas, “Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning,†vol. 18, Jan. 2016.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).