Pengaruh Oversampling dan Cross Validation Pada Model Machine Learning Untuk Sentimen Analisis Kebijakan Luaran Kelulusan Mahasiswa

 Mufida Rahayu (Universitas Dian Nuswantoro, Semarang, Indonesia)
 (*)Ardytha Luthfiarta Mail (Universitas Dian Nuswantoro, Semarang, Indonesia)
 Lailatul Cahyaningrum (Universitas Dian Nuswantoro, Semarang, Indonesia)
 Alya Nurfaiza Azzahra (Universitas Dian Nuswantoro, Semarang, Indonesia)

(*) Corresponding Author

Submitted: November 20, 2023; Published: January 9, 2024

Abstract

The Minister of Education, Culture, Research and Technology issued a new policy on graduation standards for undergraduate and postgraduate students. This policy was delivered on August 29, 2023, on live streaming YouTube Kemendikbudristek at the Merdeka Belajar seminar episode 26: Transformation of National Standards and Higher Education Accreditation. The policy has caused various kinds of positive and negative responses in the community. Based on this problem, this research analyzes the sentiment of how the attitude and response of the community regarding this matter, so that it can be useful for the community in the future. This research uses two algorithms Nae Bayes Classifier (NBC) and K-Nearest Neighbor (KNN) with data collection done through YouTube video comments getting a total dataset of 1085 data. After that, enter the data pre-processing which is then labeled using the Lexicon-based method with the stemming Sastrawi method. Datasets are grouped into positive sentiment and negative sentiment where the labeling results show unbalanced label data. Then the oversampling method Synthetic Minority Over-sampling Technique (SMOTE) is performed so that the data can be balanced and produce good accuracy. The test results after the SMOTE technique show that the NBC algorithm has the highest accuracy compared to KNN. The accuracy results are 74%, precision 74.6%, recall 74% and f1-score 73.9%. While KNN produces an accuracy of 50.2%, precision of 75.2%, recall of 50.2%, and f1-score of 34.5%.

Keywords



Full Text:

PDF


Article Metrics

Abstract view : 178 times
PDF - 59 times

References

H. D. Astuti, P. Meilina, N. Amri, and M. Hasbi, Aplikasi Pengelompokkan Abstrak Skripsi Teknik Informatika, Pros. Semnastek, no. November 2022, pp. 110, 2022, [Online]. Available: https://jurnal.umj.ac.id/index.php/semnastek/article/view/14701.

R. N. Sari, Permendikbudristek No. 53 Tahun 2023 Tentang Penjaminan Mutu Pendidikan Tinggi, kemendikbud.go.id, 2023. https://lldikti13.kemdikbud.go.id/2023/08/29/peraturan-terbaru-mengenai-penjaminan-mutu-pendidikan-tinggi/ (accessed Nov. 02, 2023).

R. L. Pratama, Nadiem Umumkan Aturan Baru, Mahasiswa S1 Kini Tidak Wajib Buat Skripsi, kompas.tv, 2023. https://www.kompas.tv/pendidikan/438914/nadiem-umumkan-aturan-baru-mahasiswa-s1-kini-tidak-wajib-buat-skripsi?page=all (accessed Oct. 30, 2022).

B. M. Akbar, A. T. Akbar, and R. Husaini, Analysis of Sentiments and Emotions about Sinovac Vaccine Using Naive Bayes, Telematika, vol. 19, no. 2, p. 185, 2022, doi: 10.31315/telematika.v19i2.7601.

J. Ortiz-Bejar, E. S. Tellez, M. Graff, D. Moctezuma, and S. Miranda-Jimenez, Improving k Nearest Neighbors and Nave Bayes Classifiers through Space Transformations and Model Selection, IEEE Access, vol. 8, pp. 221669221688, 2020, doi: 10.1109/ACCESS.2020.3042453.

D. A. M. Reza, A. M. Siregar, and Rahmat, Penerapan Algoritma K-Nearest Neighbord Untuk Prediksi Kematian Akibat Penyakit Gagal Jantung, Sci. Student J. Information, Technol. Sci. , vol. III, no. 1, pp. 105112, 2022.

H. Setiawan and I. Zufria, Analisis Sentimen Pembatalan Indonesia Sebagai Tuan Rumah Piala Dunia FIFA U-20 Menggunakan Nave Bayes, vol. 7, no. 3, pp. 10031012, 2023, doi: 10.30865/mib.v7i3.6144.

A. M. Ndapamuri, D. Manongga, and A. Iriani, Analisis Sentimen Ulasan Aplikasi Tripadvisor Dengan Metode Support Vector Machine, K-Nearest Neighbor, Dan Naive Bayes, INOVTEK Polbeng - Seri Inform., vol. 8, no. 1, p. 127, 2023, doi: 10.35314/isi.v8i1.3260.

D. Sandi, E. Utami, and K. Kusnawi, Analisis Sentimen Publik Terhadap Elektabilitas Ganjar Pranowo di Tahun Politik 2024 di Twitter dengan Algoritma KNN dan Nave Bayes, J. Media , vol. 7, pp. 10971108, 2023, doi: 10.30865/mib.v7i3.6298.

S. Khomsah, Jurnal Penelitian Pos dan Informatika Naive Bayes Classifier Optimization on Sentiment Analysis of Hotel Reviews Optimasi Naive Bayes Classifier Pada Sentiment Analysis Komentar Pelanggan Hotel, vol. 10, no. 2, pp. 157168, 2020, doi: 10.17933/jppi.2020.100206.

W. Xing and Y. Bei, Medical Health Big Data Classification Based on KNN Classification Algorithm, IEEE Access, vol. 8, pp. 2880828819, 2020, doi: 10.1109/ACCESS.2019.2955754.

D. Hernikawati, Kecenderungan Tanggapan Masyarakat Terhadap Vaksin Sinovac Berdasarkan Lexicon Based Sentiment Analysis, J. Ilmu Pengetah. dan Teknol. Komun., vol. 23, no. 1, pp. 2131, 2021, [Online]. Available: http://dx.doi.org/10.33169/iptekkom.23.1.2021.21-31.

R. N. Ikhsani and F. F. Abdulloh, Optimasi SVM dan Decision Tree Menggunakan SMOTE Untuk Mengklasifikasi Sentimen Masyarakat Mengenai Pinjaman Online, vol. 7, pp. 16671677, 2023, doi: 10.30865/mib.v7i4.6809.

S. Sumayah, F. Sembiring, and W. Jatmiko, Analysis of Sentiment of Indonesian Community on Metaverse Using Support Vector Machine Algorithm, J. Tek. Inform., vol. 4, no. 1, pp. 143150, 2023, doi: 10.52436/1.jutif.2023.4.1.417.

R. Puspitasari, Y. Findawati, M. A. Rosid, P. S. Informatika, and U. M. Sidoarjo, Sentiment Analysis of Post-Covid-19 Inflation Based on Twitter Using the K-Nearest Neighbor and Support Vector Machine Analisis Sentimen Terhadap Inflasi Pasca Covid-19 Berdasarkan Twitter Dengan Metode Klasifikasi K-Nearest Neighbor Dan, vol. 4, no. 4, pp. 111, 2023.

J. Muliawan, E. Dazki, and R. D. Kurniawan, SENTIMENT ANALYSIS OF INDONESIA S CAPITAL CITY RELOCATION USING THREE ALGORITHMS : NAVE BAYES , KNN , AND RANDOM FOREST ANALISIS SENTIMEN PEMINDAHAN IBU KOTA NEGARA INDONESIA MENGGUNAKAN TIGA ALGORITMA : NAVE BAYES , KNN , DAN RANDOM, vol. 4, no. 5, pp. 12271236, 2023.

S. Supangat, M. Z. Bin Saringat, and M. Y. F. Rochman, Predicting Handling Covid-19 Opinion using Naive Bayes and TF-IDF for Polarity Detection, MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 22, no. 2, pp. 173184, 2023, doi: 10.30812/matrik.v22i2.2227.

I. Alarab and S. Prakoonwit, Effect of data resampling on feature importance in imbalanced blockchain data: Comparison studies of resampling techniques, Data Sci. Manag., vol. 5, no. 2, pp. 6676, 2022, doi: 10.1016/j.dsm.2022.04.003.

S. Ruan, H. Li, C. Li, and K. Song, Class-specific deep feature weighting for nave bayes text classifiers, IEEE Access, vol. 8, pp. 2015120159, 2020, doi: 10.1109/ACCESS.2020.2968984.

M. Asfi and N. Fitrianingsih, Implementasi Algoritma Naive Bayes Classifier sebagai Sistem Rekomendasi Pembimbing Skripsi, J. Nas. Inform. dan Teknol. Jar., vol. 5, pp. 4550, 2020, [Online]. Available: https://jurnal.uisu.ac.id/index.php/infotekjar/article/view/2536.

S. Zhang and J. Li, KNN Classification With One-Step Computation, IEEE Trans. Knowl. Data Eng., vol. 35, no. 3, pp. 27112723, 2023, doi: 10.1109/TKDE.2021.3119140.

H. Younes, A. Ibrahim, M. Rizk, and M. Valle, An Efficient Selection-Based kNN Architecture for Smart Embedded Hardware Accelerators, IEEE Open J. Circuits Syst., vol. 2, no. April, pp. 534545, 2021, doi: 10.1109/ojcas.2021.3108835.

Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Pengaruh Oversampling dan Cross Validation Pada Model Machine Learning Untuk Sentimen Analisis Kebijakan Luaran Kelulusan Mahasiswa

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 JURNAL MEDIA INFORMATIKA BUDIDARMA

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.



JURNAL MEDIA INFORMATIKA BUDIDARMA
STMIK Budi Darma
Secretariat: Sisingamangaraja No. 338 Telp 061-7875998
Email: mib.stmikbd@gmail.com

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.