Implementasi Model Gpt-3.5 Turbo Untuk Otomatisasi Penilaian Esai Pada Sistem Pembelajaran Daring
DOI:
https://doi.org/10.30865/jurikom.v12i6.9317Keywords:
GPT-3.5 Turbo, Essay, Online Learning, Automated ScoringAbstract
Essay assessment in online learning requires significant time, effort, and consistency, which can be challenging to maintain when conducted manually. This study explores the use of the large language model GPT-3.5 Turbo as the core of an automated essay scoring system for online learning platforms. Employing a Research and Development (R&D) approach with the ADDIE development model—comprising Analysis, Design, Development, Implementation, and Evaluation phases—the research adopts the Cross-Industry Standard Process for Data Mining (CRISP-DM) framework for its methodology. The automated essay scoring system utilizing Prompt 4 demonstrated exceptionally high accuracy and reliability. The model achieved an accuracy of 94.3%, an F1-Score of 0.955, and a Cohen’s Kappa value of 0.878. This high Kappa value indicates a very strong agreement between AI-generated assessments and the gold standard validated by educators, surpassing the initial inter-rater agreement among educators themselves, which was only 0.1157. The superior performance of Prompt 4 is also confirmed by the lowest Mean Absolute Error (MAE) of 30.54 and the highest Area Under the Curve (AUC) of 0.956.
References
[1] I. Sarto, “PENGARUH PEMBELAJARAN DARING TERHADAP MINAT BELAJAR SISWA PADA MASA PANDEMI COVID-19 KELAS V SDN CENDRAWASIH 1 MAKASSAR The Effect Of Online Learning On Students’ Interest Learning During The Covid-19 Pandemic Of Class V SDN Cendrawasih 1 Makassar.”
[2] N. L. Kinanti and A. Qoiriah, “Sistem Penilaian Otomatis Jawaban Esai Bahasa Indonesia Berdasarkan Kemiripan Kalimat Menggunakan Syntactic-Semantic Similarity,” Journal of Informatics and Computer Science, vol. 02, 2020.
[3] J. Esai, M. Jurusan, T. Informatika, D. Stmik, and A. B. Aristejo, “Penggunaan ChatGPT dalam Otomatisasi Penilaian”.
[4] E. L. Amalia1, A. J. Jumadi, I. A. Mashudi3, W. Wibowo4, and P. N. Malang, “ANALISIS METODE COSINE SIMILARITY PADA APLIKASI UJIAN ONLINE ESAI OTOMATIS ( STUDI KASUS JTI POLINEMA ) COSINE SIMILARITY METHOD ANALYSIS ON AUTOMATIC ESAI ONLINE TEST APPLICATION”, doi: 10.25126/jtiik.202184356.
[5] A. Sumbaryadi and P. Christo, “SISTEM INFORMASI PENILAIAN HASIL BELAJAR SISWA SEKOLAH MENENGAH KEJURUAN (SMK) BERBASIS WEB,” Sistem Informasi |, vol. 6, no. 1, pp. 48–53, 2019.
[6] J. Pendidikan dan Pengabdian kepada Masyarakat, N. Putri Mawarny, S. Holida, and N. Sari Siregar, “Tahun 2022 | Hal,” vol. 1, no. 3, pp. 30–39, [Online]. Available: https://jurnal.permapendis-sumut.org/index.php/pema
[7] K. A. Pradani and L. H. Suadaa, “Automated Essay Scoring Menggunakan Semantic Textual Similarity Berbasis Transformer Untuk Penilaian Ujian Esai,” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 10, no. 6, pp. 1177–1184, Dec. 2023, doi: 10.25126/jtiik.2023107338.
[8] R. Ahmad, “E-learning Automated Essay Scoring System Menggunakan Metode Searching Text Similarity Matching Text,” Jurnal Penelitian Enjiniring, vol. 22, no. 1, pp. 38–43, May 2019, doi: 10.25042/jpe.052018.07.
[9] E. Latif and X. Zhai, “Fine-tuning ChatGPT for automatic scoring,” Computers and Education: Artificial Intelligence, vol. 6, Jun. 2024, doi: 10.1016/j.caeai.2024.100210.
[10] N. L. Kinanti and A. Qoiriah, “Sistem Penilaian Otomatis Jawaban Esai Bahasa Indonesia Berdasarkan Kemiripan Kalimat Menggunakan Syntactic-Semantic Similarity,” Journal of Informatics and Computer Science, vol. 02, 2020.
[11] J. Esai, M. Jurusan, T. Informatika, D. Stmik, and A. B. Aristejo, “Penggunaan ChatGPT dalam Otomatisasi Penilaian”.
[12] “932+Template+JPT+2020+26862-26869”.
[13] C. A. Mallio, C. Bernetti, A. C. Sertorio, and B. B. Zobel, “ChatGPT in radiology structured reporting: analysis of ChatGPT-3.5 Turbo and GPT-4 in reducing word count and recalling findings,” Feb. 10, 2024, AME Publishing Company. doi: 10.21037/qims-23-1300.
[14] U. Sultan Syarif Kasim and K. Kunci, “Pengembangan Model ADDIE (Analisys, Design, Development, Implemetation, Evaluation).”
[15] M. A. Hasanah, S. Soim, and A. S. Handayani, “Implementasi CRISP-DM Model Menggunakan Metode Decision Tree dengan Algoritma CART untuk Prediksi Curah Hujan Berpotensi Banjir,” Journal of Applied Informatics and Computing, vol. 5, no. 2, pp. 103–108, 2021, doi: 10.30871/jaic.v5i2.3200.
[16] M. A. Hasanah, S. Soim, and A. S. Handayani, “Implementasi CRISP-DM Model Menggunakan Metode Decision Tree dengan Algoritma CART untuk Prediksi Curah Hujan Berpotensi Banjir,” 2021. [Online]. Available: http://jurnal.polibatam.ac.id/index.php/JAIC
[17] A. Rianti et al., “CRISP-DM: Metodologi Proyek Data Science.”
[18] I. Budiman et al., “Data Clustering Menggunakan Metodologi CRISP-DM Untuk Pengenalan Pola Proporsi Pelaksanaan Tridharma,” 2011.
[19] F. T. Informasi, D. Komunikasi, D. Ratna, M. Nafisah, and A. Hendrawan, “SEMINAR NASIONAL INOVASI DAN TREN TEKNOLOGI (SINATTI) PENERAPAN METODE CRISP-DM DENGAN ALGORITMA K-MEANS CLUSTERING UNTUK ANALISA KEMISKINAN DAN KONSUMSI PER KAPITA DI JAWA TENGAH SELAMA PANDEMI.”
[20] I. Gede Iwan Sudipa et al., DATA MINING. [Online]. Available: www.globaleksekutifteknologi.co.id



