Penerapan Metode CRISP-DM dalam Klasifikasi Data Ulasan Pengunjung Destinasi Danau Toba Menggunakan Algoritma Naïve Bayes Classifier (NBC) dan Decision Tree (DT)

Authors

DOI:

https://doi.org/10.30865/mib.v7i3.6461

Keywords:

Sentiment Analysis, CRISP-DM, Toba Lake, Classification, NBC, DT

Abstract

This study aims to implement a classification method using the Nave Bayes Classifier (NBC) algorithm on Lake Toba visitor review text data. The Cross Industry Standard Process for Data Mining (CRISP-DM) methodology comprises the following stages: business understanding, data understanding, data preparation, modeling, evaluation, and deployment. The findings of this study indicate that during the phase of business comprehension, the context of the discussion focuses on the tourism sector, specifically tourist perceptions of the quality of products and services at Lake Toba tourist destinations. At the data comprehension stage, the source of review data used was the Tripadvisor website, which contained as many as 858 reviews with the following rating classification: 8 reviews with abysmal ratings; 22 reviews with poor ratings; 81 reviews with neutral ratings; 304 reviews with good ratings; 443 reviews with excellent ratings. Data cleansing is performed at the data preparation stage so that 382 data are processed by dividing training data by 70 percent and test data by 30 percent. During the modeling phase, the performance of the NBC and DT algorithms was evaluated using and without SMOTE UPsampling operators. The comparison of NBC and DT algorithm values indicates that the model with the best performance is DT using SMOTE UPsampling operators with accuracy values (98.27 percent), precision values (98.83 percent), recall values (97.71 percent), f-measure values (98.26 percent), and AUC values (98.27 percent) (0.982). At the evaluation stage, the importance of excellent service (Quality Human Resources) and supporting infrastructure was highlighted by analyzing the results of ranking the five most frequently used terms in Lake Toba visitor review data (tourism facilities and infrastructure). At the deployment stage, it is necessary to balance the development of attractions, accessibility, lodging, and tourism-supporting amenities to generate visiting intention and revisit motivation to Lake Toba.

References

F. T. Meturan, M. Idris Taking, and R. Latief, “Analisis Ketersediaan Prasaran Dan Fasilitas Penunjang Pengembangan Objek Wisata Pantai Liang Kecamatan Salahutu Kabupaten Maluku Tengah,†J. Urban Plan. Stud., vol. 2, no. 1, pp. 85–95, 2021, doi: 10.35965/jups.v2i1.33.

S. A. Azzahra and A. Wibowo, “Analisis Sentimen Multi-Aspek Berbasis Konversi Ikon Emosi dengan Algoritme Naïve Bayes untuk Ulasan Wisata Kuliner Pada Web Tripadvisor,†J. Teknol. Inf. dan Ilmu Komput., vol. 7, no. 4, pp. 737–744, 2020, doi: 10.25126/jtiik.2020731907.

D. Riadi, L. A. Permadi, and W. Retnowati, “Pengaruh Kualitas Pelayanan Terhadap Minat Berkunjung Kembali ke Desa Wisata Hijau Bilebante yang Dimediasi Oleh Kepuasan Wisatawan,†J. Ris. Pemasar., vol. 2, no. 2, pp. 38–49, 2023.

Y. Christian and K. O. Y. R. Qi, “Penerapan K-Means pada Segmentasi Pasar untuk Riset Pemasaran pada Startup Early Stage dengan Menggunakan CRISP-DM,†JURIKOM (Jurnal Ris. Komputer), vol. 9, no. 4, pp. 966–973, 2022, doi: 10.30865/jurikom.v9i4.4486.

D. A. Munawwaroh and A. H. Primandari, “Implementasi CRISP-DM Model Menggunakan Metode Decision Tree dengan Algoritma CART untuk Prediksi Ibu Hamil Berpotensi Gizi Kurang,†Delta J. Ilm. Pendidik. Mat., vol. 10, no. 2, pp. 367–380, 2022, doi: 10.30871/jaic.v5i2.3200.

Y. Harwani, “Constructing Brand Personality from the TripAdvisor Online Reviews,†Int. J. Bus. Manag. Technol., vol. 5, no. 4, pp. 152–157, 2021, doi: https://dx.doi.org/10.5281/zenodo.7672647.

P. Rita, R. Ramos, M. T. Borges-Tiago, and D. Rodrigues, “Impact of the rating system on sentiment and tone of voice: A Booking.com and TripAdvisor comparison study,†Int. J. Hosp. Manag., vol. 104, no. 2, pp. 1–12, 2022, doi: 10.1016/j.ijhm.2022.103245.

P. P. Dewi, I. P. Utama, and I. A. P. Widawati, “Peran Brand Image Situs Tripadvisor Memediasi Pengaruh eWoM tehradap Niat Beli Kamar di Kabupaten Badung,†TULIP Tulisan Ilm. Pariwisata, vol. 5, no. 2, pp. 75–81, 2022.

A. Lynn, M. T. J. T, A. Lianina, and F. A. Madrilejos, “A Narrative Analysis on Tripadvisor Reviews of Guest Satisfaction in Conrad Manila as a Quarantine Facility 2020-2021,†Int. J. Manag. Commer. Innov., vol. 10, no. 1, pp. 435–446, 2022, doi: https://doi.org/10.5281/zenodo.7027432.

A. Minkwitz, “Tripadvisor as a source of data in the planning process of tourism development on a local scale,†Turyzm/Tourism, vol. 28, no. 2, pp. 49–55, 2018, doi: 10.2478/tour-2018-0014.

D. A. Pramudita and Bagus Sumargo, “Pengelompokan Pengguna Internet dengan Metode K-Means Clustering,†J. Stat. dan Apl., vol. 3, no. 1, pp. 1–12, 2019, doi: 10.21009/jsa.03101.

R. Santoso, H. A. Munawi, and D. Sukmawati, “Perkembangan Teknologi Informasi dan Telekomunikasi Terhadap Perubahan Perilaku Masyarakat,†in Conference on Research and Community Services, 2019, pp. 586–592.

Barrie Goldsmith, “Negative Feedback on Tripadvisor: A Hotel’s Nightmare,†J. Tour. Hosp. Manag., vol. 4, no. 3, pp. 135–138, 2016, doi: 10.17265/2328-2169/2016.06.004.

W. Khofifah, D. N. Rahayu, and A. M. Yusuf, “Analisis Sentimen Menggunakan Naive Bayes Untuk Melihat Review Masyarakat Terhadap Tempat Wisata Pantai Di Kabupaten Karawang Pada Ulasan Google Maps,†J. Interkom J. Publ. Ilm. Bid. Teknol. Inf. dan Komun., vol. 16, no. 4, pp. 28–38, 2022, doi: 10.35969/interkom.v16i4.192.

G. K. Pati and E. Umar, “Analisis Sentimen Komentar Pengunjung Terhadap Tempat Wisata Danau Weekuri Menggunakan Metode Naive Bayes Classifier Dan K-Nearest Neighbor,†J. Media Inform. Budidarma, vol. 6, no. 4, pp. 2309–2315, 2022, doi: 10.30865/mib.v6i4.4635.

H. Christanto et al., “Analisis Perbandingan Decision Tree , Support Vector Machine , dan Xgboost dalam Mengklasifikasi Review Hotel Trip Advisor,†J. Teknol. Inform. dan Komput. MH. Thamrin, vol. 9, no. 1, pp. 306–319, 2023.

A. A. Arifiyanti, M. Fuad, P. Fikri, and B. Utomo, “Analisis Sentimen Ulasan Pengunjung Objek Wisata Gunung Bromo pada Situs Tripadvisor,†Explor. J. Sist. Inf. dan Telemat., vol. 13, no. 1, pp. 32–37, 2022.

O. Somantri and Dairoh, “Analisis Sentimen Penilaian Tempat Tujuan Wisata Kota Tegal Berbasis Text Mining,†JEPIN J. Edukasi dan Penelit. Inform., vol. 5, no. 2, pp. 191–196, 2019.

F. Nurhuda, S. W. Sihwi, and A. Doewes, “Analisis Sentimen Masyarakat Terhadap Pilpres 2019 Berdasarkan Opini Dari Twitter Menggunakan Metode Naive Bayes Classifier,†J. ITSMART, vol. 2, no. 2, pp. 35–42, 2013, doi: 10.51519/journalcisa.v1i3.45.

M. F. Asshiddiqi and K. M. Lhaksmana, “Perbandingan Metode Decision Tree dan Support Vector Machine untuk Analisis Sentimen pada Instagram Mengenai Kinerja PSSI,†in e-Proceeding of Engineering, 2020, vol. 7, no. 3, pp. 9936–9948.

S. W. U. Vitandy, A. A. Supianto, and F. A. Bachtiar, “Analisis Sentimen Evaluasi Kinerja Dosen menggunakan Term Frequency- Inverse Document Frequency dan Naïve Bayes Classifier,†J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 3, no. 6, pp. 6080–6088, 2019, [Online]. Available: https://j-ptiik.ub.ac.id/index.php/j-ptiik/article/view/5645

R. Kosasih and A. Alberto, “Analisis Sentimen Produk Permainan Menggunakan Metode TF-IDF Dan Algoritma K-Nearest Neighbor,†InfoTekJar J. Nas. Inform. dan Teknol. Jar., vol. 6, no. 1, pp. 134–139, 2021.

M. Y. Ardiansyah, M. A. Fauzi, and S. Adinugroho, “Penerapan Term Frequency-Modified Inverse Document Frequency pada Analisis Sentimen Ulasan Barang menggunakan Metode Learning Vector Quantization,†J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 3, no. 6, pp. 5592–5598, 2019, [Online]. Available: http://j-ptiik.ub.ac.id

M. I. Alfarizi, L. Syafaah, and M. Lestandy, “Emotional Text Classification Using TF-IDF (Term Frequency-Inverse Document Frequency) And LSTM (Long Short-Term Memory),†JUITA J. Inform., vol. 10, no. 2, p. 225, 2022, doi: 10.30595/juita.v10i2.13262.

E. B. S. Rayhan Rahmanda, “JURNAL RESTI Word2Vec on Sentiment Analysis with Synthetic Minority Oversampling,†J. RESTI, vol. 5, no. 2, pp. 599–605, 2022.

R. A. Barro, I. D. Sulvianti, and M. Afendi, “Penerapan Synthetic Minority Oversampling Technique (Smote) Terhadap Data Tidak Seimbang Pada Pembuatan Model Komposisi Jamu,†Xplore J. Stat., vol. 1, no. 1, pp. 1–6, 2013.

Y. E. Kurniawati, “Class Imbalanced Learning Menggunakan Algoritma Synthetic Minority Over-sampling Technique – Nominal (SMOTE-N) pada Dataset Tuberculosis Anak,†J. Buana Inform., vol. 10, no. 2, pp. 134–143, 2019, doi: 10.24002/jbi.v10i2.2441.

N. L. W. S. R. Ginantra, C. P. Yanti, G. D. Prasetya, I. B. G. Sarasvandana, and I. K. A. G. Wiguna, “Analisis Sentimen Ulasan Villa di Ubud Menggunakan Metode Naive Bayes, Decision Tree, dan k-NN,†Janapati, vol. 11, no. 3, pp. 205–216, 2022.

J. Pardosi, R. Sibarani, N. C. Bangun, and I. M. Putra, “Peran Sumber Daya Manusia Transportasi Penyeberangan dalam Meningkatkan Pelayanan Pariwisata di Danau Toba,†War. Penelit. Perhub., vol. 33, no. 2, pp. 113–122, 2021.

N. A. Wulandari, D. S. Kartini, and N. Y. Yuningsih, “Akselerasi Pengembangan Destinasi Wisata Danau Toba (Studi Realisasi Prinsip Good Governance Pada Badan Pelaksana Otorita Danau Toba),†J. MODERAT, vol. 7, no. 3, pp. 512–533, 2021.

O. Irena, D. Christie, and S. Thio, “Persepsi Masyarakat Terhadap Citra Destinasi Dari Candi Borobudur, Mandalika, Labuan Bajo, dan Danau Toba,†J. Hosp. dan Manaj. Jasa, vol. 7, no. 2, pp. 1–23, 2019.

T. Wal hidayat and I. Nasution, “Persepsi Publik Tentang Destinasi Pariwisata Danau Toba Sebagai Global Geopark Kaldera UNESCO,†Publikauma J. Adm. Publik Univ. Medan Area, vol. 7, no. 2, pp. 88–102, 2019, doi: 10.31289/publika.v7i2.2943.

Q. A’yuniyah et al., “Implementasi Algoritma Naïve Bayes Classifier (NBC) untuk Klasifikasi Penyakit Ginjal Kronik,†J. Sist. Komput. dan Inform., vol. 4, no. 1, pp. 72–76, 2022, doi: 10.30865/json.v4i1.4781.

R. I. Permatasari et al., “Analisis Sentimen Film pada Twitter Berbahasa Indonesia Menggunakan Ensemble Features dan Naïve Bayes,†J. Pengemb. Teknol. Inf. dan Ilmu Komput. Univ. Brawijaya, vol. 2, no. 11, pp. 5921–5927, 2018.

N. Nuraeni, “Penentuan Kelayakan Kredit Dengan Algoritma Naïve Bayes Classifier: Studi Kasus Bank Mayapada Mitra Usaha Cabang PGC,†J. Tek. Komput., vol. 3, no. 1, pp. 9–15, 2017, [Online]. Available: https://ejournal.bsi.ac.id/ejurnal/index.php/jtk/article/view/1337

Downloads

Published

2023-07-31

Issue

Section

Articles