Analysis of Telkom University News Subjects on Popular Indonesian News Portals Using a Combination of Hidden Markov Model (HMM) and Rule Based Methods

Authors

  • Rendhy Al-Farrel Telkom University, Bandung
  • Donni Richasdy Telkom University, Bandung
  • Mahendra Dwifebri Purbolaksono Telkom University, Bandung

DOI:

https://doi.org/10.30865/mib.v6i4.4566

Keywords:

Pos Tagger, Subject, Hidden Markov Model, Rule-Based

Abstract

News media are often found in everyday life as a means of information for the public about something that is happening. In news articles, it is common to see several sentences that support the object to increase its popularity by being promoted by the subject. Part of Speech Tagging can determine the class of words in the sentence according to Tagsets provided by the corpus. That way, the search for the subject in the news article can be found from the word class obtained from a corpus. This research was focused on finding the subject "who" repeatedly spreading the news about Telkom University by using Part of Speech Tagging with the Hidden Markov Model and Rule Based on a news dataset from popular news portals about Telkom University. The process is taking all news about Telkom University on popular news portals and classifying it using the Hidden Markov Model and Rule-Based. We conducted to enhance the research results by changing the probability estimator on Hidden Markov Model. After running some scenarios, the best results obtained by the Hidden Markov Model and Rule-Based are the Accuracy of 94.96%, the Precision of 94.99%, the Recall of 94.96%, and the F1-Score of 94.95%.

Author Biography

Rendhy Al-Farrel, Telkom University, Bandung

Prodi Informatika

References

D. E. Cahyani and M. J. Vindiyanto, “Indonesian part of speech tagging using hidden markov model - Ngram viterbi,†2019 4th Int. Conf. Inf. Technol. Inf. Syst. Electr. Eng. ICITISEE 2019, pp. 353–358, 2019, doi: 10.1109/ICITISEE48480.2019.9003989.

D. N. Prabhu Khorjuvenkar, M. Ainapurkar, and S. Chagas, “Parts of speech tagging for Konkani language,†Proc. 2nd Int. Conf. Comput. Methodol. Commun. ICCMC 2018, no. ICCMC, pp. 605–607, 2018, doi: 10.1109/ICCMC.2018.8487620.

A. Y. Rofiqi, “Clustering Berita Olahraga Berbahasa Indonesia Menggunakan Metode K-Medoid Bersyarat,†J. Simantec, vol. 6, no. 1, pp. 25–32, 2017.

B. T. Within, “Branding for Universities A qualitative case study on Jönköping University BACHELOR THESIS WITHIN : Business Administration NUMBER OF CREDITS : 15 ECTS PROGRAMME OF STUDY : Marketing Management,†no. May, 2019.

Q. Setyani, R. Andreswari, and M. A. Hasibuan, “Target Analysis of Students Based on Academic Data Record Using Method Fuzzy Analytical Hierarchy Process (FAHP) Case Study: Study Program Information Systems Telkom University,†2018 6th Int. Conf. Cyber IT Serv. Manag. CITSM 2018, no. Citsm, pp. 1–6, 2019, doi: 10.1109/CITSM.2018.8674334.

D. Y. Putri, R. Andreswari, and M. A. Hasibuan, “Analysis of Students Graduation Target Based on Academic Data Record Using C4.5 Algorithm Case Study: Information Systems Students of Telkom University,†2018 6th Int. Conf. Cyber IT Serv. Manag. CITSM 2018, no. Citsm, pp. 1–6, 2019, doi: 10.1109/CITSM.2018.8674366.

Y. P. D. Sasongko, “Pertarungan wacana dalam pemberitaan revisi undang undang Komisi Pemberantasan Korupsi di Kompas.com dan Detiknews.com,†J. Signal, vol. 8, no. Vol 8, No 1 (2020): JURNAL SIGNAL, pp. 36–48, 2020, [Online]. Available: http://jurnal.unswagati.ac.id/index.php/Signal/article/view/3011.

C. R. Yulianti and H. Setiawan, “Analisis Framing dan Diksi Berita pada Media Online Detik Travel dan CNN Indonesia Sebagai Bahan Ajar Teks Berita,†Edukatif J. Ilmu Pendidik., vol. 4, no. 1, pp. 803–814, 2022, doi: 10.31004/edukatif.v4i1.1895.

D. Yulistiani and A. Parmawati, “an Analysis of Deictic Expression in the Article Selected From Detiknews About Krakatoa’S Mount Disaster 2018,†Proj. (Professional J. English Educ., vol. 3, no. 6, p. 751, 2020, doi: 10.22460/project.v3i6.p751-756.

H. Z. Muhammad, M. Nasrun, C. Setianingsih, and M. A. Murti, “Speech recognition for English to Indonesian translator using hidden Markov model,†2018 Int. Conf. Signals Syst. ICSigSys 2018 - Proc., pp. 255–260, 2018, doi: 10.1109/ICSIGSYS.2018.8372768.

Ankita and K. A. Abdul Nazeer, “Part-of-speech tagging and named entity recognition using improved hidden markov model and bloom filter,†2018 Int. Conf. Comput. Power Commun. Technol. GUCON 2018, pp. 1072–1077, 2019, doi: 10.1109/GUCON.2018.8674901.

A. N. M. Fahim Faisal, M. A. Rahman, and T. Farah, “A rule-based bengali grammar checker,†Proc. 2021 5th World Conf. Smart Trends Syst. Secur. Sustain. WorldS4 2021, pp. 113–117, 2021, doi: 10.1109/WorldS451998.2021.9514031.

K. Kurniawan and A. F. Aji, “Toward a Standardized and More Accurate Indonesian Part-of-Speech Tagging,†Proc. 2018 Int. Conf. Asian Lang. Process. IALP 2018, pp. 303–307, 2019, doi: 10.1109/IALP.2018.8629236.

M. Ridho Ananda, M. Yudistira Hanifmuti, and I. Alfina, “A Hybrid of Rule-based and HMM-based Part-of-Speech Tagger for Indonesian,†2021 Int. Conf. Asian Lang. Process. IALP 2021, pp. 280–285, 2021, doi: 10.1109/IALP54817.2021.9675180.

Y. Marchuk et al., “Predicting Patient-ventilator Asynchronies with Hidden Markov Models,†Sci. Rep., vol. 8, no. 1, pp. 1–7, 2018, doi: 10.1038/s41598-018-36011-0.

Muljono, U. Afini, and C. Supriyanto, “Morphology analysis for Hidden Markov Model based Indonesian part-of-speech tagger,†Proc. - 2017 1st Int. Conf. Informatics Comput. Sci. ICICoS 2017, vol. 2018-Janua, no. 0, pp. 237–240, 2017, doi: 10.1109/ICICOS.2017.8276368.

S. K. Nambiar, A. Leons, S. Jose, and Arunsree, “POS Tagger for Malayalam using Hidden Markov Model,†Proc. 2nd Int. Conf. Smart Syst. Inven. Technol. ICSSIT 2019, no. Icssit, pp. 957–960, 2019, doi: 10.1109/ICSSIT46314.2019.8987786.

N. Sabloak, B. A. Hardono, and Deri Alamsyah, “Part-of-Speech (POS) Tagging Bahasa Indonesia Menggunakan Algoritma Viterbi,†Progr. Stud. Tek. Inform. STIMIK GI MDP Palembang, no. x, pp. 1–11, 2017.

Y. A. Rohman and R. Kusumaningrum, “Twitter Storytelling Generator Using Latent Dirichlet Allocation and Hidden Markov Model POS-TAG (Part-of-Speech Tagging),†ICICOS 2019 - 3rd Int. Conf. Informatics Comput. Sci. Accel. Informatics Comput. Res. Smarter Soc. Era Ind. 4.0, Proc., pp. 0–5, 2019, doi: 10.1109/ICICoS48119.2019.8982411.

M. D. Drovo, M. Chowdhury, S. I. Uday, and A. K. Das, “Named Entity Recognition in Bengali Text Using Merged Hidden Markov Model and Rule Base Approach,†2019 7th Int. Conf. Smart Comput. Commun., pp. 7–11, 2019.

A. Purwarianti, A. Andhika, A. F. Wicaksono, I. Afif, and F. Ferdian, “InaNLP: Indonesia natural language processing toolkit, case study: Complaint tweet classification,†4th IGNITE Conf. 2016 Int. Conf. Adv. Informatics Concepts, Theory Appl. ICAICTA 2016, pp. 5–9, 2016, doi: 10.1109/ICAICTA.2016.7803103.

M. Hasnain, M. F. Pasha, I. Ghani, M. Imran, M. Y. Alzahrani, and R. Budiarto, “Evaluating Trust Prediction and Confusion Matrix Measures for Web Services Ranking,†IEEE Access, vol. 8, pp. 90847–90861, 2020, doi: 10.1109/ACCESS.2020.2994222.

Downloads

Published

2022-10-25