Optimasi Hyperparameter Optuna Pada Model mT5 Untuk Penerjemahan Angkola-Indonesia
DOI:
https://doi.org/10.30865/jurikom.v13i1.9465Keywords:
Machine Translation, Angkola Language, mT5, Optuna, Hyperparameter Tuning, BLEU Score, chrF ScoreAbstract
This research aims to address the challenges of preserving the Angkola language in the digital era, which are exacerbated by the lack of an adequate digital data corpus, by developing an accurate and efficient automatic Angkola-to-Indonesian machine translation system. The proposed method focuses on a fine-tuning approach for the Multilingual Text-to-Text Transfer Transformer (mT5-base) model using an Angkola-Indonesian text data corpus.The initial dataset, consisting of Angkola-Indonesian sentence pairs, was cleaned, resulting in 28,775 sentence pairs used for training. The data was subsequently split into 70% training data (20,142 lines), 15% validation data (4,316 lines), and 15% test data (4,317 lines). Intelligent model performance optimization was conducted using Optuna Hyperparameter Tuning to find the best hyperparameter combination. Optuna's objective function was designed to maximize a composite score based on the BLEU and chrF metrics from the validation evaluation results. The optimization process yielded the best Trial (Trial 50) with key hyperparameters: learning rate = 0.0004316 and num beams = 4. The best model obtained from the fine-tuning process was then evaluated on a separate Test dataset. The final evaluation on the test data using standard translation metrics demonstrated excellent performance, achieving a BLEU score of 73.84 and a chrF score of 83.34. Overall, this research successfully implemented hyperparameter optimization using Optuna for the mT5 model, resulting in an Angkola-to-Indonesian translation model that exhibits high accuracy and more efficient performance. These results provide a tangible contribution to the preservation of the Angkola language by offering a modern and accurate translation tool.
References
[1] G. P. M. Virgilio, F. Saavedra Hoyos, and C. B. Bao Ratzemberg, “The impact of artificial intelligence on unemployment: a review,” Int. J. Soc. Econ., no. January, pp. 25553–25579, 2024, doi: 10.1108/IJSE-05-2023-0338.
[2] I. Bakhov, “Artificial intelligence tools for automating philological text research Herramientas de inteligencia artificial para automatizar la investigación filológica de textos,” 2025, doi: 10.62486/latia2025293.
[3] S. Cahyawijaya et al., “NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages,” 2023.
[4] D. Arbian, A. Prasetya, D. Dwi, and F. Almu, “Multilingual Parallel Corpus for Indonesian Low-Resource Languages,” vol. 9, no. September, 2025.
[5] N. Nadra, R. Marnita, and K. A. Amini, “Assimilation of the Batak Angkola Language in Pintu Padang, North Sumatra, Indonesia,” J. Arbitrer, vol. 11, no. 1, pp. 29–38, 2024, doi: 10.25077/ar.11.1.29-38.2024.
[6] R. M. Khofifah Aisah Amini, Nadra Nadra, “Affix Form and Morphonemic Process of Batak Angkola Language Bentuk Afiks Dan Proses Morfofonemik Bahasa Batak Angkola,” Pendidijan, Bahasa, Dan Sastra, vol. 11, no. 1, pp. 30–38, 2023.
[7] N. H. Hrp, M. Fikry, and Y. Yusra, “Algoritma Stemming Teks Bahasa Batak Angkola Berbasis Aturan Tata Bahasa,” J. Comput. Syst. Informatics, vol. 4, no. 3, pp. 642–648, 2023, doi: 10.47065/josyc.v4i3.3458.
[8] V. Agarwal, S. B. Pooja Rao, and D. B. Jayagopi, “Hinglish to English Machine Translation using Multilingual Transformers,” Int. Conf. Recent Adv. Nat. Lang. Process. RANLP, vol. 2021-Septe, pp. 16–21, 2021, doi: 10.26615/issn.2603-2821.2021_003.
[9] L. Xue, N. Constant, A. Roberts, and M. Kale, “mT5 : A Massively Multilingual Pre-trained Text-to-Text Transformer,” pp. 483–498, 2021.
[10] M. Kale and A. Siddhant, “nmT5 - Is parallel data still relevant for pre-training massively multilingual language models ?,” pp. 683–691, 2021.
[11] M. Kresic and N. Abbas, “ScienceDirect Normalizing Swiss German Dialects with the Power of Large Language Models,” Procedia Comput. Sci., vol. 244, pp. 287–295, 2024, doi: 10.1016/j.procs.2024.10.202.
[12] Z. Liu, Y. Lin, and M. Sun, Representation Learning for Natural Language Processing, Second Edition. 2023. doi: 10.1007/978-981-99-1600-9.
[13] B. Zhang et al., “Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation,” 2025, [Online]. Available: http://arxiv.org/abs/2504.06225
[14] N. Alinda and S. Defit, “The Use of Hyperparameter Tuning in Model Classification : A Scientific Work Area Identification,” vol. 8, no. December, pp. 2181–2188, 2024.
[15] B. Haddow, R. Bawden, A. Valerio, M. Barone, and A. Birch, “Survey of Low-Resource Machine Translation,” no. August 2021, 2022.
[16] J. Richardson, “SentencePiece : A simple and language independent subword tokenizer and detokenizer for Neural Text Processing,” pp. 66–71, 2018.
[17] T. Akiba, S. Sano, T. Yanase, T. Ohta, M. Koyama, and P. Networks, “O ptuna : A N ext - generation H yperparameter O ptimization F ramework,” pp. 1–10, 2019.
[18] M. Azam, K. Raiaan, S. Sakib, and N. Mohammad, “A systematic review of hyperparameter optimization techniques in Convolutional Neural Networks,” Decis. Anal. J., vol. 11, no. September 2023, p. 100470, 2024, doi: 10.1016/j.dajour.2024.100470.
[19] K. Papineni, S. Roukos, T. Ward, and W. Zhu, “B LEU : a Method for Automatic Evaluation of Machine Translation,” no. July, pp. 311–318, 2002.
[20] M. Popovi, “CHR F : character n -gram F-score for automatic MT evaluation,” no. September, pp. 392–395, 2015.



