Phrase Based Statistical Machine Translation Javanese-Indonesian

 (*)Aufa Eka Putri Lesatari Mail (Telkom University, Bandung, Indonesia)
 Arie Ardiyanti (Telkom University, Bandung, Indonesia)
 Arie Ardiyanti (Telkom University, Bandung, Indonesia)
 Ibnu Asror (Telkom University, Bandung, Indonesia)
 Ibnu Asror (Telkom University, Bandung, Indonesia)

(*) Corresponding Author

DOI: http://dx.doi.org/10.30865/mib.v5i2.2812

Abstract

This research aims to produce a statistical machine translation that can be implemented to perform Javanese-Indonesian translation and to know the influence of the main data sources of statistical machine translation namely parallel corpus and monolingual corpus on the quality of Javanese-Indonesian statistical machine translation. The testing was carried out by gradually adding the quantity of parallel corpus and monolingual corpus to seven configurations of Javanese-Indonesian statistical machine translation. All machine translation configuration experiments were tested with test data totaling 500 lines of Javanese sentences. Results from machine translation are evaluated automatically using Bilingual Evaluation Understudy (BLEU). Test results in seven configurations showed an increase in the evaluation value of the translation machine after the quantity of parallel corpus and monolingual corpus was added. The quantity of parallel corpus in configurations 1 and 2 increased by 3,6%, configurations 2 and 3 increased by 8,23%, configurations 3 and 7 increased by 14,92%. Additional monolingual corpus quantity in configurations 4 and 5 increased BLEU score by 0,18%, configurations 5 and 6 increased by 0,06%, configurations 6 and 7 increased by 0,24%. The test results showed that the quantity of parallel corpus and monolingual corpus could increase the evaluation value of statistical machine translation Javanese-Indonesian, but the quantity of parallel corpus had a greater influence than the quantity of monolingual corpus

Keywords


Statistical Machine Translation; Parallel Corpus; Monolingual Corpus; BLEU; Phrase Based

Full Text:

PDF


Article Metrics

Abstract view : 162 times
PDF - 21 times

References

I. Hadi, “Uji Akurasi Mesin Penerjemah Statistik (MPS) Bahasa Indonesia Ke Bahasa Melayu Sambas Dan Mesin Penerjemah Statistik (MPS) Bahasa Melayu Sambas Ke Bahasa Indonesia,” J. Sist. dan Teknol. Inf., vol. 2, pp. 1–6, 2014.

R. Nugroho Aditya, T. Adji Bharata, and B. Hantono S, “Penerjemahan Bahasa Indonesia dan Bahasa Jawa Menggunakan Metode Statistik Berbasis Frasa,” Semin. Nas. Teknol. Inf. dan Komun., vol. 2015, no. Sentika, 2015.

M. Wahyuni, H. Sujaini, and H. Muhardi, “Pengaruh Kuantitas Korpus Monolingual Terhadap Akurasi Mesin Penerjemah Statistik,” J. Sist. dan Teknol. Inf., vol. 7, no. 1, p. 20, 2019, doi: 10.26418/justin.v7i1.27241.

R. Darwis, H. Sujaini, and R. D. Nyoto, “Peningkatan Mesin Penerjemah Statistik dengan Menambah Kuantitas Korpus Monolingual (Studi Kasus : Bahasa Indonesia - Sunda),” J. Sist. dan Teknol. Inf., vol. 7, no. 1, p. 27, 2019, doi: 10.26418/justin.v7i1.27254.

H. Sujaini, “Meningkatkan Peran Model Bahasa dalam Mesin Penerjemah Statistik (Studi Kasus Bahasa Indonesia-Dayak Kanayatn),” Khazanah Inform. J. Ilmu Komput. dan Inform., 2017, doi: 10.23917/khif.v3i2.4398.

M. G. Asparilla, H. Sujaini, and R. D. Nyoto, “Perbaikan Kualitas Korpus untuk Meningkatkan Kualitas Mesin Penerjemah Statistik ( Studi Kasus : Bahasa Indonesia – Jawa Krama ),” vol. 1, no. 2, pp. 66–74, 2018.

H. Sujaini, “Peningkatan Akurasi Penerjemah Bahasa Daerah dengan Optimasi Korpus Paralel,” J. Nas. Tek. Elektro dan Teknol. Inf., vol. 7, no. 1, 2018, doi: 10.22146/jnteti.v7i1.394.

P. Permata, Z. Abidin, and F. Ariyani, “Efek Peningkatan Jumlah Paralel Korpus Pada Penerjemahan Kalimat Bahasa Indonesia ke Bahasa Lampung Dialek Api,” J. Komputasi, vol. 8, no. 2, pp. 41–49, 2020, doi: 10.23960/komputasi.v8i2.2613.

J. Pranata, T. Informatika, F. I. Komputer, and U. D. Nuswantoro, “Mesin penerjemah bahasa indonesia- bahasa jawa 1,2,” no. 5, pp. 1–5.

P. Permata and Z. Abidin, “Statistical Machine Translation Pada Bahasa Lampung Dialek Api Ke Bahasa Indonesia,” J. Media Inform. Budidarma, vol. 4, no. 3, p. 519, 2020, doi: 10.30865/mib.v4i3.2116.

A. Hidayat, H. Sujaini, and R. Dwinyoto, “Aplikasi Penerjemah Dua Arah Bahasa Indonesia – Bahasa Melayu Sambas Berbasis Web Dengan Menggunakan Decoder Moses,” J. Sist. dan Teknol. Inf., 2015.

D. W. Ningtyas, H. Sujaini, and N. Safriadi, “Penggunaan Pivot Language pada Mesin Penerjemah Statistik Bahasa Inggris ke Bahasa Melayu Sambas,” J. Edukasi dan Penelit. Inform., 2018, doi: 10.26418/jp.v4i2.27414.

M. Mulyana, H. Sujaini, and H. S. Pratiwi, “Algortima Pembagian Frasa Dalam Kalimat Untuk Menigkatkan Akurasi Mesin Penerjemah Statistik Bahasa Indonesia – Bahasa Bugis Wajo,” J. Sist. dan Teknol. Inf., 2018, doi: 10.26418/justin.v6i2.23984.

D. W. Ningtyas, H. Sujaini, and N. Safriadi, “Penggunaan Pivot Language pada Mesin Penerjemah Statistik Bahasa Inggris ke Bahasa Melayu Sambas,” J. Edukasi dan Penelit. Inform., vol. 4, no. 2, p. 173, 2018, doi: 10.26418/jp.v4i2.27414.

D. Indrayana, H. Sujaini, and N. Safriadi, “Meningkatkan Akurasi Pada Mesin Penerjemah Bahasa Indonesia Ke Bahasa Melayu Pontianak Dengan Part Of Speech,” vol. 3, no. 1, pp. 1–5, 2016.

Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Phrase Based Statistical Machine Translation Javanese-Indonesian

Refbacks

  • There are currently no refbacks.


Copyright (c) 2021 JURNAL MEDIA INFORMATIKA BUDIDARMA

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.



JURNAL MEDIA INFORMATIKA BUDIDARMA
STMIK Budi Darma
Sekretariat : Jln. Sisingamangaraja No. 338 Telp 061-7875998
email : mib.stmikbd@gmail.com

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.