Analysis Of Database Record Compression Using The Levenstein Algorithm

Data stored in the database affects how much storage memory is used. The more data stored, the more memory will be used. To reduce memory usage it is necessary to reduce the data to be stored. The technique that can be used is data compression. Levenstein algorithm is a data compression technique that replaces the initial bits of the string, resulting in smaller compression. Levenstein algorithm is suitable for compressing data that has many repetitions. Keywords– Levenstein, Compression, Record Database


INTRODUCTION
Databases are commonly used to store information from an application system whether it is website based or desktop based.Applications that use databases are of course aimed at utilizing existing storage capacity so that it can be accessed anytime and anywhere.But the larger the size of the data stored it requires large storage media [1], [2].The size of the data also affects the speed of data transmission.Large data sizes are also influenced by the number of repetitions of letters or words [3].Techniques that can be used are data compression, which allows reducing repetition of letters or words.One of the data compression algorithms is Levenstein.

Compression
Compression is the process of converting data consisting of a collection of characters into a coded form that aims to save storage and time of data transmission [1].Data compression is the process of converting an input data stream (the source stream or the original raw data) into another data stream (the output, the bit stream, or the compressed stream) that has a smaller size [4]- [6].  1. Read data that has been compressed.
2. The result of the Levenstein code bit string that becomes the new data value is changed back to binary form.Returns the binary to the original bit string by reading the last 8 bits, the reading is a decimal number.State the reading result with n then remove the bit at the end as much as 7 + n.

RESULT AND DISCUSSION
For example, we have input string "characters".The first step is to sort the characters frequency by the largest to the smallest, see table 2. From the string bit above we get 39 bit, then we will look for the remainder of 39 for 8 and we get 7 as the remainder for and state as n.Because the remainder for the length of bit string is 7, then add 0 as much as 7 -n + "1" = 7 -7 + "1" = "1" and state it with L, from that we add padding as much as "1" bit at the end of the bit string as we can see below with red colored: 10111000 00110001 01110010 11011100 11100011 Then add the binary number from 9 -n = 9 -7 = 2 = 00000010 state it with final bit and add to the end of the string bit after padding.And we get the final string bit as we can see below with red colored: 10111000 00110001 01110010 11011100 11100011 00000010 Now we finish compression data with Levenstein.Then we generate per bit to look for new character we get after compression.See table 4.

CONCLUSION
The conclusions we can summarized are as follows:

3 .
Doing the Levenstein code bit string results into new data values.But before making the Levenstein code bit string results into a new data value, check the bit string length first.Following are the steps in checking the length of the bit string.a.If the remainder for the length of the string bit to 8 is 0 then add 00000001.State the final bit.b.If the remainder for the length of the bit string to 8 is n (1, 2, 3, 4, 5, 6, 7) then add 0 as much as 7 -n + "1" at the end of the bit string.Express it with L. Then add the binary number from 9 -n.State with the final bit.The steps of decompression using the Levenstein Code algorithm are as follows:

The IJICS | Rian Syahputra | http://ejurnal.stmik-budidarma.ac.id/index.php/ijics 2
. Form a Levenstein code table.The data bits to be compressed are replaced with bits in the Levenstein code table.After replacing count the number of bits in each value.

Table 2 .
Sorting the input string 5% data left after the compression using Levenstein.From table above we can make the new table data in the database that will be used to compare code during the decompression process.

Table 4 .
Generating new codeFrom the string bit above we read the bit from left to right one by one and compare it with table 3. Read first index-0 from the string bit is 1 then compare it with table 3.If not available, then join with the next index, we get index-0 + index-1 are = 10 then compare it with table 3.If available, change the bit with the found "c".And then we read the next index is index-2 is 1 then compare it with table 3, and not available, so we continue the read index-2 + index-3 are 11 then compare it with table 3, and not available, continue read index-2 + index-3 + index-4 are 111 and so on … we get in index-2 + index-3 + index-4 + … + index-7 are 111000 and compare it with table 3 we found "h".And so on until all bit strings are replaced with the initial character.