Keyword Security Implementation Based on Hill Cipher Optimized Using Genetic Algorithms

In the process of exchanging data and information, the most important task is to maintain data and information security and reach out to interested parties. One way this can be achieved is through encryption, a process better known as cryptography. Cryptography can scramble messages so that, even if intercepted, the message cannot be immediately read. One example of an encryption algorithm is the Hill Cipher. The Hill Cipher uses an m-by-m-sized matrix as the key for the encryption and decryption process, making it a challenging algorithm to crack. The key provided for the Hill Cipher encryption and decryption process cannot be arbitrary. The keys with mismatched determinants cannot be used, as they can prevent the encrypted message from being restored to its original form. Optimization can be carried out to overcome these obstacles using a genetic algorithm. Genetic algorithms can determine the keys to encrypt and decrypt the Hill Cipher. A key with the appropriate composition for the Hill Cipher will be obtained through the genetic algorithm's evaluation function. This research aims to enhance message security by using the correct composition to generate Hill Cipher encryption and decryption keys. The research results indicate that out of 10 tests conducted with different lengths of original text, eight succeeded, while two failed to complete the encryption and decryption process.


I. INTRODUCTION
One of the most important parts of an information system is the availability of data and information.So, today's society competes to get as much data and information as possible.This resulted in the flow of information exchange processes between communities to increase.However, it cannot be denied that many dangers arise from exchanging information.For example, the rise of information theft and wiretapping means that information is no longer confidential because it can fall into the hands of unauthorized parties.Once the importance of exchanging information must be accompanied by efforts to maintain information security, one widely used way is to encrypt or encode the information.This method is known as cryptography.According to [1], cryptography is defined as the art or science of maintaining message security, whereas, according to [2], cryptography refers to the science that studies methods related to information security.[3] cryptography is a method for maintaining the confidentiality of messages by converting text into encoded text with a special format whose contents are difficult to understand or even completely incomprehensible.Many cryptographic algorithms, including the Hill Cipher, can be applied to the encryption process.Hill cipher is a symmetrical algorithm classified as a classic algorithm, and hill cipher is a substitution cipher with matrix multiplication [4].The working process principle of Hill Cipher is to compare plaintext and ciphertext in both negative and positive directions [5].Encryption is carried out by exchanging the key matrix for a plaintext matrix, and decryption is accomplished by exchanging the inverse key matrix for ciphertext [6].However, using the Hill Cipher still has a lot of downsides.One of them is giving the key to the matrix for the encryption and decryption process cannot be arbitrary because it will result in the text being unable to be returned to its original form.In general, to avoid key mismatches in Hill Cipher can be done with genetic algorithms.
A genetic algorithm is defined as computer software used to simulate the evolutionary process, where each population produces chromosomes randomly and allows these chromosomes to develop based on the laws of evolution, which are expected to produce better chromosomes [7][8] [9].According to [10], a genetic algorithm is a computational algorithm inspired by evolutionary theory, which is then adopted into a computer algorithm to help complete the search for values or solutions in optimization problems.This optimization problem has been widely applied, especially in securing data or information.
Research topics discussing the implementation of Genetic Algorithms, Hill Cipher, and their combinations have been carried out by several previous researchers, including research conducted by [11], which discusses how to solve the Sudoku game by implementing genetic algorithms.Furthermore, a study [12] examines using Hill Cipher for the encryption and decryption process using rectangular matrix keys.The research [13] discusses text data security by applying Hill Cipher to Telephone Codes and the Five Modulus Method.The topic of Hill Cipher algorithm hybrid cryptography as the development of symmetric key cryptography [14] discusses how to secure student academic grades using the Hill Cipher hybrid algorithm.The publication [15] wrote about optimizing genetic algorithms for population service affairs using genetic algorithms.Then, a discussion of the genetic algorithm for predicting the time and cost of a construction project [16].The research [17] discusses the application of genetic algorithms and Hill Cipher for data encoding.

A. Data Collection Methods
Data collection methods are the methods used by researchers to obtain data to support their research activities.This method is oriented towards data sources obtained through main or supporting methods [18].The data obtained to support this research comes from secondary data in the form of books, journals, and proceedings that discuss the implementation of the Hill Cipher algorithm and genetic algorithms.Data is obtained from several important documents for testing, which will be attempted to carry out the encryption and decryption process.

B. Hill Cipher Algorithm
In the Hill Cipher algorithm, the encoding process involves using strategies in conjunction with plaintext.In contrast, the decoding process involves inversible text intended for use with cipher text [19].The symbols for identifiers are in plain and cipher text, each with 29 characters.Each block of plain text is used in the Hill Cipher encryption process.This particular block is identical in size to the key material [20].
Hill cipher, a polyalphabetic cipher, can be categorized as a block cipher because the text to be processed will be divided into blocks of a certain size.According to [21], the Hill Cipher algorithm uses a key as a size m by m matrix to carry out the encryption and decryption process.Several matrix theories proposed by Hill Cipher include multiplication between matrices and inversing matrices [22].

C. Hill Cipher Encryption Technique
The Hill Cipher encryption process Figure 1.The encryption process in Hill Cipher is carried out block by block of plaintext.Each block is the same size as the key matrix.Before dividing the text into rows of blocks, the plaintext is first converted to numbers, for example, A=0, B=1, and Z=25, where the illustration is in Figure 2.  Mathematically, the encryption process in Hill Cipher is as Equation (1) [23], where the C variable is a ciphertext (encoded text), the K variable is a key, and the P variable is a plaintext (original text).

𝐶 = 𝐾 * 𝑃
(1) To clarify the Hill Cipher encryption process, the following is an example of a case and its solution.
Plaintext: UDINUS Convert to number: 20 3 8 13 20 18 The keywords presented in matrix form are as The subsequent steps in the encryption process are carried out for each block based on the outcomes of converting the integers and determining the key matrix.These phases are as follows: UD :  114 mod 26=10=K From this process, it is found that the plaintext UDINUS is encrypted to LRTLEK

D. Hill Cipher Decryption Technique
The technique of decrypting information using the Hill cipher is exactly the same as the process of encrypting it.However, to proceed, the key matrix needs to be inverted.The flow of the decryption procedure is depicted in Figure 3. Mathematically, the decryption process on the Hill cipher can be derived from the Equation [24].
The following is a list of several examples of instances and possible solutions that have been supplied to make the application of the decryption function in the Hill Cypher more clear: Cipher Text: LRTLEK Key : Calculating the determinant using the key yields the following result: determinant K = (4*3) -(3*3) = 3.After obtaining this value, an inverse modulo can be constructed as follows: 3-1 mode 26 = 3X=1 mod 26 =3X=1+26K =X=(1+26k)/3 Then look for Find k = n so that the result x is an integer k=0 X=(1+26*0)/3=1/3 (result is not an integer) k=1 X=(1+26*1)/3=9 (result is an integer).From these results, the inverse of 3 mod 26 is equivalent to 9 mod 26, which is 9, so the search for the inverse matrix is carried out as: The MOD function above is used to find the remainder of the division results, where X is the value of the matrix addition result, and Y is the length value of the character conversion value with Hill Cipher.The result of the decryption process is plain text: UDINUS.

E. Genetic Algorithm
Genetic algorithms utilize various patterns of thought derived from natural evolution to solve a problem [25].The term "genetic algorithms" can also refer to a strategy for solving optimization issues based on natural selection.This refers to strategies that mirror the progression of biology throughout time [26].Utilizing the evaluation function with genetic algorithms allows for determining the key required for encryption and decryption in Hill Cypher.At the same time, Hill Cipher is an algorithm that will execute the key for the encryption and decryption process.In general, the stages of solving using the genetic algorithm are in Figure 4.

III. RESULT AND DISCUSSION
This study aims to analyze how the genetic algorithm optimization process determines the data security key with Hill Cipher.

A. Population Testing
In the Hill Cipher key matrix, there are nine chromosomes.Of these chromosomes, each gene has a value range from 0 to 255, which is of type byte.This value is obtained randomly and placed in each cell in the key matrix.
The information shown up top gives an outline of how chromosomes are constructed using the Hill Cypher matrix key as a starting point.The key, which is in the form of a three-by-three matrix, will be transformed into a one-dimensional vector, and each chromosome will nine genes.

B. Fitness Value
The fitness value determined immediately after the current genes are populated with random numbers.The fitness value can be obtained by calculating the determinant.The main provision of this value is F = 1, where the determinant of the key matrix must have a value of D = 1.Equation (3) calculates the fitness value, where the F variable is the fitness and the D variable is the determinant.

C. Testing
The testing procedure is conducted to locate suitable values that are entered into the nine cells comprising the Hill Cypher key matrix.Initialization is the first stage, and its purpose is to establish several parameters that will later serve as initial determinants.
Generation: 30 Population size: 25 Crossover level: 0.8 Mutation level: 0.5 As can be observed from the information above, there are a total of 25 populations.The creation of random values will be repeated 25 times for each population.There will be nine rounds of random generation while the population is maintained.The population formed according to the criteria in Table I is then utilized in several procedures, including the calculation of fitness values, probabilities, and cumulative probabilities.This procedure needs to be carried out to determine how close the population is to the solution that is anticipated to be found.The outcomes of the calculations are presented in Table II.From the selection, crossover, and mutation values calculation results, a series of selection, crossover, and mutation processes will be used when the cumulative probability value calculation process is complete.Each of these processes will form the last population that occurs after the changes that occur.The results of the population that has undergone the genetic algorithm process are in Table III.

K
The decryption process per block is carried out below based on the inverse matrix.LR :

Fig. 4 .
Fig.4.The GA Algorithm Process Flow DOI: http://dx.doi.org/10.25139/ijair.v5i2.6907number of chromosomes utilized equals the value 9. 225 values will be generated in this initial stage using this value as the starting point.TableIdisplays the generation results up to this point in the process.