Customer Loyalty Clustering Model Using K-Means Algorithm with LRIFMQ Parameters

Loyal customers are one of the factors that determine the development of a business. Therefore, businesses need a strategy to keep customers loyal, even making customers who were previously less loyal to become more loyal. The strategy used must be right on target according to customer segmentation. The purpose of this paper is to model a cluster of customer loyalty to help businesses in making the right decisions of marketing strategy. Segmentation is done using the k-means algorithm with LRIFMQ (length, recency, interval, frequency, monetary, quantity) as parameters, and the CLV (customer lifetime value) of each cluster is calculated. Data obtained from PT. XYZ (a company engaged in food processing) for one year (1 January 2019 31 December 2019), with 337.739 transactions, and 26.683 customers. AHP (analytical hierarchy process) method is used for LRIFMQ weighting because this method has a consistency index calculation. The silhouette coefficient is used to calculate the cluster quality and determine the optimal number of clusters. The best results are obtained with the silhouette coefficient value of 0,632904 with the number of clusters 6. Keywords—customer analysis; k-means; LRIFMQ; analytical hierarchy process; AHP; silhouette coefficient.


I. INTRODUCTION
One of the intangible assets owned by a company is a loyal customer [1]. The competition in business is so tight, as an owner must know the customer's needs as well so that customers do not turn away. Companies need to carry out special strategies to establish good relations between the company and its customers. The strategy must be under the customer's need because with the right marketing strategy can increase profits for the company [2].
The step that a company can take is to provide value to customers based on certain criteria that can benefit the company. This concept is known as CLV (customer lifetime value) [3]. CLV is an estimated value. Even so, this value can be used to evaluate the future of customers against the company with data mining techniques in detecting patterns and relationships using data history [4]. One of the models that can measure CLV is RFM (recency, frequency, monetary) [5].
RFM was first introduced by Hughes, which is the most common segmentation method used to identify customer value in the company based on three variables: recency, frequency, dan monetary. Recency is a calculation of how many days from the customer's last transaction up to today. Frequency is a calculation of how many transactions are made during the period, while monetary is how much money has been transacted by the customer [6].
This RFM model was developed by Chang and Tsay with the addition of length variable and is known as the LRFM model (length, recency, frequency, monetary) [7]. Length is the calculation of the distance of days between the first and the last time customer's transaction. In this paper, the author tries to add interval and quantity variables. Interval is the average daily distance from each transaction of the customer, while the quantity is the sum of all items that have been transacted. The addition of these two variables was inserted in the previous theory to LRIFMQ (length, recency, interval, frequency, monetary, quantity). Each of the parameters has a weight that will be determined using the AHP method (analytical hierarchy process).
After assessing customers with LRIFMQ, the next step is to segment customers. This segmentation can be done with the help of data mining techniques known as clustering. Clustering in data mining is the process of forming segments or clusters by looking at the similarities between data based on the parameters given.
Many clustering methods have their advantages and disadvantages. It can be seen in Table I that the majority of previous research uses the k-means algorithm, which is widely used because of its performance quite effective and efficient [8]. Another quite popular algorithm is fuzzy c-means, which has an element of fuzziness that is difficult to solve using the k-means algorithm. Therefore, this paper uses both algorithms.
In research by A.J. Christy show that segmentation performed using k-means requires a shorter time and much less iteration than fuzzy c-means [6]. Also, the average number of silhouette coefficient values using the k-means algorithm is better than the fuzzy c-means. The silhouette coefficient and the arm method or commonly known as the elbow, are used to determine the best number of clusters. Besides, this silhouette coefficient method can also be used to determine the quality of a clustering process. This paper aims to obtain a cluster model of customer loyalty that is assessed based on CLV with the k-means algorithm and using the silhouette coefficient method to determine the best number of clusters. The customer loyalty cluster model is expected to assist businesses in determining the best marketing strategy.
II. RESEARCH METHODOLOGY The processes stage in the discussion of this paper uses several steps, following:

A. Data Collection
Sources of the data obtained from PT. XYZ (a company engaged in food processing). The data taken are sales transactions consisting of 2,629,261 transactions and 64,239 customers. While the attributes used are the date of transaction, transaction value, and quantity.

B. Weighting Using Analytical Hierarchy Process (AHP)
AHP method is one of the methods used in determining weight. In this case, the weight that will be determined is LRIFMQ. The steps taken are as follows: [9] • Prioritize using the pairwise comparison matrix with the pairwise comparison index shown in Table II [5]. • Normalize pairwise comparison matrix.
• Calculates the weight obtained from the average of each row in the pairwise comparison matrix. This calculation will produce WL, WR, WI, WF, WM, WQ. This weight can be described using a 6x1 matrix with W variable. After getting the weight value, the value is multiplied by the initial pairwise comparison matrix before normalization. This matrix multiplication will produce a 6x1 matrix denoted X variable.
• Calculates the value of the consistency index (CI) using Equation (2).
• The value of consistency ratio (CR) calculated using Equation (3).

CR= CI RI
(3) Where RI for the number n = 6 is 1.24. A consistency ratio below 0.1 is required for weights to be considered valid and can be used [9].

C. Data Preparation
The steps taken in this data processing stage are as follows: • Removes duplicate data.
• Removes customer data that only make one transaction because the interval parameter is required to have at least two transactions. Normalization of data using the min-max method, with a range of 0-1. Equation (4) used to calculate min-max normalization.

D. Clustering Process
The main clustering method used is k-means, while fuzzy c-means is used as a comparison. The steps of clustering using k-means are as follows [10]: • Determine the number of clusters k • Determine the centroid point randomly as many as clusters k. • Calculate the flat distance to centroid using the Euclidean distance formula, as in Equation (5).
• Renew centroid points based on the average value in each cluster. • Repeat points c -d until all centroid points are convergent/immovable. • The k-means algorithm is run several times to search for global optima. The steps of clustering with fuzzy c-means are as follows: • Choose the center point k randomly (the number of k is predetermined). • Calculate fuzzy membership , based on Equation (6).
• Update the centroid on each cluster v j , using Equation (7).
• Repeat until the value ε is between 0 and 1, where ε is the limit specified between 0 and 1 based Equation (8).

E. Determine the Optimal Cluster Number
Determine the optimal number of clusters using the silhouette coefficient method and the elbow method based on the following steps: • Calculate the average distance of data objects with other data objects in 1 cluster and will get the value a(o), based on Equation (9).
• Calculate the average distance between data objects and data objects in other clusters. From the average distance of each cluster, take the least, and will get the value b(o) based on Equation (10).
Elbow method is a method that determines the optimal number of clusters using visuals, as shown in Figure 1, there is a significant change in cluster point 3. If you use the elbow method as a reference for determining the optimal number of clusters, the optimal number of clusters obtained is three based on Equation (12).
Where: K = number of clusters = i th data = centroid cluster k

G. Evaluation and Analysis of Results
Evaluate and analyze the results of all trials. If the results obtained are not appropriate due to a trial error, then the step can be repeated to the point where the error occurred. The best clustering is the result of trials with the silhouette coefficient that is closest to one. I also performed an analysis of the type of customer for each cluster.

III. RESULT AND DISCUSSION
In the initial preparation stage, the AHP process is carried out to determine the weight of each trial parameter. Giving weight using the pairwise comparison index is done subjectively because it depends on the company how to see the interests of each LRIFMQ parameter. Weights obtained for each WL, WR, Wi, WF, WM, WQ are 0,0672, 0,0395, 0,2519, 0,2850, 0,2082, 0,1481 with a value of consistency ratio 0,039. With a consistency ratio below 0.1, the weight is considered valid and can be used.

A. Trial Using K-Means Algorithm
The trial using the k-means algorithm was repeated 40 times. Each repetition is carried out clustering, starting from k = 2 to k = 9. The reason is dismissed at number 9 is because the value of the silhouette coefficient decreases as the number of clusters increases k.
The silhouette coefficient is used to measure the quality of the clustering process, as well as to determine what the optimal number of clusters is. The silhouette coefficient value has a range between -1 to 1. If the value is close to 1, then the quality of the resulting cluster is getting better. While close to -1, the cluster quality gets worse [12].  Table V shows the results of trials with the highest silhouette coefficient for each number of clusters from 40 iterations. The silhouette coefficient results that are closest to 1 are in the number of clusters 2, iteration 1, with a value of 0.71616. Whereas the elbow method graph shown in Figure 2 shows the fracture point that previously dropped significantly and then changes to the point at the number of clusters 3. What is unique in this graph is the increase in the number of clusters point 6. Table VI and Table VII show the number of members in each cluster, along with the CLV value, whether using AHP or not. It can be seen in the number of clusters 2 that cluster 1 gets the first rank when seen from its CLV value. This shows that there are more potentially loyal customers than potentially non-loyal customers. It can also be seen that the AHP weight value does not change the cluster rating. For the number of clusters 3, the number of members in the first rank is more than the number of members in the next rank. This means the number of potential customers loyal to PT. XYZ tends to be more than those that don't.   A comparison of LRIFMQ values for each cluster without using AHP weights is shown in Figure 3. The first cluster has a higher L, I, F, M, and Q values and lowers R compared to the second cluster. A high value of I is a minus point for loyalty, but overall the loyalty of the first cluster is higher. A comparison of LRIFMQ values using AHP weights is shown in Figure 4. When compared to the LRIFMQ value that does not use weights, the value of R goes down far enough inversely with the value of I and F, which rises quite significantly. Even so, for the number of cluster 2, the AHP weighting does not change the cluster loyalty rating. Based on Table V, the silhouette coefficient value for the number of clusters 3, 4, and 5 decreases along with the clusters increase. However, there is a difference in the number of clusters 6, where the value of the silhouette coefficient increased to 0.632904, higher than the number of clusters 4. Details of clustering results from cluster 6 are shown in Table  VIII. In the number of clusters 6, the first rank is the sixth cluster, both using AHP weights or not. This clustering process can also separate the most two loyal customers, viewed from the CLV value far adrift with the value of other cluster CLV. Furthermore, for ranks 2 and 3, whether using AHP weights or not, are in the same cluster. However, there is a change in rank after weighting, where the fifth cluster was previously ranked 4, becoming rank 6. The second cluster before weighting was rank 5 to rank 4. The third cluster before weighting was rank 6 to rank 5. A comparison of LRIFMQ values without using AHP weights is shown in Figure 5. It is clear that cluster 6 has a larger L, F, M, Q, and R, and I value compared to other clusters. While for the last rank, the third cluster has an R-value that is quite large compared to other clusters. The LRIFMQ comparison after weighting using AHP is shown in Figure 6. The third cluster, which previously had a high R-value, turned lower and made this cluster rise to rank 5. Meanwhile, the fifth cluster that already had a top I value became even more elevated and made this cluster go down to the last grade.

H. Trial Using Fuzzy C-Means Algorithm
The trial using the fuzzy c-means algorithm is carried out with the same treatment as the experiments using the k-means algorithm, which is done as many as 40 iterations and the number of k = 2 to k = 9. Figure 7 shows the LRIFMQ comparison without using AHP weights. The second cluster has a higher L, I, F, M, Q values, and lower R values than the first cluster.  Table IX shows the results of trials with the highest silhouette coefficient for each number of clusters from 40 iterations. The silhouette coefficient obtained the closest result to 1 is the number of clusters 2, iteration 1, with a value of 0.716398.  Table X shows the number of members for each cluster, along with the CLV value, whether using AHP or not. It can be seen that the second cluster gets the first rank when seen from its CLV value. This shows that there are more potential loyal customers than potentially non-loyal customers, seen from the number of members in the second cluster of 14171. This number increases, when compared to the k-means algorithm, where the number of members ranked first is 13852.  Table IX It is shown that the greater the number of clusters, the smaller the value of the silhouette coefficient. But in the number of clusters 6, the value of the silhouette coefficient is 0.581685, higher than the number of clusters 5. The value is obtained after the 24 th iteration. The trial results in details with cluster number 6, 24 th iteration, shown in Table XI. The cluster that gets the first rank in a process that does not use AHP weights is the third cluster. But when using AHP weights, the first rank changes to the fifth cluster. In trials using fuzzy c-means, the highest silhouette coefficient value for the number of clusters is still lower than using k-means. Also, the number of members in the trial results using fuzzy cmeans is not as significant a part as trials using k-means.  A comparison of LRIFMQ values without using AHP is shown in Figure 8. It appears that the third cluster has significantly higher L and F values than the other clusters. In addition, the third cluster also has R, and I value compared to other clusters. The last rank is in the first cluster because the L value is on average lower, and the R-value is significantly higher than the other cluster. After weighting using AHP, the fifth cluster that was previously ranked 3 rd changed to 1 st place. This is because the value of I increased significantly. A comparison of LRIFMQ values after weighting can be seen in Figure 9.
IV. CONCLUSION There are two trials conducted in this study, using the kmeans algorithm and fuzzy c-means. The trial using k-means can separate a group of customers who are very loyal to the number of clusters 6 with two members. While using fuzzy cmeans for the same number of clusters, the number of members in the most loyal cluster is 1421. If the business wants more specific grouping, then the results of this k-means will be quite helpful. In this case, even though the highest silhouette coefficient value is in the number of clusters 2, this value cannot be used as a benchmark that the number of clusters 2 is a cluster that can be used in business decision making. The value of the silhouette coefficient and the elbow Comparison of LRIFMQ Fuzzy C-Means (k = 6) Comparison of LRIFMQ AHP Fuzzy C-Means (k = 6) method graph that increases between decreases can also be used as consideration. Meanwhile, the use of AHP weights can also change the CLV rating. Therefore, weighting using AHP is important to do and adjusted to the level of importance of each business. Suggestions for further research are to increase the number of clustering iterations. This needs to be done because if you look at the research that has been done, the highest silhouette coefficient value can only be obtained after dozens of iterations. Besides, the addition of variables other than LRIFMQ also needs to be considered. Examples of variables that can be used are age, region, economic level, and others. The addition of these variables can further assist businesses in getting more specific customer clusters.
V. REFERENCES