The Level of Student Satisfaction with the Online Learning Process During a Pandemic Using the K-means Algorithm

— The number of cases of Covid-19 in this pandemic era is increasing and getting out of control every day. This triggers the Indonesian government to set policies on schools with online learning methods. Of course, online learning cannot ensure that it runs smoothly in all circles because several factors hinder the learning process. The difficulty of the internet network, limited quotas, unfamiliarity with the use of learning media, and an unsupportive environment for conducting online learning are the obstacles to ineffective online learning. The purpose of this study was to determine the level of satisfaction with online learning during the pandemic. This study uses quantitative research methods with a descriptive approach. Quantitative research methods will be processed into data mining using the K-Means Clustering Algorithm. The clustering process is carried out to get the results of clustering the level of student satisfaction. The dataset was obtained from the results of the questionnaire by submitting statements of satisfaction and dissatisfaction. The cluster type is based on high, medium, and low class. The test results obtained a value with the final iteration, namely the level of satisfied statements is categorized as high with a value of 11.79 compared to the dissatisfied statement, which is categorized as moderate with a value of 7.46. In contrast, for the low category level, there is no value of 0.00 cluster results state that the category is satisfied with online learning with a value of 9.33.


I. INTRODUCTION
Coronavirus (COVID 19) is a virus that is spreading around the world. This virus first appeared in Wuhan, China, at the end of 2019 [1]. This virus is growing so fast that it has spread worldwide with increasing and uncontrolled cases [2]. On March 2, 2020, the Indonesian government announced two positive COVID-19 patients; Indonesia was designated as one of the countries exposed to COVID-19 [3]. The government has also established a Social Distancing policy to break the chain of transmission of the Covid-19 virus by maintaining distance and limiting human interaction. The implementation of this policy is carried out by encouraging all people to stay at home, do all activities at home, and have educational activities carried out online or online [4].
Since March 16, 2020, Paramadina University has implemented social distancing and work from home (WFH) policies for students, lecturers, and staff. With this policy, the face-to-face learning process is replaced with online learning methods that can connect students and teachers who are far apart but can communicate with each other or interact with technology and internet support. Online learning uses digital platforms such as zoom, google meet, e-learning, WhatsApp, and others to assist in implementing online learning and a means of communication between teachers and students [5]. Of course, this situation is an obstacle experienced by teachers and students, such as the slow internet network, causing problems in submitting assignments and disconnection while lectures are in progress [6].
During the covid-19 pandemic, students conduct online or online lectures that have various positive and negative impacts on their application. The positive impact of online lectures is growing independence in learning. Students are required to be more active in exploring the material that the lecturer has given to make it easier for them to understand the material [7]. Online lectures also have time flexibility because virtual classes can be accessed anytime and anywhere [8]. The negative impact of online lectures is that excessive use of gadgets and computers can lead to serious health problems [7]. In addition, many students complain of increasing assignments during online lectures, resulting in fatigue and easy stress [9]. The problem discussed in this paper's discussion is to determine the level of student satisfaction with online learning using the K-Means algorithm. This algorithm will make class groupings based on their similarity [10]. K-Means was chosen because it is a very reliable method [11].
Previous research that uses data mining to create clusters in dataset analysis, including the Fuzzy C-Means Algorithm [12]- [14], Multifactor Evaluation Process (MFEP) [15], Comparison of Single Linkage, Complete Linkage and Average Linkage Clustering Methods [16] [14], and K-Means Algorithm [14] [17]by Using Particle Swarm [18]. However, no one has discussed the paper to make groupings know the level of student satisfaction with online learning. Policymakers can use the results of this study to evaluate online learning during this pandemic.

II. RESEARCH METHODOLOGY
This study uses quantitative research methods with a descriptive approach. Quantitative research methods will be processed into a form of data mining. Data mining itself is a series of processes obtained from data sets manually. One of the techniques contained in data mining is the K-Means Clustering Algorithm. A descriptive approach is a research approach that solves a problem by obtaining decisions based on data [19].

A. Online Learning
The online learning model can also be referred to as a blended learning (BL) model. Blended learning is a learning approach that combines the advantages of offline learning and e-learning [20]. Online learning is a learning method that is tried by using various digital technology features such as smartphones, computers, applications, or websites based on internet networks. Digital platforms that are often used for online learning are WhatsApp, google meet, zoom, e-learning, google classroom, video conferencing, live chat, and various other digital platforms [5]. The application of online learning immediately requires utilizing technology and using the internet network [21].
This online learning certainly cannot ensure that it runs smoothly in all circles because not all students have supporting facilities such as laptops, adequate network connections, economic conditions in the absence of quotas, and an unsupportive environment for online learning. These constraints make online learning ineffective [22]. In addition, online learning spends more time in front of a smartphone or computer screen. The use of these devices for a very long duration can cause physical problems, especially health. In the results of Mustakim's research through a survey procedure, while online learning, students face many physical complaints such as headaches, eye fatigue, difficulty resting, often drowsy, and other physical complaints [23].
During online lectures, students spend a lot of time in front of a gadget or computer screen. This can lead to health problems such as headaches, eye fatigue, and other ailments [23]. Less effective learning occurs because of the slow network, lack of facilities, inadequate environment, lack of understanding in the use of technology, difficulty in understanding the material because it is easier for students to understand the material face to face than online. This can lead to ineffective learning while online [5]. Various academic demands that students must complete cause academic stress [24]. Academic stress is the pressure that arises on students due to competition and academic demands. Academic stress is caused by academic stress. Academic stressors start from the learning process, such as pressure to get good grades, increased assignments, length of study, low grades/performance, and fear of taking exams [25]. The inability of students to adapt to these conditions causes students to experience stress [23].

B. Datasets
The subject of this study is a dataset of students from Paramadina University's Faculty of Engineering batches of 2019 and 2020, with a total of about 200 students. However, only 10% of the entire random sample was used in this study. Questionnaires, observation, and study theories were utilized to collect data in this study. The author's data-collecting period runs from April through July 2021. The variables in this study are declarations of extreme satisfaction and dissatisfaction. As a result, for numerous categories of questions on the questionnaire, it becomes a measure of satisfaction. In this study, the data collected from the outcomes of the questionnaire distribution by 20 random sample respondents will be processed first, followed by clustering. At the clustering step, the processed data will be processed. The information will be analyzed by grouping student satisfaction with the degree of the online learning process. High satisfaction level cluster, medium satisfaction level cluster, and low satisfaction level cluster are the three clusters. In this study, data is processed using a spreadsheet/excel that is automatically linked to the questionnaire data. Figure 1 depicts the step of categorizing and solving a collection of data from groups into numerous classes based on criteria. During this step of data grouping, the Euclidean Distance theory is applied. The clustering flowchart can be shown in Figure 1 Data mining decision-making is a collection of procedures that are manually obtained from data sets. The K-Means Clustering Algorithm is one of the data mining strategies. This approach employs the Clustering principle, which involves categorizing objects into class groups based on their resemblance [10]. Overall, the K-Means Clustering Algorithm is a non-hierarchical clustering approach that divides sample data into one or more groups based on similar features [27]. The K-Means algorithm classifies data by detecting the significant categories of each cluster and searches for group members within each cluster [28].

C. K-Means Clustering Algorithm
The K-Means Algorithm is calculated by first identifying the data to be grouped. The k variable, where n is the number of data group members and m is the number of variables. Because the K-Means Algorithm is directionless in theory, each cluster will be selected independently at first. As a result, the distance between each data and each primary category of the cluster is established in order to become a fixed data. The Euclidean Distance concept is used to determine the distance between the data and the cluster's center. The concept is derived from the data's value by comparing the data's similarity to other data using the Euclidean Distance equation, which is based on Equation (1).

de =
(1) x,y variable is object coordinate, s,t variable is centroid coordinate, and i variable is the number of objects. After determining the data distance, the next step is to determine the center of the cluster by finding the average of each cluster member using Equation (2). ( where the variable is the average centroid in the I variable is the cluster for the j variable, is the number of the i variable is Cluster members, the i,k variable is the index of the cluster, the j variable is the variable index, and the Xkj variable is the k data value and j variable in the cluster.
The cluster grouping process will be completed if the cluster members do not move clusters. But if the cluster members are still moving, it must be recalculated from the first method, which calculates the data distance. Then calculate the cluster average to determine the cluster center. This method is still carried out until the cluster members do not experience displacement [17].

III. RESULT AND DISCUSSION
The trial's findings were based on a dataset created from a questionnaire filled out by students at Paramadina University's Faculty of Engineering. The number of data samples is 20, the number of clusters is 3, and the number of statements or characteristics is 2 with the categories Satisfied and Dissatisfied. There are 20 respondents who become the sample data to establish the degree of student satisfaction.
The starting centroid is determined at random in the first iteration till the nth data. The data utilized is from the datasets in Table I with C1 = (10,0), C2 = (5,5), and C3 = (10,0). (1,9). Based on the results of the Equation computation, the nearest centroid to the nearest centroid was calculated sample data is (9,1)  The data has been categorized with the findings in Table II  based on Table I above.   TABEL II  ITERATION DATA GROUPING   Cluster  Sample Data   C1  IV, V, VI, VII, VIII, XIII, XV, XVII, XIX   C2  I, II, III, IX, X, XI, XII, XIV, XVI, XVIII, XX The value of the first iteration did not change, the value of the third iteration was the same. The iteration process then comes to a halt until the final result has three clusters. During the covid-19 epidemic, studies on student satisfaction with online learning produced the graph in Figure 2. IV. CONCLUSION Based on pleased and unsatisfied with online learning, the K-Means algorithm may be used to generate clusters in evaluating the level of satisfaction with the online learning process. There are three clusters of data handled in the online learning process. The satisfied category with online learning was categorized as "high" with a value of 9.33, while the unsatisfied category was categorized as "low" with a value of 0.67 in the first cluster. The category of being happy with online learning is classed as "low" with a value of 4.73, while the category of being dissatisfied is rated as "high" with a value of 5.27 in the second cluster. The happy or unsatisfied category has no value, according to the third cluster. These findings suggest that online learning is satisfactory at Paramadina University's Faculty of Engineering.