Analytical Comparison of Lung Cancer Classification Using K-Nearest Neighbor and Naïve Bayes Algorithms

  • Yunisa Darmayanti Informatics Department, Universitas Widyagama, Malang
  • Fitri Marisa Informatics Department, Universitas Widyagama, Malang
  • Aviv Yuniar Rahman Informatics Department, Universitas Widyagama, Malang
Abstract views: 78 , PDF downloads: 61
Keywords: Lung Cancer Classification, K-Nearest Neighbor, Naïve Bayes Classifier


Lung cancer stands as a significant global contributor to human mortality, constituting 25% of all cancer-related deaths in 2021. Its elusive nature, often devoid of early symptoms in a quarter of diagnosed cases, poses a challenge for timely detection. Unlike some other cancers, lung cancer remains hidden from the naked eye, with its symptoms often masquerading as those of other ailments like bronchitis, asthma, or persistent coughs. Early diagnosis is pivotal for effective treatment and increased survival rates. In light of the pressing nature of the situation, this study investigates the prediction of lung cancer by using data mining tools. It is essential to conduct data mining, which is a process that involves searching for patterns and trends inside vast data repositories to discover valuable insights. Within this context, classification emerges as a fundamental aspect that discerns objects based on their distinctive characteristics. A comparative study is undertaken to address the complexities associated with lung cancer classification, focusing on the K-Nearest Neighbor (KNN) and Naïve Bayes Classifier (NBC) algorithms. Through the utilization of a dataset that contains one thousand instances and twenty-four criteria, the purpose of this study is to determine which algorithm is preferable in the categorization of lung cancer. Upon analysis, the study yields noteworthy results. The KNN algorithm exhibits an accuracy rate of 98.34%, surpassing the NBC algorithm's accuracy of 89.37%. Consequently, this research concludes that, in lung cancer classification, the KNN algorithm outperforms the Naïve Bayes algorithm. These findings promise to enhance the efficacy of early lung cancer detection, potentially saving numerous lives through improved classification methods.


