Comparison of Hyperparameter Tuning Methods for Optimizing K-Nearest Neighbor Performance in Predicting Hypertension Risk

Dimas Trianda, Dedy Hartama, Solikhun Solikhun

Abstract


Hypertension is a major cause of cardiovascular disease, making early risk prediction essential. According to WHO, hypertension cases are estimated to reach 1.28 billion by 2023. This study aims to optimize the K-Nearest Neighbor (KNN) algorithm for predicting hypertension risk through hyperparameter tuning. Three methods Grid SearchCV, Bayes SearchCV, and Random SearchCV are compared to determine the best parameter configuration. The dataset, obtained from Kaggle, consists of 520 balanced samples (260 positive and 260 negative) with 18 health-related features such as age, gender, blood pressure, cholesterol, glucose, and others. After preprocessing, the KNN model is tuned using each method by testing combinations of neighbors (k), weight types, and distance metrics. Results show Bayes SearchCV achieved the highest accuracy of 92%, outperforming the baseline KNN model, which had 85% accuracy. The ROC AUC score of 0.96191 also indicates excellent classification performance. In conclusion, Bayes SearchCV significantly improves KNN's predictive ability in hypertension risk classification.


Keywords


Optimization; Hypertension; Machine Learning; KNN; GridSearchCV

Full Text:

PDF

References


W. Frąk, A. Wojtasińska, W. Lisińska, E. Młynarska, B. Franczyk, and J. Rysz, “Pathophysiology of Cardiovascular Diseases: New Insights into Molecular Mechanisms of Atherosclerosis, Arterial Hypertension, and Coronary Artery Disease,” Biomedicines, vol. 10, no. 8, 2022, doi: 10.3390/biomedicines10081938.

K. Gadó, A. Szabo, D. Markovics, and A. Virág, “Most common cardiovascular diseases of the elderly – A review article,” Dev. Heal. Sci., vol. 4, no. 2, pp. 27–32, 2022, doi: 10.1556/2066.2021.00048.

M. H. Elnaem et al., “Disparities in Prevalence and Barriers to Hypertension Control: A Systematic Review,” Int. J. Environ. Res. Public Health, vol. 19, no. 21, pp. 1–16, 2022, doi: 10.3390/ijerph192114571.

A. Mohammed Nawi et al., “The Prevalence and Risk Factors of Hypertension among the Urban Population in Southeast Asian Countries: A Systematic Review and Meta-Analysis,” Int. J. Hypertens., vol. 2021, 2021, doi: 10.1155/2021/6657003.

O. K. A, “International Journal of Research Publication and Reviews Machine Learning in Predictive Modelling : Addressing Chronic Disease Management through Optimized Healthcare Processes,” Int. J. Res. Publ. Rev., vol. 6, no. 1, pp. 1525–1539, 2025.

M. A. Gadjiev et al., “Innovations in arterial blood pressure management: enhancing public health through effective prevention methods,” Rev. Latinoam. Hipertens., vol. 19, no. 6, pp. 277–282, 2024, doi: 10.5281/zenodo.12674070.

Ibrahim Adedeji Adeniran, Christianah Pelumi Efunniyi, Olajide Soji Osundare, and Angela Omozele Abhulimen, “Data-driven decision-making in healthcare: Improving patient outcomes through predictive modeling,” Int. J. Sch. Res. Multidiscip. Stud., vol. 5, no. 1, pp. 059–067, 2024, doi: 10.56781/ijsrms.2024.5.1.0040.

X. Shu and Y. Ye, “Knowledge Discovery: Methods from data mining and machine learning,” Soc. Sci. Res., vol. 110, no. April 2022, p. 102817, 2023, doi: 10.1016/j.ssresearch.2022.102817.

Salumanda and Christian, “Investigating Data Mining Methods For Pattern And Relationship Detection In Large Datasets,” Int. J. Data Sci. Eng., vol. 1, no. 1, pp. 1–9, 2023.

S. M. D. A. C. Jayatilake and G. U. Ganegoda, “Involvement of Machine Learning Tools in Healthcare Decision Making,” J. Healthc. Eng., vol. 2021, 2021, doi: 10.1155/2021/6679512.

W. T. Wu et al., “Data mining in clinical big data: the frequently used databases, steps, and methodological models,” Mil. Med. Res., vol. 8, no. 1, pp. 1–12, 2021, doi: 10.1186/s40779-021-00338-z.

E. Dritsas and M. Trigka, “Efficient Data-Driven Machine Learning Models for Cardiovascular Diseases Risk Prediction,” Sensors (Switzerland), vol. 50, 2023, doi: 10.5937/mckg50-11761.

A. Rahim, Y. Rasheed, F. Azam, M. W. Anwar, M. A. Rahim, and A. W. Muzaffar, “An Integrated Machine Learning Framework for Effective Prediction of Cardiovascular Diseases,” IEEE Access, vol. 9, pp. 106575–106588, 2021, doi: 10.1109/ACCESS.2021.3098688.

R. K. Halder, M. N. Uddin, M. A. Uddin, S. Aryal, and A. Khraisat, “Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications,” J. Big Data, vol. 11, no. 1, 2024, doi: 10.1186/s40537-024-00973-y.

P. K. Bhowmik et al., “Advancing Heart Disease Prediction through Machine Learning: Techniques and Insights for Improved Cardiovascular Health,” Br. J. Nurs. Stud., no. 2022, pp. 35–49, 2024, doi: 10.32996/bjns.

M. A. Naser, A. A. Majeed, M. Alsabah, T. R. Al-Shaikhli, and K. M. Kaky, “A Review of Machine Learning’s Role in Cardiovascular Disease Prediction: Recent Advances and Future Challenges,” Algorithms, vol. 17, no. 2, pp. 1–33, 2024, doi: 10.3390/a17020078.

C. Zhang, P. Zhong, M. Liu, Q. Song, Z. Liang, and X. Wang, “Hybrid Metric K-Nearest Neighbor Algorithm and Applications,” Math. Probl. Eng., vol. 2022, 2022, doi: 10.1155/2022/8212546.

S. V. Razavi-Termeh, A. Sadeghi-Niaraki, S. Razavi, and S. M. Choi, “Enhancing flood-prone area mapping: fine-tuning the K-nearest neighbors (KNN) algorithm for spatial modelling,” Int. J. Digit. Earth, vol. 17, no. 1, pp. 1–29, 2024, doi: 10.1080/17538947.2024.2311325.

Y. A. Ali, E. M. Awwad, M. Al-Razgan, and A. Maarouf, “Hyperparameter Search for Machine Learning Algorithms for Optimizing the Computational Complexity,” Processes, 2023, doi: https:// doi.org/10.3390/pr11020349.

M. Sahu and S. Soni, “A Predictive Approach to Employee Turnover Through Machine Learning,” Int. J. Mod. Eng. Manag. Res., vol. 12, no. 3, pp. 5–18, 2024.

A. H. Fristiana, S. A. I. Alfarozi, A. E. Permanasari, M. Pratama, and S. Wibirama, “A Survey on Hyperparameters Optimization of Deep Learning for Time Series Classification,” IEEE Access, vol. 12, no. November, pp. 191162–191198, 2024, doi: 10.1109/ACCESS.2024.3516198.

P. K. Sahu and T. Fatma, “Optimized Breast Cancer Classification Using PCA-LASSO Feature Selection and Ensemble Learning Strategies with Optuna Optimization,” IEEE Access, vol. 11, p. 1, 2025, doi: 10.1109/ACCESS.2025.3539746.

A. A. Albishri and M. M. Dessouky, “A Comparative Analysis of Machine Learning Techniques for URL Phishing Detection,” Eng. Technol. Appl. Sci. Res., vol. 14, no. 6, pp. 18495–18501, 2024, doi: https://doi.org/10.48084/etasr.8920.

K. Alemerien, S. Alsarayreh, and E. Altarawneh, “Diagnosing Cardiovascular Diseases using Optimized Machine Learning Algorithms with GridSearchCV,” J. Appl. Data Sci., vol. 5, no. 4, pp. 1539–1552, 2024, doi: 10.47738/jads.v5i4.280.

H. K. Bharadwaj et al., “A Review on the Role of Machine Learning in Enabling IoT Based Healthcare Applications,” IEEE Access, vol. 9, pp. 38859–38890, 2021, doi: 10.1109/ACCESS.2021.3059858.

Y. Zhao, W. Zhang, and X. Liu, “Grid search with a weighted error function: Hyper-parameter optimization for financial time series forecasting,” Appl. Soft Comput., vol. 154, no. February, p. 111362, 2024, doi: 10.1016/j.asoc.2024.111362.

Y. Han and I. Joe, “Enhancing Machine Learning Models Through PCA, SMOTE-ENN, and Stochastic Weighted Averaging,” Appl. Sci., vol. 14, no. 21, 2024, doi: 10.3390/app14219772.

M. Ridwan and E. Utami, “Optimized Hyperparameter Tuning for Improved Hate Speech Detection with Multilayer Perceptron,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 8, no. 4, pp. 525–534, 2024, doi: https://doi.org/10.29207/resti.v8i4.5949.

M. Muntasir Nishat et al., “A Comprehensive Investigation of the Performances of Different Machine Learning Classifiers with SMOTE-ENN Oversampling Technique and Hyperparameter Optimization for Imbalanced Heart Failure Dataset,” Sci. Program., vol. 2022, no. Cvd, 2022, doi: 10.1155/2022/3649406.

S. Sudriyanto, “Optimizing Neural Networks Using Particle Swarm Optimization (PSO) Algorithm for Hypertension Disease Prediction,” JEECOM J. Electr. Eng. Comput., vol. 5, no. 2, pp. 278–284, 2023, doi: 10.33650/jeecom.v5i2.6759.

F. V. Ongkosianbhadra and C. C. Lestari, “Pengembangan Model Prediksi Risiko Hipertensi Menggunakan Algoritma Gradient Boosting Decision Tree yang Dioptimalkan dengan Hyperparameter Tuning Tree Parzer Estimation,” J. Inform. dan Sist. Inf., vol. 9, no. 2, pp. 35–44, 2023, doi: 10.37715/juisi.v9i2.4403.

A. M. Widodo, N. Anwar, B. Irawan, L. Meria, and A. Wisnujati, “Komparasi Performansi Algoritma Pengklasifikasi KNN , Bagging, Dan Random Forest Untuk Prediksi Kanker Payudara,” Konf. Nas. Ilmu Komput., pp. 367–372, 2021.

I. L. F. Amien, W. Astuti, and K. M. Lhaksamana, “Perbandingan Metode Naïve Bayes dan KNN (K-Nearest Neighbor) dalam Klasifikasi Penyakit Diabetes,” e-Proceeding Eng., vol. 10, no. 2, pp. 1911–1920, 2023.

I. P. Putri, “Analisis Performa Metode K- Nearest Neighbor (KNN) dan Crossvalidation pada Data Penyakit Cardiovascular,” Indones. J. Data Sci., vol. 2, no. 1, pp. 21–28, 2021, doi: 10.33096/ijodas.v2i1.25.




DOI: https://doi.org/10.15408/jti.v18i1.42260

Refbacks

  • There are currently no refbacks.


Copyright (c) 2025 Dimas Trianda, Dedy Hartama, Solikhun Solikhun

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

3rd Floor, Dept. of Informatics, Faculty of Science and Technology, UIN Syarif Hidayatullah Jakarta
Jl. Ir. H. Juanda No.95, Cempaka Putih, Ciputat Timur.
Kota Tangerang Selatan, Banten 15412
Tlp/Fax: +62 21 74019 25/ +62 749 3315
Handphone: +62 8128947537
E-mail: jurnal-ti@apps.uinjkt.ac.id


Creative Commons Licence
Jurnal Teknik Informatika by Prodi Teknik Informatika Universitas Islam Negeri Syarif Hidayatullah Jakarta is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Based on a work at http://journal.uinjkt.ac.id/index.php/ti.

JTI Visitor Counter: View JTI Stats

 Flag Counter