A data enhancement approach to improve machine learning performance for predicting health status using remote healthcare data

Shaira Tabassum, Masuda Begum Sampa, Rafiqul Islam, Fumihiko Yokota, Naoki Nakashima, Ashir Ahmed

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Machine Learning (ML) is becoming tremendously important to improve the performance of remote healthcare systems. Portable health clinic (PHC), a remote healthcare system contains a triage function that classifies the patients in two major groups - (a)healthy and (b)unhealthy. Unhealthy patients require regular health checkups. This paper aims to predict the status of the registered patients to decide the follow-up date and frequency. Health management cost can be reduced by decreasing the number of follow-up frequency. We carried out an experiment on 271 corporate members and monitored their health status in every three months and collected four phases of data. The data records contain clinical data, socio-demographical data, dietary behavior data. However, most of the machine learning algorithms can not directly work with categorical data. Several encoding techniques are available which can also enhance the prediction performance. In this paper, We applied three encoding techniques and proposed a new encoding approach to handle categorical variables. The result shows that Random Forest Classifier performs the best with 95.33% accuracy. A comparison chart displaying the performance of eight different supervised learning algorithms in terms of three existing encoding mechanisms is reported.

Original languageEnglish
Title of host publication2020 2nd International Conference on Advanced Information and Communication Technology, ICAICT 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages308-312
Number of pages5
ISBN (Electronic)9780738123226
DOIs
Publication statusPublished - Nov 28 2020
Event2nd International Conference on Advanced Information and Communication Technology, ICAICT 2020 - Dhaka, Bangladesh
Duration: Nov 28 2020Nov 29 2020

Publication series

Name2020 2nd International Conference on Advanced Information and Communication Technology, ICAICT 2020

Conference

Conference2nd International Conference on Advanced Information and Communication Technology, ICAICT 2020
CountryBangladesh
CityDhaka
Period11/28/2011/29/20

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications
  • Information Systems
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality

Fingerprint Dive into the research topics of 'A data enhancement approach to improve machine learning performance for predicting health status using remote healthcare data'. Together they form a unique fingerprint.

Cite this