TY - GEN
T1 - Performance Verification of a Text Analyzer Using Machine Learning for Radiology Reports Toward Phenotyping
AU - Yamashita, Takanori
AU - Izukura, Rieko
AU - Nakashima, Naoki
N1 - Publisher Copyright:
© 2021, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
PY - 2021
Y1 - 2021
N2 - The medical field is embracing the information age, and the rapidly increasing medical data generated from hospital information system signified the advent of Big Data in the healthcare arena, such that real-time data are now available to assist many clinical decisions. Real World Data (RWD) from hospital information system structured numerical data and unstructured text data, and it is imperative that phenotyping reproducibly extracts patients with an accurate phenotype from RWD using a rule-based approach. In this study, of sampling computed tomography reports from 100 patients, 48 were diagnosed with interstitial pneumonia. Three machine learning methods (Support Vector Machine, Feature Selection and Gradient Boosting Decision Tree (GBDT)) were combined for development of a text phenotyping method, which was applied for the analysis to achieve prediction with good performance. We extracted several feature words to predict true cases of interstitial pneumonia and recognized that the effect of feature selection was identified from a good performance of GBDT’s AUC. We also identified that while applying machine learning to text-based RWD, variables have to be narrowed down.
AB - The medical field is embracing the information age, and the rapidly increasing medical data generated from hospital information system signified the advent of Big Data in the healthcare arena, such that real-time data are now available to assist many clinical decisions. Real World Data (RWD) from hospital information system structured numerical data and unstructured text data, and it is imperative that phenotyping reproducibly extracts patients with an accurate phenotype from RWD using a rule-based approach. In this study, of sampling computed tomography reports from 100 patients, 48 were diagnosed with interstitial pneumonia. Three machine learning methods (Support Vector Machine, Feature Selection and Gradient Boosting Decision Tree (GBDT)) were combined for development of a text phenotyping method, which was applied for the analysis to achieve prediction with good performance. We extracted several feature words to predict true cases of interstitial pneumonia and recognized that the effect of feature selection was identified from a good performance of GBDT’s AUC. We also identified that while applying machine learning to text-based RWD, variables have to be narrowed down.
UR - http://www.scopus.com/inward/record.url?scp=85111088260&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85111088260&partnerID=8YFLogxK
U2 - 10.1007/978-981-16-3013-2_14
DO - 10.1007/978-981-16-3013-2_14
M3 - Conference contribution
AN - SCOPUS:85111088260
SN - 9789811630125
T3 - Smart Innovation, Systems and Technologies
SP - 171
EP - 182
BT - Innovation in Medicine and Healthcare - Proceedings of 9th KES-InMed 2021
A2 - Chen, Yen-Wei
A2 - Chen, Yen-Wei
A2 - Tanaka, Satoshi
A2 - Howlett, Robert J.
A2 - Howlett, Robert J.
A2 - Howlett, Robert J.
A2 - Jain, Lakhmi C.
A2 - Jain, Lakhmi C.
PB - Springer Science and Business Media Deutschland GmbH
T2 - 9th KES International Conference on Innovation in Medicine and Healthcare, KES-InMed 2021
Y2 - 14 June 2021 through 16 June 2021
ER -