TY - JOUR
T1 - Explanation of machine learning models using shapley additive explanation and application for real data in hospital
AU - Nohara, Yasunobu
AU - Matsumoto, Koutarou
AU - Soejima, Hidehisa
AU - Nakashima, Naoki
N1 - Funding Information:
This work was supported by JSPS KAKENHI Grant Number JP20K11938 .
Publisher Copyright:
© 2021 Elsevier B.V.
PY - 2022/2
Y1 - 2022/2
N2 - Background and Objective: When using machine learning techniques in decision-making processes, the interpretability of the models is important. In the present paper, we adopted the Shapley additive explanation (SHAP), which is based on fair profit allocation among many stakeholders depending on their contribution, for interpreting a gradient-boosting decision tree model using hospital data. Methods: For better interpretability, we propose two novel techniques as follows: (1) a new metric of feature importance using SHAP and (2) a technique termed feature packing, which packs multiple similar features into one grouped feature to allow an easier understanding of the model without reconstruction of the model. We then compared the explanation results between the SHAP framework and existing methods using cerebral infarction data from our hospital. Results: The interpretation by SHAP was mostly consistent with that by the existing methods. We showed how the A/G ratio works as an important prognostic factor for cerebral infarction using proposed techniques. Conclusion: Our techniques are useful for interpreting machine learning models and can uncover the underlying relationships between features and outcome.
AB - Background and Objective: When using machine learning techniques in decision-making processes, the interpretability of the models is important. In the present paper, we adopted the Shapley additive explanation (SHAP), which is based on fair profit allocation among many stakeholders depending on their contribution, for interpreting a gradient-boosting decision tree model using hospital data. Methods: For better interpretability, we propose two novel techniques as follows: (1) a new metric of feature importance using SHAP and (2) a technique termed feature packing, which packs multiple similar features into one grouped feature to allow an easier understanding of the model without reconstruction of the model. We then compared the explanation results between the SHAP framework and existing methods using cerebral infarction data from our hospital. Results: The interpretation by SHAP was mostly consistent with that by the existing methods. We showed how the A/G ratio works as an important prognostic factor for cerebral infarction using proposed techniques. Conclusion: Our techniques are useful for interpreting machine learning models and can uncover the underlying relationships between features and outcome.
UR - http://www.scopus.com/inward/record.url?scp=85121556741&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85121556741&partnerID=8YFLogxK
U2 - 10.1016/j.cmpb.2021.106584
DO - 10.1016/j.cmpb.2021.106584
M3 - Article
C2 - 34942412
AN - SCOPUS:85121556741
SN - 0169-2607
VL - 214
JO - Computer Methods and Programs in Biomedicine
JF - Computer Methods and Programs in Biomedicine
M1 - 106584
ER -