While the recent advent of new technologies in biology such as DNA microarray and next-generation sequencer has given researchers a large volume of data representing genome-wide biological responses, it is not necessarily easy to derive knowledge that is accurate and understandable at the same time. In this study, we applied the Classification Based on Association (CBA) algorithm, one of the class association rule mining techniques, to the TG-GATEs database, where both toxicogenomic and toxicological data of more than 150 compounds in rat and human are stored. We compared the generated classifiers between CBA and linear discriminant analysis (LDA) and showed that CBA is superior to LDA in terms of both predictive performances (accuracy: 83% for CBA vs. 75% for LDA, sensitivity: 82% for CBA vs. 72% for LDA, specificity: 85% for CBA vs. 75% for LDA) and interpretability.
!!!All Science Journal Classification (ASJC) codes