Computing similarity or dissimilarity between protein structures is an important task in structural biology. A conventional method to compute protein structure dissimilarity requires structural alignment of the proteins. However, defining one best alignment is difficult, especially when the structures are very different. In this paper, we propose a new similarity measure for protein structure comparisons using a set of multi-view 2D images of 3D protein structures. In this approach, each protein structure is represented by a subspace from the image set. The similarity between two protein structures is then characterized by the canonical angles between the two subspaces. The primary advantage of our method is that precise alignment is not needed. We employed Grassmann Discriminant Analysis (GDA) as the subspace-based learning in the classification framework. We applied our method for the classification problem of seven SCOP structural classes of protein 3D structures. The proposed method outperformed the k-nearest neighbor method (k-NN) based on conventional alignment-based methods CE, FATCAT, and TM-align. Our method was also applied to the classification of SCOP folds of membrane proteins, where the proposed method could recognize the fold HEM-binding four-helical bundle (f.21) much better than TM-Align.
|Number of pages||14|
|Journal||IEEE/ACM Transactions on Computational Biology and Bioinformatics|
|Publication status||Published - Jan 1 2018|
All Science Journal Classification (ASJC) codes
- Applied Mathematics