Protein homology detection using string alignment kernels

Hiroto Saigo, Jean Philippe Vert, Nobuhisa Ueda, Tatsuya Akutsu

Research output: Contribution to journalArticlepeer-review

280 Citations (Scopus)

Abstract

Motivation: Remote homology detection between protein sequences is a central problem in computational biology. Discriminative methods involving support vector machines (SVMs) are currently the most effective methods for the problem of superfamily recognition in the Structural Classification Of Proteins (SCOP) database. The performance of SVMs depends critically on the kernel function used to quantify the similarity between sequences. Results: We propose new kernels for strings adapted to biological sequences, which we call local alignment kernels. These kernels measure the similarity between two sequences by summing up scores obtained from local alignments with gaps of the sequences. When tested in combination with SVM on their ability to recognize SCOP superfamilies on a benchmark dataset, the new kernels outperform state-of-the-art methods for remote homology detection.

Original languageEnglish
Pages (from-to)1682-1689
Number of pages8
JournalBioinformatics
Volume20
Issue number11
DOIs
Publication statusPublished - Jul 22 2004
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'Protein homology detection using string alignment kernels'. Together they form a unique fingerprint.

Cite this