TY - JOUR
T1 - Searching for common sequence patterns among distantly related proteins
AU - Suyama, Mikita
AU - Nishioka, Takaaki
AU - Oda, Jun'ichi
N1 - Funding Information:
The authors thank Minoru Kanehisa for critical reading of this manuscript and Ikuo Uchiyama and Atsushi Ogiwara for helpful discussions. Computation time was provided by the Supercomputer Laboratory, Institute for Chemical Research, Kyoto University, Japan. This work was supported by a Grant-in-Aid for Scientific Research on Priority Areas, 'Genome Informatics', from the Ministry of Education, Science and Culture of Japan, and by a Fellowship of the Japan Society for the Promotion of Science for Japanese Junior Scientists to M.S.
PY - 1995/11
Y1 - 1995/11
N2 - We have developed a program Gap Allowing Pattern Explorer (GAPE) to extract amino acid sequence motifs conserved among distantly related proteins. The GAPE program is designed to allow gaps in the sequences. First, this program generates all possible amino acid patterns comprising up to five amino acids. Sequences containing the amino acid residues in the same order as a generated pattern are selected as subsequences, where the differences in the distances between two consecutive amino acids are ignored. Next, the motifs are extracted from the subsequences under conditions in which all four distances between the five amino acids are fixed. At this stage, motifs with gaps in their subsequence are also found by relaxing one of the four fixed distances. The statistical significance for a motif obtained is calculated based on the amino acid composition of the sequences under consideration. When the GAPE program was applied to 59 pyridoxal-phosphaterelated sequences and 64 ATP (AMP-forming)-related sequences, motifs extracted with a low expectation of occurrence contained some of the amino acid residues chemically proved to be involved in the ligand recognition.
AB - We have developed a program Gap Allowing Pattern Explorer (GAPE) to extract amino acid sequence motifs conserved among distantly related proteins. The GAPE program is designed to allow gaps in the sequences. First, this program generates all possible amino acid patterns comprising up to five amino acids. Sequences containing the amino acid residues in the same order as a generated pattern are selected as subsequences, where the differences in the distances between two consecutive amino acids are ignored. Next, the motifs are extracted from the subsequences under conditions in which all four distances between the five amino acids are fixed. At this stage, motifs with gaps in their subsequence are also found by relaxing one of the four fixed distances. The statistical significance for a motif obtained is calculated based on the amino acid composition of the sequences under consideration. When the GAPE program was applied to 59 pyridoxal-phosphaterelated sequences and 64 ATP (AMP-forming)-related sequences, motifs extracted with a low expectation of occurrence contained some of the amino acid residues chemically proved to be involved in the ligand recognition.
UR - http://www.scopus.com/inward/record.url?scp=0029557778&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0029557778&partnerID=8YFLogxK
U2 - 10.1093/protein/8.11.1075
DO - 10.1093/protein/8.11.1075
M3 - Article
C2 - 8819973
AN - SCOPUS:0029557778
VL - 8
SP - 1075
EP - 1080
JO - Protein Engineering, Design and Selection
JF - Protein Engineering, Design and Selection
SN - 1741-0126
IS - 11
ER -