We have developed a program Gap Allowing Pattern Explorer (GAPE) to extract amino acid sequence motifs conserved among distantly related proteins. The GAPE program is designed to allow gaps in the sequences. First, this program generates all possible amino acid patterns comprising up to five amino acids. Sequences containing the amino acid residues in the same order as a generated pattern are selected as subsequences, where the differences in the distances between two consecutive amino acids are ignored. Next, the motifs are extracted from the subsequences under conditions in which all four distances between the five amino acids are fixed. At this stage, motifs with gaps in their subsequence are also found by relaxing one of the four fixed distances. The statistical significance for a motif obtained is calculated based on the amino acid composition of the sequences under consideration. When the GAPE program was applied to 59 pyridoxal-phosphaterelated sequences and 64 ATP (AMP-forming)-related sequences, motifs extracted with a low expectation of occurrence contained some of the amino acid residues chemically proved to be involved in the ligand recognition.
All Science Journal Classification (ASJC) codes
- Molecular Biology