Reconstructing phylogenetic trees of prokaryote genomes by randomly sampling oligopeptides

Osamu Maruyama, Akiko Matsuda, Satoru Kuhara

Research output: Contribution to journalArticle

Abstract

In this paper, we propose a method for reconstructing phylogenetic trees of a given set of prokaryote organisms by randomly sampling relatively small oligopeptides of a fixed length from their complete proteomes. For each of the organisms, a vector of frequencies of the sampled oligopeptides is generated and used as a building block in reconstructing phylogenetic trees. By this procedure, multiple phylogenetic trees are created independently, and a consensus tree of those trees is created. We have applied our method to a set of 109 organisms, including 16 Archaea, 87 Bacteria, and 6 Eukarya, using around 10% of all the 3,200,000 oligopeptides of length 5 in a reconstruction of a single phylogenetic tree. Our consensus tree agrees with the tree of Bergey’s Manual in most of the basic taxa. In addition, they have almost the same quality as the trees of the same organisms reconstructed using all the 20K oligopeptides of length K = 5 and 6 given by Qi et al. Thus we can conclude that, the frequencies of a relatively small number of oligopeptides of length 5, even if those oligopeptides are determined in a random method, has phylogenetic information almost equivalent to the frequencies of all the oligopeptides of length 5 or 6.

Original languageEnglish
Pages (from-to)429-446
Number of pages18
JournalInternational Journal of Bioinformatics Research and Applications
Volume1
Issue number4
DOIs
Publication statusPublished - Jan 1 2005

Fingerprint

Oligopeptides
Genes
Genome
Sampling
Archaea
Proteome
Eukaryota
Bacteria
Proteins

All Science Journal Classification (ASJC) codes

  • Biomedical Engineering
  • Health Informatics
  • Clinical Biochemistry
  • Health Information Management

Cite this

Reconstructing phylogenetic trees of prokaryote genomes by randomly sampling oligopeptides. / Maruyama, Osamu; Matsuda, Akiko; Kuhara, Satoru.

In: International Journal of Bioinformatics Research and Applications, Vol. 1, No. 4, 01.01.2005, p. 429-446.

Research output: Contribution to journalArticle

@article{b0ec74434c0b4e9b8a2607903a7b4d80,
title = "Reconstructing phylogenetic trees of prokaryote genomes by randomly sampling oligopeptides",
abstract = "In this paper, we propose a method for reconstructing phylogenetic trees of a given set of prokaryote organisms by randomly sampling relatively small oligopeptides of a fixed length from their complete proteomes. For each of the organisms, a vector of frequencies of the sampled oligopeptides is generated and used as a building block in reconstructing phylogenetic trees. By this procedure, multiple phylogenetic trees are created independently, and a consensus tree of those trees is created. We have applied our method to a set of 109 organisms, including 16 Archaea, 87 Bacteria, and 6 Eukarya, using around 10{\%} of all the 3,200,000 oligopeptides of length 5 in a reconstruction of a single phylogenetic tree. Our consensus tree agrees with the tree of Bergey’s Manual in most of the basic taxa. In addition, they have almost the same quality as the trees of the same organisms reconstructed using all the 20K oligopeptides of length K = 5 and 6 given by Qi et al. Thus we can conclude that, the frequencies of a relatively small number of oligopeptides of length 5, even if those oligopeptides are determined in a random method, has phylogenetic information almost equivalent to the frequencies of all the oligopeptides of length 5 or 6.",
author = "Osamu Maruyama and Akiko Matsuda and Satoru Kuhara",
year = "2005",
month = "1",
day = "1",
doi = "10.1504/IJBRA.2005.008446",
language = "English",
volume = "1",
pages = "429--446",
journal = "International Journal of Bioinformatics Research and Applications",
issn = "1744-5485",
publisher = "Inderscience Enterprises Ltd",
number = "4",

}

TY - JOUR

T1 - Reconstructing phylogenetic trees of prokaryote genomes by randomly sampling oligopeptides

AU - Maruyama, Osamu

AU - Matsuda, Akiko

AU - Kuhara, Satoru

PY - 2005/1/1

Y1 - 2005/1/1

N2 - In this paper, we propose a method for reconstructing phylogenetic trees of a given set of prokaryote organisms by randomly sampling relatively small oligopeptides of a fixed length from their complete proteomes. For each of the organisms, a vector of frequencies of the sampled oligopeptides is generated and used as a building block in reconstructing phylogenetic trees. By this procedure, multiple phylogenetic trees are created independently, and a consensus tree of those trees is created. We have applied our method to a set of 109 organisms, including 16 Archaea, 87 Bacteria, and 6 Eukarya, using around 10% of all the 3,200,000 oligopeptides of length 5 in a reconstruction of a single phylogenetic tree. Our consensus tree agrees with the tree of Bergey’s Manual in most of the basic taxa. In addition, they have almost the same quality as the trees of the same organisms reconstructed using all the 20K oligopeptides of length K = 5 and 6 given by Qi et al. Thus we can conclude that, the frequencies of a relatively small number of oligopeptides of length 5, even if those oligopeptides are determined in a random method, has phylogenetic information almost equivalent to the frequencies of all the oligopeptides of length 5 or 6.

AB - In this paper, we propose a method for reconstructing phylogenetic trees of a given set of prokaryote organisms by randomly sampling relatively small oligopeptides of a fixed length from their complete proteomes. For each of the organisms, a vector of frequencies of the sampled oligopeptides is generated and used as a building block in reconstructing phylogenetic trees. By this procedure, multiple phylogenetic trees are created independently, and a consensus tree of those trees is created. We have applied our method to a set of 109 organisms, including 16 Archaea, 87 Bacteria, and 6 Eukarya, using around 10% of all the 3,200,000 oligopeptides of length 5 in a reconstruction of a single phylogenetic tree. Our consensus tree agrees with the tree of Bergey’s Manual in most of the basic taxa. In addition, they have almost the same quality as the trees of the same organisms reconstructed using all the 20K oligopeptides of length K = 5 and 6 given by Qi et al. Thus we can conclude that, the frequencies of a relatively small number of oligopeptides of length 5, even if those oligopeptides are determined in a random method, has phylogenetic information almost equivalent to the frequencies of all the oligopeptides of length 5 or 6.

UR - http://www.scopus.com/inward/record.url?scp=84946441983&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84946441983&partnerID=8YFLogxK

U2 - 10.1504/IJBRA.2005.008446

DO - 10.1504/IJBRA.2005.008446

M3 - Article

C2 - 18048147

AN - SCOPUS:84946441983

VL - 1

SP - 429

EP - 446

JO - International Journal of Bioinformatics Research and Applications

JF - International Journal of Bioinformatics Research and Applications

SN - 1744-5485

IS - 4

ER -