Reconstructing phylogenetic trees of prokaryote genomes by randomly sampling oligopeptides

Osamu Maruyama, Akiko Matsuda, Satoru Kuhara

Research output: Contribution to journalConference article

Abstract

In this paper, we propose a method for reconstructing phylogenetic trees of a given set of prokaryote organisms by randomly sampling relatively small oligopeptides of a fixed length from their complete proteomes. For each of the organisms, a vector of frequencies of those sampled oligopeptides is generated and used as a building block in reconstructing phylogenetic trees. By this procedure, phylogenetic trees are generated independently, and a consensus tree of the resulting trees is obtained. We have applied our method to a set of 109 organisms, including 16 Archaea, 87 Bacteria, and 6 Eukarya, using less 10 % of all the 3,200,000 oligopeptides of length 5. Our consensus tree agrees with the tree of Bergey's Manual in most of the basic taxa. In addition, they have almost the same quality as the trees of the same organisms reconstructed using all the 20K oligopeptides of length K = 5 and 6 given by Qi et al. Thus we can conclude that, the frequencies of a relatively small number of oligopeptides of length 5, even if those oligopeptides are determined in a random method, has phylogenetic information almost equivalent to the frequencies of all the oligopeptides of length 5 or 6.

Original languageEnglish
Pages (from-to)911-918
Number of pages8
JournalLecture Notes in Computer Science
Volume3515
Issue numberII
Publication statusPublished - Sep 30 2005

Fingerprint

Phylogenetic Tree
Genome
Genes
Sampling
Phylogenetics
Bacteria
Building Blocks
Oligopeptides
Proteins

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Reconstructing phylogenetic trees of prokaryote genomes by randomly sampling oligopeptides. / Maruyama, Osamu; Matsuda, Akiko; Kuhara, Satoru.

In: Lecture Notes in Computer Science, Vol. 3515, No. II, 30.09.2005, p. 911-918.

Research output: Contribution to journalConference article

@article{2615f3a5048742ee8ceab5d9199a92bb,
title = "Reconstructing phylogenetic trees of prokaryote genomes by randomly sampling oligopeptides",
abstract = "In this paper, we propose a method for reconstructing phylogenetic trees of a given set of prokaryote organisms by randomly sampling relatively small oligopeptides of a fixed length from their complete proteomes. For each of the organisms, a vector of frequencies of those sampled oligopeptides is generated and used as a building block in reconstructing phylogenetic trees. By this procedure, phylogenetic trees are generated independently, and a consensus tree of the resulting trees is obtained. We have applied our method to a set of 109 organisms, including 16 Archaea, 87 Bacteria, and 6 Eukarya, using less 10 {\%} of all the 3,200,000 oligopeptides of length 5. Our consensus tree agrees with the tree of Bergey's Manual in most of the basic taxa. In addition, they have almost the same quality as the trees of the same organisms reconstructed using all the 20K oligopeptides of length K = 5 and 6 given by Qi et al. Thus we can conclude that, the frequencies of a relatively small number of oligopeptides of length 5, even if those oligopeptides are determined in a random method, has phylogenetic information almost equivalent to the frequencies of all the oligopeptides of length 5 or 6.",
author = "Osamu Maruyama and Akiko Matsuda and Satoru Kuhara",
year = "2005",
month = "9",
day = "30",
language = "English",
volume = "3515",
pages = "911--918",
journal = "Lecture Notes in Computer Science",
issn = "0302-9743",
publisher = "Springer Verlag",
number = "II",

}

TY - JOUR

T1 - Reconstructing phylogenetic trees of prokaryote genomes by randomly sampling oligopeptides

AU - Maruyama, Osamu

AU - Matsuda, Akiko

AU - Kuhara, Satoru

PY - 2005/9/30

Y1 - 2005/9/30

N2 - In this paper, we propose a method for reconstructing phylogenetic trees of a given set of prokaryote organisms by randomly sampling relatively small oligopeptides of a fixed length from their complete proteomes. For each of the organisms, a vector of frequencies of those sampled oligopeptides is generated and used as a building block in reconstructing phylogenetic trees. By this procedure, phylogenetic trees are generated independently, and a consensus tree of the resulting trees is obtained. We have applied our method to a set of 109 organisms, including 16 Archaea, 87 Bacteria, and 6 Eukarya, using less 10 % of all the 3,200,000 oligopeptides of length 5. Our consensus tree agrees with the tree of Bergey's Manual in most of the basic taxa. In addition, they have almost the same quality as the trees of the same organisms reconstructed using all the 20K oligopeptides of length K = 5 and 6 given by Qi et al. Thus we can conclude that, the frequencies of a relatively small number of oligopeptides of length 5, even if those oligopeptides are determined in a random method, has phylogenetic information almost equivalent to the frequencies of all the oligopeptides of length 5 or 6.

AB - In this paper, we propose a method for reconstructing phylogenetic trees of a given set of prokaryote organisms by randomly sampling relatively small oligopeptides of a fixed length from their complete proteomes. For each of the organisms, a vector of frequencies of those sampled oligopeptides is generated and used as a building block in reconstructing phylogenetic trees. By this procedure, phylogenetic trees are generated independently, and a consensus tree of the resulting trees is obtained. We have applied our method to a set of 109 organisms, including 16 Archaea, 87 Bacteria, and 6 Eukarya, using less 10 % of all the 3,200,000 oligopeptides of length 5. Our consensus tree agrees with the tree of Bergey's Manual in most of the basic taxa. In addition, they have almost the same quality as the trees of the same organisms reconstructed using all the 20K oligopeptides of length K = 5 and 6 given by Qi et al. Thus we can conclude that, the frequencies of a relatively small number of oligopeptides of length 5, even if those oligopeptides are determined in a random method, has phylogenetic information almost equivalent to the frequencies of all the oligopeptides of length 5 or 6.

UR - http://www.scopus.com/inward/record.url?scp=25144453345&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=25144453345&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:25144453345

VL - 3515

SP - 911

EP - 918

JO - Lecture Notes in Computer Science

JF - Lecture Notes in Computer Science

SN - 0302-9743

IS - II

ER -