A preliminary metagenome analysis based on a combination of protein domains

Yoji Igarashi, Daisuke Mori, Susumu Mitsuyama, Kazutoshi Yoshitake, Hiroaki Ono, Tsuyoshi Watanabe, Yukiko Taniuchi, Tomoko Sakami, Akira Kuwata, Takanori Kobayashi, Yoshizumi Ishino, Shugo Watabe, Takashi Gojobori, Shuichi Asakawa

Research output: Contribution to journalArticle

Abstract

Metagenomic data have mainly been addressed by showing the composition of organisms based on a small part of a well-examined genomic sequence, such as ribosomal RNA genes and mitochondrial DNAs. On the contrary, whole metagenomic data obtained by the shotgun sequence method have not often been fully analyzed through a homology search because the genomic data in databases for living organisms on earth are insufficient. In order to complement the results obtained through homology-search-based methods with shotgun metagenomes data, we focused on the composition of protein domains deduced from the sequences of genomes and metagenomes, and we utilized them in characterizing genomes and metagenomes, respectively. First, we compared the relationships based on similarities in the protein domain composition with the relationships based on sequence similarities. We searched for protein domains of 325 bacterial species produced using the Pfam database. Next, the correlation coefficients of protein domain compositions between every pair of bacteria were examined. Every pairwise genetic distance was also calculated from 16S rRNA or DNA gyrase subunit B. We compared the results of these methods and found a moderate correlation between them. Essentially, the same results were obtained when we used partial random 100 bp DNA sequences of the bacterial genomes, which simulated raw sequence data obtained from short-read next-generation sequences. Then, we applied the method for analyzing the actual environmental data obtained by shotgun sequencing. We found that the transition of the microbial phase occurred because the seasonal change in water temperature was shown by the method. These results showed the usability of the method in characterizing metagenomic data based on protein domain compositions.

Original languageEnglish
Article number19
JournalProteomes
Volume7
Issue number2
DOIs
Publication statusPublished - Jun 1 2019

Fingerprint

Metagenome
Metagenomics
Genes
Firearms
Chemical analysis
Proteins
DNA Gyrase
Ribosomal RNA
Genome
Databases
DNA sequences
Bacterial Genomes
Mitochondrial DNA
Phase Transition
Bacteria
rRNA Genes
Earth (planet)
Protein Domains
Water
Temperature

All Science Journal Classification (ASJC) codes

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Clinical Biochemistry

Cite this

Igarashi, Y., Mori, D., Mitsuyama, S., Yoshitake, K., Ono, H., Watanabe, T., ... Asakawa, S. (2019). A preliminary metagenome analysis based on a combination of protein domains. Proteomes, 7(2), [19]. https://doi.org/10.3390/proteomes7020019

A preliminary metagenome analysis based on a combination of protein domains. / Igarashi, Yoji; Mori, Daisuke; Mitsuyama, Susumu; Yoshitake, Kazutoshi; Ono, Hiroaki; Watanabe, Tsuyoshi; Taniuchi, Yukiko; Sakami, Tomoko; Kuwata, Akira; Kobayashi, Takanori; Ishino, Yoshizumi; Watabe, Shugo; Gojobori, Takashi; Asakawa, Shuichi.

In: Proteomes, Vol. 7, No. 2, 19, 01.06.2019.

Research output: Contribution to journalArticle

Igarashi, Y, Mori, D, Mitsuyama, S, Yoshitake, K, Ono, H, Watanabe, T, Taniuchi, Y, Sakami, T, Kuwata, A, Kobayashi, T, Ishino, Y, Watabe, S, Gojobori, T & Asakawa, S 2019, 'A preliminary metagenome analysis based on a combination of protein domains', Proteomes, vol. 7, no. 2, 19. https://doi.org/10.3390/proteomes7020019
Igarashi Y, Mori D, Mitsuyama S, Yoshitake K, Ono H, Watanabe T et al. A preliminary metagenome analysis based on a combination of protein domains. Proteomes. 2019 Jun 1;7(2). 19. https://doi.org/10.3390/proteomes7020019
Igarashi, Yoji ; Mori, Daisuke ; Mitsuyama, Susumu ; Yoshitake, Kazutoshi ; Ono, Hiroaki ; Watanabe, Tsuyoshi ; Taniuchi, Yukiko ; Sakami, Tomoko ; Kuwata, Akira ; Kobayashi, Takanori ; Ishino, Yoshizumi ; Watabe, Shugo ; Gojobori, Takashi ; Asakawa, Shuichi. / A preliminary metagenome analysis based on a combination of protein domains. In: Proteomes. 2019 ; Vol. 7, No. 2.
@article{9bcc3445b67741a09d1ae5a3d1089258,
title = "A preliminary metagenome analysis based on a combination of protein domains",
abstract = "Metagenomic data have mainly been addressed by showing the composition of organisms based on a small part of a well-examined genomic sequence, such as ribosomal RNA genes and mitochondrial DNAs. On the contrary, whole metagenomic data obtained by the shotgun sequence method have not often been fully analyzed through a homology search because the genomic data in databases for living organisms on earth are insufficient. In order to complement the results obtained through homology-search-based methods with shotgun metagenomes data, we focused on the composition of protein domains deduced from the sequences of genomes and metagenomes, and we utilized them in characterizing genomes and metagenomes, respectively. First, we compared the relationships based on similarities in the protein domain composition with the relationships based on sequence similarities. We searched for protein domains of 325 bacterial species produced using the Pfam database. Next, the correlation coefficients of protein domain compositions between every pair of bacteria were examined. Every pairwise genetic distance was also calculated from 16S rRNA or DNA gyrase subunit B. We compared the results of these methods and found a moderate correlation between them. Essentially, the same results were obtained when we used partial random 100 bp DNA sequences of the bacterial genomes, which simulated raw sequence data obtained from short-read next-generation sequences. Then, we applied the method for analyzing the actual environmental data obtained by shotgun sequencing. We found that the transition of the microbial phase occurred because the seasonal change in water temperature was shown by the method. These results showed the usability of the method in characterizing metagenomic data based on protein domain compositions.",
author = "Yoji Igarashi and Daisuke Mori and Susumu Mitsuyama and Kazutoshi Yoshitake and Hiroaki Ono and Tsuyoshi Watanabe and Yukiko Taniuchi and Tomoko Sakami and Akira Kuwata and Takanori Kobayashi and Yoshizumi Ishino and Shugo Watabe and Takashi Gojobori and Shuichi Asakawa",
year = "2019",
month = "6",
day = "1",
doi = "10.3390/proteomes7020019",
language = "English",
volume = "7",
journal = "Proteomes",
issn = "2227-7382",
publisher = "MDPI AG",
number = "2",

}

TY - JOUR

T1 - A preliminary metagenome analysis based on a combination of protein domains

AU - Igarashi, Yoji

AU - Mori, Daisuke

AU - Mitsuyama, Susumu

AU - Yoshitake, Kazutoshi

AU - Ono, Hiroaki

AU - Watanabe, Tsuyoshi

AU - Taniuchi, Yukiko

AU - Sakami, Tomoko

AU - Kuwata, Akira

AU - Kobayashi, Takanori

AU - Ishino, Yoshizumi

AU - Watabe, Shugo

AU - Gojobori, Takashi

AU - Asakawa, Shuichi

PY - 2019/6/1

Y1 - 2019/6/1

N2 - Metagenomic data have mainly been addressed by showing the composition of organisms based on a small part of a well-examined genomic sequence, such as ribosomal RNA genes and mitochondrial DNAs. On the contrary, whole metagenomic data obtained by the shotgun sequence method have not often been fully analyzed through a homology search because the genomic data in databases for living organisms on earth are insufficient. In order to complement the results obtained through homology-search-based methods with shotgun metagenomes data, we focused on the composition of protein domains deduced from the sequences of genomes and metagenomes, and we utilized them in characterizing genomes and metagenomes, respectively. First, we compared the relationships based on similarities in the protein domain composition with the relationships based on sequence similarities. We searched for protein domains of 325 bacterial species produced using the Pfam database. Next, the correlation coefficients of protein domain compositions between every pair of bacteria were examined. Every pairwise genetic distance was also calculated from 16S rRNA or DNA gyrase subunit B. We compared the results of these methods and found a moderate correlation between them. Essentially, the same results were obtained when we used partial random 100 bp DNA sequences of the bacterial genomes, which simulated raw sequence data obtained from short-read next-generation sequences. Then, we applied the method for analyzing the actual environmental data obtained by shotgun sequencing. We found that the transition of the microbial phase occurred because the seasonal change in water temperature was shown by the method. These results showed the usability of the method in characterizing metagenomic data based on protein domain compositions.

AB - Metagenomic data have mainly been addressed by showing the composition of organisms based on a small part of a well-examined genomic sequence, such as ribosomal RNA genes and mitochondrial DNAs. On the contrary, whole metagenomic data obtained by the shotgun sequence method have not often been fully analyzed through a homology search because the genomic data in databases for living organisms on earth are insufficient. In order to complement the results obtained through homology-search-based methods with shotgun metagenomes data, we focused on the composition of protein domains deduced from the sequences of genomes and metagenomes, and we utilized them in characterizing genomes and metagenomes, respectively. First, we compared the relationships based on similarities in the protein domain composition with the relationships based on sequence similarities. We searched for protein domains of 325 bacterial species produced using the Pfam database. Next, the correlation coefficients of protein domain compositions between every pair of bacteria were examined. Every pairwise genetic distance was also calculated from 16S rRNA or DNA gyrase subunit B. We compared the results of these methods and found a moderate correlation between them. Essentially, the same results were obtained when we used partial random 100 bp DNA sequences of the bacterial genomes, which simulated raw sequence data obtained from short-read next-generation sequences. Then, we applied the method for analyzing the actual environmental data obtained by shotgun sequencing. We found that the transition of the microbial phase occurred because the seasonal change in water temperature was shown by the method. These results showed the usability of the method in characterizing metagenomic data based on protein domain compositions.

UR - http://www.scopus.com/inward/record.url?scp=85066760394&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85066760394&partnerID=8YFLogxK

U2 - 10.3390/proteomes7020019

DO - 10.3390/proteomes7020019

M3 - Article

AN - SCOPUS:85066760394

VL - 7

JO - Proteomes

JF - Proteomes

SN - 2227-7382

IS - 2

M1 - 19

ER -