PPSampler2: Predicting protein complexes more accurately and efficiently by sampling

Chasanah Kusumastuti Widita, Osamu Maruyama

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

The problem of predicting sets of components of heteromeric protein complexes is a challenging problem in Systems Biology. There have been many tools proposed to predict those complexes. Among them, PPSampler, a protein complex prediction algorithm based on the Metropolis-Hastings algorithm, is reported to outperform other tools. In this work, we improve PPSampler by refining scoring functions and a proposal distribution used inside the algorithm so that predicted clusters are more accurate as well as the resulting algorithm runs faster. The new version is called PPSampler2. In computational experiments, PPSampler2 is shown to outperform other tools including PPSampler. The F-measure score of PPSampler2 is 0.67, which is at least 26% higher than those of the other tools. In addition, about 82% of the predicted clusters that are unmatched with any known complexes are statistically significant on the biological process aspect of Gene Ontology. Furthermore, the running time is reduced to twenty minutes, which is 1/24 of that of PPSampler.

Original languageEnglish
Article numberS14
JournalBMC systems biology
Volume7
DOIs
Publication statusPublished - Jan 1 2013

Fingerprint

Sampling
Proteins
Protein
Biological Phenomena
Metropolis-Hastings Algorithm
Gene Ontology
Systems Biology
Scoring
Computational Experiments
Refining
Ontology
Genes
Predict
Prediction
Experiments

All Science Journal Classification (ASJC) codes

  • Structural Biology
  • Modelling and Simulation
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

PPSampler2 : Predicting protein complexes more accurately and efficiently by sampling. / Widita, Chasanah Kusumastuti; Maruyama, Osamu.

In: BMC systems biology, Vol. 7, S14, 01.01.2013.

Research output: Contribution to journalArticle

@article{30e5b8f8dbad4ceb8ca35a99d0cf7497,
title = "PPSampler2: Predicting protein complexes more accurately and efficiently by sampling",
abstract = "The problem of predicting sets of components of heteromeric protein complexes is a challenging problem in Systems Biology. There have been many tools proposed to predict those complexes. Among them, PPSampler, a protein complex prediction algorithm based on the Metropolis-Hastings algorithm, is reported to outperform other tools. In this work, we improve PPSampler by refining scoring functions and a proposal distribution used inside the algorithm so that predicted clusters are more accurate as well as the resulting algorithm runs faster. The new version is called PPSampler2. In computational experiments, PPSampler2 is shown to outperform other tools including PPSampler. The F-measure score of PPSampler2 is 0.67, which is at least 26{\%} higher than those of the other tools. In addition, about 82{\%} of the predicted clusters that are unmatched with any known complexes are statistically significant on the biological process aspect of Gene Ontology. Furthermore, the running time is reduced to twenty minutes, which is 1/24 of that of PPSampler.",
author = "Widita, {Chasanah Kusumastuti} and Osamu Maruyama",
year = "2013",
month = "1",
day = "1",
doi = "10.1186/1752-0509-7-S6-S14",
language = "English",
volume = "7",
journal = "BMC Systems Biology",
issn = "1752-0509",
publisher = "BioMed Central",

}

TY - JOUR

T1 - PPSampler2

T2 - Predicting protein complexes more accurately and efficiently by sampling

AU - Widita, Chasanah Kusumastuti

AU - Maruyama, Osamu

PY - 2013/1/1

Y1 - 2013/1/1

N2 - The problem of predicting sets of components of heteromeric protein complexes is a challenging problem in Systems Biology. There have been many tools proposed to predict those complexes. Among them, PPSampler, a protein complex prediction algorithm based on the Metropolis-Hastings algorithm, is reported to outperform other tools. In this work, we improve PPSampler by refining scoring functions and a proposal distribution used inside the algorithm so that predicted clusters are more accurate as well as the resulting algorithm runs faster. The new version is called PPSampler2. In computational experiments, PPSampler2 is shown to outperform other tools including PPSampler. The F-measure score of PPSampler2 is 0.67, which is at least 26% higher than those of the other tools. In addition, about 82% of the predicted clusters that are unmatched with any known complexes are statistically significant on the biological process aspect of Gene Ontology. Furthermore, the running time is reduced to twenty minutes, which is 1/24 of that of PPSampler.

AB - The problem of predicting sets of components of heteromeric protein complexes is a challenging problem in Systems Biology. There have been many tools proposed to predict those complexes. Among them, PPSampler, a protein complex prediction algorithm based on the Metropolis-Hastings algorithm, is reported to outperform other tools. In this work, we improve PPSampler by refining scoring functions and a proposal distribution used inside the algorithm so that predicted clusters are more accurate as well as the resulting algorithm runs faster. The new version is called PPSampler2. In computational experiments, PPSampler2 is shown to outperform other tools including PPSampler. The F-measure score of PPSampler2 is 0.67, which is at least 26% higher than those of the other tools. In addition, about 82% of the predicted clusters that are unmatched with any known complexes are statistically significant on the biological process aspect of Gene Ontology. Furthermore, the running time is reduced to twenty minutes, which is 1/24 of that of PPSampler.

UR - http://www.scopus.com/inward/record.url?scp=84908544846&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84908544846&partnerID=8YFLogxK

U2 - 10.1186/1752-0509-7-S6-S14

DO - 10.1186/1752-0509-7-S6-S14

M3 - Article

C2 - 24565288

AN - SCOPUS:84908544846

VL - 7

JO - BMC Systems Biology

JF - BMC Systems Biology

SN - 1752-0509

M1 - S14

ER -