The metabolic network is an important biological network which consists of enzymes and chemical compounds. However, a large number of metabolic pathways remains unknown, and most organism-specific metabolic pathways contain many missing enzymes. We present a novel method to identify the genes coding for missing enzymes using available genomic and chemical information from bacterial genomes. The proposed method consists of two steps: (a) estimation of the functional association between the genes with respect to chromosomal proximity and evolutionary association, using supervised network inference; and (b) selection of gene candidates for missing enzymes based on the original candidate score and the chemical reaction information encoded in the EC number. We applied the proposed methods to infer the metabolic network for the bacteria Pseudomonas aeruginosa from two genomic datasets: gene position and phylogenetic profiles. Next, we predicted several missing enzyme genes to reconstruct the lysine-degradation pathway in P. aeruginosa using EC number information. As a result, we identified PA0266 as a putative 5-aminovalerate aminotransferase (EC 22.214.171.124) and PA0265 as a putative glutarate semialdehyde dehydrogenase (EC 126.96.36.199). To verify our prediction, we conducted biochemical assays and examined the activity of the products of the predicted genes, PA0265 and PA0266, in a coupled reaction. We observed that the predicted gene products catalyzed the expected reactions; no activity was seen when both gene products were omitted from the reaction.
All Science Journal Classification (ASJC) codes
- Molecular Biology
- Cell Biology