The connectivity structure, giant strong component and centrality of metabolic networks

Bioinformatics. 2003 Jul 22;19(11):1423-30. doi: 10.1093/bioinformatics/btg177.

Abstract

Motivation: Structural and functional analysis of genome-based large-scale metabolic networks is important for understanding the design principles and regulation of the metabolism at a system level. The metabolic network is conventionally considered to be highly integrated and very complex. A rational reduction of the metabolic network to its core structure and a deeper understanding of its functional modules are important.

Results: In this work, we show that the metabolites in a metabolic network are far from fully connected. A connectivity structure consisting of four major subsets of metabolites and reactions, i.e. a fully connected sub-network, a substrate subset, a product subset and an isolated subset is found to exist in metabolic networks of 65 fully sequenced organisms. The largest fully connected part of a metabolic network, called 'the giant strong component (GSC)', represents the most complicated part and the core of the network and has the feature of scale-free networks. The average path length of the whole network is primarily determined by that of the GSC. For most of the organisms, GSC normally contains less than one-third of the nodes of the network. This connectivity structure is very similar to the 'bow-tie' structure of World Wide Web. Our results indicate that the bow-tie structure may be common for large-scale directed networks. More importantly, the uncovered structure feature makes a structural and functional analysis of large-scale metabolic network more amenable. As shown in this work, comparing the closeness centrality of the nodes in the GSC can identify the most central metabolites of a metabolic network. To quantitatively characterize the overall connection structure of the GSC we introduced the term 'overall closeness centralization index (OCCI)'. OCCI correlates well with the average path length of the GSC and is a useful parameter for a system-level comparison of metabolic networks of different organisms.

Supplementary information: http://genome.gbf.de/bioinformatics/

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacteria / classification
  • Bacteria / metabolism*
  • Cell Physiological Phenomena*
  • Computer Simulation
  • Energy Metabolism / physiology*
  • Models, Biological*
  • Proteome / metabolism*
  • Proteomics / methods*
  • Species Specificity

Substances

  • Proteome