The post-genomic era is significantly characterized by a high integration and interdisciplinary of research resources from such diverse fields as computational biology, bioinformatics, functional genomics, structural biology, and proteomics. In this perspective, established biological systems can be comprehensivley investigated in terms of interactions of individual or groups of proteins and enzymes as well as the behaviour of collective networks of such interactions. On the other hand, these systems can be re-examined in the light of new results that suggest novel associations between otherwise unrelated pathways and individual proteins.
Modern experimental technologies are providing seemingly endless opportunities to generate massive amounts of sequence, expression and functional data. Continuous advances and improvements have enabled proteome analyses to proceed with increased depth and efficiency. To capitalize on this enormous pool of information and in order to understand fundamental biological phenomena it is essential to collect, organize, categorize, analyze, and share data and results.
However, whilst the large international genome sequencing projects elicited considerable public attention with the creation of huge sequence databases, it has become increasingly apparent that functional data for the gene products, in particular for enzymes, has either limited accessibility or is unavailable. Additionally, although enzyme structural information has been rapidly accumulated in databases, little effort has been invested toward systematic characterization of enzyme functions.
The problem is twofold; deriving data from experimental work is expensive and very time consuming and it is inherently very difficult to collect, interpret and standardize published data since they are widely distributed among journals covering a number of fields, and the data itself is often dependent on the experimental conditions.