The beginnings of this conference series date back to 2003, initialised by a nebulous idea. This idea became concrete through the lectures and discussions at the Beilstein Bozen Symposium 2002 - Molecular Informatics: Confronting Complexity. In preparation for this symposium, the organisers, Martin Hicks and Carsten Kettner, stated in the overview [1] that "the flood of information generated as a result of research in genomics and proteomics is often completely overwhelming", which makes it inherently difficult to use this information for analysis, confirmation, interpretation, and even to understand the experimental results and to distinguish between real findings and assumptions. As much of the experimental data is reused for model development, e.g., in systems and structural biology and in drug discovery and targeting, there is a requirement to ensure accuracy and contextual quality of this data. Contextual data is data that describes the experimental data with unambiguous attributes and is now called metadata.
An international panel of molecular informatics researchers presented results of their analysis and understanding of the storage, processing and distribution of information encoded by molecules and molecular interactions, including protein structure, pattern recognition, drug discovery, design and delivery, and software tools for analysis and prediction. A unifying theme throughout the workshop was the goal of gaining insight into the behaviour of biological and molecular systems through computer simulations.
From the energetic discussions that followed the presentations, it quickly became clear that the software tools presented could only run successfully with very well defined data sets. It also became clear that the limitations in data quality were preventing researchers and modellers from generating knowledge that went significantly beyond hypotheses. Even subsequent data-driven research was considered ineffective, if not impossible.