Online Symposium

Models of Convenience

Beilstein Bozen Symposium
2020

1 — 2 September, 2020

3.00 to 6.00 PM (CEST)

#BeilsteinBozen2020

 

Scientific Committee:

Tim Clark / University of Nürnberg-Erlangen

Lee Cronin / University of Glasgow

Martin G. Hicks, Carsten Kettner / Beilstein-Institut

Aspects covered by this conference

 

Biology and materials science are posing new challenges for chemistry; the sheer size of biological systems and composite materials poses new questions of complexity (as opposed to being complicated), visual and digital depictions of the structures and collection and collation of results. Our traditional models of chemical compounds, atoms and bonds, are often proving inadequate and need to be extended. Even more fundamental, however, is the recognition that models are models and their often-problematic relationship to “reality”.

 

After the symposium was cancelled as a physical event, Tim Clark and Martin Hicks continued their discussions on the theme of Models of Convenience. They have now put our thoughts on paper as a commentary article which has been published in the Beilstein Journal of Organic Chemistry:

 

Models of Necessity
Clark, T.; Hicks, M. G. Beilstein J. Org. Chem. 2020, 16, 1649–1661.
doi:10.3762/bjoc.16.137

Models of Convenience

Chemists generally depict molecules based on Lewis structures, a concept that has flowed into the computer representations used since the 1970s and 80s. These representations still form the basic records for most databases containing chemical information, which have been used to construct predictive models using neural networks and machine learning since the end of the 1980s, as were various group contribution methods involving some regression algorithm.

The science community is now being challenged with the hype of Big Data and new methods of ML/AI, such as deep learning, that often go against the traditional validation concepts of the cheminformatics community. Chemistry is still entering the digital era and as a discipline it is confronted with a problem not found in some of the more commonplace applications of ML/AI; for example, large e-commerce companies, real time analysis and simulations in the automobile industry and data from large spectrometers and colliders such as the LHC.

The central element in chemistry, i.e. the basic identifier/model for a molecule is no more than metadata, which contain the apparent connectivity between atoms in a formalistic manner of single, double, and triple bonds. Furthermore, chemistry data are not generally measured directly in real time and its findings are often subject to human selection/interpretation based on a predetermined bonding model. The distinction between measurements and the experimentalists’ interpretation of the measurements is far less clear in chemistry than in many other branches of science and engineering. For instance, 13C chemical shifts were for years believed to be equivalent to net atomic charges, a concept without physical reality.

Even worse, deviations from the model are often treated as sensational exceptions to established principles and given far more significance than they deserve. Boranes, for example, only exhibit exceptional bonding within the Lewis model.

Is this model sufficient for ML/AI applications or for large simulations in biology and materials science? Will its lifetime be extended thereby or curtailed? The reality of a molecule in solution or in vivo is three-dimensional and dynamic, and thus far more complicated than our simple models can accommodate; many interactions between molecules are non-equilibrium events. In the past, it has been shown that for many property prediction systems, 2D-representations have given better results than 3D.

Why should this be the case; are our models capturing the essential spatial information? How does the effective averaging of dynamic conformations in a stylized 2D-representation developed for chemists to exchange ideas by drawing structures on paper affect our ability to predict molecular properties? How close do our models of molecules have to be to reality to work effectively using contemporary and future ML/AI techniques? Are we underusing the immense power of modern hard- and software with our simple models? What can we do better using ML/AI, and what are the new directions in chemistry that data-driven discovery could make possible? Have we already reached an accuracy limit imposed by the accuracy and precision of the data for predictions?

Scientific Program

Tuesday, Sept. 1


3.00 pm
Opening
Martin G. Hicks

Session Chair: Tim Clark

3.15 pm
Tim Clark
Models and Approximations versus Intractability

3.45 pm
Robin Hendry
Molecular Structure: not just a Model

4.15 pm
Karoline Wiesner
What is a Complex System?

4.45 pm
Coffee break and poster session on Stackfield

5.10 pm
Joseph Moran
Metabolism before Enzymes?

5.40 pm
Richard Bourne
Cognitive Chemical Manufacturing

6.10 pm
Closing discussion

Wednesday, 2 Sept.


Session Chair: Lee Cronin

3.00 pm
Lee Cronin
Chemical Space: Designed or Discovered?

3.30 pm
Sonia Antoranz Contera
Biological Shapes Emerging from Physics at the Nanoscale

4.00 pm
Susan Stepney
Open-ended Parasite Evolution in a Spatial Automata Chemistry

4.30 pm
Coffee break and poster session on Stackfield

4.50 pm
Oscar Ces
Artificial Cells and Cellular Bionics

5.20 pm
Jean-Loup Faulon
Machine Learning for and by Synthetic Biology

5.50 pm
Closing discussion