“I simply wish that, in a matter which so closely concerns the well-being of mankind, no decision shall be made without all the knowledge which a little analysis and calculation can provide.”
Daniel Bernoulli, on smallpox inoculation, 1766
Every day, enormous amounts of experimental and theoretical data are becoming available in the diverse research fields. For example, in chemistry the chemical space is explored, in physics the fundamental structures of particles that make up the world we know are examined, and in biology the paradigm is studied that leads from the DNA to structure, function and regulation. In economics research real time data on micro- and macroeconomic developments are collected and analysed. Think also of climate research that, based on the diverse data sources, reports observations of climate change over the past 200 years.
Most of these sciences are at a crossroads as computing power, machine learning methods and the means of artificial intelligence (AI) advance so that the data can be explored in new and hitherto unexpected ways that provide new insights and allow for predictions about new mechanisms, products and future developments. To achieve this the pressure on data quality, understanding bias, and standardisation is becoming higher. However, it is not only an essential requirement for scientific progress to have unrestricted access to these data, both from successful and unsuccessful experiments in a digitized form but also to allow researchers to process, analyse and re-use them for simulations and predictions across disciplines.