Open Science and the Chemistry Lab
of the Future

Beilstein Open Science Symposium 2017

22 – 24 May 2017
Hotel Jagdschloss Niederwald,, Rüdesheim, Germany

Scientific Program:

Ian Bruno / CCDC, Cambridge, UK

Leah McEwen / Cornell University, Ithaca, USA

Martin G. Hicks and Carsten Kettner / Beilstein-Institut

Aspects covered by this conference

Which data do we want to save, how and why and how long?
What really needs to be reproducible?
Are current reporting standards being used sufficiently?
If not, why not?
Are the current procedures for depositing data too onerous for scientists?
Will technology, through increasing automation, fix most of the problems?

Is bureaucracy killing creativity in science?
Have we got a reproducibility crisis?
If we save and share data routinely, what is the future of the publication?
Are funding agencies causing science to be too short term in their quest for value for money?
Are chemists repeating too many experiments?
What can chemistry learn from other areas and what can they learn from chemistry?

Overview

Open Access, Open Data, Open Science, Data Sharing and Big Data are examples of buzz words that are used to describe the new opportunities and demands for sharing and reusing the results of scientific research. The advantages for the scientific community in gaining unrestricted access are immense, but only if the data is standardized, comprehensively reported, validated and authenticated. In chemistry and biochemistry the consequences of funding agency mandates for depositing and sharing data are just becoming apparent.

In the laboratory, interconnected devices are allowing new paradigms to develop in the design, control and reproducibility of experiments and procedures. This will involve modifying current workflows to integrate reporting standards and make use of automation. This symposium will bring together research scientists, data scientists, publishers, funders and other interested parties to review critically their needs and concerns and discuss how they see the future developing in the context of the highly interconnected, data driven lab of the future and the new infrastructures that are being set up.

Summary

Leah McEwen/ Cornell University, Ithaca, USA

The practice of science has long been an international endeavor, building on exchange among scientists across disciplines and borders. The International Council for Science (ICSU) Principle of Universality of Science states that, the free and responsible practice of science is fundamental to scientific advancement and human and environmental well-being. Such practice, in all its aspects, requires freedom of movement, association, expression and communication for scientists, as well as equitable access to data, information, and other resources for research. It requires responsibility at all levels to carry out and communicate scientific work with integrity, respect, fairness, trustworthiness, and transparency, recognizing its benefits and possible harms. This principle is being echoed in research policies across the globe, and supported by interconnected digital communication technologies in a movement towards more open science.

The Niederwald of the beautiful Rhine Valley World Heritage area was the setting for a recent discussion on the prospect of open science in chemistry and related disciplines. This symposium brought together research scientists, data scientists, publishers, funders and other interested parties to review current publication and data sharing practices. Over the course of three days, scientists and research centers presented on experiences with automating interconnectivity in laboratories, managing data from generation to sharing, roles for hard and soft infrastructures, and particular issues related to exchanging chemical information. Representatives from various scientific and industry community groups discussed efforts to coordinate over common needs. Scientific publishers and funding organizations considered challenges supporting the evolving landscape of dissemination and communication while maintaining integrity of research outputs.

There is much potential in the interconnected nature of the Internet to support semi-automation of research workflows as well as to facilitate communication. This was beautifully demonstrated by several projects in process flow chemistry applying combinations of sensors, pumps and computer scripts. Refining the automation enables more precise and data collection at greater scale, improving the ability to analyze problems across the chemical space and allowing researchers to focus on interpretation. These processes depend critically on interoperability of data within and between experiments. The benefits can be further realized across research programs and laboratories through world-wide contributions to global research problems, such as demonstrated by open-source projects developing drugs for diseases. Building workflows for capturing and packaging data and analyses to support further re-use are being explored by research institutions in support of local research as well as cooperative infrastructure projects. It is not enough to individually publish data in papers, ensuring the FAIR data principles of findability, accessibility, interoperability and re-usability requires focus on coordination.

This symposium set out to explore big picture questions concerning which data to share, the efficacy of reporting standards, automation optimization, issues of reproducibility, and funder requirements for open publication and data sharing. A range of proactive experiments in publishing and open collaboration were represented and discussion surfaced more questions than answers. Panels debated practical aspects of data sharing, such as incentives and disincentives for authors to openly share research outputs, including visibility, credit, and concerns about scooping. What are the responsibilities of other stakeholders while scientists focus on data analysis, how can publishers and librarians continue to ensure quality peer review, technical curation and longer term availability in an open environment. Will the journal mode continue to be a relevant publishing form and can there be more coordination across publishers for reciprocity and standard submission formats. Are there obvious pre-competitive gains across the community, are there industry practices relevant to sustainable sharing in research. All stakeholders are continually challenged to consider more proactive and collective roles to move beyond the current workflows.

Many superb projects showcase benefits of open flow of data and communication for discovery, scientific methodology, leveraging community resources, improving public health, and cleaner and safer research practices, provenance, and posterity. There is no doubt that excellent science can benefit from increased exchange. The greatest challenges will be in coordinating across stakeholders and disciplines to address community wide issues. Different parties are consumed with the business of sustaining services and keeping up with technology. If it is the responsibility of scientists to uphold the integrity of scientific practice and contribute their findings to the knowledge landscape, then it is the responsibility of the rest of the research enterprise to work out the supporting technical and organizational infrastructures. It is fundamentally the responsibility of the entire community to consider the training and support of young scientists to successfully navigate emerging data driven research economies from the beginning of their careers. Realizing the value of open sharing and exchange must be a collective endeavor to identify gains that exceed practical costs, sustain efforts and uphold integrity. While compiled data has long been valued in chemistry, open sharing is not de facto practice and we look forward to more such conversations among leading stakeholders.

SPEAKERS

Enda Bergin / Nature Research, London, UK

Richard Bourne / University of Leeds, UK

Duncan L. Browne / University of Cardiff, UK

Christoph Bruch / Helmholtz Association, Potsdam, Germany

Ian Bruno / Cambridge Crystallographic Data Centre, UK

Stuart Chalk / University of North Florida, Jacksonville, USA

Helena Cousijn / Elsevier, Amsterdam, The Netherlands

Lee Cronin / University of Glasgow, UK

Johannes Fournier / German Research Foundation, Bonn, Germany

Jeremy Frey / University of Southampton, UK

Kai Karin Geschuhn / Max Planck Digital Libray, München, Germany

Rolf Grigat / Leverkusen, Germany

Nicole Jung / Karlsruhe Institute of Technology, Germany

Carsten Kettner / Beilstein-Institut, Frankfurt, Germany

Richard Kidd / Royal Society of Chemistry, Cambridge, UK

Stefan Knapp / Goethe University Frankfurt, Germany

Wolfram Koch / Gesellschaft Deutscher Chemiker, Frankfurt, Germany

Angelina Kraft / Technische Informationsbibliothek, Hannover, Germany

Greg Landrum / KNIME AG, Zurich, Switzerland

Andrew Leach / European Bioinformatics Institute, Hinxton, UK

Frédérique Lisacek / Swiss Institute for Bioinformatics, Switzerland

Leah McEwen / Cornell University, Ithaca, USA

Michael Penk / Beilstein-Institut, Frankfurt, Germany

Henry Rzepa / Imperial College London, UK

Frank Schuhmacher / Max Planck Institute of Colloids and Interfaces, Potsdam, Germany

Vera Szöllösi-Brenig / Volkswagen-Stiftung, Hannover, Germany

Klaus Tochtermann / Leibniz Information Centre for Economics, Kiel, Germany

Matthew Todd / University of Sydney, Australia

Richard Whitby / University of Southampton, UK

Egon Willighagen / Maastricht University, The Netherlands

John Wise / Pistoia Alliance, UK

Roland Wohlgemuth / Sigma-Aldrich, Buchs, Switzerland

Back to the overview of previous Open Science Symposia