Research data in Organic Chemistry
one-day Workshop on JulY 24, 2019
Beilstein-Institut, Frankfurt am Main
The Beilstein-Institut organized a one-day workshop with early career organic chemists from several German research groups. To set the scene for discussions, Prof. Christoph Steinbeck from the University of Jena gave an overview of research data management in chemistry. The extended discussion led to many interesting insights into everyday laboratory practice:
In most cases, research data is stored in the way that an individual research group determines to be most appropriate. This is in general a pragmatic solution which results from insufficient resources, whether man-power, software, IT infrastructure or financial. The lack of an easy to use digitized workflow, linking laboratory equipment with data management systems and repositories that not only store the data but also processes it into a form ready for publication is a large hindrance. Furthermore, in terms of making data reusable, there are hardly any generally applicable guidelines, as to what should be stored, in what form and for how long. As a result, when projects have been completed and archived, it is often difficult to locate specific data. Routine data sharing is hard to imagine under these circumstances; however at the moment only a few concrete use-cases are apparent, which limits the driving force for change.
Creating a network of research data repositories is one of the tasks of the National Research Data Infrastructure (NFDI) consortia. To enable this, Minimum Information Standards are needed, as are clear standards for good laboratory practice. Defining these standards, including meta-data, for the individual subject areas will be a huge but not insurmountable task. For this, pioneers are needed who lead by example and have a certain reputation in the scientific world. It is not only important to provide the necessary incentives and support to store and share data, but furthermore it is essential that such an infrastructure is sustainable. To accomplish this it is vital that universities create sufficient permanent positions for each research department or group and redefine existing positions in libraries to ensure long-term support for data management. In addition, funders, for example The German Research Foundation (DFG), should implement procedures to check the data management fulfilment during and after the runtime of their projects.
An important initial step that would accelerate the use and development of a repository would be the availability of a suitable open source Electronic Laboratory Notebook for the chemistry community. If data recording and management would be made significantly easier, the necessary cultural changes involving a rethinking in the minds of scientists, that are also necessary for data sharing, would also be easier. At the moment it is sometimes difficult to get answers to simple questions on synthesis problems, even between members of different research groups at the same university.
One of the wishes of the participants at the workshop is to find a way within the publishing workflow to make the original data available for referees to facilitate plausibility checking. Initially, publishers could serve as a collection point for original data. But the goal should be to have a central repository that contains data as raw data, processed data and as images for easy manual scanning. Publication of raw data would also make fraudulent claims significantly more difficult. Another wish would be to enable routine storage and searching for structures and reactions in an open database. It would be very useful, particularly when considering AI/ML applications if negative experiments would be documented. With regard to further databases, the participants did not see a need for physical or biophysical data (except for NMR data).
The workshop was positively evaluated by the participants. There is great interest in a follow-up meeting.