Sharing data. All data
Data and metadata generated by transmission electron microscopes exist in a format that makes them difficult to share openly. Cécile Hébert, the director of EPFL's Electron Spectrometry and Microscopy Laboratory (LSME), wants to change that.
A PhD student at EPFL’s LSME lab who came into possession of a fragment of a meteorite managed to extract the secrets of this priceless nugget using a transmission electron microscope. These bulky and relatively expensive instruments allow researchers to determine the chemical composition of natural and artificial materials down to the atomic level. The microscopes produce terabytes of data and metadata – which are just as precious as the extra-terrestrial sample analyzed by the PhD student. However, because of the format adopted by the instrument’s manufacturer, the data are next to impossible to share with the rest of the scientific community. It’s even a struggle for another LSME researcher to use them.
“These instruments are made and sold by companies that also provide maintenance services,” says Cécile Hébert. “Developments in electronics have increased the stability, resolution and accuracy of these instruments, and they are now entirely controlled by computer, including their data acquisition. This means that the suppliers develop the software, deciding which data acquisition and storage formats are used and which metadata are produced. As a result, if we turn them into open-access data, they become almost unusable.” Even when a lab owns the instrument and the software, some of the data will be recorded in the scientists’ lab books, making it hard to search for information.
“The raw data generated by the instrument must be accessible to users,” Hébert continues. “We want to develop formats and ways of distributing and documenting this data so that they are genuinely open access.” The plan is to focus initially on EPFL, so that other labs here that use microscopes can make their data openly accessible. Outside labs could eventually adopt the same approach. “If we can show the community that it’s possible, we’ll have more leverage with the manufacturers,” concludes Hébert.
Cécile Hébert, in her LSME Laboratory.
© Alain Herzog
Open standard, open tools
The lab plans to hire a post-doctoral researcher to convert the data and document them with metadata. The post-doc will need to identify which metadata are important, find ways of obtaining any of those data that are missing, and document and develop tools that can do that work semi-automatically. This will result in an open standard and a system for obtaining the right metadata, along with open-access tools for reading, converting and processing data from transmission electron microscopes.
The work, which is expected to take around two years, will be funded by the Open Science Fund and the LSME’s own budget. As Hébert explains, “although this isn’t a core area of research for my lab, I started the project because I just don’t find the current situation workable.”