Meta AI recently announced that it will soon release an entirely new data set for green hydrogen fuel ML modeling and simulation, focused on oxide catalysts for the oxygen evolution reaction (OER), a critical chemical reaction used in green hydrogen fuel production via wind and solar energy.
Meta AI and the Carnegie Mellon University (CMU) department of chemical engineering are collaborating to expedite green-energy catalyst discovery under the Open Catalyst Project. This is part of the effort in using machine learning to find catalysts in transforming renewable resources like solar and wind into other fuels such as hydrogen for clean energy usage and storage. They are working on collecting simulation results from density functional theory (DFT) from different materials and replacing DFTs with ML models for faster and better results. They open-sourced OC20 (October 2020), one of the world's largest training data sets for renewable energy storage for the first time. This dataset consists of 1,281,040 DFTs, with a wide range of materials, surfaces, and adsorbates as important features.
OER is one of the most important chemical reactions in hydrogen generation. It is used in many renewable energies like solar, wind and rechargeable metal-air batteries in electric cars. There is a limitation on the availability and the cost of metal oxides like ruthenium and iridium oxides as part of the usual process. This put high effort in finding low-cost catalysts for OER to make the process fast and economical.
The Meta AI blog mentioned this coming data set in their recent published post entitled Accelerating renewable energy with new data set for green hydrogen fuel:
The OER data set contains ~8M data points from 40K unique simulations. We believe it’s the largest data set for oxide catalysis to date, spanning a swath of oxide materials across 52 elements. It includes interactions between the surfaces of the oxide materials and five important molecules (O, OH, H2O, OOH, and O2) involved in OER, in addition to surface interactions with CO, H, C, and N. It also explores interactions on the surface when crystal defects and multiple molecules are present. The data set and baseline models will be open-sourced in the coming months to help the global scientific community advance renewable energy technologies.
Millions of compute hours are required to generate this data set. As part of Meta’s Net Zero program, Meta AI mentioned that the carbon emissions from computation used to build the data set would be 100 percent offset.
To foster and attract machine-learning engineers and researchers in this area, Meta AI and CMU organized a competition for the best ML models for catalyst selection problem last June. Meta AI selected the winners during the NeuralPS-2021 conference last December. They hope these efforts will accelerate better solutions for the green-energy catalyst discovery problem.