A multi-scale time-series dataset with benchmark for machine learning in decarbonized energy grids. Academic Article uri icon


  • The electric grid is a key enabling infrastructure for the ambitious transition towards carbon neutrality as we grapple with climate change. With deepening penetration of renewable resources, the reliable operation of the electric grid becomes increasingly challenging. In this paper, we present PSML, a first-of-its-kind open-access multi-scale time-series dataset, to aid in the development of data-driven machine learning (ML)-based approaches towards reliable operation of future electric grids. The dataset is synthesized from a joint transmission and distribution electric grid to capture the increasingly important interactions and uncertainties of the grid dynamics, containing power, voltage and current measurements over multiple spatio-temporal scales. Using PSML, we provide state-of-the-art ML benchmarks on three challenging use cases of critical importance to achieve: (i) early detection, accurate classification and localization of dynamic disturbances; (ii) robust hierarchical forecasting of load and renewable energy; and (iii) realistic synthetic generation of physical-law-constrained measurements. We envision that this dataset will provide use-inspired ML research in safety-critical systems, while simultaneously enabling ML researchers to contribute towards decarbonization of energy sectors.

published proceedings

  • Sci Data

author list (cited authors)

  • Zheng, X., Xu, N., Trinh, L., Wu, D., Huang, T., Sivaranjani, S., Liu, Y., & Xie, L. e.

citation count

  • 6

complete list of authors

  • Zheng, Xiangtian||Xu, Nan||Trinh, Loc||Wu, Dongqi||Huang, Tong||Sivaranjani, S||Liu, Yan||Xie, Le

publication date

  • June 2022