Some Perspectives on the Scalability and Resiliency for Reinforcement Learning Control

Speaker

Sayak Mukherjee

Affiliation

Staff Scientist
Optimization and Control Group
Pacific Northwest National Lab

Abstract

Recently a lot of focus has been put on the development of reinforcement learning-driven controllers that can be learned without using accurate system dynamic information and utilizing persistently excited system trajectory data. Unsurprisingly we face multiple challenges in both theoretical and implementation aspects of such designs. We have seen that as these learning techniques are model agnostic, they require gathering a sufficient amount of data to capture how the system behaves with different control actions, i.e., via interactions. Therefore, the learning time which translates to the amount of data sample requirement will suffer from the curse of dimensionality whenever the system size increases. With this motivation, various structural information of the dynamic system can be exploited to perform more distributed and scalable learning control designs. We show some of the designs exploiting various structures such as time-scale separation in dynamics, distributed system interconnection, structure on the control gains, etc. with some of the motivation coming from power grid dynamics problems. We will also show some recent applications on grid emergency voltage controls using hierarchical structures of policies. Along with these scalability issues, we also have to consider the resiliency aspects of these designs for which we will present a couple of different designs in this talk. The first resilient design deals with securing learning the control gains in the presence of intruders. In the second design, we will present some preliminary results on enhancing the cyber resilience of networked microgrids using vertical federated reinforcement learning. In Summary, some of these scalability and resiliency-related aspects will be briefly brought forward to the workshop audience for further discussion during the conference.

Bio

alt text 

Sayak Mukherjee is a Staff Scientist in the Optimization and Control group at the Pacific Northwest National Laboratory. He received Ph.D. in Electrical Engineering from North Carolina State University, USA in 2020 and B.E.E. from Jadavpur University, India in 2015. He joined PNNL as a Post-doctoral Research Associate. He is currently working on several problems on system-theoretic approaches for reinforcement learning (RL) based control of dynamical systems with applications to power and energy systems. His areas of expertise include system theory, control, learning especially data-driven optimal control using reinforcement learning and adaptive dynamic programming, AI/ML for dynamics, resilient control designs, large-scale power system stability and control, grid operation with distributed energy resources, etc.