Verifying Online Safety Properties for Safe Deep Reinforcement Learning
Author(s)
Marzari, Luca; Cicalese, Ferdinando; Farinelli, Alessandro; Amato, Christopher; Marchesini, Enrico
Download3770068.pdf (1.867Mb)
Publisher Policy
Publisher Policy
Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.
Terms of use
Metadata
Show full item recordAbstract
Ensuring safety in reinforcement learning (RL) is critical for deploying agents in real-world applications. During training, current safe RL approaches often rely on indicator cost functions that provide sparse feedback, resulting in two key limitations: (i) poor sample efficiency due to the lack of safety information in neighboring states, and (ii) dependence on cost-value functions, leading to brittle convergence and suboptimal performance. After training, safety is guaranteed via formal verification methods for deep neural networks (FV), whose computational complexity hinders their application during training. We address the limitations of using cost functions via verification by proposing a safe RL method based on a violation value---the risk associated with policy decisions in a portion of the state space. Our approach verifies safety properties (i.e., state-action pairs) that may lead to unsafe behavior, and quantifies the size of the state space where properties are violated. This violation value is then used to penalize the agent during training to encourage safer policy behavior. Given the NP-hard nature of FV, we propose an efficient, sample-based approximation with probabilistic guarantees to compute the violation value. Extensive experiments on standard benchmarks and real-world robotic navigation tasks show that violation-augmented approaches significantly improve safety by reducing the number of unsafe states encountered while achieving superior performance compared to existing methods.
Date issued
2025-09-30Department
Massachusetts Institute of Technology. Laboratory for Information and Decision SystemsJournal
ACM Transactions on Intelligent Systems and Technology
Publisher
ACM
Citation
Luca Marzari, Ferdinando Cicalese, Alessandro Farinelli, Christopher Amato, and Enrico Marchesini. 2025. Verifying Online Safety Properties for Safe Deep Reinforcement Learning. ACM Trans. Intell. Syst. Technol.
Version: Final published version
ISSN
2157-6904