Verifying Online Safety Properties for Safe Deep Reinforcement Learning

Marzari, Luca; Cicalese, Ferdinando; Farinelli, Alessandro; Amato, Christopher; Marchesini, Enrico

Author(s)

Marzari, Luca; Cicalese, Ferdinando; Farinelli, Alessandro; Amato, Christopher; Marchesini, Enrico

Download3770068.pdf (1.867Mb)

Publisher Policy

Terms of use

Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.

Metadata

Show full item record

Abstract

Ensuring safety in reinforcement learning (RL) is critical for deploying agents in real-world applications. During training, current safe RL approaches often rely on indicator cost functions that provide sparse feedback, resulting in two key limitations: (i) poor sample efficiency due to the lack of safety information in neighboring states, and (ii) dependence on cost-value functions, leading to brittle convergence and suboptimal performance. After training, safety is guaranteed via formal verification methods for deep neural networks (FV), whose computational complexity hinders their application during training. We address the limitations of using cost functions via verification by proposing a safe RL method based on a violation value---the risk associated with policy decisions in a portion of the state space. Our approach verifies safety properties (i.e., state-action pairs) that may lead to unsafe behavior, and quantifies the size of the state space where properties are violated. This violation value is then used to penalize the agent during training to encourage safer policy behavior. Given the NP-hard nature of FV, we propose an efficient, sample-based approximation with probabilistic guarantees to compute the violation value. Extensive experiments on standard benchmarks and real-world robotic navigation tasks show that violation-augmented approaches significantly improve safety by reducing the number of unsafe states encountered while achieving superior performance compared to existing methods.

Date issued

2025-09-30

URI

https://hdl.handle.net/1721.1/162893

Department

Massachusetts Institute of Technology. Laboratory for Information and Decision Systems

Journal

ACM Transactions on Intelligent Systems and Technology

Publisher

ACM

Citation

Luca Marzari, Ferdinando Cicalese, Alessandro Farinelli, Christopher Amato, and Enrico Marchesini. 2025. Verifying Online Safety Properties for Safe Deep Reinforcement Learning. ACM Trans. Intell. Syst. Technol.

Version: Final published version

ISSN

2157-6904

Collections

MIT Open Access Articles