Neural-Barrier Lyapunov-Constrained PPO: Safe Reinforcement Learning with Deep Certificates in Nonlinear Systems

Tracking #: 924-1904

	Name	ORCID
	Dr. Raghavendra	https://orcid.org/0000-0002-6625-2986
	N. Shobha Rani	https://orcid.org/0000-0003-4882-1919
	Sowmya T	https://orcid.org/0000-0001-8965-3312

Authors:

Submission Type:

Research Paper

Abstract:

Safe reinforcement learning (Safe RL) seeks to acquire policies that maximize the cumulative reward under stringent safety constraints during training and deployment. Most current solutions, e.g., Lyapunov- and barrier-based methods, are not sufficiently adaptable in dealing with nonlinear dynamics or are based on analytically hard-coded safety certificates. To overcome these limitations, we introduce Neural-barrier Lyapunov-constrained Proximal Policy Optimization (NBLC-PPO), a general architecture that combines data-driven neural control barrier functions, Lyapunov stability filters, and trust-region policy updates with PPO. The approach allows per-step safe action enforcement with stability and constraint satisfaction guarantees in nonlinear environments. NBLC-PPO learns safety certificates and policy parameters simultaneously, enforcing dynamic feasibility by using differentiable constraints in the optimization loop. A set of empirical tests proves that NBLC-PPO attains state-of-the-art safety-performance trade-offs in constrained control tasks. It attains a 24-step cumulative reward, outperforming Lyapunov-PPO (∼19) and PPO (∼17.5), but with an average violation of only 0.04–0.06. It also attains more than 98.5% of the safety rate, training stability of almost 0.95, and converges 33% more quickly than baseline PPO. It also provides a reward-to-constraint ratio of over 500, which is a 66% improvement over Lyapunov-PPO and 2.5× that of baseline PPO. All these findings affirm the effectiveness of NBLC-PPO in facilitating secure, stable, and high-performing RL in real-world constrained environments.

Manuscript:

ds-paper-924.pdf

Data repository URLs:

Not applicable

Date of Submission:

Monday, July 14, 2025

Date of Decision:

Thursday, July 17, 2025

Nanopublication URLs:

Decision:

Reject (Pre-Screening)

Data Science