"Before you let the agent act, you need the rules and a bounded context — some kind of guardrail that allows you to audit, understand, and make judgments on the agent's decisions."
The Boom Nobody Is Thinking Carefully About
Agents are everywhere right now. OpenAI acquired the team behind OpenDevin (now OpenHands), one of the most prominent open-source cognitive agent frameworks, and the news was greeted with the kind of excitement reserved for moon landings. Every week another framework drops, another benchmark is shattered, another demo goes viral showing an AI autonomously browsing the web, writing code, deploying to production, and sending emails — all from a single natural language instruction.
And nobody, or almost nobody, is asking the obvious question: What stops it from doing something catastrophically wrong?
This isn't a doomsday essay. The capabilities are real, and the productivity gains are real. But in the rush to ship agents, the industry has quietly agreed to skip the part where we define what agents are actually allowed to do. We give them access to databases, filesystems, APIs, cloud credentials, and communication channels. We tell them the goal. We hit run. We pray.
The problem is that we are treating the LLM as both the engine and the steering wheel and the brakes — and LLMs were not designed to be any of those last two things.
The Unconstrained Agent Problem
Consider how humans operate within organizations. You have access to some systems and not others. You can approve invoices up to a certain value. You can send emails on behalf of your team but not the CEO. You can delete files in your own directory but not in finance. These constraints are not bugs — they are the architecture of trust. They exist because even the most competent, well-intentioned person makes mistakes, and bounded authority limits the blast radius of those mistakes.
Now consider what happens when you give an LLM-powered agent unrestricted access to everything and tell it to "make the business run 200% faster." The agent does not have intuitions about appropriate scope. It does not feel the social friction of overstepping. It does not hesitate before spending the company's entire cloud budget on a compute-intensive experiment it decided, autonomously, was a good idea.
We would never grant a new employee — or even a senior one — that kind of unconstrained authority. Yet we routinely grant it to agents because the demo looked impressive and the deadline is next sprint.
The issue is structural. Powerful LLMs are excellent at pattern completion, language understanding, and even multi-step reasoning within a context window. What they are not is rule-compliant by construction. You can ask an LLM to follow rules. You can put rules in the system prompt. But there is no formal guarantee, no verifiable enforcement, no audit trail that proves the rules were respected. The guardrails are suggestions whispered to a very confident guesser.
To fix this, we need to think about what actually constrains behavior in reliable systems — and that means going back, much further back than the transformer paper.
The Quiet Return to the 1970s
Here is the uncomfortable truth: the core problem of "how do you make an automated reasoning system behave within defined boundaries" was studied intensively from the 1960s through the 1990s. The field was called knowledge engineering, and its primary artifact was the expert system — a program that encoded human expertise as explicit rules and used a reasoning engine to draw conclusions and decide actions.
Expert systems fell out of fashion. Neural networks ate the world. But the problems they were designed to solve did not disappear. They are back, wearing agent-shaped clothing, and they are considerably more powerful and therefore considerably more dangerous when left unconstrained.
To understand what we lost — and what we can recover — we need to revisit two foundational technologies in depth: Prolog and CLIPS.
Prolog: Logic as Computation
Prolog (Programming in Logic) was developed in the early 1970s, primarily by Alain Colmerauer and Robert Kowalski, and it represented a genuinely radical idea: what if the program was not a sequence of instructions, but a set of logical facts and rules, and computation was the act of proving whether something was true?
The Anatomy of Prolog
A Prolog program consists of three kinds of statements:
Facts — unconditional assertions about the world:
has_clearance(alice, level_3).
has_clearance(bob, level_1).
resource_sensitivity(patient_records, level_3).
resource_sensitivity(public_docs, level_1).
Rules — conditional assertions derived from other facts:
can_access(User, Resource) :-
has_clearance(User, Level),
resource_sensitivity(Resource, Required),
Level >= Required.
Queries — questions posed to the system:
?- can_access(alice, patient_records).
% true
?- can_access(bob, patient_records).
% false
The Prolog engine uses SLD resolution — a form of backward chaining — to prove or disprove queries. It does not execute a procedure; it constructs a proof. The difference matters enormously. When a Prolog program says "bob cannot access patient records," it is not returning an error code or a probability — it is the logical conclusion of the rules you encoded. You can inspect the proof tree. You can audit exactly why the answer was what it was.
Prolog and the Knowledge Representation Problem
Prolog excels at domains where knowledge can be expressed as Horn clauses — implications of the form "if A and B and C, then D." Medical diagnosis, legal reasoning, configuration validation, access control — all of these map naturally onto Prolog's paradigm.
For agentic systems, this translates directly. Consider encoding agent permissions as Prolog rules:
% An agent can perform an action if:
% 1. It has the capability
% 2. The action is permitted in the current context
% 3. The preconditions are satisfied
agent_may_act(Agent, Action, Context) :-
agent_capability(Agent, Action),
context_permits(Context, Action),
preconditions_satisfied(Action, Context).
% Specific rules
context_permits(production, database_write) :- fail. % Never in prod
context_permits(staging, database_write).
context_permits(_, database_read).
preconditions_satisfied(send_email, Context) :-
user_approved(Context, send_email).
Every action the agent wants to take can be routed through this reasoning layer. The layer is transparent, auditable, and formally correct given the rules you specify.
Where Prolog Struggles
Prolog is not a universal solution. Its constraint system, while powerful, can be difficult to scale. The closed-world assumption — the premise that anything not explicitly stated as true is false — creates problems in open domains where incomplete knowledge is the norm. Negation in Prolog is handled through negation as failure, which is semantically clean but can produce counterintuitive results in complex rule sets.
More practically: Prolog requires experts who can think in formal logic, and the tooling around it — debugging, profiling, integration with modern systems — is sparse compared to mainstream languages. Writing a large Prolog codebase is not "a bit magical," as the audio notes suggest. It is entirely magical to most engineers, and that opacity is a genuine barrier.
This is why, for production expert systems at scale, the world eventually turned to CLIPS.
CLIPS: The Production-Grade Expert System Shell
CLIPS — C Language Integrated Production System — was developed at NASA's Johnson Space Center in the mid-1980s. The need was specific and revealing: NASA needed to build expert systems that could run reliably on production hardware, integrate with C-based infrastructure, and be maintained by engineers who were not Prolog specialists.
The result was one of the most successful knowledge engineering tools ever built, eventually used across NASA, the U.S. military, the IRS, major financial institutions, and industrial control systems.
The CLIPS Architecture
CLIPS implements the Rete algorithm for rule matching, which makes it highly efficient at pattern matching across large working memory. Its architecture has three core components:
1. The Fact Base (Working Memory)
Facts are the current state of the world as the system understands it:
(assert (agent-request (agent "agent-01")
(action "delete-records")
(target "production-db")
(timestamp 1718294400)))
(assert (agent-clearance (agent "agent-01") (level "standard")))
(assert (environment (name "production") (criticality "high")))
2. The Rule Base (Production Memory)
Rules define what to do when certain patterns are present in the fact base:
(defrule block-destructive-actions-in-production
"Prevent any destructive action in high-criticality environments"
(agent-request (agent ?a) (action ?act) (target ?t))
(environment (criticality "high"))
(test (member$ ?act (create$ "delete-records" "drop-table" "wipe-storage")))
=>
(assert (action-blocked (agent ?a) (action ?act) (target ?t)
(reason "Destructive action in high-criticality environment")))
(retract ?request))
(defrule require-approval-for-financial-actions
"Financial actions above threshold require human approval"
?req <- (agent-request (agent ?a) (action "transfer-funds") (amount ?amt))
(test (> ?amt 10000))
=>
(retract ?req)
(assert (pending-approval (agent ?a) (action "transfer-funds")
(amount ?amt) (approver "human-supervisor"))))
3. The Inference Engine
The Rete-based inference engine continuously matches rules against the fact base and fires applicable rules in priority order. The result is a system that processes state changes in real time, applying expert-encoded rules to every incoming situation.
Why CLIPS Solved What Prolog Couldn't for Production Use
CLIPS was designed from the ground up for the constraints of real operational systems:
Explainability — every rule that fires produces an audit trail. You can reconstruct exactly why the system made a decision, which rule triggered, what facts were matched, and what conclusions were drawn. This is not a feature bolted on. It is intrinsic to how production systems work.
Integration — CLIPS is written in C and designed to embed into larger systems. NASA used it inside mission-critical software. Modern ports exist for Java (Jess), Python (pyclips, experta), and JavaScript.
Forward chaining — unlike Prolog's backward chaining, CLIPS primarily uses forward chaining: it starts with known facts and derives new ones until no more rules can fire. This maps naturally onto event-driven systems where you receive observations and need to reason forward to actions and alerts.
Conflict resolution — when multiple rules are applicable simultaneously, CLIPS uses configurable conflict resolution strategies (priority, recency, specificity) to determine firing order. This lets you encode policy hierarchies explicitly.
CLIPS in the Agent Context
The mental model for deploying CLIPS as an agent guardrail layer is straightforward. Every action the agent wishes to take is translated into a CLIPS fact and asserted into the working memory. The rule engine runs. If the action is permitted, it proceeds; if blocked, an explanation is returned to the agent or escalated to a human.
; The agent wants to send a customer email
(assert (agent-request (agent "marketing-agent-7")
(action "send-email")
(recipient "customer-segment-A")
(content-type "promotional")
(volume 15000)))
; Rule: promotional emails require opt-in verification
(defrule verify-opt-in-before-promotional-send
?req <- (agent-request (agent ?a) (action "send-email")
(content-type "promotional") (recipient ?seg))
(not (opt-in-verified (segment ?seg)))
=>
(retract ?req)
(assert (action-blocked (agent ?a) (action "send-email")
(reason "Opt-in status not verified for segment")))
(assert (required-verification (type "opt-in") (segment ?seg))))
The agent cannot send the email. The reason is explicit and machine-readable. The required remediation step is automatically asserted. No probabilistic guessing. No "trust me, I read the policy in my system prompt."
The Deeper Problem: Rules Are Not Enough Without Structure
CLIPS and Prolog are powerful, but they share a fundamental limitation: the rules they enforce are only as good as the knowledge engineering that produced them. Rules are brittle against novelty. An expert system with 10,000 rules about medical diagnosis will fail gracefully but silently when presented with a disease that emerged after the rules were written.
For agentic systems operating in open-ended domains, this is a serious problem. Agents are precisely valuable because they can operate in novel situations. If every situation requires an explicit rule written in advance, the knowledge engineering burden becomes impossible.
What we need is a layer above individual rules — a formal framework that defines the structure of what agents can do and how they can interact, in a way that is expressive enough to cover novel situations and formal enough to enforce. This is where Promise Theory and ontological action encoding enter the picture.
Promise Theory: A Formal Grammar for Agent Behavior
Promise Theory, developed by Mark Burgess (the creator of CFEngine) over the course of the 2000s and 2010s, is a mathematical framework for describing how autonomous agents interact through voluntary commitments. It was originally developed for distributed systems configuration management, but its scope is considerably broader than that origin suggests.
The Core Concepts
The fundamental unit of Promise Theory is, unsurprisingly, the promise: a declaration by an agent about its own future behavior.
Notation:
π+(a, b, B, τ)
This means: Agent a makes a positive promise to agent b regarding behavior B within context τ.
Promises have a crucial asymmetry: an agent can only promise about its own behavior, never about the behavior of others. You cannot promise that the database will be available. You can only promise that you will attempt to access it and handle failure gracefully. This asymmetry turns out to be the key to modeling realistic distributed systems — and agentic systems.
Key Promise Types:
+promise (provision): I will provide this to you.
-promise (use): I will accept or consume this from you.
Cooperation: When
amakes+πtobandbmakes-πtoafor the same behaviorB, they form a cooperation — a bilateral agreement.
Body of a Promise:
The body B is not just a string. It is a typed, constrained specification:
B = {action: "write_file",
path_constraint: "/tmp/*",
max_size_bytes: 1048576,
precondition: "caller_authenticated",
postcondition: "file_integrity_verified"}
Promise Theory as a Meta-Framework for Rules
The insight for agentic guardrails is that Promise Theory gives us a formal language for describing not just what rules exist but the relationship structure within which rules make sense.
In traditional rule-based systems, rules are flat: "if X then Y." In a Promise Theory framework, rules are anchored to agents, directional, and typed by the nature of the obligation being described.
Consider an AI agent operating within an organization. Its behavioral constraints can be modeled as a promise graph:
Agent_Marketing --+π--> Organization : "I will only contact opted-in customers"
Agent_Marketing --+π--> DataStore : "I will only read, not write, customer records"
Agent_Marketing --+π--> Agent_Audit : "I will log every action I take"
Organization --+π--> Agent_Marketing : "I will provide the opted-in customer list"
Agent_Audit --+π--> Agent_Marketing : "I will notify you if I detect policy violation"
This is not just documentation. It is a formal, machine-verifiable specification of the agent's behavioral contract. The promises can be encoded as rules in CLIPS or as dependent types (more on that shortly). Violations can be detected automatically. The structure of obligations is explicit and auditable.
Promises and Bounded Context
One of the most powerful concepts in Promise Theory for agentic systems is the body scope constraint — the idea that a promise is valid only within a defined context τ. This maps directly onto the "bounded context" concept in the audio: agents do not have universal permissions; they have permissions within specific contexts.
π+(Agent_Finance_Bot, Ledger_System, {action: "approve_payment", amount_max: 5000},
τ = {environment: "internal", time: business_hours, requester: "verified_employee"})
This promise is conditional on context. Outside business hours, or from an unverified requester, or in an external environment, the promise does not hold. The agent cannot perform the action because its promise was context-scoped.
This is how humans actually work. Your authority is not absolute; it is contextual. Promise Theory gives us the mathematics to encode that.
Encoding Actions and Rules as Ontology
We have now established two complementary tools: CLIPS for operational rule enforcement, and Promise Theory for structural behavioral contracts. The final piece is unifying them into a formal action ontology — a machine-readable schema that describes the space of possible actions, their preconditions, postconditions, constraints, and relationships.
What Is an Action Ontology?
An ontology, in the knowledge representation sense, is a formal specification of a domain — its entities, properties, relationships, and constraints. OWL (Web Ontology Language) and RDF (Resource Description Framework) are common implementation technologies, though the concepts are implementation-independent.
An action ontology specifically models:
Action types and their taxonomic relationships (a
DeleteRecordis a subtype ofWriteAction, which is a subtype ofDataMutationAction)Preconditions: what must be true before an action can be performed
Postconditions: what becomes true after an action is performed
Constraints: boundaries that must not be violated
Effects: changes to the world state
Requiredness: which actions are mandatory, optional, or forbidden in a context
A Sketch of an Action Ontology for Agent Guardrails
Using a simplified OWL-like notation:
@prefix agent: <http://example.org/agent-ontology#>
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>
# Action taxonomy
agent:Action a owl:Class .
agent:ReadAction a owl:Class ; rdfs:subClassOf agent:Action .
agent:WriteAction a owl:Class ; rdfs:subClassOf agent:Action .
agent:DestructiveAction a owl:Class ; rdfs:subClassOf agent:WriteAction .
agent:CommunicationAction a owl:Class ; rdfs:subClassOf agent:Action .
# Precondition property
agent:requiresPrecondition a owl:ObjectProperty ;
rdfs:domain agent:Action ;
rdfs:range agent:Condition .
# Context constraint
agent:permittedInEnvironment a owl:ObjectProperty ;
rdfs:domain agent:Action ;
rdfs:range agent:Environment .
# Specific action definition
agent:DeleteProductionRecord a owl:Class ;
rdfs:subClassOf agent:DestructiveAction ;
agent:requiresPrecondition agent:HumanApprovalGranted ;
agent:requiresPrecondition agent:BackupVerified ;
agent:permittedInEnvironment agent:ProductionEnvironment ;
agent:requiresRole agent:DataAdmin .
# Prohibition axiom
agent:DeleteProductionRecord owl:disjointWith agent:AutomatedAgentPermission .
This ontology is not just documentation that humans read. It is machine-processable. A reasoning engine (a Description Logic reasoner like HermiT or Pellet) can determine at runtime whether a proposed action is consistent with the ontology. An agent that attempts to delete a production record without HumanApprovalGranted and BackupVerified will be refused — not by a hard-coded if-statement, but by an inference that the action is ontologically inconsistent with the current world state.
Connecting Promise Theory, Ontology, and Rules
Here is how the three layers compose:
Layer 1 — Promise Theory (Structural Contracts)
Defines the relationships between agents, the scope of their authority, and the nature of their obligations. This is the constitution — the highest-level specification of what an agent is and what it has committed to do.
Layer 2 — Action Ontology (Domain Knowledge)
Defines the space of possible actions in the domain, their types, preconditions, and constraints. This is the law — the codified rules of what is permitted, what is required, and what is forbidden.
Layer 3 — CLIPS / Rule Engine (Operational Enforcement
Implements the ontology as operational rules that run in real time, matching agent action requests against the current system state and enforcing the constraints defined in layers 1 and 2. This is the police — the active enforcement mechanism that operates moment to moment.
The LLM agent sits above all three layers, generating proposed actions. The three-layer system sits below the agent, acting as a formal filter. The agent proposes; the guardrail layer disposes.
┌─────────────────────────────────────────────┐
│ LLM / Agent Layer │
│ (proposes actions, generates plans) │
└─────────────────────┬───────────────────────┘
│ Action Request
▼
┌─────────────────────────────────────────────┐
│ Promise Theory Layer │
│ (validates authority and context scope) │
└─────────────────────┬───────────────────────┘
│ Contextually scoped request
▼
┌─────────────────────────────────────────────┐
│ Action Ontology Layer │
│ (checks type safety and preconditions) │
└─────────────────────┬───────────────────────┘
│ Ontologically validated request
▼
┌─────────────────────────────────────────────┐
│ CLIPS Rule Engine Layer │
│ (operational enforcement, audit trail) │
└─────────────────────┬───────────────────────┘
│ Permitted / Blocked + Explanation
▼
System State
Dependent Types: Making Rules Part of the Type System
There is a fourth layer that can be added for the highest-assurance scenarios: dependent types from type theory.
In conventional programming, types describe what kind of thing a value is: int, string, User. In dependent type theory, types can depend on values, allowing you to express properties like "a list of exactly 5 elements" or "a function that only accepts positive integers" or "an action that is only callable when environment == staging" — as types, not runtime checks.
Formal proof assistants like Agda, Lean, and Idris implement dependent type systems. In these systems, you can encode:
-- A type for actions that are only valid in non-production environments
data SafeAction : Environment → Type where
ReadData : SafeAction env
WriteStaging : SafeAction Staging
-- No constructor for WriteProduction, DeleteProduction, etc.
-- They literally cannot be typed in a non-staging environment
-- An agent that can only invoke safe actions
execute : {env : Environment} → SafeAction env → Effect
The crucial property here is that the constraint is enforced at compile time — or, in the context of a type-checking guardrail layer, at validation time before execution. An agent that attempts to call WriteProduction directly cannot even construct a well-typed action request. The violation is not caught by a runtime rule; it is ruled out by the structure of the type system itself.
This is what Burgess's Promise Theory combined with dependent types gives you: the formal guarantee that certain classes of violations are structurally impossible, not merely improbable or rule-checked-at-runtime.
Deductive Databases: Closing the Loop
The final architectural component is the deductive database — a database system that combines stored facts with inference rules to derive new knowledge automatically.
Systems like Datalog (the basis for many modern deductive database implementations, including commercial products like LogicBlox and open-source ones like Soufflé) allow you to express queries and rules that are evaluated against a stored fact base. Unlike Prolog, Datalog has guaranteed termination and a clean semantics — it is purely declarative, set-based, and stratified.
For agentic guardrails, a deductive database serves as the persistent, queryable representation of:
The promise graph: which agents have made which commitments to which other agents
The current world state: what the system believes to be true right now
The audit log: what actions have been taken, by whom, when, and with what result
The violation history: what constraints have been breached and how they were resolved
The CLIPS rule engine operates on the live fact base. The deductive database provides the persistent, queryable record. Together, they form a complete guardrail infrastructure: real-time enforcement plus historical accountability.
"But Where Is the LLM?"
This is exactly the right question, and it deserves a careful answer.
LLMs are extraordinary pattern matchers. They have internalized vast amounts of human knowledge, can reason across domains, and produce fluent, contextually appropriate text. These are genuinely powerful capabilities. The problem is not that LLMs are bad — the problem is that we are asking them to do things they were not designed to do.
Specifically, we are asking LLMs to:
Enforce rules — but LLMs are trained to produce plausible next tokens, not to follow logical constraints.
Maintain consistency — but LLMs have no persistent state and can contradict themselves across a conversation.
Guarantee safety — but LLMs hallucinate, and hallucinated safety is no safety at all.
The symbolic-neural integration, sometimes called neurosymbolic AI, addresses this by assigning each component what it is actually good at:
The neural (LLM) layer handles language understanding, intent parsing, domain knowledge retrieval, and natural language generation. It proposes; it explains; it converses.
The symbolic (rules/ontology/deductive) layer handles constraint enforcement, logical consistency, auditing, and verification. It approves; it blocks; it records.
The LLM does not need to know the rules exhaustively — it needs to be able to express its proposed actions in a formal enough notation that the symbolic layer can evaluate them. This is actually a task LLMs are quite good at: translating natural language intentions into structured, formal representations.
The more hallucination-prone the model, the more critical the symbolic guardrail layer becomes. The guardrail does not prevent the LLM from hallucinating. It prevents the hallucination from taking effect in the world.
A Concrete Architecture
Putting all of this together, here is what a production-grade agentic guardrail system looks like:
1. Intent Parsing (LLM)
The agent receives a natural language instruction and produces a structured action request:
{
"agent_id": "agent-marketing-001",
"action": "send_bulk_email",
"target": "customer_segment_new_users",
"parameters": {
"template": "promotional_spring_2025",
"volume": 45000
},
"context": {
"environment": "production",
"timestamp": "2025-03-15T14:30:00Z",
"session_id": "sess-abc123"
}
}
2. Promise Validation (Promise Theory Layer)
The system checks whether agent-marketing-001 has a valid promise that covers this action in this context. If the agent has no promise covering send_bulk_email with volume > 10,000, the request is escalated for approval.
3. Ontology Type-Checking (Action Ontology Layer)
The send_bulk_email action is checked against the ontology. Its preconditions include OptInVerified(customer_segment_new_users) and AntiSpamCompliance(template). If either precondition is not satisfied, the action is blocked with a structured explanation.
4. Operational Rule Enforcement (CLIPS)
Real-time rules fire against the current system state. Rules might check: daily send limits, DNS reputation status, recent bounce rate, compliance flags, and time-of-day restrictions.
5. Audit and Logging (Deductive Database)
Every step is logged. The promise invoked, the ontological check result, the rules that fired, the final decision. This log is queryable. Compliance auditors can reconstruct any decision at any time.
6. Response to Agent (LLM)
If blocked, the agent receives a structured explanation it can parse and, optionally, explain to the human in natural language: "I cannot send this campaign because opt-in status for the new user segment has not been verified. Would you like me to initiate the verification process?"
The Knowledge Engineering Burden
There is an honest objection to this architecture: it requires significant upfront knowledge engineering. Someone has to write the promises. Someone has to build the ontology. Someone has to encode the CLIPS rules. This is not trivial work.
But consider the alternative. Without this work, the guardrails exist only as system prompt text — informal, unverifiable, prone to jailbreaks both intentional and accidental. When something goes wrong (and it will), there is no audit trail, no formal record of what the agent was supposed to do versus what it did, no structured explanation for the post-mortem.
The knowledge engineering burden is not a cost of having guardrails. It is the cost of understanding your own domain well enough to make trustworthy automated decisions within it. If you cannot specify the rules formally, you do not understand the domain well enough to automate it safely. The formalization process is also the design process — it forces precision about constraints that were previously assumed, implicit, or inconsistent.
Moreover, LLMs can assist the knowledge engineering process. A domain expert describes rules in natural language; the LLM drafts the Datalog rules or CLIPS productions; the expert reviews and refines. The tooling is becoming available to make this loop significantly faster.
What Full Autonomy Actually Requires
The dream of fully autonomous agents — systems where you set a goal and the agent figures out everything else — is not inherently impossible. But it is further away than the current hype cycle suggests, and the path there runs through formal guarantees, not through bigger models.
Full autonomy requires:
Formal verification that the agent's behavior is provably consistent with its constraints in the class of situations it will encounter. This is the domain of model checking and formal methods — mature fields that are largely absent from current agent discussions.
Bounded uncertainty — the agent must be able to know when it does not know, and to stop or escalate rather than guess. Current LLMs are famously overconfident. Symbolic systems with explicit knowledge gaps are better at representing and communicating uncertainty.
Interpretable reasoning — every significant decision must be explainable in terms that humans can evaluate. This is a regulatory requirement in many domains (finance, healthcare, law) and an ethical requirement in most others.
Graceful degradation — when the guardrails block an action, the agent must be able to decompose the goal, find an alternative path, or escalate appropriately. This requires the agent to reason about the guardrail layer, not just hit it.
None of these are solved problems. But all of them have predecessors in the expert systems research of the 1970s–1990s that we have largely forgotten. The tragedy of the current moment is not that we lack the intelligence to build safe agents — it is that we lack the patience to build on the foundations that already exist.
Conclusion: The Guardrail Layer Is Not Optional
Agents are powerful. LLMs are powerful. The combination is powerful. And power without constraint is not productivity — it is liability.
The history of software engineering is a history of gradually formalizing what had previously been implicit. We formalized memory management. We formalized type systems. We formalized concurrency. We formalized API contracts. Each formalization was resisted by people who thought it slowed things down, and each formalization eventually made things dramatically more reliable and productive.
The formalization of agent behavior is next. Promise Theory gives us the structural vocabulary. Action ontologies give us the domain-specific schema. CLIPS and deductive databases give us the operational enforcement. Dependent types give us the strongest formal guarantees for the highest-stakes decisions.
The "huge elephant in the room" that the current agent boom is studiously ignoring is that we are granting unprecedented autonomous authority to systems with no formal behavioral contracts, no provably enforced constraints, and no interpretable audit trails. The guardrail layer is not a limitation on what agents can do. It is the foundation on which anything agents do can be trusted.
We went to the 1970s and built the systems. Then we forgot them. Now we need them back, updated for 2025, integrated with the neural layer that genuinely is extraordinary, and deployed as a mandatory architectural component rather than an afterthought.