Policy Guide

How to write policies that precisely and predictably govern agent behaviour.

Policies can be created, edited, deleted, and toggled (enabled/disabled) from the console at console.blacklake.systems/policies (cloud) or http://localhost:3200/policies (local), via the API (POST /v1/policies, PATCH /v1/policies/:id, DELETE /v1/policies/:id), or via the SDK (bl.policies.create(), bl.policies.update(), bl.policies.delete()). All three interfaces are equivalent — changes from any source take effect immediately.

Bind your tools first — or every call denies#

Before any policy is even evaluated, BlackLake checks one thing: is the tool bound to the agent? Bindings are explicit — you create them with bl.agents.bindTool(agentId, toolId) (or via the console). Without a binding, every govern() call for that pair returns deny with reason "Tool is not bound to agent" — not default-deny from a policy.

This is by design: accidentally granting a tool to the wrong agent should take a deliberate step, not happen implicitly. But it's also the most common cause of "I wrote an allow policy and it still denies" confusion. If you're testing and seeing unexpected denies:

Confirm the binding exists — Console → Agents → <agent> → Tools, or bl.agents.listTools(agentId).
The deny reason names the cause: Tool 'X' is not bound to agent 'Y'. That's the bindings issue, not your policy.
The govern simulator (/playground → Govern simulator) shows binding state on its result card before policy match — use it to debug fast.

The MCP proxy auto-creates bindings on first call (so MCP traffic just works). The SDK does not — bind explicitly.

How Selectors Work#

A policy has two selectors: agent_selector and tool_selector. Each is a flat key-value object matched against the corresponding resource record.

Matching rules:

Every key in the selector must match the exact value of that field on the agent or tool record.
An empty selector ({}) matches everything.
All fields must match — it is an AND condition, not OR.

Example agent selector:

{ "environment": "production", "risk_classification": "high" }

This matches any agent where environment is "production" and risk_classification is "high". An agent in production with a medium classification would not match.

Fields available for matching:

Agents: name, environment, risk_classification, status, approval_mode

Tools: name, risk_classification

Priority Ordering#

Policies are evaluated in ascending priority order — lower number = evaluated first.

The engine walks through the sorted list and stops at the first policy where both selectors match. That policy's outcome is applied. No further policies are checked.

Specific policies should have lower priority numbers than broad catch-all policies.

Priority 1   →  evaluated first  (most specific rules)
Priority 10  →  evaluated second
Priority 100 →  evaluated last   (broadest catch-all)

The Three Outcomes#

Outcome	Meaning
`allow`	The tool invocation is permitted. Proceed.
`deny`	The tool invocation is blocked. Do not proceed.
`approval_required`	The invocation requires human review before proceeding. Surface creates a pending approval record and fires an `approval.created` webhook. Your application must wait for or receive the decision before executing the tool.

Default Deny#

If no policy matches, the outcome is default_deny. This is not a policy — it is the engine's fallback. It means you have not written a policy that covers this agent-tool combination.

There is no implicit allow. Every permitted invocation requires an explicit policy.

MCP Policies#

When you add a server to ~/.blacklake/mcp-config.json, you can set a policy field that creates an initial policy for that server at startup:

{
  "servers": {
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/Users/you/Documents"],
      "policy": "ask"
    }
  }
}

The policy field is a shorthand that creates a single catch-all policy for the server:

Value	Effect
`"allow"`	All tools from this server are allowed
`"deny"`	All tools from this server are denied
`"ask"`	All tools require human approval before executing

Policies created from mcp-config.json are ordinary policy rows in the database. You can create, edit, delete, and toggle them from the console at console.blacklake.systems/policies (cloud) or http://localhost:3200/policies (local) or via the API at any time. Changes take effect immediately — the policy cache is flushed on every write — and are not overwritten on restart.

Example: allow reads, require approval for writes#

To allow some tools but require approval for others, start with "policy": "allow" in mcp-config.json and then create specific policies from the console or API. For example, to require approval for write operations on a filesystem server:

# Allow read_file
curl -X POST http://localhost:3100/v1/policies \
  -H "x-api-key: local" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "filesystem-allow-reads",
    "priority": 10,
    "agent_selector": { "name": "filesystem" },
    "tool_selector": { "name": "read_file" },
    "outcome": "allow"
  }'

# Require approval for write_file
curl -X POST http://localhost:3100/v1/policies \
  -H "x-api-key: local" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "filesystem-approve-writes",
    "priority": 5,
    "agent_selector": { "name": "filesystem" },
    "tool_selector": { "name": "write_file" },
    "outcome": "approval_required"
  }'

With these two policies in place, read_file calls are allowed immediately and appear in the Evaluations page as allow. write_file calls pause and create an approval record. Approve or reject from the console at console.blacklake.systems/approvals (cloud) or http://localhost:3200/approvals (local).

Example: deny all tool calls from a specific MCP server#

To fully block a server while you review it:

curl -X POST http://localhost:3100/v1/policies \
  -H "x-api-key: local" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "block-untrusted-server",
    "priority": 1,
    "agent_selector": { "name": "untrusted-mcp-server" },
    "tool_selector": {},
    "outcome": "deny"
  }'

Priority 1 means this is evaluated before any other policy. The empty tool_selector matches all tools. Every call from untrusted-mcp-server is denied.

To lift the block: disable or delete the policy in the console. You do not need to edit mcp-config.json.

Common Patterns#

Block all high-risk tools in production#

High-risk tools should not be callable by any agent in production without explicit review. Place this at a low priority number so it is evaluated before broader allow rules.

await bl.policies.create({
  name: 'block-high-risk-in-prod',
  priority: 1,
  agent_selector: { environment: 'production' },
  tool_selector: { risk_classification: 'high' },
  outcome: 'deny',
});

Require approval for medium-risk tools in production#

await bl.policies.create({
  name: 'approve-medium-risk-in-prod',
  priority: 2,
  agent_selector: { environment: 'production' },
  tool_selector: { risk_classification: 'medium' },
  outcome: 'approval_required',
});

Allow everything in development#

Development environments typically need unrestricted access for testing. Place this at a high priority number so specific deny rules for other environments are not overridden.

await bl.policies.create({
  name: 'allow-all-in-dev',
  priority: 100,
  agent_selector: { environment: 'development' },
  tool_selector: {},
  outcome: 'allow',
});

Block a specific agent from a specific tool#

Name-based selectors let you block a single agent from a single tool regardless of its environment or classification.

await bl.policies.create({
  name: 'block-agent-x-from-delete-record',
  priority: 5,
  agent_selector: { name: 'legacy-agent' },
  tool_selector: { name: 'delete-record' },
  outcome: 'deny',
});

Allow everything for trusted agents#

An agent classified as critical may represent a fully audited system that should have unrestricted access. Allow it explicitly, but place it at a high priority number so narrower deny rules still take effect when they match first.

await bl.policies.create({
  name: 'allow-critical-agents',
  priority: 90,
  agent_selector: { risk_classification: 'critical' },
  tool_selector: {},
  outcome: 'allow',
});

Policy Evaluation Walkthrough#

Consider this policy set, ordered by priority:

Priority	Name	Agent Selector	Tool Selector	Outcome
1	block-high-risk-in-prod	`{ environment: production }`	`{ risk_classification: high }`	deny
10	approve-medium-risk-in-prod	`{ environment: production }`	`{ risk_classification: medium }`	approval_required
50	allow-support-agent	`{ name: customer-support-agent }`	`{}`	allow
100	allow-all-dev	`{ environment: development }`	`{}`	allow

Scenario A: customer-support-agent (production, medium) tries to use send-email (medium).

Priority 1: agent selector matches (production). Tool selector requires high — send-email is medium. No match.
Priority 10: agent selector matches (production). Tool selector requires medium — send-email is medium. Match. Outcome: approval_required.

Scenario B: customer-support-agent (production, medium) tries to use read-knowledge-base (low).

Priority 1: tool is low, not high. No match.
Priority 10: tool is low, not medium. No match.
Priority 50: agent name is customer-support-agent. Tool selector is {} — matches everything. Match. Outcome: allow.

Scenario C: data-pipeline-agent (production, high) tries to use write-to-s3 (high).

Priority 1: agent is production. Tool is high. Both match. Outcome: deny.

Evaluation stops at the first match. Priorities 10, 50, and 100 are never reached in Scenario C.

Scenario D: new-agent (staging, low) tries to use send-notification (low).

Priority 1: agent is staging, not production. No match.
Priority 10: agent is staging, not production. No match.
Priority 50: agent name is not customer-support-agent. No match.
Priority 100: agent environment is staging, not development. No match.
No match. Outcome: default_deny.

Practical Tips#

Space priorities apart. Use values like 1, 10, 20, 50, 100 rather than 1, 2, 3. This gives you room to insert rules between existing ones without renumbering everything.

Put catch-all allows at the highest priority number. Broad permissive rules should be evaluated last, after all specific deny and approval rules have had a chance to match.

Test before relying on a policy. Use POST /v1/govern directly to verify a policy behaves as expected before deploying the agent that depends on it.

# Test: does the new deny rule block the expected agent?
curl -X POST http://localhost:3100/v1/govern \
  -H "x-api-key: local" \
  -H "Content-Type: application/json" \
  -d '{
    "agent": "customer-support-agent",
    "tool": "delete-record",
    "context": { "test": true }
  }'

Disable rather than delete. Set enabled: false to temporarily remove a policy from evaluation without losing its definition. Use DELETE only when you are sure the rule is no longer needed.

Combine selectors carefully. A selector with multiple keys requires all keys to match. If you want different rules for different combinations, write separate policies — do not try to express OR logic in a single selector.

Audit with evaluations. After deploying a new policy, query the evaluations endpoint to confirm that traffic is being matched and decided as expected. Look for unexpected default_deny outcomes as a signal that a required allow policy is missing.

`approval_required` in Practice#

When a policy with outcome: 'approval_required' matches, Surface:

Records the evaluation (as always).
Creates a pending approval record with a 24-hour expiry.
Fires an approval.created webhook to any registered subscribers.
Returns decision: 'approval_required' and approval_id in the govern response.

The caller must act on this before executing the tool. Two patterns:

Short-running workflows: Call bl.approvals.wait(approval_id) to block and poll for up to 5 minutes. If a decision arrives in time, proceed or abort based on the result.
Long-running workflows: Store the approval_id and register a webhook to receive approval.approved or approval.rejected events when a human makes a decision.

For MCP tool calls: the proxy holds the request open and waits automatically. When you approve or reject from the console, the proxy responds to the MCP client immediately.

Example — require approval for high-value payments:

await bl.policies.create({
  name: 'require-approval-for-large-payments',
  priority: 5,
  agent_selector: { name: 'expense-bot' },
  tool_selector: { name: 'payments.send' },
  outcome: 'approval_required',
});

With this policy in place, every call to bl.govern({ agent: 'expense-bot', tool: 'payments.send', ... }) returns approval_required with an approval_id. Your agent code must not execute the payment until an approved decision is confirmed. The approval expires after 24 hours if no decision is made.

Two-person approval#

For the highest-risk policies, set requires_two_person: true so a single approver isn't enough.

await bl.policies.create({
  name: 'require-two-approvers-prod-mutations',
  priority: 1,
  agent_selector: { environment: 'production' },
  tool_selector: { risk_classification: 'critical' },
  outcome: 'approval_required',
  requires_two_person: true,
});

The flag is snapshotted onto the approval at creation, so subsequent edits to the policy don't change the rules mid-flow. The first bl.approvals.approve(...) records the decision but leaves the approval pending; a second approve from a different decided_by identity closes it. A single reject still terminates immediately. The same identity attempting to approve twice is rejected with 409 DUPLICATE_APPROVER.

Break-glass override#

bl.approvals.breakGlass(...) is an emergency override. It force-approves regardless of requires_two_person and sets break_glass: true on the approval as a permanent audit marker. The reason field is required to be at least 40 characters — the friction is intentional. If a team reaches for it routinely, the underlying policy probably needs adjusting.

Cost-aware policies#

Policies can carry a cost_conditions block that's evaluated alongside selectors. When set, the policy only matches if every condition holds — the selectors qualify which (agent, tool) pair the policy targets, the conditions decide whether to apply the outcome based on observed or estimated spend.

await bl.policies.create({
  name: 'block-expensive-opus-on-prod',
  priority: 5,
  agent_selector: { environment: 'production' },
  outcome: 'deny',
  mode: 'monitor', // observe before restricting
  cost_conditions: {
    all: [
      { signal: 'tool.estimated_cost_usd', op: 'gt', value: 1.00 },
      { signal: 'tool.model', op: 'in', value: ['claude-opus-4-7', 'claude-opus-4-6'] },
    ],
  },
});

Signals#

Signal	Source	Notes
`agent.spend_today_usd`	sum of cost_records for this agent since UTC midnight	Workspace-day totals; not per-task
`agent.spend_per_task_usd`	sum since the call's `task_id` started	Caller passes `task_id` on `govern()`; null otherwise
`agent.cumulative_spend_session_usd`	sum since the call's `session_id` started	Long-running agent loops
`tool.estimated_cost_usd`	from the `estimate` block on the govern call	Pre-call gate
`tool.input_tokens`	from the `estimate` block	Block large-context calls
`tool.model`	from the `estimate` block	Use with `op: 'in'`
`workspace.spend_today_usd`	workspace-wide today	Coarse last-resort cap
`user.spend_today_usd`	user-scoped if a user_id is on the call	For shared dev workspaces

Operators#

gt, gte, lt, lte, eq, ne work on numbers and strings; in takes an array of strings (used with tool.model).

Modes#

enforce (default for selector-only policies) — applies the policy outcome. Selector-only policies kept the historical behaviour.
monitor (default when cost_conditions is set) — records the would-be decision but never denies. Lets operators ship a cost-aware policy and watch what would have happened for a week before flipping to enforce.

The receipt records policy_snapshot.mode and includes a cost_signals context block under request_context so the audit trail explains exactly which signal pushed the call into the policy.

Pre-call gating with `estimate`#

Cost-based denial only works pre-spend if BlackLake knows the projected cost. Pass estimate on the govern call:

const estimate = await bl.cost.estimate({
  provider: 'anthropic',
  model: 'claude-opus-4-7',
  input_tokens: countedClientSide,
  output_ceiling_tokens: 4000,
});

const result = await bl.govern({
  agent: 'claude-prod',
  tool: 'github__create-issue',
  estimate: {
    provider: 'anthropic',
    model: 'claude-opus-4-7',
    input_tokens: countedClientSide,
    output_ceiling_tokens: 4000,
    estimated_cost_usd: estimate.total_usd,
  },
});

Without an estimate, conditions on tool.estimated_cost_usd evaluate to 0 and never trigger.

Budgets vs cost-aware policies#

Budgets cap total spend over time (workspace / agent / tool / user). Cost-aware policies gate specific calls by attribute (model, input size, projected cost). Use both together:

a workspace budget (scope_type: 'workspace', hard $1000/month) is the last-resort cap;
a cost-aware policy (tool.estimated_cost_usd > 1, mode monitor for two weeks then flipped to enforce) catches the long tail of overrun before the budget fires.

Worked example — cost-aware approval gate#

Require approval whenever a single planned call is projected to cost more than $5 and the agent is in production. Routine calls flow without friction; the expensive ones land in your approval queue.

await bl.policies.create({
  name: 'approve-expensive-prod-calls',
  priority: 7,
  agent_selector: { environment: 'production' },
  outcome: 'approval_required',
  mode: 'enforce',
  cost_conditions: {
    all: [
      { signal: 'tool.estimated_cost_usd', op: 'gt', value: 5.00 },
    ],
  },
});