ARF sits transparently between your AI coding CLI and the upstream API intercepting, translating, governing, and recording every message before it touches the wire. No code changes. No configuration on the agent side. Just point the CLI at localhost and ARF handles the rest.
The proxy is fully transparent to the runner. Any AI coding CLI sees a standard API endpoint. ARF handles all protocol translation internally through a canonical intermediate representation, then forwards to whatever upstream engine you've configured.
Anthropic uses server-sent events with content_block_delta streaming. OpenAI uses chat completion chunks. Gemini has its own protobuf-flavored JSON. They're not compatible.
ARF's translation layer converts between all of them through a Canonical Intermediate Representation a single internal message format that captures the full semantics of any supported protocol. Translation is lossless for governance-relevant fields.
This means you can run Claude Code against OpenAI's API, or Codex CLI against Anthropic's. The runner doesn't know. The engine doesn't know. ARF does the work.
-- INBOUND (from claude-code) ----------------------
POST /v1/messages HTTP/1.1 x-arf-runner: claude-code x-api-key: [redacted] { "model": "anthropic/claude", "messages": [...] }
-- CANONICAL IR --------------------------------
CIR { runner: claude-code engine: anthropic -> openai session: 01HX...QPBZ policy: standard ✓ }
-- OUTBOUND (to openai endpoint) ----
POST /v1/chat/completions HTTP/1.1 { "model": "openai/gpt", "messages": [translated], "stream": true }
✓ Translation complete · 2.1ms
ARF works by intercepting the HTTP traffic at the environment level. You set one environment variable ANTHROPIC_BASE_URL=http://localhost:4554 and every CLI that respects that variable is now governed by ARF.
Claude Code doesn't need a plugin. Codex CLI doesn't need a flag. Gemini CLI doesn't need configuration. The interposition is at the network layer, not the application layer. This is the Agent Runtime Firewall model: everything goes through the fence, nothing bypasses it.
ARF handles TLS termination, certificate injection for mTLS environments, and credential substitution so the runner sees a plain HTTP endpoint while ARF manages the secure upstream connections.
ANTHROPIC_BASE_URLThe Agent Runtime Firewall doesn't care which runner is connecting. It identifies the runner from the User-Agent header and incoming protocol shape, assigns the session to the appropriate protocol adapter, and proceeds. All runners share the same governance policy, the same audit chain, the same credential vault.
Running three agents in parallel — Claude on a feature branch, Codex doing code review, a DeepSeek model writing tests? ARF tracks them all as separate sessions under a shared work graph, applies governance uniformly, and records each decision to the same proof chain. The TUI shows all three simultaneously.
SESS RUNNER ENGINE MSGS TOKENS ● 01HX…AB12 claude-code anthropic 142 28.4k ● 01HX…CD34 codex-cli openai 88 12.1k ◐ 01HX…EF56 gemini-cli google 34 8.7k ● 01HX…GH78 ai-cli ollama 21 4.2k
4 active sessions · policy: standard · uptime: 3h 42m
Bidirectional conversion between Anthropic, OpenAI, and Gemini wire formats. Every message passes through the canonical IR. No information lost.
Every request and response is evaluated against your TOML policy rules before being forwarded. Policy violations trip circuit breakers in real time.
Each message is signed with Ed25519 and added to the SHA-256 hash chain. The record is tamper-evident from the first request to the last seal.
API keys are pulled from the encrypted vault just-in-time, injected into outbound headers, and never written to disk. Runners hold no credentials.
Inbound prompts and outbound completions can be rewritten, augmented, or filtered by ARF rules before the runner or engine sees them.
Every request is tagged with a ULID session ID, git commit context, and caller identity. Attribution is precise enough for compliance audits.
Use Claude's UX with OpenAI's backend, or Codex's workflow against Anthropic's API. ARF translates everything. Mix, match, experiment without touching agent config.
Your cost caps, content rules, and circuit breakers are defined once in TOML and enforced at the proxy layer. An agent can't route around policy by switching models.
The hash-chained proof bundle captures every message request and response with timing, token counts, policy decisions, and cryptographic signatures. Nothing slips through.
Ollama, DeepSeek, Qwen, and any OAI-compatible local model plug into ARF just like the cloud APIs. Route sensitive work locally, experimental work to the cloud. Policy follows either way.