Chapter 28: NeamClaw API, CLI, and Deployment #
"The best code in the world is worthless if it never leaves your laptop. Deployment is not the last step -- it is the first step toward value." -- Production engineering axiom
In the previous chapters, you learned to declare claw agents with persistent sessions,
forge agents with iterative build-verify loops, and traits that add cross-cutting behavior
to both. You can now build sophisticated multi-turn assistants and autonomous code
generators. But all of that work has been running locally -- compiled and executed on your
own machine. In this chapter, you will learn how to expose your agents to the outside
world through the HTTP API, run forge agents from the command line with the neam-forge
CLI, orchestrate multiple agents with spawn and dag_execute, and deploy everything
to Docker, Kubernetes, and AWS Lambda.
By the end of this chapter, you will be able to take a .neam file containing claw and
forge agents from source code to a production deployment accessible over the network.
- Understand the
neam-forgeCLI and its internal workflow - Use the HTTP API to interact with claw agents (sessions, messages, reset, compaction)
- Use the HTTP API to manage forge agent loops (start, stop, status)
- Configure authentication with API keys and admin keys
- Tune the
neam-apiserver with runtime flags - Orchestrate multiple agents with
spawnanddag_execute - Package agents in Docker containers with proper volume mounts
- Deploy to Kubernetes with PersistentVolumeClaims and health probes
- Deploy to AWS Lambda with EFS for forge workspace persistence
Think of a restaurant kitchen. A brilliant chef who can prepare extraordinary dishes is only useful if the restaurant has a dining room, a front-of-house staff to take orders, a delivery system for takeout, and health inspections to ensure safety. The chef is your agent logic. The dining room is your HTTP API. The front-of-house staff is your CLI tooling. The delivery system is your deployment pipeline. And the health inspections are your authentication and monitoring. Without all of these pieces working together, your agents remain private experiments. This chapter connects every piece so that your agents can serve real users in real environments.
28.1 The neam-forge CLI #
The neam-forge command-line tool runs forge agents directly from the terminal. It
compiles a .neam file, finds the forge agent inside it, executes the build-verify
loop, and prints the outcome. This is the fastest way to test forge agents during
development and the simplest way to integrate them into shell scripts and CI pipelines.
Purpose #
The neam-forge CLI exists for a specific use case: you have a .neam file containing
one or more forge agents, and you want to run one of them immediately without starting
an HTTP server. It handles compilation, VM setup, and loop execution in a single command.
Usage #
neam-forge --agent <file.neam> [options]
CLI Flags #
| Flag | Required | Default | Description |
|---|---|---|---|
--agent FILE |
Yes | -- | Path to the .neam file containing forge agent(s) |
--name NAME |
No | First found | Name of the specific forge agent to run |
--workspace DIR |
No | Agent default | Override the agent workspace directory |
--max-iterations N |
No | Agent default | Override the maximum iteration count |
--max-cost N |
No | Agent default | Override the maximum cost limit (USD) |
--verbose |
No | Off | Print compilation details and agent metadata to stderr |
--help |
No | -- | Show usage information and exit |
Internal Workflow #
When you execute neam-forge, the tool performs seven steps internally. Understanding
this workflow helps you debug issues and predict behavior:
The key insight is that the wrapper code is compiled and executed on the same VM
that already has all agents, skills, and structs registered. This means the forge agent
can call any skill defined in the original .neam file, and the spawn keyword can
reach any other agent registered during Step 2.
Exit Codes #
| Exit Code | Meaning |
|---|---|
0 |
Forge loop completed successfully |
1 |
Error (compilation failure, missing agent, runtime error) |
Examples #
Basic run with defaults:
neam-forge --agent build_pipeline.neam
This compiles build_pipeline.neam, finds the first forge agent, runs it with all
default settings, and prints the outcome.
Run with overrides:
neam-forge --agent build_pipeline.neam \
--workspace ./output \
--max-iterations 10 \
--max-cost 3.0 \
--verbose
This overrides the workspace directory, limits iterations to 10 and cost to $3.00, and prints compilation details to stderr.
Run a specific named agent:
neam-forge --agent agents.neam --name code_builder
When a .neam file contains multiple forge agents, use --name to select which one
to run. Without --name, the CLI picks the first forge agent it finds.
Use --verbose during development to see exactly which agent was selected, what
overrides were applied, and how many skills and structs were registered on the VM.
This information goes to stderr, so it does not interfere with piping the JSON
outcome from stdout.
28.2 HTTP API: Claw Endpoints #
The neam-api server exposes your claw and forge agents over HTTP. Start the server
by pointing it at a .neam file that contains your agent declarations:
neam-api --program my_agents.neam --port 8080
The server compiles the program, registers all agents on a VM pool, and begins listening for HTTP requests. Below is the complete endpoint map for claw and forge agents:
Endpoint Map #
┌──────────────────────────────────────────────────────────────────────┐
│ neam-api Endpoint Map │
│ │
│ Health & Metrics │
│ ─────────────────────────────────────────────────────────────────── │
│ GET /health → Liveness probe │
│ GET /ready → Readiness probe │
│ GET /api/v1/health → Health + VM stats │
│ GET /api/v1/metrics → p50/p95/p99 metrics │
│ │
│ Legacy Agents │
│ ─────────────────────────────────────────────────────────────────── │
│ GET /api/v1/agents → List legacy agents │
│ POST /api/v1/agent/ask → Query legacy agent │
│ │
│ Claw Agents │
│ ─────────────────────────────────────────────────────────────────── │
│ GET /api/v1/claw → List claw agents │
│ POST /api/v1/claw/:agent/sessions/:key/message → Send message │
│ POST /api/v1/claw/:agent/sessions/:key/reset → Reset session │
│ POST /api/v1/claw/:agent/compact → Trigger compaction │
│ │
│ Forge Agents │
│ ─────────────────────────────────────────────────────────────────── │
│ GET /api/v1/forge → List forge agents │
│ POST /api/v1/forge/:agent/start → Start forge loop │
│ POST /api/v1/forge/:agent/stop → Stop forge loop │
│ GET /api/v1/forge/:agent/status → Get loop status │
│ │
│ Admin │
│ ─────────────────────────────────────────────────────────────────── │
│ POST /api/v1/admin/disable → Kill switch │
│ POST /api/v1/admin/enable → Re-enable │
│ GET /api/v1/admin/status → Monitoring status │
│ POST /api/v1/confirm → Approve/deny tool │
│ GET /api/v1/admin/confirmations → Pending confirmations │
└──────────────────────────────────────────────────────────────────────┘
Let us walk through each claw endpoint in detail.
GET /api/v1/claw -- List Claw Agents #
This endpoint returns all claw agents registered on the server, along with metadata about their provider, model, active session count, and channel configuration.
curl http://localhost:8080/api/v1/claw
Response:
{
"claw_agents": [
{
"name": "support_bot",
"provider": "openai",
"model": "gpt-4o",
"session_count": 3,
"channels": ["cli_chan"]
},
{
"name": "tutor",
"provider": "ollama",
"model": "llama3",
"session_count": 0,
"channels": ["web_chan"]
}
]
}
The session_count field tells you how many active sessions exist for each agent.
This is useful for monitoring and capacity planning.
POST /api/v1/claw/:agent/sessions/:key/message -- Send Message #
This is the primary endpoint for interacting with claw agents. You send a message to a specific agent and session, and the agent responds using its full context (system prompt, session history, skills, guards, and semantic memory).
Sessions are created automatically on first use. If you send a message to a session key that does not exist yet, the server creates the session, initializes it, and processes your message.
curl -X POST http://localhost:8080/api/v1/claw/support_bot/sessions/user_42/message \
-H "Content-Type: application/json" \
-d '{"message": "How do I reset my password?"}'
Response:
{
"agent": "support_bot",
"session_key": "user_42",
"response": "To reset your password, go to Settings > Security > Reset Password. You will receive a confirmation email within 2 minutes."
}
Subsequent messages to the same session key maintain conversation context:
curl -X POST http://localhost:8080/api/v1/claw/support_bot/sessions/user_42/message \
-H "Content-Type: application/json" \
-d '{"message": "What if I do not receive the email?"}'
{
"agent": "support_bot",
"session_key": "user_42",
"response": "If you do not receive the confirmation email within 5 minutes, check your spam folder first. If it is not there, I can trigger a manual password reset for you. Would you like me to do that?"
}
The agent remembers the previous message because the session history is loaded from disk (JSONL format) and included in the context sent to the LLM.
POST /api/v1/claw/:agent/sessions/:key/reset -- Reset Session #
Reset a claw agent session. This archives the current conversation history and starts a fresh session. The agent will have no memory of previous interactions in that session after a reset.
curl -X POST \
http://localhost:8080/api/v1/claw/support_bot/sessions/user_42/reset
Response:
{
"agent": "support_bot",
"session_key": "user_42",
"status": "reset",
"message": "Session archived and reset"
}
Resetting a session is destructive for the active conversation. The archived history is preserved on disk for audit purposes, but the agent will not load it into context for future messages. If you want to reduce context size without losing the conversation thread, use compaction instead.
POST /api/v1/claw/:agent/compact -- Trigger Compaction #
Compaction reduces the context window by summarizing older conversation history. This is useful when a long-running session approaches the model context limit. The compaction process first flushes important facts to the agent workspace (MEMORY.md), then replaces older turns with a concise summary.
curl -X POST http://localhost:8080/api/v1/claw/support_bot/compact
Response:
{
"agent": "support_bot",
"status": "compacted",
"turns_before": 87,
"turns_after": 12,
"summary_length": 342
}
If a claw agent has compaction: "auto" in its session configuration, compaction
triggers automatically when the context reaches approximately 80% of the model
window. Manual compaction via this endpoint is useful when you have compaction:
"manual" or when you want to force compaction at a specific point.
28.3 HTTP API: Forge Endpoints #
Forge agents run iterative build-verify loops that can take minutes or hours to complete. The HTTP API lets you start loops in the background, monitor their progress, and stop them if needed.
GET /api/v1/forge -- List Forge Agents #
Returns all registered forge agents with their configuration and current running state.
curl http://localhost:8080/api/v1/forge
Response:
{
"forge_agents": [
{
"name": "code_builder",
"provider": "openai",
"model": "gpt-4o",
"max_iterations": 25,
"max_cost": 10.0,
"running": false
},
{
"name": "doc_generator",
"provider": "openai",
"model": "gpt-4o",
"max_iterations": 15,
"max_cost": 5.0,
"running": true
}
]
}
The running field indicates whether a forge loop is currently in progress for each
agent. An agent can only have one active loop at a time.
POST /api/v1/forge/:agent/start -- Start Forge Loop #
Start a forge agent build-verify loop in the background. You can optionally override the iteration and cost limits.
curl -X POST http://localhost:8080/api/v1/forge/code_builder/start \
-H "Content-Type: application/json" \
-d '{"max_iterations": 10, "max_cost": 5.0}'
Response:
{
"status": "started",
"agent": "code_builder",
"max_iterations": 10,
"max_cost": 5.0
}
If you omit the request body, the agent runs with its declared defaults:
curl -X POST http://localhost:8080/api/v1/forge/code_builder/start
{
"status": "started",
"agent": "code_builder",
"max_iterations": 25,
"max_cost": 10.0
}
If you attempt to start a forge loop while one is already running for the same agent,
the server returns an error. You must either wait for the current loop to finish or
explicitly stop it first with the /stop endpoint.
POST /api/v1/forge/:agent/stop -- Stop Running Loop #
Request a running forge loop to stop. The loop finishes its current iteration and then terminates gracefully.
curl -X POST http://localhost:8080/api/v1/forge/code_builder/stop
Response:
{
"status": "stopping",
"agent": "code_builder"
}
The status is "stopping" rather than "stopped" because the current iteration
must complete before the loop actually halts. Poll the /status endpoint to confirm
when the loop has fully stopped.
GET /api/v1/forge/:agent/status -- Check Loop Status #
Check the current state of a forge agent loop, including the iteration count, accumulated cost, and last outcome.
curl http://localhost:8080/api/v1/forge/code_builder/status
Response (while running):
{
"agent": "code_builder",
"running": true,
"iteration": 7,
"total_cost": 0.0023,
"last_outcome": "running"
}
Response (after completion):
{
"agent": "code_builder",
"running": false,
"iteration": 12,
"total_cost": 0.0051,
"last_outcome": "completed"
}
The last_outcome field contains one of: "running", "completed", "aborted",
"cost_limit", "iteration_limit", or "error".
Write a shell script that starts a forge loop, then polls the status endpoint every 5 seconds until the loop finishes. Print the final iteration count and total cost. Here is a starting point:
curl -s -X POST http://localhost:8080/api/v1/forge/code_builder/start
while true; do
status=$(curl -s http://localhost:8080/api/v1/forge/code_builder/status)
running=$(echo "$status" | jq -r '.running')
if [ "$running" = "false" ]; then
echo "$status" | jq '.'
break
fi
sleep 5
done
28.4 Authentication #
In production, you do not want anyone on the network to be able to query your agents
or start forge loops. Neam provides two environment variables for authentication:
NEAM_API_KEY for general API access and NEAM_ADMIN_KEY for administrative
endpoints.
General API Access: NEAM_API_KEY #
When NEAM_API_KEY is set, all API requests must include an Authorization:
Bearer header with the matching key. Requests without a valid key receive a 401
Unauthorized response.
# Set the API key before starting the server
export NEAM_API_KEY="sk-my-secret-key"
neam-api --program my_agents.neam --port 8080
Now every request must include the authorization header:
# This works
curl -H "Authorization: Bearer sk-my-secret-key" \
http://localhost:8080/api/v1/claw
# This fails with 401
curl http://localhost:8080/api/v1/claw
Admin Endpoints: NEAM_ADMIN_KEY #
Admin endpoints (/api/v1/admin/* and /api/v1/confirm) require a separate
NEAM_ADMIN_KEY. This allows you to give API access to application code while
restricting administrative operations to operators.
export NEAM_API_KEY="sk-my-secret-key"
export NEAM_ADMIN_KEY="admin-super-secret-key"
neam-api --program my_agents.neam --port 8080
Calling an admin endpoint requires the admin key:
# Disable an agent (requires admin key)
curl -X POST \
-H "Authorization: Bearer admin-super-secret-key" \
-H "Content-Type: application/json" \
-d '{"agent": "support_bot"}' \
http://localhost:8080/api/v1/admin/disable
{
"status": "disabled",
"agent": "support_bot"
}
Authentication Flow Diagram #
When NEAM_API_KEY is not set, the server runs in development mode with no
authentication. Never deploy to production without setting both NEAM_API_KEY and
NEAM_ADMIN_KEY. The health check endpoints (/health and /ready) are always
accessible without authentication so that load balancers and Kubernetes probes can
reach them.
28.5 Server Flags #
The neam-api command accepts several flags to control its runtime behavior. These
flags let you tune performance, configure logging, and adapt the server to different
deployment environments.
neam-api Flag Reference #
| Flag | Default | Description |
|---|---|---|
--host HOST |
0.0.0.0 |
Network interface to bind to |
--port PORT |
8080 |
TCP port to listen on |
--workers N |
4 |
Number of worker threads in the thread pool |
--timeout MS |
60000 |
Request timeout in milliseconds |
--pool-size N |
(workers) | Number of VM instances in the pool |
--llm-pool-size N |
8 |
LLM HTTP connection pool size |
--bytecode DIR |
-- | Directory containing pre-compiled .neamb files |
--drain-seconds N |
10 |
Graceful shutdown drain period in seconds |
--program FILE |
-- | Path to .neam file for claw/forge agent registration |
--audit-sink SINK |
stderr |
Audit log destination: stderr, file, or json |
--audit-log PATH |
-- | File path for audit log (when --audit-sink file) |
Flag Usage Examples #
Development server with verbose logging:
neam-api --program agents.neam --port 3000 --workers 2 --audit-sink stderr
Production server with tuned pools and file-based audit:
neam-api \
--program agents.neam \
--host 0.0.0.0 \
--port 8080 \
--workers 8 \
--pool-size 8 \
--llm-pool-size 16 \
--timeout 120000 \
--drain-seconds 30 \
--audit-sink file \
--audit-log /var/log/neam/audit.jsonl
Load pre-compiled bytecode for faster startup:
# First, compile ahead of time
neamc compile agents.neam --output ./bytecode/
# Then start the server with bytecode
neam-api --bytecode ./bytecode/ --port 8080
Think of --workers as the number of cooks in a kitchen, --pool-size as the
number of stoves, and --llm-pool-size as the number of delivery drivers. More
cooks help with more simultaneous orders. More stoves prevent cooks from waiting
for a free burner. More delivery drivers prevent finished orders from piling up
while waiting to go out. In practice, start with --workers equal to your CPU
core count and --llm-pool-size at 2x workers, then adjust based on observed
latency metrics.
28.6 Multi-Agent Orchestration #
In Chapter 13, you learned about multi-agent patterns using the basic agent type.
NeamClaw extends orchestration to claw and forge agents with two mechanisms: the
spawn keyword for invoking agents from code, and dag_execute() for running
dependency-ordered workflows.
The spawn Keyword #
The spawn keyword invokes another agent by name. It performs a single-turn LLM call
regardless of the target agent type. This makes spawn lightweight and suitable for
quick delegation tasks within a larger workflow.
// Keyword form -- static agent name
let result = spawn researcher("Find recent papers on RAG");
// Native function form -- dynamic agent name
let agent_name = "researcher";
let result = spawn(agent_name, "Find recent papers on RAG");
Both forms are equivalent. The keyword form is more readable when you know the agent name at write time. The function form is useful when the agent name comes from a variable or configuration.
Spawn Resolution Order #
When you call spawn, the VM searches for the agent in a specific order:
spawn does a simplified single-turn LLM call. It does not invoke the full
.ask() session flow for claw agents or the full .run() forge loop for forge
agents. If you need full session behavior, call agent.ask() directly. If you need
the full build-verify loop, call agent.run() directly.
DAG Execution with dag_execute() #
For workflows where tasks have dependencies, dag_execute() accepts an array of
node definitions and executes them in topological order. Independent nodes at the
same level can run concurrently (depending on available workers), while dependent
nodes wait for their prerequisites to complete.
let results = dag_execute([
{
"id": "research",
"agent": "researcher",
"task": "Gather data on market trends",
"depends_on": []
},
{
"id": "analysis",
"agent": "analyst",
"task": "Analyze the research findings",
"depends_on": ["research"]
},
{
"id": "report",
"agent": "writer",
"task": "Write executive summary",
"depends_on": ["research", "analysis"]
}
]);
// results is a Map:
// { "research": "...", "analysis": "...", "report": "..." }
DAG Node Fields #
| Field | Type | Required | Description |
|---|---|---|---|
id |
String | Yes | Unique identifier for this node |
agent |
String | Yes | Name of the agent to invoke via spawn |
task |
String | No | Task description passed to the agent |
depends_on |
List of strings | No | IDs of nodes that must complete first |
DAG Execution Flow #
The dag_execute() function uses Kahn's topological sort algorithm to determine the
correct execution order. Here is the flow for the three-agent example above:
Complete DAG Example #
Here is a complete example that defines three claw agents and orchestrates them with a DAG:
// Shared skill and guard for all agents
skill web_search {
description: "Search the web for information"
params: { query: string }
impl(query) {
return "Search results for: " + query;
}
}
guard safety {
description: "Basic safety filter"
on_observation(text) { return text; }
}
guardchain safe_chain = [safety];
channel web_chan {
type: "http"
}
// Three specialized agents
claw agent researcher {
provider: "openai"
model: "gpt-4o"
system: "You are a research assistant. Gather and summarize information."
skills: [web_search]
guards: [safe_chain]
channels: [web_chan]
}
claw agent analyst {
provider: "openai"
model: "gpt-4o"
system: "You are a data analyst. Identify patterns and key insights."
skills: [web_search]
guards: [safe_chain]
channels: [web_chan]
}
claw agent writer {
provider: "openai"
model: "gpt-4o"
system: "You are a technical writer. Create clear, well-structured reports."
skills: [web_search]
guards: [safe_chain]
channels: [web_chan]
}
// Orchestration with Orchestrable trait callbacks
impl Orchestrable for researcher {
fn on_spawn(self, child) {
emit "Researcher spawning: " + child;
return nil;
}
fn on_delegate(self, result) {
emit "Researcher received delegation result";
return nil;
}
}
// Execute the DAG
let results = dag_execute([
{
"id": "gather",
"agent": "researcher",
"task": "Research recent advances in AI agent frameworks",
"depends_on": []
},
{
"id": "analyze",
"agent": "analyst",
"task": "Identify key trends and compare frameworks",
"depends_on": ["gather"]
},
{
"id": "write",
"agent": "writer",
"task": "Write a 500-word executive summary",
"depends_on": ["gather", "analyze"]
}
]);
emit "Research: " + results["gather"];
emit "Analysis: " + results["analyze"];
emit "Report: " + results["write"];
28.7 Docker Deployment #
Docker is the standard packaging format for deploying Neam agents. The generated Dockerfile uses a multi-stage build to keep the runtime image small while including all necessary binaries.
Updated Dockerfile #
The NeamClaw Dockerfile includes the neam-forge binary alongside the existing
toolchain:
# =============================================================================
# Neam Multi-Stage Docker Build (NeamClaw)
# =============================================================================
# -----------------------------------------------------------------------------
# Stage 1: Builder
# -----------------------------------------------------------------------------
FROM ubuntu:24.04 AS builder
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
cmake \
g++ \
libcurl4-openssl-dev \
libssl-dev \
libpq-dev \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /build
COPY CMakeLists.txt ./
COPY deps/ deps/
COPY NeamC/ NeamC/
RUN cmake -B build \
-DCMAKE_BUILD_TYPE=Release \
-DNEAM_BACKEND_POSTGRES=ON \
&& cmake --build build -j$(nproc)
# -----------------------------------------------------------------------------
# Stage 2: Runtime
# -----------------------------------------------------------------------------
FROM ubuntu:24.04
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates \
curl \
libcurl4 \
libssl3 \
libpq5 \
git \
&& rm -rf /var/lib/apt/lists/*
# Copy all Neam binaries including neam-forge
COPY --from=builder /build/build/neamc /usr/local/bin/neamc
COPY --from=builder /build/build/neam /usr/local/bin/neam
COPY --from=builder /build/build/neam-cli /usr/local/bin/neam-cli
COPY --from=builder /build/build/neam-api /usr/local/bin/neam-api
COPY --from=builder /build/build/neam-lsp /usr/local/bin/neam-lsp
COPY --from=builder /build/build/neam-forge /usr/local/bin/neam-forge
# Create non-root user
RUN useradd --create-home --shell /bin/bash neam
USER neam
WORKDIR /home/neam
# Create directories for sessions and workspaces
RUN mkdir -p /home/neam/sessions /home/neam/workspace
# Copy agent source files
COPY --chown=neam:neam agents/ /home/neam/agents/
EXPOSE 8080
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
CMD curl -f http://localhost:8080/health || exit 1
ENTRYPOINT ["neam-api"]
CMD ["--program", "/home/neam/agents/main.neam", "--port", "8080"]
Note that git is included in the runtime image. This is required for forge agents
that use checkpoint: "git" -- the forge controller executes git init, git add,
and git commit inside the workspace directory.
Session and Workspace Volumes #
Claw agents store session data (JSONL conversation histories) in a sessions directory. Forge agents write files to a workspace directory. Both of these need persistent storage in production to survive container restarts.
Docker Compose Example #
A complete docker-compose.yaml with persistent volumes:
services:
neam-agent:
build: .
ports:
- "8080:8080"
environment:
- NEAM_API_KEY=${NEAM_API_KEY}
- NEAM_ADMIN_KEY=${NEAM_ADMIN_KEY}
- OPENAI_API_KEY=${OPENAI_API_KEY}
volumes:
- session-data:/home/neam/sessions
- workspace-data:/home/neam/workspace
command:
- "--program"
- "/home/neam/agents/main.neam"
- "--port"
- "8080"
- "--workers"
- "4"
- "--audit-sink"
- "file"
- "--audit-log"
- "/home/neam/sessions/audit.jsonl"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 30s
timeout: 5s
retries: 3
volumes:
session-data:
driver: local
workspace-data:
driver: local
Build and Run Commands #
# Build the image
docker build -t neam-agents:latest .
# Run with environment variables
docker run -d \
--name neam-server \
-p 8080:8080 \
-e NEAM_API_KEY="sk-my-secret-key" \
-e NEAM_ADMIN_KEY="admin-secret-key" \
-e OPENAI_API_KEY="sk-openai-key" \
-v neam-sessions:/home/neam/sessions \
-v neam-workspace:/home/neam/workspace \
neam-agents:latest
# Verify it is running
curl http://localhost:8080/health
# Run neam-forge directly in a container (for CI pipelines)
docker run --rm \
-e OPENAI_API_KEY="sk-openai-key" \
-v $(pwd)/output:/home/neam/workspace \
neam-agents:latest \
neam-forge --agent /home/neam/agents/build_pipeline.neam \
--workspace /home/neam/workspace \
--max-iterations 10
Use docker compose up during development for the full stack with persistent
volumes. Use docker run --rm with neam-forge as the entrypoint for one-shot
forge agent execution in CI pipelines.
28.8 Kubernetes Deployment #
Kubernetes provides production-grade orchestration for Neam agent containers. The key
additions for NeamClaw are PersistentVolumeClaims for session and workspace data,
updated deployment manifests with volume mounts, and health check probes that match
the /health and /ready endpoints.
PersistentVolumeClaims #
Claw agent sessions and forge agent workspaces must survive pod restarts. Define PersistentVolumeClaims for both:
# pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: neam-sessions-pvc
namespace: neam-prod
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: standard
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: neam-workspace-pvc
namespace: neam-prod
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: standard
The workspace PVC is larger because forge agents may generate significant code, documentation, or data artifacts during their build-verify loops.
Deployment with Volume Mounts #
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: neam-agents
namespace: neam-prod
labels:
app: neam-agents
spec:
replicas: 2
selector:
matchLabels:
app: neam-agents
template:
metadata:
labels:
app: neam-agents
spec:
containers:
- name: neam-api
image: neam-agents:latest
ports:
- containerPort: 8080
env:
- name: NEAM_API_KEY
valueFrom:
secretKeyRef:
name: neam-secrets
key: api-key
- name: NEAM_ADMIN_KEY
valueFrom:
secretKeyRef:
name: neam-secrets
key: admin-key
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: neam-secrets
key: openai-key
args:
- "--program"
- "/home/neam/agents/main.neam"
- "--port"
- "8080"
- "--workers"
- "4"
- "--drain-seconds"
- "30"
volumeMounts:
- name: tmp
mountPath: /tmp
- name: session-storage
mountPath: /home/neam/sessions
- name: forge-workspace
mountPath: /home/neam/workspace
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 15
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
failureThreshold: 3
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "2000m"
memory: "2Gi"
volumes:
- name: tmp
emptyDir: {}
- name: session-storage
persistentVolumeClaim:
claimName: neam-sessions-pvc
- name: forge-workspace
persistentVolumeClaim:
claimName: neam-workspace-pvc
Service and Ingress #
Expose the deployment with a Service and an Ingress:
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: neam-agents-svc
namespace: neam-prod
spec:
selector:
app: neam-agents
ports:
- port: 80
targetPort: 8080
protocol: TCP
type: ClusterIP
---
# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: neam-agents-ingress
namespace: neam-prod
annotations:
nginx.ingress.kubernetes.io/proxy-read-timeout: "120"
nginx.ingress.kubernetes.io/proxy-send-timeout: "120"
spec:
ingressClassName: nginx
rules:
- host: agents.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: neam-agents-svc
port:
number: 80
tls:
- hosts:
- agents.example.com
secretName: neam-tls-secret
Health Check Endpoints #
The two health check endpoints serve different purposes in Kubernetes:
| Endpoint | Kubernetes Probe | Checks | Failure Action |
|---|---|---|---|
/health |
Liveness | Process alive, event loop responsive | Kill and restart pod |
/ready |
Readiness | Agents loaded, LLM connectivity | Remove from service endpoints |
The liveness probe answers "is the process alive?" -- if it fails, the pod is irrecoverably broken and Kubernetes restarts it. The readiness probe answers "can this pod serve requests?" -- if it fails, the pod stays alive but stops receiving traffic until it recovers.
Resource Limits Recommendations #
| Workload Type | CPU Request | CPU Limit | Memory Request | Memory Limit |
|---|---|---|---|---|
| Claw agents (chat only) | 250m | 1000m | 256Mi | 1Gi |
| Forge agents (code generation) | 500m | 2000m | 512Mi | 2Gi |
| Mixed claw + forge | 500m | 2000m | 512Mi | 2Gi |
| High-traffic production | 1000m | 4000m | 1Gi | 4Gi |
Forge agents that run git operations and compile generated code need more CPU and
memory than pure chat agents. If you are running both claw and forge agents on the
same deployment, size your resource limits for the forge workload.
28.9 AWS Lambda Deployment #
AWS Lambda offers a serverless deployment model for Neam agents. The key challenge is persistence: Lambda functions are stateless, but forge agents need a workspace that persists across iterations, and claw agents need session storage. Amazon Elastic File System (EFS) solves this by providing a shared filesystem that Lambda functions can mount.
Lambda Architecture #
SAM Template #
Use the AWS Serverless Application Model (SAM) to define the Lambda function, EFS mount, and API Gateway configuration:
# template.yaml
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Neam Agent Lambda Deployment
Globals:
Function:
Timeout: 120
MemorySize: 1024
Resources:
# EFS for persistent storage
NeamFileSystem:
Type: AWS::EFS::FileSystem
Properties:
PerformanceMode: generalPurpose
ThroughputMode: bursting
Encrypted: true
NeamMountTarget:
Type: AWS::EFS::MountTarget
Properties:
FileSystemId: !Ref NeamFileSystem
SubnetId: !Ref PrivateSubnet
SecurityGroups:
- !Ref EfsSecurityGroup
NeamAccessPoint:
Type: AWS::EFS::AccessPoint
Properties:
FileSystemId: !Ref NeamFileSystem
PosixUser:
Uid: "1000"
Gid: "1000"
RootDirectory:
CreationInfo:
OwnerUid: "1000"
OwnerGid: "1000"
Permissions: "755"
Path: "/neam"
# Lambda function
NeamAgentFunction:
Type: AWS::Serverless::Function
Properties:
PackageType: Image
Architectures:
- x86_64
VpcConfig:
SubnetIds:
- !Ref PrivateSubnet
SecurityGroupIds:
- !Ref LambdaSecurityGroup
FileSystemConfigs:
- Arn: !GetAtt NeamAccessPoint.Arn
LocalMountPath: /mnt/neam
Environment:
Variables:
NEAM_SESSIONS_DIR: /mnt/neam/sessions
NEAM_WORKSPACE_DIR: /mnt/neam/workspace
NEAM_API_KEY: !Ref ApiKeyParameter
Events:
ClawList:
Type: HttpApi
Properties:
Path: /api/v1/claw
Method: GET
ClawMessage:
Type: HttpApi
Properties:
Path: /api/v1/claw/{agent}/sessions/{key}/message
Method: POST
ClawReset:
Type: HttpApi
Properties:
Path: /api/v1/claw/{agent}/sessions/{key}/reset
Method: POST
ForgeList:
Type: HttpApi
Properties:
Path: /api/v1/forge
Method: GET
ForgeStart:
Type: HttpApi
Properties:
Path: /api/v1/forge/{agent}/start
Method: POST
ForgeStatus:
Type: HttpApi
Properties:
Path: /api/v1/forge/{agent}/status
Method: GET
HealthCheck:
Type: HttpApi
Properties:
Path: /health
Method: GET
Metadata:
DockerTag: latest
DockerContext: .
Dockerfile: Dockerfile.lambda
Parameters:
ApiKeyParameter:
Type: String
NoEcho: true
Description: API key for authentication
Outputs:
ApiUrl:
Description: API Gateway URL
Value: !Sub "https://${ServerlessHttpApi}.execute-api.${AWS::Region}.amazonaws.com"
API Gateway Route Configuration #
The SAM template above automatically creates an HTTP API (API Gateway v2) with routes for each event definition. API Gateway v2 is preferred over REST API (v1) because it has lower latency, lower cost, and native support for Lambda proxy integration.
The routes map directly to the neam-api endpoints:
| API Gateway Route | Lambda Handler | Purpose |
|---|---|---|
GET /api/v1/claw |
List claw agents | Discovery |
POST /api/v1/claw/{agent}/sessions/{key}/message |
Send message | Conversation |
POST /api/v1/claw/{agent}/sessions/{key}/reset |
Reset session | Session management |
GET /api/v1/forge |
List forge agents | Discovery |
POST /api/v1/forge/{agent}/start |
Start forge loop | Execution |
GET /api/v1/forge/{agent}/status |
Check status | Monitoring |
GET /health |
Health check | Load balancer |
Lambda has a maximum execution timeout of 15 minutes. Forge agents that run many iterations may exceed this limit. For long-running forge loops, consider using AWS Step Functions to break the loop into individual Lambda invocations (one iteration per invocation) with state stored in EFS between invocations. Alternatively, deploy forge agents to ECS or Kubernetes where there is no execution time limit.
28.10 Real-World Example: Full Deployment Walkthrough #
Let us walk through a complete deployment from source code to production. We will build a system with two agents: a claw agent for customer support and a forge agent for code generation. We will test locally, package in Docker, deploy to Kubernetes, and verify via the HTTP API.
Step 1: Define the Agents #
Create a file called agents.neam with both agents:
// =============================================================================
// agents.neam -- Customer support + code builder
// =============================================================================
// ── Shared skills ────────────────────────────────────────────────────────────
skill lookup_order {
description: "Look up order status by order ID"
params: { order_id: string }
impl(order_id) {
return "Order " + order_id + ": Shipped, arriving Tuesday";
}
}
skill escalate {
description: "Escalate issue to human agent"
params: { reason: string }
impl(reason) {
return "Escalated to human: " + reason;
}
}
skill write_file {
description: "Write content to a file in the workspace"
params: { path: string, content: string }
impl(path, content) {
workspace_write(path, content);
return "Wrote " + path;
}
}
skill read_file {
description: "Read a file from the workspace"
params: { path: string }
impl(path) {
return workspace_read(path);
}
}
// ── Guards ───────────────────────────────────────────────────────────────────
guard input_filter {
description: "Filter harmful input"
on_observation(text) { return text; }
}
guardchain safety = [input_filter];
// ── Channel ──────────────────────────────────────────────────────────────────
channel web_chan {
type: "http"
}
// ── Claw Agent: Customer Support Bot ─────────────────────────────────────────
claw agent support_bot {
provider: "openai"
model: "gpt-4o"
system: "You are a customer support agent for Acme Corp. Be helpful, concise, and professional. Use lookup_order to check order status. Escalate complex issues with the escalate skill."
temperature: 0.3
skills: [lookup_order, escalate]
guards: [safety]
channels: [web_chan]
workspace: "./support_data"
session: {
idle_reset_minutes: 30
max_history_turns: 50
compaction: "auto"
}
semantic_memory: {
search: "hybrid"
}
}
// ── Verify function for forge agent ──────────────────────────────────────────
fun verify_code(ctx) {
let main = workspace_read("src/main.py");
if (main == nil) {
return "retry";
}
let tests = workspace_read("tests/test_main.py");
if (tests == nil) {
return "No tests found. Please create tests/test_main.py";
}
return true;
}
// ── Forge Agent: Code Builder ────────────────────────────────────────────────
forge agent code_builder {
provider: "openai"
model: "gpt-4o"
system: "You are an expert Python developer. Write clean, tested, production-quality code. Always create both source files and test files."
skills: [write_file, read_file]
guards: [safety]
verify: verify_code
workspace: "./generated_code"
checkpoint: "git"
loop: {
max_iterations: 10
max_cost: 5.0
plan_file: "plan.md"
progress_file: "progress.jsonl"
learnings_file: "learnings.jsonl"
}
}
// ── Trait implementations ────────────────────────────────────────────────────
impl Monitorable for support_bot {
fn baseline(self) {
return { "avg_tool_calls": 2, "avg_response_time_ms": 800 };
}
fn on_anomaly(self, event) {
emit "ALERT: anomaly on support_bot: " + event;
return nil;
}
}
impl Monitorable for code_builder {
fn baseline(self) {
return { "avg_iterations": 5, "avg_cost": 2.0 };
}
fn on_anomaly(self, event) {
emit "ALERT: anomaly on code_builder: " + event;
return nil;
}
}
Step 2: Test Locally with neam-forge CLI #
Before deploying, verify that the forge agent works locally:
# Test the forge agent
neam-forge --agent agents.neam --name code_builder \
--workspace ./local_test \
--max-iterations 3 \
--verbose
Expected output (stderr with --verbose):
[neam-forge] Compiling agents.neam...
[neam-forge] Registered 2 skills, 1 guard, 1 channel
[neam-forge] Found forge agent: code_builder
[neam-forge] Applied overrides: workspace=./local_test, max_iterations=3
[neam-forge] Starting forge loop...
Expected output (stdout):
{outcome: "completed", iterations: 3, total_cost: 0.0018, message: "All tasks verified"}
Also test the full system by starting the API server:
# Start the server locally
neam-api --program agents.neam --port 3000
# In another terminal, test the claw agent
curl -X POST http://localhost:3000/api/v1/claw/support_bot/sessions/test_user/message \
-H "Content-Type: application/json" \
-d '{"message": "What is the status of order ORD-12345?"}'
Step 3: Build the Docker Image #
Create the Dockerfile (as shown in Section 28.7) and build:
# Build the image
docker build -t neam-agents:latest .
# Test locally with Docker
docker run -d \
--name neam-test \
-p 8080:8080 \
-e OPENAI_API_KEY="${OPENAI_API_KEY}" \
neam-agents:latest
# Verify health
curl http://localhost:8080/health
# List agents
curl http://localhost:8080/api/v1/claw
curl http://localhost:8080/api/v1/forge
# Clean up
docker stop neam-test && docker rm neam-test
Step 4: Deploy to Kubernetes #
Tag and push the image, then apply the manifests:
# Tag and push to your container registry
docker tag neam-agents:latest registry.example.com/neam-agents:latest
docker push registry.example.com/neam-agents:latest
# Create namespace and secrets
kubectl create namespace neam-prod
kubectl create secret generic neam-secrets \
--namespace neam-prod \
--from-literal=api-key="${NEAM_API_KEY}" \
--from-literal=admin-key="${NEAM_ADMIN_KEY}" \
--from-literal=openai-key="${OPENAI_API_KEY}"
# Apply manifests
kubectl apply -f pvc.yaml
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
kubectl apply -f ingress.yaml
# Verify pods are running
kubectl get pods -n neam-prod -l app=neam-agents
Expected output:
NAME READY STATUS RESTARTS AGE
neam-agents-6b8f9c4d7f-abc12 1/1 Running 0 45s
neam-agents-6b8f9c4d7f-def34 1/1 Running 0 45s
Step 5: Test via HTTP API #
Once the pods are running and the ingress is configured, test the deployment through the HTTP API:
# Set the base URL (use port-forward for direct testing)
kubectl port-forward -n neam-prod svc/neam-agents-svc 8080:80 &
# Health check
curl http://localhost:8080/health
{
"status": "ok",
"uptime_seconds": 52
}
# List claw agents
curl -H "Authorization: Bearer ${NEAM_API_KEY}" \
http://localhost:8080/api/v1/claw
{
"claw_agents": [
{
"name": "support_bot",
"provider": "openai",
"model": "gpt-4o",
"session_count": 0,
"channels": ["web_chan"]
}
]
}
# Chat with the support bot
curl -X POST \
-H "Authorization: Bearer ${NEAM_API_KEY}" \
-H "Content-Type: application/json" \
-d '{"message": "What is the status of order ORD-99887?"}' \
http://localhost:8080/api/v1/claw/support_bot/sessions/k8s_test/message
{
"agent": "support_bot",
"session_key": "k8s_test",
"response": "Order ORD-99887: Shipped, arriving Tuesday. Is there anything else I can help you with?"
}
# Start a forge loop
curl -X POST \
-H "Authorization: Bearer ${NEAM_API_KEY}" \
-H "Content-Type: application/json" \
-d '{"max_iterations": 5, "max_cost": 2.0}' \
http://localhost:8080/api/v1/forge/code_builder/start
{
"status": "started",
"agent": "code_builder",
"max_iterations": 5,
"max_cost": 2.0
}
# Monitor progress
curl -H "Authorization: Bearer ${NEAM_API_KEY}" \
http://localhost:8080/api/v1/forge/code_builder/status
{
"agent": "code_builder",
"running": true,
"iteration": 3,
"total_cost": 0.0012,
"last_outcome": "running"
}
# Wait and check final status
sleep 30
curl -H "Authorization: Bearer ${NEAM_API_KEY}" \
http://localhost:8080/api/v1/forge/code_builder/status
{
"agent": "code_builder",
"running": false,
"iteration": 5,
"total_cost": 0.0024,
"last_outcome": "completed"
}
Deploy the agents from this walkthrough to your own Kubernetes cluster or Docker
environment. Modify the support_bot system prompt to match your own domain. Add a
third agent -- perhaps a documentation writer -- and wire it into a DAG with the
existing two agents.
Summary #
-
The neam-forge CLI runs forge agents directly from the command line. It compiles the
.neamfile, executes bytecode on a fresh VM, finds the forge agent, applies CLI overrides, generates wrapper code, runs the loop, and prints the LoopOutcome map. Exit code 0 means success; exit code 1 means error. -
The HTTP API exposes claw and forge agents over the network. Claw endpoints handle agent listing, message sending, session reset, and compaction. Forge endpoints handle agent listing, loop start, loop stop, and status checking.
-
Authentication uses two environment variables:
NEAM_API_KEYfor general API access (all endpoints) andNEAM_ADMIN_KEYfor administrative operations (disable, enable, confirmations). Health check endpoints are always accessible without authentication. -
The neam-api server flags control binding address, port, worker count, timeouts, pool sizes, bytecode loading, graceful shutdown, and audit logging.
-
Multi-agent orchestration uses
spawnfor quick single-turn delegation anddag_execute()for dependency-ordered workflows. Thespawnresolution searchesclaw_agents_, thenforge_agents_, thenglobals_. -
Docker deployment uses a multi-stage build that includes all Neam binaries (including
neam-forge), with named volumes for session data and workspace files. -
Kubernetes deployment adds PersistentVolumeClaims for persistent session and workspace storage, liveness probes on
/health, readiness probes on/ready, and resource limits tuned to the workload type. -
AWS Lambda deployment uses EFS for forge workspace persistence and API Gateway v2 for HTTP routing. Long-running forge loops may need Step Functions to work within Lambda timeout limits.
Exercises #
Exercise 28.1: CLI Basics #
Run a forge agent using the neam-forge CLI. Create a .neam file with a forge agent
that writes a "Hello, World!" Python file and a corresponding test file. Use the
--verbose flag and --max-iterations 3. Verify that the output directory contains
both src/main.py and tests/test_main.py.
Exercise 28.2: API Exploration #
Start neam-api with a .neam file containing one claw agent and one forge agent.
Using only curl commands:
- List all claw agents
- List all forge agents
- Send three messages to a claw agent session and verify context is maintained
- Reset the session and verify the agent has no memory of previous messages
- Start a forge loop with custom limits and poll the status until completion
Document each curl command and its response.
Exercise 28.3: Authentication #
Configure neam-api with both NEAM_API_KEY and NEAM_ADMIN_KEY. Write curl
commands that demonstrate:
- A request without authentication (should fail with 401)
- A request with the correct API key (should succeed)
- An admin request with only the API key (should fail with 403)
- An admin request with the admin key (should succeed)
- A health check without any key (should succeed)
Exercise 28.4: Docker Compose Stack #
Create a complete docker-compose.yaml that runs:
- A
neam-apiservice with two claw agents and one forge agent - Named volumes for sessions and workspaces
- Environment variables loaded from a
.envfile - A health check that restarts the container if unhealthy
Build and run the stack, then verify all agents are accessible via the API.
Exercise 28.5: DAG Pipeline #
Design and implement a three-agent DAG pipeline for a code review workflow:
researcher-- gathers best practices for the programming languagereviewer-- reviews a code file against the best practicesreporter-- writes a summary report with actionable feedback
Define all three agents, wire them into a dag_execute() call, and verify that
each node receives the results of its dependencies.
Exercise 28.6: Kubernetes Deployment #
Write complete Kubernetes manifests for deploying a Neam agent system. Include:
- A Namespace
- A Secret for API keys
- Two PersistentVolumeClaims (sessions and workspace)
- A Deployment with liveness and readiness probes
- A Service and Ingress
Apply the manifests to a local cluster (minikube or kind) and verify the agents are accessible.
Exercise 28.7: Monitoring Script #
Write a shell script that monitors a forge agent deployment. The script should:
- Start a forge loop via the API
- Poll the status endpoint every 5 seconds
- Print a progress line for each poll (iteration number and cost)
- Print the final outcome when the loop completes
- Exit with the same code as the forge loop (0 for completed, 1 for error)
Exercise 28.8: Multi-Environment Configuration #
Design a deployment strategy that supports three environments: development, staging, and production. For each environment, specify:
- The Docker image tag strategy
- The number of replicas
- Resource limits
- Authentication configuration
- Volume sizes for sessions and workspaces
- Server flags (workers, pool size, timeout)
Write the Kubernetes manifests (or Kustomize overlays) for all three environments and explain the rationale behind each difference.