📖 10 min read

Chapter 20: Docker and Kubernetes #

"Containers do not solve the deployment problem. Containers solve the 'it works on my machine' problem. Kubernetes solves the deployment problem." -- Kelsey Hightower, paraphrased

What You Will Learn #

In this chapter, you will learn how to package Neam agents as Docker containers and deploy them to Kubernetes. You will build a multi-stage Docker image, stand up a complete development stack with Docker Compose, write Kubernetes manifests with production-grade health probes, configure GitOps with Kustomize overlays, set up continuous deployment with ArgoCD and FluxCD, and implement autoscaling strategies. By the end, you will be able to take a Neam agent from source code to a production Kubernetes cluster with a single git push.

20.1 The Docker Multi-Stage Build #

The Neam Dockerfile uses a multi-stage build to keep the final image small. The builder stage includes all compilation tools (CMake, g++, development headers); the runtime stage includes only the compiled binaries and minimal runtime libraries.

Stage 1: Builder (~2 GB) Stage 2: Runtime (~150 MB)

+-------------------------+ +-------------------------+

| ubuntu:24.04 | | ubuntu:24.04

| + build-essential | | + ca-certificates

| + cmake, g++ | | + curl, libcurl4

| + libcurl4-openssl-dev | | + libssl3, libpq5

| + libssl-dev, libpq-dev | |

| | | /usr/local/bin/

| /build/ | COPY | neamc

| build/neamc ------->| neam

| build/neam | | neam-cli

| build/neam-api | | neam-api

| build/neam-cli | | neam-lsp

| build/neam-lsp | |

+-------------------------+ | USER neam (non-root)

| HEALTHCHECK enabled

+-------------------------+

The Complete Dockerfile #

dockerfile

# =============================================================================
# Neam v0.6.0 Multi-stage Docker Build
# =============================================================================

# -----------------------------------------------------------------------------
# Stage 1: Builder
# -----------------------------------------------------------------------------
FROM ubuntu:24.04 AS builder

RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    cmake \
    g++ \
    libcurl4-openssl-dev \
    libssl-dev \
    libpq-dev \
    ca-certificates \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /build

# Copy dependency files first for better layer caching
COPY CMakeLists.txt ./
COPY deps/ deps/
COPY NeamC/ NeamC/
COPY tests/ tests/

# Configure and build with PostgreSQL backend enabled
RUN cmake -B build \
    -DCMAKE_BUILD_TYPE=Release \
    -DNEAM_BACKEND_POSTGRES=ON \
    && cmake --build build -j$(nproc)

# Run tests during build to catch issues early
RUN ctest --test-dir build --output-on-failure || true

# -----------------------------------------------------------------------------
# Stage 2: Runtime
# -----------------------------------------------------------------------------
FROM ubuntu:24.04

RUN apt-get update && apt-get install -y --no-install-recommends \
    ca-certificates \
    curl \
    libcurl4 \
    libssl3 \
    libpq5 \
    && rm -rf /var/lib/apt/lists/* \
    && groupadd -r neam && useradd -r -g neam -d /app -s /sbin/nologin neam

WORKDIR /app

# Copy binaries from builder
COPY --from=builder /build/build/neamc /usr/local/bin/neamc
COPY --from=builder /build/build/neam /usr/local/bin/neam
COPY --from=builder /build/build/neam-cli /usr/local/bin/neam-cli
COPY --from=builder /build/build/neam-api /usr/local/bin/neam-api
COPY --from=builder /build/build/neam-forge /usr/local/bin/neam-forge
COPY --from=builder /build/build/neam-lsp /usr/local/bin/neam-lsp

# Copy stdlib
COPY --from=builder /build/NeamC/stdlib /app/stdlib

# Create data directories (include session and workspace storage)
RUN mkdir -p /app/data /app/sessions /app/workspace /tmp/neam && \
    chown -R neam:neam /app /tmp/neam

ENV NEAM_ENV=production
ENV NEAM_LOG_LEVEL=info
ENV NEAM_PORT=8080

EXPOSE 8080

HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
    CMD curl -f http://localhost:8080/health || exit 1

USER neam

ENTRYPOINT ["neam-api"]
CMD ["--port", "8080"]

Understanding the Layers #

Let us examine the key design decisions in this Dockerfile.

Layer caching: The COPY instructions are ordered from least-frequently-changed to most-frequently-changed. CMakeLists.txt and deps/ change rarely, so Docker can cache those layers. NeamC/ contains the source code and changes with every commit, so it is copied last. This optimization can reduce rebuild times from 10 minutes to under 1 minute for source-only changes.

Non-root user: The runtime stage creates a neam user and group, and the final USER neam instruction ensures the container runs as non-root. This is a security requirement in most production Kubernetes clusters.

Health check: The HEALTHCHECK instruction tells Docker (and Docker Compose) how to verify the container is healthy. The neam-api server exposes a /health endpoint that returns HTTP 200 when the service is operational.

Minimal runtime: The runtime image includes only the libraries needed to run the compiled binaries. The build tools, headers, and intermediate objects are left behind in the builder stage. This reduces the image from ~2 GB to ~150 MB.

Building the Image #

bash

# Build the image
docker build -t neam-agent:latest .

# Build with a specific tag
docker build -t neam-agent:v0.6.0 .

# Build with multi-cloud backends
docker build \
  --build-arg CMAKE_FLAGS="-DNEAM_BACKEND_POSTGRES=ON -DNEAM_BACKEND_AWS=ON" \
  -t neam-agent:aws .

# Run the image
docker run -p 8080:8080 \
  -e OPENAI_API_KEY="sk-..." \
  neam-agent:latest

# Run with a custom agent file
docker run -p 8080:8080 \
  -e OPENAI_API_KEY="sk-..." \
  -v $(pwd)/src:/app/src \
  neam-agent:latest \
  neam-api --port 8080 --agent-file /app/src/main.neamb

20.2 Docker Compose: The Development Stack #

For local development, Docker Compose orchestrates Neam alongside its supporting services: PostgreSQL for state, Redis for caching, an OpenTelemetry Collector for trace ingestion, Jaeger for trace visualization, and Prometheus for metrics.

The Complete docker-compose.yml #

yaml

services:
  neam-agent:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - "8080:8080"
    environment:
      - NEAM_ENV=development
      - NEAM_LOG_LEVEL=debug
      - NEAM_PORT=8080
      - DATABASE_URL=postgres://neam:neam_dev@postgres:5432/neam_dev
      - REDIS_URL=redis://redis:6379
      - OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
      - OPENAI_API_KEY=${OPENAI_API_KEY}
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy
    volumes:
      - neam-data:/app/data
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 15s
      timeout: 5s
      retries: 5
      start_period: 10s

  postgres:
    image: postgres:16
    environment:
      POSTGRES_DB: neam_dev
      POSTGRES_USER: neam
      POSTGRES_PASSWORD: neam_dev
    ports:
      - "5432:5432"
    volumes:
      - postgres-data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U neam -d neam_dev"]
      interval: 10s
      timeout: 5s
      retries: 5
    restart: unless-stopped

  redis:
    image: redis:7
    ports:
      - "6379:6379"
    volumes:
      - redis-data:/data
    command: redis-server --appendonly yes
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5
    restart: unless-stopped

  otel-collector:
    image: otel/opentelemetry-collector-contrib:latest
    ports:
      - "4317:4317"   # gRPC OTLP
      - "4318:4318"   # HTTP OTLP
      - "8889:8889"   # Prometheus exporter
    volumes:
      - ./docker/otel-config.yaml:/etc/otelcol-contrib/config.yaml:ro
    depends_on:
      - jaeger
    restart: unless-stopped

  jaeger:
    image: jaegertracing/all-in-one:latest
    ports:
      - "16686:16686"  # Jaeger UI
      - "14250:14250"  # gRPC model.proto
      - "14268:14268"  # HTTP collector
    environment:
      - COLLECTOR_OTLP_ENABLED=true
    restart: unless-stopped

  prometheus:
    image: prom/prometheus:latest
    ports:
      - "9090:9090"
    volumes:
      - ./docker/prometheus.yml:/etc/prometheus/prometheus.yml:ro
      - prometheus-data:/prometheus
    command:
      - "--config.file=/etc/prometheus/prometheus.yml"
      - "--storage.tsdb.path=/prometheus"
      - "--storage.tsdb.retention.time=7d"
    depends_on:
      - otel-collector
    restart: unless-stopped

volumes:
  neam-data:
  postgres-data:
  redis-data:
  prometheus-data:

Starting the Development Stack #

bash

# Start all services
docker compose up -d

# Watch logs
docker compose logs -f neam-agent

# Check service health
docker compose ps

# Access the services:
# - Neam API:      http://localhost:8080
# - Jaeger UI:     http://localhost:16686
# - Prometheus:    http://localhost:9090
# - Postgres:      localhost:5432
# - Redis:         localhost:6379

# Stop all services
docker compose down

# Stop and remove volumes (clean slate)
docker compose down -v

The OpenTelemetry Collector Configuration #

The OTel Collector receives traces and metrics from the Neam agent and forwards them to Jaeger (traces) and Prometheus (metrics):

yaml

# docker/otel-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    send_batch_size: 1024
    timeout: 5s
  memory_limiter:
    check_interval: 5s
    limit_mib: 512
    spike_limit_mib: 128
  resource:
    attributes:
      - key: service.name
        value: neam-agent
        action: upsert

exporters:
  jaeger:
    endpoint: jaeger:14250
    tls:
      insecure: true
  prometheus:
    endpoint: 0.0.0.0:8889
    namespace: neam
    resource_to_telemetry_conversion:
      enabled: true
  debug:
    verbosity: basic

extensions:
  health_check:
    endpoint: 0.0.0.0:13133

service:
  extensions: [health_check]
  pipelines:
    traces:
      receivers: [otlp]
      processors: [memory_limiter, batch, resource]
      exporters: [jaeger, debug]
    metrics:
      receivers: [otlp]
      processors: [memory_limiter, batch, resource]
      exporters: [prometheus, debug]

20.3 Kubernetes Deployment #

Moving from Docker Compose to Kubernetes is the leap from single-machine development to production-grade orchestration. Neam provides both pre-built manifests in the gitops/ directory and the neamc deploy command for generating custom manifests.

Kubernetes Architecture Overview #

neam-agent

Pod (1/3)

port: 8080

▶

neam-agent

Pod (2/3)

port: 8080

▶

neam-agent

Pod (3/3)

port: 8080

▼

Postgres

(RDS/

Cloud

SQL)

▶

Redis

(Elasti-

cache)

▶

OTel Collector

(observability

namespace)

Base Manifests #

The Neam project ships with production-ready Kubernetes manifests in gitops/base/. Let us examine each one.

Deployment #

The Deployment defines how Neam pods are created and managed:

yaml

# gitops/base/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: neam-agent
  labels:
    app.kubernetes.io/name: neam-agent
    app.kubernetes.io/version: "0.6.0"
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: neam-agent
  template:
    metadata:
      labels:
        app.kubernetes.io/name: neam-agent
    spec:
      serviceAccountName: neam-agent
      terminationGracePeriodSeconds: 30
      containers:
        - name: neam-agent
          image: neam-agent:latest
          ports:
            - name: http
              containerPort: 8080
              protocol: TCP
          envFrom:
            - configMapRef:
                name: neam-config
            - secretRef:
                name: neam-secrets
                optional: true
          livenessProbe:
            httpGet:
              path: /health
              port: http
            initialDelaySeconds: 15
            periodSeconds: 20
            timeoutSeconds: 5
            failureThreshold: 3
          readinessProbe:
            httpGet:
              path: /ready
              port: http
            initialDelaySeconds: 5
            periodSeconds: 10
            timeoutSeconds: 3
            failureThreshold: 3
          resources:
            requests:
              cpu: 500m
              memory: 512Mi
            limits:
              cpu: "1"
              memory: 1Gi
          securityContext:
            runAsNonRoot: true
            readOnlyRootFilesystem: true
            allowPrivilegeEscalation: false
            capabilities:
              drop:
                - ALL
          volumeMounts:
            - name: tmp
              mountPath: /tmp
            - name: data
              mountPath: /app/data
            - name: session-storage
              mountPath: /app/sessions
            - name: forge-workspace
              mountPath: /app/workspace
      volumes:
        - name: tmp
          emptyDir: {}
        - name: data
          emptyDir: {}
        - name: session-storage
          emptyDir: {}
        - name: forge-workspace
          emptyDir: {}

💡 Tip

For production deployments with claw agents that need session persistence across pod restarts, replace the emptyDir: {} volumes with PersistentVolumeClaim references:

yaml

- name: session-storage
persistentVolumeClaim:
claimName: neam-sessions-pvc
- name: forge-workspace
persistentVolumeClaim:
claimName: neam-workspace-pvc

Key points to understand:

Health probes: The liveness probe hits /health to detect deadlocked processes. The readiness probe hits /ready to check that the state backend and LLM providers are accessible. We cover health check semantics in detail in Chapter 22.
Security context: The container runs as non-root with a read-only filesystem, no privilege escalation, and all Linux capabilities dropped. The tmp, data, session-storage, and forge-workspace volumes provide writable directories for temporary files, local state, claw agent sessions, and forge agent workspaces.
Resource requests and limits: The requests guarantee the pod gets at least 500m CPU and 512Mi memory. The limits cap it at 1 CPU and 1Gi to prevent noisy-neighbor problems.

Service #

The Service provides a stable network endpoint for the pods:

yaml

# gitops/base/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: neam-agent
  labels:
    app.kubernetes.io/name: neam-agent
spec:
  type: ClusterIP
  ports:
    - name: http
      port: 80
      targetPort: 8080
      protocol: TCP
  selector:
    app.kubernetes.io/name: neam-agent

ConfigMap #

yaml

# gitops/base/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: neam-config
  labels:
    app.kubernetes.io/name: neam-agent
data:
  NEAM_ENV: "production"
  NEAM_LOG_LEVEL: "info"
  NEAM_PORT: "8080"
  NEAM_TELEMETRY_ENABLED: "true"
  NEAM_OTEL_ENDPOINT: "http://otel-collector.observability.svc.cluster.local:4318"

HorizontalPodAutoscaler #

yaml

# gitops/base/hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: neam-agent
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: neam-agent
  minReplicas: 1
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
        - type: Pods
          value: 2
          periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Pods
          value: 1
          periodSeconds: 120

The HPA behavior section is critical for AI agent workloads. Scale-up is aggressive (add up to 2 pods per minute) because LLM calls have latency and you want capacity before requests start queuing. Scale-down is conservative (remove at most 1 pod every 2 minutes) to avoid flapping during intermittent traffic patterns.

PodDisruptionBudget #

yaml

# gitops/base/pdb.yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: neam-agent
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: neam-agent

The PDB ensures at least 1 pod remains available during voluntary disruptions like node upgrades or cluster scaling.

NetworkPolicy #

yaml

# gitops/base/networkpolicy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: neam-agent
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: neam-agent
  policyTypes:
    - Ingress
    - Egress
  ingress:
    - ports:
        - port: 8080
          protocol: TCP
  egress:
    - {}

Ingress is restricted to port 8080 only. Egress is open because Neam agents need to reach external LLM APIs, state backends, and the OTel collector. In a more restrictive environment, you would enumerate the allowed egress targets.

Kustomization #

Kustomize ties all base resources together:

yaml

# gitops/base/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

commonLabels:
  app.kubernetes.io/name: neam-agent
  app.kubernetes.io/part-of: neam
  app.kubernetes.io/managed-by: kustomize

resources:
  - deployment.yaml
  - service.yaml
  - configmap.yaml
  - hpa.yaml
  - pdb.yaml
  - networkpolicy.yaml
  - serviceaccount.yaml

20.4 GitOps with Kustomize Overlays #

Kustomize overlays allow you to customize the base manifests for different environments without duplicating YAML. The Neam project ships with three overlays: dev, staging, and production.

git

push

Development Overlay #

yaml

# gitops/overlays/dev/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

namespace: neam-dev

resources:
  - ../../base

patches:
  - target:
      kind: Deployment
      name: neam-agent
    patch: |
      - op: replace
        path: /spec/replicas
        value: 1
      - op: replace
        path: /spec/template/spec/containers/0/resources/requests/cpu
        value: 250m
      - op: replace
        path: /spec/template/spec/containers/0/resources/requests/memory
        value: 256Mi
      - op: replace
        path: /spec/template/spec/containers/0/resources/limits/cpu
        value: 500m
      - op: replace
        path: /spec/template/spec/containers/0/resources/limits/memory
        value: 512Mi

  - target:
      kind: ConfigMap
      name: neam-config
    patch: |
      - op: replace
        path: /data/NEAM_ENV
        value: development
      - op: replace
        path: /data/NEAM_LOG_LEVEL
        value: debug

commonLabels:
  app.kubernetes.io/instance: dev

Staging Overlay #

yaml

# gitops/overlays/staging/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

namespace: neam-staging

resources:
  - ../../base

patches:
  - target:
      kind: Deployment
      name: neam-agent
    patch: |
      - op: replace
        path: /spec/replicas
        value: 2
      - op: replace
        path: /spec/template/spec/containers/0/resources/requests/cpu
        value: 500m
      - op: replace
        path: /spec/template/spec/containers/0/resources/limits/cpu
        value: "1"

  - target:
      kind: ConfigMap
      name: neam-config
    patch: |
      - op: replace
        path: /data/NEAM_ENV
        value: staging
      - op: replace
        path: /data/NEAM_LOG_LEVEL
        value: info

  - target:
      kind: HorizontalPodAutoscaler
      name: neam-agent
    patch: |
      - op: replace
        path: /spec/minReplicas
        value: 2
      - op: replace
        path: /spec/maxReplicas
        value: 10

commonLabels:
  app.kubernetes.io/instance: staging

Production Overlay #

yaml

# gitops/overlays/production/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

namespace: neam-production

resources:
  - ../../base

patches:
  - target:
      kind: Deployment
      name: neam-agent
    patch: |
      - op: replace
        path: /spec/replicas
        value: 3
      - op: replace
        path: /spec/template/spec/containers/0/resources/requests/cpu
        value: "1"
      - op: replace
        path: /spec/template/spec/containers/0/resources/requests/memory
        value: 512Mi
      - op: replace
        path: /spec/template/spec/containers/0/resources/limits/cpu
        value: "2"
      - op: replace
        path: /spec/template/spec/containers/0/resources/limits/memory
        value: 1Gi

  - target:
      kind: ConfigMap
      name: neam-config
    patch: |
      - op: replace
        path: /data/NEAM_ENV
        value: production
      - op: replace
        path: /data/NEAM_LOG_LEVEL
        value: warn

  - target:
      kind: HorizontalPodAutoscaler
      name: neam-agent
    patch: |
      - op: replace
        path: /spec/minReplicas
        value: 3
      - op: replace
        path: /spec/maxReplicas
        value: 20

  - target:
      kind: PodDisruptionBudget
      name: neam-agent
    patch: |
      - op: replace
        path: /spec/minAvailable
        value: 2

commonLabels:
  app.kubernetes.io/instance: production

Building and Applying Overlays #

bash

# Preview the dev overlay
kubectl kustomize gitops/overlays/dev

# Apply the dev overlay
kubectl apply -k gitops/overlays/dev

# Preview the production overlay
kubectl kustomize gitops/overlays/production

# Apply the production overlay
kubectl apply -k gitops/overlays/production

# Diff against the running cluster
kubectl diff -k gitops/overlays/production

20.5 ArgoCD Setup #

ArgoCD provides automated GitOps deployment. When you push changes to the Git repository, ArgoCD detects the change, renders the Kustomize overlay, and applies the resulting manifests to the cluster.

ArgoCD Application #

yaml

# gitops/argocd/application.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: neam-agent
  namespace: argocd
  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  project: neam
  source:
    repoURL: https://github.com/neam-lang/Neam.git
    targetRevision: main
    path: gitops/overlays/production
  destination:
    server: https://kubernetes.default.svc
    namespace: neam-production
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
      allowEmpty: false
    syncOptions:
      - CreateNamespace=true
      - PrunePropagationPolicy=foreground
      - PruneLast=true
    retry:
      limit: 5
      backoff:
        duration: 5s
        factor: 2
        maxDuration: 3m
  revisionHistoryLimit: 10

Key ArgoCD concepts:

automated.prune: Deletes resources that are no longer in Git
automated.selfHeal: Reverts manual changes made directly to the cluster
CreateNamespace: Creates the target namespace if it does not exist
retry: Retries failed syncs with exponential backoff

Setting Up ArgoCD #

bash

# Install ArgoCD
kubectl create namespace argocd
kubectl apply -n argocd -f \
  https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

# Wait for ArgoCD to be ready
kubectl -n argocd rollout status deploy/argocd-server

# Get the initial admin password
kubectl -n argocd get secret argocd-initial-admin-secret \
  -o jsonpath="{.data.password}" | base64 -d

# Apply the ArgoCD project and application
kubectl apply -f gitops/argocd/appproject.yaml
kubectl apply -f gitops/argocd/application.yaml

# Port-forward to access the ArgoCD UI
kubectl port-forward svc/argocd-server -n argocd 8443:443
# Open https://localhost:8443 in your browser

20.6 FluxCD Setup #

FluxCD is an alternative GitOps controller. Where ArgoCD uses a pull-based UI model, FluxCD is more Kubernetes-native and uses CRDs (Custom Resource Definitions) for everything.

FluxCD Kustomization #

yaml

# gitops/fluxcd/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: neam-agent
  namespace: flux-system
spec:
  interval: 5m
  retryInterval: 2m
  timeout: 3m
  sourceRef:
    kind: GitRepository
    name: neam
  path: ./gitops/overlays/production
  prune: true
  wait: true
  healthChecks:
    - apiVersion: apps/v1
      kind: Deployment
      name: neam-agent
      namespace: neam-production
  patches:
    - patch: |
        apiVersion: apps/v1
        kind: Deployment
        metadata:
          name: neam-agent
          annotations:
            fluxcd.io/automated: "true"
      target:
        kind: Deployment
        name: neam-agent

FluxCD GitRepository #

yaml

# gitops/fluxcd/gitrepository.yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
  name: neam
  namespace: flux-system
spec:
  interval: 1m
  url: https://github.com/neam-lang/Neam.git
  ref:
    branch: main
  secretRef:
    name: neam-git-credentials

Setting Up FluxCD #

bash

# Install FluxCD
flux install

# Create the GitRepository source
kubectl apply -f gitops/fluxcd/gitrepository.yaml

# Create the Kustomization
kubectl apply -f gitops/fluxcd/kustomization.yaml

# Check the reconciliation status
flux get kustomizations
flux get sources git

20.7 Health Probes #

Kubernetes uses three types of probes to manage container lifecycle. The Neam API server implements all three:

Probe	Endpoint	Purpose	Failure Action
Liveness	`GET /health`	Is the process alive?	Kill and restart the pod
Readiness	`GET /ready`	Can the pod serve traffic?	Remove from Service endpoints
Startup	`GET /startup`	Has initialization completed?	Wait, then fall through to liveness

Liveness Probe: /health #

Returns HTTP 200 if the Neam process is alive and responsive. This is a simple heartbeat -- it does not check external dependencies.

bash

$ curl -i http://localhost:8080/health
HTTP/1.1 200 OK
Content-Type: application/json

{"status": "ok", "version": "0.6.0"}

Readiness Probe: /ready #

Returns HTTP 200 only when the agent can serve requests. This means:

The state backend is connected and responsive
At least one LLM provider circuit is closed (not all providers failed)
If telemetry is enabled, the OTLP export queue is not full

bash

$ curl -i http://localhost:8080/ready
HTTP/1.1 200 OK
Content-Type: application/json

{
  "status": "ready",
  "state_backend": "postgres",
  "state_backend_status": "connected",
  "llm_providers": {
    "openai": "healthy",
    "anthropic": "healthy"
  },
  "telemetry": "ok"
}

If the state backend is down:

bash

$ curl -i http://localhost:8080/ready
HTTP/1.1 503 Service Unavailable
Content-Type: application/json

{
  "status": "not_ready",
  "state_backend": "postgres",
  "state_backend_status": "connection_refused",
  "llm_providers": {
    "openai": "healthy"
  }
}

Startup Probe: /startup #

Returns HTTP 200 once the VM has completed initialization: loaded the bytecode, connected to the state backend, registered agents, and started the autonomous executor (if configured).

yaml

# Adding a startup probe to the deployment
startupProbe:
  httpGet:
    path: /startup
    port: http
  initialDelaySeconds: 5
  periodSeconds: 5
  failureThreshold: 30

With the startup probe configured, Kubernetes waits up to 150 seconds (30 x 5s) for initialization before starting liveness and readiness checks. This is important for agents that need to ingest knowledge bases at startup.

20.8 Scaling Strategies #

HPA: CPU-Based Scaling #

The default HPA scales based on CPU utilization. This works well for most Neam workloads because LLM calls (while they wait on external APIs) still consume CPU for request marshaling, response parsing, and RAG retrieval.

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: neam-agent
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: neam-agent
  minReplicas: 3
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
        - type: Pods
          value: 2
          periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Pods
          value: 1
          periodSeconds: 120

KEDA: Event-Driven Scaling #

For event-driven workloads (message queues, scheduled autonomous agents), KEDA provides more sophisticated scaling triggers:

yaml

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: neam-agent
  namespace: neam-production
spec:
  scaleTargetRef:
    name: neam-agent
  minReplicaCount: 1
  maxReplicaCount: 20
  triggers:
    - type: prometheus
      metadata:
        serverAddress: http://prometheus.observability:9090
        metricName: neam_llm_requests_queued
        threshold: "10"
        query: |
          sum(neam_llm_requests_queued{service="neam-agent"})
    - type: cron
      metadata:
        timezone: America/New_York
        start: 0 8 * * 1-5
        end: 0 18 * * 1-5
        desiredReplicas: "5"

This KEDA configuration scales based on two triggers:

Prometheus metric: When the LLM request queue exceeds 10, add more pods
Cron schedule: During business hours (8 AM - 6 PM ET, Monday-Friday), maintain at least 5 replicas

20.9 Security Best Practices #

Non-Root Execution #

The Dockerfile creates a dedicated neam user, and the deployment enforces it:

yaml

securityContext:
  runAsNonRoot: true
  readOnlyRootFilesystem: true
  allowPrivilegeEscalation: false
  capabilities:
    drop:
      - ALL

Read-Only Root Filesystem #

The container's root filesystem is mounted read-only. The only writable paths are explicitly mounted volumes:

yaml

volumeMounts:
  - name: tmp
    mountPath: /tmp      # For temporary files during LLM calls
  - name: data
    mountPath: /app/data # For local SQLite state (if used)
volumes:
  - name: tmp
    emptyDir: {}
  - name: data
    emptyDir: {}

Network Policies #

The NetworkPolicy restricts which pods can communicate with the Neam agent:

yaml

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: neam-agent
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: neam-agent
  policyTypes:
    - Ingress
    - Egress
  ingress:
    - ports:
        - port: 8080
          protocol: TCP
  egress:
    - {}   # Open egress for LLM APIs

For stricter environments, restrict egress to known endpoints:

yaml

egress:
  # Allow DNS
  - to:
      - namespaceSelector: {}
    ports:
      - port: 53
        protocol: UDP
  # Allow LLM API calls
  - to:
      - ipBlock:
          cidr: 0.0.0.0/0
    ports:
      - port: 443
        protocol: TCP
  # Allow state backend
  - to:
      - namespaceSelector:
          matchLabels:
            name: databases
    ports:
      - port: 5432
        protocol: TCP
  # Allow OTel Collector
  - to:
      - namespaceSelector:
          matchLabels:
            name: observability
    ports:
      - port: 4318
        protocol: TCP

Secrets Management #

Never put secrets in ConfigMaps. Use Kubernetes Secrets with encryption at rest:

bash

# Create a secret
kubectl create secret generic neam-secrets \
  -n neam-production \
  --from-literal=OPENAI_API_KEY="sk-..." \
  --from-literal=ANTHROPIC_API_KEY="sk-ant-..."

# Or use a cloud secrets provider (AWS, GCP, Azure) with External Secrets Operator

20.10 The neamc deploy Command #

The neamc deploy command generates deployment artifacts from your neam.toml:

bash

# Generate Kubernetes manifests
neamc deploy --target kubernetes --output ./deploy/

# Dry run (preview without writing files)
neamc deploy --target kubernetes --dry-run

# Generate Helm chart
neamc deploy --target helm --output ./chart/

# Generate for Docker only
neamc deploy --target docker --output ./deploy/

The command reads your neam.toml configuration and generates manifests tailored to your settings. For example, if you have telemetry enabled, the Kubernetes manifest will include an OTel Collector sidecar. If you have an HPA section, the manifest will include a HorizontalPodAutoscaler.

bash

$ neamc deploy --target kubernetes --dry-run

=== Generating Kubernetes manifests ===

Reading neam.toml...
  Project: customer-service-agent v2.1.0
  State backend: postgres
  Telemetry: enabled (endpoint: http://otel-collector:4318)
  Deploy target: kubernetes (namespace: neam-production)

Generated files:
  deployment.yaml          (3 replicas, resource limits, health probes)
  service.yaml             (ClusterIP, port 80 -> 8080)
  configmap.yaml           (NEAM_ENV, NEAM_LOG_LEVEL, ...)
  hpa.yaml                 (min: 3, max: 20, CPU target: 70%)
  pdb.yaml                 (minAvailable: 2)
  networkpolicy.yaml       (ingress: 8080, egress: all)
  serviceaccount.yaml      (neam-agent service account)
  kustomization.yaml       (ties all resources together)

Dry run complete. No files written.

20.11 CI/CD Pipelines with GitHub Actions #

Continuous integration and continuous deployment (CI/CD) pipelines automate the build-test-deploy cycle. The Neam project ships three GitHub Actions workflows: one for continuous integration on every push, one for deploying to Kubernetes, and one for security scanning.

Continuous Integration Workflow #

The CI workflow runs on every push and pull request. It builds the Neam toolchain across multiple platforms, runs the test suite, and publishes container images for tagged releases.

yaml

name: CI
on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  build:
    strategy:
      matrix:
        os: [ubuntu-latest, macos-latest]
        build_type: [Release, Debug]
    runs-on: ${{ matrix.os }}
    steps:
      - uses: actions/checkout@v4
      - name: Install dependencies (Ubuntu)
        if: runner.os == 'Linux'
        run: |
          sudo apt-get update
          sudo apt-get install -y build-essential cmake libcurl4-openssl-dev \
            libssl-dev libpq-dev
      - name: Install dependencies (macOS)
        if: runner.os == 'macOS'
        run: brew install cmake openssl postgresql libpq
      - name: Configure
        run: cmake -B build -DCMAKE_BUILD_TYPE=${{ matrix.build_type }}
      - name: Build
        run: cmake --build build -j$(nproc 2>/dev/null || sysctl -n hw.ncpu)
      - name: Test
        run: ctest --test-dir build --output-on-failure

  container:
    needs: build
    if: github.event_name == 'push' && startsWith(github.ref, 'refs/tags/v')
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
    steps:
      - uses: actions/checkout@v4
      - uses: docker/login-action@v3
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - uses: docker/build-push-action@v5
        with:
          push: true
          tags: |
            ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.ref_name }}
            ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:latest

The matrix strategy builds across Ubuntu and macOS with both Release and Debug configurations, catching platform-specific issues early. The container job only runs on version tags (v1.0.0, v0.6.0), pushing multi-tagged images to the GitHub Container Registry.

Kubernetes Deployment Workflow #

The deployment workflow uses environment-based dispatch. Pushing to main deploys to staging automatically; production deployments require manual approval.

yaml

name: Deploy to Kubernetes
on:
  workflow_dispatch:
    inputs:
      environment:
        description: "Target environment"
        required: true
        type: choice
        options:
          - staging
          - production
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    environment: ${{ github.event.inputs.environment || 'staging' }}
    steps:
      - uses: actions/checkout@v4
      - name: Set up kubectl
        uses: azure/setup-kubectl@v3
      - name: Configure kubeconfig
        run: |
          mkdir -p ~/.kube
          echo "${{ secrets.KUBECONFIG }}" | base64 -d > ~/.kube/config
      - name: Set image tag
        run: |
          cd gitops/overlays/${{ github.event.inputs.environment || 'staging' }}
          kustomize edit set image \
            neam-agent=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
      - name: Apply manifests
        run: |
          kubectl apply -k \
            gitops/overlays/${{ github.event.inputs.environment || 'staging' }}
      - name: Wait for rollout
        run: |
          kubectl -n neam-${{ github.event.inputs.environment || 'staging' }} \
            rollout status deployment/neam-agent --timeout=300s
      - name: Verify health
        run: |
          kubectl -n neam-${{ github.event.inputs.environment || 'staging' }} \
            exec deploy/neam-agent -- curl -sf http://localhost:8080/ready

The workflow uses Kubernetes environment protection rules. For production, you can configure required reviewers and deployment branch restrictions in your repository settings.

20.12 Security Scanning #

Automated security scanning catches vulnerabilities before they reach production. A dedicated workflow runs three types of scans: container image scanning, static code analysis, and dependency review.

Security Scan Workflow #

yaml

name: Security Scan
on:
  push:
    branches: [main]
  pull_request:
    branches: [main]
  schedule:
    - cron: "0 6 * * 1"   # Weekly Monday at 6 AM UTC

permissions:
  contents: read
  security-events: write

jobs:
  trivy-container:
    name: Trivy Container Scan
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Build image
        run: docker build -t neam-agent:scan .
      - name: Run Trivy container scan
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: neam-agent:scan
          format: sarif
          output: trivy-container.sarif
          severity: CRITICAL,HIGH
      - name: Upload results to GitHub Security
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: trivy-container.sarif

  trivy-filesystem:
    name: Trivy Filesystem Scan
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Run Trivy filesystem scan
        uses: aquasecurity/trivy-action@master
        with:
          scan-type: fs
          scan-ref: .
          format: sarif
          output: trivy-fs.sarif
          severity: CRITICAL,HIGH,MEDIUM
      - name: Upload results to GitHub Security
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: trivy-fs.sarif

  codeql:
    name: CodeQL Analysis
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Initialize CodeQL
        uses: github/codeql-action/init@v3
        with:
          languages: cpp
      - name: Build
        run: |
          cmake -B build -DCMAKE_BUILD_TYPE=Release
          cmake --build build -j$(nproc)
      - name: Perform CodeQL analysis
        uses: github/codeql-action/analyze@v3

  dependency-review:
    name: Dependency Review
    if: github.event_name == 'pull_request'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Dependency review
        uses: actions/dependency-review-action@v4
        with:
          fail-on-severity: high

Each scan serves a different purpose:

Scan	Tool	What It Finds
Container scan	Trivy	OS package vulnerabilities in the Docker image
Filesystem scan	Trivy	Vulnerabilities in source dependencies and config files
Static analysis	CodeQL	Code-level security issues (buffer overflows, injection, etc.)
Dependency review	GitHub	New vulnerable dependencies introduced in pull requests

The weekly scheduled scan (cron: "0 6 * * 1") catches newly disclosed CVEs in dependencies that were safe when first added. Results are uploaded as SARIF (Static Analysis Results Interchange Format) to GitHub's Security tab, where they appear alongside code scanning alerts.

Interpreting Scan Results #

When Trivy reports a vulnerability, evaluate whether it is reachable from your code. A HIGH severity CVE in libssl matters if your agent makes HTTPS calls (it does). A HIGH severity CVE in libx11 does not matter because the container has no graphical interface.

To suppress known false positives, create a .trivyignore file:

text

# Not exploitable: Neam does not use the affected API
CVE-2024-XXXXX

20.13 Helm Chart Generation #

Helm is the package manager for Kubernetes. While Kustomize works well for teams that manage their own clusters, Helm charts are the standard distribution format for applications that others will deploy. The neamc deploy command can generate a complete Helm chart from your neam.toml.

Generating a Helm Chart #

bash

# Generate a Helm chart
neamc deploy --target helm --output ./chart/

# Preview the generated chart structure
tree chart/

The generated chart follows the standard Helm layout:

📁chart/

├── 📄Chart.yaml

├── 📄values.yaml

├── 📁templates/

│ ├── 📄_helpers.tpl

│ ├── 📄deployment.yaml

│ ├── 📄service.yaml

│ ├── 📄configmap.yaml

│ ├── 📄hpa.yaml

│ ├── 📄pdb.yaml

│ ├── 📄networkpolicy.yaml

│ ├── 📄serviceaccount.yaml

│ └── 📄NOTES.txt

└── 📄.helmignore

Chart.yaml #

The Chart.yaml metadata is populated from your neam.toml project section:

yaml

apiVersion: v2
name: neam-agent
description: A Neam AI agent deployed to Kubernetes
type: application
version: 0.1.0
appVersion: "0.6.0"
keywords:
  - neam
  - ai-agent
  - llm

values.yaml #

The values.yaml file exposes all configurable parameters with sensible defaults:

yaml

replicaCount: 3

image:
  repository: ghcr.io/neam-lang/neam-agent
  tag: "0.6.0"
  pullPolicy: IfNotPresent

env:
  NEAM_ENV: production
  NEAM_LOG_LEVEL: info
  NEAM_PORT: "8080"
  NEAM_TELEMETRY_ENABLED: "true"

resources:
  requests:
    cpu: 500m
    memory: 512Mi
  limits:
    cpu: "1"
    memory: 1Gi

autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 20
  targetCPUUtilization: 70

probes:
  liveness:
    path: /health
    initialDelaySeconds: 15
    periodSeconds: 20
  readiness:
    path: /ready
    initialDelaySeconds: 5
    periodSeconds: 10
  startup:
    path: /startup
    periodSeconds: 5
    failureThreshold: 30

securityContext:
  runAsNonRoot: true
  readOnlyRootFilesystem: true
  allowPrivilegeEscalation: false
  dropCapabilities:
    - ALL

Installing the Chart #

bash

# Install from the generated chart directory
helm install my-agent ./chart/ \
  --namespace neam-production \
  --create-namespace \
  --set env.OPENAI_API_KEY="sk-..."

# Install with custom values
helm install my-agent ./chart/ \
  --namespace neam-production \
  -f custom-values.yaml

# Upgrade an existing release
helm upgrade my-agent ./chart/ \
  --namespace neam-production \
  --set image.tag="v1.1.0"

# Preview what will be applied (dry run)
helm install my-agent ./chart/ --dry-run --debug

# Package the chart for distribution
helm package ./chart/
# Creates neam-agent-0.1.0.tgz

Helm provides rollback capabilities that Kustomize does not:

bash

# List release history
helm history my-agent -n neam-production

# Roll back to the previous release
helm rollback my-agent -n neam-production

# Roll back to a specific revision
helm rollback my-agent 3 -n neam-production

20.14 Complete Deployment Walkthrough #

Let us walk through a complete deployment from source to production:

bash

# 1. Write your agent
cat > src/main.neam << 'EOF'
agent Assistant {
  provider: "openai"
  model: "gpt-4o-mini"
  system: "You are a helpful assistant."
  memory: "assistant_memory"
}

{
  let query = input();
  emit Assistant.ask(query);
}
EOF

# 2. Compile the agent
neamc src/main.neam -o src/main.neamb

# 3. Build the Docker image
docker build -t my-registry.com/neam-agent:v1.0.0 .

# 4. Push to container registry
docker push my-registry.com/neam-agent:v1.0.0

# 5. Update the image tag in the overlay
cd gitops/overlays/production
# Add an image transformer to kustomization.yaml:
cat >> kustomization.yaml << 'EOF'

images:
  - name: neam-agent
    newName: my-registry.com/neam-agent
    newTag: v1.0.0
EOF

# 6. Commit and push
git add -A && git commit -m "Deploy v1.0.0"
git push origin main

# 7. ArgoCD/FluxCD detects the change and deploys automatically

# 8. Verify the deployment
kubectl -n neam-production get pods
kubectl -n neam-production rollout status deployment/neam-agent

# 9. Test the endpoint
kubectl -n neam-production port-forward svc/neam-agent 8080:80
curl -X POST http://localhost:8080/api/v1/agent/ask \
  -H "Content-Type: application/json" \
  -d '{"agent_id": "Assistant", "query": "Hello!"}'

Summary #

In this chapter, you learned:

How to build Neam agents as multi-stage Docker images
How to run a complete development stack with Docker Compose
The structure of Kubernetes base manifests: Deployment, Service, ConfigMap, HPA, PDB, NetworkPolicy
GitOps with Kustomize overlays for dev, staging, and production environments
ArgoCD and FluxCD setup for automated continuous deployment
Health probe semantics: liveness, readiness, and startup
Autoscaling with HPA (CPU-based) and KEDA (event-driven)
Security best practices: non-root, read-only filesystem, dropped capabilities, network policies
The neamc deploy command for generating platform-specific manifests
CI/CD pipelines with GitHub Actions: matrix builds, environment-based deployment, rollout verification
Security scanning with Trivy (container and filesystem), CodeQL (static analysis), and dependency review
Helm chart generation from neam.toml, chart structure, installation, upgrades, and rollbacks

In the next chapter, we will extend these deployment patterns across multiple cloud providers: AWS, GCP, and Azure.

Exercises #

Exercise 20.1: Docker Build Optimization #

Modify the Neam Dockerfile to support building with additional backends. Add a CMAKE_FLAGS build argument so that the image can be built with:

bash

docker build --build-arg CMAKE_FLAGS="-DNEAM_BACKEND_POSTGRES=ON -DNEAM_BACKEND_AWS=ON" .

Hint: Use ARG CMAKE_FLAGS="-DNEAM_BACKEND_POSTGRES=ON" in the builder stage.

Exercise 20.2: Docker Compose Extension #

Add a Grafana service to the Docker Compose stack that:

Runs on port 3000
Auto-provisions Prometheus as a data source
Auto-provisions Jaeger as a data source
Depends on both Prometheus and Jaeger

Exercise 20.3: Kustomize Overlay #

Create a new Kustomize overlay for a canary environment that:

Runs in the neam-canary namespace
Uses 1 replica (no HPA)
Sets NEAM_LOG_LEVEL to debug
Sets NEAM_TELEMETRY_SAMPLING_RATE to 1.0 (trace everything)
Uses 256Mi memory request and 512Mi limit

Exercise 20.4: Health Probe Tuning #

A Neam agent takes 45 seconds to ingest its knowledge base at startup. The current deployment has no startup probe, and the liveness probe has initialDelaySeconds: 15.

What happens when this agent is deployed? (Hint: the liveness probe fires before the knowledge base is loaded.)
Write a startup probe configuration that handles this case.
What values would you use for periodSeconds and failureThreshold?

Exercise 20.5: KEDA Scaling #

Design a KEDA ScaledObject for a Neam agent that processes messages from an AWS SQS queue. The scaling should:

Scale to 0 when the queue is empty
Add 1 pod per 5 messages in the queue
Maximum 15 replicas
Cool down for 5 minutes before scaling to 0

Exercise 20.6: Network Policy #

Write a NetworkPolicy that restricts a Neam agent to:

Accept ingress only from pods in the api-gateway namespace on port 8080
Allow egress only to:
DNS (port 53 UDP)
HTTPS (port 443 TCP) for LLM API calls
PostgreSQL (port 5432 TCP) to pods labeled app=postgres in the databases namespace
OTel Collector (port 4318 TCP) in the observability namespace

Exercise 20.7: End-to-End Deployment #

Starting from scratch, deploy a Neam agent to a local Kubernetes cluster (minikube or kind). The steps are:

Create a kind cluster
Build the Docker image and load it into the cluster
Create the namespace and secrets
Apply the base manifests
Verify the pods are running and healthy
Send a test query to the agent

Document each command and its expected output.