📖 9 min read

Chapter 21: Multi-Cloud Deployment #

"The cloud is not a place. It is a way of doing IT. And if you are going to do IT that way, you should not be locked into doing it in only one place." -- Cloud architecture principle

What You Will Learn #

In this chapter, you will learn how to deploy Neam agents to the three major cloud providers: Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure. You will deploy to serverless platforms (Lambda, Cloud Run, Container Apps), container orchestrators (ECS Fargate, AKS), and managed AI services (Bedrock, Vertex AI, Azure OpenAI). You will configure cloud-native state backends (DynamoDB, CosmosDB) and secrets managers for each provider. By the end of this chapter, you will understand how to choose the right deployment target for your workload and how to run the same Neam agent across multiple clouds.

21.1 Multi-Cloud Architecture Overview #

Neam v0.6.0 supports eight deployment targets across three cloud providers plus Kubernetes:

AWS

▶

GCP

▶

Azure

▼

Bedrock

(LLM)

Firestore

(state)

▶

CosmosDB

(state)

Secrets

Manager

The Neam source code is identical across all targets. Only neam.toml and compile flags change.

Compile Flags #

Each cloud provider's integrations are gated behind compile flags to avoid pulling in unnecessary dependencies:

Flag	Enables	Required Libraries
`-DNEAM_BACKEND_POSTGRES=ON`	PostgreSQL state backend	`libpq`
`-DNEAM_BACKEND_REDIS=ON`	Redis state backend	`hiredis`
`-DNEAM_BACKEND_AWS=ON`	DynamoDB, Bedrock, Lambda, ECS Fargate	`libcurl` (bundled)
`-DNEAM_BACKEND_GCP=ON`	Cloud Run, Vertex AI, GCP Secret Manager	`libcurl` (bundled)
`-DNEAM_BACKEND_AZURE=ON`	CosmosDB, Azure OpenAI, Container Apps, AKS, Key Vault	`libcurl` (bundled)

The AWS, GCP, and Azure backends use custom REST clients built on libcurl (which is already bundled with Neam), not full cloud SDKs. This keeps the binary small and avoids SDK version conflicts.

bash

# Build for AWS deployment
cmake -B build \
  -DCMAKE_BUILD_TYPE=Release \
  -DNEAM_BACKEND_POSTGRES=ON \
  -DNEAM_BACKEND_AWS=ON

# Build for all clouds
cmake -B build \
  -DCMAKE_BUILD_TYPE=Release \
  -DNEAM_BACKEND_POSTGRES=ON \
  -DNEAM_BACKEND_AWS=ON \
  -DNEAM_BACKEND_GCP=ON \
  -DNEAM_BACKEND_AZURE=ON

21.2 AWS Deployment #

AWS Lambda #

AWS Lambda is ideal for event-driven agents that handle sporadic traffic. The agent starts on demand, processes a request, and shuts down -- you pay only for compute time used.

Generating Lambda Artifacts #

bash

neamc deploy --target lambda --dry-run

This generates a SAM (Serverless Application Model) template:

yaml

# Generated: template.yaml
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31

Description: Neam Agent - Lambda Deployment

Globals:
  Function:
    Timeout: 30
    MemorySize: 512
    Runtime: provided.al2023
    Architectures:
      - arm64

Resources:
  NeamAgentFunction:
    Type: AWS::Serverless::Function
    Properties:
      Handler: bootstrap
      CodeUri: ./build/
      Description: Neam agent function
      Policies:
        - SecretsManagerReadWrite
        - DynamoDBCrudPolicy:
            TableName: !Ref StateTable
      Environment:
        Variables:
          NEAM_ENV: production
          NEAM_STATE_BACKEND: dynamodb
          NEAM_STATE_CONNECTION_STRING: !Sub "dynamodb://${AWS::Region}/${StateTable}"
          NEAM_SECRETS_PROVIDER: aws-secrets-manager
      Events:
        ApiEvent:
          Type: Api
          Properties:
            Path: /api/v1/agent/ask
            Method: post
        HealthEvent:
          Type: Api
          Properties:
            Path: /health
            Method: get

  StateTable:
    Type: AWS::DynamoDB::Table
    Properties:
      TableName: neam-state
      AttributeDefinitions:
        - AttributeName: PK
          AttributeType: S
        - AttributeName: SK
          AttributeType: S
      KeySchema:
        - AttributeName: PK
          KeyType: HASH
        - AttributeName: SK
          KeyType: RANGE
      BillingMode: PAY_PER_REQUEST
      TimeToLiveSpecification:
        AttributeName: ttl
        Enabled: true

Outputs:
  ApiUrl:
    Description: API Gateway URL
    Value: !Sub "https://${ServerlessRestApi}.execute-api.${AWS::Region}.amazonaws.com/Prod/"

Deploying to Lambda #

bash

# Build the agent
neamc src/main.neam -o build/main.neamb

# Generate the SAM template
neamc deploy --target lambda --output ./deploy/

# Deploy using SAM CLI
cd deploy
sam build
sam deploy --guided \
  --stack-name neam-agent \
  --capabilities CAPABILITY_IAM \
  --parameter-overrides \
    OpenAIApiKeySecret=production/neam/OPENAI_API_KEY

Lambda Configuration in neam.toml #

toml

[deploy]
target = "lambda"

[deploy.lambda]
memory-mb = 512
timeout-seconds = 30
architecture = "arm64"    # Graviton (cheaper)
vpc-enabled = false

[state]
backend = "dynamodb"
connection-string = "dynamodb://us-east-1/neam-state"

[secrets]
provider = "aws-secrets-manager"
region = "us-east-1"
prefix = "production/neam/"

AWS ECS Fargate #

ECS Fargate is the right choice when your agent needs to be always-on (for autonomous agents with scheduled triggers) or when it needs persistent connections (WebSocket, SSE streaming).

Generating ECS Artifacts #

bash

neamc deploy --target ecs-fargate --dry-run

This generates a task definition, ECS service configuration, and deployment script:

json

// Generated: task-definition.json
{
  "family": "neam-agent",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "512",
  "memory": "1024",
  "executionRoleArn": "arn:aws:iam::ACCOUNT:role/neam-execution-role",
  "taskRoleArn": "arn:aws:iam::ACCOUNT:role/neam-task-role",
  "containerDefinitions": [
    {
      "name": "neam-agent",
      "image": "ACCOUNT.dkr.ecr.REGION.amazonaws.com/neam-agent:latest",
      "portMappings": [
        {
          "containerPort": 8080,
          "protocol": "tcp"
        }
      ],
      "healthCheck": {
        "command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"],
        "interval": 30,
        "timeout": 5,
        "retries": 3,
        "startPeriod": 15
      },
      "environment": [
        {"name": "NEAM_ENV", "value": "production"},
        {"name": "NEAM_PORT", "value": "8080"},
        {"name": "NEAM_STATE_BACKEND", "value": "postgres"}
      ],
      "secrets": [
        {
          "name": "OPENAI_API_KEY",
          "valueFrom": "arn:aws:secretsmanager:REGION:ACCOUNT:secret:neam/OPENAI_API_KEY"
        },
        {
          "name": "DATABASE_URL",
          "valueFrom": "arn:aws:secretsmanager:REGION:ACCOUNT:secret:neam/DATABASE_URL"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/neam-agent",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      }
    }
  ]
}

Deploying to ECS #

bash

# Build and push Docker image
docker build -t neam-agent:latest .
docker tag neam-agent:latest ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/neam-agent:latest
aws ecr get-login-password | docker login --username AWS --password-stdin \
  ACCOUNT.dkr.ecr.us-east-1.amazonaws.com
docker push ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/neam-agent:latest

# Generate and deploy ECS artifacts
neamc deploy --target ecs-fargate --output ./deploy/

# Register the task definition
aws ecs register-task-definition --cli-input-json file://deploy/task-definition.json

# Create or update the service
aws ecs update-service \
  --cluster neam-cluster \
  --service neam-agent \
  --task-definition neam-agent \
  --force-new-deployment

AWS Bedrock Integration #

AWS Bedrock provides managed access to foundation models from Anthropic, Meta, Amazon, and others. Using Bedrock means your LLM calls stay within the AWS network, which can reduce latency and simplify compliance.

neam

agent BedrockAgent {
  provider: "bedrock"
  model: "anthropic.claude-3-5-sonnet-20241022-v2:0"
  system: "You are a helpful assistant powered by AWS Bedrock."
}

{
  let response = BedrockAgent.ask("Explain serverless architecture.");
  emit response;
}

Bedrock authentication uses AWS IAM credentials -- no API key needed. The Neam VM signs requests with AWS Signature V4, using the credentials from the environment (environment variables, instance profile, or ECS task role).

toml

# neam.toml for Bedrock
[llm]
default-provider = "bedrock"
default-model = "anthropic.claude-3-5-sonnet-20241022-v2:0"

[llm.rate-limits.bedrock]
requests-per-minute = 60

Supported Bedrock model families:

Model Family	Model ID Pattern	Notes
Anthropic Claude	`anthropic.claude-*`	Messages API format
Meta Llama	`meta.llama3-*`	Llama request format
Amazon Titan	`amazon.titan-*`	Amazon Titan format

DynamoDB State Backend #

DynamoDB is a natural fit for serverless (Lambda) deployments because it scales to zero with your compute:

toml

[state]
backend = "dynamodb"
connection-string = "dynamodb://us-east-1/neam-state"

The single-table design uses composite partition and sort keys:

DynamoDB Table: neam-state

PK (Partition) SK (Sort) Attributes

---------------------- ------------------- ---------------

EVENT#service_memory 1706745600#uuid-1 type, data

EVENT#service_memory 1706745601#uuid-2 type, data

LEARNING#TriageAgent 1706745600#uuid-3 query, response

EVOLUTION#TriageAgent V#1 evolved_prompt

EVOLUTION#TriageAgent V#2 evolved_prompt

LOCK#autonomous:leader LOCK holder_id, ttl

BUDGET#TriageAgent 2026-01-30 calls, tokens

DynamoDB distributed locking uses conditional writes:

text

PutItem(
  PK = "LOCK#autonomous:leader",
  SK = "LOCK",
  holder_id = instance_id,
  expires_at = now + ttl,
  ConditionExpression = "attribute_not_exists(PK) OR expires_at <= :now"
)

AWS Secrets Manager #

toml

[secrets]
provider = "aws-secrets-manager"
region = "us-east-1"
prefix = "production/neam/"

bash

# Store secrets
aws secretsmanager create-secret \
  --name "production/neam/OPENAI_API_KEY" \
  --secret-string "sk-..."

aws secretsmanager create-secret \
  --name "production/neam/DATABASE_URL" \
  --secret-string "postgresql://user:pass@host:5432/neam"

21.3 GCP Deployment #

Google Cloud Run #

Cloud Run is GCP's serverless container platform. It automatically scales from zero to many instances and charges per request. Unlike Lambda, Cloud Run runs your actual Docker container, so everything that works in Docker works on Cloud Run.

Generating Cloud Run Artifacts #

bash

neamc deploy --target cloud-run --dry-run

This generates a Cloud Run service YAML:

yaml

# Generated: cloud-run-service.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: neam-agent
  labels:
    cloud.googleapis.com/location: us-central1
  annotations:
    run.googleapis.com/ingress: all
    run.googleapis.com/cpu-throttling: "false"
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/minScale: "0"
        autoscaling.knative.dev/maxScale: "10"
        run.googleapis.com/cpu-throttling: "false"
        run.googleapis.com/startup-cpu-boost: "true"
    spec:
      serviceAccountName: neam-agent@PROJECT.iam.gserviceaccount.com
      containerConcurrency: 80
      timeoutSeconds: 300
      containers:
        - name: neam-agent
          image: gcr.io/PROJECT/neam-agent:latest
          ports:
            - containerPort: 8080
          resources:
            limits:
              cpu: "1"
              memory: 1Gi
          env:
            - name: NEAM_ENV
              value: production
            - name: NEAM_PORT
              value: "8080"
            - name: NEAM_STATE_BACKEND
              value: postgres
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  key: latest
                  name: neam-database-url
            - name: OPENAI_API_KEY
              valueFrom:
                secretKeyRef:
                  key: latest
                  name: neam-openai-key
          startupProbe:
            httpGet:
              path: /startup
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5
            failureThreshold: 30
          livenessProbe:
            httpGet:
              path: /health
              port: 8080
            periodSeconds: 20

Key Cloud Run annotations:

cpu-throttling: "false": Keeps CPU allocated even between requests. Essential for autonomous agents that run scheduled tasks.
startup-cpu-boost: "true": Allocates extra CPU during cold starts for faster initialization.
minScale: "0": Scales to zero when idle (cost saving).
containerConcurrency: 80: Maximum concurrent requests per instance.

Deploying to Cloud Run #

bash

# Build and push image
docker build -t gcr.io/my-project/neam-agent:latest .
docker push gcr.io/my-project/neam-agent:latest

# Generate Cloud Run manifest
neamc deploy --target cloud-run --output ./deploy/

# Deploy using gcloud
gcloud run services replace deploy/cloud-run-service.yaml \
  --region us-central1

# Or deploy directly
gcloud run deploy neam-agent \
  --image gcr.io/my-project/neam-agent:latest \
  --region us-central1 \
  --memory 1Gi \
  --cpu 1 \
  --min-instances 0 \
  --max-instances 10 \
  --no-cpu-throttling \
  --set-secrets "OPENAI_API_KEY=neam-openai-key:latest" \
  --set-secrets "DATABASE_URL=neam-database-url:latest"

Vertex AI Integration #

Vertex AI provides managed access to Google's Gemini models and third-party models. Using Vertex AI instead of the public Gemini API gives you VPC service controls, customer-managed encryption keys, and data residency guarantees.

neam

agent VertexAgent {
  provider: "vertex"
  model: "gemini-2.0-flash"
  system: "You are an assistant powered by Vertex AI."
}

{
  let response = VertexAgent.ask("Explain Cloud Run autoscaling.");
  emit response;
}

Vertex AI authentication uses Google Application Default Credentials (ADC). The Neam VM reads the service account key from GOOGLE_APPLICATION_CREDENTIALS, exchanges it for an OAuth2 access token, and caches the token until it expires (1 hour).

toml

# neam.toml for Vertex AI
[llm]
default-provider = "vertex"
default-model = "gemini-2.0-flash"

[llm.rate-limits.vertex]
requests-per-minute = 60

GCP Secret Manager #

toml

[secrets]
provider = "gcp-secret-manager"
project = "my-gcp-project"

bash

# Store secrets
echo -n "sk-..." | gcloud secrets create neam-openai-key --data-file=-
echo -n "postgresql://..." | gcloud secrets create neam-database-url --data-file=-

# Grant access to the service account
gcloud secrets add-iam-policy-binding neam-openai-key \
  --member="serviceAccount:neam-agent@my-project.iam.gserviceaccount.com" \
  --role="roles/secretmanager.secretAccessor"

Firestore State Backend #

Firestore is GCP's serverless document database. Like DynamoDB for AWS, it pairs naturally with Cloud Run because both scale to zero:

toml

[state]
backend = "firestore"
connection-string = "firestore://my-gcp-project/neam-state"

Firestore uses a collection-per-type design:

📁neam-state (database)

├── 📁events/

│ ├── 📁{agent_name}_{timestamp}_{uuid}

│ │ 📁→ type, data, timestamp

├── 📁learning/

│ ├── 📁{agent_name}_{timestamp}_{uuid}

│ │ 📁→ query, response, reflection_score

├── 📁evolution/

│ ├── 📁{agent_name}_v{version}

│ │ 📁→ evolved_prompt, timestamp

├── 📁locks/

│ ├── 📁{lock_name}

│ │ 📁→ holder_id, expires_at

└── 📁budgets/

├── 📁{agent_name}_{date}

📁→ calls, tokens, cost_usd

Firestore distributed locking uses transactions with optimistic concurrency:

text

transaction {
  doc = get("locks/autonomous:leader")
  if doc.exists AND doc.expires_at > now:
    abort  // Lock held by another instance
  set("locks/autonomous:leader", {
    holder_id: instance_id,
    expires_at: now + ttl
  })
}

Authentication uses the same Application Default Credentials as Vertex AI. No additional configuration is needed when running on Cloud Run — the service account is inherited from the Cloud Run service configuration.

21.4 Azure Deployment #

Azure Container Apps #

Azure Container Apps is a serverless container platform built on Kubernetes and KEDA. It handles autoscaling, load balancing, and health management automatically.

Generating Container Apps Artifacts #

bash

neamc deploy --target azure-container-apps --dry-run

yaml

# Generated: azure-container-app.yaml
properties:
  managedEnvironmentId: /subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.App/managedEnvironments/{env}
  configuration:
    ingress:
      external: true
      targetPort: 8080
      transport: http
    secrets:
      - name: openai-key
        keyVaultUrl: https://neam-vault.vault.azure.net/secrets/openai-key
        identity: system
      - name: database-url
        keyVaultUrl: https://neam-vault.vault.azure.net/secrets/database-url
        identity: system
  template:
    containers:
      - name: neam-agent
        image: neamacr.azurecr.io/neam-agent:latest
        resources:
          cpu: 0.5
          memory: 1Gi
        env:
          - name: NEAM_ENV
            value: production
          - name: NEAM_PORT
            value: "8080"
          - name: OPENAI_API_KEY
            secretRef: openai-key
          - name: DATABASE_URL
            secretRef: database-url
        probes:
          - type: Liveness
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 15
            periodSeconds: 20
          - type: Readiness
            httpGet:
              path: /ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10
          - type: Startup
            httpGet:
              path: /startup
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5
            failureThreshold: 30
    scale:
      minReplicas: 0
      maxReplicas: 10
      rules:
        - name: http-rule
          http:
            metadata:
              concurrentRequests: "50"

Deploying to Container Apps #

bash

# Build and push to Azure Container Registry
az acr build --registry neamacr --image neam-agent:latest .

# Generate artifacts
neamc deploy --target azure-container-apps --output ./deploy/

# Deploy
az containerapp create \
  --name neam-agent \
  --resource-group neam-rg \
  --environment neam-env \
  --image neamacr.azurecr.io/neam-agent:latest \
  --target-port 8080 \
  --ingress external \
  --cpu 0.5 \
  --memory 1.0Gi \
  --min-replicas 0 \
  --max-replicas 10 \
  --secrets "openai-key=keyvaultref:https://neam-vault.vault.azure.net/secrets/openai-key,identityref:system" \
  --env-vars "NEAM_ENV=production" "OPENAI_API_KEY=secretref:openai-key"

Azure Kubernetes Service (AKS) #

AKS is the right choice when you need the full power of Kubernetes with Azure integration. The neamc deploy --target azure-aks command generates standard Kubernetes manifests enhanced with AKS-specific annotations.

bash

neamc deploy --target azure-aks --dry-run

The generated manifests include:

Workload Identity annotations for pod-level Azure AD authentication
Azure Key Vault SecretProviderClass for mounting secrets from Key Vault
Azure Load Balancer annotations for the Service
Azure Disk/Files StorageClass references for persistent volumes

yaml

# Additional AKS annotations on the Deployment
metadata:
  annotations:
    azure.workload.identity/use: "true"
spec:
  template:
    metadata:
      labels:
        azure.workload.identity/use: "true"
    spec:
      serviceAccountName: neam-agent
      volumes:
        - name: secrets-store
          csi:
            driver: secrets-store.csi.k8s.io
            readOnly: true
            volumeAttributes:
              secretProviderClass: neam-secrets

Deploying to AKS #

bash

# Generate AKS manifests
neamc deploy --target azure-aks --output ./deploy/

# Connect to the AKS cluster
az aks get-credentials --resource-group neam-rg --name neam-aks

# Apply manifests
kubectl apply -f deploy/

# Verify
kubectl -n neam-production get pods

Azure OpenAI Integration #

Azure OpenAI provides OpenAI models (GPT-4, GPT-4o) hosted in Azure data centers. This gives you the same models as OpenAI but with Azure's compliance certifications, VNet integration, and managed identity authentication.

neam

agent AzureAgent {
  provider: "azure_openai"
  model: "gpt-4o-mini"
  endpoint: "https://neam-openai.openai.azure.com"
  system: "You are an assistant powered by Azure OpenAI."
}

{
  let response = AzureAgent.ask("Explain Azure Container Apps.");
  emit response;
}

The Azure OpenAI adapter uses the same request/response format as the OpenAI adapter but with different authentication (API key header or Azure AD token) and endpoint URL construction:

text

https://{resource}.openai.azure.com/openai/deployments/{deployment}/chat/completions?api-version=2024-10-21

toml

# neam.toml for Azure OpenAI
[llm]
default-provider = "azure_openai"
default-model = "gpt-4o-mini"

[llm.rate-limits.azure_openai]
requests-per-minute = 120

CosmosDB State Backend #

CosmosDB is Azure's globally distributed database. It offers single-digit millisecond reads, multi-region writes, and multiple consistency models:

toml

[state]
backend = "cosmosdb"
connection-string = "cosmosdb://neamaccount.documents.azure.com/neam-db"

CosmosDB uses the SQL API with a single container partitioned by agent_name. Each document includes a type field for multiplexing:

json

{
  "id": "uuid-123",
  "agent_name": "TriageAgent",
  "type": "learning_interaction",
  "query": "Help with my order",
  "response": "I'll route you to...",
  "reflection_score": 0.85,
  "timestamp": 1706745600
}

Azure Key Vault #

toml

[secrets]
provider = "azure-key-vault"
vault-url = "https://neam-vault.vault.azure.net"

bash

# Store secrets
az keyvault secret set --vault-name neam-vault \
  --name openai-key --value "sk-..."
az keyvault secret set --vault-name neam-vault \
  --name database-url --value "postgresql://..."

# Grant access to the managed identity
az keyvault set-policy --name neam-vault \
  --object-id $(az identity show --name neam-identity -g neam-rg --query principalId -o tsv) \
  --secret-permissions get list

21.5 Multi-Cloud Strategy #

When to Use Each Target #

+---------------------------------------------------------------+
|                                                               |
|  Start                                                        |
|    |                                                          |
|    v                                                          |
|  Is traffic sporadic/event-driven?                            |
|    |                                                          |
|    +-- Yes --> Is it on AWS?                                  |
|    |            +-- Yes --> Lambda                             |
|    |            +-- No --> Cloud Run (GCP)                     |
|    |                       Container Apps (Azure)              |
|    |                                                          |
|    +-- No --> Does it need always-on                          |
|               (autonomous agents, WebSocket)?                 |
|                 |                                             |
|                 +-- Yes --> Need Kubernetes features?          |
|                 |            +-- Yes --> K8s / AKS / GKE      |
|                 |            +-- No --> ECS Fargate (AWS)      |
|                 |                       Cloud Run (GCP, no     |
|                 |                         CPU throttling)      |
|                 |                       Container Apps (Azure) |
|                 |                                             |
|                 +-- No --> Cloud Run / Container Apps          |
|                            (scale to zero, save cost)         |
|                                                               |
+---------------------------------------------------------------+

Target Comparison #

Concern	Lambda	ECS Fargate	Cloud Run	Container Apps	Kubernetes
Cold start	1-5s	None	1-3s	1-3s	None
Scale to zero	Yes	No	Yes	Yes	With KEDA
Max timeout	15 min	Unlimited	60 min	Unlimited	Unlimited
WebSocket/SSE	No	Yes	Yes	Yes	Yes
Autonomous agents	No	Yes	Yes (no throttle)	Yes	Yes
Pricing model	Per request	Per vCPU-hour	Per request	Per vCPU-sec	Per node
Complexity	Low	Medium	Low	Low	High

Cross-Cloud Agent Example #

Here is a complete example of a Neam agent that can run on any cloud provider. The agent code is identical; only neam.toml changes:

neam

agent CustomerBot {
  provider: "openai"
  model: "gpt-4o-mini"
  temperature: 0.3
  system: "You are a customer service agent. Be helpful and professional."

  learning: {
    strategy: "experience_replay"
    review_interval: 20
  }

  memory: "customer_memory"
}

agent EscalationBot {
  provider: "openai"
  model: "gpt-4o"
  temperature: 0.1
  system: "You handle escalated customer issues with extra care."
  memory: "escalation_memory"
}

fun handle_request(query) {
  let response = CustomerBot.ask(query);

  // Check if escalation is needed
  if (response.contains("ESCALATE")) {
    return EscalationBot.ask("Escalated issue: " + query);
  }

  return response;
}

{
  let query = input();
  let result = handle_request(query);
  emit result;
}

AWS Configuration #

toml

# neam.toml for AWS
[project]
name = "customer-bot"
version = "1.0.0"

[project.entry_points]
main = "src/main.neam"

[state]
backend = "dynamodb"
connection-string = "dynamodb://us-east-1/customer-bot-state"

[llm]
default-provider = "openai"
default-model = "gpt-4o-mini"

[llm.rate-limits.openai]
requests-per-minute = 120

[secrets]
provider = "aws-secrets-manager"
region = "us-east-1"
prefix = "production/customer-bot/"

[deploy]
target = "ecs-fargate"

GCP Configuration #

toml

# neam.toml for GCP
[project]
name = "customer-bot"
version = "1.0.0"

[project.entry_points]
main = "src/main.neam"

[state]
backend = "postgres"
connection-string = "secret://DATABASE_URL"

[llm]
default-provider = "openai"
default-model = "gpt-4o-mini"

[llm.rate-limits.openai]
requests-per-minute = 120

[secrets]
provider = "gcp-secret-manager"
project = "customer-bot-prod"

[deploy]
target = "cloud-run"

Azure Configuration #

toml

# neam.toml for Azure
[project]
name = "customer-bot"
version = "1.0.0"

[project.entry_points]
main = "src/main.neam"

[state]
backend = "cosmosdb"
connection-string = "cosmosdb://customerbot.documents.azure.com/customer-bot-db"

[llm]
default-provider = "azure_openai"
default-model = "gpt-4o-mini"

[llm.rate-limits.azure_openai]
requests-per-minute = 120

[secrets]
provider = "azure-key-vault"
vault-url = "https://customerbot-vault.vault.azure.net"

[deploy]
target = "azure-container-apps"

Multi-Cloud Considerations #

Vendor lock-in: Using cloud-specific state backends (DynamoDB, CosmosDB) and LLM providers (Bedrock, Vertex AI, Azure OpenAI) creates coupling. If portability is a priority, use PostgreSQL for state and the standard OpenAI/Anthropic APIs for LLM calls. These work on any cloud.

Latency: Place your state backend in the same region as your compute. LLM API calls to OpenAI or Anthropic go to the public internet regardless of cloud provider. Cloud-specific LLM providers (Bedrock, Vertex AI, Azure OpenAI) keep traffic on the cloud provider's network, which can reduce latency by 20-50ms.

Cost: Serverless targets (Lambda, Cloud Run, Container Apps) are cheapest for sporadic workloads. For sustained load, container-based targets (ECS Fargate, Kubernetes) are more cost-effective. DynamoDB PAY_PER_REQUEST is cheapest for variable workloads; provisioned capacity is cheaper for predictable workloads.

Compliance: Some industries require data residency in specific regions. Cloud- specific LLM providers (Bedrock, Azure OpenAI) offer more data residency controls than the public OpenAI or Anthropic APIs. Azure OpenAI, in particular, offers data processing within Azure data centers with no data leaving your subscription.

21.6 Cloud-Specific Build Pipeline #

Here is a CI/CD pipeline that builds and deploys to multiple clouds:

yaml

# .github/workflows/deploy.yml
name: Multi-Cloud Deploy
on:
  push:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Build Docker image
        run: |
          docker build \
            --build-arg CMAKE_FLAGS="-DNEAM_BACKEND_POSTGRES=ON -DNEAM_BACKEND_AWS=ON -DNEAM_BACKEND_GCP=ON -DNEAM_BACKEND_AZURE=ON" \
            -t neam-agent:${{ github.sha }} .

      - name: Push to ECR (AWS)
        run: |
          aws ecr get-login-password | docker login --username AWS --password-stdin $ECR_REGISTRY
          docker tag neam-agent:${{ github.sha }} $ECR_REGISTRY/neam-agent:${{ github.sha }}
          docker push $ECR_REGISTRY/neam-agent:${{ github.sha }}

      - name: Push to GCR (GCP)
        run: |
          gcloud auth configure-docker
          docker tag neam-agent:${{ github.sha }} gcr.io/$GCP_PROJECT/neam-agent:${{ github.sha }}
          docker push gcr.io/$GCP_PROJECT/neam-agent:${{ github.sha }}

      - name: Push to ACR (Azure)
        run: |
          az acr login --name neamacr
          docker tag neam-agent:${{ github.sha }} neamacr.azurecr.io/neam-agent:${{ github.sha }}
          docker push neamacr.azurecr.io/neam-agent:${{ github.sha }}

  deploy-aws:
    needs: build
    runs-on: ubuntu-latest
    steps:
      - name: Deploy to ECS
        run: |
          aws ecs update-service \
            --cluster neam-cluster \
            --service neam-agent \
            --force-new-deployment

  deploy-gcp:
    needs: build
    runs-on: ubuntu-latest
    steps:
      - name: Deploy to Cloud Run
        run: |
          gcloud run deploy neam-agent \
            --image gcr.io/$GCP_PROJECT/neam-agent:${{ github.sha }} \
            --region us-central1

  deploy-azure:
    needs: build
    runs-on: ubuntu-latest
    steps:
      - name: Deploy to Container Apps
        run: |
          az containerapp update \
            --name neam-agent \
            --resource-group neam-rg \
            --image neamacr.azurecr.io/neam-agent:${{ github.sha }}

21.7 Terraform Integration #

For teams that manage infrastructure with Terraform, the neamc deploy command can generate Terraform configurations instead of raw cloud manifests. This integrates Neam deployments into existing Infrastructure-as-Code workflows.

Generating Terraform Configurations #

bash

# Generate Terraform for AWS
neamc deploy --target terraform --cloud aws --output ./terraform/

# Generate Terraform for GCP
neamc deploy --target terraform --cloud gcp --output ./terraform/

# Generate Terraform for Azure
neamc deploy --target terraform --cloud azure --output ./terraform/

# Preview without writing files
neamc deploy --target terraform --cloud aws --dry-run

The generated Terraform files define all the cloud resources your agent needs:

📁terraform/

├── 📄main.tf # Provider config and data sources

├── 📁compute.tf # ECS/Cloud Run/Container Apps resources

├── 📁state.tf # DynamoDB/Firestore/CosmosDB resources

├── 📁secrets.tf # Secrets Manager/Key Vault resources

├── 📄networking.tf # VPC, subnets, security groups

├── 📁monitoring.tf # CloudWatch/Cloud Monitoring/Azure Monitor

├── 📄variables.tf # Input variables

├── 📄outputs.tf # Endpoint URLs, resource ARNs

└── 📄terraform.tfvars # Default variable values from neam.toml

Example: AWS Terraform Output #

hcl

# Generated: compute.tf (AWS ECS Fargate)
resource "aws_ecs_cluster" "neam" {
  name = var.project_name

  setting {
    name  = "containerInsights"
    value = "enabled"
  }
}

resource "aws_ecs_service" "neam_agent" {
  name            = "${var.project_name}-agent"
  cluster         = aws_ecs_cluster.neam.id
  task_definition = aws_ecs_task_definition.neam_agent.arn
  desired_count   = var.min_replicas
  launch_type     = "FARGATE"

  network_configuration {
    subnets          = var.private_subnet_ids
    security_groups  = [aws_security_group.neam_agent.id]
    assign_public_ip = false
  }

  load_balancer {
    target_group_arn = aws_lb_target_group.neam_agent.arn
    container_name   = "neam-agent"
    container_port   = 8080
  }
}

resource "aws_appautoscaling_target" "neam_agent" {
  max_capacity       = var.max_replicas
  min_capacity       = var.min_replicas
  resource_id        = "service/${aws_ecs_cluster.neam.name}/${aws_ecs_service.neam_agent.name}"
  scalable_dimension = "ecs:service:DesiredCount"
  service_namespace  = "ecs"
}

Applying Terraform #

bash

cd terraform/

# Initialize and validate
terraform init
terraform validate

# Plan the changes
terraform plan -out=tfplan

# Apply
terraform apply tfplan

# Outputs show the deployment endpoint
terraform output api_url

Mixing neamc deploy with Existing Terraform #

If you already have Terraform-managed infrastructure, you can import the generated resources into your existing state or reference them as modules:

hcl

# In your existing Terraform project
module "neam_agent" {
  source = "./modules/neam-agent"

  project_name     = "customer-bot"
  image_tag        = var.neam_image_tag
  min_replicas     = 2
  max_replicas     = 10
  vpc_id           = module.networking.vpc_id
  private_subnets  = module.networking.private_subnet_ids
}

21.8 Cloud-Specific Observability #

Each cloud provider offers its own monitoring service. Neam integrates with all three through the OpenTelemetry Collector, which translates OTLP data into cloud-native formats.

AWS: CloudWatch Integration #

On AWS, the OTel Collector forwards traces to AWS X-Ray and metrics to CloudWatch Metrics:

yaml

# OTel Collector config for AWS
exporters:
  awsxray:
    region: us-east-1
  awsemf:
    region: us-east-1
    namespace: Neam
    log_group_name: /neam/metrics

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [awsxray]
    metrics:
      receivers: [otlp]
      exporters: [awsemf]

For ECS Fargate deployments, the OTel Collector runs as a sidecar container in the same task definition. For Lambda deployments, the Neam runtime exports spans directly via the OTLP/HTTP endpoint — no collector is needed.

GCP: Cloud Trace and Cloud Monitoring #

On GCP, the OTel Collector exports to Cloud Trace and Cloud Monitoring:

yaml

# OTel Collector config for GCP
exporters:
  googlecloud:
    project: my-gcp-project
    trace:
      endpoint: cloudtrace.googleapis.com:443
    metric:
      endpoint: monitoring.googleapis.com:443

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [googlecloud]
    metrics:
      receivers: [otlp]
      exporters: [googlecloud]

Cloud Run deployments can use the built-in Cloud Trace integration. Set the GOOGLE_CLOUD_PROJECT environment variable, and the OTel Collector sidecar authenticates automatically via the service account.

Azure: Application Insights #

On Azure, the OTel Collector exports to Azure Monitor (Application Insights):

yaml

# OTel Collector config for Azure
exporters:
  azuremonitor:
    connection_string: ${APPLICATIONINSIGHTS_CONNECTION_STRING}

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [azuremonitor]
    metrics:
      receivers: [otlp]
      exporters: [azuremonitor]

Application Insights provides built-in dashboards, smart detection alerts, and an application map that visualizes service dependencies.

Exporter Comparison #

Feature	Jaeger + Prometheus	AWS X-Ray + CloudWatch	GCP Cloud Trace	Azure App Insights
Self-hosted	Yes	No	No	No
Cost	Infrastructure only	Per trace/metric	Per trace/metric	Per GB ingested
Trace retention	Configurable	30 days	30 days	90 days
Built-in alerting	Via Alertmanager	CloudWatch Alarms	Cloud Monitoring	Smart Detection
Multi-cloud	Yes	AWS only	GCP only	Azure only

For multi-cloud deployments, use the self-hosted stack (Jaeger + Prometheus + Grafana) as the single pane of glass, with cloud-native exporters as secondary destinations for teams that prefer their cloud provider's tooling.

Summary #

In this chapter, you learned:

AWS deployment: Lambda (serverless), ECS Fargate (containers), Bedrock (managed LLM), DynamoDB (state), Secrets Manager
GCP deployment: Cloud Run (serverless containers), Vertex AI (managed LLM), Firestore (state), GCP Secret Manager
Azure deployment: Container Apps (serverless containers), AKS (Kubernetes), Azure OpenAI (managed LLM), CosmosDB (state), Key Vault (secrets)
Compile flags for cloud-specific backends
How to write cloud-portable Neam agents with environment-specific neam.toml
Decision criteria for choosing between deployment targets
Multi-cloud CI/CD pipelines
Terraform integration for Infrastructure-as-Code workflows
Cloud-specific observability: AWS X-Ray, GCP Cloud Trace, Azure Application Insights

In the next chapter, we will cover observability and monitoring -- how to see what your deployed agents are doing in production.

Exercises #

Exercise 21.1: Lambda Configuration #

Write a complete neam.toml for deploying a Neam agent to AWS Lambda with:

DynamoDB state backend in us-west-2
Bedrock as the LLM provider using Claude 3.5 Sonnet
AWS Secrets Manager for credentials
1024 MB memory, 30-second timeout

Then write the Neam agent code that uses this configuration.

Exercise 21.2: Cloud Run vs. Container Apps #

Compare deploying the same Neam agent to Google Cloud Run and Azure Container Apps. For each platform:

Write the neam.toml configuration
List the CLI commands to deploy
Describe how autoscaling is configured
Explain how secrets are injected

Exercise 21.3: Multi-Cloud State Migration #

You have a Neam agent running on AWS with DynamoDB state that needs to be migrated to Azure with CosmosDB. Describe:

The data model differences between DynamoDB and CosmosDB
A migration strategy that minimizes downtime
How you would verify data integrity after migration
What changes are needed in neam.toml

Exercise 21.4: Cloud-Native LLM Provider Selection #

For each scenario, recommend an LLM provider and justify your choice:

A healthcare agent that must keep all data within a specific AWS region
A startup that wants the cheapest option for GPT-4-class models
An enterprise that needs Azure AD integration for compliance
A research team that wants access to open-source models (Llama 3) without managing GPU infrastructure
A multi-cloud deployment where the LLM provider must work on all three clouds

Exercise 21.5: Cost Optimization #

A Neam agent receives 10,000 requests per day with the following pattern: - 8 AM - 6 PM: 80% of traffic (800 requests/hour) - 6 PM - 8 AM: 20% of traffic (140 requests/hour)

Calculate the monthly compute cost for each deployment target: 1. AWS Lambda (128 MB, 5s avg duration) 2. AWS ECS Fargate (0.5 vCPU, 1 GB, always-on) 3. Google Cloud Run (1 vCPU, 1 GB, min 0 instances) 4. Azure Container Apps (0.5 vCPU, 1 GB, min 0 instances)

Use current pricing from each provider's documentation. Which target is cheapest?

Exercise 21.6: Disaster Recovery #

Design a disaster recovery strategy for a Neam agent deployed on AWS that needs:

RPO (Recovery Point Objective) of 1 hour
RTO (Recovery Time Objective) of 15 minutes
Automatic failover to a secondary region

Describe the architecture, including state replication, DNS failover, and the neam.toml changes needed for each region.