Skip to content

Add runner.topology config selector for ARC/DinD mode #5591

Description

@lpcox

Summary

Add a single declarative runner.topology selector to the AWF stdin config / JSON schema so callers (notably the gh-aw compiler) can put the firewall into ARC/DinD mode with one key, instead of hand-coordinating ~6 lower-level keys plus runtime probes.

Problem

There is currently no first-class concept of "AWF is running inside an ARC Kubernetes runner using the Docker-in-Docker sidecar pattern." Getting AWF to behave correctly there requires the caller to independently set several scattered keys, and the firewall can't apply sane defaults, validate the combination, or warn when it's half-configured:

Key Section ARC/DinD role
network.isolation network no NET_ADMIN/iptables (k8s)
container.dockerHost container point at the sibling dind daemon
container.dockerHostPathPrefix container translate bind-mount paths across the split filesystem
container.runnerToolCachePath container relocate the tool cache off /opt onto the shared _work volume
chroot.binariesSourcePath / chroot.identity chroot split-FS chroot overrides
dind.preStageDirs / dind.workDir dind stage dirs into the daemon-visible filesystem

Because nothing ties these together, a workflow can end up with e.g. network isolation enabled but the tool cache still under /opt (invisible to the daemon), with no diagnostic.

Proposed config change

Add a new top-level runner object with a topology enum (future-proof vs a boolean, since other topologies are likely):

{
  "runner": {
    "topology": "arc-dind"   // "standard" (default) | "arc-dind"
  }
}

Semantics: overridable defaults

When runner.topology === "arc-dind", the firewall applies overridable defaults — an explicit value anywhere else in the config always wins:

  1. network.isolation defaults to true (ARC k8s lacks NET_ADMIN).
  2. dind.preStageDirs defaults to true.

An explicit network.isolation: false or dind.preStageDirs: false still takes precedence.

Tool-cache validation warning

At config load, if runner.topology is arc-dind and the resolved tool cache (container.runnerToolCachePath or RUNNER_TOOL_CACHE) is unset or under /opt, AWF emits an actionable warning. In ARC/DinD, /opt/hostedtoolcache belongs to the runner image and is not visible to the sibling Docker daemon, so a cache there cannot be mounted into job containers. AWF cannot fix this itself — the runner pod spec must relocate RUNNER_TOOL_CACHE onto a volume mounted into both the runner and the daemon at the same path (conventionally under /home/runner/_work). The warning makes the misconfiguration visible instead of silent.

Scope of this issue (config + schema only)

  • Add runner.topology enum (standard | arc-dind) to the canonical schema (docs/awf-config.schema.json) and regenerate src/awf-config-schema.json via scripts/generate-schema.mjs.
  • Add runner?: { topology? } to the AwfFileConfig interface in src/config-file.ts.
  • In mapAwfFileConfigToCliOptions, apply the overridable arc-dind defaults (networkIsolation, dindPreStageDirs).
  • Add a load-time tool-cache warning (warnArcDindToolCache).
  • Schema enum tests + mapping default/override tests + warning tests.

Out of scope (follow-up: runtime wiring)

The config/schema layer is independent of the runtime behavior that should also eventually key off runner.topology. Today these key off other signals and would be wired up separately:

  • The split-filesystem path-prefix probe / dindHint in src/option-parsers.ts keys off AWF_DIND=1, dockerHostPathPrefix, or a sibling daemon socket — not topology.
  • src/dind-bootstrap.ts fires on enableDind || dockerHostPathPrefix || AWF_DIND.

A follow-up should make runner.topology: arc-dind also drive those probes so the mode triggers split-FS handling without relying on env vars.

Motivation

This gives the gh-aw compiler a single stable contract (runner.topology: arc-dind) to emit — analogous to how sudo: false maps to network isolation — rather than orchestrating six keys and runtime probes for standard ARC/DinD deployments.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions