Skip to content

[aw-failures] Claude false-red — log_parser_bootstrap fails completed runs on empty logEntries (Avenger, Daily Rendering Scripts [Content truncated due to length] #41885

Description

@github-actions

Guard the Claude log_parser_bootstrap guardrail with safeOutputEntriesCount — it currently fails completed runs whenever logEntries reads empty, marking real successes red across multiple workflows.

This is a NEW P1 false-red gap. It is NOT covered by #41636 (Copilot CLI exit-1 / "permission denied", different engine), #41711 (github MCP "failed to launch"), or #39849 (No-Op Runs). Here the Claude engine runs to completion, emits safe outputs, and the parse step even prints Claude log parsed successfully — then fails the job anyway.

Problem statement

Fix the contradiction in actions/setup/js/log_parser_bootstrap.cjs: the parse step logs success and then fails the same step ~3ms later. Two distinct Claude-engine workflows hit this in the last 6h, both at the "Parse agent logs for step summary" step:

Claude log parsed successfully
##[error]ERR_CONFIG: Claude execution failed: no structured log entries were produced.
This usually indicates a startup or configuration error before tool execution.

The agent demonstrably executed real work in both runs (visible tool_use/tool_result JSON, passing vitest/Go test output, task retrieval success) and produced safe outputs — both jobs set output, output_types, and has_patch. The run is a false-red: the job is marked failure even though the agent succeeded.

Root cause

The Claude guardrail fires purely on an empty logEntries, with no completion/safe-output escape hatch:

// actions/setup/js/log_parser_bootstrap.cjs:331-335
if (parserName === "Claude" && (!logEntries || logEntries.length === 0)) {
  core.setFailed(`${ERR_CONFIG}: Claude execution failed: no structured log entries were produced. ...`);
}

When logEntries is empty, the parser also takes the fallback branch that prints ${parserName} log parsed successfully (line 301) — hence "parsed successfully" immediately followed by setFailed.

logEntries is empty here despite the agent producing structured output to the GitHub step stdout. The parser reads GH_AW_AGENT_OUTPUT=/tmp/gh-aw/agent-stdio.log; in both runs the sandbox/firewall teardown immediately before the parse step failed to repair artifact ownership, leaving that file empty/unreadable for the parse step:

chmod: cannot access '/tmp/gh-aw/sandbox/firewall/logs': No such file or directory
[WARN] Rootless artifact permission repair failed for /tmp/gh-aw/sandbox/firewall/logs (exit 1)
[WARN] Failed to clean up temporary files: Error: EACCES, Permission denied: /tmp/awf-...-chroot-home
    at Object.rmSync (node:fs:1222:18)

The adjacent MCP-failure handling (same file, lines 338–348) already treats post-completion infra failures as non-fatal when safeOutputEntriesCount > 0 or agentRanToCompletion(logEntries). The Claude guardrail was never given the same escape hatch.

Affected workflows and run IDs

Workflow Engine Failed run Nearest success
Avenger (.github/workflows/avenger.lock.yml) claude §28287279888 §28288382389
Daily Rendering Scripts Verifier (.github/workflows/daily-rendering-scripts-verifier.lock.yml) claude §28283864143

Stack: awf v0.27.11, mcpg v0.3.30. Pattern is intermittent — Avenger interleaves failures (01:19, 02:11, 03:36, 05:40, 11:02 UTC) with successes on the same config, consistent with a teardown race rather than a config error.

Evidence: audit-diff (failed §28287279888 vs success §28288382389)
  • firewall_diff: no anomalies (has_anomalies: false) — the EACCES/permission-repair warnings appear in passing runs too, so they are not the discriminator; the only divergence is the parse-step guardrail.
  • run_metrics_diff: failed run made 12 GitHub API calls (23 core consumed); token usage parsed as 0 on both — consistent with the structured-log read returning empty.

Proposed remediation

  1. Primary (cheap, high-confidence): gate the guardrail at log_parser_bootstrap.cjs:333 on completion signal, mirroring the existing MCP logic — do not setFailed when safeOutputEntriesCount > 0 (downgrade to core.warning). The agent provably completed and emitted outputs in both failing runs.
  2. Upstream: ensure agent-stdio.log (GH_AW_AGENT_OUTPUT) is readable by the parse step after rootless sandbox/firewall teardown — the "Rootless artifact permission repair failed (exit 1)" warning should not be able to strand the agent log.
  3. Hygiene: suppress the misleading ${parserName} log parsed successfully message on the empty-logEntries fallback path so logs stop claiming success right before failing.

Success criteria / verification

  • Re-running Avenger / Daily Rendering Scripts Verifier on the same config no longer flips to failure when the agent completed and emitted safe outputs.
  • Add a log_parser_bootstrap.test.cjs case: Claude parser, empty logEntries, safeOutputEntriesCount > 0setFailed NOT called (warning instead).
  • No ERR_CONFIG: ... no structured log entries emitted on runs where output_types/has_patch outputs were set.

Analyzed run IDs: 28287279888, 28283864143 (comparator 28288382389). The third in-window failure, §28288175888 "Daily Max AI Credits Test (Intentionally Fails)", is by-design and excluded.

Generated by 🔍 [aw] Failure Investigator (6h) · 202.7 AIC · ⌖ 36.1 AIC · ⊞ 5.6K ·

  • expires on Jul 4, 2026, 5:21 AM UTC-08:00

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions