[formal-spec] Add OTEL observability formal compatibility test suite (v0.4.0 L1/L2)#41906
[formal-spec] Add OTEL observability formal compatibility test suite (v0.4.0 L1/L2)#41906Copilot wants to merge 3 commits into
Conversation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
✅ Test Quality Sentinel completed test quality analysis. |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. |
|
✅ PR Code Quality Reviewer completed the code quality review. |
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
There was a problem hiding this comment.
Pull request overview
Adds a formal compatibility test suite for observability.otlp behavior in pkg/workflow, intended to continuously enforce spec-level invariants around endpoint normalization, header handling, if-missing policy semantics, service naming, and related OTEL/OTLP contracts.
Changes:
- Introduces
TestFormal_*coverage for OTLP endpoint normalization (string/object/array), fan-out ordering, and empty/absent handling. - Adds assertions for deterministic OTLP header serialization plus Sentry-specific
Authorization -> x-sentry-authrewriting behavior. - Adds contract checks for if-missing normalization/defaulting, service-name formation, static domain extraction, and mirror-path constant composition.
Show a summary per file
| File | Description |
|---|---|
| pkg/workflow/otel_observability_formal_test.go | New formal-compatibility tests for OTLP observability normalization, policies, and contract invariants. |
Review details
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 1/1 changed files
- Comments generated: 3
- Review effort level: Low
| func TestFormal_ExpressionProducesNoAllowlistEntry(t *testing.T) { | ||
| assert.Empty(t, extractOTLPEndpointDomain("${{ secrets.OTLP_ENDPOINT }}")) | ||
| } |
| assert.Contains(t, validWorkflowData.Env, "GH_AW_OTLP_IF_MISSING") | ||
| assert.Contains(t, validWorkflowData.Env, "warn") |
| assert.Equal(t, | ||
| []otlpEndpointEntry{{URL: "https://string.example.com:4317"}}, | ||
| collectAllOTLPEndpoints(stringForm), | ||
| ) |
🧪 Test Quality Sentinel Report❌ Test Quality Score: 87/100 — Excellent (but failing)
📊 Metrics & Test Classification (15 tests analyzed)
Go: 15 (
|
There was a problem hiding this comment.
❌ Test Quality Sentinel: 87/100. Coding-guideline violation detected. pkg/workflow/otel_observability_formal_test.go is missing the required //go:build !integration (or //go:build integration) build tag on line 1. Additionally, all 42 assertions are bare (no descriptive message strings). Please review the flagged tests in the comment above.
…ility test suite Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
🏗️ Design Decision Gate — ADR RequiredThis PR makes significant changes to core business logic (245 new lines in 📄 Draft ADR committed:
📋 What to do next
Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision. ❓ Why ADRs Matter
ADRs create a searchable, permanent record of why the codebase looks the way it does. Future contributors (and your future self) will thank you. 📋 Michael Nygard ADR Format ReferenceAn ADR must contain these four sections to be considered complete:
All ADRs are stored in
|
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /tdd and /grill-with-docs — requesting changes on test-quality issues.
📋 Key Themes & Highlights
Key Themes
- Duplicate/redundant tests (3 instances):
TestFormal_ExpressionProducesNoAllowlistEntry,TestFormal_StringHeaderFormPreservedForNonSentry, and the overlap betweenTestFormal_AbsentObservabilityProducesNoEndpointsand the"empty and absent"subtest all re-assert something already covered elsewhere. In a formal spec suite each test should own exactly one predicate. - Fragmented
if-missingcoverage: Valid and invalid modes are split across two functions; theTestFormal_InvalidIfMissingFallsBackToDefaultalso mixes unit and integration assertions in one body. - Determinism loop doesn't actually vary map ordering: Go randomises iteration per map creation, not per call — the 10-iteration loop over the same map object is 9 no-ops.
- No spec predicate linkage in the file: The traceability from test → predicate → spec version exists only in the PR description, not in the source.
authPlaceholder = "******": Identical to the GitHub Actions masking output — use a clearly-fake token value instead.
Positive Highlights
- ✅ Clean
TestFormal_naming convention makes the formal-test layer immediately recognisable. - ✅ Good use of
require.Lenbefore index access to prevent panics on wrong-length slices. - ✅ Sentry auth-rewrite isolation test is well-structured and covers the security-sensitive rename.
- ✅ Mirror path constant test is a neat guard against accidental path drift.
- ✅ Fan-out order test correctly uses three-element slices to catch off-by-one reordering.
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 100 AIC · ⌖ 10.9 AIC · ⊞ 6.6K
| assert.Empty(t, extractOTLPEndpointDomain("${{ secrets.OTLP_ENDPOINT }}")) | ||
| } | ||
|
|
||
| func TestFormal_ExpressionProducesNoAllowlistEntry(t *testing.T) { |
There was a problem hiding this comment.
[/tdd] TestFormal_ExpressionProducesNoAllowlistEntry is a duplicate — line 104 of TestFormal_StaticDomainExtraction already asserts assert.Empty(t, extractOTLPEndpointDomain("${{ secrets.OTLP_ENDPOINT }}")). Having a standalone test for the same single assertion adds noise without adding coverage.
💡 Suggestion
Either remove TestFormal_ExpressionProducesNoAllowlistEntry entirely (the predicate is already covered), or remove the expression case from TestFormal_StaticDomainExtraction and keep this as the sole location for that invariant to make the predicate mapping unambiguous.
@copilot please address this.
| assert.Equal(t, []otlpEndpointEntry{{URL: "https://valid.example.com:4317"}}, collectAllOTLPEndpoints(frontmatter)) | ||
| } | ||
|
|
||
| func TestFormal_StringHeaderFormPreservedForNonSentry(t *testing.T) { |
There was a problem hiding this comment.
[/tdd] TestFormal_StringHeaderFormPreservedForNonSentry is redundant — the same assertion is already made at line 84 inside TestFormal_SentryAuthHeaderRewrite (assert.Equal(t, "Authorization="+authPlaceholder, normalizedNonSentryHeaders)). Duplicate formal predicates dilute clarity about what each test uniquely covers.
💡 Suggestion
Remove this function. The non-Sentry preservation invariant is already a named assertion in TestFormal_SentryAuthHeaderRewrite. If isolation is desired, remove the non-Sentry case from the Sentry test and keep this function as the canonical location instead.
@copilot please address this.
| headers := map[string]any{"z": "3", "a": "1", "m": "2"} | ||
| want := "a=1,m=2,z=3" | ||
|
|
||
| for range determinismTestIterations { |
There was a problem hiding this comment.
[/tdd] The determinism loop calls the same map object 10 times, but Go randomises map iteration order at map creation, not per iteration. So all 10 calls see the same internal order and the loop adds no additional coverage — it's the same assertion repeated.
💡 Suggestion
The actual invariant being tested is that normalizeOTLPHeadersForEndpoint sorts keys before joining. A clearer test would assert the sorted order directly, or — if you want to guard against future regressions where sorting is removed — create a fresh map in each iteration:
for range determinismTestIterations {
h := map[string]any{"z": "3", "a": "1", "m": "2"}
assert.Equal(t, "a=1,m=2,z=3", normalizeOTLPHeadersForEndpoint(h, "(example.com/redacted)"))
}Alternatively, rename the constant to mapKeyOrderIterations to signal intent more precisely, and add a comment explaining the Go randomisation behaviour being guarded against.
@copilot please address this.
| assert.Empty(t, normalizeOTLPHeadersForEndpoint(map[string]any{}, "https://example.com:4317")) | ||
| } | ||
|
|
||
| func TestFormal_InvalidIfMissingFallsBackToDefault(t *testing.T) { |
There was a problem hiding this comment.
[/tdd] TestFormal_InvalidIfMissingFallsBackToDefault mixes two abstraction levels in one function: lines 193-195 test normalizeOTLPIfMissingMode (unit), while lines 197-238 test injectOTLPConfig (integration). A failure here gives no signal about which layer broke.
💡 Suggestion
Split into two focused tests:
func TestFormal_InvalidIfMissingModesReturnEmpty(t *testing.T) {
for _, mode := range []string{"fail", "silent", "skip", "abort"} {
t.Run(mode, func(t *testing.T) {
assert.Empty(t, normalizeOTLPIfMissingMode(mode))
})
}
}
func TestFormal_InvalidIfMissingNotInjected(t *testing.T) {
// ...injectOTLPConfig assertions...
}Using t.Run per mode also surfaces which exact value caused the failure.
@copilot please address this.
| const authPlaceholder = "******" | ||
|
|
||
| func TestFormal_EndpointFormNormalization(t *testing.T) { | ||
| t.Run("string object and array normalize to ordered entries", func(t *testing.T) { |
There was a problem hiding this comment.
[/tdd] All three endpoint forms (string, object, array) are bundled inside one t.Run("string object and array normalize to ordered entries"). If the array form assertion fails, the test name doesn't tell you which form was the culprit.
💡 Suggestion
Give each form its own subtest for isolated, self-describing failures:
func TestFormal_EndpointFormNormalization(t *testing.T) {
t.Run("string form", func(t *testing.T) { ... })
t.Run("object form", func(t *testing.T) { ... })
t.Run("array form", func(t *testing.T) { ... })
t.Run("empty and absent", func(t *testing.T) { ... })
}Each subtest name then appears directly in the failure output, matching the spec predicate language.
@copilot please address this.
| "github.com/stretchr/testify/require" | ||
| ) | ||
|
|
||
| const determinismTestIterations = 10 |
There was a problem hiding this comment.
[/grill-with-docs] No spec predicate references appear in the file. The PR description explicitly maps each test to a named predicate in specs/otel-observability-spec.md (v0.4.0), but none of those references appear inside the test file itself. Without them, the traceability that makes a "formal spec" test suite valuable is lost as soon as someone reads the file without the PR context.
💡 Suggestion
Add a package-level or per-test doc comment linking to the spec section. For example:
// otel_observability_formal_test.go implements formal compatibility checks for
// specs/otel-observability-spec.md v0.4.0 — Level 1 and Level 2 predicates.
// Each TestFormal_* function corresponds to a named predicate in that document.Or per-test: // Spec: endpoint-form-normalization (v0.4.0 §3.1)
@copilot please address this.
| ) | ||
|
|
||
| const determinismTestIterations = 10 | ||
| const authPlaceholder = "******" |
There was a problem hiding this comment.
[/tdd] authPlaceholder = "******" uses the same 6-asterisk pattern that GitHub Actions masking produces for redacted secrets. Using the mask output as a test input conflates "placeholder token" with "masked output" and could produce confusing test failure messages if masking logic changes.
💡 Suggestion
Use a realistic, clearly-fake token value that cannot be confused with a masked secret:
const authPlaceholder = "test-auth-token-placeholder"This also makes Sentry rewrite assertions easier to read: "x-sentry-auth=test-auth-token-placeholder" clearly communicates the rename without implying the value was masked.
@copilot please address this.
| assert.Contains(t, validWorkflowData.Env, "warn") | ||
| } | ||
|
|
||
| func TestFormal_AbsentObservabilityProducesNoEndpoints(t *testing.T) { |
There was a problem hiding this comment.
[/tdd] TestFormal_AbsentObservabilityProducesNoEndpoints duplicates two of the three assertions already in the "empty and absent normalize to empty" subtest (lines 60-62): collectAllOTLPEndpoints(nil) and collectAllOTLPEndpoints(map[string]any{}). The only unique case is map[string]any{"observability": nil}.
💡 Suggestion
Move the unique case (observability: nil) into the existing subtest inside TestFormal_EndpointFormNormalization, and remove this function entirely. Or, conversely, absorb the existing cases here and remove the overlap from TestFormal_EndpointFormNormalization.
@copilot please address this.
| assert.Empty(t, extractOTLPEndpointDomain("${{ secrets.OTLP_ENDPOINT }}")) | ||
| } | ||
|
|
||
| func TestFormal_TopLevelHeadersApplyToStringFormOnly(t *testing.T) { |
There was a problem hiding this comment.
[/tdd] TestFormal_TopLevelHeadersApplyToStringFormOnly covers string form and object form but omits the array form. The spec predicate likely also implies that top-level headers should not be silently lost or duplicated when combined with array-form endpoints — that case is untested.
💡 Suggestion
Add a third case for array form:
arrayForm := map[string]any{
"observability": map[string]any{
"otlp": map[string]any{
"endpoint": []any{
map[string]any{"url": "(array.example.com/redacted)"},
},
"headers": "Authorization=" + authPlaceholder,
},
},
}
arrayEntries := collectAllOTLPEndpoints(arrayForm)
require.Len(t, arrayEntries, 1)
// Assert: top-level headers do NOT propagate to array-form entries
assert.Empty(t, arrayEntries[0].Headers)(Adjust the assertion to match actual spec behaviour if the intent differs.)
@copilot please address this.
| assert.Equal(t, "x-sentry-auth="+authPlaceholder+",X-Tenant=acme", normalizedSentryMixedHeaders) | ||
| } | ||
|
|
||
| func TestFormal_IfMissingPolicyValidation(t *testing.T) { |
There was a problem hiding this comment.
[/tdd] TestFormal_IfMissingPolicyValidation tests only the three happy-path values (error, WARN, Ignore) — this is exactly what TestFormal_InvalidIfMissingFallsBackToDefault (line 193) also verifies indirectly. The split creates a fragmented picture: valid modes here, invalid modes there, with no clear canonical location.
💡 Suggestion
Consolidate all normalizeOTLPIfMissingMode assertions — valid and invalid — into a single table-driven test:
func TestFormal_IfMissingPolicy(t *testing.T) {
cases := []struct{ input, want string }{
{"error", "error"}, {"WARN", "warn"}, {" Ignore ", "ignore"},
{"fail", ""}, {"silent", ""}, {"skip", ""}, {"abort", ""},
}
for _, tc := range cases {
t.Run(tc.input, func(t *testing.T) {
assert.Equal(t, tc.want, normalizeOTLPIfMissingMode(tc.input))
})
}
}@copilot please address this.
|
Great work on this formal test suite, The PR is focused (single file, single purpose), the description clearly explains what each test group covers, and the assertions are written against public compiler/runtime behavior rather than implementation internals — all exactly right. This looks ready for review. 🚀
|
There was a problem hiding this comment.
REQUEST_CHANGES — spec-test quality gaps undermine the "formal conformance" claim
The new test file adds genuine value, but several issues weaken it as a formal compatibility test suite: a fragile substring assertion that can pass for the wrong reason, a missing form in a predicate that explicitly claims full coverage, and duplicate tests that inflate apparent coverage. See inline comments.
Blocking issues (2)
- Fragile
assert.Contains(t, validWorkflowData.Env, "warn")(line 239) — use the full key-value pair to avoid false positives. TestFormal_TopLevelHeadersApplyToStringFormOnlymissing array form (line 137) — the invariant is untested for array-form endpoints, leaving a regression blind spot.
Non-blocking but worth fixing
TestFormal_ExpressionProducesNoAllowlistEntryis a complete duplicate (line 108).TestFormal_StringHeaderFormPreservedForNonSentryduplicates an assertion already inTestFormal_SentryAuthHeaderRewrite(line 184).- Bare
forloop withoutt.Runobscures which invalid mode failed (line 195); also missing coverage for"error"mode throughinjectOTLPConfig.
🔎 Code quality review by PR Code Quality Reviewer · 124.3 AIC · ⌖ 7.48 AIC · ⊞ 5.2K
| } | ||
|
|
||
| func TestFormal_ExpressionProducesNoAllowlistEntry(t *testing.T) { | ||
| assert.Empty(t, extractOTLPEndpointDomain("${{ secrets.OTLP_ENDPOINT }}")) |
There was a problem hiding this comment.
Exact duplicate of TestFormal_StaticDomainExtraction line 104 — adds zero coverage and silently inflates the spec-test count.
💡 Suggested fix
Either delete this function or replace with a genuinely distinct input not yet covered (e.g. a Sentry-named secret expression, a bare hostname without scheme, or a URL with user-info that should still extract the domain):
func TestFormal_ExpressionProducesNoAllowlistEntry(t *testing.T) {
// Sentry-style secret expression also produces no domain
assert.Empty(t, extractOTLPEndpointDomain(
"${{ secrets."+constants.OTELSentryEndpointSecretName+" }}",
))
}As-is, if someone deletes the shared assertion from TestFormal_StaticDomainExtraction during a refactor, this duplicate gives false confidence that the predicate is still covered.
| } | ||
| objectEntries := collectAllOTLPEndpoints(objectForm) | ||
| require.Len(t, objectEntries, 1) | ||
| assert.Equal(t, "X-Per-Entry=v", objectEntries[0].Headers) |
There was a problem hiding this comment.
Array form is not tested — the function name claims "string form only" but only verifies string and object forms; a regression where array-form entries incorrectly inherit the top-level headers field would go undetected.
💡 Suggested fix
Add an array-form assertion to complete the spec contract:
// Array form: per-entry headers take effect; top-level headers are ignored
arrayForm := map[string]any{
"observability": map[string]any{
"otlp": map[string]any{
"endpoint": []any{
map[string]any{
"url": "(first.example.com/redacted)",
"headers": "X-Per-Entry=v1",
},
},
"headers": "Authorization=" + authPlaceholder, // top-level — must NOT apply
},
},
}
arrayEntries := collectAllOTLPEndpoints(arrayForm)
require.Len(t, arrayEntries, 1)
assert.Equal(t, "X-Per-Entry=v1", arrayEntries[0].Headers)Without this, the invariant that top-level headers is strictly scoped to the backward-compat string form has a blind spot for array input.
| (&Compiler{}).injectOTLPConfig(validWorkflowData) | ||
| assert.Contains(t, validWorkflowData.Env, "GH_AW_OTLP_IF_MISSING") | ||
| assert.Contains(t, validWorkflowData.Env, "warn") | ||
| } |
There was a problem hiding this comment.
Substring "warn" is too broad — this assertion passes if any injected env var value contains the four letters warn, not specifically GH_AW_OTLP_IF_MISSING: warn.
💡 Suggested fix
Pin both the key and the value together so a future rename or value-format change doesn't silently pass this check:
assert.Contains(t, validWorkflowData.Env, "GH_AW_OTLP_IF_MISSING: warn")For example, if injectOTLPConfig were changed to emit GH_AW_OTLP_IF_MISSING_MODE: warn instead, the current assertion would still pass while the spec invariant is violated.
| func TestFormal_InvalidIfMissingFallsBackToDefault(t *testing.T) { | ||
| for _, mode := range []string{"fail", "silent", "skip", "abort"} { | ||
| assert.Empty(t, normalizeOTLPIfMissingMode(mode)) | ||
| } |
There was a problem hiding this comment.
Loop without subtests loses per-iteration failure context — when normalizeOTLPIfMissingMode fails for one of these modes, the failure message says only "expected empty string" with no indication of which mode triggered it.
💡 Suggested fix
for _, mode := range []string{"fail", "silent", "skip", "abort"} {
t.Run(mode, func(t *testing.T) {
assert.Empty(t, normalizeOTLPIfMissingMode(mode))
})
}This is also a good place to add "error" — normalizeOTLPIfMissingMode("error") returns "error" (a valid normalized value), but injectOTLPConfig only injects GH_AW_OTLP_IF_MISSING for "warn" and "ignore". Explicitly asserting that an "error" mode does not emit that env var would close the integration gap between this unit test and TestFormal_InvalidIfMissingFallsBackToDefault.
| "Authorization="+authPlaceholder, | ||
| normalizeOTLPHeadersForEndpoint("Authorization="+authPlaceholder, "https://otlp.example.com:4317"), | ||
| ) | ||
| } |
There was a problem hiding this comment.
Duplicate of normalizedNonSentryHeaders assertion in TestFormal_SentryAuthHeaderRewrite — this function's single assertion is already present on line 84 of the same file.
💡 Suggested fix
Remove TestFormal_StringHeaderFormPreservedForNonSentry unless it can test a distinct input (e.g., a multi-pair non-Sentry string, or a string with whitespace around =). The current duplication means a failure here also fails in TestFormal_SentryAuthHeaderRewrite, doubling noise without adding signal.
|
@copilot please run the
|
🤖 PR Triage
Score breakdown: Impact 20 + Urgency 8 + Quality 13 = 41 Assessment: Pure test additions implementing formal compatibility checks for the OTEL observability spec v0.4.0 (L1/L2). Adds 245-line test file covering all 15 modeled predicates + 44-line ADR. No production code changed. CI: 1 check passing. No existing reviews.
Run §28298156610
|
This PR implements the formal compatibility checks defined in
specs/otel-observability-spec.md(v0.4.0) for Level 1/2observability.otlpbehavior. It adds an explicit predicate-to-test mapping inpkg/workflowso key normalization, secrecy, policy, and fan-out invariants are continuously enforced.Formal predicate coverage (
pkg/workflow/otel_observability_formal_test.go)TestFormal_*coverage for all 15 modeled predicates:Authorization -> x-sentry-authrewrite isolationif-missingpolicy validation/default semanticsgh-aw/gh-aw.<sanitized-id>)/tmp/gh-aw/otel.jsonl)Contract-focused assertions
observability_otlp.gohelpers, with explicit assertions for compatibility invariants rather than implementation internals.