Propagate enterprise host context into curated DIFC/CLI proxy env#41912
Propagate enterprise host context into curated DIFC/CLI proxy env#41912Copilot wants to merge 2 commits into
Conversation
Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes enterprise/data-residency (*.ghe.com) routing for the DIFC/CLI proxy by ensuring the proxy start steps receive the necessary upstream GitHub host context (previously dropped by the curated env), and by deriving tenant-aware upstream API/GraphQL/Copilot endpoints in the proxy start scripts.
Changes:
- Forward a curated set of upstream GitHub environment variables into both “Start DIFC Proxy” and “Start CLI Proxy” compiler-generated steps.
- Enhance DIFC/CLI proxy startup scripts to normalize the effective GitHub host and derive tenant-aware upstream endpoints for
*.ghe.com(while keeping explicit overrides when present). - Extend compiler tests and update the
pigolden to cover the new env propagation.
Show a summary per file
| File | Description |
|---|---|
| pkg/workflow/testdata/TestWasmGolden_AllEngines/pi.golden | Updates golden workflow output to include enterprise host context variables for the CLI proxy start step. |
| pkg/workflow/compiler_difc_proxy.go | Adds a shared helper to emit a consistent curated upstream env for both DIFC and CLI proxy start steps. |
| pkg/workflow/compiler_difc_proxy_test.go | Adds assertions that the proxy start steps include the curated upstream env variables. |
| actions/setup/sh/start_difc_proxy.sh | Derives tenant-aware upstream endpoints and passes enterprise host context env vars into the DIFC proxy container. |
| actions/setup/sh/start_cli_proxy.sh | Derives tenant-aware upstream endpoints and passes enterprise host context env vars into the CLI proxy container. |
Review details
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 5/5 changed files
- Comments generated: 3
- Review effort level: Low
| export GH_HOST="${GH_HOST:-$github_host}" | ||
|
|
||
| if [ "$github_host" != "github.com" ]; then | ||
| export GITHUB_HOST="${GITHUB_HOST:-$github_host}" | ||
| export GITHUB_ENTERPRISE_HOST="${GITHUB_ENTERPRISE_HOST:-$github_host}" | ||
| fi |
| export GH_HOST="${GH_HOST:-$github_host}" | ||
|
|
||
| if [ "$github_host" != "github.com" ]; then | ||
| export GITHUB_HOST="${GITHUB_HOST:-$github_host}" | ||
| export GITHUB_ENTERPRISE_HOST="${GITHUB_ENTERPRISE_HOST:-$github_host}" | ||
| fi |
| derive_proxy_upstream_env() { | ||
| local server_url="${GITHUB_SERVER_URL:-https://github.com}" | ||
| local server_host | ||
| local github_host="${GH_HOST:-${GITHUB_HOST:-${GITHUB_ENTERPRISE_HOST:-}}}" | ||
|
|
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
|
✅ Test Quality Sentinel completed test quality analysis. |
|
✅ PR Code Quality Reviewer completed the code quality review. |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. No ADR enforcement needed: PR #41912 does not have the 'implementation' label and has only 35 new lines of code in business logic directories (threshold: 100). |
🧪 Test Quality Sentinel Report
📊 Metrics & Test Classification (2 tests analyzed)
A new helper Note on edge case coverage (0%): The score is penalised for lack of error-path assertions, but this is appropriate for the change: the PR fixes env-var propagation, and the tests correctly verify the positive contract (all 7 env vars are present). No error-path testing was needed. Consider adding a negative test (e.g. assert the old missing vars no longer cause misrouting) if relevant scenarios exist. Verdict
|
Review Summary7 inline comments posted — Top issues to address:
The core fix (propagating 7 env vars into proxy startup) is correct and the Go-level tests are well-written. The issues above are about hardening the shell-side runtime logic. @copilot please address the review comments above.
|
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /improve-codebase-architecture, /diagnose, and /tdd — requesting changes on DRY violations, a concrete host-parsing bug, and missing shell-level test coverage.
📋 Key Themes & Highlights
Key Themes
- DRY violation (high):
normalize_github_host()now exists in 3 separate files (configure_gh_for_ghe.sh,start_cli_proxy.sh,start_difc_proxy.sh) andderive_proxy_upstream_env()is duplicated between both proxy scripts. A shared library would make future fixes safe and single-point. - Port-stripping bug (medium):
normalize_github_hostdoes not strip port numbers, so(ghes.example.com/redacted)→ghes.example.com:8443, which breaks*.ghe.compattern matching and produces an invalidGH_HOST. - Silent GH_HOST override (medium): When
GITHUB_SERVER_URLpoints to a non-github.comhost, an explicitly-passedGH_HOST=github.comis quietly replaced — surprising behaviour that could mask misconfiguration. - Missing shell unit tests (medium): The 5-branch derivation logic in
derive_proxy_upstream_envhas no shell-level tests. The existingconfigure_gh_for_ghe_test.shdemonstrates exactly how to do this. - Copilot URL gap (low):
GITHUB_COPILOT_BASE_URLis only auto-derived for*.ghe.com; the asymmetry vsGITHUB_API_URL/GITHUB_GRAPHQL_URL(which both have GHES fallbacks) is undocumented.
Positive Highlights
- ✅ Correct fix: forwarding 7 enterprise host context variables into proxy startup is exactly the right approach
- ✅
writeProxyUpstreamEnv()is a clean abstraction that applies consistently to both DIFC and CLI proxy steps - ✅ Go-level regression tests are well-structured and immediately verify the compile-time env shape
- ✅ Updated pi.golden clearly documents the new generated output
- ✅ Good use of
${VAR:-}safe-expansion throughout to avoidset -efailures on unset vars
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 96.1 AIC · ⌖ 10.6 AIC · ⊞ 6.6K
|
|
||
| set -e | ||
|
|
||
| normalize_github_host() { |
There was a problem hiding this comment.
[/improve-codebase-architecture] normalize_github_host() is now a third copy of the same function — identical implementations already live in configure_gh_for_ghe.sh and start_difc_proxy.sh. Drift between copies is a real risk here.
💡 Suggested fix
Extract the shared logic into a sourced library (e.g., actions/setup/sh/proxy_env_lib.sh) and replace all three copies with a single source call. The existing configure_gh_for_ghe_test.sh also inlines a fourth copy for testing, so a shared lib + one proxy_env_lib_test.sh would consolidate all instances.
@copilot please address this.
| echo "$host" | ||
| } | ||
|
|
||
| derive_proxy_upstream_env() { |
There was a problem hiding this comment.
[/improve-codebase-architecture] derive_proxy_upstream_env() is duplicated verbatim between start_cli_proxy.sh and start_difc_proxy.sh. Any future fix to the host-resolution logic must be applied in two places, which will be easy to miss.
💡 Suggested fix
Move both normalize_github_host() and derive_proxy_upstream_env() into a shared library script (e.g., actions/setup/sh/proxy_env_lib.sh) and source it in both proxy scripts:
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck source=./proxy_env_lib.sh
source "${SCRIPT_DIR}/proxy_env_lib.sh"This also gives a natural home for a proxy_env_lib_test.sh that can test derive_proxy_upstream_env in isolation (see the /tdd comment on line 32).
@copilot please address this.
|
|
||
| server_url="${server_url%/}" | ||
| server_host="$(normalize_github_host "$server_url")" | ||
| if [ -z "$github_host" ] || { [ "$server_host" != "github.com" ] && [ "$github_host" = "github.com" ]; }; then |
There was a problem hiding this comment.
[/diagnose] This condition silently overrides an explicitly-set GH_HOST=github.com when GITHUB_SERVER_URL points to a non-github.com host. If a caller deliberately passes GH_HOST=github.com (e.g., for a specific routing reason), this will quietly replace it with the server URL hostname.
💡 Explanation & suggested change
Current logic:
if [ -z "$github_host" ] || { [ "$server_host" != "github.com" ] && [ "$github_host" = "github.com" ]; }; then
github_host="$server_host"
fiThe second branch (server_host != github.com AND github_host == github.com) fires even when GH_HOST was explicitly provided by the caller. The intent seems to be "if the host resolved from the env looks like a stale default, prefer the server URL" — but "stale default" and "explicit override" are indistinguishable here.
Consider narrowing the condition to only override when github_host came from a lower-priority var than GITHUB_SERVER_URL, or document clearly that GITHUB_SERVER_URL takes precedence over GH_HOST when they disagree.
@copilot please address this.
| if [[ "$host" =~ ^https?:// ]]; then | ||
| host="${host#http://}" | ||
| host="${host#https://}" | ||
| host="${host%%/*}" |
There was a problem hiding this comment.
[/diagnose] normalize_github_host does not strip port numbers. A URL like (ghes.example.com/redacted) produces ghes.example.com:8443, which then gets exported as GH_HOST and used to pattern-match *.ghe.com. This would misroute GHES on non-standard ports.
💡 Suggested fix
Add a port strip after the path strip:
normalize_github_host() {
local host="$1"
host="${host%/}"
if [[ "$host" =~ ^https?:// ]]; then
host="${host#(redacted)
host="${host#https://}"
host="${host%%/*}"
fi
host="${host%%:*}" # strip port if present
echo "$host"
}Also add a test case in the test suite:
["(ghes.example.com/redacted)"]="ghes.example.com"@copilot please address this.
| if [ -z "${GITHUB_COPILOT_BASE_URL:-}" ] && [[ "$github_host" == *.ghe.com ]]; then | ||
| export GITHUB_COPILOT_BASE_URL="https://copilot-api.${github_host}" | ||
| fi | ||
| } |
There was a problem hiding this comment.
[/tdd] derive_proxy_upstream_env() contains 5 conditional branches (empty host, *.ghe.com, GHES fallback, public GitHub, override guard) with no shell-level unit tests. The Go tests only verify the compiler emits the env var names; they do not exercise the runtime derivation logic.
💡 Suggested tests
Following the configure_gh_for_ghe_test.sh pattern, add a start_cli_proxy_test.sh (or better, a proxy_env_lib_test.sh after extracting to a shared lib) with cases like:
# Case: *.ghe.com tenant — API URL should be tenant-scoped
GITHUB_SERVER_URL="(myorg.ghe.com/redacted)" derive_proxy_upstream_env
assert_eq "$GITHUB_API_URL" "(api.myorg.ghe.com/redacted)"
assert_eq "$GITHUB_COPILOT_BASE_URL" "(copilotapi.myorg.ghe.com/redacted)"
# Case: GHES custom host — API URL should use /api/v3
GITHUB_SERVER_URL="(ghes.example.com/redacted)" derive_proxy_upstream_env
assert_eq "$GITHUB_API_URL" "(ghes.example.com/redacted)
# Case: public GitHub — should not override
GITHUB_SERVER_URL="https://github.com" derive_proxy_upstream_env
assert_eq "$GITHUB_API_URL" "https://api.github.com"@copilot please address this.
| fi | ||
| fi | ||
|
|
||
| if [ -z "${GITHUB_COPILOT_BASE_URL:-}" ] && [[ "$github_host" == *.ghe.com ]]; then |
There was a problem hiding this comment.
[/diagnose] GITHUB_COPILOT_BASE_URL is only auto-derived for *.ghe.com tenants, not for GHES (non-*.ghe.com). This creates asymmetric behaviour: GITHUB_API_URL and GITHUB_GRAPHQL_URL both have GHES fallbacks (${server_url}/api/v3, ${server_url}/api/graphql), but Copilot does not.
💡 Suggested change
If GHES also exposes a Copilot endpoint, add a fallback. If it is not yet supported, document the gap explicitly with a comment:
# NOTE: GITHUB_COPILOT_BASE_URL is not auto-derived for GHES (non-*.ghe.com).
# GHES Copilot endpoint is not yet standardised; callers must set this explicitly.
if [ -z "${GITHUB_COPILOT_BASE_URL:-}" ] && [[ "$github_host" == *.ghe.com ]]; then
export GITHUB_COPILOT_BASE_URL="(copilotapi/redacted)${github_host}"
fiEither way, making the intent explicit prevents future confusion.
@copilot please address this.
| "github.com/stretchr/testify/require" | ||
| ) | ||
|
|
||
| func assertProxyCuratedUpstreamEnv(t *testing.T, result string) { |
There was a problem hiding this comment.
[/tdd] assertProxyCuratedUpstreamEnv uses only assert.Contains checks, which cannot detect duplicate env keys or verify that vars belong to a specific step. If GITHUB_SERVER_URL were accidentally written twice (once by writeProxyUpstreamEnv and once by a lingering call site), or if env vars leaked into the wrong step, these assertions would still pass.
💡 Suggested improvement
For the most critical fields, count occurrences or verify exact substrings with surrounding context:
func assertProxyCuratedUpstreamEnv(t *testing.T, result string) {
t.Helper()
assert.Contains(t, result, "GITHUB_SERVER_URL: ${{ github.server_url }}")
// Ensure exactly one occurrence to catch accidental duplication
assert.Equal(t, 1, strings.Count(result, "GITHUB_SERVER_URL:"), "GITHUB_SERVER_URL should appear exactly once")
assert.Equal(t, 1, strings.Count(result, "GITHUB_API_URL:"), "GITHUB_API_URL should appear exactly once")
// ... etc.
}This is a low-friction improvement that closes the "duplicate field" regression class.
@copilot please address this.
🤖 PR Triage
Score breakdown: Impact 38 + Urgency 22 + Quality 16 = 76 Assessment: Production fix for GHE tenants on Run §28298156610
|
On
*.ghe.comdata-residency tenants, the compiler started the mcpg DIFC proxy with a curated env that dropped the enterprise host context. The proxy then defaulted togithub.com/api.github.com, causing forwardedgh apicalls to fail before the agent could start.Compiler: forward enterprise host context into proxy startup
GH_HOSTGITHUB_HOSTGITHUB_ENTERPRISE_HOSTGITHUB_SERVER_URLGITHUB_API_URLGITHUB_GRAPHQL_URLGITHUB_COPILOT_BASE_URLStart DIFC ProxyStart CLI ProxyProxy startup scripts: derive tenant-aware endpoints
GITHUB_SERVER_URL/ host env.*.ghe.com, derive tenant-specific upstreams instead of falling back to public GitHub:https://api.<tenant>.ghe.comhttps://api.<tenant>.ghe.com/graphqlhttps://copilot-api.<tenant>.ghe.comRegression coverage
pigolden so the CLI proxy path exercises the new env propagation.Example of the generated proxy startup env: