Skip to content

Route-Share E2E Harness

Issue: #1051. Parent: #1041.

The reset is harness-first: local and preview route-share checks should fail before a human has to manually test WhatsApp or Garmin. The harness has two layers: contract checks for fixture/URL inputs, and a local live driver that uses the fake chat + fake Garmin devstack.

Launch gates and the fake-chat boundary are tracked in docs/route-share-launch-gates.md.

  • every route option is a separate replyable link preview, not a heavy media payload or final picker
  • each option bubble carries exactly one visible route URL; bare URL bodies are accepted only when the route presentation packet says delivery.mode = "bare_url", and label: URL-style synthesized copy is always rejected
  • channel link-preview descriptions include stats plus route prose; stats-only, prose-only, and request-echo descriptions fail before deploy
  • final route summaries do not require lettered selection unless the original user prompt explicitly asked for lettered options
  • public manifests do not include legacy SVG/PNG route artifacts, duplicated route geometry, signed URLs, or backend/provenance fields
  • public manifests expose sanitized route-page context (request_paraphrase) and sibling alternatives as public /r/{share_id}/{slug} links, without reviving the old sidecar/context reader
  • route-page URLs use the public share_id, never the internal route/content hash

Contract checkpoint command:

imperfect-cli route-share-e2e \
  --repeat 25 \
  --manifest-json /path/to/manifest.json \
  --route-options-json /path/to/route-options.json \
  --artifact-dir tmp/route-share-e2e

The same command accepts a preview or local manifest URL. Preview mode requires explicit service URLs so the run is reproducible from logs:

imperfect-cli route-share-e2e \
  --target preview \
  --manifest-url https://imperfect.co/r/<share_id>/manifest.json \
  --imperfect-api-base-url https://api-preview.example \
  --cheshire-base-url https://cheshire-preview.example \
  --alice-base-url https://alice-preview.example \
  --looking-glass-base-url https://web-preview.example \
  --footman-base-url https://footman-preview.example \
  --fake-chat-base-url https://chat-preview.example \
  --fake-garmin-base-url https://garmin-preview.example

Live mode is intentionally separate from fixture mode:

imperfect-cli route-share-e2e --live \
  --prompt "Create three scenic trail running routes from the Conservatory of Flowers in Golden Gate Park, San Francisco, about 5 miles, starting and ending there." \
  --artifact-dir tmp/route-share-e2e-live

--live rejects --manifest-json, --manifest-url, and --route-options-json. Route options must come from the fake-chat live runner after a real prompt. The quoted route page is fetched as HTML, and the manifest is fetched from that same generated page identity. The page fetch uses a WhatsApp-like crawler user agent, then verifies Open Graph title/description metadata plus the owned first-party preview.jpg response as a bounded JPEG. The built-in live driver currently supports --target localhost: it seeds a local harness user, posts the fake inbound WhatsApp webhook, drains the real channel workers, captures each outbound route option's platform message id, quotes the configured option, follows fake Garmin OAuth, and checks resend idempotency.

Local live evidence distinguishes replied_to_share_id from selected_share_id: the runner quotes a specific outbound option, then the BFF must resolve that quote to the same route page. The default live scenario replies to the second generated option so a fallback to "first/latest route" cannot pass by accident.

Run it against a booted local stack. Keep the devstack attached in one terminal and run the live harness from a second terminal; do not use --exit-after-smoke for live harness runs because it tears the stack down.

imperfect-cli route-share-devstack up --smoke
imperfect-cli route-share-e2e --live \
  --prompt "Create three scenic trail running routes from the Conservatory of Flowers in Golden Gate Park, San Francisco, about 5 miles, starting and ending there." \
  --artifact-dir tmp/route-share-e2e-live

When the live loop fails, --artifact-dir writes route-share-e2e-error.json; the error string should contain enough recent fake-chat, delivery, route-reference, route-page, or fake-Garmin state to keep debugging on the product contract instead of sending a human back to manual WhatsApp or Garmin testing.

Local live mode uses harness user route-share-e2e@imperfect.local and WhatsApp channel id +15555550100. Each run resets fake chat, fake Garmin, and prior channel/route artifacts for that harness identity before posting /whatsapp_webhooks, capturing message references, following fake Garmin OAuth, and resending the same route to verify idempotency.

The product loop under test is:

chat prompt -> replyable route-option links -> quoted option reply -> route page renders -> manifest validates -> route-send API start/callback -> fake Garmin auth -> success -> duplicate send/idempotency check.

Artifact shape: normal --live runs preserve the historical report["live"] = <single live evidence object> shape. With --live --stress-all-fake-garmin, report["live"] is a matrix wrapper: {"scenario_count": N, "runs": [<live evidence object>, ...]}. Each fixture or live report's route_options.transcript preserves the channel-visible route-option body text, link-preview metadata, route context, platform/message request ids, route presentation delivery mode when present, and final summary text. The built-in live driver captures that final summary from fake chat before reducing the run to status and message-count metadata.

Real Garmin Smoke

The localhost live harness intentionally uses fake Garmin so it can run quickly and deterministically in CI. It proves route-option fan-out, quoted-reply disambiguation, route-page rendering, manifest shape, and Alice idempotency. It does not prove Garmin Connect's real course URL behavior: Alice may emit a /modern/course/{id} URL while the browser canonicalizes to /app/course/{id}, and reused idempotency rows can point at stale or deleted course ids.

Run the real-provider smoke after a green localhost loop when validating the public provider path. Use local Alice, not prod Alice: start alice serve from ~/code/alice so its ngrok callback and Garmin dev credentials are active, complete alice garmin login if the local token has expired, then run the probe against the generated route page:

GARMIN_USER_ID=$(python3 - <<'PY'
from pathlib import Path
for line in Path.home().joinpath("code/alice/.env").read_text().splitlines():
    if line.startswith("GARMIN_USER="):
        print(line.split("=", 1)[1].strip())
        break
PY
)

imperfect-cli route-garmin-probe \
  http://127.0.0.1:3000/r/<share_id>/<slug> \
  --garmin-user-id "$GARMIN_USER_ID" \
  --alice-base-url http://127.0.0.1:8766 \
  --route-base-url http://127.0.0.1:3000 \
  --environment local \
  --no-dry-run \
  --availability-timeout-seconds 60 \
  --android-app-verify \
  --android-device-serial emulator-5554 \
  --android-artifact-dir tmp/garmin-android-probe

Success means Alice returns a Garmin course_id, the course URL probe stops returning bad states (404 or 5xx), and the emitted URL form is recorded for comparison against the logged-in Chrome page. With --android-app-verify, the same returned course_url is opened in the installed, logged-in Garmin Connect Android app via adb; the command stores a UI dump and screenshot, then requires the course title and activity label to be visible. A 403 from an anonymous public probe is useful as an auth-gated, non-missing signal. Treat this smoke as real course creation plus app-open evidence, not proof that Garmin's app-native OAuth consent works for the local dev client.

The #1052 local devstack entry point is:

imperfect-cli route-share-devstack check
imperfect-cli route-share-devstack up --smoke
imperfect-cli route-share-devstack up --smoke --exit-after-smoke

That command records the service URL table, boots local infra, starts fake Garmin and fake chat, and validates the fake-chat route-option link-preview contract rather than a retired two-turn fixture. Live generated Cheshire output remains part of the full #1051 Playwright loop.

Stress Matrix

Implemented in this harness:

  • --repeat N reloads fixture inputs and reruns the harness contracts N times without manual reset.
  • --artifact-dir writes route-share-e2e-report.json on success and route-share-e2e-error.json on harness failures.
  • --stress-all-fake-garmin cycles the fake Garmin scenario matrix in fixture mode and live mode: success, denied, permission missing, callback error, timeout-after-create, retry, already-sent, revoked, and disconnected.
  • Fake WhatsApp route-option payload checks fail on heavy media, require separate replyable link previews, and reject preview-duplicate option text. URL-only option text passes only with explicit bare_url presentation delivery.
  • Reset route-bundle checks fail on SVG/high PNG-like assets, manifest geometry, signed URLs, provider/backend/provenance leakage, and keep GeoJSON as the expected source of truth outside the manifest.
  • Manifest reports expose whether request context is present and how many sibling alternatives were included, without logging the paraphrase text.
  • Live route-page checks fetch the public page with a WhatsApp-like user agent, require usable Open Graph title/description/image metadata, and record the preview JPG content type and byte count in the success artifact.
  • Open Graph copy gates reject route titles with trailing distance/unit suffixes, stats-only descriptions, and descriptions that duplicate the manifest request_paraphrase after normalization.
  • Channel transcript gates reject link-preview descriptions that omit stats or prose, reject request echoes when a presentation packet supplies the request paraphrase, and reject final summary picker copy unless the user asked for lettered options.

Still to wire on top of this local loop:

  • Playwright screenshots/artifacts for each failed browser step.
  • Preview-mode live driving after the local loop is stable.