Skip to content

Home API Telemetry Queries

Run these against the imperfect-api Logfire project after #1196 is deployed. They cover API/server telemetry only; mobile Home First Content Shown remains owned by the mobile-side worker.

Plain Stream Opens

Daily latency and cache split for no-feedback POST /user_states/stream opens:

SELECT
  date_trunc('day', start_timestamp) AS day,
  attributes->>'cache_status' AS cache_status,
  attributes->>'stale_reason' AS stale_reason,
  count(*) AS requests,
  count(DISTINCT attributes->>'user_id') AS users,
  percentile_cont(0.50) WITHIN GROUP (
    ORDER BY (attributes->>'time_to_first_renderable_event_ms')::double precision
  ) AS p50_first_renderable_ms,
  percentile_cont(0.90) WITHIN GROUP (
    ORDER BY (attributes->>'time_to_first_renderable_event_ms')::double precision
  ) AS p90_first_renderable_ms,
  percentile_cont(0.95) WITHIN GROUP (
    ORDER BY (attributes->>'time_to_complete_ms')::double precision
  ) AS p95_complete_ms
FROM records
WHERE start_timestamp > now() - interval '7 days'
  AND message = 'user_states.stream.telemetry'
  AND attributes->>'has_feedback' = 'false'
  AND attributes->>'has_audio' = 'false'
  AND attributes->>'has_attachment' = 'false'
GROUP BY day, cache_status, stale_reason
ORDER BY day DESC, requests DESC;

GET Fallback

GET /user_states latency and cache split:

SELECT
  date_trunc('day', start_timestamp) AS day,
  attributes->>'cache_status' AS cache_status,
  attributes->>'stale_reason' AS stale_reason,
  count(*) AS requests,
  count(DISTINCT attributes->>'user_id') AS users,
  percentile_cont(0.50) WITHIN GROUP (
    ORDER BY (attributes->>'time_to_complete_ms')::double precision
  ) AS p50_complete_ms,
  percentile_cont(0.95) WITHIN GROUP (
    ORDER BY (attributes->>'time_to_complete_ms')::double precision
  ) AS p95_complete_ms
FROM records
WHERE start_timestamp > now() - interval '7 days'
  AND message = 'user_states.get.telemetry'
GROUP BY day, cache_status, stale_reason
ORDER BY day DESC, requests DESC;

ContextBuilder Subfetches

Break down all ContextBuilder subfetches by source, provider, data type, and status. Use this with the slow-open stream query above to identify which optional context families dominate cache-miss completion time.

SELECT
  attributes->>'source' AS source,
  attributes->>'provider' AS provider,
  attributes->>'data_type' AS data_type,
  attributes->>'status' AS status,
  count(*) AS fetches,
  percentile_cont(0.50) WITHIN GROUP (
    ORDER BY (attributes->>'duration_ms')::double precision
  ) AS p50_ms,
  percentile_cont(0.95) WITHIN GROUP (
    ORDER BY (attributes->>'duration_ms')::double precision
  ) AS p95_ms
FROM records
WHERE start_timestamp > now() - interval '7 days'
  AND message = 'context_builder.subfetch'
GROUP BY source, provider, data_type, status
ORDER BY p95_ms DESC NULLS LAST;

Disconnect Recovery

Count stream interruptions and recovery mode:

SELECT
  date_trunc('day', start_timestamp) AS day,
  attributes->>'outcome' AS outcome,
  attributes->>'client_disconnect_before_renderable' AS before_renderable,
  attributes->>'background_recovery_started' AS recovery_started,
  attributes->>'disconnect_retry_push_started' AS retry_push_started,
  count(*) AS requests
FROM records
WHERE start_timestamp > now() - interval '7 days'
  AND message = 'user_states.stream.telemetry'
  AND attributes->>'outcome' <> 'complete'
GROUP BY day, outcome, before_renderable, recovery_started, retry_push_started
ORDER BY day DESC, requests DESC;

Model Metadata

Streamed vs non-streamed board model usage:

SELECT
  date_trunc('day', start_timestamp) AS day,
  message,
  attributes->>'model' AS model,
  attributes->>'provider' AS provider,
  attributes->>'streamed' AS streamed,
  sum((attributes->>'input_tokens')::bigint) AS input_tokens,
  sum((attributes->>'output_tokens')::bigint) AS output_tokens,
  sum((attributes->>'cache_read_tokens')::bigint) AS cache_read_tokens,
  sum((attributes->>'cache_write_tokens')::bigint) AS cache_write_tokens,
  sum((attributes->>'request_count')::bigint) AS request_count
FROM records
WHERE start_timestamp > now() - interval '7 days'
  AND message IN (
    'User state agent response',
    'User state agent streamed response'
  )
GROUP BY day, message, model, provider, streamed
ORDER BY day DESC, message;