All Projects

AI-Powered Testing Platform

6-Agent System for Legacy Modernization

Vertex AIClaude Sonnet 4.6PlaywrightFastAPIPostgres + pgvectorReactViteTailwindCloud RunCloud SQLIdentity-Aware ProxyWorkload Identity Federation
56/
Agents Closed-Loop
9,420
Indexed Symbols
148
Specs / Run
AI-Powered Testing Platform — Dashboard View
AI-Powered Testing Platform — Architecture

Overview

A 6-agent system that lets legacy systems stay safe to change. CODEX (codebase intelligence with 9,420 indexed symbols + 8 drift detectors) reads the existing code; KNOWLEDGE (typed extraction across 704 sessions with plain-language search) captures tribal knowledge from retiring SMEs before they walk out the door; TESTGEN (Playwright generation at v2 + auto-confirm policy moving 146 drafts/day) turns that knowledge into executable specs; GUARDIAN (regression watch + queued-runner pattern) runs the specs and flags drift; OPERATOR (the orchestrator — Mounika's chief of staff) synthesizes cross-agent state into a daily briefing with actionable proposals; INTEGRATION (Phase F) is designed-not-built. Privacy guardrails verified on real data — a two-stage PHI classifier (heuristic short-circuit + LLM second-pass) held 321 files for human review, and after-the-fact audits found 11 leaks the heuristic missed and the LLM caught — purged, with the routing config updated so they SKIP on re-ingest. Multi-tenant architecture from day one: RLS on every table keyed by tenant_id, per-agent service accounts, IAP gating, Workload Identity Federation (no service-account JSON keys). UCR is tenant 1; ACHP onboards in Phase G.

Impact & Results

5/6
Agents Closed-Loop
CODEX · KNOWLEDGE · TESTGEN · GUARDIAN · OPERATOR shipped
9,420
Indexed Symbols
CODEX codebase intelligence with 8 drift detectors
146 / day
Auto-Confirm Rate
drafts moved out of SME backlog (38% inbox clearance)
148 specs
Latest GUARDIAN Run
144 passed in 22 minutes against UCR QAT
11
PHI Leak Catches
missed by heuristic, caught by LLM second-pass, purged
1 → N
Tenants
UCR is tenant 1; ACHP onboards in Phase G
AI-Powered Testing Platform — Screenshot

Key Features

6 specialized agents: CODEX (codebase intelligence + 8 drift detectors), KNOWLEDGE (typed SME extraction), TESTGEN (Playwright generation), GUARDIAN (regression watch), OPERATOR (orchestrator), INTEGRATION (designed)
TESTGEN v2 prompt iteration after live QAT runs — 208 drafts at 98.1% compile-clean under sequential-workflow + UCR hover-menu nav discipline
Auto-confirm policy moves 146 drafts/day out of SME review backlog via pattern-match — 38% inbox clearance without SME involvement, fully audited
GUARDIAN queued-runner pattern — browser "Trigger run" button writes to Postgres work queue; polling runner claims jobs and executes Playwright (~35s end-to-end)
OPERATOR deterministic rule engine reads cross-agent state and produces Mounika a prioritized daily briefing + Approve/Reject proposals — solves the "who orchestrates the orchestrator?" gap that breaks 5-specialist fleets at 20 tenants
Two-stage PHI privacy guardrails (heuristic short-circuit + LLM second-pass) validated on real data — 321 files held for review, 11 missed leaks caught + purged + routing updated
Multi-tenant from day one — RLS keyed by tenant_id, per-agent service accounts, IAP gating, Workload Identity Federation (no service-account JSON keys)
Stakeholder hub with open-questions surface — Rikki, Margretta, Antonio, Michelle answer tribal-knowledge questions async; 50 of 77 resolved

Next Project

UCR Modernization