# Product Architecture
> [!summary]
> Eval Labs is a separate role-based human evaluation platform that tests Lucia through the deployed Engine, stores review evidence in Supabase, and exposes scoped testing, run-history, review, Team Review, and Global Analysis surfaces.
---
## High-level architecture
```text
Employee / Reviewer
→ Eval Labs web app
→ Clerk role and Supabase RLS scope
→ Lucia Engine /admin/operator-focus
→ Lucia response
→ Eval Labs Review Queue
→ Suggested selections plus human review
→ Quick Review / Human Guidance Evaluation
→ Lifecycle finalization
→ Supabase persistence
→ Run History / Team Review / Global Analysis / Exports
```
---
## Runtime responsibility split
Eval Labs owns:
- top app shell and route identity
- test launcher UX
- custom prompt suite UX
- auto-generated prompt tester UX
- guest-facing verification check and results UX
- controlled batch runner UX
- run orchestration
- Run History
- Team Review
- Global Analysis
- Single Run Analysis
- copy Session ID / copy Deep Link controls
- role-gated product access
- Clerk public metadata role behavior
- Supabase RLS role-claim requirements
- Review Queue
- suggested review generation
- human ratings
- semantic scoring sliders
- Quick Review
- Human Guidance Evaluation
- review lifecycle and finalization
- dirty / completion state
- tester identity capture
- exports
- Supabase persistence for eval data
- staged hydration from run summaries to recent/deep evidence
- localStorage compaction for completed cloud-backed runs
Lucia Engine owns:
- actual Lucia behavior
- intent/routing
- response generation
- emotional containment
- operational prioritization
- model gateway behavior
Eval Labs does not decide Lucia's response quality. It records and evaluates the response Lucia produced.
AI-reviewed platform evidence proves that the Eval Labs lifecycle works. Human reviewers still decide Lucia behavioral quality.
---
## Current Engine target
Eval Labs endpoint selection is environment-configured through `VITE_LUCIA_EVAL_ENDPOINT`.
The current Lucia v0.1.3.6 validation target for active dev refinement is:
```text
https://api-dev.hellolucia.ai/admin/operator-focus
```
Development is where active iteration happens.
Staging is for promoted validation only when intentionally configured.
---
## Source of truth hierarchy
When debugging Eval Labs platform behavior:
1. Browser Network request URL
2. Current route and role state
3. Supabase rows and counts
4. Run History / Analysis UI truth
5. localStorage diagnostics
6. Render service environment
7. Netlify environment variables
8. Lucia Engine deployed commit
9. Eval Labs deployed commit
10. Exported run metadata
11. Human memory
Human memory is useful. It is not the source of truth.
---
## Current route architecture
The current route map is documented in [[04 - Product Surfaces and Route Map|Product Surfaces and Route Map]].
Core canonical paths:
```text
/ Owner/Admin Home dashboard
/lucia/launcher workspace chooser
/lucia/custom Custom prompt tester
/lucia/auto-generated Auto-generated 50-prompt tester
/guest-facing/verification Guest Facing Agent Verification Check
/lucia/batch-runner Controlled Batch Runner
/lucia/automated/runs Run History
/team-review Team Review
/analysis Global Analysis
/analysis/runs/:sessionId Single Run Analysis
/runs/:sessionId/review Review Queue
```
Legacy aliases:
```text
/lucia/automated alias to /lucia/auto-generated
/experiments alias to /analysis
```
---
## Current role architecture
Role gating is documented in [[05 - Role and Access Model|Role and Access Model]].
Current supported Clerk public metadata values:
```text
owner
admin
evaluator
tester
```
Owner/admin are privileged roles with full platform access, Team Review, Global Analysis, shared persisted evidence, and all test surfaces.
Evaluator is the full evaluator workbench role. Evaluator can use evaluator-safe test surfaces and own run/review/history routes, but cannot use Team Review or Global Analysis.
Tester is the entry-level prompt-testing lane. Tester can use Custom Prompt Test and Auto-generated Prompt Test, but not verification, controlled batch, Team Review, Global Analysis, Registry Diagnostics, Behavioral Observatory, or owner/admin tools.
Missing or unknown role metadata should fail closed.
Frontend role behavior is driven by Clerk public metadata. Persisted evidence access depends on the Clerk session token carrying `eval_labs_role` so Supabase RLS can recognize privileged owner/admin access.
---
## Important design decision
The custom prompt feature did not require a separate database model because Eval Labs already had a general structure:
```text
Session
→ Run items
→ Lucia responses
→ Human reviews
```
Custom prompts are a new run source, not a new evaluation universe.
That is good architecture.
---
## Current run source strategy
Custom runs use:
```text
mode: automated
runSource: custom
category: custom/prompts
templateKey: custom-prompt
```
This preserves compatibility with the existing run engine while clearly distinguishing custom runs from generated automated runs.
Controlled batch runs use the same platform lifecycle to create, execute, review, finalize, persist, and verify runs. They are operational readiness evidence, not a separate human-review standard.
Guest Facing Agent Verification Check is a separate app surface for booked-guest verification behavior and results. It is evaluator-safe but not tester-facing.
---
## Review-layer architecture
Eval Labs now separates review responsibility into layers:
```text
Review Queue UI
→ Suggested review values
→ Employee Review fields
→ Human Guidance Evaluation
→ Review State / Escalation flags
→ Lifecycle / dirty / completion state
→ Adjudication metadata
→ Exports / Analysis
```
The schema supports high-resolution analysis while the employee UI remains simple.
This is intentional.
The user-facing review experience should remain calm and guided even when the exported data is detailed.