# Current System State
> [!summary]
> Eval Labs is an implemented, role-based human evaluation platform for Lucia. It supports controlled human onboarding, persisted evidence, owner/admin oversight, and evaluator-safe workflows, while some UX and rollout areas remain in active hardening.
---
## Current platform truth
Status: implemented.
Eval Labs is no longer only a founder or AI-agent testing tool.
It is Lucia's role-based human evaluation platform:
- Clerk auth works.
- Clerk public metadata drives frontend role behavior through `eval_labs_role`.
- The Clerk session token includes `eval_labs_role` so Supabase RLS can recognize privileged owner/admin access.
- Supabase RLS protects persisted evidence.
- Real runs must persist to Supabase.
- Owner/admin should see shared persisted Eval Labs evidence.
- Evaluator and tester data remains scoped to their own work except where owner/admin oversight applies.
- Team Review exists as the owner/admin oversight surface.
- Staged hydration loads run summaries first, then recent and deeper evidence, so dashboards can render faster without fake metrics.
---
## Current roles
Status: implemented.
Current roles:
- `owner`
- `admin`
- `evaluator`
- `tester`
- unassigned or missing role
Read the canonical matrix: [[08 - Eval Labs Roles and Access Matrix|Eval Labs Roles and Access Matrix]].
---
## Current test surfaces
Status: implemented.
Current test surfaces:
1. Custom Prompt Test
2. Auto-generated Prompt Test
3. Guest Facing Agent Verification Check
4. Controlled Batch Runner
Tester access is intentionally narrower than evaluator access. Tester is for clean prompt-testing onboarding cohorts. Evaluator is for the full evaluator workbench and evaluator-safe test types.
---
## Oversight and analysis
Status: implemented.
Owner/admin have full platform access, shared persisted evidence, Team Review, Global Analysis, and all test surfaces.
Team Review exists for owner/admin oversight of human evaluation work: evidence quality, reviewer activity, review gaps, and escalation readiness.
Global Analysis is owner/admin-only platform-wide evidence inspection. It is not a tester or evaluator onboarding surface.
---
## Human onboarding posture
Status: active hardening.
Eval Labs is ready for controlled human onboarding by role and assignment.
Do not describe the platform as broadly production-mature or open-access. Do not describe Lucia as human-approved because the AI-reviewed platform readiness gate passed.
Use:
```text
implemented
active hardening
deferred
future
```
Avoid softer labels that imply more maturity than the source state proves.
---
## Active hardening
These areas are implemented but still being tightened, polished, or verified for rollout:
- evaluator onboarding/workspace polish
- first human cohort instructions
- role-specific route verification
- Clerk-to-Supabase role-claim verification after role or RLS changes
- staged hydration behavior across large evidence sets
- clear tester-vs-evaluator assignment guidance
---
## Deferred
Deferred means intentionally outside the current access model:
- tester access to Verification Check
- tester access to Verification Results
- tester access to Controlled Batch Runner
- tester access to Team Review
- tester access to Global Analysis
- tester access to Registry Diagnostics
- tester access to Behavioral Observatory
- evaluator access to Team Review
- evaluator access to Global Analysis
---
## Future
Future means possible later, not current behavior:
- broader public or external evaluator rollout
- expanded assignment management
- additional owner/admin management tooling
- deeper cohort analytics beyond current oversight surfaces
- more final evaluator UX polish
---
## First human onboarding readiness criteria
Before a first human onboarding cohort starts:
1. Confirm every participant has Clerk auth access.
2. Confirm `eval_labs_role` is set in Clerk public metadata.
3. Confirm the session token carries the role claim used by Supabase RLS.
4. Confirm visible routes match the access matrix.
5. Run a real prompt test and verify Supabase persistence.
6. Confirm owner/admin can see shared persisted evidence where oversight applies.
7. Keep testers limited to Custom Prompt Test and Auto-generated Prompt Test.
8. Give evaluators only assignments that match evaluator-safe surfaces.
9. Name any active-hardening caveats before the work begins.
10. Repeat that AI-reviewed platform readiness is not human Lucia-quality approval.