Playground
Playground is the safe place to experiment with prompts, model inputs, and draft policy logic without affecting production telemetry or audit metrics. It runs the same evaluation pipeline used in production, including guardrails, intent classification, and risk scoring.
Use it to explore behavior quickly, validate new ideas, and promote only the changes that perform well against your policy expectations.
Playground
The Playground is a sandboxed test environment for prompts and model interactions. It runs prompts through the same policy evaluation pipeline used for production sessions, including guardrails, intent classification, risk scoring, and annotation, without creating live records or affecting production metrics.
Playground runs never create production session records or blockchain anchors. They are for exploration and validation only.
Use the Playground to check that your SDK instrumentation produces the expected intent, risk, and outcome classifications before you push to production. Select the application whose active configuration you want to test.
Use the Playground to validate Rego policy logic interactively. Select an application, provide a test prompt, and observe how your draft rules would evaluate the session — without activating or saving anything.
Playground runs do not generate production session records or blockchain anchors. They are for testing and exploration only and do not appear in Sessions or compliance evidence exports.
Running a Prompt
- Navigate to Playground.
- Select the application whose policy configuration you want to test. The playground uses that application’s active guardrails, redaction policy, and rules.
- Enter your prompt in the input panel.
- Optionally add context — structured fields like
userId,intent,agentRole— to simulate a realistic session context. - Click Run.
The playground calls your application’s configured model endpoint and processes the result through the policy pipeline. Results appear in the output panel within a few seconds.
Reading the Results
Model Response
The model’s raw response, optionally redacted if the active redaction policy matches any content.
Governance Evaluation
| Signal | Description |
|---|---|
| Intent | The classified intent label |
| Risk level | The assigned risk level (LOW → CRITICAL) |
| Policy score | Point-in-time score for this evaluation |
| Decision | Approved / Denied / Deferred based on active rules |
| Guardrail result | Which guardrails evaluated and what action each took (allowed / flagged / blocked) |
| Content safety | Whether content safety signals were detected |
Annotation Preview
A preview of the annotations that would be attached to a production session, including step-level annotations emitted by the SDK.
Testing Guardrail Configuration
Use the playground to confirm that new or changed rules behave as expected before rollout. To test a guardrail change:
- Draft your rule change in the Rules Builder and save as draft (do not activate).
- In the Playground, toggle Use draft rules to enable draft rule evaluation.
- Run prompts that should trigger the new rule and prompts that should not.
- Verify the guardrail results match your expectation.
- Return to the Rules Builder and activate the rule when you are confident.
Saving Runs as Dataset Cases
Any playground run can be saved as a test case in an Evaluation Dataset:
- After a run, click Save as test case.
- Select an existing dataset or create a new one.
- Set the expected outcome, intent, and risk level to the values you just confirmed are correct.
- Click Add to dataset.
This is the fastest way to build evaluation datasets from validated prompt behavior.