Skip to Content
GuidesPolicy Trends
💡
Quick overview

Policy Trends

Policy Trends helps you read policy posture over time instead of judging a single session in isolation. Use the 7, 30, and 90 day views to spot degradation, compare applications, and separate short-term noise from real control failures.

This page is especially useful for recurring reviews and board-level updates. It shows how to turn trend lines into follow-up actions and stronger evidence.

Policy Trends

Policy trend analysis is the practice of examining how your AI policy posture changes over time, not just whether a single session passed or failed. This guide explains how to read trend data effectively and integrate it into your regular review cadence.


Navigate to Analytics → Trends. By default, it shows the last 30 days for all active applications.

Key controls:

  • Date range — use 7 days for tactical investigation, 30 days for weekly policy reviews, and 90 days for quarterly reporting
  • Application filter — compare specific applications or isolate a single system
  • Metric selector — choose which metrics to overlay (policy score, risk distribution, guardrail activity, anchor coverage)

What healthy looks like

A healthy policy score trend for a stable, well-instrumented application shows:

  • A score in the 80–95 range with low day-to-day variance
  • Gradual upward movement over weeks as instrumentation coverage improves
  • Brief dips (3–7 days) after model updates, followed by recovery

Recognizing concerning patterns

PatternLikely causeRecommended action
Sudden step-down (score drops ≥ 10 points in 24 hours)Model update, major prompt change, or instrumentation regressionCompare sessions before and after in Cohort view; check instrumentation health
Gradual decline (steady downward slope over 2–4 weeks)Reduced annotation coverage — new code paths missing instrumentationReview Coverage tab in Application workspace; check which annotations are declining
High variance (erratic daily swings)Low session volume making the moving average sensitive to individual sessions, or inconsistent intent classificationIncrease evaluation window; check intent label consistency in Vocabulary Browser
Plateau at low score (stuck below 70 for extended periods)Systematic instrumentation gap — a major annotation type is consistently absentOpen the application coverage view and address the lowest-coverage dimension

Policy Score vs. Risk Distribution

Governance score and risk distribution are complementary signals — read them together:

  • High policy score + low CRITICAL/HIGH rate = strong policy operations and low-risk sessions
  • High policy score + high CRITICAL/HIGH rate = strong policy operations, but the application is handling inherently risky interactions
  • Low policy score + low CRITICAL/HIGH rate = controls are present but underperforming; instrumentation coverage likely needs improvement
  • Low policy score + high CRITICAL/HIGH rate = the highest-priority improvement area; escalate to the responsible team

Many frameworks require evidence of continual improvement, not just a snapshot of current posture.

For EU AI Act Article 9 (post-market monitoring)

Export the policy score time series for all applications in scope for a given period. The export from Analytics can be attached as evidence of your post-market monitoring system.

For ISO 42001 Clause 9.1 (performance evaluation)

The 90-day policy score chart with the fleet target line overlaid is a direct artifact for the monitoring, measurement, analysis, and evaluation requirement. Caption it with your target, the period covered, and your assessment of trend direction.

For SOC 2 CC7 (system operations)

The guardrail activity trend — blocked events over time — demonstrates that operational controls were active throughout the audit period.

Save a permalink after setting your filters. This creates a repeatable, consistent view for review meetings and reporting cycles.


Setting Trend Alerts

Rather than manually checking the Trends view on a schedule, configure alert rules to notify you when trends cross thresholds:

  • Score regression alert — fires when the rolling 7-day score drops ≥ 10 points in 48 hours
  • Score below target — fires when the rolling average stays below your configured target for 24 consecutive hours
  • Anchor coverage drop — fires when anchor rate drops below 98%

See Policy Rule Templates for pre-built rules for each of these scenarios.


Last updated on