Article 10 — Data Governance

Article 10 of the EU AI Act imposes data governance requirements on providers of high-risk AI systems. It covers training, validation, and test data quality — and extends into production through the obligation to monitor data quality in real-world use.

What the Article Requires

Article 10 requires that training, validation, and test datasets:

Are subject to appropriate data governance and management practices
Are relevant, sufficiently representative, and free from errors to the best possible extent
Have appropriate statistical properties for the system’s intended purpose
Are examined for possible biases that could affect health, safety, or fundamental rights

Article 10(5) additionally requires providers to apply relevant data governance and management practices throughout the entire lifecycle of the system — which means monitoring production data quality post-deployment, not just at training time.

Article 10 is primarily a design and development-time obligation. VeriProof addresses the production monitoring subset: detecting when real-world inputs or outputs diverge from the distribution the system was designed for.

Training and Validation Data (Design Time)

VeriProof does not manage training datasets or validation pipelines. For these obligations, your data governance programme should address:

Data source documentation (where data came from, when it was collected)
Bias analysis methodology and results
Statistical properties of the training distribution
Documentation of data preparation, cleaning, and augmentation steps

These are typically documented as part of your model card or system card, which forms part of the Article 11 technical documentation package.

Production Data Monitoring (Article 10(5))

Article 10(5) is where VeriProof plays a direct role. The Act’s requirement to ensure data quality throughout the lifecycle means your governance system must detect when production inputs differ materially from the training distribution — a sign that the system is operating outside the conditions it was designed for.

What to Monitor

For LLM-based systems, useful production data quality signals include:

Signal	What it detects	How to capture
Input token distribution	Unusual input lengths or vocabulary	Session metadata
Input language	Non-intended languages appearing in production	Adapter metadata
Topic drift	Inputs on topics not represented in training	Governance scoring dimension
Demographic patterns in inputs	Potential sampling bias in real-world usage	Metadata enrichment
Refusal rate shifts	Model encountering inputs it wasn’t trained on	Governance score signal

Configuring Production Data Monitoring

VeriProof’s current control surface is centered on captured session metadata plus shared governance settings, not a generic per-application drift-rule builder. To monitor Article 10 signals in practice:

Emit the production data-quality signals you care about as session metadata from your SDK adapter, such as detected language, input length, topic category, refusal reason, or other domain-specific input characteristics.
Use Settings → Governance Policies to require the declarations and process controls that should always be present for the monitored workflow.
Use Settings → Governance Thresholds to alert when platform-wide oversight, grounding, or guardrail rates fall below your accepted operating level.
Use Compliance → Scoring Settings to control how partial, critical, major, and minor findings contribute to framework scores reviewed by compliance teams.

Refer to the SDK adapter guide for instructions on enriching sessions with the metadata fields you want to analyse.

Generating Data Quality Evidence

To produce Article 10 evidence for your annual conformity assessment, go to Compliance → Evidence Exports, choose the EU AI Act framework, select Article 10, set the date range, enable Include blockchain proofs, and click Download Evidence Pack (PDF).

The package includes:

Input distribution summary (token counts, detected languages, topic signals)
Drift detection summary (comparison to baseline distribution established at deployment)
Sessions flagged for data quality concerns, with full payload for review

Bias Monitoring in Production

Article 10(3) requires examination for biases. While bias analysis typically happens at training time, detecting bias in production outputs is increasingly expected as part of the system’s ongoing risk management.

VeriProof supports production bias review when your adapters emit the relevant grouping metadata alongside each session. Capture the user-context or demographic fields that are appropriate for your use case, then use evidence exports, session review, and your governance workflows to assess whether refusal rates, outcomes, or other monitored signals differ materially across groups.

This requires your SDK adapter to emit the relevant grouping metadata field. What constitutes a meaningful grouping depends entirely on your system’s use case and the population your model serves.

Documentation for Auditors

Your Article 10 documentation package should include:

Training data provenance and governance summary (produced by your data team separately)
VeriProof’s production data monitoring configuration and threshold rationale
Production data quality report for the period under review
Any flagged drift or bias incidents and the corrective actions taken

Next Steps

Article 9 — Risk Management — monitoring thresholds and corrective action
Article 11 — Technical Documentation — integrating data quality evidence into documentation packages
Governance Scoring guide — advanced scoring configuration