SDK Health Monitoring
The VeriProof SDK is designed to be resilient: it exports telemetry data asynchronously, queues failures, and includes a circuit breaker that protects your application when the ingest API is unreachable. In normal operation you should not notice any of this. But in production, you want to know when exports start failing so you can investigate before you have gaps in your compliance record.
This guide explains how to monitor SDK health in production.
Built-in health signals
The SDK emits structured log lines and exposes a metrics interface that your observability stack can consume.
SDK log output
Enable SDK debug logging to see export success and failure events:
Python
import os
os.environ["VERIPROOF_DEBUG"] = "true"
# Or configure at runtime:
configure_veriproof(
api_key=os.environ["VERIPROOF_API_KEY"],
application_id="your-app",
debug=True,
)With debug logging enabled, look for these log patterns:
| Log pattern | Meaning |
|---|---|
[veriproof] export success | Session batch exported successfully |
[veriproof] export failed attempt=N | Export failed; will retry (N = attempt number) |
[veriproof] circuit breaker OPEN | Consecutive failures exceeded threshold; exports suspended |
[veriproof] circuit breaker HALF_OPEN | Recovery probe in progress |
[veriproof] circuit breaker CLOSED | Exports resumed after successful probe |
[veriproof] queue depth=N | Pending span queue size (N spans waiting to export) |
Do not leave debug=True enabled in production indefinitely. Debug logging is verbose and may log session metadata. Use it temporarily for investigation, then disable it.
Circuit breaker behavior
The SDK uses a circuit breaker to prevent export retries from impacting application performance when the ingest API is unavailable:
| State | Behavior |
|---|---|
| Closed (normal) | Exports proceed; failures trigger retries |
| Open (failing) | Exports are dropped; no retries attempted; queue drains |
| Half-open (recovering) | One probe export sent; if it succeeds, circuit closes |
The circuit opens after 5 consecutive export failures. It attempts to close after 60 seconds.
Sessions exported while the circuit is open are lost — they are not queued for later delivery. If your application requires zero-gap compliance records, monitor for circuit breaker events and alert on them immediately.
Metrics interface
Python
from veriproof import get_sdk_health
health = get_sdk_health()
print(f"Circuit breaker state: {health.circuit_state}") # CLOSED, OPEN, HALF_OPEN
print(f"Successful exports: {health.exports_success}")
print(f"Failed exports: {health.exports_failed}")
print(f"Pending queue depth: {health.queue_depth}")
print(f"Last export at: {health.last_export_at}")Integrating with your health check endpoint
Add VeriProof SDK health to your application’s existing health check:
from fastapi import FastAPI
from veriproof import get_sdk_health
app = FastAPI()
@app.get("/health")
def health_check():
vp = get_sdk_health()
vp_healthy = vp.circuit_state == "CLOSED" and vp.exports_failed < 5
return {
"status": "ok" if vp_healthy else "degraded",
"veriproof": {
"circuit_state": vp.circuit_state,
"exports_success": vp.exports_success,
"exports_failed": vp.exports_failed,
"queue_depth": vp.queue_depth,
"last_export_at": vp.last_export_at.isoformat() if vp.last_export_at else None,
}
}Alerting on SDK health
Using VeriProof’s built-in alerts
The VeriProof system alert “Sustained SDK export failure” fires automatically when your application has not exported any spans for more than 60 minutes. This requires no configuration — it is active by default for all accounts and notifies Admin portal users.
Using your existing monitoring stack
Export SDK health metrics to your monitoring platform:
import time
import threading
from veriproof import get_sdk_health
from prometheus_client import Gauge
vp_circuit_open = Gauge("veriproof_circuit_breaker_open", "1 if circuit is open, 0 if closed")
vp_queue_depth = Gauge("veriproof_queue_depth", "Number of pending spans in the export queue")
vp_exports_failed = Gauge("veriproof_exports_failed_total", "Cumulative export failures")
def _emit_veriproof_metrics():
while True:
health = get_sdk_health()
vp_circuit_open.set(1 if health.circuit_state == "OPEN" else 0)
vp_queue_depth.set(health.queue_depth)
vp_exports_failed.set(health.exports_failed)
time.sleep(15)
threading.Thread(target=_emit_veriproof_metrics, daemon=True).start()Recommended Prometheus alert rules:
- alert: VeriProofCircuitOpen
expr: veriproof_circuit_breaker_open == 1
for: 2m
annotations:
summary: "VeriProof SDK circuit breaker is open — compliance sessions are not being exported"
- alert: VeriProofExportQueueGrowing
expr: rate(veriproof_queue_depth[5m]) > 0.5
for: 5m
annotations:
summary: "VeriProof export queue is growing — possible ingest API connectivity issue"Common production issues
| Symptom | Likely cause | Resolution |
|---|---|---|
| Circuit breaker opens at startup | API key incorrect or ingest API URL misconfigured | Verify VERIPROOF_API_KEY and ingest_endpoint setting |
| Intermittent export failures | Network instability between application and ingest API | Check connection to ingest.veriproof.app on port 443; verify no egress filtering |
| Persistent queue growth | Ingest API returning 429 (rate limited) | Review your plan’s rate limits; consider batching more aggressively |
| Circuit keeps opening after recovery | Application handling requests faster than the circuit can probe | Consider reducing max_concurrent_exports in SDK options |
Next steps
- SDK Troubleshooting — diagnose missing traces and authentication errors
- Alert Rules — configure portal alerts triggered by SDK health events
- Python SDK Core Reference — full
get_sdk_health()API details