P014 — Closed-Loop Validation of Telematics Pipelines via Synthetic Ground Truth
Abstract
Evaluating the accuracy of telematics data processing pipelines (cleaning, calibration, event detection, scoring) is difficult in the real world: there is no ground truth for the “correct” number of harsh braking events, the true device orientation, or the exact driving score. This paper presents a closed-loop validation framework that uses a simulator (RoadSimulator3) as a ground truth generator and an analysis platform (Telemachus) as the system under test.
The framework:
- Injects known perturbations into synthetic traces (device rotation, events, noise, multi-rate sensors),
- Processes the traces through the full D1→D3 pipeline,
- Compares detections against the injected truth using precision/recall/F1 metrics.
We demonstrate the method on IMU rectification (P013) and event detection (D2), producing quantitative accuracy metrics that are impossible to obtain from real-world data alone.
Key results: IMU rotation is recovered within 2° (yaw via GPS correlation). Event detection achieves F1=0.04 on the current D2 detector, exposing systematic over-detection of braking events and failure to detect surface events (potholes, curbs). These findings directly inform threshold calibration.
Introduction
The Validation Gap
Telematics pipelines transform raw sensor data (GPS, accelerometer, gyroscope) into actionable intelligence: driving scores, event counts, quality indices. But how do we know the pipeline is correct?
In production:
- There is no “true” driving score to compare against
- Event labels (harsh brake, pothole) are subjective
- Device orientation is unknown after installation
- Sensor characteristics vary across hardware
Contribution
We propose a closed-loop validation methodology:
RoadSimulator3 Telemachus Platform
┌─────────────┐ ┌─────────────────┐
│ Ground Truth │──── CSV+manifest ──→│ D1→D3 Pipeline │
│ - rotation R │ │ - IMU calibration│
│ - events │ │ - event detect. │
│ - noise σ │ │ - KPIs │
│ - GPS Hz │◄── comparison ────│ - artifacts │
└─────────────┘ └─────────────────┘
│ │
└──── Validation Report ─────────┘
precision, recall, F1
rotation error (°)
overall score
This is analogous to unit testing for data pipelines — the simulator provides deterministic, reproducible test cases with known answers.
Framework
Ground Truth Generation (RS3)
RoadSimulator3 generates synthetic telematics data with configurable:
| Parameter | Range | Example |
|---|---|---|
| Device rotation (roll, pitch, yaw) | 0°–45° | 10°, 20°, 30° |
| Event injection (7 types) | 0–20 per type | 3 brake, 2 accel, 4 bump |
| Sensor noise (σ_acc, σ_gyro) | 0–0.1 m/s² | 0.03 m/s² |
| GPS frequency | 1–10 Hz | 1 Hz |
| IMU frequency | 5–10 Hz | 10 Hz |
| Gyroscope enabled/disabled | bool | false |
The manifest (config YAML) accompanies the data and serves as the ground truth reference.
Pipeline Under Test (Telemachus)
The full D1→D3 pipeline processes the synthetic trace:
- D1: GPS cleaning, upsampling, IMU calibration (P013), map matching, road context, DEM, SQS
- D2: Event detection (8 types), curve radius classification
- D3: KPI aggregation, driving score
Comparison Metrics
Rotation accuracy:
- Pitch error: |pitch_detected − pitch_GT|
- Yaw error: |yaw_detected − yaw_GT| (GPS correlation method)
- Gravity norm: ||g_measured|| ≈ 9.81 m/s²
Event detection:
- Precision = TP / (TP + FP) — how many detections are correct
- Recall = TP / (TP + FN) — how many injections are found
- F1 = 2 × P × R / (P + R)
- Per-type breakdown
Overall score = mean(rotation_score, events_F1 × 100)
Results
Rotation Validation
| Input | Pitch detected | Yaw detected | Pitch error | Yaw error |
|---|---|---|---|---|
| (10°, 20°, 30°) | 22.18° | 28.0° | 2.18° | 2.0° |
| (5°, 5°, 15°) | — | 15.0° | — | 0.0° |
| (15°, 8°, 20°) | — | 20.0° | — | 0.0° |
Rotation score: 89.4/100 — the two-level method (P013) recovers pitch within 3° and yaw within 2° on realistic traces.
Event Detection Validation
| Event type | Injected | Detected | Status |
|---|---|---|---|
| harsh_brake | 3 | 84 | ✅ detected (over) |
| harsh_accel | 2 | 181 | ✅ detected (over) |
| speed_bump | 4 | 1 | ⚠️ under-detected |
| pothole | 2 | 0 | ❌ missed |
| curb | 1 | 0 | ❌ missed |
| sharp_turn | 3 | 0 | ❌ missed |
| door_open | 2 | 0 | ❌ missed |
Precision=0.023, Recall=0.353, F1=0.042 — the current D2 detector:
- Over-detects braking/acceleration (84 detected for 3 injected) → thresholds too sensitive
- Fails to detect surface events (pothole, curb) and turns → thresholds too aggressive or detection logic incomplete
- Misses door_open → stop-segment detection not sensitive enough to gyro patterns
Diagnostic Value
The validation report directly identifies which D2 thresholds need adjustment:
| Issue | Root cause | Fix |
|---|---|---|
| 84 brake for 3 injected | ax threshold (-3.0) too sensitive on rotated data | Increase to -4.5 or normalize post-calibration |
| 0 pothole detected | az_delta threshold (5.0) too high for RS3 injection amplitude | Lower to 3.0 or increase injection amplitude |
| 0 sharp_turn | gz_rad_s threshold (0.25) but gyro disabled → no gyro data | Fall back to ay-based turn detection |
| 0 door_open | gy threshold (3.0) but gyro disabled | Cannot detect without gyro — expected failure |
Implementation
The validation module is implemented in telemachus-platform/src/telemachus_platform/validation/ground_truth.py:
load_rs3_ground_truth_from_config(cfg)— extracts GT from RS3 YAMLvalidate_rotation(gt, detected)— pitch/yaw error + gravity checkvalidate_events(gt, detected)— precision/recall/F1 per event typefull_validation(gt, artifacts)→ValidationReport
Discussion
Why Synthetic Validation Matters
Real-world validation requires manual annotation (expensive, subjective, incomplete). Synthetic validation provides:
- Exact ground truth (rotation angles to 0.01°, event positions to the sample)
- Reproducibility (same YAML → same test case)
- Systematic coverage (sweep rotation angles, event counts, noise levels)
- Regression testing (CI/CD integration possible)
Limitations
- Synthetic data may not capture all real-world phenomena (multi-path GPS, vibration modes, temperature drift)
- Event injection profiles are simplified (bell-shaped vs real kinematic signatures)
- The mapping RS3_event_name → Telemachus_event_name assumes consistent labeling
Toward Continuous Validation
The framework can be automated as a CI pipeline:
git push → RS3 generate test traces → Telemachus process → compare → report
If F1 drops below threshold → pipeline regression detected.
Conclusion
We present a closed-loop validation framework for telematics data pipelines, using simulation as ground truth. The method is applicable to any pipeline that processes IMU/GPS data, and the validation report quantitatively identifies detection failures and threshold miscalibrations.
Applied to the Telemachus Platform, the framework reveals that IMU rectification achieves 89% accuracy while event detection needs significant threshold recalibration (F1=0.04). These findings would be invisible without synthetic ground truth.
References
- Edet, S. (2026). P013 — In-field IMU Rectification Without Gyroscope. Teleforge.
- Edet, S. (2025). RoadSimulator3: A Modular Framework for Inertial Vehicle Trajectory Simulation. PhD Thesis.
- Edet, S. (2026). P011 — Rectification inertielle 10 Hz pour la donnée mobilité. Teleforge.