2. Scope

This RFC covers:

  • Validation workflow and architecture
  • Validation levels and modes
  • Command-line interface (telemachus validate)
  • Error, warning, and exit-code conventions
  • Integration with datasets, adapters, and schema registry

It does not redefine the Telemachus schema itself, but defines how compliance to it is verified programmatically.


3. Relationship to Other RFCs

RFCTitleDependency
RFC-0001Telemachus Core 0.2Provides base schema for validation
RFC-0003Dataset Specification 0.2Defines dataset manifest to validate
RFC-0004Extended FieldGroups SchemaProvides optional extended schema targets
RFC-0005Adapter ArchitectureGenerates data that must be validated
RFC-0011Versioning and Governance PolicyDescribes lifecycle of validation rules

4. Validation Architecture Overview

The validation framework is composed of three main layers:

  1. Schema Layer — JSON Schemas describing structure and types (RFC-0001, RFC-0004).
  2. Semantic Layer — Domain-specific checks (timestamps, monotonicity, units).
  3. CLI/Automation Layer — User interface and integration within pipelines.
flowchart TD
  A[Dataset] --> B[Schema Validator]
  B --> C[Semantic Checks]
  C --> D[Validation Report]
  D --> E[CLI / CI Output]

5. Validation Levels

LevelDescriptionTypical Use
coreValidate only fields from the Core schema (RFC-0001).Minimal compliance
extendedValidate Core + Extended FieldGroups (RFC-0004).Industrial datasets
strictEnforce temporal, semantic, and numerical tolerances.Research-grade datasets

Example:

telemachus validate --dataset path/to/data --level strict

6. Validation Modes

ModeDescriptionExample
--datasetValidate a dataset directory with manifest and samples.telemachus validate --dataset ./2025-10-01-v1.0
--recordValidate a single JSON or CSV record.telemachus validate --record input.json
--schemaValidate schema definitions themselves.telemachus validate --schema schema/core/record.schema.json
--adapterValidate output from a provider adapter (RFC-0005).telemachus validate --adapter samsara

7. Command-Line Interface

7.1 Basic Syntax

telemachus validate [OPTIONS]

7.2 Options

OptionDescription
--dataset PATHPath to dataset directory
--record FILEValidate a single record file
--schema FILEValidate a schema file
--adapter NAMEValidate adapter output
--level {core,extended,strict}Validation level
--allValidate all available tests
--jsonOutput validation report as JSON
--fail-fastStop at first error
--verboseDetailed output of warnings and metrics

8. Validation Report Structure

Validation results are always returned as a standardized JSON object:

{
  "dataset": "2025-10-01-v1.0",
  "validated_at": "2025-10-13T10:00:00Z",
  "level": "strict",
  "records_checked": 131186,
  "errors": [
    {"field": "time", "type": "missing_value", "count": 5},
    {"field": "speed.kmh", "type": "unit_violation", "count": 3}
  ],
  "warnings": [
    {"field": "heading.deg", "type": "alignment_tolerance_exceeded", "max_delta_ns": 9000000}
  ],
  "status": "passed_with_warnings"
}

This format ensures compatibility with automated pipelines and CI/CD dashboards.


9. Error and Exit Codes

CodeMeaning
0Validation successful
1Validation failed (errors detected)
2Manifest missing or corrupted
3Schema invalid or unavailable
4CLI or adapter misconfiguration

Warnings do not cause non-zero exit unless --strict is specified.


10. Semantic Validation Rules

CheckDescriptionApplies To
Timestamp monotonicityEnsures timestamps are strictly increasingAll datasets
Missing valuesDetects nulls in mandatory fieldsCore & Extended
Unit validationChecks values within expected unit rangeNumeric fields
Alignment toleranceWarns when misalignment exceeds defined thresholdMultisensor data
Sampling rate deviationDetects irregular intervalsHigh-frequency datasets

11. Implementation Guidelines

  • Implemented under telemachus/core/validate.py
  • Expose a main entry point validate_dataset(path, level="core")
  • Integrate schema validation via jsonschema or fastjsonschema
  • Use warnings module for soft alerts (AlignmentWarning, etc.)
  • All validation rules must be unit-tested under tests/test_validation.py

12. Integration with CI/CD

A GitHub Actions workflow validates all datasets on push or pull requests:

name: Validate Telemachus Dataset
on: [push, pull_request]
jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Validate datasets
        run: |
          pip install telemachus-py
          telemachus validate --all --level strict

13. Future Extensions

  • Web dashboard for dataset validation reports
  • Integration with RS3 simulation output checks
  • Automatic schema discovery from datasets
  • Integration with OpenTelemetry logging and metrics (RFC-0014)

14. References

  • RFC-0001 — Telemachus Core 0.2
  • RFC-0003 — Dataset Specification 0.2
  • RFC-0004 — Extended FieldGroups Schema
  • RFC-0005 — Adapter Architecture & Provider Modules
  • RFC-0011 — Versioning & Governance Policy
  • JSON Schema — https://json-schema.org/
  • Python jsonschema library — https://pypi.org/project/jsonschema/

15. Conclusion

This RFC formalizes the validation backbone of the Telemachus ecosystem.
It provides clear operational rules for ensuring consistency, reliability, and traceability of datasets, enabling automated quality assurance for both simulated and real-world telematics data.

Réseau 1 sortants 2 entrants

Sources · Liens sortants

  • T001 — Telemachus RFCs & Specifications — White Paper

Cité par · Liens entrants

  • P003 — Telemachus: An Open Pivot Specification for Synthetic and Real Mobility Data
  • T001 — Telemachus RFCs & Specifications — White Paper