The fault-injection campaign | Building a Steer-by-Wire System

Chapter 1 proved the actuator under nominal conditions. Chapter 2 asks what happens when things go wrong: 384 fault-injection runs, eleven acceptance metrics, and six failures clustered in three localised mechanisms.

Chapter 1 established the baseline: 27 characterisation runs, five sources per run, one harmonisation pipeline, and a spec check that surfaced five metric failures at temperature and voltage extremes. The system worked, but only under nominal conditions.

Chapter 2 asks the harder question: what happens when things go wrong?

The fault-injection campaign exercises the dual-channel steer-by-wire system across 384 runs and 1,920 source files:

324 single-point fault injections
48 commanded A→B handover runs
12 fault-free golden baselines

The analysis ingests five sources per run, harmonises them to a 100 Hz grid on the rig master clock, and extracts injection, detection, and emergency-op events from residuals, DTC latches, and channel-active flags. The eleven-metric acceptance verdict is 5 PASS / 6 FAIL.

The dataset at scale

Each run carries five synchronised sources: rig DAQ at 1 kHz (master clock), ECU-A and ECU-B at 100 Hz (nine monitors each), steering robot at 100 Hz, ambient sensors at 1 Hz. Single-point runs span nine fault types × three magnitudes (mild / moderate / severe) × three operating points (park / urban / high-speed) × two channels (ECU-A / ECU-B) × two repeats = 324. The operator log flags two truncated runs (inj_077, inj_244); both are processed on their available samples and their detection signatures recover normally inside the truncated window.

Campaign overview384 runs · 1,920 files

324Single-point injections

48Handover-dedicated

12Golden baselines

Rig DAQ

MF4 · 1 kHz

Master clock

ECU-A

MF4 · 100 Hz

Primary channel

ECU-B

MF4 · 100 Hz

Backup channel

Steering robot

CSV · 100 Hz

Input profile

Ambient

CSV · 1 Hz

Temp & voltage

Single-point run matrix · 324 runs

Operating points

High-speed

120 kph

Urban

50 kph

Parking

5 kph

Severity

Mild

Moderate

Severe

Fault types

Motor driver short

Motor driver open

Angle sensor bias

Angle sensor open

Torque sensor drift

Cross-channel freeze

CAN corruption

CAN timeout

Channel supply dropout

Motor driveAngle sensingTorque sensingBus / channelPower

9 faults×3 severity×3 op points×ECU-A / ECU-B×2 repeats=324

Clock alignment uses MDF header timestamps relative to the rig master, the same approach from Chapter 1 scaled to the full campaign. Robot angles convert from radians to degrees; rack position in millimetres maps to road-wheel angle. ECU-B channel names reconcile to the shared schema (theta_cmd → HwaCmd_B, and so on).

What ran across every run

Every run goes through the same five-step check. For each one, the pipeline pulls four timestamps off the rig clock and scores the result against the acceptance gates.

Injection — when the fault takes effect. Found from the targeted monitor's residual stepping outside its pre-fault baseline, with a fallback to the nominal injection instant at 3.0 s.
Detection — when the first new DTC latches on the injected channel (ignoring any code that was already active before the run started).
Handover — when the A→B channel swap completes.
FTTI check — did injection through handover finish within the hazard budget? (100 ms at high speed, up to 500 ms in parking, depending on fault type.)
Handover quality — after the swap settles, does achieved road-wheel angle stay within 1.0° of commanded?

Analysis pipeline5 stages · every run

1Harmonise

5 sources → 100 Hz grid on rig master clock

384 runs

2Extract events

t_inject, t_detect, t_eop from residuals & DTCs

per run

3Coverage

correct monitor DTC latched per fault

324 SP runs

4FTTI check

total_ms vs hazard-class gate

34 breach

5Handover QC

steady-state RWA dev after t_eop + 1 s

≤ 1.0°

Fault injection is refined from the targeted monitor residual (5σ departure) with fallback to t = 3.0 s. FTTI compares total_ms (injection to emergency-op completion) against the hazard-class gate. All 384 runs complete in ~400 s on 8 parallel workers.

Fault types map to three hazard classes with different timing budgets: self-steer (commission faults), incorrect-steer, and handover-omission. If a fault is never detected, it counts as a timing failure. All 384 runs complete in ~400 s on eight parallel workers. No runs failed to harmonise.

The acceptance verdict

Acceptance gates6 fail · 5 pass (11 total)

Metric	Observed	Gate
Coverage (pooled)	93.21%	≥ 99%	fail
Coverage (per fault)	50.0% min	≥ 99%	fail
FTTI (all hazards)	34 / 324 breach	< FTTI	fail
Self-steer FTTI (high-speed)	0 / 36 breach	< 100 ms	pass
Self-steer FTTI (urban)	0 / 36 breach	< 200 ms	pass
Self-steer FTTI (parking)	0 / 36 breach	< 500 ms	pass
Incorrect-steer FTTI	16 / 72 breach	< 150 ms	fail
Handover-omission FTTI	18 / 144 breach	< 150 ms	fail
Handover RWA deviation	18 / 48 fail	≤ 1.0°	fail
False positives	0	0	pass
EOTTI window	1.17–3.49 s	≤ 5 s	pass

Self-steer FTTI passes at every operating point. The six failures cluster in coverage, incorrect-steer timing, handover-omission timing, and steady-state handover deviation. Zero false positives across 12 golden runs.

The headline is blunt: the campaign does not meet acceptance. Coverage falls short at 93.21% pooled (302 / 324) against a 99% gate. FTTI compliance is clean on the self-steer hazard at every operating point, but incorrect-steer and handover-omission both miss the 150 ms gate. Handover RWA deviation fails on 18 of 48 runs. The one clean robustness gate is false positives: zero DTC latches across all 12 golden runs on 18 monitors across both ECUs.

Three localised mechanisms explain most of the failures.

Coverage falls short

Diagnostic coverage by faultpooled 93.21% · target ≥ 99%

All 18 ECU-A motor-driver-open runs detect; all 18 ECU-B runs miss. The four angle-sensor-bias misses are all mild magnitude (two urban, two park). Every other fault type reaches 100%.

Pooled single-point coverage is 93.21% (302 / 324). Two fault types drive the shortfall.

Motor driver open: 50% (18 / 36). All 18 ECU-A-injected runs detect; all 18 ECU-B-injected runs miss. The pattern is fully consistent with hot-standby behaviour: the standby motor carries no current, so an open-circuit monitor that relies on current/voltage residuals has no signal to detect.

Angle sensor bias: 88.89% (32 / 36). The four misses are all mild magnitude (two at urban, two at park). All moderate and severe runs detect. This is a sensitivity boundary, not a protocol error.

Every other fault type reaches 100% coverage.

FTTI compliance

FTTI compliance by hazard34 / 324 breach overall

Self-steer timing is clean at every operating point. Incorrect-steer breaches concentrate on mild torque-sensor drift (210–230 ms vs 150 ms gate). Handover-omission breaches are the same 18 undetected ECU-B motor-driver-open runs from the coverage gap.

Self-steer passes cleanly. All motor_driver_short, cross_channel_signal_freeze, and can_corruption runs detect inside their operating-point budget: zero breaches at high-speed (100 ms), urban (200 ms), and park (500 ms).

Incorrect-steer fails (16 / 72 breach). The breach concentrates on mild torque_sensor_drift: all 12 mild runs sit at 210–230 ms, well above the 150 ms gate. Moderate (110–130 ms) and severe (70–80 ms) sit cleanly below. This is magnitude-dependent monitor sensitivity: the residual takes longer to cross threshold at mild drift amplitudes. The four undetected mild angle_sensor_bias runs also count here.

Handover-omission fails (18 / 144 breach). All 18 breaches are ECU-B-injected motor_driver_open runs that the standby-channel monitor cannot see. This is the coverage gap from above showing up in the timing gate.

Most other fault types resolve in 10–50 ms with low scatter. torque_sensor_drift is the outlier, with mean ~141 ms driven by the magnitude-dependent latency ladder (severe ~77 ms, moderate ~125 ms, mild ~221 ms).

Handover quality

Handover RWA deviation (failures only)18 / 48 fail

Measured steady-state over [t_eop + 1.0 s, end], excluding the handover transient. Thirty runs sit at ~0.03° and pass cleanly. Failures split between severe motor-driver-short residual carry-over (up to 8.81°) and a park + severe omission cluster at ~3.3°.

30 of 48 runs sit at ~0.03° steady-state RWA deviation, well inside the 1.0° gate. 18 of 48 fail, and the failure population splits into three groups.

Severe motor driver short (6 runs): mean deviation ~5.4° at high-speed/urban and ~8.7° at park. The fault residual carries forward into B's tracking; deviation scales with operating amplitude. Maximum observed: 8.807°.

Park + severe omission faults (6 runs across motor_driver_open, channel_supply_dropout, and can_timeout): a tight cluster at ~3.2–3.4°, indistinguishable across the three fault types. Mechanism looks like a B-channel calibration/gain offset that only becomes observable post-handover under high-current conditions.

Moderate motor driver short (6 runs): ~3.40–3.46° across all operating points.

Why this matters after kickoff

Chapter 1 paid the format tax once and proved the actuator mostly clean under nominal sweeps. Chapter 2 pays the volume tax: hundreds of fault runs, each needing the same harmonisation, event extraction, and gate checks, applied consistently across the whole campaign. Done by hand, that is weeks of review with no guarantee two engineers would score the same run the same way.

Six gates fail, clustered in three mechanisms an architect can act on. That is enough to prioritize fixes. It is not enough to sign off, and there is no sensible next step on a vehicle or a correlation drive until the system passes.

The agent does not sign the safety case. It ran every check the same way on every run and handed back a structured failure. What comes next is the regression run: the team ships fixes, re-runs the campaign, and something has to diff the two result sets and catch what the fix broke.

Next in the series: the regression run, where two full campaigns meet and MOVEcenter does the diff. If comparing 384 runs before and after a fix still happens in spreadsheets, we would like to talk: founders@movedot.com, or www.movedot.ai.