Checkpoint-backed run on steps 1–9 using step→expected-part constraints. Generated 2026-04-06 UTC.
| Step | Status | Expected parts | Hit frames | What actually hit | Missing |
|---|---|---|---|---|---|
| 1 step_001_correct_0.mov | partial | 30029, 3023 | 5/7 | 30029 (5/7) | 3023 |
| 2 step_002_correct_1.mov; step_002_corrected_2.mov | partial | 35480, 36841 | 2/14 | 36841 (2/14) | 35480 |
| 3 step_003_correct_3.mov | none | 20482 | 0/7 | none | 20482 |
| 4 step_004_correct_4.mov | partial | 3710, 99781 | 3/7 | 99781 (3/7) | 3710 |
| 5 step_005_correct_5.mov | full | 3023, 3666 | 4/7 | 3023 (2/7), 3666 (4/7) | — |
| 6 step_006_correct_6.mov | full | 36840 | 3/7 | 36840 (3/7) | — |
| 7 step_007_correct_7.mov | partial | 3022, 3710 | 7/7 | 3022 (7/7) | 3710 |
| 8 step_008_correct_8.mov | partial | 28802, 3024 | 1/7 | 28802 (1/7) | 3024 |
| 9 step_009_correct_9.mov | partial | 4032a, 99780 | 1/7 | 4032a (1/7) | 99780 |
Read: rank is by how many steps a class blocks first, then by missed frame count. That makes 3710 and 3023 the highest-leverage fixes because they hurt multiple steps.
| Design ID | Count | Share of top-10 FP mass |
|---|---|---|
| 18986 | 126 | 31.6% |
| 3022 | 93 | 23.3% |
| 6806 | 47 | 11.8% |
| 30029 | 34 | 8.5% |
| 36840 | 26 | 6.5% |
| 3005 | 18 | 4.5% |
| 28974 | 17 | 4.3% |
| 4032a | 13 | 3.3% |
| 20482 | 13 | 3.3% |
| 4006 | 12 | 3.0% |
Top-10 false positives account for 399 raw detections. The biggest offenders are driving review noise and should be suppressed or rebalanced during retraining.