Training splits for YOLOv8n ablation study. Source: outputs/phase2_framing_fixed/
Generated 2026-04-04 — splits.json: /root/.openclaw/workspace/assemble-anything/prototypes/synthetic_objects_yolo/outputs/phase3/splits.json
| Subset | Max scenes | Views / scene | Training images (final) | Distinct scenes used |
|---|---|---|---|---|
| 2view | 42 | 2 | 84 | 42 |
| 5view | 20 | 5 | 100 | 20 |
| 10view | 10 | 10 | 100 | 10 |
Val hold-out (fixed): 8 scenes × 10 views = 80 images. Same for all runs.
44 classes remapped from original (non-contiguous) IDs to contiguous 0–43. All label files in outputs/phase3/ use the remapped IDs.
| New ID | Original ID | Design ID |
|---|---|---|
| 0 | 0 | 11477 |
| 1 | 1 | 15573 |
| 2 | 2 | 1748 |
| 3 | 3 | 17485 |
| 4 | 4 | 18986 |
| 5 | 5 | 20482 |
| 6 | 6 | 25269 |
| 7 | 7 | 28802 |
| 8 | 8 | 28974 |
| 9 | 9 | 30029 |
| 10 | 10 | 3003 |
| 11 | 11 | 3005 |
| 12 | 12 | 3022 |
| 13 | 13 | 3023 |
| 14 | 14 | 3024 |
| 15 | 15 | 3070b |
| 16 | 16 | 3176 |
| 17 | 17 | 32123b |
| 18 | 18 | 3388 |
| 19 | 19 | 33909 |
| 20 | 20 | 3484 |
| 21 | 21 | 35480 |
| 22 | 23 | 3666 |
| 23 | 24 | 36840 |
| 24 | 25 | 36841 |
| 25 | 26 | 3710 |
| 26 | 27 | 3821 |
| 27 | 28 | 3822 |
| 28 | 29 | 4006 |
| 29 | 30 | 4032a |
| 30 | 31 | 41769 |
| 31 | 32 | 5091 |
| 32 | 33 | 5092 |
| 33 | 34 | 5095 |
| 34 | 38 | 65429 |
| 35 | 39 | 6806 |
| 36 | 40 | 73825 |
| 37 | 42 | 77844 |
| 38 | 44 | 85984 |
| 39 | 45 | 86996 |
| 40 | 46 | 98138 |
| 41 | 47 | 99563 |
| 42 | 48 | 99780 |
| 43 | 49 | 99781 |
8 scenes × 10 views = 80 images. Fixed across all runs — never in any training subset.
42 scenes in training pool. Color indicates which fold uses this scene as fold-val. In each CV fold, fold-val scenes are held out; remaining scenes form the fold-train set.
| Fold | Fold-val scenes | Fold-train scenes | 2view images | 5view images | 10view images |
|---|---|---|---|---|---|
| fold_0 | 8 | 34 | 68 | 100 | 100 |
| fold_1 | 8 | 34 | 68 | 100 | 100 |
| fold_2 | 8 | 34 | 68 | 100 | 100 |
| fold_3 | 8 | 34 | 68 | 100 | 100 |
| fold_4 | 10 | 32 | 64 | 100 | 100 |
2-view fold counts vary because removing fold-val scenes changes the pool size. Fold 4 is larger (4 fold-val scenes) due to 42 mod 5 = 2 remainder.
Final training runs use the full 42-scene pool for each subset, evaluated on the 8-scene val hold-out.
| Subset | Scenes | Images | Scenes (preview) |
|---|---|---|---|
| 2view | 42 | 84 | scene_000, scene_001, scene_002, scene_003, scene_004, scene_005, scene_006, scene_007… |
| 5view | 20 | 100 | scene_000, scene_001, scene_002, scene_003, scene_004, scene_005, scene_006, scene_007… |
| 10view | 10 | 100 | scene_000, scene_001, scene_002, scene_003, scene_004, scene_005, scene_006, scene_007… |