YOLO Split Review — Phase 3

Training splits for YOLOv8n ablation study. Source: outputs/phase2_framing_fixed/

Generated 2026-04-04 — splits.json: /root/.openclaw/workspace/assemble-anything/prototypes/synthetic_objects_yolo/outputs/phase3/splits.json

✓ No val leakage
50
Total scenes
42
Training pool
8
Val hold-out
80
Val images
44
Classes (nc)
5
CV folds

Subset Image Budget

Subset Max scenes Views / scene Training images (final) Distinct scenes used
2view4228442
5view20510020
10view101010010

Val hold-out (fixed): 8 scenes × 10 views = 80 images. Same for all runs.

Class Remapping

44 classes remapped from original (non-contiguous) IDs to contiguous 0–43. All label files in outputs/phase3/ use the remapped IDs.

Show full remapping table (44 classes)
New IDOriginal IDDesign ID
0011477
1115573
221748
3317485
4418986
5520482
6625269
7728802
8828974
9930029
10103003
11113005
12123022
13133023
14143024
15153070b
16163176
171732123b
18183388
191933909
20203484
212135480
22233666
232436840
242536841
25263710
26273821
27283822
28294006
29304032a
303141769
31325091
32335092
33345095
343865429
35396806
364073825
374277844
384485984
394586996
404698138
414799563
424899780
434999781

Val Hold-Out Scenes

8 scenes × 10 views = 80 images. Fixed across all runs — never in any training subset.

scene_042 scene_043 scene_044 scene_045 scene_046 scene_047 scene_048 scene_049

Training Pool — Fold Assignments

42 scenes in training pool. Color indicates which fold uses this scene as fold-val. In each CV fold, fold-val scenes are held out; remaining scenes form the fold-train set.

Fold 0 Fold 1 Fold 2 Fold 3 Fold 4
scene_000 scene_001 scene_002 scene_003 scene_004 scene_005 scene_006 scene_007 scene_008 scene_009 scene_010 scene_011 scene_012 scene_013 scene_014 scene_015 scene_016 scene_017 scene_018 scene_019 scene_020 scene_021 scene_022 scene_023 scene_024 scene_025 scene_026 scene_027 scene_028 scene_029 scene_030 scene_031 scene_032 scene_033 scene_034 scene_035 scene_036 scene_037 scene_038 scene_039 scene_040 scene_041

Per-Fold Image Counts

FoldFold-val scenesFold-train scenes2view images5view images10view images
fold_083468100100
fold_183468100100
fold_283468100100
fold_383468100100
fold_4103264100100

2-view fold counts vary because removing fold-val scenes changes the pool size. Fold 4 is larger (4 fold-val scenes) due to 42 mod 5 = 2 remainder.

Final Splits (full training pool)

Final training runs use the full 42-scene pool for each subset, evaluated on the 8-scene val hold-out.

SubsetScenesImagesScenes (preview)
2view4284scene_000, scene_001, scene_002, scene_003, scene_004, scene_005, scene_006, scene_007…
5view20100scene_000, scene_001, scene_002, scene_003, scene_004, scene_005, scene_006, scene_007…
10view10100scene_000, scene_001, scene_002, scene_003, scene_004, scene_005, scene_006, scene_007…