model: final_10view IoU threshold: 0.5 conf threshold: 0.25 val images: 80 classes: 44
Scoring method: Each image is scored by its F1 score at IoU ≥ 0.5: F1 = 2·TP / (2·TP + FP + FN). A predicted box is a True Positive (TP) if it overlaps a ground-truth box of the same class with IoU ≥ 0.5 (greedy highest-IoU matching, at most one GT per prediction). Unmatched predictions are False Positives (FP); unmatched GT boxes are False Negatives (FN). Images with no GT boxes and no predictions receive F1 = 1.0 (vacuously perfect); images with predictions but no GT receive F1 = 0.0.

Best examples have F1 closest to 1 — the model correctly locates and classifies nearly all parts. Worst examples have F1 closest to 0 — many parts are missed or mislabelled.

Aggregate Metrics — Validation Set

0.327
Global F1@0.5
0.337
Mean image F1
0.333
Median image F1
281
Total TP
438
Total FP
717
Total FN
0
F1 = 1.0
6
F1 = 0.0

Best 10 Examples

Highest F1@0.5 — model detects the right parts in the right places.

Best #1 scene_047_view_04.jpg F1 0.750
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 5 Pred: 3 TP: 3 FP: 0 FN: 2
Best #2 scene_047_view_03.jpg F1 0.667
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 5 Pred: 4 TP: 3 FP: 1 FN: 2
Best #3 scene_049_view_08.jpg F1 0.625
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 8 Pred: 8 TP: 5 FP: 3 FN: 3
Best #4 scene_047_view_05.jpg F1 0.600
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 6 Pred: 4 TP: 3 FP: 1 FN: 3
Best #5 scene_047_view_02.jpg F1 0.600
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 6 Pred: 4 TP: 3 FP: 1 FN: 3
Best #6 scene_047_view_01.jpg F1 0.600
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 5 Pred: 5 TP: 3 FP: 2 FN: 2
Best #7 scene_048_view_09.jpg F1 0.571
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 11 Pred: 10 TP: 6 FP: 4 FN: 5
Best #8 scene_042_view_07.jpg F1 0.538
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 16 Pred: 10 TP: 7 FP: 3 FN: 9
Best #9 scene_049_view_04.jpg F1 0.533
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 8 Pred: 7 TP: 4 FP: 3 FN: 4
Best #10 scene_048_view_04.jpg F1 0.500
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 10 Pred: 6 TP: 4 FP: 2 FN: 6

Worst 10 Examples

Lowest F1@0.5 — model misses many parts or fires spurious predictions.

Worst #1 scene_045_view_01.jpg F1 0.000
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 9 Pred: 5 TP: 0 FP: 5 FN: 9
Worst #2 scene_045_view_02.jpg F1 0.000
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 9 Pred: 7 TP: 0 FP: 7 FN: 9
Worst #3 scene_045_view_03.jpg F1 0.000
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 8 Pred: 5 TP: 0 FP: 5 FN: 8
Worst #4 scene_045_view_05.jpg F1 0.000
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 9 Pred: 5 TP: 0 FP: 5 FN: 9
Worst #5 scene_045_view_06.jpg F1 0.000
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 9 Pred: 4 TP: 0 FP: 4 FN: 9
Worst #6 scene_045_view_09.jpg F1 0.000
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 9 Pred: 4 TP: 0 FP: 4 FN: 9
Worst #7 scene_045_view_04.jpg F1 0.095
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 9 Pred: 12 TP: 1 FP: 11 FN: 8
Worst #8 scene_044_view_01.jpg F1 0.118
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 13 Pred: 4 TP: 1 FP: 3 FN: 12
Worst #9 scene_045_view_08.jpg F1 0.118
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 9 Pred: 8 TP: 1 FP: 7 FN: 8
Worst #10 scene_045_view_07.jpg F1 0.133
Original
Ground Truth ■ detected ■ missed
Prediction ■ TP ■ FP
GT: 9 Pred: 6 TP: 1 FP: 5 FN: 8