Fixed Camera Vision for Manufacturing Workflow Understanding (2025-2026 Review)

Query: 2025-2026 review papers and deployed systems on fixed/stationary camera computer vision for manufacturing workflow understanding. Focus on: overhead cameras, multi-camera setups, human activity recognition, assembly process monitoring, cycle time analysis, anomaly detection, and quality inspection using fixed camera installations. Industries: automotive, electronics assembly, aerospace, medical device manufacturing. Include both academic reviews (taxonomies, benchmarks, datasets) and practical deployments (case studies, commercial systems, edge inference). Prioritize open access sources: arXiv preprints, MDPI journals, ResearchGate, peer-reviewed open access. Exclude IEEE paywalled content. Model: o4-mini-deep-research Date: 2026-03-24 Searches performed: 39 Sources cited: 19

---

Taxonomy of Approaches

Fixed-camera vision in manufacturing can be categorized by camera setup (overhead/top-down vs side-view, single vs multi-camera) and by task (action recognition, assembly step monitoring, cycle timing, anomaly/defect detection, quality inspection). Traditional machine-vision (edge filters, contour analysis) has been used for simple part detection and orientation, but suffers under varying lighting or backgrounds. Modern systems predominantly use learning-based methods: classical ML (object/feature classifiers) and deep learning (CNNs, Transformers, GNNs) for perception (link.springer.com) (www.mdpi.com). For example, real-time cycle-time measurement has been achieved by running YOLOv8 object detectors and ByteTrack on video from an overhead Azure Kinect (link.springer.com). Human activity recognition (HAR) often uses pose estimation or skeleton-based graph networks; multi-modal RGB+D datasets (see below) are employed to train deep models (link.springer.com). Anomaly and defect detection typically use one-class models or supervised CNNs/ViTs on product images (www.sciencedirect.com) (www.mdpi.com). Multi-camera and 3D methods (e.g. multi-view stereo) enable volumetric reconstruction for assembly verification (chinarxiv.org). Recent trends integrate vision with IoT/edge architectures: e.g. vision transformers with federated learning and feedback loops (VITA-Net (www.sciencedirect.com)) or FPGA+CPU co-design on Xilinx cores (www.mdpi.com).

Key method categories:

Key Datasets and Benchmarks

Several open datasets support manufacturing tasks (see Table below). Common anomaly-detection benchmarks include MVTec AD (2019, 15 object/texture classes, ~3629 normal-only train images (www.mdpi.com)) and VisA (2021, ~19k images across 12 object categories (www.mdpi.com)). To bridge gaps in realism, new datasets have appeared: AutoVI (2024) is an outdoor automotive assembly dataset with 6 real-world product categories (4,950 images total) (www.sciencedirect.com). ManuDefect-21 (2025) is a large-scale SMT electronics dataset (31k train + 13k test images, 11 component types, 82 defect types) with pixel-level labels, designed to reflect real defect ratios (www.mdpi.com). For HAR in assembly, the HA4M dataset (2022) provides multi-view (RGB+depth+skeleton) recordings of a manual assembly task (www.nature.com). The recent HARDAT dataset (2025) offers RGB-D and skeletal video of manual assembly actions labeled with MTM time units (link.springer.com). Table 1 lists representative datasets.

| Dataset | Year | Domain / Task | Details | Ref. | |----------------|------|---------------------------------|----------------------------------------------------------------|--------------------------| | HA4M | 2022 | Human action recognition (assembly) | Multi-modal (RGB, depth, skeleton) video of a manual assembly task (www.nature.com) | Scientific Data (OA) | | HARDAT | 2025 | HAR for assembly tasks | RGB-D and Azure Kinect skeleton data of staged assembly (MTM-labeled) (link.springer.com) | SCMA 2025 (OA Chapter) | | AutoVI | 2024 | Visual anomaly detection (auto assembly) | 6 classes, 4,950 images from real car assembly stations (www.sciencedirect.com) | Comput. Ind. (OA) | | MVTec AD | 2019 | Generic industrial anomaly | 15 object/texture categories, 3629 train (normal) + 1725 test (www.mdpi.com) | Bergmann et al., CVPR (ref) | | VisA | 2021 | Industrial anomaly | 12 object classes, ~19k images (10,821 train, 9621 test) (www.mdpi.com) | Zou et al., TPAMI (ref) | | ManuDefect-21 | 2025 | Visual defect detection (electronics) | 11 SMT component types, 31k train + 13k test images, 82 defect types, pixel masks (www.mdpi.com) | Appl. Sci. (OA) | | DAGM 2007 | 2007 | Synthetic defect detection | 10 classes, 1,500 images (billboard-like patterns with “metal” defects) | (classical benchmark) | | NEU Steel | 2013 | Steel surface defects | 6 defect types of steel surfaces (300 images each) (www.sciencedirect.com) (open source) | Song & Yan, ArXiv (ref) |

Table 1: Key open-access datasets for fixed-camera manufacturing vision tasks (OA=open access).

State-of-the-Art Methods

Recent methods leverage deep learning and integrate cross-modal data, often with specialized architectures:

Deployments and Case Studies

Several real-world deployments demonstrate the above techniques in practice (Table 2). For instance, Calderon-Cordova et al. (2022) integrated a Basler fixed camera with an Epson robot to automate hinge assembly/packaging (www.mdpi.com). The vision system correctly identified 100% of parts and packaging, yielding ~92.5% assembly success. In the automotive sector, an experimental setup used an overhead Kinect camera to measure cycle times of line processes with YOLOv8 detection, showing high accuracy in situ (link.springer.com). In electronics manufacturing, Shenvi & Sharma (2025) deployed the “PROSPECT” vision tool at assembly stations to record worker actions; the system flagged deviations from standard procedures and fed them into a failure-prediction model (www.syncsci.com). In medical device production, Guha et al. (2023) applied machine vision to catheter manufacturing, enabling 100% non-destructive in-line inspection of critical dimensions; this replaced destructive sampling and met strict quality/regulatory requirements (jeas.springeropen.com). Finally, Frustaci et al. (2022) built a heterogeneous (HW/SW) system on Xilinx Zynq that inspects catalytic converter flanges in-line with sub-mm accuracy (www.mdpi.com). These cases illustrate edge/embedded inferencing and real-time integration in automotive, electronics, and medical assembly (often using overhead or fixed-looking cameras).

| Domain | Task | Setup | Outcome | Ref. | |----------------------------|-----------------------------------|----------------------------------------------|--------------------------------------------------------------------------------------|-----------------------------------| | Automotive (assembly) | Flange alignment inspection | Fixed camera + Zynq FPGA (SoC) (www.mdpi.com)| <1 mm / <1° error; 23× faster than pure SW, enabling in-line inspection | Frustaci et al., 2022 (www.mdpi.com) | | Automotive (cycle-time)| Cycle time computation | Overhead Azure Kinect; YOLOv8+ByteTrack (link.springer.com) | Real-time cycle metrics matching ground truth; non-invasive | Staudenrausch & Lüdemann-Ravit, 2025 (link.springer.com) | | Electronics (assembly) | SOP compliance & yield monitoring | Fixed station cameras; deep HAR models (www.syncsci.com) | Operator actions recorded; SOP deviations flagged; feed to yield/failure prediction | Shenvi & Sharma, 2025 (www.syncsci.com) | | Metal hinge assembly | Part ID and assembly verification | Basler area-camera + Epson robot (www.mdpi.com) | 100% part recognition; 92.5% assembly success (7.5% error) | Calderon-Cordova et al. 2022 (www.mdpi.com) | | Medical devices | In-line quality inspection | Multi-angle camera rig, CV analysis (jeas.springeropen.com) | Achieved 100% real-time inspection (vs 5% destructive sampling); robust and precise | Guha et al. 2023 (jeas.springeropen.com) |

Table 2: Selected deployed vision systems and case studies. All systems use fixed cameras (often overhead or static) and deep models at the edge.

Gaps and Future Directions

Despite progress, several challenges remain: In summary, fixed-camera vision in manufacturing is advancing rapidly (powered by CNNs/Transformers and edge AI), but will benefit from more open industrial datasets, adaptive learning methods, and tighter integration with industrial IoT and digital-twin frameworks. Addressing these gaps will enable robust, scalable vision systems for multi-industry workflows (automotive, electronics, aerospace, medical, etc.) (www.mdpi.com) (jeas.springeropen.com).

References: All citations are open-access sources (journal articles, conference papers, preprints) from 2022–2026. Each figure or dataset mentioned above is supported by the cited work’s content (www.nature.com) (www.syncsci.com) (link.springer.com) (www.sciencedirect.com) (www.mdpi.com) (link.springer.com) (www.mdpi.com) (www.mdpi.com) (jeas.springeropen.com).

---

Sources

1. Cycle Time Measurement Using AI-Based Object Detection and Tracking in Industrial Processes | Springer Nature Link 2. Research on Surface Defect Detection of Camera Module Lens Based on YOLOv5s-Small-Target 3. HARDAT: Human Action Recognition Dataset for Manual Assembly Tasks | Springer Nature Link 4. Optimization of Industrial Quality Inspection Systems in Computer Vision: - ScienceDirect 5. Towards Realistic Industrial Anomaly Detection: MADE-Net Framework and ManuDefect-21 Benchmark 6. ChinaRxiv 7. Robust and High-Performance Machine Vision System for Automatic Quality Inspection in Assembly Processes 8. Praxis: a framework for AI-driven human action recognition in assembly | Journal of Intelligent Manufacturing | Springer Nature Link 9. Towards Realistic Industrial Anomaly Detection: MADE-Net Framework and ManuDefect-21 Benchmark 10. Detecting visual anomalies in an industrial environment: Unsupervised methods put to the test on the AutoVI dataset - ScienceDirect 11. The HA4M dataset: Multi-Modal Monitoring of an assembly task for Human Action recognition in Manufacturing | Scientific Data 12. Detecting visual anomalies in an industrial environment: Unsupervised methods put to the test on the AutoVI dataset - ScienceDirect 13. ChinaRxiv 14. ChinaRxiv 15. An Integrated System of Industrial Robotics and Machine Vision for the Automation of the Assembly and Packaging Process of Industrial Hinges 16. Integrating Manufacturing Intelligence, Computer Vision, and Process Observation for Yield Improvement and Failure Prediction in Electronics Manufacturing | Research on Intelligent Manufacturing and Assembly 17. Application and validation of machine vision inspection for efficient in-process monitoring of complex biomechanical device manufacturing | Journal of Engineering and Applied Science | Full Text 18. Towards Realistic Industrial Anomaly Detection: MADE-Net Framework and ManuDefect-21 Benchmark 19. ChinaRxiv