Nasirahmadi, Abozar
- Department of Energy and Technology, Swedish University of Agricultural Sciences
- University of Kassel
Accurate and robust grape-cluster detection remains a persistent challenge in precision viticulture due to spectral variability, canopy occlusion, and lighting heterogeneity. Recent advancements in the YOLO series, have focused on eliminating post-processing bottlenecks like Non-Maximum Suppression (NMS) to improve inference speed. Furthermore, state-of-the-art models increasingly integrate attention-based mechanisms and hybrid transformerCNN backbones to enhance feature representation and global context understanding, leading to greater accuracy. This study presents a comprehensive benchmark and error analysis of recent YOLO architectures (v8-v12), including an orientation-aware YOLOv8-OBB, where YOLOv11 and YOLOv12 are community implementations rather than official successors to the Ultralytics, across multispectral (RGB, NIR) vineyard datasets under both normal and degraded imaging conditions. Models were evaluated using standard metrics (Precision, Recall, F1, mAP@0.5, mAP@0.5:0.95) and False Classification Rate (FCR) that integrates false positives and negatives to capture field reliability. Results show that YOLOv10, YOLOv11, and YOLOv8-OBB deliver the highest overall stability and transfer performance, maintaining consistent F1 >= 0.85 across spectral regimes. RGB imagery outperforms NIR by approximately 8-10%, yet OBB regression markedly improves NIR localization, reducing FCR by up to 30% in poor-quality scenes. Cross-dataset experiments further reveal that YOLOv11 sustains the lowest metric variance, while YOLOv8-OBB achieves superior mAP@0.5:0.95 when object orientations vary. The findings emphasize that orientation-aware geometry, domain-robust feature balance, and variance-based reliability metrics are more predictive of field performance than absolute mAP values. The study provides actionable guidance for detector selection in vineyard monitoring and establishes a reproducible benchmark for multispectral object detection under real-world variability.
Cross domain model generalization; Perception reliability; Autonomous vineyard robotics; Spectral domain-aware training; Multispectral near-infrared data analytics
Results in Engineering
2026, volume: 29, article number: 108833
Other Engineering and Technologies
https://res.slu.se/id/publ/145888