WheatFormer3D: Segmentation and Phenotyping of Wheat Heads with Transformers

Singh, Ashutosh

2026 Other https://fordatis.fraunhofer.de/handle/fordatis/500
http://dx.doi.org/10.24406/fordatis/459

WheatFormer3D: Segmentation and Phenotyping of Wheat Heads with Transformers

Singh, Ashutosh (Fraunhofer-Institut für Graphische Datenverarbeitung IGD)

IGD Fraunhofer-Institut für Graphische Datenverarbeitung

Files in This Item:

File	Description	Size	Format
datasets.md	Readme file for data description. Contact and citation info.	5,15 kB	Unknown	Download/Open
datasets.zip	Data files.	1,98 GB	ZIP	Download/Open

Abstract

3D Computer vision offers powerful tools for efficient agricultural analysis by enabling automated extraction of quantitative traits using point clouds of field scenes. In the context of wheat, accurate analysis of wheat head morphology is challenging because the acquisition of high resolution point clouds is difficult and annotating them for instance segmentation requires substantial manual effort. While 3D instance segmentation has shown promise for such tasks by explicitly modeling geometric structure, existing approaches often use simulated data or data obtained in highly controlled indoor setups. As a result, they struggle to achieve reliable instance coverage in real field conditions. In this work, we study 3D instance segmentation of wheat heads in real in-field point clouds and introduce WheatFormer3D, a transformer-based framework designed to improve query coverage of individual wheat heads in crowded scenes. We further propose domain-specific geometric augmentations that increase data efficiency and robustness in data-scarce agricultural settings. Extensive experiments demonstrate that the proposed approach consistently outperforms recent transformer-based baselines, including OneFormer3D and Mask3D, on wheat head instance segmentation, achieving 87.96 AP@50 and 77.99 AP overall. In addition, we investigate the use of segmentation outputs for downstream phenotyping tasks and construct a reference organ-level dataset with paired indoor and in-field wheat head scans and reference volume measurements. Using this dataset, we explore the feasibility and current limitations of learning-based volume estimation from real-world point clouds, highlighting challenges associated with noisy in-field reconstructions.

Classification

000 Informatik, Informationswissenschaft, allgemeine Werke

Keywords

Wheat point clouds

Funder

Bundesministerium für Bildung und Forschung BMBF (Deutschland)

Show full item record

This item is licensed under a Creative Commons License