Masks-to-Skeleton: Multi-view mask-based tree skeleton extraction with 3D Gaussian splatting

Jul 11, 2025·

Xinpeng Liu

Kanyu Xu

Risa Shinoda

Hiroaki Santo

Fumio Okura

· 0 min read

PDF Code

Abstract

Accurately reconstructing tree skeletons from multi-view images is challenging. While most existing works use skeletonization from 3D point clouds, thin branches with low-texture contrast often involve multi-view stereo (MVS) to produce noisy and fragmented point clouds, which break branch connectivity. Leveraging the recent development in accurate mask extraction from images, we introduce a mask-guided graph optimization framework that estimates a 3D skeleton directly from multi-view segmentation masks, bypassing the reliance on point cloud quality. In our method, a skeleton is modeled as a graph whose nodes store positions and radii while its adjacency matrix encodes branch connectivity. We use 3D Gaussian splatting (3DGS) to render silhouettes of the graph and directly optimize the nodes and the adjacency matrix to fit given multi-view silhouettes in a differentiable manner. Furthermore, we use a minimum spanning tree (MST) algorithm during the optimization loop to regularize the graph to a tree structure. Experiments on synthetic and real-world plants show consistent improvements in completeness and structural accuracy over existing point-cloud-based and heuristic baseline methods.

Type

Journal article

Publication

Sensors, 25(14):4354

Last updated on Jul 11, 2025

Computer Vision Plant Phenomics

Authors

Fumio Okura

Associate Professor

← Spectral sensitivity estimation with an uncalibrated diffraction grating Oct 19, 2025

A multi-modality fusion model based on dual-task measurement for the automatic detection of early-stage cognitive impairment Jun 23, 2025 →