GradiSeg: Gradient-Guided Gaussian Segmentation with Enhanced 3D Boundary Precision

Abstract

While 3D Gaussian Splatting enables high-quality real-time rendering, existing Gaussian-based frameworks for 3D semantic segmentation still face significant challenges in boundary recognition accuracy.

To address this, we propose a novel 3DGS-based framework named GradiSeg, incorporating Identity Encoding to construct a deeper semantic understanding of scenes. Our approach introduces two key modules: Identity Gradient Guided Densification (IGD) and Local Adaptive K-Nearest Neighbors (LA-KNN).

The IGD module supervises gradients of Identity Encoding to refine Gaussian distributions along object boundaries, aligning them closely with boundary contours. Meanwhile, the LA-KNN module employs position gradients to adaptively establish locality-aware propagation of Identity Encodings, preventing irregular Gaussian spreads near boundaries.

We validate the effectiveness of our method through comprehensive experiments. Results show that GradiSeg effectively addresses boundary-related issues, significantly improving segmentation accuracy without compromising scene reconstruction quality. Furthermore, our method's robust segmentation capability and decoupled Identity Encoding representation make it highly suitable for various downstream scene editing tasks, including 3D object removal, swapping and so on.

🔥 Overview of the proposed method. a) We adopt Identity Encoding as a learnable vector to construct a semantic understanding of the scene. This vector is optimized through multi-view supervision to produce initial segmentation results. b) To tackle boundary ambiguity, we introduce two boundary enhancement modules: IGD and LA-KNN. IGD refines Gaussians near object boundaries by monitoring Identity Encoding gradients. Complementarily, LA-KNN enables direction-aware feature propagation by leveraging position gradients for neighbor selection, preventing cross-instance feature contamination at boundaries.

IGD Module

The process of the IGD module. The first row refers to Identity Encoding gradient monitoring. For Gaussians near the boundaries, in order to optimize, they continuously adjust their Identity Encoding, leading to an increasingly high gradient that may become anomalous. The second row involves Identity Encoding densification. For Gaussians with anomalous gradients, we perform splitting and adjust them to both sides of the boundary, addressing optimization conflicts during the training process.

LA-KNN Module

The process of the LA-KNN module. We first compute the neighboring direction by taking the opposite direction of the Gaussian position gradient. Then, we eliminate all Gaussians whose angle with the direction vector is greater than 180 degrees. For the remaining Gaussians, we sort them by their projection distance to the direction vector and select the $K$ nearest neighbors. Finally, we align the Identity Encoding features in the local space.