Historical Review — 2019 to 2026
Global latents proved the concept of neural fields but revealed capacity limits for complex geometry. A single low-dimensional vector conditions a shared MLP to represent entire shapes.
Researchers realized global latents lack detail. The solution: distribute multiple latents spatially on grids. Moving latents onto regular/multiscale grids solved the capacity problem but introduced memory and rigidity issues.
Hierarchy and sparsity emerged as key themes. Octree and hash-based approaches combined classical multigrid ideas with modern neural implicits, enabling adaptive resolution and dramatic training speedups.
Explicit but adaptive positions + transformer encoding became the dominant approach for detail and efficiency. Farthest Point Sampling placed latents at surface-relevant locations, making representations sparse and transformer-compatible.
The field shifted from "where the latent lives" to "what abstract feature the latent encodes." Position-free sets and axis-aligned feature planes emerged as powerful representations, enabling transformer-native diffusion on structured 2D feature maps.
Discretization fundamentally changed latent code nature. VQ-VAE compression and autoregressive tokenization enabled transformer-based generation on discrete codebooks, bridging 3D shape understanding with language model architectures.
Gaussian primitives emerged as an alternative to implicit neural fields. Explicit point-based representations with learnable covariance enabled real-time rendering and faster optimization, bridging classical graphics with neural generation.
The frontier returned to hierarchical representations with learned adaptivity. Geometry-aware local grids and unsupervised hierarchical transformers enable adaptive resolution that follows surface complexity, closing the loop from global codes back to structured hierarchies.