Generative pipeline that turns a text prompt into a textured 3D mesh. VULK uses text-to-3D for hero objects, product mockups, and game props — output as GLB and rendered via React Three Fiber.

Text-to-3D

Text-to-3D is the generation of a textured 3D asset directly from a written description ("a stylized low-poly mountain", "a chrome chess knight on marble"). The pipeline typically chains a text-to-image stage (to lock the silhouette and style), multi-view diffusion (to produce consistent angles), and a reconstruction stage (NeRF, Gaussian splatting, or mesh extraction) that exports a GLB / USDZ. Generation takes 20-90 seconds per asset on current models.

In VULK, text-to-3D is invoked when a prompt requests a 3D object that does not have a reference image. The agent routes to TRELLIS-text or Hunyuan3D-2.1, receives the GLB, uploads to R2, and writes a React Three Fiber scene that places the mesh under environment lighting with <Environment preset="studio"> from Drei. The asset is automatically decimated and Draco-compressed for web delivery.

See /docs/ai-models/choosing-model.

Text-to-3D

Text-to-3D

On this page