A generation flow where the user uploads a video — a Loom walk-through, a competitor demo, a Figma prototype recording — and the AI builder reconstructs it as a working app. VULK extracts frames, runs vision models, and synthesizes the codebase.

Video-to-App

Video-to-app is the workflow where the user provides a screen recording of an existing UI (a Loom of their workflow, a Figma prototype playthrough, a competitor walk-through) and the platform reverse-engineers it into source code. The system samples key frames, runs each through a vision-language model to extract structure (routes, components, copy, color tokens), reconciles the scenes into a state graph, and emits the project.

In VULK, video-to-app is one of three multimodal entry points (alongside screenshot-to-app and URL clone). The clip is uploaded, FFmpeg extracts ~12 keyframes, each frame is captioned by a vision model, and the resulting structured plan is fed into the regular generation pipeline. The output is a React or Next.js project that mirrors the recorded flow — every screen, every interaction, every transition — and is rendered in the live Firecracker preview.

See /docs/creating-projects/writing-prompts.

Video-to-App

Video-to-App

On this page