Go-with-the-Track: Video Compositing and Motion Control from Netflix Research

June 27, 2026

Share this post:

Go-with-the-Track: Video Compositing and Motion Control from Netflix Research

Eyeline Labs, the research division of Netflix, has released Go-with-the-Track, an open source video generation model that gives filmmakers precise control over what appears in a video and exactly how the camera moves through it. The model was accepted to SIGGRAPH 2026, the top venue for visual effects and computer graphics research, and is available under the Apache 2.0 license for commercial use.

The model conditions generation on 1 to 4 reference keyframes combined with up to 15,000 point tracks per video clip. Those tracks anchor reference images precisely into the generated frames, enabling a level of spatial control over compositing and camera movement that has not been available in open source video generation before.

How Point Tracking Controls the Output

Traditional video generation models accept a text prompt and a starting image, then generate the rest of the sequence. Go-with-the-Track adds a third input layer: a set of point tracks that specify where objects and reference images should be positioned in every frame of the generated video.

A point track is a set of 2D coordinates that describe where a specific point in the scene is located across frames. The model learns to respect those trajectories during generation, so a compositor can describe the path of a reference character, object, or camera target through the scene as a set of control points, and the model follows them. Netflix built this capability on a prior model from the same lab, Go-with-the-Flow (a CVPR 2025 Oral paper), and extends its scope from optical flow conditioning to full reference based generation with point level control.

Blender Mesh Stylization

One of the most direct filmmaking applications is native Blender integration. A 3D mesh animated in Blender can serve as the structural backbone for generation: the model uses the mesh's surface motion as the point track source, then generates a fully stylized video that follows the 3D animation's spatial logic.

Keyframe driven mesh stylization

Multi reference mesh compositing

The mesh stylization example on the left uses keyframes to drive the output style while the 3D mesh controls motion. The mesh compositing example on the right places multiple reference images into the scene and tracks them through the animation. Both results follow the mesh geometry exactly without needing a separate tracking pass.

Reference Based Restylization and Compositing

Beyond Blender, the model handles video restylization from still reference images and compositing driven by keypoint positions. A reference image defines the visual style or subject to inject into the generated video; the point tracks tell the model where that subject sits in each frame.

Multi reference restylization

Keypoint driven compositing

The restylization example transfers the look of multiple reference images across an entire generated sequence while preserving its spatial structure. The keypoint driven compositing example places a reference subject at specified positions frame by frame, following a trajectory defined in advance, without the model drifting from that path as the video progresses.

Camera Control and 360-Degree Orbits

Point tracking extends to camera movement. By defining point tracks that describe how a static scene's geometry shifts relative to a moving camera, the model can generate novel camera views from a single reference capture, including complete 360 degree orbital paths around a subject.

Multi reference camera control, static scene

360 degree camera orbit

These capabilities are comparable to what InfCam builds for precise camera control, but Go-with-the-Track achieves them through point-level trajectory specification rather than camera matrix conditioning. The 360 degree orbit example generates a full revolution around a static subject from a single reference, creating coverage that would require physical rig construction on a conventional set.

Camera Retargeting in Dynamic Scenes

Camera retargeting is a distinct capability: it copies the camera behavior of one clip and applies it to a different scene. The model transfers the motion pattern of the source camera to a new subject or environment while keeping the target scene's content unchanged.

Camera retargeting in a dynamic scene

This is the workflow that prior camera motion models like Warp-as-History have approached through different technical means. Go-with-the-Track handles it through point tracks defined on the dynamic scene's moving elements, which allows the retargeted camera path to respect the motion of subjects in the target clip rather than treating the scene as static.

Material and Lighting Reconstruction

The model also supports albedo and shading estimation from video, separating the intrinsic material properties of a surface from its lighting. These capabilities enable compositors to relight footage or isolate material properties for use in downstream rendering.

Albedo reconstruction

Shading and lighting estimation

Separating albedo from shading is a core step in physically based rendering pipelines. Having this available in an open model built on a video generation backbone means the same framework that handles compositing and camera control can also produce the material data needed to integrate generated content into a CG pipeline.

Available Now Under Apache 2.0

Go-with-the-Track generates 49-frame videos at 480p or 720p at 24fps. The codebase, model weights, and training dataset are all available under the Apache 2.0 license, permitting commercial use. This makes it one of the most permissively licensed camera control and compositing models available as of June 2026.

The editorial distinction from prior open source camera control models is structural: Netflix is building these tools through its own research lab rather than licensing them from third-party vendors. That signals streaming platforms see filmmaker tooling as part of their internal technical infrastructure, not just a capability to acquire. The model has an immediate predecessor in the same lab, Go-with-the-Flow, which reached 1,083 GitHub stars following its CVPR 2025 Oral presentation.

Filmmakers working on compositing, camera design, or reference based generation can try the video generation tools available in AI FILMS Studio's video workspace.