LTX-2.3: Lightricks Upgrades Its Open Source Audio Video Model

March 5, 2026

Updated: June 17, 2026

Share this post:

LTX-2.3: Lightricks Upgrades Its Open Source Audio Video Model

Lightricks has released LTX-2.3, a significant update to its open source audio video foundation model. The new version brings improved visual quality, better prompt adherence, and a suite of upscaler models that push generated content toward higher resolutions and smoother frame rates.

What Is LTX-2.3

LTX-2.3 is a DiT based (Diffusion Transformer) audio video foundation model that generates synchronized video and audio within a single pass. Built on the same architecture as LTX-2, the 2.3 release focuses on refinement rather than architectural overhaul. The core model has 22 billion parameters and ships in two variants: a full dev model and a distilled version.

The model supports a broad range of generation tasks in a single unified system:

text-to-video
image-to-video
video-to-video
audio-to-video and video-to-audio
image and text to audio video

This multimodal flexibility makes LTX-2.3 one of the most capable open source video models available today. A different approach to the same goal arrived in May 2026: ByteDance Research's Lance consolidates text-to-video, video editing, and image generation into a single 3 billion parameter model under Apache 2.0, prioritizing a lighter unified architecture over LTX-2.3's 22 billion parameter audio video approach.

What Changed in 2.3

The headline improvements in this release are audio and visual quality. Lightricks reports stronger prompt adherence across both modalities, meaning the model follows text descriptions more precisely when generating the visual scene and its accompanying sound.

The distilled variant now runs in just 8 steps with a classifier-free guidance value of 1, making inference substantially faster without a major quality penalty. For creators who need rapid iteration, this is the practical path to quick results.

Lightricks also introduced a set of upscaler models released alongside the main checkpoint:

Upscaler	Function
ltx-2.3-spatial-upscaler-x2-1.0	2x spatial resolution increase
ltx-2.3-spatial-upscaler-x1.5-1.0	1.5x spatial resolution increase
ltx-2.3-temporal-upscaler-x2-1.0	2x frame rate increase

The spatial upscalers allow creators to generate at a manageable resolution and scale up afterward, while the temporal upscaler doubles the frame rate of existing clips. Used in combination, these tools make high resolution, high frame rate output more accessible on consumer hardware.

Model Variants

LTX-2.3 ships with four distinct checkpoints:

ltx-2.3-22b-dev: The full model in bf16 precision. This is the flexible, trainable base intended for fine-tuning, LoRA training, and research workflows.

ltx-2.3-22b-distilled: The 8-step distilled version for faster inference. Lower memory overhead and significantly quicker generation times compared to the dev model.

ltx-2.3-22b-distilled-lora-384: A LoRA adapter that applies distillation behavior to the dev model. Useful if you want the full model's quality ceiling with faster sampling.

Upscaler models: The three upscaler checkpoints described above, applied as a post-processing step after generation.

Training and Fine-Tuning

The dev model is fully trainable. Lightricks provides reproducible LoRA and IC-LoRA training through the LTX-2 Trainer, with the company noting that motion, style, and likeness training can complete in under an hour in many configurations. This puts custom model training within reach for individual creators and small studios, not just large teams with dedicated compute.

In June 2026, Lightricks released a unified LTX Trainer that expands training to 13 modes, covering video, audio, and cross modal objectives from a single configuration file.

Technical Requirements

LTX-2.3 requires Python 3.12 or newer, CUDA above version 12.7, and PyTorch 2.7. Resolution inputs must be divisible by 32, and frame counts must follow the formula: divisible by 8, plus 1. The model can be run through the official PyTorch codebase or through ComfyUI using the built-in LTXVideo nodes.

Diffusers support is listed as coming soon, which will broaden compatibility with the wider Python AI tooling ecosystem.

Running LTX-2.3 Locally

For ComfyUI users, the LTXVideo nodes are available through ComfyUI Manager and the official documentation at docs.ltx.video. For direct Python usage:

git clone https://github.com/Lightricks/LTX-2.git
cd LTX-2
uv sync
source .venv/bin/activate

From there, the inference scripts handle both the dev and distilled checkpoints, with the upscalers applied as a second stage.

A live demo is available at the LTX-2.3 API Playground for testing generation without a local setup.

AI FILMS Studio video generation workspace

Try AI FILMS Studio

Generate text-to-video and image-to-video with the latest AI models in the video workspace.

Nodes Graph Editor

Build custom AI workflows by connecting models visually in the Nodes Graph Editor.

LTX-2.3 tutorial on AI FILMS Studio covering text-to-video and image-to-video generation

LTX-2.3 Tutorial: Text to Video and Image to Video

Step by step guide to every setting, credit costs, and Node Graph integration for LTX-2.3 on AI FILMS Studio.

For context on what LTX-2 offers and how it compares to the 2.3 update, see our earlier coverage of LTX-2. If you want to run the model locally on consumer hardware, our LTX-2 4K RTX GPU setup guide walks through ComfyUI installation and configuration.

Sources

Continue Reading

Jul 17, 2026

Andy Serkis Says AI Cannot Replicate an 'Authored Performance' as Hunt for Gollum Begins Filming

Andy Serkis says AI cannot yet replicate an authored performance as The Hunt for Gollum begins filming, and argues that motion capture acting is long overdue for Oscar recognition.

Jul 17, 2026

MolmoMotion: Ai2 Releases Open Source Model That Forecasts 3D Object Motion From Language

Allen Institute for AI releases MolmoMotion, an open source model that predicts 3D object trajectories from video and language instructions, with a dataset of 1.16 million annotated clips.

Jul 17, 2026

Venice Immersive 2026: Margot Robbie, Andy Serkis, Daisy Ridley Lead AI and XR Lineup

Venice Immersive marks its 10th anniversary with 68 projects featuring Margot Robbie, Andy Serkis, Daisy Ridley and Mark Ruffalo in AI and XR immersive works.

View all Posts