Skip to main content
LTX-2 is a 19B parameter DiT-based audio-video foundation model by Lightricks. It generates synchronized video and audio in a single pass, creating cohesive experiences where motion, dialogue, background noise, and music are produced together.
Make sure your ComfyUI is updated.Workflows in this guide can be found in the Workflow Templates. If you can’t find them in the template, your ComfyUI may be outdated. (Desktop version’s update will delay sometime)If nodes are missing when loading a workflow, possible reasons:
  1. You are not using the latest ComfyUI version (Nightly version)
  2. Some nodes failed to import at startup

Key features

  • Synchronized audio-video generation: Generates motion, dialogue, SFX, and music together in one pass
  • Multiple generation modes: Text-to-video, image-to-video, and video-to-video
  • Control options: Canny, Depth, and Pose video-to-video control via IC-LoRAs
  • Keyframe-driven generation: Interpolate between keyframe images
  • Native upscaling: Spatial (2x) and temporal (2x) upscalers for higher resolution and FPS
  • Prompt enhancement: Automatic prompt enhancement support

Model checkpoints

NameDescription
ltx-2-19b-devFull model in bf16, flexible and trainable
ltx-2-19b-dev-fp8Full model in fp8 quantization
ltx-2-19b-distilledDistilled version, 8 steps, CFG=1
ltx-2-spatial-upscaler-x2-1.02x spatial upscaler for higher resolution
ltx-2-temporal-upscaler-x2-1.02x temporal upscaler for higher FPS

Getting started

LTX-2 is natively supported in ComfyUI. To get started:
  1. Update ComfyUI to the latest version
  2. Go to Template Library > Video > choose any LTX-2 workflow
  3. Follow the pop-up to download models and run the workflow

Workflows

Text-to-video

Generate videos from text prompts. Distilled version (faster, 8 steps):

Text-to-Video Distilled

Download workflow

Image-to-video

Generate videos from an input image. Distilled version (faster, 8 steps):

Image-to-Video Distilled

Download workflow

Control-to-video

Generate videos with structural control using IC-LoRAs. Depth control: Canny control: Pose control:

Prompting tips

When writing prompts for LTX-2, focus on detailed, chronological descriptions of actions and scenes. Include specific movements, appearances, camera angles, and environmental details in a single flowing paragraph. Start directly with the action and keep descriptions literal and precise. Structure your prompts with:
  • Main action in a single sentence
  • Specific details about movements and gestures
  • Character/object appearances
  • Background and environment details
  • Camera angles and movements
  • Lighting and colors
  • Any changes or sudden events
Keep prompts within 200 words for best results.

Resources

Limitations

  • Not intended to provide factual information
  • May amplify existing societal biases
  • May fail to generate videos that match prompts perfectly
  • Prompt following is heavily influenced by prompting style
  • Audio without speech may be of lower quality