Queue a new video generation request.
Authentication: This endpoint accepts either a Bearer API key or an X-Sign-In-With-X header for x402 wallet-based authentication. When using x402, a 402 Payment Required response indicates insufficient balance and includes top-up instructions.
/video/quote to get a price estimate, then poll /video/retrieve with the returned queue_id until complete.
topaz-video-upscale model, use upscale_factor (1, 2, or 4) instead of resolution, and provide a video_url. Duration and FPS are detected automatically from the video file. See the Video Upscaling Guide for full details and examples.
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Request body for video generation. Available fields and valid values vary by model.
The model to use for video generation.
"wan-2-7-text-to-video"
The prompt to use for video generation. Required for most models. The maximum length varies by model (default 2500 characters, up to 3500 for some models such as Seedance 2.0).
1 - 3500"Commerce being conducted in the city of Venice, Italy."
The duration of the video to generate. Available options vary by model.
2s, 3s, 4s, 5s, 6s, 7s, 8s, 9s, 10s, 11s, 12s, 13s, 14s, 15s, 16s, 18s, 20s, 25s, 30s, Auto "5s"
Optional negative prompt. The maximum length varies by model (default 2500 characters, up to 3500 for some models).
3500"low resolution, error, worst quality, low quality, defects"
The aspect ratio of the video. Available options vary by model. Some models do not support aspect_ratio.
1:1, 2:3, 3:2, 3:4, 4:3, 9:16, 16:9, 21:9 "16:9"
The resolution of the video. Available options vary by model. Some models do not support resolution. Use upscale_factor for upscale models.
256p, 360p, 480p, 540p, 580p, 720p, 1080p, 1440p, 2160p, 4k, 2x, 4x "720p"
For upscale models only. 1 = quality enhancement, 2 = double resolution (default), 4 = quadruple.
1, 2, 4 2
For models which support audio generation and configuration. Defaults to true.
true
For image-to-video models, the reference image. Must be a URL (http/https) or a data URL (data:image/...).
"data:image/png;base64,iVBORw0K..."
For models that support end images or transitions, the end frame image. Must be a URL or data URL.
"data:image/png;base64,iVBORw0K..."
For models that support audio input, background music. Must be a URL or data URL. Supported: WAV, MP3. Max: 30s, 15MB.
"data:audio/mpeg;base64,SUQzBAA..."
For models that support video input (video-to-video, upscale). Must be a URL or data URL. Supported: MP4, MOV, WebM.
"data:video/mp4;base64,AAAAFGZ0eXA..."
For models with reference image support, up to 9 images for character/style consistency. Each must be a URL or data URL.
9["data:image/png;base64,iVBORw0K..."]For models with advanced element support (e.g., Kling O3 R2V). Up to 4 elements defining characters/objects. Reference in prompt as @Element1, @Element2, etc.
4[
{
"frontal_image_url": "data:image/png;base64,iVBORw0K...",
"reference_image_urls": ["data:image/png;base64,iVBORw0K..."]
}
]For models with advanced element support. Up to 4 scene reference images. Reference in prompt as @Image1, @Image2, etc.
4["data:image/png;base64,iVBORw0K..."]