Interface VideoProjectParams

Video-specific parameters for video workflows (t2v, i2v, s2v, ia2v, a2v, animate). Only applicable when using video models like wan_v2.2-14b-fp8_t2v or ltx2-19b-fp8_t2v. Includes frame count, fps, shift, and reference assets (image, audio, video).

  • Always generate video at 16fps internally
  • The fps parameter (16 or 32) only controls post-render frame interpolation
  • fps=32 doubles the frames via interpolation after generation
  • Frame count is always calculated as: duration * 16 + 1
  • Example: 5 seconds at 32fps = 81 frames generated, then interpolated to 161 output frames
  • Generate video at the actual specified FPS (1-60 fps range)
  • No post-render interpolation - fps directly affects generation
  • Frame count is calculated as: duration * fps + 1
  • Frame count must follow the pattern: 1 + n*8 (i.e., 1, 9, 17, 25, 33, ...)
  • Example: 5 seconds at 24fps = 121 frames (since 121 = 1 + 15*8)
interface VideoProjectParams {
    audioDuration?: number;
    audioStart?: number;
    controlNet?: VideoControlNetParams;
    detailerStrength?: number;
    disableNSFWFilter?: boolean;
    duration?: number;
    firstFrameStrength?: number;
    fps?: number;
    frames?: number;
    guidance?: number;
    height?: number;
    lastFrameStrength?: number;
    loras?: string[];
    loraStrengths?: number[];
    modelId: string;
    negativePrompt?: string;
    network?: SupernetType;
    numberOfMedia: number;
    outputFormat?: "mp4";
    positivePrompt: string;
    referenceAudio?: InputMedia;
    referenceImage?: InputMedia;
    referenceImageEnd?: InputMedia;
    referenceVideo?: InputMedia;
    sam2Coordinates?: { x: number; y: number }[];
    sampler?: string;
    scheduler?: string;
    seed?: number;
    shift?: number;
    steps?: number;
    stylePrompt?: string;
    teacacheThreshold?: number;
    tokenType?: TokenType;
    trimEndFrame?: boolean;
    type: "video";
    videoStart?: number;
    width?: number;
}

Hierarchy (View Summary)

Properties

audioDuration?: number

Audio duration in seconds for audio-driven workflows (s2v, ia2v, a2v). Specifies how many seconds of audio to use. If not provided, defaults to 30 seconds on the server.

audioStart?: number

Audio start position in seconds for audio-driven workflows (s2v, ia2v, a2v). Specifies where to begin reading from the audio file. Default: 0

ControlNet parameters for LTX-2 v2v workflows. Specifies which control signal to extract from the reference video.

detailerStrength?: number

Detailer LoRA strength for LTX-2 v2v IC-Control workflows. The detailer LoRA is always loaded alongside the control LoRA (canny/pose/depth). Range: 0.0-1.0, default 0.6.

disableNSFWFilter?: boolean

Disable NSFW filter for Project. Default is false, meaning NSFW filter is enabled. If image triggers NSFW filter, it will not be available for download.

duration?: number

Duration of the video in seconds. Supported range 1 to 10 (WAN) or 4 to 20 (LTX-2).

The SDK automatically calculates the correct frame count based on the model:

  • WAN 2.2: duration * 16 + 1 (always 16fps generation)
  • LTX-2: duration * fps + 1, snapped to frame step constraint
firstFrameStrength?: number

First frame strength for LTX-2 keyframe interpolation (when referenceImageEnd is provided). Controls how strictly the first frame is matched. Range: 0.0-1.0, default 0.6. Set to 0 to disable first frame (last-frame-only mode).

fps?: number

Frames per second for output video.

WAN 2.2 Models: Only 16 or 32 fps allowed. The 32fps option is post-render frame interpolation that doubles the output frames. Internal generation is always 16fps.

LTX-2 Models: Any value from 1-60 fps. This directly controls the generation frame rate - there is no post-render interpolation.

frames?: number

Number of frames to generate.

Use duration instead. When using duration, the SDK automatically calculates the correct frame count based on the model type.

guidance?: number

Guidance scale. For most Stable Diffusion models, optimal value is 7.5. For video models: Regular models range 0.7-8.0, LoRA version (lightx2v) range 0.7-1.6, step 0.01. This maps to guidanceScale in the keyFrame for both image and video models.

height?: number

Output video height. Only used if sizePreset is "custom"

lastFrameStrength?: number

Last frame strength for LTX-2 keyframe interpolation (when referenceImageEnd is provided). Controls how strictly the last frame is matched. Range: 0.0-1.0, default 0.6.

loras?: string[]

Array of LoRA IDs to apply. Available LoRAs are model-specific. The worker will download the LoRA if not already present on the persistent volume. LoRA IDs are resolved to filenames via the worker config API. Example: ['multiple_angles']

loraStrengths?: number[]

Array of LoRA strengths corresponding to each LoRA in the loras array. Values should be between 0.0 and 2.0. Defaults to 1.0 if not specified. Example: [0.9]

modelId: string

ID of the model to use, available models are available in the availableModels property of the ProjectsApi instance.

negativePrompt?: string

Prompt for what to be avoided. If not provided, server default is used.

network?: SupernetType

Override current network type. Default value can be read from sogni.account.currentAccount.network

numberOfMedia: number

Number of media files to generate. Depending on project type, this can be number of images or number of videos.

outputFormat?: "mp4"

Output video format. For now only 'mp4' is supported, defaults to 'mp4'.

positivePrompt: string

Prompt for what to be created

referenceAudio?: InputMedia

Reference audio for audio-driven video workflows (s2v, ia2v, a2v).

referenceImage?: InputMedia

Reference image for video workflows. Maps to: startImage (i2v), characterImage (animate), referenceImage (s2v, ia2v)

referenceImageEnd?: InputMedia

Optional end image for i2v interpolation workflows. When provided with referenceImage, the video will interpolate between the two images.

referenceVideo?: InputMedia

Reference video for animate and v2v (ControlNet) workflows. Maps to: drivingVideo (animate-move), sourceVideo (animate-replace), referenceVideo (v2v)

sam2Coordinates?: { x: number; y: number }[]

SAM2 click coordinates for subject detection in animate-replace workflows. Array of {x, y} coordinate objects indicating where the subject is located in the reference image.

Coordinates can be normalized (0.0-1.0) or absolute pixel values. Normalized coordinates are automatically converted to pixel values by the server. If not provided, the server defaults to the center of the frame.

Example: [{ x: 0.5, y: 0.5 }] for center of frame

sampler?: string

Sampler, available options depend on the model. Use sogni.projects.getModelOptions(modelId) to get the list of available samplers.

scheduler?: string

Scheduler, available options depend on the model. Use sogni.projects.getModelOptions(modelId) to get the list of available schedulers.

seed?: number

Seed for one of images in project. Other will get random seed. Must be Uint32

shift?: number

Shift parameter for video diffusion models. Controls motion intensity. Range: 1.0-8.0, step 0.1. Default: 8.0 for regular models, 5.0 for speed lora (lightx2v) except s2v and animate which use 8.0

steps?: number

Number of steps. For most Stable Diffusion models, optimal value is 20.

stylePrompt?: string

Image style prompt. If not provided, server default is used.

teacacheThreshold?: number

TeaCache optimization threshold for T2V and I2V models. Range: 0.0-1.0. 0.0 = disabled. Recommended: 0.15 for T2V (~1.5x speedup), 0.2 for I2V (conservative quality-focused)

tokenType?: TokenType

Select which tokens to use for the project. If not specified, the Sogni token will be used.

trimEndFrame?: boolean

Trim the last frame from the generated video. Used for seamless stitching of transition videos where the last frame duplicates the end reference image. Default: false

type: "video"
videoStart?: number

Video start position in seconds for animate workflows (animate-move, animate-replace). Specifies where to begin reading from the reference video file. Default: 0

width?: number

Output video width. Only used if sizePreset is "custom"