OptionalappOptional client app/source label to attach to the project request for server-side attribution.
OptionalaudioAudio duration in seconds for audio-driven workflows (s2v, ia2v, a2v). Specifies how many seconds of audio to use. If not provided, defaults to 30 seconds on the server.
OptionalaudioControls how strongly the speaker's vocal identity is applied. Uses an extra forward pass per denoising step to amplify identity features. Range: 0-10. Default: 3.0. Set to 0 to disable (skips extra forward pass). Only used when referenceAudioIdentity is provided.
OptionalaudioAudio start position in seconds for audio-driven workflows (s2v, ia2v, a2v). Specifies where to begin reading from the audio file. Default: 0
OptionalcontrolControlNet parameters for LTX-2.3 v2v workflows. Specifies which control signal to extract from the reference video.
OptionaldetailerDetailer LoRA strength for LTX-2.3 v2v IC-Control workflows. The detailer LoRA is always loaded alongside the control LoRA (canny/pose/depth). Range: 0.0-1.0, default 0.6.
OptionaldisableNSFWFilterDisable NSFW filter for Project. Default is false, meaning NSFW filter is enabled. If image triggers NSFW filter, it will not be available for download.
OptionaldurationDuration of the video in seconds. Supported range 1 to 10 (WAN), 4 to 20 (LTX-2.3), or 4 to 15 (Seedance direct SDK projects).
The SDK automatically calculates the correct frame count based on the model:
duration * 16 + 1 (always 16fps generation)duration * fps + 1, snapped to frame step constraintduration * 24 + 1OptionalfirstFirst frame strength for LTX-2.3 keyframe interpolation (when referenceImageEnd is provided). Controls how strictly the first frame is matched. Range: 0.0-1.0, default 0.6. Set to 0 to disable first frame (last-frame-only mode).
OptionalfpsFrames per second for output video.
WAN 2.2 Models: Only 16 or 32 fps allowed. The 32fps option is post-render frame interpolation that doubles the output frames. Internal generation is always 16fps.
LTX-2.3 Models: Any value from 1-60 fps. This directly controls the generation frame rate - there is no post-render interpolation.
Seedance Models: Fixed 24fps external API generation.
OptionalframesNumber of frames to generate.
OptionalgenerateEnable native audio generation for external API-backed video models that support it. Seedance defaults to audio enabled server-side; set to false to request a silent video.
OptionalguidanceGuidance scale. For most Stable Diffusion models, optimal value is 7.5.
For video models: Regular models range 0.7-8.0, LoRA version (lightx2v) range 0.7-1.6, step 0.01.
This maps to guidanceScale in the keyFrame for both image and video models.
OptionalheightOutput video height. Only used if sizePreset is "custom"
OptionallastLast frame strength for LTX-2.3 keyframe interpolation (when referenceImageEnd is provided). Controls how strictly the last frame is matched. Range: 0.0-1.0, default 0.6.
OptionallorasArray of LoRA IDs to apply. Available LoRAs are model-specific. The worker will download the LoRA if not already present on the persistent volume. LoRA IDs are resolved to filenames via the worker config API. Example: ['multiple_angles']
OptionalloraArray of LoRA strengths corresponding to each LoRA in the loras array. Values should be between 0.0 and 2.0. Defaults to 1.0 if not specified. Example: [0.9]
ID of the model to use, available models are available in the availableModels property of the ProjectsApi instance.
OptionalnegativePrompt for what to be avoided. If not provided, server default is used.
OptionalnetworkOverride current network type. Default value can be read from sogni.account.currentAccount.network
Number of media files to generate. Depending on project type, this can be number of images or number of videos.
OptionaloutputOutput video format. For now only 'mp4' is supported, defaults to 'mp4'.
Prompt for what to be created
OptionalreferenceReference audio for audio-driven video workflows (s2v, ia2v, a2v).
OptionalreferenceReference audio for ID-LoRA speaker identity transfer (LTX-2.3 only). Provide a ~5 second audio clip of the target speaker's voice. The model uses this to transfer vocal identity into the generated video. Available on t2v, i2v, and v2v LTX-2.3 workflows. Not compatible with audio-driven workflows (s2v, ia2v, a2v).
OptionalreferenceSeedance-only audio context references. These must be publicly accessible HTTPS URLs. Seedance does not support text+audio-only requests; include at least one image or video reference when using audio URL references.
OptionalreferenceReference image for video workflows. Maps to: startImage (i2v), characterImage (animate), referenceImage (s2v, ia2v)
OptionalreferenceOptional end image for i2v interpolation workflows. When provided with referenceImage, the video will interpolate between the two images.
OptionalreferenceSeedance-only loose image context references. These must be publicly accessible HTTPS URLs that the vendor can fetch. Use referenceImage / referenceImageEnd when the image should lock the first or last frame.
OptionalreferenceReference video for animate and v2v (ControlNet) workflows. Maps to: drivingVideo (animate-move), sourceVideo (animate-replace), referenceVideo (v2v)
OptionalreferenceSeedance-only video context references. These must be publicly accessible HTTPS URLs and map to Seedance reference_video assets.
Optionalsam2SAM2 click coordinates for subject detection in animate-replace workflows. Array of {x, y} coordinate objects indicating where the subject is located in the reference image.
Coordinates can be normalized (0.0-1.0) or absolute pixel values. Normalized coordinates are automatically converted to pixel values by the server. If not provided, the server defaults to the center of the frame.
Example: [{ x: 0.5, y: 0.5 }] for center of frame
OptionalsamplerSampler, available options depend on the model. Use sogni.projects.getModelOptions(modelId)
to get the list of available samplers.
OptionalschedulerScheduler, available options depend on the model. Use sogni.projects.getModelOptions(modelId)
to get the list of available schedulers.
OptionalseedSeed for one of images in project. Other will get random seed. Must be Uint32
OptionalshiftShift parameter for video diffusion models. Controls motion intensity. Range: 1.0-8.0, step 0.1. Default: 8.0 for regular models, 5.0 for speed lora (lightx2v) except s2v and animate which use 8.0
OptionalstepsNumber of steps. For most Stable Diffusion models, optimal value is 20.
OptionalstyleImage style prompt. If not provided, server default is used.
OptionalteacacheTeaCache optimization threshold for T2V and I2V models. Range: 0.0-1.0. 0.0 = disabled. Recommended: 0.15 for T2V (~1.5x speedup), 0.2 for I2V (conservative quality-focused)
OptionaltokenSelect which tokens to use for the project. If not specified, the Sogni token will be used.
OptionaltrimTrim the last frame from the generated video. Used for seamless stitching of transition videos where the last frame duplicates the end reference image. Default: false
OptionalvideoVideo start position in seconds for animate workflows (animate-move, animate-replace). Specifies where to begin reading from the reference video file. Default: 0
OptionalwidthOutput video width. Only used if sizePreset is "custom"
Video-specific parameters for video workflows (t2v, i2v, s2v, ia2v, a2v, animate). Only applicable when using video models like wan_v2.2-14b-fp8_t2v or ltx23-22b-fp8_t2v_distilled. Includes frame count, fps, shift, and reference assets (image, audio, video).
Important: FPS and Frame Count Behavior Differs by Model
WAN 2.2 Models (wan_v2.2-*)
fpsparameter (16 or 32) only controls post-render frame interpolationduration * 16 + 1LTX-2.3 Models (ltx2-, ltx23-)
duration * fps + 11 + n*8(i.e., 1, 9, 17, 25, 33, ...)Seedance 2.0 Models (seedance-2-0*)
duration * 24 + 1