GenerationOptions - WhisperKit

Overview

GenerationOptions controls all aspects of the speech synthesis pipeline, including sampling parameters, chunking strategy, and concurrency. All fields have sensible defaults, so the zero-argument initializer works for most use cases.

public struct GenerationOptions: Codable, Sendable

Initialization

public init(
    temperature: Float = GenerationOptions.defaultTemperature,
    topK: Int = GenerationOptions.defaultTopK,
    repetitionPenalty: Float = GenerationOptions.defaultRepetitionPenalty,
    maxNewTokens: Int = GenerationOptions.defaultMaxNewTokens,
    concurrentWorkerCount: Int = 0,
    chunkingStrategy: TextChunkingStrategy? = nil,
    targetChunkSize: Int? = nil,
    minChunkSize: Int? = nil,
    instruction: String? = nil,
    forceLegacyEmbedPath: Bool = false
)

temperature

Float

default:"0.9"

Sampling temperature. Higher values (e.g., 1.0) make output more random; lower values (e.g., 0.5) make it more deterministic.

topK

Int

default:"50"

Top-K sampling parameter. Only the K most likely tokens are considered at each step.

repetitionPenalty

Float

default:"1.05"

Repetition penalty to discourage repeating tokens. Values > 1.0 penalize repetition.

maxNewTokens

Int

default:"245"

Maximum number of tokens to generate in the autoregressive loop.

concurrentWorkerCount

Int

default:"0"

Number of concurrent workers for multi-chunk generation:

0: all chunks run concurrently in one batch (default, fastest for non-streaming use cases)
1: sequential - one chunk at a time; required for real-time play streaming
N: at most N chunks run concurrently

chunkingStrategy

TextChunkingStrategy?

default:"nil"

How to split long text into chunks. Defaults to .sentence. Set to .none to force a single-pass generation without sentence splitting.

targetChunkSize

Int?

default:"nil"

Target chunk size in tokens for sentence chunking. nil resolves to TextChunker.defaultTargetChunkSize at the call site.

minChunkSize

Int?

default:"nil"

Minimum chunk size in tokens. nil resolves to TextChunker.defaultMinChunkSize at the call site.

instruction

String?

default:"nil"

Optional style instruction for controlling speech characteristics (e.g., "Very happy"). Prepended as a text-only user prompt before the main TTS segment. For Qwen3, this is only supported by the 1.7B model variant.

forceLegacyEmbedPath

Bool

default:"false"

Force the legacy [FloatType] inference path even on macOS 15+ / iOS 18+. When false (default), the MLTensor path is taken on supported OS versions. Set to true in tests to exercise the pre-macOS-15 code path on current hardware.

Properties

Sampling Parameters

temperature

Float

Sampling temperature. Default: 0.9

topK

Int

Top-K sampling parameter. Default: 50

repetitionPenalty

Float

Repetition penalty to discourage repeating tokens. Default: 1.05

maxNewTokens

Int

Maximum number of tokens to generate. Default: 245

Chunking and Concurrency

concurrentWorkerCount

Int

Number of concurrent workers for multi-chunk generation. Default: 0 (all chunks concurrently)

chunkingStrategy

TextChunkingStrategy?

How to split long text into chunks. Default: nil (resolves to .sentence)

targetChunkSize

Int?

Target chunk size in tokens for sentence chunking. Default: nil (uses TextChunker.defaultTargetChunkSize)

minChunkSize

Int?

Minimum chunk size in tokens. Default: nil (uses TextChunker.defaultMinChunkSize)

Style Control

instruction

String?

Optional style instruction for controlling speech characteristics. Default: nilOnly supported by the Qwen3 1.7B model variant.

Advanced

forceLegacyEmbedPath

Bool

Force the legacy [FloatType] inference path. Default: false

Static Properties

defaultTemperature

public static let defaultTemperature: Float = 0.9

value

Float

Default sampling temperature: 0.9

defaultTopK

public static let defaultTopK: Int = 50

value

Int

Default Top-K sampling parameter: 50

defaultRepetitionPenalty

public static let defaultRepetitionPenalty: Float = 1.05

value

Float

Default repetition penalty: 1.05

defaultMaxNewTokens

public static let defaultMaxNewTokens: Int = 245

value

Int

Default maximum number of tokens to generate: 245

Example Usage

Default Options

let result = try await tts.generate(
    text: "Hello, world!",
    voice: "ryan"
)

Custom Sampling

var options = GenerationOptions(
    temperature: 0.7,
    topK: 30,
    maxNewTokens: 500
)
let result = try await tts.generate(
    text: "A longer piece of text.",
    voice: "ryan",
    options: options
)

Sequential Generation (for Streaming)

var options = GenerationOptions(
    concurrentWorkerCount: 1  // Required for play() streaming
)
let result = try await tts.play(
    text: "This will stream audio chunk by chunk.",
    voice: "ryan",
    options: options,
    playbackStrategy: .auto
)

With Style Instruction (1.7B only)

var options = GenerationOptions(
    instruction: "Very happy and excited"
)
let result = try await tts.generate(
    text: "I'm so glad to meet you!",
    voice: "ryan",
    options: options
)

Disable Chunking

var options = GenerationOptions(
    chunkingStrategy: .none
)
let result = try await tts.generate(
    text: "Generate this as a single chunk.",
    voice: "ryan",
    options: options
)

Custom Chunk Sizes

var options = GenerationOptions(
    targetChunkSize: 100,
    minChunkSize: 20
)
let result = try await tts.generate(
    text: "A very long piece of text that will be split into chunks...",
    voice: "ryan",
    options: options
)

Parallel Generation

var options = GenerationOptions(
    concurrentWorkerCount: 4  // Up to 4 chunks in parallel
)
let result = try await tts.generate(
    text: "Long text with multiple sentences. Each sentence becomes a chunk. They generate in parallel.",
    voice: "ryan",
    options: options
)

Documentation Index

​Overview

​Initialization

​Properties

​Sampling Parameters

​Chunking and Concurrency

​Style Control

​Advanced

​Static Properties

​defaultTemperature

​defaultTopK

​defaultRepetitionPenalty

​defaultMaxNewTokens

​Example Usage

​Default Options

​Custom Sampling

​Sequential Generation (for Streaming)

​With Style Instruction (1.7B only)

​Disable Chunking

​Custom Chunk Sizes

​Parallel Generation

Overview

Initialization

Properties

Sampling Parameters

Chunking and Concurrency

Style Control

Advanced

Static Properties

defaultTemperature

defaultTopK

defaultRepetitionPenalty

defaultMaxNewTokens

Example Usage

Default Options

Custom Sampling

Sequential Generation (for Streaming)

With Style Instruction (1.7B only)

Disable Chunking

Custom Chunk Sizes

Parallel Generation