Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/argmaxinc/WhisperKit/llms.txt

Use this file to discover all available pages before exploring further.

Model Selection

WhisperKit supports all official OpenAI Whisper model variants, from tiny to large-v3. Choosing the right model involves balancing accuracy, speed, and memory usage based on your application’s requirements.

Available Models

Whisper models come in different sizes, each with multilingual and English-only variants:

Model Variants

Best for: Real-time streaming, constrained devices, quick prototyping
  • Fastest inference
  • Lowest memory footprint (~75 MB)
  • Acceptable accuracy for clear audio
  • Available: tiny (multilingual), tiny.en (English-only)
let whisperKit = try await WhisperKit(model: "tiny")
Best for: Mobile apps, moderate accuracy requirements
  • Good balance of speed and accuracy
  • Memory footprint ~140 MB
  • Suitable for most mobile applications
  • Available: base, base.en
let whisperKit = try await WhisperKit(model: "base")
Best for: Production applications, higher accuracy needs
  • Good accuracy for production use
  • Memory footprint ~460 MB
  • Slower than base but more accurate
  • Available: small, small.en
let whisperKit = try await WhisperKit(model: "small")
Best for: High accuracy requirements, server-side processing
  • Very good accuracy
  • Memory footprint ~1.5 GB
  • Slower inference
  • Available: medium, medium.en
let whisperKit = try await WhisperKit(model: "medium")
Best for: Maximum accuracy, offline batch processing
  • Best accuracy
  • Memory footprint ~3 GB
  • Slowest inference
  • Available: large, large-v2, large-v3
let whisperKit = try await WhisperKit(model: "large-v3")
See ModelVariant

ModelVariant Enum

public enum ModelVariant: CustomStringConvertible {
    case tiny
    case tinyEn
    case base
    case baseEn
    case small
    case smallEn
    case medium
    case mediumEn
    case large
    case largev2
    case largev3
    
    var isMultilingual: Bool {
        // Returns true for multilingual models
        // Returns false for .en variants
    }
}
WhisperKit provides device-specific recommendations:
// Get locally computed recommendations
let localSupport = WhisperKit.recommendedModels()
print("Default model: \(localSupport.default)")
print("Supported models: \(localSupport.supported)")

// Get recommendations from remote config
let remoteSupport = await WhisperKit.recommendedRemoteModels(
    from: "argmaxinc/whisperkit-coreml"
)
print("Recommended: \(remoteSupport.default)")
See WhisperKit.recommendedModels and WhisperKit.recommendedRemoteModels

Device-Specific Recommendations

Recommendations are based on device hardware:
let deviceName = WhisperKit.deviceName()
print("Running on: \(deviceName)")

// Example device identifiers:
// - "iPhone15,2" (iPhone 14 Pro)
// - "iPad13,16" (iPad Pro M2)
// - "Mac14,2" (Mac Studio M2)
See WhisperKit.deviceName

Downloading Models

Automatic Download

By default, WhisperKit downloads models automatically:
// Downloads and loads the default recommended model
let whisperKit = try await WhisperKit()

// Downloads a specific model
let whisperKit = try await WhisperKit(model: "base")
See WhisperKitConfig.download

Manual Download

Download a model without initializing WhisperKit:
let modelFolder = try await WhisperKit.download(
    variant: "large-v3",
    from: "argmaxinc/whisperkit-coreml",
    progressCallback: { progress in
        print("Downloaded: \(progress.fractionCompleted * 100)%")
    }
)

print("Model saved to: \(modelFolder.path)")
See WhisperKit.download

List Available Models

let availableModels = try await WhisperKit.fetchAvailableModels(
    from: "argmaxinc/whisperkit-coreml"
)

print("Available models:")
for model in availableModels {
    print("  - \(model)")
}
See WhisperKit.fetchAvailableModels

Local Models

Use pre-downloaded or bundled models:
// Use a local model folder
let whisperKit = try await WhisperKit(
    modelFolder: "/path/to/model/folder",
    download: false  // Disable automatic download
)
See WhisperKitConfig.modelFolder

Bundle Models in App

// Get bundled model path
guard let modelPath = Bundle.main.path(
    forResource: "openai_whisper-base",
    ofType: nil
) else {
    fatalError("Model not found in bundle")
}

let whisperKit = try await WhisperKit(
    modelFolder: modelPath,
    download: false
)
Bundling large models increases app size significantly. Consider downloading on first launch instead.

Model Repositories

WhisperKit downloads models from Hugging Face repositories:

Default Repository

// Default: argmaxinc/whisperkit-coreml
let whisperKit = try await WhisperKit(model: "base")

Custom Repository

let whisperKit = try await WhisperKit(
    model: "base",
    modelRepo: "your-username/your-repo",
    modelToken: "hf_your_token_here"  // If repo is private
)
See WhisperKitConfig.modelRepo

Custom Endpoint

let config = WhisperKitConfig(
    model: "base",
    modelEndpoint: "https://your-custom-endpoint.com"
)

let whisperKit = try await WhisperKit(config)
See WhisperKitConfig.modelEndpoint

Download Configuration

Background Downloads

Enable background downloads for large models:
let whisperKit = try await WhisperKit(
    model: "large-v3",
    useBackgroundDownloadSession: true
)
See WhisperKitConfig.useBackgroundDownloadSession

Custom Download Location

let customBase = FileManager.default.urls(
    for: .documentDirectory,
    in: .userDomainMask
).first!

let whisperKit = try await WhisperKit(
    model: "base",
    downloadBase: customBase
)
See WhisperKitConfig.downloadBase

Model States and Loading

Prewarming Models

Prewarm models to reduce peak memory usage:
let whisperKit = try await WhisperKit(
    model: "medium",
    prewarm: true  // Load and unload models sequentially
)
See WhisperKitConfig.prewarm
Prewarming loads models one at a time to trigger Core ML specialization without high peak memory. This doubles load time but reduces memory pressure.

Deferred Loading

// Download but don't load models yet
let whisperKit = try await WhisperKit(
    model: "base",
    load: false
)

// Load later when needed
try await whisperKit.loadModels()
See WhisperKitConfig.load

Unload Models

// Free memory when models aren't needed
await whisperKit.unloadModels()

// Reload when needed
try await whisperKit.loadModels()
See WhisperKit.unloadModels

Multilingual vs English-only

When to Use Multilingual Models

  • Transcribing content in multiple languages
  • Language is unknown in advance
  • Need automatic language detection
  • Translation to English (.translate task)
let whisperKit = try await WhisperKit(model: "base")  // Multilingual

let (language, _) = try await whisperKit.detectLanguage(
    audioPath: "audio.wav"
)
print("Detected: \(language)")

When to Use English-only Models

  • Only transcribing English audio
  • Slightly faster inference
  • Marginally better English accuracy
let whisperKit = try await WhisperKit(model: "base.en")

var options = DecodingOptions(language: "en")
let results = try await whisperKit.transcribe(
    audioPath: "audio.wav",
    decodeOptions: options
)

Model Performance Comparison

Performance varies by device. These are approximate values for reference.
ModelSizeParametersRelative SpeedMemoryAccuracy
tiny75 MB39M32x~150 MBGood
base140 MB74M16x~250 MBBetter
small460 MB244M6x~600 MBVery Good
medium1.5 GB769M2x~1.8 GBExcellent
large-v33 GB1550M1x~3.2 GBBest

Selection Guidelines

Real-time Streaming

Recommended: tiny, baseFast enough to transcribe live audio without lag on most devices.

Mobile Apps

Recommended: base, smallBalance of accuracy and app size. Consider on-demand download instead of bundling.

High Accuracy

Recommended: medium, large-v3Best for offline processing, server deployments, or high-end devices.

Constrained Devices

Recommended: tinyOnly option for devices with limited memory or older hardware.

Next Steps

Configuration

Configure compute options and advanced settings

Transcription

Start transcribing with your selected model