aiSpeak

Convert text to natural-sounding speech audio using an AI provider.

Syntax

aiSpeak( text, params={}, options={} )

Parameters

Parameter
Type
Required
Description

text

string

✅ Yes

The text content to synthesize into speech

params

struct

No

Provider API parameters sent directly to the AI provider (e.g. model, voice, speed)

options

struct

No

Module-level behavior options (provider, output format, file output, logging)

Options

Option
Type
Default
Description

provider

string

(config default)

AI provider name: openai, mistral, gemini, grok, elevenlabs

apiKey

string

(env var)

Provider API key. Falls back to <PROVIDER>_API_KEY environment variable

voice

string

(config default)

Voice name or ID. Pass a provider-specific voice name (e.g. nova) or a gender keyword "male" / "female" which is resolved to the correct voice for the active provider using audio.voiceGenderMap from your module config. See voice reference table below

outputFormat

string

mp3

Audio format: mp3, wav, flac, opus, pcm

speed

numeric

1.0

Playback speed multiplier. Range: 0.25 – 4.0

outputFile

string

""

When set, saves audio to this path and returns the file path string instead of AiSpeechResponse

timeout

numeric

30

HTTP request timeout in seconds

logRequest

boolean

false

Log request to the module log file

logRequestToConsole

boolean

false

Print request payload to the console

logResponse

boolean

false

Log response to the module log file

logResponseToConsole

boolean

false

Print raw provider response to the console (useful for debugging)

Returns

Condition
Returns

outputFile not set

AiSpeechResponse — wraps binary audio data with convenience methods

outputFile is set

string — absolute path to the saved audio file

AiSpeechResponse Methods

Method
Returns
Description

saveToFile( filePath )

string

Saves audio binary to the given path; returns absolute path

getBase64()

string

Base64-encoded audio data

getMimeType()

string

MIME type, e.g. audio/mpeg

toDataURI()

string

data:audio/mpeg;base64,... URI for HTML <audio> elements

hasAudio()

boolean

true if audio data is present and non-empty

getSize()

numeric

Audio data size in bytes

getAudioFormat()

string

Format string: mp3, wav, flac, etc.

toStruct()

struct

Metadata struct (no binary data — safe for logging)

toJSON()

string

JSON-serialized metadata

getMetadataValue( key )

any

Read a value from the response metadata bag

setMetadataValue( key, value )

this

Write a value to the metadata bag (chainable)

Events Fired

Event
When

beforeAISpeech

Before the TTS request is sent to the provider

afterAISpeech

After the TTS response is received

Examples

Synthesize and save to file

Shorthand — write directly to disk with outputFile

Custom voice and format

Gender keyword voices

Instead of hard-coding a provider-specific voice name, pass "male" or "female". The module resolves the keyword to the configured voice for the active provider using audio.voiceGenderMap in your module settings:

The default gender-to-voice mapping (overridable in config/boxlang.json):

Provider

"male"

"female"

OpenAI

ash

nova

Grok / xAI

onyx

nova

Gemini

Fenrir

Aoede

Mistral

(provider default)

Charlotte

ElevenLabs

(provider default)

21m00Tcm4TlvDq8ikWAM

To override any mapping, set audio.voiceGenderMap in your config/boxlang.json:

Base64 / data URI for web responses

ElevenLabs with voice ID

Custom interceptor for TTS logging

Voice Reference

Provider

Available Voices

"male" keyword

"female" keyword

OpenAI

alloy, ash, echo, fable, onyx, nova, shimmer

ash

nova

Mistral

Charlotte

(provider default)

Charlotte

Gemini

Fenrir, Aoede, Kore (and others)

Fenrir

Aoede

Grok / xAI

alloy, echo, fable, onyx, nova, shimmer, eve

onyx

nova

ElevenLabs

Voice IDs from your voice library

(provider default)

21m00Tcm4TlvDq8ikWAM

For ElevenLabs, pass a voice_id in params for specific voices. The "male"/"female" keywords resolve to the IDs in audio.voiceGenderMap.

See Also

Last updated