# Text-to-Speech

`aiSpeak()` converts text to natural-sounding speech using cloud AI providers. It returns an `AiSpeechResponse` object containing the binary audio data, with convenience methods for saving to disk, encoding as Base64, generating data URIs for HTML playback, and inspecting metadata.

## 🔧 The `aiSpeak()` Function

### Syntax

```javascript
aiSpeak( text, params={}, options={} )
```

### Parameters

| Parameter | Type   | Required | Description                                                     |
| --------- | ------ | -------- | --------------------------------------------------------------- |
| `text`    | string | ✅ Yes    | The text to synthesize into speech                              |
| `params`  | struct | No       | Provider API parameters such as `model`, `voice`, `speed`       |
| `options` | struct | No       | Module-level options such as `provider`, `apiKey`, `outputFile` |

### Options

| Option                 | Type    | Default   | Description                                                                                                                                          |
| ---------------------- | ------- | --------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- |
| `provider`             | string  | (config)  | AI provider: `openai`, `mistral`, `gemini`, `grok`, `elevenlabs`                                                                                     |
| `apiKey`               | string  | (env var) | Provider API key (falls back to `<PROVIDER>_API_KEY` env var)                                                                                        |
| `voice`                | string  | (config)  | Voice name or ID for the provider, **or** the gender keyword `"male"` / `"female"` (resolved per provider via `audio.voiceGenderMap` in your config) |
| `outputFormat`         | string  | `mp3`     | Audio output format: `mp3`, `wav`, `flac`, `opus`, `pcm`                                                                                             |
| `speed`                | numeric | `1.0`     | Playback speed multiplier (range: 0.25–4.0)                                                                                                          |
| `outputFile`           | string  | `""`      | If non-empty, saves audio to this path and returns the file path string instead of `AiSpeechResponse`                                                |
| `timeout`              | numeric | `30`      | HTTP request timeout in seconds                                                                                                                      |
| `logRequest`           | boolean | `false`   | Log requests to the module log file                                                                                                                  |
| `logRequestToConsole`  | boolean | `false`   | Print request payload to console                                                                                                                     |
| `logResponse`          | boolean | `false`   | Log responses to the module log file                                                                                                                 |
| `logResponseToConsole` | boolean | `false`   | Print raw provider response to console                                                                                                               |

## 📦 Return Value — `AiSpeechResponse`

When `outputFile` is **not** set, `aiSpeak()` returns an `AiSpeechResponse` object wrapping the binary audio data.

When `outputFile` **is** set, `aiSpeak()` saves the audio to that path and returns the **absolute file path string** instead of the response object.

### `AiSpeechResponse` Methods

| Method                           | Returns | Description                                                                      |
| -------------------------------- | ------- | -------------------------------------------------------------------------------- |
| `saveToFile( filePath )`         | string  | Saves audio binary to disk; returns the absolute path                            |
| `getBase64()`                    | string  | Returns the audio data as a Base64-encoded string                                |
| `getMimeType()`                  | string  | Returns the MIME type (e.g. `audio/mpeg` for `mp3`)                              |
| `toDataURI()`                    | string  | Returns a `data:audio/mpeg;base64,...` URI for an HTML `<audio>` `src` attribute |
| `hasAudio()`                     | boolean | Returns `true` if audio binary data is present and non-empty                     |
| `getSize()`                      | numeric | Returns the size of the audio data in bytes                                      |
| `getAudioFormat()`               | string  | Returns the audio format string (e.g. `mp3`, `wav`)                              |
| `toStruct()`                     | struct  | Returns a metadata struct (no binary data — safe for logging)                    |
| `toJSON()`                       | string  | Returns JSON-serialized metadata                                                 |
| `getMetadataValue( key )`        | any     | Read a value from the response metadata bag                                      |
| `setMetadataValue( key, value )` | this    | Write a value to the metadata bag (fluent, chainable)                            |

## 💡 Examples

### Basic — synthesize and save to file

```javascript
audio = aiSpeak( "Welcome to BoxLang AI!" )
audio.saveToFile( "welcome.mp3" )
println( "Size: #audio.getSize()# bytes" )
```

### Shorthand with `outputFile` option

When you only need the file on disk, use `outputFile` to skip the response object entirely:

```javascript
path = aiSpeak(
    "Your order has shipped.",
    {},
    { outputFile: "/audio/notification.mp3" }
)
println( "Saved to: #path#" )
```

### Custom provider and voice

```javascript
audio = aiSpeak(
    "Hello, this is a custom voice.",
    { speed: 1.2 },
    { provider: "openai", voice: "nova", outputFormat: "wav" }
)
audio.saveToFile( "custom.wav" )
```

## 🧱 Fluent Builder API (v3.2.0+)

Calling `aiSpeak()` with **no arguments** returns an `AiSpeechRequest` builder object for method chaining. This provides a more readable, self-documenting way to configure speech synthesis.

### Basic Builder Usage

```javascript
audio = aiSpeak()
    .of( "Welcome to BoxLang AI!" )
    .voice( "nova" )
    .provider( "openai" )
    .asMP3()
    .speak()
```

### Builder Methods

| Method                   | Description                                            |
| ------------------------ | ------------------------------------------------------ |
| `of( text )`             | Set the text to synthesize (static factory)            |
| `.text( text )`          | Alias for `of()`                                       |
| `.model( name )`         | Set the TTS model                                      |
| `.provider( name )`      | Set the provider                                       |
| `.apiKey( key )`         | Set the API key                                        |
| `.voice( name )`         | Set the voice name                                     |
| `.male()`                | Use male voice (resolved per provider)                 |
| `.female()`              | Use female voice (resolved per provider)               |
| `.speed( n )`            | Set playback speed (0.25–4.0)                          |
| `.instructions( text )`  | Set voice instructions                                 |
| `.outputFile( path )`    | Set output file path                                   |
| `.asMP3()`               | Set output format to MP3                               |
| `.asWav()`               | Set output format to WAV                               |
| `.asFlac()`              | Set output format to FLAC                              |
| `.asOpus()`              | Set output format to Opus                              |
| `.asPCM()`               | Set output format to PCM                               |
| `.withParams( struct )`  | Set provider params                                    |
| `.withOptions( struct )` | Set module options                                     |
| `.withLogging()`         | Enable request/response logging                        |
| `.speak()`               | **Terminator** — execute and return `AiSpeechResponse` |

### Fluent Examples

```javascript
// Gender shortcuts
audio = aiSpeak()
    .of( "This uses a male voice." )
    .male()
    .speed( 1.2 )
    .speak()

// Save directly to file
path = aiSpeak()
    .of( "Notification alert!" )
    .asWav()
    .outputFile( "/audio/alert.wav" )
    .speak()

// Full configuration
audio = aiSpeak()
    .of( "Hello from BoxLang!" )
    .model( "tts-1-hd" )
    .provider( "openai" )
    .voice( "nova" )
    .speed( 1.0 )
    .asMP3()
    .withLogging()
    .speak()
```

> 💡 **Backward Compatible:** The traditional `aiSpeak( text, params, options )` syntax continues to work unchanged. The fluent builder is an **additional** option — no migration required.

### Gender keyword voices

Use `"male"` or `"female"` for provider-agnostic voice selection. The module resolves the keyword to the concrete voice configured in `audio.voiceGenderMap` for the active provider — no need to remember provider-specific names:

```javascript
// Works with any provider — resolves to the configured male/female voice automatically
audio = aiSpeak( "This message uses a male voice.", {}, { voice: "male" } )
audio.saveToFile( "male-greeting.mp3" )

audio = aiSpeak( "This message uses a female voice.", {}, { voice: "female" } )
audio.saveToFile( "female-greeting.mp3" )

// Useful when switching providers without updating every voice call
audio = aiSpeak(
    "Same code, different provider.",
    {},
    { provider: "gemini", voice: "female" }  // resolves to "Aoede" on Gemini
)
```

### Base64 / data URI for web responses

Embed audio directly in an HTML page or API response — no file I/O required:

```javascript
audio = aiSpeak( "Click to play this message" )

// Embed directly in HTML
htmlOutput = '<audio controls src="#audio.toDataURI()#"></audio>'

// Or pass the raw Base64 to a front-end
jsonResponse = { audio: audio.getBase64(), mimeType: audio.getMimeType() }
```

### ElevenLabs — high-quality multilingual voice

```javascript
audio = aiSpeak(
    "This is a premium voice synthesis example.",
    { voice_id: "21m00Tcm4TlvDq8ikWAM" },  // Rachel voice ID
    { provider: "elevenlabs" }
)
audio.saveToFile( "premium.mp3" )
println( "Format: #audio.getAudioFormat()#, Size: #audio.getSize()# bytes" )
```

### Generate comparison files across all voices

```javascript
voices = [ "alloy", "echo", "fable", "onyx", "nova", "shimmer" ]
sampleText = "BoxLang AI — where imagination meets voice."

voices.each( voice => {
    aiSpeak(
        sampleText,
        { voice: voice },
        { outputFile: expandPath( "/tmp/voice-#voice#.mp3" ) }
    )
    println( "Generated: /tmp/voice-#voice#.mp3" )
})
```

## 🎙️ Provider Voice Reference

| Provider       | Available Voices                                           | `"male"` keyword     | `"female"` keyword     |
| -------------- | ---------------------------------------------------------- | -------------------- | ---------------------- |
| **OpenAI**     | `alloy`, `ash`, `echo`, `fable`, `onyx`, `nova`, `shimmer` | `ash`                | `nova`                 |
| **Mistral**    | `Charlotte`                                                | *(provider default)* | `Charlotte`            |
| **Gemini**     | `Fenrir`, `Aoede`, `Kore` (and others via API)             | `Fenrir`             | `Aoede`                |
| **Grok / xAI** | `alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer`, `eve` | `onyx`               | `nova`                 |
| **ElevenLabs** | Voice IDs from your ElevenLabs voice library               | *(provider default)* | `21m00Tcm4TlvDq8ikWAM` |

> For ElevenLabs, pass a `voice_id` in `params` for specific voices. The `"male"`/`"female"` keywords resolve to the IDs configured in `audio.voiceGenderMap`.

### Customising the gender map

The defaults can be overridden per-provider in `config/boxlang.json`:

```json
"modules": {
    "bxai": {
        "settings": {
            "audio": {
                "voiceGenderMap": {
                    "openai": { "male": "echo", "female": "shimmer" },
                    "elevenlabs": { "male": "YOUR_MALE_VOICE_ID", "female": "YOUR_FEMALE_VOICE_ID" }
                }
            }
        }
    }
}
```

## 📡 Events

`aiSpeak()` fires two interception points you can hook into for logging, auditing, or modifying behavior.

| Event            | Data Available                       |
| ---------------- | ------------------------------------ |
| `beforeAISpeech` | `speechRequest`, `service`           |
| `afterAISpeech`  | `speechRequest`, `service`, `result` |

```javascript
// Log all TTS calls with their output size
BoxRegisterInterceptor( "afterAISpeech", event => {
    var sizeKB = event.result.getSize() / 1024
    println( "TTS: provider=#event.service.getName()# size=#numberFormat( sizeKB, '0.0' )#KB" )
})
```

***

## 📖 Related Pages

* [Audio Overview](/main-components/audio.md)
* [Speech-to-Text](/main-components/audio/speech-to-text.md)
* [Audio Translation](/main-components/audio/audio-translation.md)
* [aiSpeak BIF Reference](/advanced/reference/built-in-functions/aispeak.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://ai.ortusbooks.com/main-components/audio/text-to-speech.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
