Pipeline Streaming

Stream data through AI pipelines in real-time for responsive applications.

Stream data through pipelines in real-time for responsive applications. Streaming provides immediate feedback as AI generates responses.

🚀 Basic Streaming

🔄 Streaming Flow

Stream Through Pipeline

pipeline = aiMessage()
    .user( "Tell me a story" )
    .toDefaultModel()

pipeline.stream( ( chunk ) => {
    content = chunk.choices?.first()?.delta?.content ?: ""
    print( content )
} )

With Bindings

pipeline = aiMessage()
    .system( "You are ${style}" )
    .user( "Write about ${topic}" )
    .toDefaultModel()

// stream( onChunk, input, params, options )
pipeline.stream(
    ( chunk ) => print( chunk.choices?.first()?.delta?.content ?: "" ),
    { style: "poetic", topic: "nature" }  // input bindings
)

With Options

Options in Streaming

Streamers accept the same options parameter as run() methods:

Default Options

Runtime Options Override

Note: Return format options don't apply to streaming - chunks are always in provider's streaming format.

Message Streaming

Messages can stream their content:

Collecting Stream Data

Full Response Collection

Structured Collection

Streaming Patterns

Progress Indicator

Real-Time Display

Chunk Processing

Web Streaming

Server-Sent Events (SSE)

WebSocket Streaming

JSON Streaming

Advanced Streaming

Stream with Transforms

Note: Transforms run after streaming completes, not per-chunk.

Conditional Streaming

Stream with Timeout

Practical Examples

Interactive Chat

Markdown Renderer

Progress Tracker

Stream Multiplexer

Error Handling

Stream Error Handling

Graceful Degradation

Best Practices

  1. Flush Output: Call flush() in web contexts

  2. Handle Errors: Streaming can fail mid-response

  3. Track State: Monitor stream progress

  4. Set Timeouts: Prevent infinite streams

  5. Buffer Appropriately: Balance responsiveness and performance

  6. Test Disconnects: Handle client disconnections

  7. Provide Feedback: Show progress indicators

Performance Tips

  1. Minimize Processing: Keep chunk callbacks fast

  2. Buffer When Needed: Don't flush every character

  3. Use Appropriate Models: Some stream better than others

  4. Monitor Memory: Long streams can accumulate

  5. Close Connections: Clean up resources

Next Steps

Last updated