HidsTech
Intelligent AI Studio
← All articles
AI Development6 min read10 February 2026

Building Real-Time AI Applications with Streaming

Streaming transforms AI from batch processing to real-time interaction. Here's how to implement streaming in your AI application for a dramatically better user experience.

Waiting for an AI response to complete before showing anything to the user is a UX mistake. Streaming — sending tokens as they're generated — makes AI feel instant and dramatically improves perceived performance.

Why Streaming Matters

Without streaming: user waits 5-15 seconds, then sees the full response appear.

With streaming: user sees the response start appearing in under 500ms, with the rest arriving progressively.

The actual generation time is the same. The perceived speed is completely different.

Server-Side Streaming (Node.js / Next.js)

```typescript

// app/api/chat/route.ts

import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic();

export async function POST(req: Request) {

const { message } = await req.json();

const stream = await anthropic.messages.stream({

model: "claude-sonnet-4-6",

max_tokens: 1024,

messages: [{ role: "user", content: message }],

});

const encoder = new TextEncoder();

const readable = new ReadableStream({

async start(controller) {

for await (const chunk of stream) {

if (chunk.type === "content_block_delta" && chunk.delta.type === "text_delta") {

controller.enqueue(encoder.encode(chunk.delta.text));

}

}

controller.close();

},

});

return new Response(readable, {

headers: {

"Content-Type": "text/plain; charset=utf-8",

"Transfer-Encoding": "chunked",

},

});

}

```

Client-Side Consumption (React)

```typescript

"use client";

import { useState } from "react";

export function Chat() {

const [response, setResponse] = useState("");

const sendMessage = async (message: string) => {

setResponse("");

const res = await fetch("/api/chat", {

method: "POST",

body: JSON.stringify({ message }),

headers: { "Content-Type": "application/json" },

});

const reader = res.body!.getReader();

const decoder = new TextDecoder();

while (true) {

const { done, value } = await reader.read();

if (done) break;

setResponse((prev) => prev + decoder.decode(value));

}

};

return (

<div>

<button onClick={() => sendMessage("Hello!")}>Send</button>

<div>{response}</div>

</div>

);

}

```

Streaming with the Vercel AI SDK

The Vercel AI SDK simplifies streaming significantly:

```typescript

import { streamText } from "ai";

import { anthropic } from "@ai-sdk/anthropic";

export async function POST(req: Request) {

const { messages } = await req.json();

const result = streamText({

model: anthropic("claude-sonnet-4-6"),

messages,

});

return result.toDataStreamResponse();

}

```

```typescript

// Client

import { useChat } from "ai/react";

export function Chat() {

const { messages, input, handleInputChange, handleSubmit } = useChat();

return (

<form onSubmit={handleSubmit}>

{messages.map((m) => <div key={m.id}>{m.content}</div>)}

<input value={input} onChange={handleInputChange} />

<button type="submit">Send</button>

</form>

);

}

```

Streaming Best Practices

  • Show a loading indicator for the period before the first token arrives
  • Handle connection errors — implement reconnection logic for long streams
  • Cancel on unmount — abort the fetch when the component unmounts
  • Stream structured data carefully — parse JSON only when the stream is complete
  • Rate limit per user — streaming can be abused; protect your API
  • When Not to Stream

  • Background processing (batch jobs)
  • When you need the full response before doing anything with it
  • Structured data extraction (wait for complete JSON)
  • When latency to first byte doesn't matter
  • Talk to us about building real-time AI experiences for your product.

    Ready to implement AI in your business?

    Book a free 30-minute strategy call — no commitment required.

    Book a Free Call →