Skip to Content
ConsoleCore FeaturesInteractive Avatars

Interactive Avatars

Earna AI Console integrates HeyGen Streaming Avatar technology to provide real-time interactive avatars that can speak responses from GPT-4o and other AI models through WebRTC streaming.

Overview

The interactive avatar system provides:

  • Real-time avatar rendering synchronized with AI responses
  • WebRTC peer-to-peer streaming for low latency
  • Multiple avatar options with different personalities
  • Voice synthesis and lip-sync animation
  • Session management and state handling

Architecture

System Architecture

Sequence Diagram

Data Flow

Setup

Get HeyGen API Keys

  1. Sign up at HeyGen Platform 
  2. Navigate to API Settings
  3. Generate your API key and secret

Configure Environment

Add to your .env.local:

# HeyGen Configuration NEXT_PUBLIC_HEYGEN_API_KEY=your_api_key HEYGEN_SECRET_KEY=your_secret_key # Optional: Custom TURN server TURN_SERVER_URL=turn:your-server.com:3478 TURN_USERNAME=username TURN_PASSWORD=password

Install Dependencies

The HeyGen SDK is already included in the console:

{ "dependencies": { "@heygen/streaming-avatar": "^2.0.16" } }

Enable Avatar Mode

The avatar button appears automatically in the chat interface when HeyGen is configured.

Implementation

API Routes

Generate Access Token

// app/api/heygen/access-token/route.ts import { SignJWT } from 'jose'; export async function POST(req: Request) { try { const apiKey = process.env.NEXT_PUBLIC_HEYGEN_API_KEY; const secretKey = process.env.HEYGEN_SECRET_KEY; if (!apiKey || !secretKey) { return NextResponse.json( { error: 'HeyGen not configured' }, { status: 500 } ); } // Create JWT token const token = await new SignJWT({ api_key: apiKey, exp: Math.floor(Date.now() / 1000) + 3600, // 1 hour }) .setProtectedHeader({ alg: 'HS256' }) .setIssuedAt() .sign(new TextEncoder().encode(secretKey)); return NextResponse.json({ token }); } catch (error) { console.error('Token generation failed:', error); return NextResponse.json( { error: 'Failed to generate token' }, { status: 500 } ); } }

Client Components

Avatar Modal Component

// app/components/avatar/avatar-modal.tsx import { useState, useEffect } from 'react'; import { StreamingAvatar } from '@heygen/streaming-avatar'; export function AvatarModal({ isOpen, onClose }) { const [avatar, setAvatar] = useState<StreamingAvatar | null>(null); const [isLoading, setIsLoading] = useState(false); const [selectedAvatar, setSelectedAvatar] = useState('josh_lite3_20230714'); const initializeAvatar = async () => { setIsLoading(true); try { // Get access token const tokenResponse = await fetch('/api/heygen/access-token', { method: 'POST', }); const { token } = await tokenResponse.json(); // Initialize SDK const newAvatar = new StreamingAvatar({ token }); // Set up event listeners newAvatar.on('stream-ready', (stream) => { const videoElement = document.getElementById('avatar-video') as HTMLVideoElement; if (videoElement && stream) { videoElement.srcObject = stream; videoElement.play(); } }); newAvatar.on('stream-disconnected', () => { console.log('Stream disconnected'); }); setAvatar(newAvatar); } catch (error) { console.error('Failed to initialize avatar:', error); } finally { setIsLoading(false); } }; const startSession = async () => { if (!avatar) return; try { await avatar.createStartAvatar({ avatarName: selectedAvatar, quality: 'high', voice: { voiceId: 'en-US-BrianNeural', rate: 1.0, }, }); } catch (error) { console.error('Failed to start session:', error); } }; const speak = async (text: string) => { if (!avatar) return; try { await avatar.speak({ text, taskType: 'talk', }); } catch (error) { console.error('Failed to speak:', error); } }; return ( <Dialog open={isOpen} onOpenChange={onClose}> <DialogContent className="max-w-4xl"> <DialogHeader> <DialogTitle>Interactive Avatar</DialogTitle> </DialogHeader> <div className="grid grid-cols-2 gap-4"> <div className="aspect-video bg-black rounded-lg overflow-hidden"> <video id="avatar-video" className="w-full h-full object-cover" autoPlay playsInline /> </div> <div className="space-y-4"> <Select value={selectedAvatar} onValueChange={setSelectedAvatar}> <SelectTrigger> <SelectValue placeholder="Choose an avatar" /> </SelectTrigger> <SelectContent> <SelectItem value="josh_lite3_20230714">Josh</SelectItem> <SelectItem value="anna_public">Anna</SelectItem> <SelectItem value="wayne_lite">Wayne</SelectItem> </SelectContent> </Select> <div className="flex gap-2"> <Button onClick={initializeAvatar} disabled={isLoading}> Initialize </Button> <Button onClick={startSession} disabled={!avatar}> Start Session </Button> </div> <Textarea placeholder="Enter text for avatar to speak..." onKeyDown={(e) => { if (e.key === 'Enter' && !e.shiftKey) { e.preventDefault(); speak(e.currentTarget.value); e.currentTarget.value = ''; } }} /> </div> </div> </DialogContent> </Dialog> ); }

Default avatars available in the system:

Avatar IDNameStyleVoice Options
josh_lite3_20230714JoshProfessionalen-US-BrianNeural, en-US-GuyNeural
anna_publicAnnaFriendlyen-US-JennyNeural, en-US-AriaNeural
wayne_liteWayneCasualen-US-ChristopherNeural, en-US-EricNeural
monicaMonicaCorporateen-US-MichelleNeural, en-US-MonicaNeural
kaylaKaylaEnergeticen-US-SaraNeural, en-US-AmberNeural

Voice Configuration

HeyGen supports Azure Neural voices. You can customize voice parameters for each avatar.

// Voice configuration options interface VoiceConfig { voiceId: string; // Azure voice ID rate?: number; // Speed: 0.5 to 2.0 (default: 1.0) pitch?: number; // Pitch: -50 to 50 (default: 0) volume?: number; // Volume: 0 to 100 (default: 100) emotion?: string; // Optional emotion style } // Example configurations const voices = { professional: { voiceId: 'en-US-BrianNeural', rate: 0.95, pitch: -5, }, friendly: { voiceId: 'en-US-JennyNeural', rate: 1.05, pitch: 5, }, authoritative: { voiceId: 'en-US-GuyNeural', rate: 0.9, pitch: -10, }, };

Integration with Chat

Connect the avatar to your chat system:

// app/components/chat/chat-with-avatar.tsx export function ChatWithAvatar() { const { messages, sendMessage } = useChat(); const { speak, isConnected } = useStreamingAvatarSession(); useEffect(() => { // Make avatar speak GPT-4o responses const lastMessage = messages[messages.length - 1]; if (lastMessage?.role === 'assistant' && isConnected) { speak(lastMessage.content); } }, [messages, isConnected]); return ( <div className="flex gap-4"> <div className="flex-1"> <ChatInterface /> </div> <div className="w-96"> <AvatarModal /> </div> </div> ); }

Performance Optimization

Connection Quality

// Monitor WebRTC connection quality async function monitorConnectionQuality(pc: RTCPeerConnection) { const stats = await pc.getStats(); stats.forEach((report) => { if (report.type === 'inbound-rtp' && report.mediaType === 'video') { console.log({ packetsLost: report.packetsLost, jitter: report.jitter, frameRate: report.framesPerSecond, bytesReceived: report.bytesReceived, }); } }); }

Bandwidth Management

// Adjust quality based on network conditions function adjustQuality(bandwidth: number): 'low' | 'medium' | 'high' { if (bandwidth < 500000) return 'low'; // < 500 Kbps if (bandwidth < 1000000) return 'medium'; // < 1 Mbps return 'high'; // >= 1 Mbps }

Troubleshooting

Common Issues

IssueCauseSolution
Avatar not loadingMissing API keyCheck HEYGEN_API_KEY in environment
No video streamWebRTC blockedCheck browser permissions and firewall
Audio sync issuesNetwork latencyUse closer TURN server or reduce quality
Session timeoutToken expiredRefresh token every hour
Black videoCORS issuesUse the proxy route configuration

Debug Mode

Enable debug logging:

// Enable HeyGen SDK debug mode const avatar = new StreamingAvatar({ token, debug: true, logLevel: 'verbose', }); // Log WebRTC stats setInterval(async () => { const stats = await avatar.getStats(); console.log('Avatar stats:', stats); }, 5000);

Security Considerations

Never expose your HeyGen secret key in client-side code. Always generate access tokens server-side.

  1. Token Generation: Always generate tokens server-side with expiration
  2. CORS Policy: Configure proper CORS headers for WebRTC
  3. Rate Limiting: Implement rate limiting on session creation
  4. Session Management: Clean up abandoned sessions
  5. Content Filtering: Validate text before sending to avatar

Cost Optimization

  • Session Duration: Close sessions when not in use
  • Quality Settings: Use appropriate quality for use case
  • Caching: Cache avatar list to reduce API calls
  • Pooling: Reuse sessions when possible

Next Steps

Last updated on