Talk
Beginner

All I know about Voice Agents

Rejected

Session Description

In this talk I'll try to cover everything I have learned building conversational voice applications. This talk will be divided into 3 sections


1. Discussion around "Voice Orchestration Layer" or "Speech-To-Speech Models"

1.1 In the "Voice Orchestration Layer",we'll talk about multiple component involved in building a seamless speech to speech application, potential bottlenecks, latency issues and to what extent we can solve it. These components involve third party services line TTS, STT and any LLM model.

1.2 In the "Speech-To-Speech Models", we'll talk about how this completely changed how voice applications were built and how this solves some of the key problem of voice application.


2. Conversational Flow (Basically how to make the agent conversational)- In this section we'll talk about how to maintain the context for the complete conversation and prompt it in a way that the conversation flows in a way we want it to flow.


3. Handling complex call workflows- In this we'll discuss a real world scenario where a voice agent needs to talk a decision based on the real-time conversation as to what call workflows it has to start/continue. Eg: start neutral but become rude if the call outcome is not met. We'll discuss the challenged in this and the engineering way to solve this problem.


I’ll wrap things up with a quick demo of a voice agent powered by the orchestration layer to show how it all comes together in real time.

Key Takeaways

  • Learn how to architect a real-time voice agent
  • Understand how to design conversational flows that feel natural and context-aware.

References

Session Categories

Technology architectureOther

Reviews

0 %
Approvability
0
Approvals
2
Rejections
1
Not Sure
There's no FOSS mentioned.
Reviewer #1
Not Sure
Reviewer #2
Rejected
Reviewer #3
Rejected