---
title: Command_RTC
emoji: 🦀
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 5.32.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: Text-to-speech using Gradio, FastAPI, and Chatterbox TTS
tags:
- chatterbox-tts
- text-to-speech
- voice-cloning
- gradio
- fastapi
---

# Voice Chat Assistant
A conversational voice assistant powered by AI that responds to your spoken queries with natural-sounding speech.

## Features

- Speech Recognition: Uses OpenAI's Whisper model to accurately transcribe your voice
- Natural Language Understanding: Leverages Cohere's LLM API for intelligent responses
- Text-to-Speech: Generates natural speech using Chatterbox-TTS
- Reply on Pause: Automatically responds when you finish speaking
- Conversation History: Maintains context throughout your dialogue

## Demo
Speak into your microphone and the assistant will respond with voice!

## How It Works
- Your voice is transcribed to text using Whisper
- The text is processed by Cohere's LLM to generate a response
- The response is converted to speech using Chatterbox-TTS
- The conversation continues with full context retention

## Technical Details

This project utilizes:

- Zero-GPU: Efficient GPU memory usage with Hugging Face's Zero-GPU technology
- FastRTC: Real-time communication for seamless voice interaction
- Gradio: Simple and intuitive user interface

## Setup

To run this locally, you'll need a Cohere API key and Python 3.8+.

## Acknowledgements

- OpenAI for the Whisper speech recognition model
- Cohere for the language model API
- Tortoise-TTS for the text-to-speech capabilities
- Hugging Face for the Spaces and Zero-GPU infrastructure