--- title: Command_RTC emoji: 🦀 colorFrom: yellow colorTo: purple sdk: gradio sdk_version: 5.32.1 app_file: app.py pinned: false license: apache-2.0 short_description: Text-to-speech using Gradio, FastAPI, and Chatterbox TTS tags: - chatterbox-tts - text-to-speech - voice-cloning - gradio - fastapi --- # Voice Chat Assistant A conversational voice assistant powered by AI that responds to your spoken queries with natural-sounding speech. ## Features - Speech Recognition: Uses OpenAI's Whisper model to accurately transcribe your voice - Natural Language Understanding: Leverages Cohere's LLM API for intelligent responses - Text-to-Speech: Generates natural speech using Chatterbox-TTS - Reply on Pause: Automatically responds when you finish speaking - Conversation History: Maintains context throughout your dialogue ## Demo Speak into your microphone and the assistant will respond with voice! ## How It Works - Your voice is transcribed to text using Whisper - The text is processed by Cohere's LLM to generate a response - The response is converted to speech using Chatterbox-TTS - The conversation continues with full context retention ## Technical Details This project utilizes: - Zero-GPU: Efficient GPU memory usage with Hugging Face's Zero-GPU technology - FastRTC: Real-time communication for seamless voice interaction - Gradio: Simple and intuitive user interface ## Setup To run this locally, you'll need a Cohere API key and Python 3.8+. ## Acknowledgements - OpenAI for the Whisper speech recognition model - Cohere for the language model API - Tortoise-TTS for the text-to-speech capabilities - Hugging Face for the Spaces and Zero-GPU infrastructure