Spaces:
Sleeping
A newer version of the Gradio SDK is available:
6.1.0
Setup and Deployment Guide
This guide walks you through setting up and deploying the Hopsworks RAG ChatBot to HuggingFace Spaces.
Table of Contents
- Prerequisites
- Local Setup
- Indexing Documents
- Configuring Models
- Deploying to HuggingFace Spaces
- Syncing with GitHub
- Testing
- Troubleshooting
Prerequisites
Before you begin, ensure you have:
- Python 3.10 installed locally
- Git installed
- Hopsworks Account: Sign up at hopsworks.ai
- HuggingFace Account: Sign up at huggingface.co
- PDF Documents you want to index for RAG
Local Setup
1. Clone the Repository
git clone <your-repo-url>
cd rag_finetune_LLM
2. Create Virtual Environment
python3.10 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
3. Install Dependencies
pip install -r requirements.txt
4. Configure Environment Variables
Create a .env file in the root directory:
# .env
HOPSWORKS_API_KEY=your_hopsworks_api_key_here
Get your Hopsworks API Key:
- Go to Hopsworks
- Navigate to your project
- Click on your profile → Settings → API Keys
- Create a new API key and copy it
Indexing Documents
1. Add Your PDF Document
Place your PDF file in the project directory (e.g., content/your_content.pdf)
2. Update the Indexing Notebook
Open index_content.ipynb and update the PDF path:
PDF_PATH = "content/your_content.pdf" # Update this
3. Run the Notebook
Execute all cells in index_content.ipynb:
jupyter notebook index_content.ipynb
This will:
- Load and chunk your PDF using Docling
- Generate embeddings with sentence-transformers
- Upload to Hopsworks Feature Store as
contentfeature group
Note: This only needs to be done once. The embeddings will be available for all deployments.
Configuring Models
1. Edit Model Configuration
Update models_config.json with your models.
2. Model Format Requirements
- Models should be in GGUF format (for CPU-optimized inference unless you have GPUs)
- Hosted on HuggingFace Hub
Deploying to HuggingFace Spaces
Method 1: Direct Git Push (Recommended)
1. Create a New Space
- Go to HuggingFace Spaces
- Click "Create new Space"
- Configure:
- Name:
your-rag-chatbot - SDK: Gradio
- Hardware: CPU basic (free tier works fine)
- Visibility: Public or Private
- Name:
2. Get Your HuggingFace Token
- Go to HuggingFace Settings → Tokens
- Click "New token"
- Give it a name (e.g., "spaces-deploy")
- Select Write permission
- Copy the token
3. Connect Your Repository
# Add HuggingFace Space as remote
```bash
git remote add space https://YOUR_USERNAME:[email protected]/spaces/your-username/your-rag-chatbot
4. Configure Secrets
In your Space settings on HuggingFace:
- Go to Settings → Repository secrets
- Add the following secret:
- Name:
HOPSWORKS_API_KEY - Value: Your Hopsworks API key
- Name:
5. Wait for Build
The Space will automatically build and deploy. This may take a couple of minutes.
Method 2: GitHub Sync (Automatic)
1. Enable GitHub Actions
The repository includes .github/workflows/sync_to_huggingface.yaml for automatic syncing.
2. Add GitHub Secrets
In your GitHub repository:
- Go to Settings → Secrets and variables → Actions
- Add:
- Name:
HF_TOKEN - Value: Your HuggingFace write token
- Name:
Get your HuggingFace Token:
- Go to HuggingFace Settings → Tokens
- Create a new token with write permissions
- Copy the token
3. Update Workflow File (if needed)
Edit .github/workflows/sync_to_huggingface.yaml and update:
env:
HF_TOKEN: ${{ secrets.HUGGINGFACE_SYNC_TOKEN }} #leave this
HF_SPACE_URL: https://huggingface.co/spaces/your-username/your-space-name
4. Automatic Syncing
Now, every push to your main branch will automatically sync to HuggingFace Spaces!
git add .
git commit -m "Update model configuration"
git push origin main # Automatically syncs to HF Spaces
Testing
Local Testing
Before deploying, test locally:
python app.py
This will:
- Install llama-cpp-python at runtime
- Connect to Hopsworks and load embeddings
- Launch Gradio interface at a local host (exact url can be found in the command line)
Testing on HuggingFace Spaces
- Open your Space URL:
https://huggingface.co/spaces/your-username/your-space-name - Select a model from the dropdown
- Click "Load Model" (wait 1-3 minutes for first load)
- Once loaded, ask a question related to your documents
- Verify the response uses context from your indexed documents
Configuration Reference
README.md (Space Configuration)
models_config.json
Defines available models in the dropdown:
{
"models": [
{
"name": "Display Name", // Shown in dropdown
"repo_id": "username/repo", // HuggingFace model repository
"filename": "model.gguf", // GGUF file in the repo
"description": "Model description" // Shown in UI
}
]
}
Happy Deploying! 🚀