--- title: MGZON Smart Assistant emoji: 🏃 colorFrom: purple colorTo: gray sdk: docker sdk_version: 0.115.0 app_file: app.py pinned: false license: apache-2.0 short_description: API for T5 and Mistral-7B in Hugging Face Spaces --- # MGZON Smart Assistant This project provides a FastAPI-based API for integrating two language models: - **MGZON-FLAN-T5**: A pre-trained T5 model fine-tuned to respond to questions containing keywords like "mgzon", "flan", or "t5". - **Mistral-7B-GGUF**: A Mistral-7B model in GGUF format for answering general questions. ## Setup - **Docker**: The image is built using `python:3.10-slim` with development tools (`gcc`, `g++`, `cmake`, `make`) installed to support building `llama-cpp-python`. - **Permissions**: The application runs as a non-root user (`appuser`), with cache (`/app/.cache/huggingface`) and model (`models/`) directories configured with appropriate permissions. - **System Requirements**: Dependencies are installed from `requirements.txt`, including `transformers`, `torch`, `fastapi`, and `llama-cpp-python`. - **Model Download**: The Mistral-7B GGUF model is downloaded via `setup.sh` using `huggingface_hub`. - **Environment Variables**: - `HF_HOME` is set to `/app/.cache/huggingface`. - `HF_TOKEN` (secret) is required to access models from the Hugging Face Hub. ## How to Run 1. Build the Docker image using the provided `Dockerfile`. 2. Ensure the `HF_TOKEN` is set in the Hugging Face Spaces settings. 3. Run the application using `uvicorn` on port 8080. ## Endpoint - **POST /ask**: - **Input**: JSON containing `question` (the query) and `max_new_tokens` (optional, default=150). - **Output**: JSON containing `model` (name of the model used) and `response` (the answer). ## Example Usage ```bash curl -X POST "https://mgzon-api-mg.hf.space/ask" \ -H "Content-Type: application/json" \ -d '{"question": "What is MGZON?", "max_new_tokens": 100}' ```