---
title: MGZON Smart Assistant
emoji: 🏃
colorFrom: purple
colorTo: gray
sdk: docker
sdk_version: 0.115.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: API for T5 and Mistral-7B in Hugging Face Spaces
---

# MGZON Smart Assistant

This project provides a FastAPI-based API for integrating two language models:
- **MGZON-FLAN-T5**: A pre-trained T5 model fine-tuned to respond to questions containing keywords like "mgzon", "flan", or "t5".
- **Mistral-7B-GGUF**: A Mistral-7B model in GGUF format for answering general questions.

## Setup
- **Docker**: The image is built using `python:3.10-slim` with development tools (`gcc`, `g++`, `cmake`, `make`) installed to support building `llama-cpp-python`.
- **Permissions**: The application runs as a non-root user (`appuser`), with cache (`/app/.cache/huggingface`) and model (`models/`) directories configured with appropriate permissions.
- **System Requirements**: Dependencies are installed from `requirements.txt`, including `transformers`, `torch`, `fastapi`, and `llama-cpp-python`.
- **Model Download**: The Mistral-7B GGUF model is downloaded via `setup.sh` using `huggingface_hub`.
- **Environment Variables**:
  - `HF_HOME` is set to `/app/.cache/huggingface`.
  - `HF_TOKEN` (secret) is required to access models from the Hugging Face Hub.

## How to Run
1. Build the Docker image using the provided `Dockerfile`.
2. Ensure the `HF_TOKEN` is set in the Hugging Face Spaces settings.
3. Run the application using `uvicorn` on port 8080.

## Endpoint
- **POST /ask**:
  - **Input**: JSON containing `question` (the query) and `max_new_tokens` (optional, default=150).
  - **Output**: JSON containing `model` (name of the model used) and `response` (the answer).

## Example Usage
```bash
curl -X POST "https://mgzon-api-mg.hf.space/ask" \
-H "Content-Type: application/json" \
-d '{"question": "What is MGZON?", "max_new_tokens": 100}'
```