File size: 1,928 Bytes
bf6e35e 585128b bf6e35e 300ccee bf6e35e 9ec6eb1 bf6e35e 585128b 9ec6eb1 585128b 9ec6eb1 585128b 9ec6eb1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
---
title: MGZON Smart Assistant
emoji: π
colorFrom: purple
colorTo: gray
sdk: docker
sdk_version: 0.115.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: API for T5 and Mistral-7B in Hugging Face Spaces
---
# MGZON Smart Assistant
This project provides a FastAPI-based API for integrating two language models:
- **MGZON-FLAN-T5**: A pre-trained T5 model fine-tuned to respond to questions containing keywords like "mgzon", "flan", or "t5".
- **Mistral-7B-GGUF**: A Mistral-7B model in GGUF format for answering general questions.
## Setup
- **Docker**: The image is built using `python:3.10-slim` with development tools (`gcc`, `g++`, `cmake`, `make`) installed to support building `llama-cpp-python`.
- **Permissions**: The application runs as a non-root user (`appuser`), with cache (`/app/.cache/huggingface`) and model (`models/`) directories configured with appropriate permissions.
- **System Requirements**: Dependencies are installed from `requirements.txt`, including `transformers`, `torch`, `fastapi`, and `llama-cpp-python`.
- **Model Download**: The Mistral-7B GGUF model is downloaded via `setup.sh` using `huggingface_hub`.
- **Environment Variables**:
- `HF_HOME` is set to `/app/.cache/huggingface`.
- `HF_TOKEN` (secret) is required to access models from the Hugging Face Hub.
## How to Run
1. Build the Docker image using the provided `Dockerfile`.
2. Ensure the `HF_TOKEN` is set in the Hugging Face Spaces settings.
3. Run the application using `uvicorn` on port 8080.
## Endpoint
- **POST /ask**:
- **Input**: JSON containing `question` (the query) and `max_new_tokens` (optional, default=150).
- **Output**: JSON containing `model` (name of the model used) and `response` (the answer).
## Example Usage
```bash
curl -X POST "https://mgzon-api-mg.hf.space/ask" \
-H "Content-Type: application/json" \
-d '{"question": "What is MGZON?", "max_new_tokens": 100}'
```
|