metadata
title: MGZON Smart Assistant
emoji: π
colorFrom: purple
colorTo: gray
sdk: docker
sdk_version: 0.115.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: API for T5 and Mistral-7B in Hugging Face Spaces
MGZON Smart Assistant
This project provides a FastAPI-based API for integrating two language models:
- MGZON-FLAN-T5: A pre-trained T5 model fine-tuned to respond to questions containing keywords like "mgzon", "flan", or "t5".
- Mistral-7B-GGUF: A Mistral-7B model in GGUF format for answering general questions.
Setup
- Docker: The image is built using
python:3.10-slimwith development tools (gcc,g++,cmake,make) installed to support buildingllama-cpp-python. - Permissions: The application runs as a non-root user (
appuser), with cache (/app/.cache/huggingface) and model (models/) directories configured with appropriate permissions. - System Requirements: Dependencies are installed from
requirements.txt, includingtransformers,torch,fastapi, andllama-cpp-python. - Model Download: The Mistral-7B GGUF model is downloaded via
setup.shusinghuggingface_hub. - Environment Variables:
HF_HOMEis set to/app/.cache/huggingface.HF_TOKEN(secret) is required to access models from the Hugging Face Hub.
How to Run
- Build the Docker image using the provided
Dockerfile. - Ensure the
HF_TOKENis set in the Hugging Face Spaces settings. - Run the application using
uvicornon port 8080.
Endpoint
- POST /ask:
- Input: JSON containing
question(the query) andmax_new_tokens(optional, default=150). - Output: JSON containing
model(name of the model used) andresponse(the answer).
- Input: JSON containing
Example Usage
curl -X POST "https://mgzon-api-mg.hf.space/ask" \
-H "Content-Type: application/json" \
-d '{"question": "What is MGZON?", "max_new_tokens": 100}'