Spaces:

MGZON
/

api-mg

Runtime error

App Files Files Community

api-mg / README.md

ibrahimlasfar

Shorten short_description to meet Hugging Face metadata requirements

9ec6eb1 4 months ago

preview code

raw

history blame contribute delete

1.93 kB

metadata

title: MGZON Smart Assistant
emoji: 🏃
colorFrom: purple
colorTo: gray
sdk: docker
sdk_version: 0.115.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: API for T5 and Mistral-7B in Hugging Face Spaces

MGZON Smart Assistant

This project provides a FastAPI-based API for integrating two language models:

MGZON-FLAN-T5: A pre-trained T5 model fine-tuned to respond to questions containing keywords like "mgzon", "flan", or "t5".
Mistral-7B-GGUF: A Mistral-7B model in GGUF format for answering general questions.

Setup

Docker: The image is built using python:3.10-slim with development tools (gcc, g++, cmake, make) installed to support building llama-cpp-python.
Permissions: The application runs as a non-root user (appuser), with cache (/app/.cache/huggingface) and model (models/) directories configured with appropriate permissions.
System Requirements: Dependencies are installed from requirements.txt, including transformers, torch, fastapi, and llama-cpp-python.
Model Download: The Mistral-7B GGUF model is downloaded via setup.sh using huggingface_hub.
Environment Variables:
- HF_HOME is set to /app/.cache/huggingface.
- HF_TOKEN (secret) is required to access models from the Hugging Face Hub.

How to Run

Build the Docker image using the provided Dockerfile.
Ensure the HF_TOKEN is set in the Hugging Face Spaces settings.
Run the application using uvicorn on port 8080.

Endpoint

POST /ask:
- Input: JSON containing question (the query) and max_new_tokens (optional, default=150).
- Output: JSON containing model (name of the model used) and response (the answer).

Example Usage

curl -X POST "https://mgzon-api-mg.hf.space/ask" \
-H "Content-Type: application/json" \
-d '{"question": "What is MGZON?", "max_new_tokens": 100}'