|
|
--- |
|
|
title: Caribbean Voices - OWSM v3.1 Platform |
|
|
emoji: π€ |
|
|
colorFrom: purple |
|
|
colorTo: pink |
|
|
sdk: docker |
|
|
app_port: 7860 |
|
|
pinned: false |
|
|
hardware: gpu-a10g-large |
|
|
license: mit |
|
|
--- |
|
|
|
|
|
# Caribbean Voices Hackathon - OWSM v3.1 Platform |
|
|
|
|
|
Hugging Face Space for OWSM v3.1 training and inference with progress tracking. |
|
|
|
|
|
## β
Dataset Upload Complete |
|
|
|
|
|
**Dataset**: `shaun3141/caribbean-voices-hackathon` (Private) |
|
|
|
|
|
The dataset has been uploaded and the app is configured to use it automatically. |
|
|
|
|
|
## Quick Start |
|
|
|
|
|
### HF Space Configuration |
|
|
|
|
|
**Environment Variable** (optional - already set in code): |
|
|
``` |
|
|
HF_DATASET_NAME=shaun3141/caribbean-voices-hackathon |
|
|
``` |
|
|
|
|
|
**For Private Dataset Access** (if needed): |
|
|
``` |
|
|
HF_TOKEN=your_hf_token_here |
|
|
``` |
|
|
|
|
|
Set these in: **Settings β Variables** (or **Settings β Secrets** for token) |
|
|
|
|
|
### The Space Will Automatically: |
|
|
|
|
|
- β
Load CSV files from dataset on startup |
|
|
- β
Load audio files on-demand during inference |
|
|
- β
Cache data locally for faster access |
|
|
- β
No manual uploads needed! |
|
|
|
|
|
## Features |
|
|
|
|
|
- **Data Loading**: Auto-load from HF Dataset (no uploads needed!) |
|
|
- **Entity Extraction**: Extract Caribbean entities from training transcripts |
|
|
- **Model Training**: Fine-tune OWSM v3.1 with entity-weighted loss |
|
|
- **Batch Inference**: Generate transcriptions for test set with progress tracking |
|
|
- **Single File Transcription**: Quick transcription with multiple models |
|
|
- **Status Monitoring**: Real-time status of setup and training progress |
|
|
|
|
|
## Usage |
|
|
|
|
|
1. **Deploy Space** - Data loads automatically from dataset |
|
|
2. **Check Status**: View system status and setup progress |
|
|
3. **Extract Entities**: Run entity extraction on training data |
|
|
4. **Train Model**: Fine-tune OWSM (requires ESPnet recipes for full training) |
|
|
5. **Run Inference**: Generate test set transcriptions |
|
|
6. **Download Results**: Get submission CSV file |
|
|
|
|
|
## Requirements |
|
|
|
|
|
- ESPnet and espnet_model_zoo (for OWSM models) |
|
|
- PyTorch |
|
|
- Transformers |
|
|
- Gradio |
|
|
- Datasets (for HF Dataset loading) |
|
|
|
|
|
## Model Support |
|
|
|
|
|
- **Wav2Vec2 Models**: Fast baseline models |
|
|
- **OWSM v3.1 Small**: Open Whisper-style model with ESPnet |
|
|
|
|
|
## Dataset Information |
|
|
|
|
|
- **Name**: `shaun3141/caribbean-voices-hackathon` |
|
|
- **Privacy**: Private |
|
|
- **Train**: 19,856 samples |
|
|
- **Test**: 8,510 samples |
|
|
- **URL**: https://huggingface.co/datasets/shaun3141/caribbean-voices-hackathon |
|
|
|
|
|
## Documentation |
|
|
|
|
|
- `DATASET_SETUP.md` - Dataset upload instructions |
|
|
- `DATA_LOADING.md` - Data loading methods |
|
|
- `SPACE_CONFIG.md` - Space configuration guide |
|
|
- `FILES.md` - File organization |
|
|
|