shaun3141's picture
Switch to Docker SDK to use custom Dockerfile with multi-stage caching
d632650
---
title: Caribbean Voices - OWSM v3.1 Platform
emoji: 🎀
colorFrom: purple
colorTo: pink
sdk: docker
app_port: 7860
pinned: false
hardware: gpu-a10g-large
license: mit
---
# Caribbean Voices Hackathon - OWSM v3.1 Platform
Hugging Face Space for OWSM v3.1 training and inference with progress tracking.
## βœ… Dataset Upload Complete
**Dataset**: `shaun3141/caribbean-voices-hackathon` (Private)
The dataset has been uploaded and the app is configured to use it automatically.
## Quick Start
### HF Space Configuration
**Environment Variable** (optional - already set in code):
```
HF_DATASET_NAME=shaun3141/caribbean-voices-hackathon
```
**For Private Dataset Access** (if needed):
```
HF_TOKEN=your_hf_token_here
```
Set these in: **Settings β†’ Variables** (or **Settings β†’ Secrets** for token)
### The Space Will Automatically:
- βœ… Load CSV files from dataset on startup
- βœ… Load audio files on-demand during inference
- βœ… Cache data locally for faster access
- βœ… No manual uploads needed!
## Features
- **Data Loading**: Auto-load from HF Dataset (no uploads needed!)
- **Entity Extraction**: Extract Caribbean entities from training transcripts
- **Model Training**: Fine-tune OWSM v3.1 with entity-weighted loss
- **Batch Inference**: Generate transcriptions for test set with progress tracking
- **Single File Transcription**: Quick transcription with multiple models
- **Status Monitoring**: Real-time status of setup and training progress
## Usage
1. **Deploy Space** - Data loads automatically from dataset
2. **Check Status**: View system status and setup progress
3. **Extract Entities**: Run entity extraction on training data
4. **Train Model**: Fine-tune OWSM (requires ESPnet recipes for full training)
5. **Run Inference**: Generate test set transcriptions
6. **Download Results**: Get submission CSV file
## Requirements
- ESPnet and espnet_model_zoo (for OWSM models)
- PyTorch
- Transformers
- Gradio
- Datasets (for HF Dataset loading)
## Model Support
- **Wav2Vec2 Models**: Fast baseline models
- **OWSM v3.1 Small**: Open Whisper-style model with ESPnet
## Dataset Information
- **Name**: `shaun3141/caribbean-voices-hackathon`
- **Privacy**: Private
- **Train**: 19,856 samples
- **Test**: 8,510 samples
- **URL**: https://huggingface.co/datasets/shaun3141/caribbean-voices-hackathon
## Documentation
- `DATASET_SETUP.md` - Dataset upload instructions
- `DATA_LOADING.md` - Data loading methods
- `SPACE_CONFIG.md` - Space configuration guide
- `FILES.md` - File organization