Audio Classification
This repo contains code and notes for this tutorial.
Dataset
GTZAN is used.
Usage
export HUGGINGFACE_TOKEN=<your_token>
python main.py
Performance
Acc: 0.81 (default setting)
Notes
๐ค Datasets support
train_test_split()method to split the dataset.feature_extractorcan not handle resampling- To resample, one can use
dataset.map()
- To resample, one can use
from datasets import Audio
gtzan = gtzan.cast_column("audio", Audio(sampling_rate=feature_extractor.sampling_rate))
feature_extractordo the normalization and returnsinput_valuesandattention_mask..map()support batched preprocess.Why
AutoModelForAudioClassification.from_pretrainedtakeslabel2idandid2label?
- Downloads last month
- 8