anthony-wss
/

distilhubert-finetuned-gtzan

Audio Classification

AudioClassification

Model card Files Files and versions

Audio Classification

This repo contains code and notes for this tutorial.

Dataset

GTZAN is used.

Usage

export HUGGINGFACE_TOKEN=<your_token>
python main.py

Performance

Acc: 0.81 (default setting)

Notes

🤗 Datasets support train_test_split() method to split the dataset.
feature_extractor can not handle resampling
- To resample, one can use dataset.map()

from datasets import Audio

gtzan = gtzan.cast_column("audio", Audio(sampling_rate=feature_extractor.sampling_rate))

feature_extractor do the normalization and returns input_values and attention_mask.
.map() support batched preprocess.
Why AutoModelForAudioClassification.from_pretrained takes label2id and id2label?

Downloads last month: 8

Dataset used to train anthony-wss/distilhubert-finetuned-gtzan