Skip to content

Latest commit

 

History

History
49 lines (41 loc) · 2.61 KB

README.md

File metadata and controls

49 lines (41 loc) · 2.61 KB

Sharif_WavLM

for Speaker Verification

In this repository, the wavLM model is used for quality and poor-quality data for speaker verification tasks, and the PyCM library is used for evaluation.

Table of Contents

  1. General Info
  2. How to Use
  3. Comparison
  4. Useful Links
  5. Thanks to

General Info


  • Datasets: In this review, 30 speakers have been selected from the Farsdat Dataset, 10 speakers is chosen for test as unknows and the rest of speakers as known (target/untarget) each speakers has 10 audio files we use the first audio file as Enrollment file audio files should be 6 secs (here we use ffmpeg to cut them)

  • Evaluation: For the evaluation part, the PyCM library has been used, which is a reliable and comprehensive library and supports many metrics PyCM is a multi-class confusion matrix library written in Python that supports both input data vectors and direct matrix, and a proper tool for post-classification model evaluation that supports most classes and overall statistics parameters. PyCM is the swiss-army knife of confusion matrices, targeted mainly at data scientists that need a broad array of metrics for predictive models and accurate evaluation of a large variety of classifiers.

  • System Config: To fine-tune this model, NVIDIA GeForce RTX 3060-12 GB is used.

  • link to model: https://huggingface.co/SaraSadeghi/Sharif-WavLM

How to Use


for high-quality(microphone) data: use WavLM_base_AGP for poor-quality(telephony) data: use WavLM_base_telephony

Comparison


Loading .... :hourglass_flowing_sand:

Useful Links


Thanks to


Thanks to Sadra Sabouri for his collaboration:handshake::handshake:

and also thanks to PyCM🔥🔥


Give us a star if you found this repo useful.

🙋‍♀️ Open an issue if you have any comments about them.

🥰 Feel free to open a pull request addding your feature. We'll be more than happy to accept them.