Real-time Noise-voice recognition task with CNN-type models
Noise: self labeled noises from webist freesound. Including several common noises appeared in online meeting: crowd, dog, keyboard, lawnmower, mouse click, passing car
Voice: additive noise type noisy voice generated by mixing noise above with clean voice sampled from clean dataset Aurora4
CNN-type pytorch implemented models in paper https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43969.pdf
With MobileNet Tricks applied, so that models may be more reasonable applied in real-time tasks. (Still needs more data to let this trick work)
tensorflow version >= 1.8.0 some useful audio processing tool are used for audio data preprocessing
pytorch used for model training
pydub used for audio preprocessing (audio segmentation + normalization)
training: python3 train.py [model name]
testing: python3 test.py [model name] [percentage of whole test data want to use]
testing on real record wav: python3 testOnRealRecord.py [model name] [wav file path]