There is an interesting and completely open source initiative called the “SpeechBrain Toolkit“. Somehow it was not on our radar, even if they’ve been around for a couple years already. So we maybe it is useful to share an overview of this project here, too. In a nutshell, the goal is:
… create a single, flexible, and user-friendly toolkit that can be used to easily develop state-of-the-art speech technologies, including systems for speech recognition (both end-to-end and HMM-DNN), speaker recognition, speech separation, multi-microphone signal processing (e.g, beamforming), self-supervised and unsupervised learning, speech contamination / augmentation, and many others. The toolkit will be designed to be a stand-alone framework, but simple interfaces with well-known toolkits, such as Kaldi will also be implemented
The toolkit is developed by researchers at the University of Montréal and Avignon Université together with a diverse consortium — including many big players in speech tech, like NVIDIA, Cambridge, and even Yoshua Bengio, the inventor of DNN and Turing Award winner!
The SpeechBrain team actively welcomes collaboration to extend application areas and use cases, so check them out! Here’s an overview on YouTube to get you started.