Can you recommend a free/libre and open source text-to-speech (TTS) solution that generates English audio in “near-human” quality, to be included in open educational resources (OER)? Maybe with support for say-as of #SSML? A model on #HuggingFace? #Coqui TTS? Something else?
I am thinking about an integration of TTS into my OER CI/CD pipeline:
https://gitlab.com/oer/emacs-reveal/-/issues/20
#tts #texttospeech #floss #foss #oer #cicd