The iSpeech AI is a constantly evolving text-to-speech platform, adding new voices, emotional tones, and language support.
Built on Gemini 2.5 Flash and Pro with a 32,000-token context window, you get faster results and precise delivery for ...
Abstract: Zero-shot text-to-speech (TTS) has recently achieved remarkable performance by leveraging a speech prompt instead of a speaker embedding, as it provides richer information. However, ...
Abstract: We propose a novel generative speech enhancement (SE) framework that integrates a language model (LM) and a flow-matching model. To utilize an LM with discrete tokens, we introduce dMel, ...
Kokoro Web is powered by hexgrad/Kokoro-82M, an open-weight 82 million parameter Text-to-Speech model available on Hugging Face. Despite its lightweight architecture, it delivers comparable quality to ...
Dec 4 (Reuters) - Connecticut-based ITT Inc (ITT.N), opens new tab is in advanced talks to buy Lone Star's SPX Flow in a transaction valuing the industrial equipment manufacturer at more than $4.5 ...
Finally, the code for the web UI client used in the Moshi demo is provided in the client/ directory. If you want to fine tune Moshi, head out to kyutai-labs/moshi ...