Has anyone successfully run this on a Mac? The installation instructions appear ...

magicalhippo · 2026-01-22T18:53:26 1769108006

FWIW you can run the demo without FlashAttention using --no-flash-attn command-line parameter, I do that since I'm on Windows and haven't gotten FlashAttention2 to work.

turnsout · 2026-01-22T18:57:54 1769108274

It seems to depend on FlashAttention, so the short answer is no. Hopefully someone does the work of porting the inference code over!

Lichtso · 2026-01-23T18:13:12 1769191992

Yes, using mlx-audio. See https://news.ycombinator.com/item?id=46726440

rahimnathwani · 2026-01-23T20:29:21 1769200161

Thanks! Simon's example uses the custom voice model (creating a voice from instructions). But that comment led me eventually to this page, which shows how to use mlx-audio for custom voices:

https://huggingface.co/mlx-community/Qwen3-TTS-12Hz-0.6B-Bas...

  uv tool install --force git+https://github.com/Blaizzy/mlx-audio.git --prerelease=allow
    
  python -m mlx_audio.tts.generate --model mlx-community/Qwen3-TTS-12Hz-0.6B-Base-bf16 --text "Hello, this is a test." --ref_audio path_to_audio.wav --ref_text "Transcript of the reference audio." --play

javier123454321 · 2026-01-22T17:32:43 1769103163

I recommend using modal for renting the metal.