The official voice documentation provides a very focused check: whether your user ID is in DISCORD_ALLOWED_USERS, whether the bot has Connect/Speak permissions, and whether Privileged Intents are fully open. If one is missing, there may be a fake online status of "you can enter but can't hear".
Shortest checklist
- Confirm that your own Discord user ID is written to
DISCORD_ALLOWED_USERS. - Open the Presence Intent, Server Members Intent, and Message Content Intent in the developer background.
- The robot invitation permissions must have at least Connect and Speak, and it is best to add Use Voice Activity.
- Confirm that you are not muted and are not in a voice room where the robot is not allowed to enter.
Why can it enter the room but still seem to be deaf
Joining a voice room only means that the connection is established, and it does not mean that Hermes already has the ability to map the speaker to an allowlist or get a voice stream. Especially without Server Members Intent, the bot may not recognize who is speaking at all.
In a word: The most common problem with Discord voice mode is not TTS, but allowlists and intents not being fully matched.
Official open source address: https://github.com/NousResearch/hermes-agent; Official document entry: https://hermes-agent.nousresearch.com/.