fix: prevent discord voice self-feedback

This commit is contained in:
Peter Steinberger
2026-05-07 05:37:17 +01:00
parent 6009b86f0d
commit 1c2832526f
5 changed files with 37 additions and 4 deletions

View File

@@ -6,6 +6,7 @@ Docs: https://docs.openclaw.ai
### Changes
- Discord/voice: keep TTS playback running when another user starts speaking, ignore new capture during playback to avoid feedback loops, and downgrade expected receive-stream aborts to verbose diagnostics.
- Telegram: treat successful same-chat `message` tool outbound sends during an inbound telegram turn as delivered when deciding whether to emit the rewritten silent reply fallback (#78685). Thanks @neeravmakwana.
- Gateway/tasks: reconcile stale CLI run-context tasks whose live run context disappeared even when a child session row remains, and apply the default bounded reload deferral timeout to channel hot reloads so stale task records cannot block Discord/Slack/Telegram reloads forever.
- Discord/voice: make `openclaw channels capabilities --channel discord --target channel:<id>` and `channels status --probe` audit voice-channel permissions, including auto-join targets, so missing Connect/Speak/Read Message History permissions show up before `/vc join`.

View File

@@ -1205,8 +1205,10 @@ Notes:
- `@discordjs/voice` defaults are `daveEncryption=true` and `decryptionFailureTolerance=24` if unset.
- `voice.connectTimeoutMs` controls the initial `@discordjs/voice` Ready wait for `/vc join` and auto-join attempts. Default: `30000`.
- `voice.reconnectGraceMs` controls how long OpenClaw waits for a disconnected voice session to begin reconnecting before destroying it. Default: `15000`.
- Voice playback does not stop just because another user starts speaking. To avoid feedback loops, OpenClaw ignores new voice capture while TTS is playing; speak after playback finishes for the next turn.
- OpenClaw also watches receive decrypt failures and auto-recovers by leaving/rejoining the voice channel after repeated failures in a short window.
- If receive logs repeatedly show `DecryptionFailed(UnencryptedWhenPassthroughDisabled)` after updating, collect a dependency report and logs. The bundled `@discordjs/voice` line includes the upstream padding fix from discord.js PR #11449, which closed discord.js issue #11419.
- `The operation was aborted` receive events are expected when OpenClaw finalizes a captured speaker segment; they are verbose diagnostics, not warnings.
Voice channel pipeline:

View File

@@ -351,6 +351,7 @@ WhatsApp runs through the gateway's web channel (Baileys Web). It starts automat
- `channels.discord.voice.daveEncryption` and `channels.discord.voice.decryptionFailureTolerance` pass through to `@discordjs/voice` DAVE options (`true` and `24` by default).
- `channels.discord.voice.connectTimeoutMs` controls the initial `@discordjs/voice` Ready wait for `/vc join` and auto-join attempts (`30000` by default).
- `channels.discord.voice.reconnectGraceMs` controls how long a disconnected voice session may take to enter reconnect signalling before OpenClaw destroys it (`15000` by default).
- Discord voice playback is not interrupted by another user's speaking-start event. To avoid feedback loops, OpenClaw ignores new voice capture while TTS is playing.
- OpenClaw additionally attempts voice receive recovery by leaving/rejoining a voice session after repeated decrypt failures.
- `channels.discord.streaming` is the canonical stream mode key. Discord defaults to `streaming.mode: "progress"` so tool/work progress appears in one edited preview message; set `streaming.mode: "off"` to disable it. Legacy `streamMode` and boolean `streaming` values remain runtime aliases; run `openclaw doctor --fix` to rewrite persisted config.
- `channels.discord.autoPresence` maps runtime availability to bot presence (healthy => online, degraded => idle, exhausted => dnd) and allows optional status text overrides.

View File

@@ -393,6 +393,29 @@ describe("DiscordVoiceManager", () => {
expect(player.off).toHaveBeenCalledWith("error", expect.any(Function));
});
it("ignores new capture while playback is running", async () => {
const connection = createConnectionMock();
joinVoiceChannelMock.mockReturnValueOnce(connection);
const manager = createManager();
await manager.join({ guildId: "g1", channelId: "1001" });
const player = createAudioPlayerMock.mock.results.at(-1)?.value;
const entry = (manager as unknown as { sessions: Map<string, unknown> }).sessions.get("g1");
expect(entry).toBeDefined();
expect(player).toBeDefined();
player.state.status = "playing";
await (
manager as unknown as {
handleSpeakingStart: (entry: unknown, userId: string) => Promise<void>;
}
).handleSpeakingStart(entry, "u1");
expect(player.stop).not.toHaveBeenCalled();
expect(connection.receiver.subscribe).not.toHaveBeenCalled();
});
it("passes DAVE options to joinVoiceChannel", async () => {
const manager = createManager({
voice: {

View File

@@ -496,15 +496,17 @@ export class DiscordVoiceManager {
`capture start: guild ${entry.guildId} channel ${entry.channelId} user ${userId}`,
);
const voiceSdk = loadDiscordVoiceSdk();
if (entry.player.state.status === voiceSdk.AudioPlayerStatus.Playing) {
logVoiceVerbose(
`capture ignored during playback: guild ${entry.guildId} channel ${entry.channelId} user ${userId}`,
);
return;
}
this.enableDaveReceivePassthrough(
entry,
`speaker ${userId} start`,
DAVE_RECEIVE_PASSTHROUGH_REARM_EXPIRY_SECONDS,
);
if (entry.player.state.status === voiceSdk.AudioPlayerStatus.Playing) {
entry.player.stop(true);
}
const stream = entry.connection.receiver.subscribe(userId, {
end: {
behavior: voiceSdk.EndBehaviorType.Manual,
@@ -575,6 +577,10 @@ export class DiscordVoiceManager {
private handleReceiveError(entry: VoiceSessionEntry, err: unknown) {
const analysis = analyzeVoiceReceiveError(err);
if (analysis.isAbortLike && !analysis.countsAsDecryptFailure) {
logVoiceVerbose(`receive stream ended: ${analysis.message}`);
return;
}
logger.warn(`discord voice: receive error: ${analysis.message}`);
if (analysis.shouldAttemptPassthrough) {
this.enableDaveReceivePassthrough(