Voice Input

The Amurg UI supports voice dictation with two backends. The default mode uses the browser's built-in Web Speech API and requires no setup. For private, offline speech recognition, you can connect a local Whisper server.

Voice Modes

Mode	Backend	Setup	Privacy
Browser (default)	Web Speech API (Chrome, Edge, Safari)	None — works out of the box	Audio may be sent to the browser vendor's cloud service
Local Whisper	Self-hosted Whisper ASR server via WebSocket	Run a Whisper server, configure the URL in settings	Fully local — audio never leaves your machine

Switch modes via the gear icon next to the microphone button. Settings are saved in localStorage under the key amurg-voice.

How to Use

The microphone button supports two interaction styles:

Gesture	Action
Hold to talk	Press and hold the mic button for more than 200ms. Recording stops when you release.
Tap to toggle	Quick-tap (under 200ms) to start recording, tap again to stop. Useful on mobile.

Edit before send

Transcribed text is appended to the message input field — it is never sent automatically. You can review, edit, or add to it before pressing send. While you speak, a real-time interim preview is shown above the input field.

Visual Feedback

Indicator	Meaning
Red mic button	Recording is active
Pulsing ring around button	Audio level visualization (scales with input volume)
Italic text above input	Interim transcription (partial, live as you speak)
Green ring on input field	Final transcription received (flashes briefly)

Browser Mode

The default mode uses the browser's SpeechRecognition API (or webkitSpeechRecognition on Safari). It requires no server or configuration.

Browser	Support
Chrome / Edge	Full support
Safari (iOS 14.5+, macOS)	Full support
Firefox	Not supported (mic button hidden)

Language Detection

The browser mode uses navigator.language for speech recognition language, falling back to en-US. This means it automatically matches your browser's language setting.

Local Whisper Mode

For private, offline speech recognition, you can run a Whisper-compatible ASR server and point the UI at it. Audio never leaves your machine.

Setup

Run a Whisper ASR server that accepts WebSocket connections and receives audio/webm chunks.
Click the gear icon next to the microphone button in the Amurg UI.
Select Local Whisper.
Enter the WebSocket URL (e.g. ws://localhost:8000/asr).

Protocol

The UI streams audio to the Whisper server in 250ms chunks using MediaRecorder with audio/webm;codecs=opus format. The server is expected to respond with JSON messages containing transcription results.

Expected server responses

The UI looks for a text or transcript field in the JSON response. It also recognizes partial/interim results via buffer, segments, is_final, and type: "partial" fields.

// Partial transcription (shown as interim preview)
{"type": "partial", "text": "hello wor"}

// Final transcription (appended to input field)
{"text": "hello world", "is_final": true}

// Alternative field names also accepted
{"transcript": "hello world"}

Compatible Servers

Any Whisper ASR server that accepts WebSocket audio streaming and returns JSON with a text or transcript field will work. Popular options include whisper_streaming and WhisperLive.

Settings Storage

Voice settings are persisted in localStorage under the key amurg-voice as a JSON object:

{
  "mode": "browser",
  "whisperUrl": "ws://localhost:8000/asr"
}

Field	Type	Default	Description
`mode`	string	`"browser"`	`"browser"` or `"whisper"`
`whisperUrl`	string	`""`	WebSocket URL for the Whisper server

Troubleshooting

Problem	Solution
No microphone button visible	Your browser does not support the Web Speech API (e.g. Firefox). Switch to Chrome, Edge, or Safari.
"Microphone access denied" toast	Grant microphone permission in your browser settings. On mobile, check app-level permissions too.
Recognition stops after ~60 seconds	Some mobile browsers kill long-running speech recognition. The UI auto-restarts it. Tap the mic again if needed.
Whisper mode shows no transcription	Check that the Whisper server is running and the WebSocket URL is correct. Open browser dev tools to inspect WebSocket frames.
Whisper mode: "Connection failed"	Ensure the Whisper server accepts WebSocket connections. If using HTTPS for the UI, the Whisper URL must also be `wss://` (browsers block mixed content).