Audio

rtvoice ships ready-to-use implementations for microphone input and speaker output. For custom hardware or special requirements (e.g. file playback, virtual devices) you can implement the abstract base classes directly.

Quickstart

from rtvoice.audio import MicrophoneInput, SpeakerOutput

agent = RealtimeAgent(
    audio_input=MicrophoneInput(sample_rate=24000),
    audio_output=SpeakerOutput(sample_rate=24000),
)

Built-in devices

Bases: AudioInputDevice

Default microphone input powered by PyAudio.

Streams raw 16-bit PCM audio from the system microphone in a non-blocking loop. Requires the pyaudio package — install it with pip install rtvoice[audio].

Example

mic = MicrophoneInput(sample_rate=24000)
agent = RealtimeAgent(audio_input=mic)

Parameters:

Name	Type	Description	Default
`device_index`	`int \| None`	PyAudio device index. Defaults to the system default input device.	`None`
`sample_rate`	`int`	Sample rate in Hz. Must match the model's expected rate (24 000 Hz).	`24000`
`chunk_size`	`int`	Number of frames per read. Smaller values reduce latency.	`4800`

Bases: AudioOutputDevice

Default speaker output powered by PyAudio.

Plays raw 16-bit PCM audio through the system speaker using a dedicated background thread, keeping the async event loop unblocked. Requires the pyaudio package — install it with pip install rtvoice[audio].

Example

speaker = SpeakerOutput(sample_rate=24000)
agent = RealtimeAgent(audio_output=speaker)

Parameters:

Name	Type	Description	Default
`device_index`	`int \| None`	PyAudio device index. Defaults to the system default output device.	`None`
`sample_rate`	`int`	Sample rate in Hz. Must match the model's output rate (24 000 Hz).	`24000`

Custom devices

Both built-in classes implement an abstract base interface. Use these if you need to bring your own audio source or sink.

Bases: ABC

Abstract base class for audio input devices.

Implement this interface to provide a custom microphone or audio source. The default implementation is MicrophoneInput.

is_active `abstractmethod` `property`

is_active: bool

Whether the device is currently capturing audio.

start `abstractmethod` `async`

start() -> None

Open the device and begin audio capture.

stop `abstractmethod` `async`

stop() -> None

Stop audio capture and release all device resources.

stream_chunks `abstractmethod`

stream_chunks() -> AsyncIterator[bytes]

Yield raw 16-bit PCM audio chunks as they become available.

Bases: ABC

Abstract base class for audio output devices.

Implement this interface to provide a custom speaker or audio sink. The default implementation is SpeakerOutput.

is_playing `abstractmethod` `property`

is_playing: bool

Whether audio is currently being played or queued.

clear_buffer `abstractmethod` `async`

clear_buffer() -> None

Discard all queued audio to stop playback immediately.

play_chunk `abstractmethod` `async`

play_chunk(chunk: bytes) -> None

Enqueue a raw 16-bit PCM audio chunk for playback.

start `abstractmethod` `async`

start() -> None

Open the device and prepare it for playback.

stop `abstractmethod` `async`

stop() -> None

Stop playback and release all device resources.

Audio

Quickstart

Built-in devices

Custom devices

is_active abstractmethod property

start abstractmethod async

stop abstractmethod async

stream_chunks abstractmethod

is_playing abstractmethod property

clear_buffer abstractmethod async

play_chunk abstractmethod async

start abstractmethod async

stop abstractmethod async

is_active `abstractmethod` `property`

start `abstractmethod` `async`

stop `abstractmethod` `async`

stream_chunks `abstractmethod`

is_playing `abstractmethod` `property`

clear_buffer `abstractmethod` `async`

play_chunk `abstractmethod` `async`

start `abstractmethod` `async`

stop `abstractmethod` `async`