Appearance
Realtime API Events
Speak
request.speak.text
Make the robot say something using text-to-speech.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| text | string | ✅ | — | The text to speak |
| abort | boolean | ❌ | false | Abort any earlier planned or ongoing speech. If false, new speech is queued |
| monitor_words | boolean | ❌ | false | Whether to send back response.speak.word while speaking |
request.speak.audio
Make the robot play/say some audio from a URL. The audio must be in WAV format (not MP3).
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| url | string | ✅ | — | The url from which to play audio |
| text | string | ❌ | "AUDIO" | The corresponding text (for display purposes) |
| lipsync | boolean | ❌ | true | Whether to perform lipsync |
| abort | boolean | ❌ | false | Whether to abort any earlier planned or ongoing speech |
request.speak.stop
Make the robot stop speaking and/or abort any planned speech, or speech being generated.
response.speak.start
The robot has started speaking.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| text | string | ✅ | — | The synthesized text |
| gen_time | number | ✅ | — | Time to synthesize (ms) |
response.speak.end
The robot has stopped speaking.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| text | string | ✅ | — | The synthesized text (what was actually said if aborted) |
| aborted | boolean | ❌ | false | Whether the speech was stopped prematurely |
| failed | boolean | ❌ | false | Whether the speech synthesis failed |
response.speak.word
Sent while speaking, if monitor_words is true.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| word | string | ✅ | — | The word spoken |
| index | number | ✅ | — | The index of the word spoken |
Speak (Streaming)
request.speak.audio.start
Start speaking from streaming audio.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| sample_rate | int | ❌ | 16000 | The sample rate of the audio data |
| lipsync | boolean | ❌ | true | Whether to add automatic lipsync |
request.speak.audio.data
Send audio data for streaming speech.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| audio | string | ✅ | — | Base64-encoded audio data, 16-bit, mono, little-endian |
request.speak.audio.end
Marks the end of audio data transmission. This should be sent after all audio chunks have been sent.
request.speak.stop
Stop speaking immediately. This command can be used to halt audio playback at any time.
response.speak.end
This event is sent back when all audio has been played.
response.speak.audio.buffer
This event is sent back with the current audio buffer status while audio is being played.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| played | int | ❌ | — | The number of bytes of audio played |
| received | int | ❌ | — | The number of bytes of audio received |
Listen
request.listen.config
Configure speech recognition.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| languages | list | ❌ | ["en-US"] | List of languages to listen for |
| phrases | list | ❌ | [] | List of words or phrases to listen extra carefully for |
request.listen.start
Make the robot listen for speech. If it is already listening, it will reset the current result.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| partial | boolean | ❌ | false | Whether to produce results while user is speaking |
| concat | boolean | ❌ | true | Whether to concatenate results during the same listening session |
| stop_no_speech | boolean | ❌ | true | Stop listening if user silent for too long (no_speech_timeout) |
| stop_robot_start | boolean | ❌ | true | Stop listening when robot start speaking |
| stop_user_end | boolean | ❌ | true | Stop listening when end-of-speech is detected |
| resume_robot_end | boolean | ❌ | false | Resume listening when robot stopped speaking |
| no_speech_timeout | number | ❌ | 8.0 | Timeout to use for stop_no_speech (seconds) |
| end_speech_timeout | number | ❌ | 1.0 | Amount of silence needed to detect end-of-speech (seconds) |
request.listen.stop
Force the robot to stop listening.
response.listen.start
The robot has started listening for speech.
response.listen.end
The robot has stopped listening for speech.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| cause | string | ✅ | — | The reason for stopping: 'stopped', 'robot_speak', 'speech_end', 'silence_timeout' |
response.hear.start
The user has started speaking, or re-started speaking if listening continued.
response.hear.end
The user has stopped speaking and speech is recognized.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| text | string | ✅ | — | Recognized speech |
response.hear.partial
The robot has recognized partial speech from the user. These events are only sent if partial results are requested.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| text | string | ✅ | — | Recognized speech |
Voice
request.voice.config
Set the current voice, using either voice id, name, language, gender, or provider (or a combination).
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| voice_id | string | ❌ | — | Voice ID |
| name | string | ❌ | — | Voice name (partial, case insensitive) |
| provider | string | ❌ | — | Voice provider (partial, case insensitive) |
| language | string | ❌ | — | Voice language (e.g. en-US) |
| gender | string | ❌ | — | Voice gender (male/female) |
| input_language | boolean | ❌ | true | Set input language to same as voice |
request.voice.status
Gets the current and available voices.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| voice_id | boolean | ❌ | true | Current voice id |
| voice_list | boolean | ❌ | true | List of available voices |
response.voice.status
Returns current and available voices.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| voice_id | string | ❌ | null | Current voice id |
| voice_list | list | ❌ | null | List of available voices |
Attention
request.attend.user
Make the robot attend to a user. If the user is lost, the robot will attend nobody. If the user comes back later, the robot will attend to that user again. If the 'closest' user is specified, the robot will always attend the closest user.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| user_id | string | ❌ | closest | The ID of the user to attend, or 'closest' |
| slack_pitch | number | ❌ | 15 | Max difference (deg) between head pitch and gaze pitch |
| slack_yaw | number | ❌ | 5 | Max difference (deg) between head yaw and gaze yaw |
| slack_timeout | number | ❌ | 3000 | Max time (ms) head direction can diverge from gaze (-1 is infinite) |
| speed | string | ❌ | medium | Speed of head movement (xslow, slow, medium, fast, xfast) |
request.attend.location
Make the robot attend to a specific location, as specified in meters, relative to the robot.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| x | number | ✅ | — | Horizontal position |
| y | number | ✅ | — | Vertical position |
| z | number | ✅ | — | Distance from robot |
| slack_pitch | number | ❌ | 15 | Max difference (deg) between head pitch and gaze pitch |
| slack_yaw | number | ❌ | 5 | Max difference (deg) between head yaw and gaze yaw |
| slack_timeout | number | ❌ | 3000 | Max time (ms) head direction can diverge from gaze (-1 is infinite) |
| speed | string | ❌ | medium | Speed of head movement (xslow, slow, medium, fast, xfast) |
request.attend.nobody
Make the robot attend to nobody.
response.attend.status
The robot's attention status has changed.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| target | string | ✅ | — | The target attention (nobody, location, closest, user-id) |
| current | string | ✅ | — | The current attention (nobody, location, user-id) |
Gestures
request.gesture.start
Make the robot perform a gesture.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| name | string | ✅ | — | Name of the gesture (see API for values) |
| intensity | number | ❌ | 1.0 | Intensity of the gesture |
| duration | number | ❌ | 1.0 | Duration of gesture |
| monitor | boolean | ❌ | false | Whether to receive events when gesture starts and ends |
response.gesture.start
Received when the gesture starts.
response.gesture.end
Received when the gesture ends.
Face
request.face.params
Set facial animation parameters directly.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| params | object | ✅ | — | Face parameters |
request.face.headpose
Control the head pose of the robot, as specified in degrees along yaw, pitch and roll.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| yaw | number | ❌ | 0 | Yaw (sideways) angle in degrees |
| pitch | number | ❌ | 0 | Pitch (up/down) angle in degrees |
| roll | number | ❌ | 0 | Roll angle in degrees |
| relative | boolean | ❌ | false | Whether to use relative or absolute control |
| speed | string | ❌ | medium | Speed of head movement (xslow, slow, medium, fast, xfast) |
request.face.config
Set the current mask and character (face_id), and/or face visibility.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| face_id | string | ❌ | KEEP | face ID |
| visibility | boolean | ❌ | true | Turn on or off face visibility |
| microexpressions | boolean | ❌ | true | Turn on or off facial microexpressions |
| blinking | boolean | ❌ | true | Turn on or off blinking |
| head_sway | boolean | ❌ | false | Turn on or off automatic minor head sways |
request.face.status
Gets the current and available masks and characters (face_id).
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| face_id | boolean | ❌ | true | Current face id |
| face_list | boolean | ❌ | true | List of available faces |
request.face.reset
Resets all facial parameters to default.
response.face.status
Returns current and available masks and characters (face_id).
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| face_id | string | ❌ | null | Current face id |
| face_list | list | ❌ | null | List of available faces |
LED
request.led.set
Set the color of the LED.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| color | string | ✅ | — | Color in hex format |
Users
request.users.once
Detect users once.
request.users.start
Start detecting users.
request.users.stop
Stop detecting users.
response.users.data
New users are detected, users are lost, or users move.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| users | list | ✅ | — | List of user objects, sorted by proximity |
Audio
request.audio.start
Start capturing audio from the microphone and/or speaker. Per default, only microphone will be captured.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| sample_rate | int | ❌ | 16000 | The sample rate of the audio data to receive |
| microphone | boolean | ❌ | true | Whether to capture the microphone |
| speaker | boolean | ❌ | false | Whether to capture the speaker |
request.audio.stop
Stops capturing audio from the microphone and/or speaker.
response.audio.data
Audio data.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| microphone | string | ❌ | — | Base64-encoded audio data, 16-bit, mono, little-endian |
| speaker | string | ❌ | — | Base64-encoded audio data, 16-bit, mono, little-endian |
Camera
request.camera.once
Capture one frame from the camera.
request.camera.start
Start capturing video from the camera.
request.camera.stop
Stops capturing video from the camera.
response.camera.data
Camera video data.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| image | string | ❌ | — | Base64-encoded jpeg image |