This is the REST API and WebSocket interface for the Creature Server. All endpoints are prefixed with the server’s base URL. The server communicates over HTTP/1.1 and WebSocket on a local LAN.
Most endpoints accept and return application/json. A few accept raw binary audio or raw JSON strings. Async operations return a job ID immediately and report progress over the WebSocket.
Creatures
Manage creature configurations. A creature’s JSON definition file is the source of truth — the database is a cache. Universe assignment is runtime-only state.
GET /api/v1/creature — List all creatures known to the server.
GET /api/v1/creature/{creatureId} — Get a single creature by its UUID.
POST /api/v1/creature — Upsert a creature’s configuration. Accepts a raw creature JSON config string. Returns the creature DTO, or 400 if the config is invalid.
POST /api/v1/creature/validate — Validate a creature configuration without saving it. Useful for checking a config before deploying it to a controller.
POST /api/v1/creature/register — Register a creature controller with a specific universe. This is how a controller announces itself to the server on startup.
{
"creature_config": "<raw creature JSON string>",
"universe": 1
}
PATCH /api/v1/creature/{creatureId}/idle — Enable or disable the idle animation loop for a creature. Returns 409 if the creature isn’t registered to a universe.
{
"enabled": true
}
Animations
Animations are multi-track recordings of servo positions over time. They can be recorded with the joystick and stored permanently, or generated on the fly as ad-hoc animations that expire after 24 hours.
GET /api/v1/animation — List all stored animations.
GET /api/v1/animation/{animationId} — Get a single animation by its MongoDB ObjectId.
POST /api/v1/animation — Create or update a stored animation. Accepts a raw animation JSON string.
DELETE /api/v1/animation/{animationId} — Delete an animation and all of its tracks.
POST /api/v1/animation/play — Play a stored animation on a creature’s universe. If resumePlaylist is true, the active playlist resumes after this animation finishes.
{
"animation_id": "ObjectId string",
"universe": 1,
"resumePlaylist": false
}
POST /api/v1/animation/interrupt — Interrupt the currently playing animation with a new one. Uses the cooperative scheduler — the running animation yields gracefully. Same request format as /animation/play.
Ad-Hoc Animations
Ad-hoc animations are generated on the fly from text. The server calls ElevenLabs for TTS, generates lip sync data, blends it with a body animation, and plays the result. These are stored in a TTL collection and expire after 24 hours.
GET /api/v1/animation/ad-hoc — List all ad-hoc animations currently in the TTL collection.
GET /api/v1/animation/ad-hoc/{animationId} — Get a specific ad-hoc animation by its ID.
POST /api/v1/animation/ad-hoc — Create and immediately play an ad-hoc speech animation. This is an async job — returns 202 with a job ID and reports progress over the WebSocket.
{
"creature_id": "uuid",
"text": "Hello, I'm Beaky!",
"resume_playlist": false
}
POST /api/v1/animation/ad-hoc/prepare — Create an ad-hoc animation but don’t play it yet. Use this when timing matters — prepare ahead of time, then trigger playback manually. Same request format as above. Returns 202 with a job ID; the animation ID is delivered via the WebSocket when the job completes.
POST /api/v1/animation/ad-hoc/play — Play a previously prepared ad-hoc animation.
{
"animation_id": "ObjectId string",
"resume_playlist": false
}
Lip Sync Generation
POST /api/v1/animation/generate-lipsync — Regenerate lip sync data for an existing animation. Runs as an async job, returns 202 with a job ID.
{
"animation_id": "ObjectId string"
}
Streaming Sessions
Streaming sessions enable real-time conversation. Text arrives sentence by sentence (typically from an LLM), and the server pipelines TTS, lip sync generation, and playback so the creature starts talking within a couple of seconds — even while the LLM is still generating the rest of the response.
POST /api/v1/animation/ad-hoc-stream/start — Open a new streaming session. The server loads the creature’s config and prepares for incoming text. Returns a session_id.
{
"creature_id": "uuid",
"resume_playlist": false
}
POST /api/v1/animation/ad-hoc-stream/text — Send a sentence to the session. Each chunk kicks off an async pipeline: TTS, lip sync, blend, queue for playback. Returns chunks_received count.
{
"session_id": "uuid",
"text": "This is one sentence."
}
POST /api/v1/animation/ad-hoc-stream/finish — Signal that no more text is coming. The playback thread drains the remaining queue. Returns the final animation_id.
{
"session_id": "uuid"
}
Sounds
Manage sound files stored on the server. These are audio files (MP3, WAV, OGG) that can be played through the creature’s speaker independently of animations.
GET /api/v1/sound — List all stored sound files.
GET /api/v1/sound/{filename} — Download a sound file. Returns binary audio with the appropriate content type (audio/mpeg, audio/wav, or audio/ogg).
GET /api/v1/sound/ad-hoc — List ad-hoc generated sounds (TTS output stored in the TTL collection).
GET /api/v1/sound/ad-hoc/{filename} — Download an ad-hoc sound file. Returns audio/wav.
POST /api/v1/sound/play — Queue a sound file for playback on the creature’s speaker.
{
"file_name": "squawk.wav"
}
POST /api/v1/sound/generate-lipsync — Generate lip sync data from a stored sound file using whisper.cpp. Runs as an async job, returns 202 with a job ID.
{
"sound_file": "hello.wav",
"allow_overwrite": false
}
POST /api/v1/sound/generate-lipsync/upload — Upload a WAV file and generate lip sync data synchronously. Send raw WAV binary as the request body with a filename query parameter. Returns lip sync mouth cue data.
Playlists
Playlists are ordered sequences of animations that play one after another on a universe. They’re useful for setting up a creature to perform a scripted show.
GET /api/v1/playlist — List all playlists.
GET /api/v1/playlist/id/{playlistId} — Get a playlist by its UUID.
POST /api/v1/playlist — Create or update a playlist. Accepts a raw playlist JSON string.
POST /api/v1/playlist/start — Start playing a playlist on a universe.
{
"universe": 1,
"playlist_id": "uuid"
}
POST /api/v1/playlist/stop — Stop the currently running playlist on a universe.
{
"universe": 1
}
GET /api/v1/playlist/status — Get the status of all playlists across all universes.
GET /api/v1/playlist/status/{universe} — Get the playlist status for a specific universe.
Voice
Voice generation via ElevenLabs. Each creature has its own voice settings in its definition file.
GET /api/v1/voice/list-available — List all voices available from ElevenLabs.
GET /api/v1/voice/subscription — Get the current ElevenLabs API subscription status (remaining characters, tier, etc.).
POST /api/v1/voice — Generate a sound file from text using a specific voice.
{
"text": "Polly wants a cracker!",
"voice_name": "Beaky",
"creature_id": "uuid (optional)"
}
Speech-to-Text
Transcription powered by whisper.cpp. The Creature Listener uses this to offload transcription from the Pi 5 to the server.
POST /api/v1/stt/transcribe — Transcribe raw audio to text. Accepts 16kHz mono float32 PCM audio as a raw binary request body.
// Response
{
"transcript": "Hey Beaky, what's for dinner?",
"audio_duration_sec": 2.5,
"transcription_time_ms": 340
}
Metrics
GET /api/v1/metric/counters — Get system performance counters — frames processed, events dispatched, WebSocket messages sent, etc.
Debug
Utility endpoints for development and debugging. These trigger cache invalidation messages on connected clients.
GET /api/v1/debug/cache-invalidate/creature — Broadcast a creature cache invalidation to all connected clients.
GET /api/v1/debug/cache-invalidate/animation — Broadcast an animation cache invalidation to all connected clients.
GET /api/v1/debug/cache-invalidate/playlist — Broadcast a playlist cache invalidation to all connected clients.
GET /api/v1/debug/playlist/update — Test playlist update broadcast to connected clients.
System
GET /api/v1/health — Health check endpoint. Returns {"status": "OK"}.
WebSocket
The WebSocket provides real-time bidirectional communication between the server and its clients (the Creature Console, Creature Controller, etc.).
GET /api/v1/websocket — Upgrade to a WebSocket connection.
Client → Server Messages
- Notice — General notice messages from clients
- StreamFrame — DMX frame data for streaming playback
- BoardSensorReport — Board-level sensor data from a Raspberry Pi (temperature, voltage, etc.)
- MotorSensorReport — Motor sensor data from a Raspberry Pi (current draw, position feedback, etc.)
Server → Client Messages
- Database — Database change notifications
- LogMessage — Server log messages
- ServerCounters — Periodic system metrics
- VirtualStatusLights — Status light state updates (the virtual version of the physical LEDs from the Pi hat era)
- UpsertCreature — Creature configuration change notifications
- CacheInvalidation — Cache invalidation signals (creature, animation, or playlist)
- PlaylistStatus — Playlist state changes
- JobProgress — Progress updates for async jobs (lip sync generation, ad-hoc animations)
- JobComplete — Job completion notifications with results
- IdleStateChanged — Idle loop enable/disable notifications
- CreatureActivity — Creature activity reports (what each creature is currently doing)
Status Codes
The server uses these HTTP status codes consistently:
- 200 — Success
- 202 — Accepted (async job started, check WebSocket for progress)
- 400 — Bad Request (invalid input — client’s fault)
- 403 — Forbidden (path traversal attempt on file endpoints)
- 404 — Not Found
- 409 — Conflict (e.g., creature not registered to a universe)
- 422 — Unprocessable Entity (missing required fields for processing)
- 500 — Internal Server Error