API Reference

This is the REST API and WebSocket interface for the Creature Server. All endpoints are prefixed with the server’s base URL. The server communicates over HTTP/1.1 and WebSocket on a local LAN.

Most endpoints accept and return application/json. A few accept raw binary audio or raw JSON strings. Async operations return a job ID immediately and report progress over the WebSocket.

Creatures

Manage creature configurations. A creature’s JSON definition file is the source of truth — the database is a cache. Universe assignment is runtime-only state.

GET /api/v1/creature — List all creatures known to the server.

GET /api/v1/creature/{creatureId} — Get a single creature by its UUID.

POST /api/v1/creature — Upsert a creature’s configuration. Accepts a raw creature JSON config string. Returns the creature DTO, or 400 if the config is invalid.

POST /api/v1/creature/validate — Validate a creature configuration without saving it. Useful for checking a config before deploying it to a controller.

POST /api/v1/creature/register — Register a creature controller with a specific universe. This is how a controller announces itself to the server on startup.

{
  "creature_config": "<raw creature JSON string>",
  "universe": 1
}

PATCH /api/v1/creature/{creatureId}/idle — Enable or disable the idle animation loop for a creature. Returns 409 if the creature isn’t registered to a universe.

{
  "enabled": true
}

Animations

Animations are multi-track recordings of servo positions over time. They can be recorded with the joystick and stored permanently, or generated on the fly as ad-hoc animations that expire after 24 hours.

GET /api/v1/animation — List all stored animations.

GET /api/v1/animation/{animationId} — Get a single animation by its MongoDB ObjectId.

POST /api/v1/animation — Create or update a stored animation. Accepts a raw animation JSON string.

DELETE /api/v1/animation/{animationId} — Delete an animation and all of its tracks.

POST /api/v1/animation/play — Play a stored animation on a creature’s universe. If resumePlaylist is true, the active playlist resumes after this animation finishes.

{
  "animation_id": "ObjectId string",
  "universe": 1,
  "resumePlaylist": false
}

POST /api/v1/animation/interrupt — Interrupt the currently playing animation with a new one. Uses the cooperative scheduler — the running animation yields gracefully. Same request format as /animation/play.

Ad-Hoc Animations

Ad-hoc animations are generated on the fly from text. The server calls ElevenLabs for TTS, generates lip sync data, blends it with a body animation, and plays the result. These are stored in a TTL collection and expire after 24 hours.

GET /api/v1/animation/ad-hoc — List all ad-hoc animations currently in the TTL collection.

GET /api/v1/animation/ad-hoc/{animationId} — Get a specific ad-hoc animation by its ID.

POST /api/v1/animation/ad-hoc — Create and immediately play an ad-hoc speech animation. This is an async job — returns 202 with a job ID and reports progress over the WebSocket.

{
  "creature_id": "uuid",
  "text": "Hello, I'm Beaky!",
  "resume_playlist": false
}

POST /api/v1/animation/ad-hoc/prepare — Create an ad-hoc animation but don’t play it yet. Use this when timing matters — prepare ahead of time, then trigger playback manually. Same request format as above. Returns 202 with a job ID; the animation ID is delivered via the WebSocket when the job completes.

POST /api/v1/animation/ad-hoc/play — Play a previously prepared ad-hoc animation.

{
  "animation_id": "ObjectId string",
  "resume_playlist": false
}

Lip Sync Generation

POST /api/v1/animation/generate-lipsync — Regenerate lip sync data for an existing animation. Runs as an async job, returns 202 with a job ID.

{
  "animation_id": "ObjectId string"
}

Streaming Sessions

Streaming sessions enable real-time conversation. Text arrives sentence by sentence (typically from an LLM), and the server pipelines TTS, lip sync generation, and playback so the creature starts talking within a couple of seconds — even while the LLM is still generating the rest of the response.

POST /api/v1/animation/ad-hoc-stream/start — Open a new streaming session. The server loads the creature’s config and prepares for incoming text. Returns a session_id.

{
  "creature_id": "uuid",
  "resume_playlist": false
}

POST /api/v1/animation/ad-hoc-stream/text — Send a sentence to the session. Each chunk kicks off an async pipeline: TTS, lip sync, blend, queue for playback. Returns chunks_received count.

{
  "session_id": "uuid",
  "text": "This is one sentence."
}

POST /api/v1/animation/ad-hoc-stream/finish — Signal that no more text is coming. The playback thread drains the remaining queue. Returns the final animation_id.

{
  "session_id": "uuid"
}

Sounds

Manage sound files stored on the server. These are audio files (MP3, WAV, OGG) that can be played through the creature’s speaker independently of animations.

GET /api/v1/sound — List all stored sound files.

GET /api/v1/sound/{filename} — Download a sound file. Returns binary audio with the appropriate content type (audio/mpeg, audio/wav, or audio/ogg).

GET /api/v1/sound/ad-hoc — List ad-hoc generated sounds (TTS output stored in the TTL collection).

GET /api/v1/sound/ad-hoc/{filename} — Download an ad-hoc sound file. Returns audio/wav.

POST /api/v1/sound/play — Queue a sound file for playback on the creature’s speaker.

{
  "file_name": "squawk.wav"
}

POST /api/v1/sound/generate-lipsync — Generate lip sync data from a stored sound file using whisper.cpp. Runs as an async job, returns 202 with a job ID.

{
  "sound_file": "hello.wav",
  "allow_overwrite": false
}

POST /api/v1/sound/generate-lipsync/upload — Upload a WAV file and generate lip sync data synchronously. Send raw WAV binary as the request body with a filename query parameter. Returns lip sync mouth cue data.

Playlists

Playlists are ordered sequences of animations that play one after another on a universe. They’re useful for setting up a creature to perform a scripted show.

GET /api/v1/playlist — List all playlists.

GET /api/v1/playlist/id/{playlistId} — Get a playlist by its UUID.

POST /api/v1/playlist — Create or update a playlist. Accepts a raw playlist JSON string.

POST /api/v1/playlist/start — Start playing a playlist on a universe.

{
  "universe": 1,
  "playlist_id": "uuid"
}

POST /api/v1/playlist/stop — Stop the currently running playlist on a universe.

{
  "universe": 1
}

GET /api/v1/playlist/status — Get the status of all playlists across all universes.

GET /api/v1/playlist/status/{universe} — Get the playlist status for a specific universe.

Voice

Voice generation via ElevenLabs. Each creature has its own voice settings in its definition file.

GET /api/v1/voice/list-available — List all voices available from ElevenLabs.

GET /api/v1/voice/subscription — Get the current ElevenLabs API subscription status (remaining characters, tier, etc.).

POST /api/v1/voice — Generate a sound file from text using a specific voice.

{
  "text": "Polly wants a cracker!",
  "voice_name": "Beaky",
  "creature_id": "uuid (optional)"
}

Speech-to-Text

Transcription powered by whisper.cpp. The Creature Listener uses this to offload transcription from the Pi 5 to the server.

POST /api/v1/stt/transcribe — Transcribe raw audio to text. Accepts 16kHz mono float32 PCM audio as a raw binary request body.

// Response
{
  "transcript": "Hey Beaky, what's for dinner?",
  "audio_duration_sec": 2.5,
  "transcription_time_ms": 340
}

Metrics

GET /api/v1/metric/counters — Get system performance counters — frames processed, events dispatched, WebSocket messages sent, etc.

Debug

Utility endpoints for development and debugging. These trigger cache invalidation messages on connected clients.

GET /api/v1/debug/cache-invalidate/creature — Broadcast a creature cache invalidation to all connected clients.

GET /api/v1/debug/cache-invalidate/animation — Broadcast an animation cache invalidation to all connected clients.

GET /api/v1/debug/cache-invalidate/playlist — Broadcast a playlist cache invalidation to all connected clients.

GET /api/v1/debug/playlist/update — Test playlist update broadcast to connected clients.

System

GET /api/v1/health — Health check endpoint. Returns {"status": "OK"}.

WebSocket

The WebSocket provides real-time bidirectional communication between the server and its clients (the Creature Console, Creature Controller, etc.).

GET /api/v1/websocket — Upgrade to a WebSocket connection.

Client → Server Messages

  • Notice — General notice messages from clients
  • StreamFrame — DMX frame data for streaming playback
  • BoardSensorReport — Board-level sensor data from a Raspberry Pi (temperature, voltage, etc.)
  • MotorSensorReport — Motor sensor data from a Raspberry Pi (current draw, position feedback, etc.)

Server → Client Messages

  • Database — Database change notifications
  • LogMessage — Server log messages
  • ServerCounters — Periodic system metrics
  • VirtualStatusLights — Status light state updates (the virtual version of the physical LEDs from the Pi hat era)
  • UpsertCreature — Creature configuration change notifications
  • CacheInvalidation — Cache invalidation signals (creature, animation, or playlist)
  • PlaylistStatus — Playlist state changes
  • JobProgress — Progress updates for async jobs (lip sync generation, ad-hoc animations)
  • JobComplete — Job completion notifications with results
  • IdleStateChanged — Idle loop enable/disable notifications
  • CreatureActivity — Creature activity reports (what each creature is currently doing)

Status Codes

The server uses these HTTP status codes consistently:

  • 200 — Success
  • 202 — Accepted (async job started, check WebSocket for progress)
  • 400 — Bad Request (invalid input — client’s fault)
  • 403 — Forbidden (path traversal attempt on file endpoints)
  • 404 — Not Found
  • 409 — Conflict (e.g., creature not registered to a universe)
  • 422 — Unprocessable Entity (missing required fields for processing)
  • 500 — Internal Server Error