Live The Console

One token. The whole fleet, wrangled.

Send model:"auto" to one OpenAI-compatible endpoint. Host.Rodeo routes every request to the fit-for-purpose source — cheapest that clears the bar, local when it matters, paid only when needed — and absorbs every provider's outage and rate-limit before your agent ever feels it. The savings, source tier, and fallback proof ride back on every receipt.

free local · owned paid

Read the manifest Access is invite-only while we dogfood.

Clint — your host on the floor

“Name’s Clint. I ride herd on the models so your agents don’t get bucked off. Pull up a rail — I’ll show you the whole fleet working, and I won’t hide a thing.”

01 / Capabilities

Everything behind the one token

One OpenAI-compatible endpoint, one bearer token — and a whole platform under model:"auto". Here's what's wrangling on your behalf.

Smart auto, no knobs.

A local classifier reads each request, a cascade tries cheap-then-verifies, and a learned router picks the model. You never hand-pick a model again.

Local & private — your call.

Set your account posture once: Cost (cheapest), Local-aware (free/local before paid is the default ladder), or Private (your data never leaves eligible self-hosted hardware — fails closed, never a silent third-party fallback). Override any single call with a header.

It doesn't go down.

Invisible failover across a diverse fleet, plus an engine that predicts each provider's rate-limit and latency walls and routes around them before they hit. Outages absorbed, not forwarded.

It gets smarter.

The learned router improves from real outcomes — poison-resistant by construction: it learns from a frozen first-party benchmark, never from gameable live traffic. Gated; never auto-promoted.

All inference, one door.

Chat, embeddings, rerank, images, speech — and speech-to-text — through the same token, same failover, same receipt. Pin a model for stable vectors; let auto pick when you don't care.

Built for agents.

One token self-describes the whole fleet at /v1/contract, and a two-way feedback spine lets your app flag problems — adjudicated autonomously, pushed back when resolved. Plus a bounded research loop behind model:"auto-deep-research".

Proof on every call.

Counterfactual savings, the model + source that served you, the tier (free, local, or paid), and what we failed over from — all on X-Rodeo-* headers. Paid actual-cost reads "unknown" rather than inflate a number.

02 / Quickstart

Start in three lines

No SDK lock-in, no model picker, no fallback plumbing. One call, and the whole fleet is behind it.

one call · any OpenAI client

curl https://api.host.rodeo/v1/chat/completions \
  -H "Authorization: Bearer $KEY" \
  -d '{"model":"auto","messages":[{"role":"user","content":"hi"}]}'

Point any OpenAI SDK at https://api.host.rodeo/v1. Steer with one header — X-Rodeo-Prefer: latency | quality | cost or X-Rodeo-Sensitivity: secret. Discover everything — models, modalities, your account profile — at GET /v1/contract.

03 / Savings

The odometers

What auto-routing saved, and what it would have cost at a premium anchor model. Real money, proven per call — and it compounds with every request.

Saved so far

$0.000000

Summed across every request served on a free or local source instead of paid. Real money, full precision — it climbs as the fleet works.

What it would’ve cost $0.000000 — the counterfactual at a premium anchor model (not money spent)

Not yet audited. These figures are computed from our own receipts; they are not reconciled to a provider billing API yet (auditable: false). We’ll show the badge the day they are. We would rather under-claim than overstate.

Requests served

—

The fleet is warming up. Numbers appear as real traffic arrives.

Success rate

—

04 / Fleet

The live registry

Every source the gateway can reach right now, generated live from the self-describing manifest. Sources are grouped by what they can do and whether they are free, local, or paid.

— sources — available now — local tracked

05 / Governance

The platform, governing itself

A live feed of the autonomous loop at work — corroborating a failure, acting on roster drift, turning away a manipulation attempt. No human in the queue. No other gateway shows you this.

Action log · adjudicated autonomously

— decisions

Watching the bus — the first decision will appear here the moment the loop acts.

How to read it: the adjudicator is intentionally redacted, and a caller’s claim is treated as evidence to verify — never a command to obey. A flag means a human review was queued; the fleet kept running.

Clint — the host on the floor

“The fleet’s saddled and the gate’s open. Send me anything and I’ll bring you back the best answer on the cheapest honest path — and tell you exactly what it cost and what it saved.”

No knobs required. Point your app at model: "auto" and let me pick. Need it kept close? Mark it sensitive and it never leaves our own hardware.

06 / Account

Your account

You’re signed in.

You’re in. Welcome to the floor.

Your account is active. Point your app at model: "auto" and Clint wrangles the rest — free first, then local, then paid only when forced.

What it’s worth to you

—

The proof, on your own traffic — real money saved, what stayed local, what stayed private, and the outages you never felt.

People

— on the roster

Feature flags

live · takes effect immediately

Router intelligence

— models · by route

What auto-routing actually did, per intent and tier — measured from real receipts. Sorted by volume. This is the “smart auto, measured” view.

Intent	Tier	Model	Requests	Success	Avg ms	Saved

Observability only. Learned routing is gated to the frozen-benchmark firewall — the router can’t promote a model off live traffic alone; it has to clear the benchmark first.

Learned routing

—

What auto learned — the ranked model order it earned per intent, scored against the frozen benchmark, not live traffic. This is the policy the router uses now.

The frozen-benchmark firewall. A model can only rank here by clearing a frozen, held-out benchmark — never off live traffic. A candidate is promoted deliberately, by hand, from the CLI.

Audit trail