Token Kiosk
Core Concepts

Choosing a model

How provider routing works — pick a provider by prefixing the model ID.

The gateway routes each request to a provider by reading the prefix of the model ID. There is no automatic fallback or cross-provider load balancing — you choose the exact model, and the gateway calls that provider.

Model ID format

All model IDs follow <provider>/<model-name>:

gemini/gemini-2.5-flash
bedrock/nova-pro
bedrock/claude-haiku-4-5-20251001
kimi/kimi-k2.5
minimax/MiniMax-M2.5

The gateway splits on the first / to select the provider, then forwards the rest as the upstream model name.

Switching models

Because routing is just a string, switching models or providers is a one-line change:

// Google Gemini …
model: 'gemini/gemini-2.5-flash'
// … or Moonshot Kimi — same request shape
model: 'kimi/kimi-k2.5'

Picking the right one

  • Cheapest general-purpose: gemini/gemini-2.5-flash
  • Long context: Gemini models (up to 1M+ tokens)
  • Anthropic Claude: via bedrock/claude-* (see Integrations → Anthropic)

Browse everything with live pricing and context windows in the Model catalog, and check Provider notes for per-model quirks.

On this page