Token Kiosk
Models

Provider notes

Per-provider and per-model quirks to be aware of.

These are documented behaviors that differ from the generic OpenAI request shape.

Kimi (kimi/kimi-k2.5)

kimi-k2.5 ignores temperature, top_p, and penalty parameters.

MiniMax

MiniMax models ignore presence_penalty and frequency_penalty parameters.

Bedrock — kimi-k2-thinking

bedrock/kimi-k2-thinking is a reasoning model that uses an internal thinking budget. Use max_tokens ≥ 1000 to ensure output is produced.

Bedrock — gpt-oss-20b

bedrock/gpt-oss-20b requires max_tokens ≥ 500 to produce output.

Not supported anywhere

thinking and reasoning_effort request parameters return HTTP 400. Strip them before calling the gateway, even for reasoning models — those manage their thinking budget internally.

On this page