Models
Pricing
How model prices map to what you're billed.
How prices work
Prices reported by GET /v1/models (and shown in the catalog) are downstream costs — what the gateway pays the provider. They are quoted in USD per 1M tokens, separately for prompt and completion tokens. You're billed at those rates.
What you're charged
- Charges are computed from actual token usage returned by the provider.
- Balance is reserved before the call based on
max_tokens, then reconciled to the real cost. See Credits & Billing. - On a provider error, the reservation is released and no charge is made.