2026-04-16 10:55:58 +00:00
# Model Compatibility Guide
This document describes model-specific handling in the OpenAI-compatible provider. When adding new models or providers, review this guide to ensure proper compatibility.
## Table of Contents
- [Overview ](#overview )
- [Model-Specific Handling ](#model-specific-handling )
- [Kimi Models (is_error Exclusion) ](#kimi-models-is_error-exclusion )
- [Reasoning Models (Tuning Parameter Stripping) ](#reasoning-models-tuning-parameter-stripping )
- [GPT-5 (max_completion_tokens) ](#gpt-5-max_completion_tokens )
2026-05-15 10:45:18 +09:00
- [Qwen and Kimi Models (DashScope Routing) ](#qwen-and-kimi-models-dashscope-routing )
2026-05-15 10:33:54 +09:00
- [Custom Gateway Slugs and Extra Body Parameters ](#custom-gateway-slugs-and-extra-body-parameters )
2026-04-16 10:55:58 +00:00
- [Implementation Details ](#implementation-details )
- [Adding New Models ](#adding-new-models )
- [Testing ](#testing )
## Overview
The `openai_compat.rs` provider translates Claude Code's internal message format to OpenAI-compatible chat completion requests. Different models have varying requirements for:
- Tool result message fields (`is_error` )
- Sampling parameters (temperature, top_p, etc.)
- Token limit fields (`max_tokens` vs `max_completion_tokens` )
- Base URL routing
2026-05-15 10:33:54 +09:00
- Provider-specific extra body parameters (`web_search_options` , `parallel_tool_calls` , local-server switches, etc.)
- Provider diagnostics for status/doctor-style surfaces
2026-04-16 10:55:58 +00:00
## Model-Specific Handling
### Kimi Models (is_error Exclusion)
**Affected models:** `kimi-k2.5` , `kimi-k1.5` , `kimi-moonshot` , and any model with `kimi` in the name (case-insensitive)
**Behavior:** The `is_error` field is **excluded** from tool result messages.
**Rationale:** Kimi models (via Moonshot AI and DashScope) reject the `is_error` field with a 400 Bad Request error:
```json
{
"error": {
"type": "invalid_request_error",
"message": "Unknown field: is_error"
}
}
```
**Detection:**
```rust
fn model_rejects_is_error_field(model: & str) -> bool {
let lowered = model.to_ascii_lowercase();
let canonical = lowered.rsplit('/').next().unwrap_or(lowered.as_str());
2026-05-15 10:23:37 +09:00
canonical.starts_with("kimi")
2026-04-16 10:55:58 +00:00
}
```
**Testing:** See `model_rejects_is_error_field_detects_kimi_models` and related tests in `openai_compat.rs` .
---
### Reasoning Models (Tuning Parameter Stripping)
**Affected models:**
- OpenAI: `o1` , `o1-*` , `o3` , `o3-*` , `o4` , `o4-*`
- xAI: `grok-3-mini`
- Alibaba DashScope: `qwen-qwq-*` , `qwq-*` , `qwen3-*-thinking`
**Behavior:** The following tuning parameters are **stripped** from requests:
- `temperature`
- `top_p`
- `frequency_penalty`
- `presence_penalty`
**Rationale:** Reasoning/chain-of-thought models use fixed sampling strategies and reject these parameters with 400 errors.
**Exception:** `reasoning_effort` is included for compatible models when explicitly set.
**Detection:**
```rust
fn is_reasoning_model(model: & str) -> bool {
let canonical = model.to_ascii_lowercase()
.rsplit('/')
.next()
.unwrap_or(model);
canonical.starts_with("o1")
|| canonical.starts_with("o3")
|| canonical.starts_with("o4")
|| canonical == "grok-3-mini"
|| canonical.starts_with("qwen-qwq")
|| canonical.starts_with("qwq")
|| (canonical.starts_with("qwen3") & & canonical.contains("-thinking"))
}
```
**Testing:** See `reasoning_model_strips_tuning_params` , `grok_3_mini_is_reasoning_model` , and `qwen_reasoning_variants_are_detected` tests.
---
### GPT-5 (max_completion_tokens)
**Affected models:** All models starting with `gpt-5`
**Behavior:** Uses `max_completion_tokens` instead of `max_tokens` in the request payload.
**Rationale:** GPT-5 models require the `max_completion_tokens` field. Legacy `max_tokens` causes request validation failures:
```json
{
"error": {
"message": "Unknown field: max_tokens"
}
}
```
**Implementation:**
```rust
let max_tokens_key = if wire_model.starts_with("gpt-5") {
"max_completion_tokens"
} else {
"max_tokens"
};
```
**Testing:** See `gpt5_uses_max_completion_tokens_not_max_tokens` and `non_gpt5_uses_max_tokens` tests.
---
2026-05-15 10:23:37 +09:00
### Qwen and Kimi Models (DashScope Routing)
2026-04-16 10:55:58 +00:00
2026-05-15 10:23:37 +09:00
**Affected models:** All models with `qwen` or `kimi` prefixes, including `qwen/` , `qwen-` , `kimi/` , and `kimi-` forms.
2026-04-16 10:55:58 +00:00
2026-05-15 10:23:37 +09:00
**Behavior:** Routed to DashScope (`https://dashscope.aliyuncs.com/compatible-mode/v1` ) rather than ambient-credential fallback providers. Known routing prefixes are stripped before sending the wire model.
2026-04-16 10:55:58 +00:00
2026-05-15 10:23:37 +09:00
**Rationale:** Qwen and Kimi compatible-mode models are hosted through Alibaba Cloud's DashScope service, not OpenAI or Anthropic.
2026-04-16 10:55:58 +00:00
**Configuration:**
```rust
pub const DEFAULT_DASHSCOPE_BASE_URL: & str = "https://dashscope.aliyuncs.com/compatible-mode/v1";
```
**Authentication:** Uses `DASHSCOPE_API_KEY` environment variable.
**Note:** Some Qwen models are also reasoning models (see [Reasoning Models ](#reasoning-models-tuning-parameter-stripping ) above) and receive both treatments.
2026-05-15 10:23:37 +09:00
---
2026-05-15 10:33:54 +09:00
### Custom Gateway Slugs and Extra Body Parameters
2026-05-15 10:23:37 +09:00
2026-05-15 10:33:54 +09:00
**Affected models:** Slash-containing model IDs routed through the OpenAI-compatible provider, especially custom gateways configured with `OPENAI_BASE_URL` such as OpenRouter, local routers, or other `/v1/chat/completions` services.
2026-05-15 10:23:37 +09:00
2026-05-15 10:33:54 +09:00
**Behavior:**
2026-06-03 23:16:46 +09:00
- The default OpenAI API and local/private OpenAI-compatible base URLs treat `openai/` as a routing prefix and send the bare model name on the wire.
- Non-local custom OpenAI-compatible base URLs preserve slash-containing slugs such as `openai/gpt-4.1-mini` so gateways like OpenRouter receive the exact model ID they expect. Local slash-containing model IDs can use `local/` , which strips only that escape-hatch prefix and sends the remainder verbatim.
2026-05-15 10:33:54 +09:00
- `MessageRequest::extra_body` passes through custom request JSON after core fields are populated. This supports provider-specific options such as `web_search_options` and `parallel_tool_calls` .
- Protected core fields (`model` , `messages` , `stream` , `tools` , `tool_choice` , `max_tokens` , `max_completion_tokens` ) cannot be overridden through `extra_body` .
2026-05-15 10:23:37 +09:00
2026-06-03 23:16:46 +09:00
**Testing:** See `custom_openai_gateway_preserves_slash_model_ids_and_extra_body_params` in `openai_compat_integration.rs` , `wire_model_strips_openai_prefix_for_default_and_local_preserves_custom_gateways` , `local_routing_prefix_strips_only_escape_hatch` , and `extra_body_params_are_passed_through_without_overriding_core_fields` in `openai_compat.rs` .
2026-05-15 10:23:37 +09:00
2026-04-16 10:55:58 +00:00
## Implementation Details
### File Location
All model-specific logic is in:
```
rust/crates/api/src/providers/openai_compat.rs
```
### Key Functions
| Function | Purpose |
|----------|---------|
| `model_rejects_is_error_field()` | Detects models that don't support `is_error` in tool results |
| `is_reasoning_model()` | Detects reasoning models that need tuning param stripping |
| `translate_message()` | Converts internal messages to OpenAI format (applies `is_error` logic) |
2026-05-15 10:33:54 +09:00
| `build_chat_completion_request()` | Constructs full request payload (applies all model-specific logic and safe `extra_body` passthrough) |
| `provider_diagnostics_for_model()` | Produces provider/status diagnostics including auth/base-url vars, reasoning behavior, proxy support, extra-body support, and slash-model preservation |
2026-04-16 10:55:58 +00:00
### Provider Prefix Handling
All model detection functions strip provider prefixes (e.g., `dashscope/kimi-k2.5` → `kimi-k2.5` ) before matching:
```rust
let canonical = model.to_ascii_lowercase()
.rsplit('/')
.next()
.unwrap_or(model);
```
2026-05-15 10:33:54 +09:00
This ensures consistent detection regardless of whether models are referenced with or without provider prefixes. Wire-model handling is more specific: known routing prefixes are stripped for provider-native defaults, while custom OpenAI-compatible base URLs preserve slash-containing gateway slugs.
2026-04-16 10:55:58 +00:00
## Adding New Models
When adding support for new models:
1. **Check if the model is a reasoning model**
- Does it reject temperature/top_p parameters?
- Add to `is_reasoning_model()` detection
2. **Check tool result compatibility**
- Does it reject the `is_error` field?
- Add to `model_rejects_is_error_field()` detection
3. **Check token limit field**
- Does it require `max_completion_tokens` instead of `max_tokens` ?
- Update the `max_tokens_key` logic
2026-05-15 10:33:54 +09:00
4. **Check custom gateway behavior**
- Should slash-containing IDs be preserved for custom `OPENAI_BASE_URL` gateways?
- Does the feature belong in a typed request field or `extra_body` passthrough?
5. **Add tests**
2026-04-16 10:55:58 +00:00
- Unit test for detection function
- Integration test in `build_chat_completion_request`
2026-05-15 10:33:54 +09:00
6. **Update this documentation**
2026-04-16 10:55:58 +00:00
- Add the model to the affected lists
- Document any special behavior
## Testing
### Running Model-Specific Tests
```bash
# All OpenAI compatibility tests
cargo test --package api providers::openai_compat
# Specific test categories
cargo test --package api model_rejects_is_error_field
cargo test --package api reasoning_model
cargo test --package api gpt5
cargo test --package api qwen
2026-05-15 10:33:54 +09:00
cargo test --package api custom_openai_gateway_preserves_slash_model_ids_and_extra_body_params
cargo test --package api provider_diagnostics_explain_openai_compatible_capabilities
2026-04-16 10:55:58 +00:00
```
### Test Files
- Unit tests: `rust/crates/api/src/providers/openai_compat.rs` (in `mod tests` )
- Integration tests: `rust/crates/api/tests/openai_compat_integration.rs`
### Verifying Model Detection
To verify a model is detected correctly without making API calls:
```rust
#[test]
fn my_new_model_is_detected() {
// is_error handling
assert!(model_rejects_is_error_field("my-model"));
// Reasoning model detection
assert!(is_reasoning_model("my-model"));
// Provider prefix handling
assert!(model_rejects_is_error_field("provider/my-model"));
}
```
---
2026-05-15 10:33:54 +09:00
*Last updated: 2026-05-15*
2026-04-16 10:55:58 +00:00
For questions or updates, see the implementation in `rust/crates/api/src/providers/openai_compat.rs` .