Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions dotnet/src/Types.cs
Original file line number Diff line number Diff line change
Expand Up @@ -1528,6 +1528,43 @@ public class ProviderConfig
/// </summary>
[JsonPropertyName("headers")]
public IDictionary<string, string>? Headers { get; set; }

/// <summary>
/// Well-known model ID used to look up agent configuration (tools, prompts,
/// reasoning behavior) and default token limits from the capability catalog.
/// Useful for fine-tuned models that should inherit the configuration of a
/// known base model.
/// Defaults to the session's configured model (see <see cref="SessionConfig.Model"/>)
/// when not explicitly set.
/// </summary>
[JsonPropertyName("modelId")]
public string? ModelId { get; set; }
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there's:
SessionConfig.Model
ProviderConfig.ModelId
ProviderConfig.WireModel

Help me understand the relationship? If I specify SessionConfig.Model, it's used as the default for both options on ProviderConfig, and those options on provider config then represent the two different groupings in which a model would be used, such that I can override one of them? Is there any situation where I would specify the same model ID for both ModelId and WireModel?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only ProviderConfig.ModelId is specified, then that controls multiple things:

  • The model name used by the runtime to determine model limits and model-specific agent configuration
  • The model name sent to the custom provider for inference

If the model provider recognizes a model name that doesn't match the model ID known by the runtime, then ProviderConfig.WireModel can specify that.

SessionConfig.Model acts a default in case neither option is specified.

Is there any situation where I would specify the same model ID for both ModelId and WireModel?

It has the same effect as just specifying ModelId, so it's not really necessary.


/// <summary>
/// Model identifier sent to the provider API for inference.
/// Use this when the name your provider knows (e.g. an Azure deployment name
/// or a custom fine-tune name) differs from the well-known model ID used for
/// configuration lookup.
/// Defaults to the session's configured model (see <see cref="SessionConfig.Model"/>)
/// when not explicitly set.
/// </summary>
[JsonPropertyName("wireModel")]
public string? WireModel { get; set; }
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of the three:
SessionConfig.Model
ProviderConfig.ModelId
ProviderConfig.WireModel

what's the meaning behind ProviderConfig.ModelId having an "Id" suffix and the other two not?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured the "ID" more strongly implied that the value was identifying a well-known model kind. We could also consider ModelFamily? It would just require changing the runtime as well.


/// <summary>
/// Maximum number of tokens allowed in the prompt for a single LLM API request.
/// Used by the runtime to trigger conversation compaction before sending a request
/// when the prompt (system message, history, tool definitions, user message) exceeds this limit.
/// </summary>
[JsonPropertyName("maxPromptTokens")]
public int? MaxPromptTokens { get; set; }
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be called MaxInputTokens instead of MaxPromptTokens?

Does this include cached tokens?

Same question as above... is this about one request or across a sequence of calls?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per-request, but not sent to the API. The runtime uses this internally to decide when to truncate or compact conversation history before each LLM call. "Prompt" here means everything sent to the model in one request: system message, full conversation history up to that point, tool definitions, and the new user message. Cached tokens are counted toward the limit.

The name matches the upstream CAPI /models field (max_prompt_tokens), though MaxInputTokens would also be reasonable.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer MaxInputTokens. That's the more modern terminology, right? And it maps to what we show in the CLI UI?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could change this; it would just require changing the runtime representation as well (including the COPILOT_PROVIDER_MAX_PROMPT_TOKENS environment variable, which would probably need to be renamed).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the sdk name wouldn't, right? Only if we also wanted to change the wire name?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, if we just changed the public API but configured it to serialize with the maxInputTokens naming? Yeah that would work.


/// <summary>
/// Maximum number of tokens the model can generate in a single response.
/// When hit, the model stops generating and returns a truncated response.
/// </summary>
[JsonPropertyName("maxOutputTokens")]
public int? MaxOutputTokens { get; set; }
}
Comment thread
MackinnonBuck marked this conversation as resolved.

/// <summary>
Expand Down
14 changes: 13 additions & 1 deletion dotnet/test/Unit/SerializationTests.cs
Original file line number Diff line number Diff line change
Expand Up @@ -20,19 +20,31 @@ public void ProviderConfig_CanSerializeHeaders_WithSdkOptions()
var original = new ProviderConfig
{
BaseUrl = "https://example.com/provider",
Headers = new Dictionary<string, string> { ["Authorization"] = "Bearer provider-token" }
Headers = new Dictionary<string, string> { ["Authorization"] = "Bearer provider-token" },
ModelId = "gpt-4o",
WireModel = "my-finetune-v3",
MaxPromptTokens = 100_000,
MaxOutputTokens = 4096
};

var json = JsonSerializer.Serialize(original, options);
using var document = JsonDocument.Parse(json);
var root = document.RootElement;
Assert.Equal("https://example.com/provider", root.GetProperty("baseUrl").GetString());
Assert.Equal("Bearer provider-token", root.GetProperty("headers").GetProperty("Authorization").GetString());
Assert.Equal("gpt-4o", root.GetProperty("modelId").GetString());
Assert.Equal("my-finetune-v3", root.GetProperty("wireModel").GetString());
Assert.Equal(100_000, root.GetProperty("maxPromptTokens").GetInt32());
Assert.Equal(4096, root.GetProperty("maxOutputTokens").GetInt32());

var deserialized = JsonSerializer.Deserialize<ProviderConfig>(json, options);
Assert.NotNull(deserialized);
Assert.Equal("https://example.com/provider", deserialized.BaseUrl);
Assert.Equal("Bearer provider-token", deserialized.Headers!["Authorization"]);
Assert.Equal("gpt-4o", deserialized.ModelId);
Assert.Equal("my-finetune-v3", deserialized.WireModel);
Assert.Equal(100_000, deserialized.MaxPromptTokens);
Assert.Equal(4096, deserialized.MaxOutputTokens);
}

[Fact]
Expand Down
23 changes: 23 additions & 0 deletions go/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -859,6 +859,29 @@ type ProviderConfig struct {
Azure *AzureProviderOptions `json:"azure,omitempty"`
// Headers are custom HTTP headers included in outbound provider requests.
Headers map[string]string `json:"headers,omitempty"`
// ModelID is the well-known model ID used to look up agent configuration
// (tools, prompts, reasoning behavior) and default token limits from the
// capability catalog. Useful for fine-tuned models that should inherit the
// configuration of a known base model.
// Defaults to the session's configured model (SessionConfig.Model) when
// not explicitly set.
ModelID string `json:"modelId,omitempty"`
// WireModel is the model identifier sent to the provider API for inference.
// Use this when the name your provider knows (e.g. an Azure deployment name
// or a custom fine-tune name) differs from the well-known model ID used for
// configuration lookup.
// Defaults to the session's configured model (SessionConfig.Model) when
// not explicitly set.
WireModel string `json:"wireModel,omitempty"`
// MaxPromptTokens is the maximum number of tokens allowed in the prompt for
// a single LLM API request. Used by the runtime to trigger conversation
// compaction before sending a request when the prompt (system message,
// history, tool definitions, user message) exceeds this limit.
MaxPromptTokens int `json:"maxPromptTokens,omitempty"`
// MaxOutputTokens is the maximum number of tokens the model can generate in
// a single response. When hit, the model stops generating and returns a
// truncated response.
MaxOutputTokens int `json:"maxOutputTokens,omitempty"`
}

// AzureProviderOptions contains Azure-specific provider configuration
Expand Down
65 changes: 65 additions & 0 deletions go/types_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -151,3 +151,68 @@ func TestSessionSendRequest_JSONIncludesRequestHeaders(t *testing.T) {
t.Fatalf("expected Authorization header, got %v", headers["Authorization"])
}
}

func TestProviderConfig_JSONIncludesAllFields(t *testing.T) {
cfg := ProviderConfig{
BaseURL: "https://example.com/provider",
APIKey: "test-key",
Headers: map[string]string{"Authorization": "Bearer provider-token"},
ModelID: "gpt-4o",
WireModel: "my-finetune-v3",
MaxPromptTokens: 100000,
MaxOutputTokens: 4096,
}

data, err := json.Marshal(cfg)
if err != nil {
t.Fatalf("failed to marshal ProviderConfig: %v", err)
}

var decoded map[string]any
if err := json.Unmarshal(data, &decoded); err != nil {
t.Fatalf("failed to unmarshal ProviderConfig: %v", err)
}

if decoded["baseUrl"] != "https://example.com/provider" {
t.Errorf("expected baseUrl to round-trip, got %v", decoded["baseUrl"])
}
if decoded["modelId"] != "gpt-4o" {
t.Errorf("expected modelId 'gpt-4o', got %v", decoded["modelId"])
}
if decoded["wireModel"] != "my-finetune-v3" {
t.Errorf("expected wireModel 'my-finetune-v3', got %v", decoded["wireModel"])
}
if decoded["maxPromptTokens"] != float64(100000) {
t.Errorf("expected maxPromptTokens 100000, got %v", decoded["maxPromptTokens"])
}
if decoded["maxOutputTokens"] != float64(4096) {
t.Errorf("expected maxOutputTokens 4096, got %v", decoded["maxOutputTokens"])
}
headers, ok := decoded["headers"].(map[string]any)
if !ok {
t.Fatalf("expected headers object, got %T", decoded["headers"])
}
if headers["Authorization"] != "Bearer provider-token" {
t.Errorf("expected Authorization header, got %v", headers["Authorization"])
}
}

func TestProviderConfig_JSONOmitsUnsetTokenFields(t *testing.T) {
cfg := ProviderConfig{BaseURL: "https://example.com/provider"}

data, err := json.Marshal(cfg)
if err != nil {
t.Fatalf("failed to marshal ProviderConfig: %v", err)
}

var decoded map[string]any
if err := json.Unmarshal(data, &decoded); err != nil {
t.Fatalf("failed to unmarshal ProviderConfig: %v", err)
}

for _, field := range []string{"modelId", "wireModel", "maxPromptTokens", "maxOutputTokens", "headers"} {
if _, present := decoded[field]; present {
t.Errorf("expected %q to be omitted when unset, got %v", field, decoded[field])
}
}
}
33 changes: 33 additions & 0 deletions nodejs/src/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -1503,6 +1503,39 @@ export interface ProviderConfig {
* Custom HTTP headers to include in outbound provider requests.
*/
headers?: Record<string, string>;

/**
* Well-known model ID used to look up agent configuration (tools, prompts,
* reasoning behavior) and default token limits from the capability catalog.
* Useful for fine-tuned models that should inherit the configuration of a
* known base model.
* Defaults to the session's configured model (see {@link SessionConfig.model})
* when not explicitly set.
*/
modelId?: string;

/**
* Model identifier sent to the provider API for inference.
* Use this when the name your provider knows (e.g. an Azure deployment name
* or a custom fine-tune name) differs from the well-known model ID used
* for configuration lookup.
* Defaults to the session's configured model (see {@link SessionConfig.model})
* when not explicitly set.
*/
wireModel?: string;

/**
* Maximum number of tokens allowed in the prompt for a single LLM API request.
* Used by the runtime to trigger conversation compaction before sending a request
* when the prompt (system message, history, tool definitions, user message) exceeds this limit.
*/
maxPromptTokens?: number;

/**
* Maximum number of tokens the model can generate in a single response.
* When hit, the model stops generating and returns a truncated response.
*/
maxOutputTokens?: number;
}

/**
Expand Down
16 changes: 16 additions & 0 deletions nodejs/test/client.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -224,6 +224,10 @@ describe("CopilotClient", () => {
provider: {
baseUrl: "https://example.com/provider",
headers: { Authorization: "Bearer provider-token" },
modelId: "gpt-4o",
wireModel: "my-finetune-v3",
maxPromptTokens: 100_000,
maxOutputTokens: 4096,
},
});

Expand All @@ -232,6 +236,10 @@ describe("CopilotClient", () => {
expect.objectContaining({
baseUrl: "https://example.com/provider",
headers: { Authorization: "Bearer provider-token" },
modelId: "gpt-4o",
wireModel: "my-finetune-v3",
maxPromptTokens: 100_000,
maxOutputTokens: 4096,
})
);
spy.mockRestore();
Expand All @@ -255,6 +263,10 @@ describe("CopilotClient", () => {
provider: {
baseUrl: "https://example.com/provider",
headers: { Authorization: "Bearer resume-token" },
modelId: "gpt-4o",
wireModel: "my-finetune-v3",
maxPromptTokens: 100_000,
maxOutputTokens: 4096,
},
});

Expand All @@ -263,6 +275,10 @@ describe("CopilotClient", () => {
expect.objectContaining({
baseUrl: "https://example.com/provider",
headers: { Authorization: "Bearer resume-token" },
modelId: "gpt-4o",
wireModel: "my-finetune-v3",
maxPromptTokens: 100_000,
maxOutputTokens: 4096,
})
);
spy.mockRestore();
Expand Down
8 changes: 8 additions & 0 deletions python/copilot/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -2275,6 +2275,14 @@ def _convert_provider_to_wire_format(
wire_provider["bearerToken"] = provider["bearer_token"]
if "headers" in provider:
wire_provider["headers"] = provider["headers"]
if "model_id" in provider:
wire_provider["modelId"] = provider["model_id"]
if "wire_model" in provider:
wire_provider["wireModel"] = provider["wire_model"]
if "max_prompt_tokens" in provider:
wire_provider["maxPromptTokens"] = provider["max_prompt_tokens"]
if "max_output_tokens" in provider:
wire_provider["maxOutputTokens"] = provider["max_output_tokens"]
if "azure" in provider:
azure = provider["azure"]
wire_azure: dict[str, Any] = {}
Expand Down
22 changes: 22 additions & 0 deletions python/copilot/session.py
Original file line number Diff line number Diff line change
Expand Up @@ -832,6 +832,28 @@ class ProviderConfig(TypedDict, total=False):
bearer_token: str
azure: AzureProviderOptions # Azure-specific options
headers: dict[str, str]
# Well-known model ID used to look up agent configuration (tools, prompts,
# reasoning behavior) and default token limits from the capability catalog.
# Useful for fine-tuned models that should inherit the configuration of a
# known base model.
# Defaults to the session's configured model (SessionConfig.model) when
# not explicitly set.
model_id: str
# Model identifier sent to the provider API for inference. Use this when the
# name your provider knows (e.g. an Azure deployment name or a custom
# fine-tune name) differs from the well-known model ID used for
# configuration lookup.
# Defaults to the session's configured model (SessionConfig.model) when
# not explicitly set.
wire_model: str
# Maximum number of tokens allowed in the prompt for a single LLM API
# request. Used by the runtime to trigger conversation compaction before
# sending a request when the prompt (system message, history, tool
# definitions, user message) exceeds this limit.
max_prompt_tokens: int
# Maximum number of tokens the model can generate in a single response.
# When hit, the model stops generating and returns a truncated response.
max_output_tokens: int


class SessionConfig(TypedDict, total=False):
Expand Down
16 changes: 16 additions & 0 deletions python/test_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -564,12 +564,20 @@ async def mock_request(method, params):
provider={
"base_url": "https://example.com/provider",
"headers": {"Authorization": "Bearer provider-token"},
"model_id": "gpt-4o",
"wire_model": "my-finetune-v3",
"max_prompt_tokens": 100_000,
"max_output_tokens": 4096,
},
)

provider = captured["session.create"]["provider"]
assert provider["baseUrl"] == "https://example.com/provider"
assert provider["headers"] == {"Authorization": "Bearer provider-token"}
assert provider["modelId"] == "gpt-4o"
assert provider["wireModel"] == "my-finetune-v3"
assert provider["maxPromptTokens"] == 100_000
assert provider["maxOutputTokens"] == 4096
finally:
await client.force_stop()

Expand Down Expand Up @@ -599,12 +607,20 @@ async def mock_request(method, params):
provider={
"base_url": "https://example.com/provider",
"headers": {"Authorization": "Bearer resume-token"},
"model_id": "gpt-4o",
"wire_model": "my-finetune-v3",
"max_prompt_tokens": 100_000,
"max_output_tokens": 4096,
},
)

provider = captured["session.resume"]["provider"]
assert provider["baseUrl"] == "https://example.com/provider"
assert provider["headers"] == {"Authorization": "Bearer resume-token"}
assert provider["modelId"] == "gpt-4o"
assert provider["wireModel"] == "my-finetune-v3"
assert provider["maxPromptTokens"] == 100_000
assert provider["maxOutputTokens"] == 4096
finally:
await client.force_stop()

Expand Down
Loading