Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions dotnet/src/Types.cs
Original file line number Diff line number Diff line change
Expand Up @@ -1528,6 +1528,43 @@ public class ProviderConfig
/// </summary>
[JsonPropertyName("headers")]
public IDictionary<string, string>? Headers { get; set; }

/// <summary>
/// Well-known model ID used to look up agent configuration (tools, prompts,
/// reasoning behavior) and default token limits from the capability catalog.
/// Useful for fine-tuned models that should inherit the configuration of a
/// known base model.
/// Defaults to the session's configured model (see <see cref="SessionConfig.Model"/>)
/// when not explicitly set.
/// </summary>
[JsonPropertyName("modelId")]
public string? ModelId { get; set; }
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there's:
SessionConfig.Model
ProviderConfig.ModelId
ProviderConfig.WireModel

Help me understand the relationship? If I specify SessionConfig.Model, it's used as the default for both options on ProviderConfig, and those options on provider config then represent the two different groupings in which a model would be used, such that I can override one of them? Is there any situation where I would specify the same model ID for both ModelId and WireModel?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only ProviderConfig.ModelId is specified, then that controls multiple things:

  • The model name used by the runtime to determine model limits and model-specific agent configuration
  • The model name sent to the custom provider for inference

If the model provider recognizes a model name that doesn't match the model ID known by the runtime, then ProviderConfig.WireModel can specify that.

SessionConfig.Model acts a default in case neither option is specified.

Is there any situation where I would specify the same model ID for both ModelId and WireModel?

It has the same effect as just specifying ModelId, so it's not really necessary.


/// <summary>
/// Model identifier sent to the provider API for inference.
/// Use this when the name your provider knows (e.g. an Azure deployment name
/// or a custom fine-tune name) differs from the well-known model ID used for
/// configuration lookup.
/// Defaults to the session's configured model (see <see cref="SessionConfig.Model"/>)
/// when not explicitly set.
/// </summary>
[JsonPropertyName("wireModel")]
public string? WireModel { get; set; }
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of the three:
SessionConfig.Model
ProviderConfig.ModelId
ProviderConfig.WireModel

what's the meaning behind ProviderConfig.ModelId having an "Id" suffix and the other two not?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured the "ID" more strongly implied that the value was identifying a well-known model kind. We could also consider ModelFamily? It would just require changing the runtime as well.


/// <summary>
/// Maximum number of tokens allowed in the prompt for a single LLM API request.
/// Used by the runtime to trigger conversation compaction before sending a request
/// when the prompt (system message, history, tool definitions, user message) exceeds this limit.
/// </summary>
[JsonPropertyName("maxPromptTokens")]
public int? MaxPromptTokens { get; set; }
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be called MaxInputTokens instead of MaxPromptTokens?

Does this include cached tokens?

Same question as above... is this about one request or across a sequence of calls?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per-request, but not sent to the API. The runtime uses this internally to decide when to truncate or compact conversation history before each LLM call. "Prompt" here means everything sent to the model in one request: system message, full conversation history up to that point, tool definitions, and the new user message. Cached tokens are counted toward the limit.

The name matches the upstream CAPI /models field (max_prompt_tokens), though MaxInputTokens would also be reasonable.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer MaxInputTokens. That's the more modern terminology, right? And it maps to what we show in the CLI UI?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could change this; it would just require changing the runtime representation as well (including the COPILOT_PROVIDER_MAX_PROMPT_TOKENS environment variable, which would probably need to be renamed).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the sdk name wouldn't, right? Only if we also wanted to change the wire name?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, if we just changed the public API but configured it to serialize with the maxInputTokens naming? Yeah that would work.


/// <summary>
/// Maximum number of tokens the model can generate in a single response.
/// When hit, the model stops generating and returns a truncated response.
/// </summary>
[JsonPropertyName("maxOutputTokens")]
public int? MaxOutputTokens { get; set; }
}
Comment thread
MackinnonBuck marked this conversation as resolved.

/// <summary>
Expand Down
56 changes: 56 additions & 0 deletions dotnet/test/E2E/SessionConfigE2ETests.cs
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,62 @@ public async Task Should_Forward_Custom_Provider_Headers_On_Resume()
await session2.DisposeAsync();
}

[Fact]
public async Task Should_Forward_Provider_Wire_Model()
{
// Verifies that ProviderConfig.WireModel overrides the model name sent to
// the provider API, while SessionConfig.Model still drives runtime
// configuration lookup (capabilities, prompts, reasoning behavior).
// MaxOutputTokens is also set here to confirm the SDK accepts it without
// serialization errors; the CLI does not echo it as `max_tokens` on the
// OpenAI-style wire request, so we don't assert on it directly (see unit
// tests for serialization coverage).
var session = await CreateSessionAsync(new SessionConfig
{
Model = "claude-sonnet-4.5",
Provider = new ProviderConfig
{
Type = "openai",
BaseUrl = Ctx.ProxyUrl,
ApiKey = "test-provider-key",
WireModel = "test-wire-model",
MaxOutputTokens = 1024,
},
});

await session.SendAndWaitAsync(new MessageOptions { Prompt = "What is 1+1?" });

var exchange = Assert.Single(await Ctx.GetExchangesAsync());
Assert.Equal("test-wire-model", exchange.Request.Model);

await session.DisposeAsync();
}

[Fact]
public async Task Should_Use_Provider_Model_Id_As_Wire_Model()
{
// ProviderConfig.ModelId drives both the runtime resolved model AND the wire model
// when WireModel is not specified. Here SessionConfig.Model is intentionally omitted
// so that ModelId is the only model source.
var session = await CreateSessionAsync(new SessionConfig
{
Provider = new ProviderConfig
{
Type = "openai",
BaseUrl = Ctx.ProxyUrl,
ApiKey = "test-provider-key",
ModelId = "claude-sonnet-4.5",
},
});

await session.SendAndWaitAsync(new MessageOptions { Prompt = "What is 1+1?" });

var exchange = Assert.Single(await Ctx.GetExchangesAsync());
Assert.Equal("claude-sonnet-4.5", exchange.Request.Model);

await session.DisposeAsync();
}

[Fact]
public async Task Should_Use_WorkingDirectory_For_Tool_Execution()
{
Expand Down
14 changes: 13 additions & 1 deletion dotnet/test/Unit/SerializationTests.cs
Original file line number Diff line number Diff line change
Expand Up @@ -20,19 +20,31 @@ public void ProviderConfig_CanSerializeHeaders_WithSdkOptions()
var original = new ProviderConfig
{
BaseUrl = "https://example.com/provider",
Headers = new Dictionary<string, string> { ["Authorization"] = "Bearer provider-token" }
Headers = new Dictionary<string, string> { ["Authorization"] = "Bearer provider-token" },
ModelId = "gpt-4o",
WireModel = "my-finetune-v3",
MaxPromptTokens = 100_000,
MaxOutputTokens = 4096
};

var json = JsonSerializer.Serialize(original, options);
using var document = JsonDocument.Parse(json);
var root = document.RootElement;
Assert.Equal("https://example.com/provider", root.GetProperty("baseUrl").GetString());
Assert.Equal("Bearer provider-token", root.GetProperty("headers").GetProperty("Authorization").GetString());
Assert.Equal("gpt-4o", root.GetProperty("modelId").GetString());
Assert.Equal("my-finetune-v3", root.GetProperty("wireModel").GetString());
Assert.Equal(100_000, root.GetProperty("maxPromptTokens").GetInt32());
Assert.Equal(4096, root.GetProperty("maxOutputTokens").GetInt32());

var deserialized = JsonSerializer.Deserialize<ProviderConfig>(json, options);
Assert.NotNull(deserialized);
Assert.Equal("https://example.com/provider", deserialized.BaseUrl);
Assert.Equal("Bearer provider-token", deserialized.Headers!["Authorization"]);
Assert.Equal("gpt-4o", deserialized.ModelId);
Assert.Equal("my-finetune-v3", deserialized.WireModel);
Assert.Equal(100_000, deserialized.MaxPromptTokens);
Assert.Equal(4096, deserialized.MaxOutputTokens);
}

[Fact]
Expand Down
79 changes: 79 additions & 0 deletions go/internal/e2e/session_config_e2e_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -323,6 +323,85 @@ func TestSessionConfigExtrasE2E(t *testing.T) {
}
})

t.Run("should forward provider wire model", func(t *testing.T) {
// Verifies that ProviderConfig.WireModel overrides the model name sent to
// the provider API, while SessionConfig.Model still drives runtime
// configuration lookup (capabilities, prompts, reasoning behavior).
// MaxOutputTokens is also set here to confirm the SDK accepts it without
// serialization errors; the CLI does not echo it as `max_tokens` on the
// OpenAI-style wire request, so we don't assert on it directly (see unit
// tests for serialization coverage).
ctx.ConfigureForTest(t)

maxOutputTokens := 1024
session, err := client.CreateSession(t.Context(), &copilot.SessionConfig{
OnPermissionRequest: copilot.PermissionHandler.ApproveAll,
Model: "claude-sonnet-4.5",
Provider: &copilot.ProviderConfig{
Type: "openai",
BaseURL: ctx.ProxyURL,
APIKey: "test-provider-key",
WireModel: "test-wire-model",
MaxOutputTokens: maxOutputTokens,
},
})
if err != nil {
t.Fatalf("CreateSession failed: %v", err)
}

_, err = session.SendAndWait(t.Context(), copilot.MessageOptions{Prompt: "What is 1+1?"})
if err != nil {
t.Fatalf("SendAndWait failed: %v", err)
}

exchanges, err := ctx.GetExchanges()
if err != nil {
t.Fatalf("GetExchanges failed: %v", err)
}
if len(exchanges) != 1 {
t.Fatalf("Expected exactly 1 exchange, got %d", len(exchanges))
}
if exchanges[0].Request.Model != "test-wire-model" {
t.Errorf("Expected request model to be 'test-wire-model', got %q", exchanges[0].Request.Model)
}
})

t.Run("should use provider model id as wire model", func(t *testing.T) {
// ProviderConfig.ModelID drives both the runtime resolved model AND the wire
// model when WireModel is not specified. SessionConfig.Model is intentionally
// omitted so that ModelID is the only model source.
ctx.ConfigureForTest(t)

session, err := client.CreateSession(t.Context(), &copilot.SessionConfig{
OnPermissionRequest: copilot.PermissionHandler.ApproveAll,
Provider: &copilot.ProviderConfig{
Type: "openai",
BaseURL: ctx.ProxyURL,
APIKey: "test-provider-key",
ModelID: "claude-sonnet-4.5",
},
})
if err != nil {
t.Fatalf("CreateSession failed: %v", err)
}

_, err = session.SendAndWait(t.Context(), copilot.MessageOptions{Prompt: "What is 1+1?"})
if err != nil {
t.Fatalf("SendAndWait failed: %v", err)
}

exchanges, err := ctx.GetExchanges()
if err != nil {
t.Fatalf("GetExchanges failed: %v", err)
}
if len(exchanges) != 1 {
t.Fatalf("Expected exactly 1 exchange, got %d", len(exchanges))
}
if exchanges[0].Request.Model != "claude-sonnet-4.5" {
t.Errorf("Expected request model to be 'claude-sonnet-4.5', got %q", exchanges[0].Request.Model)
}
})

t.Run("should use workingDirectory for tool execution", func(t *testing.T) {
ctx.ConfigureForTest(t)

Expand Down
23 changes: 23 additions & 0 deletions go/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -859,6 +859,29 @@ type ProviderConfig struct {
Azure *AzureProviderOptions `json:"azure,omitempty"`
// Headers are custom HTTP headers included in outbound provider requests.
Headers map[string]string `json:"headers,omitempty"`
// ModelID is the well-known model ID used to look up agent configuration
// (tools, prompts, reasoning behavior) and default token limits from the
// capability catalog. Useful for fine-tuned models that should inherit the
// configuration of a known base model.
// Defaults to the session's configured model (SessionConfig.Model) when
// not explicitly set.
ModelID string `json:"modelId,omitempty"`
// WireModel is the model identifier sent to the provider API for inference.
// Use this when the name your provider knows (e.g. an Azure deployment name
// or a custom fine-tune name) differs from the well-known model ID used for
// configuration lookup.
// Defaults to the session's configured model (SessionConfig.Model) when
// not explicitly set.
WireModel string `json:"wireModel,omitempty"`
// MaxPromptTokens is the maximum number of tokens allowed in the prompt for
// a single LLM API request. Used by the runtime to trigger conversation
// compaction before sending a request when the prompt (system message,
// history, tool definitions, user message) exceeds this limit.
MaxPromptTokens int `json:"maxPromptTokens,omitempty"`
// MaxOutputTokens is the maximum number of tokens the model can generate in
// a single response. When hit, the model stops generating and returns a
// truncated response.
MaxOutputTokens int `json:"maxOutputTokens,omitempty"`
}

// AzureProviderOptions contains Azure-specific provider configuration
Expand Down
65 changes: 65 additions & 0 deletions go/types_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -151,3 +151,68 @@ func TestSessionSendRequest_JSONIncludesRequestHeaders(t *testing.T) {
t.Fatalf("expected Authorization header, got %v", headers["Authorization"])
}
}

func TestProviderConfig_JSONIncludesAllFields(t *testing.T) {
cfg := ProviderConfig{
BaseURL: "https://example.com/provider",
APIKey: "test-key",
Headers: map[string]string{"Authorization": "Bearer provider-token"},
ModelID: "gpt-4o",
WireModel: "my-finetune-v3",
MaxPromptTokens: 100000,
MaxOutputTokens: 4096,
}

data, err := json.Marshal(cfg)
if err != nil {
t.Fatalf("failed to marshal ProviderConfig: %v", err)
}

var decoded map[string]any
if err := json.Unmarshal(data, &decoded); err != nil {
t.Fatalf("failed to unmarshal ProviderConfig: %v", err)
}

if decoded["baseUrl"] != "https://example.com/provider" {
t.Errorf("expected baseUrl to round-trip, got %v", decoded["baseUrl"])
}
if decoded["modelId"] != "gpt-4o" {
t.Errorf("expected modelId 'gpt-4o', got %v", decoded["modelId"])
}
if decoded["wireModel"] != "my-finetune-v3" {
t.Errorf("expected wireModel 'my-finetune-v3', got %v", decoded["wireModel"])
}
if decoded["maxPromptTokens"] != float64(100000) {
t.Errorf("expected maxPromptTokens 100000, got %v", decoded["maxPromptTokens"])
}
if decoded["maxOutputTokens"] != float64(4096) {
t.Errorf("expected maxOutputTokens 4096, got %v", decoded["maxOutputTokens"])
}
headers, ok := decoded["headers"].(map[string]any)
if !ok {
t.Fatalf("expected headers object, got %T", decoded["headers"])
}
if headers["Authorization"] != "Bearer provider-token" {
t.Errorf("expected Authorization header, got %v", headers["Authorization"])
}
}

func TestProviderConfig_JSONOmitsUnsetTokenFields(t *testing.T) {
cfg := ProviderConfig{BaseURL: "https://example.com/provider"}

data, err := json.Marshal(cfg)
if err != nil {
t.Fatalf("failed to marshal ProviderConfig: %v", err)
}

var decoded map[string]any
if err := json.Unmarshal(data, &decoded); err != nil {
t.Fatalf("failed to unmarshal ProviderConfig: %v", err)
}

for _, field := range []string{"modelId", "wireModel", "maxPromptTokens", "maxOutputTokens", "headers"} {
if _, present := decoded[field]; present {
t.Errorf("expected %q to be omitted when unset, got %v", field, decoded[field])
}
}
}
33 changes: 33 additions & 0 deletions nodejs/src/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -1503,6 +1503,39 @@ export interface ProviderConfig {
* Custom HTTP headers to include in outbound provider requests.
*/
headers?: Record<string, string>;

/**
* Well-known model ID used to look up agent configuration (tools, prompts,
* reasoning behavior) and default token limits from the capability catalog.
* Useful for fine-tuned models that should inherit the configuration of a
* known base model.
* Defaults to the session's configured model (see {@link SessionConfig.model})
* when not explicitly set.
*/
modelId?: string;

/**
* Model identifier sent to the provider API for inference.
* Use this when the name your provider knows (e.g. an Azure deployment name
* or a custom fine-tune name) differs from the well-known model ID used
* for configuration lookup.
* Defaults to the session's configured model (see {@link SessionConfig.model})
* when not explicitly set.
*/
wireModel?: string;

/**
* Maximum number of tokens allowed in the prompt for a single LLM API request.
* Used by the runtime to trigger conversation compaction before sending a request
* when the prompt (system message, history, tool definitions, user message) exceeds this limit.
*/
maxPromptTokens?: number;

/**
* Maximum number of tokens the model can generate in a single response.
* When hit, the model stops generating and returns a truncated response.
*/
maxOutputTokens?: number;
}

/**
Expand Down
Loading
Loading