Add provider model and token limit overrides to ProviderConfig#966
Add provider model and token limit overrides to ProviderConfig#966MackinnonBuck wants to merge 2 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds four optional token-limit configuration fields to ProviderConfig across the Node.js, Python, Go, and .NET SDKs so BYOK/custom-provider users can override model token limits and/or specify an alternate model ID for capability-catalog limit lookup.
Changes:
- Added
maxOutputTokens,maxPromptTokens,maxContextWindowTokens, andmodelLimitsIdtoProviderConfigin Node.js, Go, and .NET. - Added corresponding snake_case fields to Python
ProviderConfigand mapped them to camelCase wire keys in the Python client. - Documented the behavior/precedence of these fields via inline comments/XML docs.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
nodejs/src/types.ts |
Extends TS ProviderConfig interface with optional token limit fields and modelLimitsId. |
python/copilot/session.py |
Extends Python ProviderConfig TypedDict with optional token-limit and model-limit lookup fields. |
python/copilot/client.py |
Maps new snake_case Python fields onto the expected camelCase wire names. |
go/types.go |
Extends Go ProviderConfig struct with new JSON fields (omitempty). |
dotnet/src/Types.cs |
Extends .NET ProviderConfig with nullable properties and JSON property names. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| /// This is useful for fine-tuned models that share the same limits as a base model. | ||
| /// </summary> | ||
| [JsonPropertyName("modelLimitsId")] | ||
| public string? ModelLimitsId { get; set; } |
There was a problem hiding this comment.
Curious to know what others think about this name. We might want to consider extending this to also affect inferred model capabilities (streaming, tool calling, multimodality, etc.), at which point this name is a bit too specific.
We could consider something like ModelCatalogId or ModelProfileId.
There was a problem hiding this comment.
ModelCatalogId is good except if you think it's identifying the catalog rather than the model. ModelIdFromCatalog would be more explicit but feels clunky.
Conceptually I like it though. The only alternative I can think of is adding a different property ModelDeploymentId and then having the rules:
- If
ModelIdandModelDeploymentIdare both set then we use the deployment ID when identifying the model in inference requests, and the model ID when looking up capabilities/limits etc - If only one is set then we use that for both things
What do you think? Main drawback would be if neither is set, we end up with a cumbersome error like "At least one of ModelId or ModelDeploymentId is required".
Or we could keep ModelId as required, and say ModelDeploymentId is the optional override on inference requests.
There was a problem hiding this comment.
Good ideas!
I think the right separation would be:
Modelmeans "the model used in inference requests"ModelCapabilitiesId(or something) means "the model used to look up capabilities and limits"
That way, all configuration around BYOK model limits is in the same place (ProviderConfig) rather than split between ProviderConfig and SessionConfig.
That said, we've also had requests to support multiple models in BYOK mode (for the CLI specifically, but we should probably expose similar functionality via the SDK). So I'm wondering if it would actually make sense to allow configuring a list of model configurations in ProviderConfig, where each model config defines both the ID used in inference as well as the ID used to look up model capabilities.
I'll mark this PR as a draft while I think about how such a design could look.
| /// When set, takes precedence over the default limit resolved from the model's capability catalog entry. | ||
| /// </summary> | ||
| [JsonPropertyName("maxPromptTokens")] | ||
| public int? MaxPromptTokens { get; set; } |
There was a problem hiding this comment.
Should this be called MaxInputTokens instead of MaxPromptTokens?
Does this include cached tokens?
Same question as above... is this about one request or across a sequence of calls?
There was a problem hiding this comment.
Per-request, but not sent to the API. The runtime uses this internally to decide when to truncate or compact conversation history before each LLM call. "Prompt" here means everything sent to the model in one request: system message, full conversation history up to that point, tool definitions, and the new user message. Cached tokens are counted toward the limit.
The name matches the upstream CAPI /models field (max_prompt_tokens), though MaxInputTokens would also be reasonable.
There was a problem hiding this comment.
I'd prefer MaxInputTokens. That's the more modern terminology, right? And it maps to what we show in the CLI UI?
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Hi @MackinnonBuck! Just wanted to follow up on this PR — is it still pending review? Adding token limit fields to ProviderConfig across all SDKs seems like a key capability for BYOK users who need to configure custom providers with specific limits. Would be great to see this move forward. Is there anything blocking it from being marked as ready for review? Thanks. |
|
@MackinnonBuck, are you still working on this? |
|
@stephentoub Thanks for the ping. I was holding off on this because we've had requests on the CLI side to allow for multiple provider configs / models per provider config, and I wanted to make sure we were properly accounting for that on the SDK APIs. However, adding new fields to the existing provider config SDK type probably doesn't make it any harder for us to adjust the API later, so it's probably fine to take this now. I'll work on getting this in a ready-to-review state tomorrow. |
Adds the following optional fields to ProviderConfig across all SDKs: - modelId: well-known model ID for agent config + token limit lookup - wireModel: model name sent to the provider API for inference - maxPromptTokens: prompt token cap (triggers compaction) - maxOutputTokens: response token cap Both modelId and wireModel default to the session's configured model when unset. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
47d0781 to
f993631
Compare
Extends existing provider-forwarding/serialization tests across all 4 SDKs to cover modelId, wireModel, maxPromptTokens, and maxOutputTokens. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
4778c24 to
09c17f7
Compare
Cross-SDK Consistency Review ✅This PR maintains strong cross-SDK consistency. All four SDKs — Node.js, Python, Go, and .NET — have been updated in lockstep with equivalent fields, tests, and wire-format mappings. Field naming (wire format → SDK)
All naming follows established language conventions (Go uses Optionality
Go's approach ( Test coverageAll four SDKs have extended tests covering serialization/wire-format round-trips for the new fields. No gaps found. No consistency issues identified. 🎉
|
| /// when not explicitly set. | ||
| /// </summary> | ||
| [JsonPropertyName("modelId")] | ||
| public string? ModelId { get; set; } |
There was a problem hiding this comment.
So there's:
SessionConfig.Model
ProviderConfig.ModelId
ProviderConfig.WireModel
Help me understand the relationship? If I specify SessionConfig.Model, it's used as the default for both options on ProviderConfig, and those options on provider config then represent the two different groupings in which a model would be used, such that I can override one of them? Is there any situation where I would specify the same model ID for both ModelId and WireModel?
| /// when not explicitly set. | ||
| /// </summary> | ||
| [JsonPropertyName("wireModel")] | ||
| public string? WireModel { get; set; } |
There was a problem hiding this comment.
Out of the three:
SessionConfig.Model
ProviderConfig.ModelId
ProviderConfig.WireModel
what's the meaning behind ProviderConfig.ModelId having an "Id" suffix and the other two not?
| /// </summary> | ||
| [JsonPropertyName("maxOutputTokens")] | ||
| public int? MaxOutputTokens { get; set; } | ||
| } |
There was a problem hiding this comment.
Is it possible to add some E2E tests that cover all of these?
Summary
Extends
ProviderConfigacross all four language SDKs with optional fields for model overrides and token limits. These fields match the runtime's wire format and let BYOK users decouple the model name sent to their provider from the well-known model ID used for agent configuration and capability lookup.New fields
modelIdmodelId?: stringmodel_id: strstring? ModelIdModelID stringwireModelwireModel?: stringwire_model: strstring? WireModelWireModel stringmaxPromptTokensmaxPromptTokens?: numbermax_prompt_tokens: intint? MaxPromptTokensMaxPromptTokens intmaxOutputTokensmaxOutputTokens?: numbermax_output_tokens: intint? MaxOutputTokensMaxOutputTokens intAll fields are optional.
modelId— Well-known model ID used to look up agent configuration (tools, prompts, reasoning behavior) and default token limits from the capability catalog. Useful for fine-tunes that should inherit a base model''s configuration. Defaults to the session''s configured model (SessionConfig.model) when unset.wireModel— Model identifier sent to the provider API for inference. Use this when the name your provider knows (e.g. an Azure deployment name or custom fine-tune name) differs from the well-known model ID. Defaults to the session''s configured model when unset.maxPromptTokens— Maximum tokens allowed in the prompt for a single request. The runtime triggers conversation compaction before sending a request when the prompt exceeds this limit.maxOutputTokens— Maximum tokens the model can generate in a single response.modelIdandwireModelare independent knobs — set just one or both depending on whether you need to override config lookup, wire transmission, or both.Changes
nodejs/src/types.ts) — Added fields toProviderConfiginterfacepython/copilot/session.py) — Added fields toProviderConfigTypedDictpython/copilot/client.py) — Updated_convert_provider_to_wire_formatto map new snake_case fields to camelCasedotnet/src/Types.cs) — Added nullable properties with[JsonPropertyName]attributesgo/types.go) — Added fields withjson:"...,omitempty"tagsTesting
Extended existing provider-forwarding/serialization unit tests across all four SDKs to cover the new fields:
nodejs/test/client.test.ts) — Existingforwards provider headerstests forsession.create/session.resumenow also assert all four new fields on the wire payloadpython/test_client.py) — Same — extendedtest_create_session_forwards_provider_headers/_resume_to assert camelCase mapping for all four new fieldsdotnet/test/Unit/SerializationTests.cs) — ExtendedProviderConfig_CanSerializeHeaders_WithSdkOptionsto round-trip all four new fields with correct JSON property namesgo/types_test.go) — AddedTestProviderConfig_JSONIncludesAllFields(correct JSON tags) andTestProviderConfig_JSONOmitsUnsetTokenFields(omitemptyworks)All four SDKs build clean (
tsc --noEmit,go build ./...,dotnet build,python -c "import copilot") and all unit tests pass.