-
Notifications
You must be signed in to change notification settings - Fork 81
Description
Copilot-generated bug report
@azure-tools/typespec-ts emitter generates incorrect base64 decoding for bytes responses with non-binary content types (e.g. application/xml)
- Package Name:
@azure/planetarycomputer - Package Version: 1.0.0-beta.1
- Operating system: Windows 11
- nodejs
- version: 22.x
- browser
- name/version:
- typescript
- version: 5.x
- Is the bug related to documentation in
- README.md
- source code documentation
- SDK API docs on https://learn.microsoft.com
Describe the bug
When a TypeSpec operation declares a response body as bytes with a non-binary content type like application/xml, the @azure-tools/typespec-ts v0.49.0 modular emitter generates a deserializer that incorrectly treats the response body as base64-encoded data. The generated code calls stringToUint8Array(result.body, "base64") on what is actually plain UTF-8 text (XML), producing garbled binary output.
This affects the Planetary Computer SDK's WMTS capabilities endpoints, whose TypeSpec definition is:
// From models.tiler.common.tsp in azure-rest-api-specs
model WmtsCapabilitiesXmlResponse {
@statusCode statusCode: 200;
@body body: bytes;
@header contentType: "application/xml";
}These endpoints return XML text (a WMTS capabilities document), not base64-encoded binary data.
To Reproduce
- Generate the
@azure/planetarycomputerSDK from the TypeSpec spec atspecification/orbital/Microsoft.PlanetaryComputer(commite19c31f75d537aa3fb4bd926e02d4968ee83910b) - Call
getMosaicsWmtsCapabilities(...)orgetWmtsCapabilities(...)— any operation returningWmtsCapabilitiesXmlResponse - Observe that the returned
Uint8Arraycontains garbled data instead of valid XML
Expected behavior
The returned Uint8Array should contain the UTF-8 bytes of the XML capabilities document. Decoding it with new TextDecoder().decode(result) should produce a valid XML string starting with <?xml version="1.0" or <Capabilities.
Screenshots
N/A
Additional context
Generated code (incorrect)
The emitter generates:
// In src/api/data/operations.ts
export async function _getMosaicsWmtsCapabilitiesDeserialize(
result: PathUncheckedResponse,
): Promise<Uint8Array> {
const expectedStatuses = ["200"];
if (!expectedStatuses.includes(result.status)) {
throw createRestError(result);
}
return typeof result.body === "string"
? stringToUint8Array(result.body, "base64") // ❌ WRONG: treats XML text as base64
: result.body;
}Expected generated code
return typeof result.body === "string"
? new TextEncoder().encode(result.body) // ✅ CORRECT: encodes XML text as UTF-8 bytes
: result.body;Root cause analysis
The bug is in the deserializeResponseValue function in packages/typespec-ts/src/modular/helpers/operationHelpers.ts:
case "bytes":
if (format !== "binary" && format !== "bytes") {
return `${nullOrUndefinedPrefix}typeof ${restValue} === 'string'
? ${stringToUint8ArrayReference}(${restValue}, "${format ?? "base64"}")
: ${restValue}`;
}
return restValue;When the format is not "binary" or "bytes", the code defaults to base64 decoding (format ?? "base64"). The format is determined upstream by isBinaryPayload(), which only returns true for content types classified as KnownMediaType.Binary (audio, image, video, octet-stream). Since application/xml is classified as KnownMediaType.Xml (not Binary), the emitter passes a non-"binary" format to deserializeResponseValue, which then defaults to base64.
The logic chain is:
getDeserializePrivateFunction(line ~362) checksisBinaryPayload(context, response.type!.__raw!, contentTypes)to determine the formatisBinaryPayload(inoperationUtil.ts) only returnstrueforKnownMediaType.Binarycontent typesapplication/xml→KnownMediaType.Xml→ not binary → format isundefineddeserializeResponseValuegetsformat = undefined→ falls into thebytescase → usesformat ?? "base64"→ generatesstringToUint8Array(body, "base64")
However, when the response is received over HTTP with Content-Type: application/xml, the body is plain text (not base64-encoded). The correct behavior would be to encode the string as UTF-8 bytes using new TextEncoder().encode().
Scope of impact
This affects any TypeSpec API that declares @body body: bytes with a non-binary content type such as application/xml, text/plain, text/html, etc. The generated deserializer will corrupt the response data by attempting to base64-decode plain text.
Binary content types (like image/png, application/octet-stream) are handled correctly because they go through the getBinaryResponse streaming path, bypassing this deserializer entirely.
Current workaround
Manual post-generation patch replacing stringToUint8Array(result.body, "base64") with new TextEncoder().encode(result.body) in the generated operations file. This patch is fragile and will be overwritten on the next TypeSpec regeneration.
Both getMosaicsWmtsCapabilities and getWmtsCapabilities deserializers require this patch.
Prior fix attempt in the API spec
A previous attempt was made to fix this at the spec level (in azure-rest-api-specs), but it was reverted on Jan 26, 2025 with the message "Revert 'Fix WMTS Capabilities Response Type for JavaScript SDK'". This suggests the fix should be in the emitter, not the spec.
Comparison with http-client-js emitter
The newer http-client-js emitter (in the typespec repo) handles this case correctly by setting encoding to "none" for non-JSON content types:
// From typespec repo: packages/http-client-js/src/common/serialization/encode.ts
if (isNonJsonTextualFormat(contentType)) {
return "none"; // No encoding needed for text content types
}This suggests the fix for @azure-tools/typespec-ts would be similar: when the content type is a text-based format (XML, plain text, etc.) and the body type is bytes, the deserializer should use TextEncoder instead of base64 decoding.
Suggested fix
In deserializeResponseValue, the bytes case should account for the response content type. When the content type is a text-based format (XML, text, etc.), the body string should be encoded as UTF-8 bytes rather than base64-decoded. For example:
case "bytes":
if (format === "binary" || format === "bytes") {
return restValue;
}
if (format === "text" || format === "xml") {
// Text-based content: encode the string as UTF-8 bytes
return `${nullOrUndefinedPrefix}typeof ${restValue} === 'string'
? new TextEncoder().encode(${restValue})
: ${restValue}`;
}
// Default: base64 decode (for JSON-embedded bytes, etc.)
return `${nullOrUndefinedPrefix}typeof ${restValue} === 'string'
? ${stringToUint8ArrayReference}(${restValue}, "${format ?? "base64"}")
: ${restValue}`;Alternatively, the isBinaryPayload check upstream could be extended to also recognize XML/text content types as requiring special handling when paired with bytes body type.