Skip to content

Validate llm-token-limit and llm-emit-token-metric for Anthropic on v1 APIM SKUs #18

@tjsullivan1

Description

@tjsullivan1

Context

The AI Gateway billing sample uses llm-token-limit (product-level rate limiting) and llm-emit-token-metric (token metering) APIM policies. These policies are currently rolling out support for non-OpenAI backends (including Anthropic Claude) on v2 SKUs (Basicv2, StandardV2).

By June 2026, we should check whether these capabilities are also available on v1 SKUs (Developer, Basic, Standard, Premium).

Action Items

  • Check if llm-token-limit correctly counts consumed tokens from Anthropic responses (input_tokens/output_tokens format) on v1 SKUs
  • Check if llm-emit-token-metric can parse Anthropic response format natively — if so, the manual emit-metric workaround in api-anthropic.xml can be replaced
  • If native support works, simplify the Anthropic policy to use llm-emit-token-metric instead of custom emit-metric calls
  • Update the Known Limitations section in README.md to remove the SKU requirement note
  • Test with streaming responses (text/event-stream) to see if the streaming metering gap is also resolved

Related Files

  • infra/policies/api-anthropic.xml — custom emit-metric workaround
  • infra/policies/product-standard.xml / product-premium.xml — llm-token-limit
  • README.md — Known Limitations section

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions