Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .specs/coding-plans.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,8 @@ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "S

3.5. While an Installed BYOK Configuration still contains Kilo's issued credential, user-facing BYOK surfaces **MUST** identify its Coding Plan origin. Ordinary BYOK test, enable/disable, update, and delete operations **MUST** remain available. Before updating, disabling, or deleting that configuration, Kilo **MUST** warn that the operation changes routing but does not cancel subscription billing and **MUST** direct cancellation to the Subscription Center. Updating the credential **MUST** mark the entry as user-managed and detach it from later Coding Plan cleanup; deleting it **MUST NOT** cancel or pause the subscription. Testing or re-enabling the key does not require this warning.

3.6. Cloud **MAY** query current Upstream Provider quota for an authenticated owner of an active or `past_due` Coding Plan by using the retained assigned Managed Plan Credential. Decryption and provider access **MUST** remain server-side. The Managed Plan Credential, inventory identity, Upstream Plan ID, fingerprint, ciphertext, and authorization metadata **MUST NOT** leave Cloud.

## 4. Credential provisioning and inventory

4.1. Kilo **MUST** acquire or provision Managed Plan Credentials before accepting a purchase that depends on them. For an offering initially provisioned by operator upload, only authorized administrative tooling **MAY** insert credentials into inventory.
Expand Down Expand Up @@ -118,6 +120,8 @@ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "S

5.10. The initial pilot **MAY** leave an unchanged Kilo-installed BYOK configuration routable between its paid-period or grace deadline and the next scheduled billing lifecycle sweep. Once that sweep processes termination, local Kilo-installed access **MUST** be deleted regardless of whether manual upstream revocation is complete.

5.11. An active or `past_due` Coding Plan, including one pending cancellation at period end, **MUST** remain eligible for current quota presentation until Effective Cancellation. Replacing, disabling, or deleting its Installed BYOK Configuration **MUST NOT** hide the subscription or prevent quota lookup through the originally assigned Managed Plan Credential.

## 6. Traffic routing

6.1. Initial Token Plan Plus setup **MUST** route through the Kilo Gateway using the existing ordinary personal MiniMax BYOK provider identity. The initial release **MUST NOT** expose saved raw credential values through Kilo UI or API responses.
Expand All @@ -126,6 +130,8 @@ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "S

6.3. Purchase **MUST** reject an occupied personal MiniMax BYOK slot before a charge or issued credential assignment commits. Once subscribed, a user's ordinary MiniMax BYOK actions affect routing configuration only; Coding Plan billing and revocation of Kilo's originally issued credential remain independent.

6.4. Current routing state **MUST** be reported separately from subscription and provider-quota state. Quota authorization **MUST NOT** depend on the existence, contents, or enabled state of the Installed BYOK Configuration.

## 7. User-facing behavior

7.1. Users **MUST** be able to view catalog offerings, purchase a Coding Plan, view their subscription status and paid-period dates, and request cancellation from Kilo surfaces.
Expand All @@ -140,10 +146,14 @@ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "S

7.6. A sold-out offering **MUST** display its unavailable state and **MUST** offer an authenticated user a way to record an Availability Notification Intent. Recording the same intent again **MUST** be idempotent, **MUST NOT** reserve capacity or initiate billing, and **MUST** show the saved intent state. A successful activation **MUST** clear the activated user's intent for that Plan ID.

7.7. Authenticated Kilo clients **MAY** reuse Cloud's current personal billing and Coding Plan data to present current plans, routing state, and current provider quota. Ended plans, charged-term history, invoices, and billing history **MUST** remain in the Subscription Center rather than the current-plan response.

## 8. Security and observability

8.1. Logs and monitoring **MUST NOT** contain raw Managed Plan Credentials, credential-bearing authorization headers, provider-management secrets, or unfiltered provider/SDK key-test error content.

8.2. General administrative credential inventory responses **MUST** return non-secret status and remediation metadata only. For a `revocation_pending` or `revocation_failed` item, the manual-revocation admin console **MAY** display its Upstream Plan ID to authorized staff. Raw credential values **MUST NOT** be returned by queue, list, or remediation APIs or appear on customer surfaces.

8.3. The initial pilot does not require a Coding Plans audit-log history for admin inventory upload or manual revocation actions. Inventory lifecycle state, Upstream Plan ID, request/completion timestamps, attempt count, and sanitized failure information **MUST** record current disposition without retaining raw credentials after remediation starts.

8.4. Current quota responses and logs **MUST NOT** contain Managed Plan Credentials, authorization headers, raw provider bodies or messages, inventory metadata, Upstream Plan IDs, fingerprints, or ciphertext. Provider responses **MUST** be bounded, validated, and projected to an explicit non-secret field allowlist before leaving the Cloud boundary.
20 changes: 19 additions & 1 deletion .specs/subscription-center.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ Updated 2026-05-28 -- Credit-funded payment source label.
Updated 2026-05-28 -- Coding Plans API key configuration summary.
Updated 2026-05-28 -- Coding Plans billing history USD amount display.
Updated 2026-06-05 -- KiloClaw final Commit term continuation behavior.
Updated 2026-06-19 -- Current Coding Plan quota presentation and routing independence.

## Conventions

Expand Down Expand Up @@ -273,7 +274,11 @@ historical Commit names, prices, invoices, and credit deductions.
27. A user MAY have multiple Coding Plans subscriptions — one per
configured Plan ID. The Coding Plans group MUST display one
Subscription Card for each non-terminal coding plan subscription,
including a `past_due` subscription in its warning state.
including a `past_due` subscription in its warning state. Authenticated
Kilo clients MAY reuse the same current personal subscription data for
current-plan presentation outside the Subscription Center. These clients
MUST NOT include terminal history, invoices, or billing history in that
current-plan response.

28. The Coding Plans detail page MUST be served at
`/subscriptions/coding-plans/[subscriptionId]`.
Expand All @@ -294,9 +299,17 @@ historical Commit names, prices, invoices, and credit deductions.
linking to `/byok` when a managed key is installed
- Traffic routing information (Kilo Gateway through the ordinary
MiniMax BYOK provider setup)
- Current Upstream Provider quota for an `active` or `past_due` Coding Plan
when available, authorized through the retained Managed Plan Credential
without exposing it to the client
- Inline billing history showing credit transactions with amounts in USD
(see Billing History rules)

Current quota state and Installed BYOK Configuration routing state MUST be
presented separately. An active or `past_due` Coding Plan remains visible
and billable, and its current quota remains queryable, after its Installed
BYOK Configuration is replaced, disabled, or deleted.

Before update, disable, or delete, `/byok` MUST warn that routing changes
do not cancel or pause Token Plan Plus billing and cancellation is managed
in Subscription Center; customer surfaces MUST NOT include saved raw-key
Expand Down Expand Up @@ -463,6 +476,11 @@ not yet enforced in the current codebase:

## Changelog

### 2026-06-19 -- Current Coding Plan quota presentation

- Allowed authenticated Kilo clients to reuse current personal subscription data without moving billing history out of Subscription Center.
- Kept provider quota authorization on the retained Managed Plan Credential and independent from current BYOK routing state.

### 2026-06-05 -- KiloClaw final Commit continuation

- Replaced two-way post-cutoff plan switching with explicit final Commit continuation into lineage-priced Standard.
Expand Down
1 change: 1 addition & 0 deletions apps/web/src/lib/autoTopUpConstants.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ import * as z from 'zod';

// Personal auto-top-up settings
export const AUTO_TOP_UP_THRESHOLD_DOLLARS = 5;
export const AUTO_TOP_UP_THRESHOLD_CENTS = AUTO_TOP_UP_THRESHOLD_DOLLARS * 100;
export const AUTO_TOP_UP_AMOUNTS_CENTS = [2000, 5000, 10000] as const;
export const DEFAULT_AUTO_TOP_UP_AMOUNT_CENTS: AutoTopUpAmountCents = 5000;
export const AutoTopUpAmountCentsSchema = z.union(AUTO_TOP_UP_AMOUNTS_CENTS.map(n => z.literal(n)));
Expand Down
116 changes: 116 additions & 0 deletions apps/web/src/lib/coding-plans/minimax-usage.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
import { getMiniMaxUsage, MiniMaxUsageError } from '@/lib/coding-plans/minimax-usage';

const API_KEY = 'sk-cp-managed-secret';

function payload(overrides: Record<string, unknown> = {}) {
return {
base_resp: { status_code: 0, status_msg: 'provider message' },
model_remains: [
{
model_name: 'general',
current_interval_remaining_percent: 83,
current_interval_status: 1,
end_time: 1_781_280_000_000,
current_weekly_remaining_percent: 72,
current_weekly_status: 1,
weekly_end_time: 1_781_884_800_000,
unknown_secret: 'strip me',
},
],
unknown_top_level: 'strip me',
...overrides,
};
}

function jsonResponse(body: unknown, init?: ResponseInit) {
return new Response(JSON.stringify(body), {
status: 200,
headers: { 'content-type': 'application/json' },
...init,
});
}

afterEach(() => {
jest.restoreAllMocks();
});

describe('MiniMax managed usage transport', () => {
it('uses the fixed endpoint and returns only allowlisted native fields', async () => {
const request = jest.spyOn(global, 'fetch').mockResolvedValue(jsonResponse(payload()));

const result = await getMiniMaxUsage(API_KEY);

expect(request).toHaveBeenCalledWith(
'https://api.minimax.io/v1/token_plan/remains',
expect.objectContaining({
method: 'GET',
cache: 'no-store',
redirect: 'error',
signal: expect.any(AbortSignal),
headers: {
Accept: 'application/json',
Authorization: `Bearer ${API_KEY}`,
},
})
);
expect(result).toEqual({
base_resp: { status_code: 0 },
model_remains: [
{
model_name: 'general',
current_interval_remaining_percent: 83,
current_interval_status: 1,
end_time: 1_781_280_000_000,
current_weekly_remaining_percent: 72,
current_weekly_status: 1,
weekly_end_time: 1_781_884_800_000,
},
],
});
expect(JSON.stringify(result)).not.toContain('provider message');
expect(JSON.stringify(result)).not.toContain('unknown_secret');
});

it('rejects declared and streamed oversized responses', async () => {
jest
.spyOn(global, 'fetch')
.mockResolvedValueOnce(
new Response('{}', { status: 200, headers: { 'content-length': String(64 * 1024 + 1) } })
)
.mockResolvedValueOnce(new Response('x'.repeat(64 * 1024 + 1), { status: 200 }));

await expect(getMiniMaxUsage(API_KEY)).rejects.toMatchObject({ code: 'too_large' });
await expect(getMiniMaxUsage(API_KEY)).rejects.toMatchObject({ code: 'too_large' });
});

it.each([
['http', new Response('raw upstream body', { status: 429 })],
['invalid_json', new Response('raw invalid json', { status: 200 })],
['invalid_schema', jsonResponse({ base_resp: { status_code: 0 }, model_remains: 'wrong' })],
[
'application',
jsonResponse(payload({ base_resp: { status_code: 1004, status_msg: 'secret' } })),
],
] as const)('maps %s failures without exposing provider data', async (code, response) => {
jest.spyOn(global, 'fetch').mockResolvedValue(response);

await expect(getMiniMaxUsage(API_KEY)).rejects.toEqual(
expect.objectContaining({
code,
message: 'MiniMax usage is temporarily unavailable.',
})
);
});

it('maps request failures to a safe network error', async () => {
jest.spyOn(global, 'fetch').mockRejectedValue(new Error(`network body ${API_KEY}`));

const error = await getMiniMaxUsage(API_KEY).catch(value => value);
expect(error).toBeInstanceOf(MiniMaxUsageError);
expect(error).toMatchObject({
code: 'network',
message: 'MiniMax usage is temporarily unavailable.',
});
expect(JSON.stringify(error)).not.toContain(API_KEY);
});
});
139 changes: 139 additions & 0 deletions apps/web/src/lib/coding-plans/minimax-usage.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
import 'server-only';

import * as z from 'zod';

const MINIMAX_USAGE_URL = 'https://api.minimax.io/v1/token_plan/remains';
const MINIMAX_USAGE_TIMEOUT_MS = 5_000;
const MINIMAX_USAGE_MAX_BYTES = 64 * 1024;

const NativePercentSchema = z.number().finite().min(0).max(100);
const NativeIntegerSchema = z.number().int().safe();

const MiniMaxModelRemainsSchema = z.object({
model_name: z.string().min(1).max(128),
current_interval_total_count: NativeIntegerSchema.nonnegative().optional(),
current_interval_usage_count: NativeIntegerSchema.nonnegative().optional(),
start_time: NativeIntegerSchema.nonnegative().optional(),
end_time: NativeIntegerSchema.nonnegative().optional(),
remains_time: NativeIntegerSchema.nonnegative().optional(),
interval_boost_permill: NativeIntegerSchema.nonnegative().optional(),
interval_boost_permille: NativeIntegerSchema.nonnegative().optional(),
current_interval_remaining_percent: NativePercentSchema.optional(),
current_interval_status: NativeIntegerSchema.optional(),
current_weekly_total_count: NativeIntegerSchema.nonnegative().optional(),
current_weekly_usage_count: NativeIntegerSchema.nonnegative().optional(),
weekly_start_time: NativeIntegerSchema.nonnegative().optional(),
weekly_end_time: NativeIntegerSchema.nonnegative().optional(),
weekly_remains_time: NativeIntegerSchema.nonnegative().optional(),
weekly_boost_permill: NativeIntegerSchema.nonnegative().optional(),
weekly_boost_permille: NativeIntegerSchema.nonnegative().optional(),
current_weekly_remaining_percent: NativePercentSchema.optional(),
current_weekly_status: NativeIntegerSchema.optional(),
});

export const MiniMaxUsageNativeSchema = z.object({
base_resp: z.object({
status_code: NativeIntegerSchema,
}),
model_remains: z.array(MiniMaxModelRemainsSchema).max(64),
});

export type MiniMaxUsageNative = z.infer<typeof MiniMaxUsageNativeSchema>;

type MiniMaxUsageErrorCode =
| 'network'
| 'http'
| 'too_large'
| 'invalid_json'
| 'invalid_schema'
| 'application';

export class MiniMaxUsageError extends Error {
readonly code: MiniMaxUsageErrorCode;

constructor(code: MiniMaxUsageErrorCode) {
super('MiniMax usage is temporarily unavailable.');
this.name = 'MiniMaxUsageError';
this.code = code;
}
}

async function readBoundedText(response: Response): Promise<string> {
const declared = Number(response.headers.get('content-length'));
if (Number.isFinite(declared) && declared > MINIMAX_USAGE_MAX_BYTES) {
throw new MiniMaxUsageError('too_large');
}

if (!response.body) {
const buffer = await response.arrayBuffer();
if (buffer.byteLength > MINIMAX_USAGE_MAX_BYTES) {
throw new MiniMaxUsageError('too_large');
}
return new TextDecoder().decode(buffer);
}

const reader = response.body.getReader();
const chunks: Uint8Array[] = [];
let size = 0;

while (true) {
const chunk = await reader.read();
if (chunk.done) break;
if (!chunk.value) continue;

size += chunk.value.byteLength;
if (size > MINIMAX_USAGE_MAX_BYTES) {
await reader.cancel().catch(() => undefined);
throw new MiniMaxUsageError('too_large');
}
chunks.push(chunk.value);
}

const body = new Uint8Array(size);
let offset = 0;
for (const chunk of chunks) {
body.set(chunk, offset);
offset += chunk.byteLength;
}
return new TextDecoder().decode(body);
}

export async function getMiniMaxUsage(apiKey: string): Promise<MiniMaxUsageNative> {
const response = await fetch(MINIMAX_USAGE_URL, {
method: 'GET',
headers: {
Accept: 'application/json',
Authorization: `Bearer ${apiKey}`,
},
cache: 'no-store',
redirect: 'error',
signal: AbortSignal.timeout(MINIMAX_USAGE_TIMEOUT_MS),
}).catch(() => {
throw new MiniMaxUsageError('network');
});

if (!response.ok) {
response.body?.cancel().catch(() => undefined);
throw new MiniMaxUsageError('http');
}

const text = await readBoundedText(response).catch(error => {
if (error instanceof MiniMaxUsageError) throw error;
throw new MiniMaxUsageError('network');
});
const json = (() => {
try {
return JSON.parse(text);
} catch {
throw new MiniMaxUsageError('invalid_json');
}
})();
const result = MiniMaxUsageNativeSchema.safeParse(json);
if (!result.success) {
throw new MiniMaxUsageError('invalid_schema');
}
if (result.data.base_resp.status_code !== 0) {
throw new MiniMaxUsageError('application');
}
return result.data;
}
Loading
Loading