Assessment: Assessment Pipeline by vprashrex · Pull Request #122 · ProjectTech4DevAI/kaapi-frontend

vprashrex · 2026-04-21T10:28:10Z

Adds the Assessment module, a feature-gated workflow for running LLM evaluations across datasets and multiple model configurations side by side.

Features
Multi-step evaluation setup

Dataset tab — upload or select an existing dataset (CSV/Excel); preview sample rows
Column Mapper — assign each dataset column a role: Text, Attachment (image/PDF with url or base64 format), Ground Truth, or Skip
Prompt & Config — write a prompt template, define a structured output schema (field name + type), and select one or more model configs to run against
Review — summarise all inputs before submitting

Results tab

Lists all assessment runs grouped by experiment, with per-run status (pending / processing / completed / failed)
Drill into a run to see per-item results in a table; supports retry for failed runs
Export results as JSON, CSV, or XLSX (multi-run exports are zipped)

Real-time updates

SSE stream (/assessment/events) with a circuit-breaker and auto-reconnect so the results tab stays live without polling

Feature flag gating

The /assessment route is gated behind the assessment feature flag read from a server-set cookie — users without the flag are redirected to home

Introduced two files for re-usability

withQueryParams for consistency and to avoid hand-building query strings in each proxy route.
proxyJsonResponse is also in the same reusable proxy utility file: It wraps this repeated pattern:

const { status, data } = await apiClient(request, endpoint, init);
return NextResponse.json(data, { status });

File	Change
`middleware.ts`	Feature-gate added — `/assessment` redirects to home unless the `assessment` flag is in the user's cookie
`app/lib/authCookie.ts`	`kaapi_features` cookie helpers added (`set`, `clear`)
`app/lib/context/AuthContext.tsx`	`features` + `hasFeature()` exposed on auth context; listens to `FEATURES_UPDATED_EVENT`
`app/api/auth/logout/route.ts`	Clears the features cookie on logout
`app/api/users/me/route.ts`	Sets the features cookie on login from `user.features`
`app/(main)/datasets/page.tsx`	Import path fix: `types/dataset` → `types/datasets`
`app/(main)/keystore/page.tsx`	Removed duplicate `STORAGE_KEY` constant (moved to `app/lib/constants/keystore.ts`)

Summary by CodeRabbit

New Features
- Added comprehensive assessment functionality with multi-step workflow for dataset management, configuration creation, and evaluation result tracking.
- Introduced dataset upload and management with file parsing and preview capabilities.
- Added configuration system for managing prompt templates, model parameters, and output schemas.
- Implemented assessment results viewing, export, and retry capabilities.
- Integrated feature-flag gating for assessment access control.
- Enhanced navigation sidebar with assessment menu item.

…luation - Added AssessmentPage component with nested steps for dataset selection, column mapping, prompt editing, configuration, output schema, and review. - Integrated Zustand for state management of assessment dataset. - Created utility functions for schema conversion and event handling. - Enhanced Sidebar component to include a link to the new Assessment page. - Added necessary types for assessment evaluation and schema properties. - Implemented polling for evaluation status and indicators for processing, success, and failure. - Included loading states and error handling for API interactions. - Updated package dependencies to include zustand and xlsx.

coderabbitai · 2026-04-21T10:28:29Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ca0cec48-f63a-46c2-aaa2-d15a0033a51a

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

A comprehensive Assessment feature is introduced, including API routes for dataset, evaluation, and assessment management; a multi-step client-side workflow UI (dataset selection, column mapping, prompt editing, config selection, review); feature-flag infrastructure for access control; and updated auth context and navigation to support feature-gated Assessment access.

Changes

Cohort / File(s)	Summary
Assessment API Routes (Datasets) `app/api/assessment/datasets/route.ts`, `app/api/assessment/datasets/[dataset_id]/route.ts`	New endpoints for listing/creating datasets and retrieving/deleting individual datasets; includes optional file-content fetching and base64 encoding.
Assessment API Routes (Evaluations) `app/api/assessment/evaluations/route.ts`, `app/api/assessment/evaluations/[evaluation_id]/retry/route.ts`, `app/api/assessment/evaluations/[evaluation_id]/results/route.ts`	New endpoints for managing evaluation runs: list/create evaluations, retry failed runs, and fetch results with optional download conversion.
Assessment API Routes (Assessments) `app/api/assessment/assessments/route.ts`, `app/api/assessment/assessments/[assessment_id]/retry/route.ts`, `app/api/assessment/assessments/[assessment_id]/results/route.ts`	New endpoints for assessment-level operations: list assessments, retry, and retrieve results with download streaming support.
Config API Routes `app/api/configs/[config_id]/versions/route.ts`, `app/api/configs/route.ts`	Updated config endpoints to preserve query parameters in `GET` and add whitespace normalization; no functional logic changes.
Assessment Utilities `app/api/assessment/utils.ts`	New utility module for proxying assessment API calls, parsing JSON responses, and streaming file downloads.
Assessment Main Page & Client `app/assessment/page.tsx`, `app/assessment/AssessmentPageClient.tsx`	New assessment root page and client component managing three-tab workflow (Datasets, Config, Results); handles API-key selection, forbidden-access redirects, and form state persistence.
Assessment Workflow Steps `app/assessment/components/DatasetStep.tsx`, `app/assessment/components/ColumnMapperStep.tsx`, `app/assessment/components/PromptEditorStep.tsx`, `app/assessment/components/PromptAndConfigStep.tsx`, `app/assessment/config/ConfigurationStep.tsx`, `app/assessment/components/OutputSchemaStep.tsx`, `app/assessment/components/ReviewStep.tsx`	Multi-step form components covering dataset upload/selection, column role assignment, prompt templating with mention dropdown, configuration selection and creation, output schema definition, and final review/submission.
Assessment Results & Evaluation Display `app/assessment/components/EvaluationsTab.tsx`, `app/assessment/components/DataViewModal.tsx`	Components for displaying assessment/evaluation run results, supporting polling, nested status filtering, lazy config metadata loading, results preview/export, and retry actions.
Assessment Support Components `app/assessment/components/Stepper.tsx`, `app/assessment/components/JsonEditor.tsx`, `app/assessment/components/MultiConfigStep.tsx`, `app/assessment/components/CompactToggleSwitch.tsx`, `app/assessment/components/icons/ToggleThumbIcons.tsx`	Reusable UI components: multi-step navigator, JSON schema editor with syntax highlighting, multi-config selector, toggle switch, and toggle state icons.
Assessment Configuration & Types `app/assessment/config/api.ts`, `app/assessment/config/constants.ts`, `app/assessment/types.ts`, `app/assessment/store.ts`, `app/assessment/schemaUtils.ts`, `app/assessment/errorUtils.ts`	Configuration management API, model constants (OpenAI models with parameter definitions), assessment domain types, Zustand dataset store, schema-to-JSON-schema conversion, and forbidden-error detection utilities.
Feature Flag Infrastructure `app/lib/constants/featureFlags.ts`, `app/lib/featureState.ts`, `middleware.ts`	New feature flag constants and client-side feature state management; middleware updated to gate routes by feature flags and redirect unauthorized access.
Auth & Cookie Handling `app/lib/authCookie.ts`, `app/lib/types/auth.ts`, `app/lib/context/AuthContext.tsx`, `app/api/auth/logout/route.ts`, `app/api/users/me/route.ts`	New feature-cookie infrastructure; auth context extended with feature flags and event listeners; auth routes updated to set/clear feature cookies and emit auth-change events.
Navigation & Icons `app/lib/navConfig.ts`, `app/components/Sidebar.tsx`, `app/components/icons/index.tsx`, `app/components/icons/sidebar/AssessmentIcon.tsx`, `app/lib/types/nav.ts`	New Assessment navigation item with feature-flag gating; sidebar updated to conditionally render feature-flagged items; new AssessmentIcon SVG component added.
Type Definitions & Constants `app/lib/types/datasets.ts`, `app/lib/configTypes.ts`, `app/lib/constants/keystore.ts`, `app/lib/constants.ts`, `app/lib/colors.ts`	New Dataset type interface, config-related API response types, storage key constant, feature-update event constant, and white color value.
File Updates (Type & Import Changes) `app/(main)/datasets/page.tsx`	Updated Dataset type import from `@/app/lib/types/dataset` to `@/app/lib/types/datasets`.
Removed Constants `app/(main)/keystore/page.tsx`	Removed exported `STORAGE_KEY` constant (moved to `app/lib/constants/keystore.ts`).
Dependencies `package.json`	Added runtime dependencies: `cockatiel` (retry logic), `xlsx` (spreadsheet parsing), `zustand` (state management).
Speech-to-Text Component `app/components/speech-to-text/ModelComparisonCard.tsx`	New card component for displaying speech-to-text model evaluation results with expandable metrics and diff comparison.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Support new Open AI model configuration #87: Centralized model configuration data and helper functions for model selection and configuration parameter defaults.
UI: STT Evals #44: Sidebar navigation component modifications including feature-gated menu item handling.
Auth: User invitation flow #111: Auth context and cookie-handling infrastructure updates for feature flag management across session/storage.

Suggested reviewers

Ayush8923
nishika26

Poem

🐰 Hops through assessment's verdant garden,
Datasets bloom, columns map with pardon,
Prompts and configs dance on display,
Features gated, evaluations on their way!
Schemas emerge, results in review—
A complete workflow, fresh and new! 🌟

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 6.54% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Assessment: Assessment Pipeline' is vague and redundant, using 'Assessment' twice without clarifying the main change or scope.	Revise to a more descriptive title that specifies the primary feature, e.g., 'Add feature-gated assessment module with evaluation pipeline' or 'Implement assessment workflow with dataset/config/results tabs'.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/assesment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

app/lib/context/AuthContext.tsx (1)

126-139: ⚠️ Potential issue | 🟠 Major

Notify feature-flag listeners during logout too.

Line 138 clears apiKeys directly, bypassing persist, so the new kaapi-auth-changed event from Line 98 is not emitted on logout. That can leave same-tab feature-gated UI using stale flags after auth is cleared.

🐛 Proposed fix

     setSession(null);
     setCurrentUser(null);
     clearAllStorage();
-    setApiKeys([]);
+    persist([]);
   }, [persist]);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/lib/context/AuthContext.tsx` around lines 126 - 139, The logout function
currently clears apiKeys by calling setApiKeys([]) which bypasses the persist
hook and prevents the "kaapi-auth-changed" feature-flag event from firing;
change the logout flow in the logout callback to clear api keys via the existing
persist mechanism (use the same persist call used elsewhere to update/clear
"apiKeys" so it emits the kaapi-auth-changed event) instead of calling
setApiKeys directly, keeping the rest of the cleanup (setSession(null),
setCurrentUser(null), clearAllStorage()) intact and ensuring persist remains in
the dependency array.

🟠 Major comments (26)

app/components/speech-to-text/ModelComparisonCard.tsx-149-172 (1)
149-172: ⚠️ Potential issue | 🟠 Major

Add type="button" to non-submitting buttons and ensure the expand toggle is accessible.

Both buttons lack explicit type="button" attributes, causing them to default to type="submit" per HTML spec, which can accidentally trigger form submission if the component is used within a form. Additionally, the icon-only expand button needs an accessible name (aria-label) and expanded state indicator (aria-expanded), and both SVG icons should use aria-hidden="true".
Suggested changes
             {hasExpandedContent && (
               <button
+                type="button"
                 onClick={handleExpandToggle}
+                aria-expanded={isExpanded}
+                aria-label={
+                  isExpanded ? "Collapse model details" : "Expand model details"
+                }
                 className="p-1 rounded hover:bg-gray-100"
                 style={{ color: colors.text.secondary }}
               >
                 <svg
                   className="w-4 h-4"
+                  aria-hidden="true"
                   fill="none"
                   viewBox="0 0 24 24"
                   stroke="currentColor"
           {onClick && (
             <button
+              type="button"
               onClick={onClick}
               className="w-full py-1.5 rounded text-xs font-medium flex items-center justify-center gap-1"
               <svg
                 className="w-3 h-3"
+                aria-hidden="true"
                 fill="none"
                 viewBox="0 0 24 24"
                 stroke="currentColor"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/speech-to-text/ModelComparisonCard.tsx` around lines 149 -
172, The expand toggle currently defined inside the hasExpandedContent
conditional should be made non-submitting and accessible: add type="button" to
the button element that calls handleExpandToggle, add an accessible name (e.g.,
aria-label="Toggle details" or similar) and an aria-expanded attribute bound to
isExpanded, and mark the SVG icon as presentational with aria-hidden="true";
update any other nearby buttons lacking type attributes the same way to prevent
accidental form submission.
app/api/assessment/evaluations/route.ts-8-88 (1)
8-88: ⚠️ Potential issue | 🟠 Major

Use apiClient so auth relay and response status stay consistent.

These handlers manually enforce X-API-KEY, bypassing the shared cookie/header forwarding path, and they return 200 for every successful backend response. For evaluation submission, preserving backend statuses like 201 or 202 is important for clients tracking async work.
♻️ Proposed direction
 import { NextRequest, NextResponse } from 'next/server';
+import { apiClient } from "@/app/lib/apiClient";
 
 export async function GET(request: NextRequest) {
   try {
-    const apiKey = request.headers.get('X-API-KEY');
-
-    if (!apiKey) {
-      return NextResponse.json(
-        { error: 'Missing X-API-KEY header' },
-        { status: 401 }
-      );
-    }
-
-    const backendUrl = process.env.BACKEND_URL || 'http://localhost:8000';
     const searchParams = request.nextUrl.searchParams.toString();
     const queryString = searchParams ? `?${searchParams}` : '';
 
-    const response = await fetch(`${backendUrl}/api/v1/assessment/evaluations${queryString}`, {
-      method: 'GET',
-      headers: {
-        'X-API-KEY': apiKey,
-      },
+    const { status, data } = await apiClient(
+      request,
+      `/api/v1/assessment/evaluations${queryString}`,
+      {
+        method: "GET",
+      },
+    );
-    });
-
-    const data = await response.json();
-
-    if (!response.ok) {
-      return NextResponse.json(data, { status: response.status });
-    }
-
-    return NextResponse.json(data, { status: 200 });
+    return NextResponse.json(data, { status });
Apply the same pattern in POST, passing the JSON body through apiClient and returning its status.
As per coding guidelines, app/api/**/*.{ts,tsx}: Use apiClient(request, endpoint, options) in server-side Next.js route handlers to forward requests to the backend, which automatically relays X-API-KEY and Cookie headers and returns { status, data, headers }.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/evaluations/route.ts` around lines 8 - 88, The GET and
POST handlers in route.ts manually enforce X-API-KEY and call fetch directly,
bypassing the shared header/cookie relay and always returning 200; replace those
fetch flows with apiClient(request, endpoint, options) so auth/cookie forwarding
is preserved and actual backend statuses are returned: for GET call
apiClient(request,
`/api/v1/assessment/evaluations${request.nextUrl.searchParams.toString() ?
`?${request.nextUrl.searchParams.toString()}` : ''}`, { method: 'GET' }) and
return NextResponse.json(response.data, { status: response.status, headers:
response.headers }), and for POST call apiClient(request,
'/api/v1/assessment/evaluations', { method: 'POST', body: await request.json()
}) and return NextResponse.json(response.data, { status: response.status,
headers: response.headers }); remove the manual X-API-KEY checks and
console.error proxy messages in GET and POST so the handlers rely on apiClient
behavior.
app/api/features/route.ts-6-32 (1)
6-32: 🛠️ Refactor suggestion | 🟠 Major

Use the shared server proxy client here.

This route manually reconstructs auth forwarding instead of using the project’s apiClient, so it can drift from the shared X-API-KEY/Cookie relay and error-handling behavior used by other server routes.
♻️ Proposed direction
-import { cookies } from "next/headers";
 import { NextRequest, NextResponse } from "next/server";
+import { apiClient } from "@/app/lib/apiClient";
 
 export const dynamic = "force-dynamic";
 
 export async function GET(request: NextRequest) {
   try {
-    const cookieStore = await cookies();
-    const apiKey =
-      request.headers.get("X-API-KEY") ??
-      cookieStore.get("kaapi_api_key")?.value;
-    if (!apiKey) {
-      return NextResponse.json(
-        { error: "Missing X-API-KEY header" },
-        { status: 401 },
-      );
-    }
-
-    const backendUrl = process.env.BACKEND_URL || "http://localhost:8000";
-    const response = await fetch(`${backendUrl}/api/v1/features`, {
+    const { status, data } = await apiClient(request, "/api/v1/features", {
       method: "GET",
-      headers: { "X-API-KEY": decodeURIComponent(apiKey) },
       cache: "no-store",
     });
 
-    const data = await response.json();
     return NextResponse.json(data, {
-      status: response.status,
+      status,
       headers: {
         "Cache-Control": "no-store, no-cache, must-revalidate",
       },
As per coding guidelines, app/api/**/*.{ts,tsx}: Use apiClient(request, endpoint, options) in server-side Next.js route handlers to forward requests to the backend, which automatically relays X-API-KEY and Cookie headers and returns { status, data, headers }.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/features/route.ts` around lines 6 - 32, The GET handler currently
does its own cookie/header extraction and fetch; replace that logic to call the
shared apiClient(request, "/api/v1/features", { method: "GET", cache: "no-store"
}) so auth (X-API-KEY and Cookie) and error handling are forwarded consistently;
use the returned { status, data, headers } from apiClient to construct the
NextResponse (preserving Cache-Control from response.headers or setting
"no-store, no-cache, must-revalidate" as before) and remove manual cookies() and
decodeURIComponent(apiKey) handling in the GET function.
app/api/assessment/datasets/route.ts-8-87 (1)
8-87: ⚠️ Potential issue | 🟠 Major

Route through apiClient and preserve backend statuses.

Both handlers manually require X-API-KEY, so cookie-authenticated requests are rejected before the shared proxy layer can relay cookies. They also collapse every successful backend response to 200, which can hide 201 Created/202 Accepted from dataset uploads.
♻️ Proposed direction
 import { NextRequest, NextResponse } from 'next/server';
+import { apiClient } from "@/app/lib/apiClient";
 
 export async function GET(request: NextRequest) {
   try {
-    const apiKey = request.headers.get('X-API-KEY');
-
-    if (!apiKey) {
-      return NextResponse.json(
-        { error: 'Missing X-API-KEY header' },
-        { status: 401 }
-      );
-    }
-
-    const backendUrl = process.env.BACKEND_URL || 'http://localhost:8000';
-
-    const response = await fetch(`${backendUrl}/api/v1/assessment/datasets`, {
-      method: 'GET',
-      headers: {
-        'X-API-KEY': apiKey,
-      },
+    const { status, data } = await apiClient(request, "/api/v1/assessment/datasets", {
+      method: "GET",
     });
-
-    const data = await response.json();
-
-    if (!response.ok) {
-      return NextResponse.json(data, { status: response.status });
-    }
-
-    return NextResponse.json(data, { status: 200 });
+    return NextResponse.json(data, { status });
Apply the same pattern in POST, passing the parsed formData as the request body and returning the status from apiClient.
As per coding guidelines, app/api/**/*.{ts,tsx}: Use apiClient(request, endpoint, options) in server-side Next.js route handlers to forward requests to the backend, which automatically relays X-API-KEY and Cookie headers and returns { status, data, headers }.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/datasets/route.ts` around lines 8 - 87, The GET and POST
handlers in route.ts should use the shared apiClient(request, endpoint, options)
so cookies and X-API-KEY are relayed and backend status/headers are preserved;
remove manual request.headers.get('X-API-KEY') checks, call apiClient(request,
'/api/v1/assessment/datasets', { method: 'GET' }) for GET and apiClient(request,
'/api/v1/assessment/datasets', { method: 'POST', body: formData }) for POST (use
await request.formData() before calling), then return NextResponse.json(data, {
status, headers }) using the status and headers returned by apiClient instead of
always mapping successes to 200.
app/assessment/components/DataViewModal.tsx-24-51 (1)
24-51: ⚠️ Potential issue | 🟠 Major

Add dialog semantics and keyboard-close support.

This modal has no role="dialog", aria-modal, labelled title, Escape handler, or accessible name on the icon-only close button. That makes the data preview hard to operate with assistive tech/keyboard-only flows.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/DataViewModal.tsx` around lines 24 - 51, The modal
markup in DataViewModal.tsx is missing dialog semantics and keyboard support;
update the outer modal container element (the fixed inset-0 wrapper or the inner
dialog container used for rendering) to include role="dialog" and
aria-modal="true", give the title element (the h3 that renders {title}) a stable
id (e.g., dataViewModalTitle) and add aria-labelledby referencing that id on the
dialog container, add an accessible label to the icon-only close button (e.g.,
aria-label="Close preview") instead of relying on color-only styling, and wire
an Escape key handler (onKeyDown on the container or a useEffect keyboard
listener) that calls the existing onClose function so keyboard users can close
with Esc. Ensure the stopPropagation click handler remains on the inner
container so clicks inside do not close the dialog.
app/api/assessment/assessments/[assessment_id]/results/route.ts-13-29 (1)
13-29: ⚠️ Potential issue | 🟠 Major

Use the shared proxy path for this backend call.

This route bypasses apiClient and forwards only X-API-KEY, so backend calls won’t receive cookies that the shared route-handler proxy normally relays. If binary passthrough is the blocker, centralize that support in apiClient instead of duplicating proxy logic.

As per coding guidelines, app/api/**/*.{ts,tsx} routes should use apiClient(request, endpoint, options) to forward backend requests.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/assessments/`[assessment_id]/results/route.ts around lines
13 - 29, Replace the direct fetch in route.ts with the central apiClient call:
instead of building backendUrl and calling fetch with only 'X-API-KEY', call
apiClient(request,
`/api/v1/assessment/assessments/${assessment_id}/results${queryString}`, {
method: 'GET', headers: { 'X-API-KEY': apiKey } }) so the shared proxy will
forward cookies and handle binary passthrough; if apiClient doesn't yet support
binary passthrough, add that support there and remove duplicate proxy logic from
this route, keeping references to request, assessment_id, apiClient, and
queryString to locate where to change.
app/api/assessment/evaluations/[evaluation_id]/results/route.ts-13-29 (1)
13-29: ⚠️ Potential issue | 🟠 Major

Keep backend forwarding behind apiClient.

This direct fetch path only forwards X-API-KEY and drops cookies that the shared proxy helper is supposed to relay. If downloads need raw-response handling, consider extending apiClient for passthrough responses rather than bypassing it per route.

As per coding guidelines, app/api/**/*.{ts,tsx} routes should use apiClient(request, endpoint, options) to forward backend requests.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/evaluations/`[evaluation_id]/results/route.ts around lines
13 - 29, This route is directly calling fetch and only forwards the X-API-KEY
header, dropping cookies and bypassing the shared proxy; replace the direct
fetch with the shared apiClient(request, endpoint, options) call so cookies and
other proxy logic are preserved (use apiClient(request,
`/api/v1/assessment/evaluations/${evaluation_id}/results${queryString}`, {
method: 'GET', headers: { 'X-API-KEY': apiKey } })). If you need raw-response
handling for downloads, extend apiClient to support a passthrough/raw response
mode and call that new option from this handler instead of bypassing apiClient.
app/api/assessment/assessments/[assessment_id]/retry/route.ts-7-32 (1)
7-32: ⚠️ Potential issue | 🟠 Major

Use the shared apiClient proxy here.

This route bypasses the server-side proxy helper and only forwards X-API-KEY, so any backend auth/session behavior that depends on forwarded cookies will be lost. Prefer the shared helper so header forwarding and response handling stay consistent.
Proposed direction
 import { NextRequest, NextResponse } from 'next/server';
+import { apiClient } from '@/app/lib/apiClient';

@@
-    const apiKey = request.headers.get('X-API-KEY');
-
-    if (!apiKey) {
-      return NextResponse.json(
-        { error: 'Missing X-API-KEY header' },
-        { status: 401 }
-      );
-    }
-
     const { assessment_id } = await context.params;
-    const backendUrl = process.env.BACKEND_URL || 'http://localhost:8000';
-
-    const response = await fetch(
-      `${backendUrl}/api/v1/assessment/assessments/${assessment_id}/retry`,
-      {
-        method: 'POST',
-        headers: {
-          'X-API-KEY': apiKey,
-        },
-      }
-    );
-
-    const data = await response.json();
-    return NextResponse.json(data, { status: response.status });
+    const { status, data } = await apiClient(
+      request,
+      `/api/v1/assessment/assessments/${assessment_id}/retry`,
+      { method: 'POST' },
+    );
+
+    return NextResponse.json(data, { status });
As per coding guidelines, app/api/**/*.{ts,tsx} routes should use apiClient(request, endpoint, options) to forward backend requests.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/assessments/`[assessment_id]/retry/route.ts around lines 7
- 32, The POST handler is calling fetch directly and bypasses the shared proxy;
update the POST function to call the existing apiClient(request, endpoint,
options) instead of fetch so cookies and forwarded headers are preserved. Build
the endpoint as `/api/v1/assessment/assessments/${assessment_id}/retry`, call
apiClient(request, endpoint, { method: 'POST', headers: { 'X-API-KEY': apiKey }
}) (or omit explicit headers if apiClient already forwards them), await the
response JSON and return it with NextResponse.json(data, { status:
response.status }); ensure you reference the POST function and the assessment_id
param when making the change.
app/assessment/schemaUtils.ts-19-20 (1)
19-20: ⚠️ Potential issue | 🟠 Major

Do not emit empty enum schemas.

If all enum values are blank, this produces enum: [], which makes the field impossible to satisfy. Trim/filter values first and skip or downgrade invalid enum properties instead.
Proposed fix
     } else if (p.type === 'enum') {
-      def = { type: 'string', enum: p.enumValues.filter(v => v.trim()) };
+      const enumValues = p.enumValues.map((v) => v.trim()).filter(Boolean);
+      if (enumValues.length === 0) return;
+      def = { type: 'string', enum: enumValues };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/schemaUtils.ts` around lines 19 - 20, When handling p.type ===
'enum' in schemaUtils.ts, trim and filter p.enumValues first (e.g., const vals =
p.enumValues.map(v => v.trim()).filter(Boolean)); if vals is non-empty set def =
{ type: 'string', enum: vals } otherwise do not emit an empty enum—fallback to a
plain string type (def = { type: 'string' }) or omit the enum constraint for
that property so you avoid producing enum: [] for property p.
package.json-25-25 (1)
25-25: ⚠️ Potential issue | 🟠 Major

Avoid adding vulnerable xlsx@0.18.5 for parsing user-uploaded files.

Version 0.18.5 is affected by two high-severity advisories: GHSA-4r6h-8v6p-xvw6 (Prototype Pollution) and GHSA-5pgg-2g8v-p4x9 (Regular Expression Denial of Service). Since DatasetStep parses user-provided spreadsheet content with this package, this introduces unnecessary supply-chain and input-processing risk. Use a maintained parser, implement server-side constraints on parsing, or choose a non-vulnerable distribution strategy before accepting arbitrary uploads.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@package.json` at line 25, Your package currently depends on a vulnerable xlsx
version (xlsx@0.18.5) which is used by DatasetStep to parse user uploads; remove
or replace this dependency with a non-vulnerable, actively maintained parser (or
upgrade to a patched release) and update usages in DatasetStep to the new API.
Additionally, enforce server-side constraints when parsing untrusted
spreadsheets: validate MIME/type and extension before processing, enforce strict
file size and row/column limits, parse inside an isolated worker/process with
timeouts, and add input sanitization and error handling to DatasetStep to avoid
prototype pollution or ReDoS risks. Ensure tests cover the new parser and the
defensive checks.
app/api/assessment/assessments/[assessment_id]/results/route.ts-31-39 (1)
31-39: ⚠️ Potential issue | 🟠 Major

Stream file exports instead of materializing a Blob.

The misleading comment "stream the response through" contradicts the implementation: await response.blob() buffers the entire CSV/XLSX/ZIP payload in memory before responding. Use response.body directly instead to stream the response without buffering.

Same buffering issue exists in app/api/assessment/evaluations/[evaluation_id]/results/route.ts at line 33—both routes should be fixed.
Proposed streaming change
-      const blob = await response.blob();
       const headers = new Headers();
       headers.set('Content-Type', contentType);
       const disposition = response.headers.get('content-disposition');
       if (disposition) headers.set('Content-Disposition', disposition);
-      return new NextResponse(blob, { status: response.status, headers });
+      return new NextResponse(response.body, { status: response.status, headers });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/assessments/`[assessment_id]/results/route.ts around lines
31 - 39, The code currently buffers file exports by calling response.blob() in
the route handler (see the block that reads response.headers and calls await
response.blob()), which contradicts the "stream the response through" intent;
change the implementation to pipe the upstream response.body stream directly
into the NextResponse without materializing a Blob, e.g. use response.body as
the body for new NextResponse and copy relevant headers (Content-Type and
Content-Disposition) and status; apply the same change to the analogous handler
in app/api/assessment/evaluations/[evaluation_id]/results/route.ts where
response.blob() is used.
app/assessment/page.tsx-1-19 (1)
1-19: ⚠️ Potential issue | 🟠 Major

Move the Assessment route into the (main) route group.

This authenticated application route is currently at app/assessment/page.tsx; place it at app/(main)/assessment/page.tsx so it follows the repo's App Router grouping without changing the public /assessment URL.

As per coding guidelines, app/{(auth),(main)}/**/*.{ts,tsx} routes should be organized using Next.js App Router route groups: app/(auth)/ for unauthenticated flows and app/(main)/ for authenticated application routes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/page.tsx` around lines 1 - 19, The Assessment page route is in
the wrong route group; move the file that exports default async function
AssessmentPage (the component importing AssessmentPageClient, FeatureRouteGuard,
FeatureFlag, and getServerFeatureFlags) into the (main) route group so it lives
under the app/(main)/assessment route group while keeping the public /assessment
URL; after moving, verify and update any relative imports (e.g.,
AssessmentPageClient, FeatureRouteGuard, getServerFeatureFlags, FeatureFlag) so
they resolve from the new location and ensure the exported function signature
and behavior (checking initialFlags[FeatureFlag.ASSESSMENT] and calling
notFound()) remain unchanged.
app/api/assessment/evaluations/[evaluation_id]/retry/route.ts-7-42 (1)
7-42: 🛠️ Refactor suggestion | 🟠 Major

Use apiClient for this proxy route.

Same issue as app/api/assessment/assessments/route.ts: the handler reimplements header forwarding and JSON parsing that apiClient handles. Also address the Prettier formatting CI warning.

As per coding guidelines: "Use apiClient(request, endpoint, options) in server-side Next.js route handlers to forward requests to the backend, which automatically relays X-API-KEY and Cookie headers and returns { status, data, headers }".
♻️ Proposed refactor
-import { NextRequest, NextResponse } from 'next/server';
+import { NextRequest, NextResponse } from 'next/server';
+import { apiClient } from '@/app/lib/apiClient';
 
 interface RouteContext {
   params: Promise<{ evaluation_id: string }>;
 }
 
 export async function POST(request: NextRequest, context: RouteContext) {
   try {
-    const apiKey = request.headers.get('X-API-KEY');
-
-    if (!apiKey) {
-      return NextResponse.json(
-        { error: 'Missing X-API-KEY header' },
-        { status: 401 }
-      );
-    }
-
     const { evaluation_id } = await context.params;
-    const backendUrl = process.env.BACKEND_URL || 'http://localhost:8000';
-
-    const response = await fetch(
-      `${backendUrl}/api/v1/assessment/evaluations/${evaluation_id}/retry`,
-      {
-        method: 'POST',
-        headers: {
-          'X-API-KEY': apiKey,
-        },
-      }
-    );
-
-    const data = await response.json();
-    return NextResponse.json(data, { status: response.status });
+    const { status, data } = await apiClient(
+      request,
+      `/api/v1/assessment/evaluations/${evaluation_id}/retry`,
+      { method: 'POST' }
+    );
+    return NextResponse.json(data, { status });
   } catch (error: unknown) {
     return NextResponse.json(
       {
         error: 'Failed to forward evaluation retry request',
         details: error instanceof Error ? error.message : String(error),
       },
       { status: 500 }
     );
   }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/evaluations/`[evaluation_id]/retry/route.ts around lines 7
- 42, The POST handler in route.ts reimplements header forwarding and JSON
parsing; replace the manual fetch logic with the shared apiClient utility: call
apiClient(request, `/api/v1/assessment/evaluations/${evaluation_id}/retry`, {
method: 'POST' }), then return NextResponse.json(response.data, { status:
response.status }) (or forward headers if needed) instead of manually extracting
X-API-KEY and parsing JSON; update the function POST and remove the custom
headers block and try/catch parsing duplication, and ensure the file formatting
matches Prettier (run formatter) to resolve the CI warning.
app/api/assessment/assessments/route.ts-3-32 (1)
3-32: 🛠️ Refactor suggestion | 🟠 Major

Use apiClient instead of raw fetch for backend forwarding.

Route handlers under app/api/** must use apiClient(request, endpoint, options) which automatically relays X-API-KEY and Cookie headers and returns { status, data, headers }. This eliminates manual key validation, query reconstruction, and JSON parsing boilerplate. Also run npx prettier --write on this file to resolve CI warnings.
♻️ Proposed refactor
-import { NextRequest, NextResponse } from 'next/server';
-
-export async function GET(request: NextRequest) {
-  try {
-    const apiKey = request.headers.get('X-API-KEY');
-
-    if (!apiKey) {
-      return NextResponse.json(
-        { error: 'Missing X-API-KEY header' },
-        { status: 401 }
-      );
-    }
-
-    const backendUrl = process.env.BACKEND_URL || 'http://localhost:8000';
-    const searchParams = request.nextUrl.searchParams.toString();
-    const queryString = searchParams ? `?${searchParams}` : '';
-
-    const response = await fetch(`${backendUrl}/api/v1/assessment/assessments${queryString}`, {
-      method: 'GET',
-      headers: {
-        'X-API-KEY': apiKey,
-      },
-    });
-
-    const data = await response.json();
-    return NextResponse.json(data, { status: response.status });
-  } catch (error: unknown) {
-    return NextResponse.json(
-      { error: 'Failed to forward request to backend', details: error instanceof Error ? error.message : String(error) },
-      { status: 500 }
-    );
-  }
-}
+import { NextRequest, NextResponse } from 'next/server';
+import { apiClient } from '@/app/lib/apiClient';
+
+export async function GET(request: NextRequest) {
+  try {
+    const qs = request.nextUrl.searchParams.toString();
+    const endpoint = `/api/v1/assessment/assessments${qs ? `?${qs}` : ''}`;
+    const { status, data } = await apiClient(request, endpoint, { method: 'GET' });
+    return NextResponse.json(data, { status });
+  } catch (error: unknown) {
+    return NextResponse.json(
+      {
+        error: 'Failed to forward request to backend',
+        details: error instanceof Error ? error.message : String(error),
+      },
+      { status: 500 }
+    );
+  }
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/assessments/route.ts` around lines 3 - 32, Replace the
manual fetch flow inside the exported GET function: remove the manual X-API-KEY
header check, backendUrl/queryString construction, and raw fetch/response.json
logic, and instead call apiClient(request, '/api/v1/assessment/assessments' +
request.nextUrl.search, { method: 'GET' }) (or appropriate options), then return
NextResponse.json(result.data, { status: result.status, headers: result.headers
}) using the returned { status, data, headers } from apiClient; also run npx
prettier --write on this file to fix CI formatting warnings.
app/lib/FeatureFlagProvider.tsx-38-84 (1)

38-84: ⚠️ Potential issue | 🟠 Major

SSR-provided flags can be wiped on first client render.

getServerFeatureFlags() resolves flags from the kaapi_api_key cookie, while fetchFlags here gates on the STORAGE_KEY localStorage entry. These two auth sources can disagree:

A signed-in user whose API key only lives in the HttpOnly cookie (or whose localStorage hasn’t populated yet at first render) will hit lines 57–61, setting flags to {} and overwriting the server snapshot that initialFlags just seeded on line 39.

Consumers like FeatureRouteGuard / Sidebar that gate on isEnabled will briefly (or persistently) show “feature disabled” even though the server already determined it was enabled.

Consider either (a) skipping the localStorage gate when hasServerSnapshot is true and trusting the server until a real fetch returns a new answer, or (b) attempting the /api/features call unconditionally (the route already returns 401 when unauthenticated, which you can treat as “no change”). This preserves the SSR→CSR continuity the initialFlags prop was designed for.
app/api/assessment/datasets/[dataset_id]/route.ts-42-54 (1)
42-54: ⚠️ Potential issue | 🟠 Major

S3 download has no timeout and buffers the full file in memory.

Two reliability/scalability concerns in the fetch_content=true branch:

No timeout on line 47. If the signed URL hangs (slow S3 region, network partition, dead DNS), this fetch has no AbortSignal.timeout(...) and will tie up the serverless/Node handler indefinitely. On platforms with concurrency-per-instance caps, a handful of such requests can exhaust the pool.

Whole-file base64 in memory on line 51–52. arrayBuffer() + Buffer.from(...).toString('base64') allocates the full file plus ~1.33× for the base64 copy. For even moderately large assessment datasets (tens of MB), this is a large transient allocation on every call and the response body is then serialized through NextResponse.json. Given DatasetStep.tsx just base64-decodes it back to parse with XLSX, a much better contract would be to return the signed_url to the client and let the browser download + parse directly, or to stream the bytes through as application/octet-stream.
🛡️ Minimal fix (timeout), plus streaming suggestion
-      const fileResponse = await fetch(signedUrl);
+      const fileResponse = await fetch(signedUrl, {
+        signal: AbortSignal.timeout(30_000),
+      });
Longer term, consider returning the signed URL directly (client-side fetch already works for presigned S3 URLs without CORS issues if the bucket is configured) or streaming via new Response(fileResponse.body, { headers: { 'content-type': ... } }) instead of buffering + base64.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/datasets/`[dataset_id]/route.ts around lines 42 - 54, The
fetch_content=true branch currently does an untimeouted fetch(signedUrl) and
buffers the entire file (arrayBuffer + Buffer.from(...).toString('base64')),
which can hang handlers and OOM; change the logic in the fetchContent block to
(1) use an AbortController with a reasonable timeout (e.g., 10s) and pass
controller.signal to fetch(signedUrl) to avoid hanging requests, and (2) stop
base64-buffering large files on the server — instead return the signedUrl back
to the client (or stream the fileResponse.body with new Response(...) and
appropriate content-type) via NextResponse.json({ ...data, signed_url: signedUrl
}) so DatasetStep.tsx can fetch/parse in the browser rather than using
arrayBuffer/Buffer.from on the server. Ensure you update references to
signedUrl, fileResponse, and remove the arrayBuffer/Buffer.from usage in that
branch.
app/api/assessment/datasets/[dataset_id]/route.ts-10-64 (1)
10-64: ⚠️ Potential issue | 🟠 Major

Use apiClient to forward the backend request per coding guidelines.

This route uses raw fetch() to call the backend instead of the required apiClient(request, endpoint, options). The coding guideline for app/api/** routes states:

"Use apiClient(request, endpoint, options) in server-side Next.js route handlers to forward requests to the backend, which automatically relays X-API-KEY and Cookie headers and returns { status, data, headers }"

Currently this route:

Only reads X-API-KEY from the request headers (line 15), missing the cookie fallback that apiClient provides — callers using cookie-based auth will 401

Mirrors the entire upstream response body on error (line 38) instead of extracting a normalized message using the pattern body.error || body.message || body.detail

Re-implements query-string passthrough manually

Additionally:

The S3 signed URL fetch (line 47) has no timeout, risking indefinite thread pins if the connection hangs

Loading the entire file into memory as an ArrayBuffer and base64-encoding it (lines 51–53) causes memory spikes and ~33% size overhead, which can exhaust memory on larger datasets or constrained runtimes

Refactor to use apiClient() and consider returning the signed URL to the client instead of fetching the file server-side.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/datasets/`[dataset_id]/route.ts around lines 10 - 64, The
GET route handler currently calls fetch() directly inside the exported GET
function and manually parses query params, reads only X-API-KEY, mirrors
upstream bodies on error, and downloads the signed S3 file into memory; replace
that call with apiClient(request, `/assessment/datasets/${dataset_id}`, {
method: 'GET', params: <forwarded searchParams with include_signed_url when
fetch_content> }) so headers/cookies are relayed automatically, stop manual
URL/query-string handling, and use the apiClient response shape ({ status, data,
headers }) to return normalized errors (use data.error || data.message ||
data.detail) instead of mirroring the backend body; do not fetch the signed URL
server-side in GET (remove the fileResponse/arrayBuffer/base64 logic) — return
the signed_url to the client or, if you must fetch server-side, implement a
bounded timeout and stream-to-client approach instead of Buffer.from to avoid
memory spikes.
app/assessment/components/DatasetStep.tsx-188-204 (1)
188-204: ⚠️ Potential issue | 🟠 Major

Clear stale columns when dataset parsing fails.

setDatasetId(id) runs before fetchAndParseFile() succeeds. If parsing fails or returns no headers, the selected dataset ID can remain paired with columns/sample data from the previous dataset, and Line 285 can enable “Next” with invalid mapping context.

Track the parsed dataset ID, reset columns on failure, and gate Next on successful parsing.
🐛 Suggested direction
+const [parsedDatasetId, setParsedDatasetId] = useState("");

 const handleDatasetSelect = async (id: string) => {
   setDatasetId(id);
+  setParsedDatasetId("");
   if (!id) return;

   setIsLoadingColumns(true);
   try {
     const parsed = await fetchAndParseFile(id);
-    if (parsed?.headers) {
+    if (parsed?.headers?.length) {
       const firstRow = parsed.rows[0] || [];
       const sampleRow = Object.fromEntries(
         parsed.headers.map((header, index) => [
           header,
           String(firstRow[index] ?? ""),
         ]),
       );
       onColumnsLoaded(parsed.headers, sampleRow);
+      setParsedDatasetId(id);
+    } else {
+      onColumnsLoaded([], {});
+      toast.error("Could not read columns from this dataset");
     }
   } catch (e) {
     console.error("Failed to fetch dataset columns:", e);
+    onColumnsLoaded([], {});
   } finally {
     setIsLoadingColumns(false);
   }
 };

-const canProceed = datasetId && !isLoadingColumns;
+const canProceed = Boolean(datasetId) && parsedDatasetId === datasetId && !isLoadingColumns;
Also applies to: 285-285
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/DatasetStep.tsx` around lines 188 - 204,
handleDatasetSelect currently sets setDatasetId(id) before fetchAndParseFile
succeeds, so if parsing fails or returns no headers the UI can keep stale
columns; modify handleDatasetSelect to clear/reset columns/sample data (call
onColumnsLoaded with empty arrays/object or a dedicated reset handler) and clear
any parsedDatasetId state before calling fetchAndParseFile, then only call
setDatasetId and onColumnsLoaded with real headers/sampleRow after a successful
parse; ensure setIsLoadingColumns is used around the async call and gate the
"Next" button (the logic that checks parsedDatasetId or columns) to require a
successful parse (use a parsedDatasetId or a boolean like isColumnsLoaded) so
Next is disabled unless parsing produced headers.
app/assessment/AssessmentPageClient.tsx-95-144 (1)
95-144: 🛠️ Refactor suggestion | 🟠 Major

Use clientFetch() for assessment API requests.

The polling and submit requests are made with raw fetch(), bypassing centralized 401 refresh handling and auth-expiry dispatch.
♻️ Suggested pattern
-const response = await fetch("/api/assessment/assessments?limit=10", {
+const response = await clientFetch("/api/assessment/assessments?limit=10", {
   headers: { "X-API-KEY": selectedKey.key },
 });
As per coding guidelines, **/*.{ts,tsx} should use clientFetch(endpoint, options) on the client-side for API requests, which handles token refresh on 401 errors and dispatches AUTH_EXPIRED_EVENT when refresh fails.

Also applies to: 190-224
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/AssessmentPageClient.tsx` around lines 95 - 144, Replace raw
fetch calls in the client-side assessment API logic with the centralized helper
clientFetch to ensure token refresh/401 handling; specifically update the
pollEvalStatus function (and the related submit function referenced around lines
190-224) to call clientFetch("/api/assessment/assessments?limit=10", { headers:
{ "X-API-KEY": selectedKey.key }, /* include same options as before */ })
instead of fetch, then preserve the existing response.ok check, JSON parsing,
and subsequent logic (runs detection, hasProcessing check, setEvalIndicator
branches, and dismissedRef usage). Ensure you import/usage-reference clientFetch
where pollEvalStatus and the submit handler are defined and keep the same error
handling behavior (catch block) semantics.
app/assessment/components/PromptAndConfigStep.tsx-512-537 (1)
512-537: ⚠️ Potential issue | 🟠 Major

Check MAX_CONFIGS before saving a new behavior.

When the selection is already full, this still persists a new config, addSelection() rejects it, and the UI then resets + shows “saved and added”. Add the cap check before saveAssessmentConfig() and only show success after the selection is actually added.
🐛 Suggested guard
 const handleCreateAndAdd = async () => {
   if (!configName.trim()) {
     toast.error("Configuration name is required");
     return;
   }
+  if (configs.length >= MAX_CONFIGS) {
+    toast.error(`You can select up to ${MAX_CONFIGS} configurations`);
+    return;
+  }
   setIsSaving(true);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/PromptAndConfigStep.tsx` around lines 512 - 537,
Before calling saveAssessmentConfig in handleCreateAndAdd, check the current
selection count against MAX_CONFIGS (use whatever selection state or selector
the component uses) and abort with toast.error if full; only proceed to save
after confirming there is room. Move the persistence and UI-reset logic
(saveAssessmentConfig, invalidateAssessmentConfigCache,
setDraft(buildInitialDraft()), setConfigName(""), setCommitMessage(""),
setConfigMode("existing"), and toast.success) to occur after addSelection
succeeds, and handle addSelection failure by showing an error and not resetting
the form. Ensure you still call loadConfigs(0, true) after successful
addSelection and keep references to addSelection, saveAssessmentConfig,
handleCreateAndAdd, and MAX_CONFIGS when updating the flow.
app/assessment/config/ConfigurationStep.tsx-217-438 (1)
217-438: ⚠️ Potential issue | 🟠 Major

Avoid stale selection state after async config loads.

toggleVersionSelection() awaits fetchConfigSelection(), then addSelection() writes using the configs array captured before the request. Parallel selections can overwrite each other or apply max/duplicate checks against stale state.

Use a functional setter type for setConfigs, or a latest-configs ref, so async completions merge with the current selection list.
🐛 Suggested direction
 interface ConfigurationStepProps {
   configs: ConfigSelection[];
-  setConfigs: (configs: ConfigSelection[]) => void;
+  setConfigs: React.Dispatch<React.SetStateAction<ConfigSelection[]>>;
 }

 const removeSelection = useCallback(
   (configId: string, version: number) => {
-    setConfigs(
-      configs.filter(
+    setConfigs((prev) =>
+      prev.filter(
         (item) =>
           !(item.config_id === configId && item.config_version === version),
       ),
     );
   },
-  [configs, setConfigs],
+  [setConfigs],
 );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/config/ConfigurationStep.tsx` around lines 217 - 438, The
async path in toggleVersionSelection leads to stale writes because addSelection
uses the captured configs array; change addSelection to use the functional state
updater form (call setConfigs(prev => { perform duplicate/max checks against
prev and return [...prev, selection] })) so duplicate detection and MAX_CONFIGS
enforcement operate on the latest list; ensure any other places that call
setConfigs from async callbacks (e.g., toggleVersionSelection) follow the same
pattern or read from a latest-configs ref.
app/assessment/components/EvaluationsTab.tsx-186-431 (1)
186-431: 🛠️ Refactor suggestion | 🟠 Major

Use clientFetch() for client-side API calls.

These direct fetch() calls bypass the project’s token-refresh and auth-expiry handling. Keep the X-API-KEY header, but route the request through clientFetch(endpoint, options).
♻️ Suggested pattern
-const response = await fetch('/api/assessment/assessments', {
+const response = await clientFetch('/api/assessment/assessments', {
   headers: { 'X-API-KEY': apiKey },
 });
As per coding guidelines, **/*.{ts,tsx} should use clientFetch(endpoint, options) on the client-side for API requests, which handles token refresh on 401 errors and dispatches AUTH_EXPIRED_EVENT when refresh fails.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/EvaluationsTab.tsx` around lines 186 - 431,
Multiple client-side fetch calls (loadAssessments, loadChildRuns,
loadConfigDetail, triggerDownload, handleRerun, handleRetryAssessment,
handlePreview) bypass the project's token-refresh/auth-expiry logic; replace
each direct fetch(...) with clientFetch(endpoint, options) while preserving the
existing headers (including 'X-API-KEY'), HTTP methods, query strings, response
checks, JSON/blob parsing, and error handling. Ensure you import/ensure
clientFetch is available in this module and use it in the same places and shapes
as the original fetch calls so behavior (response.ok checks, parsing, thrown
errors, and finally blocks) remains identical but benefits from token refresh
and AUTH_EXPIRED_EVENT handling.
app/assessment/components/DatasetStep.tsx-58-255 (1)
58-255: 🛠️ Refactor suggestion | 🟠 Major

Use clientFetch() for client-side API calls.

Dataset list/load/upload/delete requests currently use raw fetch(), so 401 refresh and AUTH_EXPIRED_EVENT behavior is skipped.
♻️ Suggested pattern
-const response = await fetch("/api/assessment/datasets", {
+const response = await clientFetch("/api/assessment/datasets", {
   headers: { "X-API-KEY": apiKey },
 });
As per coding guidelines, **/*.{ts,tsx} should use clientFetch(endpoint, options) on the client-side for API requests, which handles token refresh on 401 errors and dispatches AUTH_EXPIRED_EVENT when refresh fails.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/DatasetStep.tsx` around lines 58 - 255, Several
client-side API calls use raw fetch (notably in fetchAndParseFile, loadDatasets,
handleCreateDataset, handleDatasetSelect indirectly, handleViewDataset, and
handleDeleteDataset), which bypasses token refresh and AUTH_EXPIRED_EVENT;
replace those fetch calls with clientFetch(endpoint, options) and preserve all
existing options (method, body/formData, headers including "X-API-KEY": apiKey)
and existing response/error handling logic, import clientFetch at the top of the
file, and ensure you still call await response.json() and check response.ok
exactly as before so behavior remains the same except for automatic
refresh/AUTH_EXPIRED_EVENT handling.
app/assessment/components/PromptAndConfigStep.tsx-345-394 (1)
345-394: ⚠️ Potential issue | 🟠 Major

Avoid stale configs writes after async selection fetches.

addSelection() closes over the configs array from the render that started the request. If users select multiple versions quickly, later async completions can call setConfigs([...configs, selection]) with stale state and drop an earlier selection.

Prefer a functional state update by widening the prop type to React.Dispatch<React.SetStateAction<ConfigSelection[]>>, or keep a synchronized ref for the latest configs before applying async results.
🐛 Suggested direction
 interface PromptAndConfigStepProps {
-  setConfigs: (configs: ConfigSelection[]) => void;
+  setConfigs: React.Dispatch<React.SetStateAction<ConfigSelection[]>>;
 }

 const addSelection = useCallback(
   (selection: ConfigSelection) => {
-    if (
-      configs.some(
+    setConfigs((prev) => {
+      if (
+        prev.some(
           (c) =>
             c.config_id === selection.config_id &&
             c.config_version === selection.config_version,
-        )
-      ) {
-        toast.error("This configuration version is already selected");
-        return;
+        )
+      ) {
+        return prev;
       }
-      if (configs.length >= MAX_CONFIGS) {
-        toast.error(`You can select up to ${MAX_CONFIGS} configurations`);
-        return;
+
+      if (prev.length >= MAX_CONFIGS) {
+        return prev;
       }
-      setConfigs([...configs, selection]);
+
+      return [...prev, selection];
+    });
   },
-  [configs, setConfigs, toast],
+  [setConfigs],
 );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/PromptAndConfigStep.tsx` around lines 345 - 394,
addSelection currently closes over the stale configs array and uses
setConfigs([...configs, selection]) which can drop selections when async fetches
(in toggleVersionSelection -> fetchConfigSelection) resolve out of order; change
addSelection to use the functional state updater form (call setConfigs(prev =>
[...prev, selection])) or change the configs prop type to
React.Dispatch<React.SetStateAction<ConfigSelection[]>> and update all callers
accordingly so asynchronous completions in toggleVersionSelection/addSelection
safely append without relying on a captured configs variable.
app/assessment/config/api.ts-95-100 (1)
95-100: 🛠️ Refactor suggestion | 🟠 Major

Use the exported CACHE_KEY constant instead of hardcoding the string.

constants.ts already exports CACHE_KEY = 'kaapi_configs_cache' (line 276), but here the same literal is duplicated. If the key is ever renamed in constants.ts, cache invalidation will silently stop working. Import and use CACHE_KEY to stay in sync with the rest of the module.
♻️ Proposed fix
-import { CACHE_INVALIDATED_EVENT } from "./constants";
+import { CACHE_INVALIDATED_EVENT, CACHE_KEY } from "./constants";
@@
 export function invalidateAssessmentConfigCache(): void {
   if (typeof window === "undefined") return;

-  localStorage.removeItem("kaapi_configs_cache");
+  localStorage.removeItem(CACHE_KEY);
   window.dispatchEvent(new CustomEvent(CACHE_INVALIDATED_EVENT));
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/config/api.ts` around lines 95 - 100, The function
invalidateAssessmentConfigCache currently hardcodes the cache key string; import
and use the exported CACHE_KEY constant (from constants.ts) instead of the
literal "kaapi_configs_cache" inside invalidateAssessmentConfigCache and in any
localStorage.removeItem calls so the key stays in sync; update the top-of-file
imports to include CACHE_KEY and replace the hardcoded string in
localStorage.removeItem with CACHE_KEY.
app/assessment/config/api.ts-178-201 (1)
178-201: ⚠️ Potential issue | 🟠 Major

Client-side duplicate-name check is O(N) pagination with unhandled TOCTOU race condition.

findConfigByExactName paginates through all configs sequentially (100 per page) to verify a name doesn't exist before creating, yet another client/tab can still create that same name between the check (line 215) and the POST request (line 256). The backend proxy at app/api/configs/route.ts passes query parameters through to the backend, so if the backend /api/v1/configs endpoint supports a name filter parameter, use it to fetch directly instead of client-side pagination. If the backend returns a 409 on duplicate creates, requestJson will throw on non-2xx status, but currently there is no retry logic or special handling for duplicate-name errors—add explicit handling for the duplicate case as a retry that performs a new version POST instead of failing outright.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/config/api.ts` around lines 178 - 201, The client-side
pagination in findConfigByExactName is inefficient and racey; change it to call
the backend configs endpoint with a name filter (use the same query param
pass-through that app/api/configs/route.ts supports) so fetchConfigPage is not
needed — i.e., query /api/v1/configs?name=<normalizedName> and return the single
match or null. Also add explicit handling around the POST request that creates a
config (the requestJson call used for creates): if the POST fails with a 409
duplicate-name response, catch that error and implement a retry path that
performs a “create new version” POST (instead of surfacing the error) so
concurrent creates result in a versioned config rather than an unhandled
failure.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f50ed920-8ac2-4b8e-9512-2e83818393bc

📥 Commits

Reviewing files that changed from the base of the PR and between a52ee8e and 5d8d9bc.

⛔ Files ignored due to path filters (2)

package-lock.json is excluded by !**/package-lock.json
yarn.lock is excluded by !**/yarn.lock, !**/*.lock

📒 Files selected for processing (51)

app/(main)/datasets/page.tsx
app/(main)/keystore/page.tsx
app/api/assessment/assessments/[assessment_id]/results/route.ts
app/api/assessment/assessments/[assessment_id]/retry/route.ts
app/api/assessment/assessments/route.ts
app/api/assessment/datasets/[dataset_id]/route.ts
app/api/assessment/datasets/route.ts
app/api/assessment/evaluations/[evaluation_id]/results/route.ts
app/api/assessment/evaluations/[evaluation_id]/retry/route.ts
app/api/assessment/evaluations/route.ts
app/api/assessment/events/route.ts
app/api/configs/[config_id]/versions/route.ts
app/api/configs/route.ts
app/api/features/route.ts
app/assessment/AssessmentPageClient.tsx
app/assessment/components/ColumnMapperStep.tsx
app/assessment/components/DataViewModal.tsx
app/assessment/components/DatasetStep.tsx
app/assessment/components/EvaluationsTab.tsx
app/assessment/components/JsonEditor.tsx
app/assessment/components/MultiConfigStep.tsx
app/assessment/components/OutputSchemaStep.tsx
app/assessment/components/PromptAndConfigStep.tsx
app/assessment/components/PromptEditorStep.tsx
app/assessment/components/ReviewStep.tsx
app/assessment/components/Stepper.tsx
app/assessment/config/ConfigurationStep.tsx
app/assessment/config/api.ts
app/assessment/config/constants.ts
app/assessment/page.tsx
app/assessment/schemaUtils.ts
app/assessment/store.ts
app/assessment/types.ts
app/assessment/useAssessmentEvents.ts
app/components/FeatureRouteGuard.tsx
app/components/Sidebar.tsx
app/components/icons/index.tsx
app/components/icons/sidebar/AssessmentIcon.tsx
app/components/speech-to-text/ModelComparisonCard.tsx
app/layout.tsx
app/lib/FeatureFlagProvider.tsx
app/lib/colors.ts
app/lib/configTypes.ts
app/lib/constants/featureFlags.ts
app/lib/constants/keystore.ts
app/lib/context/AuthContext.tsx
app/lib/featureFlags.server.ts
app/lib/navConfig.ts
app/lib/types/datasets.ts
app/lib/types/nav.ts
package.json

💤 Files with no reviewable changes (1)

app/(main)/keystore/page.tsx

- Added feature flag handling in various components and API routes. - Introduced cookies for managing feature flags and updated authentication flows. - Removed deprecated feature flag provider and related components. - Enhanced sidebar and assessment page to utilize feature flags for conditional rendering. - Updated middleware to enforce feature-based access control.

- Updated OutputSchemaStep to remove eslint-disable comments for dependencies. - Enhanced PromptAndConfigStep to require at least one configuration and a response format to proceed, with updated UI elements and messages. - Modified ReviewStep to make the onEdit prop optional and improved dataset display. - Adjusted Stepper logic to allow sequential unlocking of steps. - Simplified ConfigurationStep by removing Google (Gemini) configurations and updating related UI elements. - Updated constants to remove references to Google provider and related configurations. - Enhanced Zustand store to include datasetName and updated dataset management methods. - Updated types to include datasetName in AssessmentFormState.

…add "use client" directive in ModelComparisonCard

coderabbitai

Actionable comments posted: 10

🧹 Nitpick comments (12)

app/assessment/store.ts (2)

2-2: Use the @/ path alias for the types import.
-import { ColumnMapping } from "./types";
+import { ColumnMapping } from "@/app/assessment/types";
As per coding guidelines: "Use the @/ import alias configured in tsconfig.json to resolve imports from the project root (e.g., import { apiClient } from '@/app/lib/apiClient')".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/store.ts` at line 2, Replace the relative import of the
ColumnMapping type in app/assessment/store.ts with the project root alias;
change the import that currently references "./types" so it imports
ColumnMapping from "@/app/assessment/types" (update the import statement that
declares ColumnMapping).
23-27: Optional: avoid sharing a single DEFAULT_MAPPING reference across initial state and resets.

DEFAULT_MAPPING (and its inner arrays) is reused by reference for the initial state, setDataset's reset, and clearDataset. Today's consumers replace columnMapping with a brand-new object (see ColumnMapperStep.tsx), so this is safe — but any future code that mutates state.columnMapping.textColumns (or one of the other arrays) in place would corrupt the module-level default and every subsequent reset.

A small factory makes this robust against future regressions:
♻️ Proposed refactor
-const DEFAULT_MAPPING: ColumnMapping = {
-  textColumns: [],
-  attachments: [],
-  groundTruthColumns: [],
-};
+const createDefaultMapping = (): ColumnMapping => ({
+  textColumns: [],
+  attachments: [],
+  groundTruthColumns: [],
+});
Then use createDefaultMapping() in the initial state, setDataset, and clearDataset.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/store.ts` around lines 23 - 27, DEFAULT_MAPPING and its arrays
are shared by reference causing potential future mutation bugs; add a factory
function (e.g. createDefaultMapping()) that returns a fresh ColumnMapping object
with new arrays and replace usages of DEFAULT_MAPPING in the initial state, in
setDataset, and in clearDataset so each state/init/reset gets its own
independent object (leave ColumnMapperStep.tsx unchanged but ensure it still
receives a new object).

app/assessment/components/Stepper.tsx (1)

62-67: Use colors.text.white for consistency.

Per the AI summary, this PR introduces colors.text.white (already used in ColumnMapperStep.tsx line 434). Replacing the hardcoded "#ffffff" here keeps the design tokens centralized.

♻️ Proposed refactor

-                color: isActive
-                  ? "#ffffff"
-                  : isCompleted
-                    ? colors.text.primary
-                    : colors.text.secondary,
+                color: isActive
+                  ? colors.text.white
+                  : isCompleted
+                    ? colors.text.primary
+                    : colors.text.secondary,

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/Stepper.tsx` around lines 62 - 67, The color value
in Stepper's render uses a hardcoded "#ffffff"; replace that literal with the
design token colors.text.white for consistency (update the ternary expression
setting color, e.g., inside the component or function rendering the step circle
where isActive/isCompleted are evaluated). Ensure the border expression remains
the same but if it references the same white anywhere else in this component,
swap to colors.text.white as well so design tokens are centralized (look for the
color: ... and border: `1px solid ${...}` expressions in Stepper.tsx).

app/assessment/components/DatasetStep.tsx (1)

286-304: Avoid synthesizing a native change event on a hidden file input.

handleDrop constructs a DataTransfer, mutates fileInputRef.current.files, and dispatches a native change event to piggyback on handleFileSelect. While this happens to work because React delegates events at the root, it duplicates the extension allowlist (lines 133 vs. 290) and is fragile — e.g., it silently no-ops when the dropped file fails the allowlist check (no toast like handleFileSelect produces).

Extract a shared acceptFile(file) and call it directly in both handlers.

♻️ Suggested refactor

+  const ALLOWED_EXTS = [".csv", ".xlsx", ".xls"];
+  const acceptFile = (file: File) => {
+    const hasValidExt = ALLOWED_EXTS.some((ext) =>
+      file.name.toLowerCase().endsWith(ext),
+    );
+    if (!hasValidExt) {
+      toast.error("Please select a CSV or Excel (.xlsx, .xls) file");
+      return;
+    }
+    setUploadedFile(file);
+    if (!datasetName) {
+      setDatasetName(file.name.replace(/\.(csv|xlsx|xls)$/i, ""));
+    }
+  };

   const handleFileSelect = (event: React.ChangeEvent<HTMLInputElement>) => {
     const file = event.target.files?.[0];
     if (!file) return;
-    const allowedExts = [".csv", ".xlsx", ".xls"];
-    ...
+    acceptFile(file);
+    event.target.value = "";
   };

   const handleDrop = (e: React.DragEvent) => {
     e.preventDefault();
     setIsDragging(false);
     const file = e.dataTransfer.files?.[0];
-    const allowedExts = [".csv", ".xlsx", ".xls"];
-    if (file && allowedExts.some(...)) {
-      const dt = new DataTransfer();
-      ...
-    }
+    if (file) acceptFile(file);
   };

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/DatasetStep.tsx` around lines 286 - 304, handleDrop
currently synthesizes a native change event and duplicates the allowlist logic;
extract a shared function (e.g., acceptFile(file: File)) that encapsulates the
extension allowlist check, state updates, toasts, and any file-input
assignment/processing performed by handleFileSelect, then replace the
DataTransfer + dispatch logic in handleDrop with a direct call to
acceptFile(file); update handleFileSelect to call acceptFile(file) as well so
the allowlist and side-effects live in one place (refer to handleDrop and
handleFileSelect to locate where to extract and call acceptFile).

app/assessment/components/ReviewStep.tsx (2)

18-26: Avoid nesting interactive elements; remove the unused stepNumber prop.

Two minor items in AccordionSection:

The header is a <div role="button" tabIndex={0}> (line 44–60) that contains an inner <button> (line 99–112) for the Edit action. Nested interactive elements are non-conformant per the WAI-ARIA spec and can confuse assistive tech even though stopPropagation prevents the visual click bubbling. Consider rendering the header as <button type="button"> and placing Edit outside the toggle (e.g., as a sibling positioned via flex), or rework the layout so only one element is interactive.

stepNumber is declared on AccordionSectionProps (line 20) and threaded through every call site but never read — safe to drop.

Also applies to: 44-112
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/ReviewStep.tsx` around lines 18 - 26,
AccordionSection currently nests an inner interactive Edit button inside a
header implemented as a div with role="button" and declares a never-used
stepNumber prop; remove stepNumber from AccordionSectionProps and all call
sites, and refactor the header so the toggle is a single native control (change
the header container to <button type="button"> and keep onToggle on that button)
while moving the Edit action outside the toggle (render the Edit control as a
sibling element — e.g., a flex-aligned button or icon button — so it is not a
child of the toggle), and ensure keyboard handlers and aria attributes are
preserved on the new toggle button and that stopPropagation/handlers are updated
accordingly.
200-203: Prefer CSS :focus over imperative onFocus/onBlur for border styling.

Inline event handlers that mutate e.target.style work but defeat memoization and can desync with controlled re-renders. A Tailwind focus:border-[var(--accent)] (or a small CSS class) achieves the same result declaratively.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/ReviewStep.tsx` around lines 200 - 203, The JSX
currently uses imperative onFocus and onBlur handlers that mutate
e.target.style.borderColor; replace those handlers on the affected input
element(s) in ReviewStep.tsx with a declarative focus style (e.g., a Tailwind
class like focus:border-[var(--accent)] or a small CSS class using :focus) so
border color is handled by CSS instead of the onFocus/onBlur functions, and
remove the inline event handlers (search for the onFocus={(e)=>
(e.target.style.borderColor = colors.accent.primary)} and onBlur={(e)=>
(e.target.style.borderColor = colors.border)} occurrences to update the
element’s className accordingly).

app/assessment/components/OutputSchemaStep.tsx (2)

118-156: Unsafe type casts when ingesting external JSON.

fromJsonSchema casts actualDef.enum as string[] (line 142) and actualDef.type as SchemaPropertyType (line 144) without runtime validation. A user pasting a valid-looking JSON Schema with "enum": [1, 2, 3] or "type": "null" would silently produce a broken SchemaProperty. Consider filtering enum values to strings and falling back to "string" when actualDef.type is not in the supported SchemaPropertyType union.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/OutputSchemaStep.tsx` around lines 118 - 156,
fromJsonSchema currently unsafely casts actualDef.enum and actualDef.type;
update it to validate and sanitize inputs: when handling enum, set enumValues =
Array.isArray(actualDef.enum) ? (actualDef.enum.filter(v => typeof v ===
"string") as string[]) : []; and when deriving type, check actualDef.type
against the supported SchemaPropertyType union (e.g., allowedTypes = new
Set<SchemaPropertyType>(["string","number","boolean","object","enum","integer","..."]))
and use actualDef.type only if it is in allowedTypes, otherwise fallback to
"string"; keep all other behavior (isArray, children, required, genId()) intact.

8-20: Replace the Google Fonts dependency with an inline SVG asterisk.

useMaterialSymbols lazily injects a <link> to fonts.googleapis.com solely to render one glyph (asterisk at line 334). This adds:

A third-party network request on first mount (privacy/GDPR consideration when self-hosted otherwise).
A render flash where the text "asterisk" is visible before the font loads.
A failure mode in restricted/offline environments.

An inline SVG (or even the * character) would remove the external dependency entirely.

♻️ Suggested refactor

-/* ── Lazy-load Material Symbols ── */
-const MATERIAL_SYMBOLS_HREF =
-  "https://fonts.googleapis.com/css2?family=Material+Symbols+Outlined:opsz,wght,FILL,GRAD@24,400,0,0&icon_names=asterisk";
-
-function useMaterialSymbols() {
-  useEffect(() => {
-    if (document.querySelector(`link[href="${MATERIAL_SYMBOLS_HREF}"]`)) return;
-    const link = document.createElement("link");
-    link.rel = "stylesheet";
-    link.href = MATERIAL_SYMBOLS_HREF;
-    document.head.appendChild(link);
-  }, []);
-}

-          <span
-            className="material-symbols-outlined"
-            style={{ fontSize: "18px" }}
-          >
-            asterisk
-          </span>
+          <svg className="w-[18px] h-[18px]" viewBox="0 0 24 24" fill="currentColor" aria-hidden="true">
+            <path d="M12 2v20M4.93 6.93l14.14 10.14M19.07 6.93L4.93 17.07" stroke="currentColor" strokeWidth="2" strokeLinecap="round"/>
+          </svg>

You can then remove the useMaterialSymbols() call at line 561.

Also applies to: 330-336

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/OutputSchemaStep.tsx` around lines 8 - 20, The
current useMaterialSymbols hook injects an external Google Fonts link
(MATERIAL_SYMBOLS_HREF) just to render a single "asterisk" glyph; remove the
external dependency by deleting MATERIAL_SYMBOLS_HREF and the useMaterialSymbols
hook and removing its invocation, then replace the JSX that currently renders
the Material Symbols "asterisk" (where the icon name "asterisk" is used) with a
small inline SVG asterisk or the plain "*" character (update the
OutputSchemaStep JSX where that glyph is rendered); ensure any imports or
references solely for the font/hook are cleaned up.

app/assessment/page.tsx (1)

1-5: Organize the assessment route under the (main) route group.

app/assessment/page.tsx is an authenticated route but lives at the App Router root instead of under app/(main)/, inconsistent with all other authenticated feature routes in the codebase. Move to app/(main)/assessment/page.tsx (along with the colocated AssessmentPageClient.tsx and components/ directory). Route groups are URL-transparent, so the /assessment path is preserved while conforming to the project's routing convention: "Organize routes using Next.js App Router route groups: app/(auth)/ for unauthenticated flows (invite, verify) and app/(main)/ for authenticated application routes."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/page.tsx` around lines 1 - 5, The assessment route file lives
at the App Router root but should be inside the authenticated route group; move
app/assessment/page.tsx plus its colocated AssessmentPageClient.tsx and
components/ into app/(main)/assessment/, preserve the URL path (/assessment)
since route groups are transparent, and update any imports that reference the
old paths to the new location (e.g., import AssessmentPageClient from
"@/app/(main)/assessment/AssessmentPageClient" or use the appropriate alias).
Ensure the default export function AssessmentPage still returns
<AssessmentPageClient /> and adjust any relative imports inside
AssessmentPageClient or components to reflect the new directory structure.

app/assessment/config/constants.ts (3)

266-273: Silent GPT4_STYLE_CONFIG fallback can mislead the UI for unknown models.

When modelName doesn't match any entry, this returns the GPT-4 family params (top_p, temperature). For a reasoning-only model that hasn't been added to ASSESSMENT_MODEL_CONFIGS yet, the configuration UI will silently render irrelevant sliders, and buildDefaultParams will seed top_p/temperature into CompletionParams — values that may be invalid for that model. Consider either returning an empty definition {} (fail-closed) or logging a console.warn so missing entries surface during development.

♻️ Suggested change

 export function getModelConfigDefinition(
   modelName: string,
 ): Record<string, ConfigParamDefinition> {
-  return (
-    ASSESSMENT_MODEL_CONFIGS.find((item) => item.model_name === modelName)
-      ?.config ?? GPT4_STYLE_CONFIG
-  );
+  const entry = ASSESSMENT_MODEL_CONFIGS.find(
+    (item) => item.model_name === modelName,
+  );
+  if (!entry) {
+    if (process.env.NODE_ENV !== "production") {
+      console.warn(
+        `[assessment] Unknown model "${modelName}"; falling back to empty config.`,
+      );
+    }
+    return {};
+  }
+  return entry.config;
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/config/constants.ts` around lines 266 - 273, The function
getModelConfigDefinition currently falls back to GPT4_STYLE_CONFIG when
ASSESSMENT_MODEL_CONFIGS has no matching model, which silently surfaces
inappropriate GPT-4 params in the UI; change it to fail-closed by returning an
empty definition {} (or at minimum emit a console.warn including the unknown
modelName) instead of GPT4_STYLE_CONFIG so the UI and downstream
buildDefaultParams don't seed invalid top_p/temperature values; update the logic
in getModelConfigDefinition and add a warning referencing
ASSESSMENT_MODEL_CONFIGS, GPT4_STYLE_CONFIG, and buildDefaultParams so missing
model entries are visible during development.

14-18: Tighten provider typing for type-safe lookups.

AssessmentModelConfig.provider is the literal "openai", but getModelsByProvider/getDefaultModelForProvider accept arbitrary string. Deriving a Provider union from PROVIDER_OPTIONS (or vice versa) would catch typos at the call sites and remove the need for the hardcoded "gpt-4o-mini" fallback string in getDefaultModelForProvider.

export type Provider = (typeof PROVIDER_OPTIONS)[number]["value"]; // "openai"

export function getModelsByProvider(provider: Provider): ModelOption[] { /* ... */ }

export function getDefaultModelForProvider(provider: Provider): string {
  // ASSESSMENT_MODEL_CONFIGS guaranteed non-empty for any valid Provider
  const entry = ASSESSMENT_MODEL_CONFIGS.find((m) => m.provider === provider);
  return entry?.model_name ?? ASSESSMENT_MODEL_CONFIGS[0].model_name;
}

Also applies to: 250-264

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/config/constants.ts` around lines 14 - 18, Change the loose
provider typing to a derived union and update callers: derive a Provider type
from PROVIDER_OPTIONS (e.g., Provider = (typeof
PROVIDER_OPTIONS)[number]["value"]), change AssessmentModelConfig.provider from
the literal "openai" to this Provider type, and update getModelsByProvider and
getDefaultModelForProvider to accept Provider rather than string; then remove
the hardcoded fallback model string ("gpt-4o-mini") in
getDefaultModelForProvider and return ASSESSMENT_MODEL_CONFIGS[0].model_name
when an entry isn't found (use ASSESSMENT_MODEL_CONFIGS and model_name for the
guaranteed default).

53-216: Reduce duplication: hoist the shared reasoning config like GPT4_STYLE_CONFIG.

The effort + summary parameter blocks are repeated almost verbatim across o3, o3-mini, o4-mini, gpt-5, gpt-5-mini, gpt-5-nano, gpt-5.1, and gpt-5.2. Only the effort.options differ (["low","medium","high"], ["minimal","low","medium","high"], ["none","low","medium","high"], ["none","low","medium","high","xhigh"]). Extracting helpers makes future additions (e.g., new effort tiers, copy tweaks) a one-line change and prevents drift like the current "null"/no-"null" inconsistency.

♻️ Proposed refactor

const SUMMARY_PARAM = {
  type: "enum",
  default: "auto",
  options: ["auto", "detailed", "concise"],
  description: "Summarize the reasoning result.",
} as const satisfies ConfigParamDefinition;

const reasoningConfig = (
  effortOptions: readonly string[],
): Record<string, ConfigParamDefinition> => ({
  effort: {
    type: "enum",
    default: "medium",
    options: [...effortOptions],
    description:
      "How long the model spends reasoning. Higher = better but slower.",
  },
  summary: { ...SUMMARY_PARAM },
});

// usage
{ provider: "openai", model_name: "o3", config: reasoningConfig(["low", "medium", "high"]) },
{ provider: "openai", model_name: "gpt-5", config: reasoningConfig(["minimal", "low", "medium", "high"]) },
{ provider: "openai", model_name: "gpt-5.2", config: reasoningConfig(["none", "low", "medium", "high", "xhigh"]) },
// "*-chat-latest" / "*-pro" models that only need summary:
{ provider: "openai", model_name: "gpt-5.1-chat-latest", config: { summary: { ...SUMMARY_PARAM } } },

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/config/constants.ts` around lines 53 - 216, Hoist the
duplicated effort+summary blocks by creating a shared SUMMARY_PARAM constant and
a reasoningConfig(effortOptions) factory, then replace the repeated config
blocks for model_name values like "o3-mini", "o3", "o4-mini", "gpt-5",
"gpt-5-mini", "gpt-5-nano", "gpt-5.1", and "gpt-5.2" to call
reasoningConfig(...) with the appropriate effort options (e.g.,
["low","medium","high"], ["minimal","low","medium","high"],
["none","low","medium","high","xhigh"]) and replace the "gpt-5.1-chat-latest"
entry to use only SUMMARY_PARAM; ensure the types satisfy ConfigParamDefinition
and that the shared SUMMARY_PARAM options do not include the stray "null" value
so all model entries inherit the consistent summary options.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/assessment/AssessmentPageClient.tsx`:
- Around line 367-379: The payload being built in AssessmentPageClient.tsx omits
the tracked columnMapping.groundTruthColumns, so ground-truth labels are never
sent; update the payload construction (where payload is created) to include the
groundTruthColumns mapping (e.g., add a ground_truth_columns or
groundTruthColumns field) by mapping columnMapping.groundTruthColumns into the
same simple shape as attachments/text_columns (extracting the column and any
needed type/format), ensuring it follows the server's expected key name and
structure so reference labels are included in the POST body.

In `@app/assessment/components/ColumnMapperStep.tsx`:
- Around line 74-103: The current Record<string, ColumnConfig> state
(columnConfigs) keyed by column name causes collisions when columns contains
duplicate names; change columnConfigs to be an array aligned with the columns
array (e.g., ColumnConfig[] managed by setColumnConfigs) so each column instance
is tracked by index, update any places using columnConfigs (initialization loop
that builds configs from columns and columnMapping, state updates when toggling
roles via setColumnConfigs, and handleNext which currently iterates
Object.entries(columnConfigs)) to use the index-aligned array, and ensure the
rendered list uses key={index} (or a stable index-based id) instead of
key={column} to eliminate duplicate-key warnings and preserve per-column state
for duplicates.

In `@app/assessment/components/DatasetStep.tsx`:
- Around line 60-94: fetchAndParseFile currently swallows all failure cases by
returning null, which lets handleDatasetSelect silently continue and leaves
isLoadingColumns true/columns empty so canProceed (which checks datasetId &&
!isLoadingColumns) can allow advancing with zero columns; change
fetchAndParseFile to throw descriptive errors (or return an error object) for
each failure path (non-ok response, missing file_content, missing sheet, empty
data), then update handleDatasetSelect to catch those specific errors, call the
toast/error UI with the error message, set isLoadingColumns to false and ensure
columns state is cleared or marked invalid so users cannot proceed, and keep the
canProceed logic (datasetId && !isLoadingColumns) intact so it blocks forward
progression when loading/parse fails; reference functions: fetchAndParseFile,
handleDatasetSelect and the canProceed/isLoadingColumns/columns state variables.

In `@app/assessment/components/OutputSchemaStep.tsx`:
- Around line 452-456: The useEffect that sets draft state when the modal opens
(useEffect that calls setDraftSchema(JSON.parse(JSON.stringify(schema))))
currently lists only [open] but also reads schema — either include schema in the
dependency array or intentionally silence the linter with an explanatory
eslint-disable-next-line comment; prefer keeping the effect triggered only on
open by using a ref to hold the latest schema (e.g., schemaRef.current updated
elsewhere) or add "// eslint-disable-next-line react-hooks/exhaustive-deps -- we
only want to run this when `open` changes, not when `schema` changes" above the
effect. Similarly, the effect that calls setSchema (the effect depending on
codeValue and editorMode) must include setSchema in its dependency array or be
wrapped so a stable setter is used (e.g., useCallback/memoize in parent) or add
a clear eslint-disable-next-line with comment explaining why setSchema is
intentionally excluded to avoid stale-closure bugs.

In `@app/assessment/components/PromptAndConfigStep.tsx`:
- Around line 1066-1165: The primary CTA always targets version 1 causing stale
selections; change the logic to compute the "latestVersion" from the loaded
version state (e.g. derive latestVersionId or latestVersionIndex from
versionStateByConfig[config.id].items[0] or the highest-version item) and use
that instead of the hard-coded 1 when computing defaultSelected, building the
loading key in loadingSelectionKeys (currently `${config.id}:1`), and when
calling toggleVersionSelection(config, 1); update those references to use the
computed latestVersion identifier so the button and selected state target the
newest saved version.

In `@app/assessment/config/ConfigurationStep.tsx`:
- Around line 827-927: The primary CTA is hard-coded to version "1"; instead
compute the default version from the latest available version in versions.items
and use that id everywhere instead of 1. For example, derive const
latestVersionId = versions.items[0]?.id ?? 1 and replace uses of the literal 1
in isSelected(config.id, 1), loadingSelectionKeys[`${config.id}:1`], and the
onClick handler void toggleVersionSelection(config, 1) with
isSelected(config.id, latestVersionId),
loadingSelectionKeys[`${config.id}:${latestVersionId}`], and void
toggleVersionSelection(config, latestVersionId) respectively (keeping a safe
fallback when versions.items is empty).

In `@app/assessment/config/constants.ts`:
- Line 86: Remove the invalid "null" string from the reasoning.summary options
arrays in the model config constants so the only accepted values are "auto",
"concise", and "detailed"; locate each config object that contains an options:
["auto", "detailed", "concise", "null"] (e.g., the reasoning.summary options
entries for the o3, o4-mini, gpt-5*, and gpt-5.1/5.2 variants) and change those
arrays to ["auto", "detailed", "concise"] to match the o3-mini entry and the
OpenAI Responses API allowed values.

In `@app/assessment/store.ts`:
- Around line 40-47: setDataset currently overwrites columnMapping with
DEFAULT_MAPPING every call; change it so columnMapping is only reset when the
dataset actually changes by comparing the incoming datasetId to the current
state.datasetId — if they differ, set columnMapping to DEFAULT_MAPPING,
otherwise preserve state.columnMapping (and continue to update datasetId,
datasetName, columns, sampleRow as already done). Refer to the setDataset setter
and the symbols columnMapping, DEFAULT_MAPPING and state.datasetId to implement
this conditional logic.

In `@middleware.ts`:
- Around line 44-45: The middleware currently reads feature flags from a
client-writable cookie via
parseFeatures(request.cookies.get(FEATURES_COOKIE)?.value), which is untrusted;
change the check so feature gating is based on a server-trusted signal: either
read an HttpOnly/signed feature cookie issued by the server and verify its
signature (implement a verifyFeatureCookie function) or fetch the user’s feature
state from trusted server storage (e.g., getFeaturesForUser(userId) or
validateFeaturesAgainstServer(userId, features)) before allowing routes that
require ASSESSMENT; update the code paths that use parseFeatures and
FEATURES_COOKIE (including the other usage around the later block) to call the
new verification method and fall back to denying the feature if verification
fails.

---

Nitpick comments:
In `@app/assessment/components/DatasetStep.tsx`:
- Around line 286-304: handleDrop currently synthesizes a native change event
and duplicates the allowlist logic; extract a shared function (e.g.,
acceptFile(file: File)) that encapsulates the extension allowlist check, state
updates, toasts, and any file-input assignment/processing performed by
handleFileSelect, then replace the DataTransfer + dispatch logic in handleDrop
with a direct call to acceptFile(file); update handleFileSelect to call
acceptFile(file) as well so the allowlist and side-effects live in one place
(refer to handleDrop and handleFileSelect to locate where to extract and call
acceptFile).

In `@app/assessment/components/OutputSchemaStep.tsx`:
- Around line 118-156: fromJsonSchema currently unsafely casts actualDef.enum
and actualDef.type; update it to validate and sanitize inputs: when handling
enum, set enumValues = Array.isArray(actualDef.enum) ? (actualDef.enum.filter(v
=> typeof v === "string") as string[]) : []; and when deriving type, check
actualDef.type against the supported SchemaPropertyType union (e.g.,
allowedTypes = new
Set<SchemaPropertyType>(["string","number","boolean","object","enum","integer","..."]))
and use actualDef.type only if it is in allowedTypes, otherwise fallback to
"string"; keep all other behavior (isArray, children, required, genId()) intact.
- Around line 8-20: The current useMaterialSymbols hook injects an external
Google Fonts link (MATERIAL_SYMBOLS_HREF) just to render a single "asterisk"
glyph; remove the external dependency by deleting MATERIAL_SYMBOLS_HREF and the
useMaterialSymbols hook and removing its invocation, then replace the JSX that
currently renders the Material Symbols "asterisk" (where the icon name
"asterisk" is used) with a small inline SVG asterisk or the plain "*" character
(update the OutputSchemaStep JSX where that glyph is rendered); ensure any
imports or references solely for the font/hook are cleaned up.

In `@app/assessment/components/ReviewStep.tsx`:
- Around line 18-26: AccordionSection currently nests an inner interactive Edit
button inside a header implemented as a div with role="button" and declares a
never-used stepNumber prop; remove stepNumber from AccordionSectionProps and all
call sites, and refactor the header so the toggle is a single native control
(change the header container to <button type="button"> and keep onToggle on that
button) while moving the Edit action outside the toggle (render the Edit control
as a sibling element — e.g., a flex-aligned button or icon button — so it is not
a child of the toggle), and ensure keyboard handlers and aria attributes are
preserved on the new toggle button and that stopPropagation/handlers are updated
accordingly.
- Around line 200-203: The JSX currently uses imperative onFocus and onBlur
handlers that mutate e.target.style.borderColor; replace those handlers on the
affected input element(s) in ReviewStep.tsx with a declarative focus style
(e.g., a Tailwind class like focus:border-[var(--accent)] or a small CSS class
using :focus) so border color is handled by CSS instead of the onFocus/onBlur
functions, and remove the inline event handlers (search for the onFocus={(e)=>
(e.target.style.borderColor = colors.accent.primary)} and onBlur={(e)=>
(e.target.style.borderColor = colors.border)} occurrences to update the
element’s className accordingly).

In `@app/assessment/components/Stepper.tsx`:
- Around line 62-67: The color value in Stepper's render uses a hardcoded
"#ffffff"; replace that literal with the design token colors.text.white for
consistency (update the ternary expression setting color, e.g., inside the
component or function rendering the step circle where isActive/isCompleted are
evaluated). Ensure the border expression remains the same but if it references
the same white anywhere else in this component, swap to colors.text.white as
well so design tokens are centralized (look for the color: ... and border: `1px
solid ${...}` expressions in Stepper.tsx).

In `@app/assessment/config/constants.ts`:
- Around line 266-273: The function getModelConfigDefinition currently falls
back to GPT4_STYLE_CONFIG when ASSESSMENT_MODEL_CONFIGS has no matching model,
which silently surfaces inappropriate GPT-4 params in the UI; change it to
fail-closed by returning an empty definition {} (or at minimum emit a
console.warn including the unknown modelName) instead of GPT4_STYLE_CONFIG so
the UI and downstream buildDefaultParams don't seed invalid top_p/temperature
values; update the logic in getModelConfigDefinition and add a warning
referencing ASSESSMENT_MODEL_CONFIGS, GPT4_STYLE_CONFIG, and buildDefaultParams
so missing model entries are visible during development.
- Around line 14-18: Change the loose provider typing to a derived union and
update callers: derive a Provider type from PROVIDER_OPTIONS (e.g., Provider =
(typeof PROVIDER_OPTIONS)[number]["value"]), change
AssessmentModelConfig.provider from the literal "openai" to this Provider type,
and update getModelsByProvider and getDefaultModelForProvider to accept Provider
rather than string; then remove the hardcoded fallback model string
("gpt-4o-mini") in getDefaultModelForProvider and return
ASSESSMENT_MODEL_CONFIGS[0].model_name when an entry isn't found (use
ASSESSMENT_MODEL_CONFIGS and model_name for the guaranteed default).
- Around line 53-216: Hoist the duplicated effort+summary blocks by creating a
shared SUMMARY_PARAM constant and a reasoningConfig(effortOptions) factory, then
replace the repeated config blocks for model_name values like "o3-mini", "o3",
"o4-mini", "gpt-5", "gpt-5-mini", "gpt-5-nano", "gpt-5.1", and "gpt-5.2" to call
reasoningConfig(...) with the appropriate effort options (e.g.,
["low","medium","high"], ["minimal","low","medium","high"],
["none","low","medium","high","xhigh"]) and replace the "gpt-5.1-chat-latest"
entry to use only SUMMARY_PARAM; ensure the types satisfy ConfigParamDefinition
and that the shared SUMMARY_PARAM options do not include the stray "null" value
so all model entries inherit the consistent summary options.

In `@app/assessment/page.tsx`:
- Around line 1-5: The assessment route file lives at the App Router root but
should be inside the authenticated route group; move app/assessment/page.tsx
plus its colocated AssessmentPageClient.tsx and components/ into
app/(main)/assessment/, preserve the URL path (/assessment) since route groups
are transparent, and update any imports that reference the old paths to the new
location (e.g., import AssessmentPageClient from
"@/app/(main)/assessment/AssessmentPageClient" or use the appropriate alias).
Ensure the default export function AssessmentPage still returns
<AssessmentPageClient /> and adjust any relative imports inside
AssessmentPageClient or components to reflect the new directory structure.

In `@app/assessment/store.ts`:
- Line 2: Replace the relative import of the ColumnMapping type in
app/assessment/store.ts with the project root alias; change the import that
currently references "./types" so it imports ColumnMapping from
"@/app/assessment/types" (update the import statement that declares
ColumnMapping).
- Around line 23-27: DEFAULT_MAPPING and its arrays are shared by reference
causing potential future mutation bugs; add a factory function (e.g.
createDefaultMapping()) that returns a fresh ColumnMapping object with new
arrays and replace usages of DEFAULT_MAPPING in the initial state, in
setDataset, and in clearDataset so each state/init/reset gets its own
independent object (leave ColumnMapperStep.tsx unchanged but ensure it still
receives a new object).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 513748a0-05aa-428e-ad41-acfebb76fbd9

📥 Commits

Reviewing files that changed from the base of the PR and between 5d8d9bc and fcbf161.

📒 Files selected for processing (21)

app/api/auth/logout/route.ts
app/api/users/me/route.ts
app/assessment/AssessmentPageClient.tsx
app/assessment/components/ColumnMapperStep.tsx
app/assessment/components/DatasetStep.tsx
app/assessment/components/OutputSchemaStep.tsx
app/assessment/components/PromptAndConfigStep.tsx
app/assessment/components/ReviewStep.tsx
app/assessment/components/Stepper.tsx
app/assessment/config/ConfigurationStep.tsx
app/assessment/config/constants.ts
app/assessment/page.tsx
app/assessment/store.ts
app/assessment/types.ts
app/components/Sidebar.tsx
app/lib/authCookie.ts
app/lib/constants.ts
app/lib/context/AuthContext.tsx
app/lib/featureState.ts
app/lib/types/auth.ts
middleware.ts

✅ Files skipped from review due to trivial changes (2)

app/lib/constants.ts
app/assessment/types.ts

🚧 Files skipped from review as they are similar to previous changes (1)

app/lib/context/AuthContext.tsx

Ayush8923

I’ve added a few comments that will mostly require updates across multiple files. It would be better to address these first so the next round of review becomes easier. I’ll continue reviewing further accordingly.

coderabbitai

Actionable comments posted: 15

🧹 Nitpick comments (2)

app/assessment/components/EvaluationsTab.tsx (2)

6-6: Switch this import to the project alias.

Using a relative path here drifts from the repo import rule and makes later file moves noisier than they need to be.

Suggested change

-import DataViewModal, { jsonResultsToTableData } from "./DataViewModal";
+import DataViewModal, {
+  jsonResultsToTableData,
+} from "@/app/assessment/components/DataViewModal";

As per coding guidelines: Use the @/ import alias configured in tsconfig.json to resolve imports from the project root.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/EvaluationsTab.tsx` at line 6, Replace the relative
import of DataViewModal and jsonResultsToTableData in EvaluationsTab.tsx with
the project root alias import; locate the import line that references
"./DataViewModal" and change it to use the
"@/assessment/components/DataViewModal" path so DataViewModal and
jsonResultsToTableData are loaded via the configured tsconfig alias instead of a
relative path.

80-94: Prefer date-fns for relative time formatting.

This hand-rolled formatter bypasses the app’s date handling standard and will drift from the rest of the UI over time.

Suggested change

+import { formatDistanceToNow } from "date-fns";
@@
 function formatRelativeTime(dateStr: string): string {
-  const date = new Date(dateStr);
-  const now = new Date();
-  const diffMs = now.getTime() - date.getTime();
-  const diffMins = Math.floor(diffMs / 60000);
-  const diffHours = Math.floor(diffMs / 3600000);
-  const diffDays = Math.floor(diffMs / 86400000);
-
-  if (diffMins < 1) return "just now";
-  if (diffMins < 60) return `${diffMins}m ago`;
-  if (diffHours < 24) return `${diffHours}h ago`;
-  if (diffDays < 30) return `${diffDays} day${diffDays !== 1 ? "s" : ""} ago`;
-  const diffMonths = Math.floor(diffDays / 30);
-  return `about ${diffMonths} month${diffMonths !== 1 ? "s" : ""} ago`;
+  return formatDistanceToNow(new Date(dateStr), { addSuffix: true });
 }

As per coding guidelines: Use date-fns 4.1.0 and date-fns-tz 3.2.0 for date/time handling across the application.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/EvaluationsTab.tsx` around lines 80 - 94, The
custom function formatRelativeTime bypasses the app's date utilities; replace
its logic by using date-fns (v4.1.0) and date-fns-tz (v3.2.0) to produce
consistent relative times—specifically swap out the body of formatRelativeTime
to call date-fns' formatDistanceToNow (or formatRelative with proper tz
conversion via date-fns-tz) to compute the human-friendly string, ensure you
parse the incoming dateStr into a Date with the app timezone functions, and
maintain the same return semantics ("just now", shortened units) or map
formatDistanceToNow results to match existing UI text.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/api/assessment/datasets/route.ts`:
- Around line 37-41: The response currently returns raw exception details in
NextResponse.json (see the catch blocks around the error variable), which can
leak internal information; instead, log the full error server-side (e.g.,
console.error or your server logger) inside those catch blocks and change the
JSON response to return a generic, client-safe message (e.g., "An internal error
occurred") and remove or replace the details field; update both places using
NextResponse.json so only the generic message is sent to clients while full
error details remain in server logs.
- Around line 21-32: Replace the manual fetch call in this route (and the
similar block at lines ~68-80) with the shared apiClient(request,
"/api/v1/assessment/datasets", { method: "GET", headers: { "X-API-KEY": apiKey }
}) so the server proxy handles header/cookie forwarding; use the returned body
and status to construct the NextResponse and when returning errors extract the
message as body.error || body.message || body.detail and include that in the
JSON response. Ensure you pass the incoming request object to apiClient and
replace usage of backendUrl/apiKey + fetch with apiClient and
NextResponse.json(body, { status }).

In `@app/api/assessment/evaluations/`[evaluation_id]/results/route.ts:
- Around line 21-32: Replace the manual fetch that builds backendUrl/apiKey with
a call to apiClient so cookie and API-key forwarding are preserved: construct
the endpoint using evaluation_id and the existing searchParams (e.g.,
`/api/v1/assessment/evaluations/${evaluation_id}/results${queryString}`), call
apiClient(request, endpoint, { method: "GET" }), then use the returned { status,
data, headers } to build and return the Response (preserving status and relevant
headers). Remove direct use of backendUrl and apiKey in this handler and ensure
the function names referenced are apiClient and evaluation_id to locate the code
to change.

In `@app/api/assessment/evaluations/`[evaluation_id]/retry/route.ts:
- Around line 18-32: This route is using raw fetch which drops cookies and
duplicates forwarding logic; replace the fetch call with the shared apiClient
call: call apiClient(request,
`/api/v1/assessment/evaluations/${evaluation_id}/retry`, { method: "POST" })
(referencing evaluation_id and apiClient) and use the returned { status, data,
headers } to build the NextResponse (e.g., NextResponse.json(data, { status,
headers })) so X-API-KEY and Cookie are automatically forwarded and response
headers/status are preserved.

In `@app/api/assessment/events/route.ts`:
- Around line 15-23: The handler currently builds backendUrl and calls fetch
directly (using backendUrl and manually setting "X-API-KEY") which bypasses
shared forwarding and auth; replace that fetch call with the shared apiClient by
calling apiClient(request, "/api/v1/assessment/events", { method: "GET",
headers: { Accept: "text/event-stream" }, cache: "no-store" }) so Cookie and API
key forwarding and centralized error handling are preserved; remove backendUrl
and the manual "X-API-KEY" header and ensure apiClient is imported/used in this
route handler.
- Around line 25-29: The handler currently returns raw upstream response text
(variable text) in NextResponse.json which can leak internals; update the
failure branch that checks response.ok/response.body to log the full response
text to server logs (e.g., use console.error or your request logger) and return
a sanitized JSON error to the client via NextResponse.json (keep status from
response.status || 500 and replace details with a generic message like "Upstream
service error" or omit details entirely). Locate the failing block that calls
NextResponse.json and the response.text() call to make these changes (preserve
status handling but remove/raw-text exposure).
- Around line 16-24: The upstream fetch to
`${backendUrl}/api/v1/assessment/events` can throw and currently bubbles into an
unstructured 500; wrap the call to fetch (the code that assigns response) in a
try/catch inside the route handler (the exported GET handler or default route
function) and on error return a controlled 502 JSON Response with an explanatory
message and minimal error info. Preserve existing headers/behavior on success,
but in the catch return new Response(JSON.stringify({ error: "Bad Gateway",
message: "Failed to connect to upstream assessment events" , detail: /* short
error message */ }), { status: 502, headers: { "Content-Type":
"application/json" } }). Ensure you reference backendUrl and apiKey variables
unchanged.

In `@app/assessment/components/EvaluationsTab.tsx`:
- Around line 926-928: The checks that set isCompletedChild only match
"completed", causing runs with status "completed_with_errors" to lose
preview/export actions; update the status checks (where isCompletedChild is
computed and the similar logic between the isFailedChild/isCompletedChild blocks
around the isCompletedChild and childRun.status references) to treat
"completed_with_errors" as completed too (e.g., include it in the condition or
use an array/Set of completed-like statuses) so preview/export actions are
enabled for partially successful runs.
- Around line 238-240: Replace direct window fetch calls in EvaluationsTab
(e.g., the request to "/api/assessment/assessments" and the other occurrences
flagged) with the shared clientFetch utility so client-side requests go through
the token-refresh/401 handling; locate the fetch invocations in
EvaluationsTab.tsx (the one creating "response" around
"/api/assessment/assessments" and the other fetches at the ranges called out)
and swap them to call clientFetch(endpoint, options) passing the same
headers/body as before, keeping the surrounding await/response handling
unchanged so AUTH_EXPIRED_EVENT and refresh logic are used.
- Around line 824-832: The badge currently uses run.status.replace("_", " ")
which only replaces the first underscore; replace that with the app's existing
status formatter used elsewhere (e.g., formatStatusLabel or formatStatus) so all
underscores are handled correctly: in EvaluationsTab change
{run.status.replace("_", " ")} to the canonical formatter call (e.g.,
{formatStatusLabel(run.status)}) and add the necessary import/usage while
keeping statusStyle/bg/text as-is.

In `@app/assessment/components/JsonEditor.tsx`:
- Around line 168-173: The textarea in JsonEditor lacks an accessible name and
ARIA invalid state; update the textarea element used in the JsonEditor component
to include an accessible label (either by adding an associated <label> with
htmlFor matching the textarea id or adding aria-label/aria-labelledby) and set
aria-invalid={!!error} so assistive tech knows when the input is invalid; also
ensure the error span has role="alert" or is referenced via aria-describedby on
the textarea (use the error variable/id) so screen readers announce the error.

In `@app/components/speech-to-text/ModelComparisonCard.tsx`:
- Around line 151-153: The button elements (e.g., the one using
onClick={handleExpandToggle} and the other at the later occurrence) lack
explicit types and may act as type="submit" inside forms; update both button JSX
elements (the one referencing handleExpandToggle and the other button around
lines ~368) to include type="button" to prevent accidental form submission and
keep their onClick behavior unchanged.
- Around line 151-173: The expand/collapse icon button in ModelComparisonCard is
missing an accessible name and state; update the button (the element using
onClick={handleExpandToggle}) to include an accessible label (e.g., aria-label
or aria-labelledby) and expose its expanded state with
aria-expanded={isExpanded}, and if there is a collapsible panel add
aria-controls pointing to that panel's id; ensure the label text clearly
indicates the action (e.g., "Expand details" / "Collapse details") and that the
aria attributes reference the existing isExpanded state and the panel element
rendered by this component.
- Around line 74-78: The effect currently only resets expansion when status ===
"pending", so changing modelId while status is "success" leaves stale state;
update the logic in the useEffect (or add a separate useEffect) to also reset
expansion when modelId changes — e.g., track previous modelId with a ref and in
useEffect compare prevModelId !== modelId then call setIsExpanded(false) (and
update the ref), or add a useEffect that depends solely on modelId and calls
setIsExpanded(false); ensure you reference the existing useEffect, status,
modelId, and setIsExpanded symbols when making the change.
- Around line 156-172: The inline SVG used in ModelComparisonCard (the chevron
SVG controlled by isExpanded) should be replaced with a static SVG asset per
repo guidelines: move the SVG markup into a file under /public (e.g.,
chevron.svg) or import it as a static asset via Next.js, then replace the inline
<svg> block inside the ModelComparisonCard component with an <img> (or
next/image) reference to that asset and keep the transform/transition styling by
applying the same rotation logic to the image element or its wrapper; apply the
same change for the other inline SVG at the other location mentioned (lines
~377-389) so all inline SVGs are served as static assets.

---

Nitpick comments:
In `@app/assessment/components/EvaluationsTab.tsx`:
- Line 6: Replace the relative import of DataViewModal and
jsonResultsToTableData in EvaluationsTab.tsx with the project root alias import;
locate the import line that references "./DataViewModal" and change it to use
the "@/assessment/components/DataViewModal" path so DataViewModal and
jsonResultsToTableData are loaded via the configured tsconfig alias instead of a
relative path.
- Around line 80-94: The custom function formatRelativeTime bypasses the app's
date utilities; replace its logic by using date-fns (v4.1.0) and date-fns-tz
(v3.2.0) to produce consistent relative times—specifically swap out the body of
formatRelativeTime to call date-fns' formatDistanceToNow (or formatRelative with
proper tz conversion via date-fns-tz) to compute the human-friendly string,
ensure you parse the incoming dateStr into a Date with the app timezone
functions, and maintain the same return semantics ("just now", shortened units)
or map formatDistanceToNow results to match existing UI text.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0cb313bf-88d2-4263-bfb4-5ef971355683

📥 Commits

Reviewing files that changed from the base of the PR and between fcbf161 and fd34470.

📒 Files selected for processing (14)

app/api/assessment/assessments/[assessment_id]/results/route.ts
app/api/assessment/assessments/[assessment_id]/retry/route.ts
app/api/assessment/assessments/route.ts
app/api/assessment/datasets/[dataset_id]/route.ts
app/api/assessment/datasets/route.ts
app/api/assessment/evaluations/[evaluation_id]/results/route.ts
app/api/assessment/evaluations/[evaluation_id]/retry/route.ts
app/api/assessment/evaluations/route.ts
app/api/assessment/events/route.ts
app/assessment/components/DataViewModal.tsx
app/assessment/components/EvaluationsTab.tsx
app/assessment/components/JsonEditor.tsx
app/assessment/schemaUtils.ts
app/components/speech-to-text/ModelComparisonCard.tsx

✅ Files skipped from review due to trivial changes (3)

app/assessment/schemaUtils.ts
app/api/assessment/assessments/route.ts
app/assessment/components/DataViewModal.tsx

🚧 Files skipped from review as they are similar to previous changes (3)

app/api/assessment/assessments/[assessment_id]/results/route.ts
app/api/assessment/evaluations/route.ts
app/api/assessment/datasets/[dataset_id]/route.ts

…nd enhance accessibility - Replaced inline SVG icons with imported icon components in DataViewModal and DatasetStep. - Added error handling for forbidden API responses across various components. - Improved accessibility by adding aria attributes and roles to relevant elements in JsonEditor and EvaluationsTab. - Refactored API request logic to use a centralized apiFetch function for consistency. - Enhanced state management and effects in ModelComparisonCard for better user experience.

coderabbitai

Actionable comments posted: 5

♻️ Duplicate comments (5)

app/api/assessment/events/route.ts (1)

19-24: ⚠️ Potential issue | 🟠 Major

Sanitize SSE proxy failures before returning them.

details: text echoes the upstream response body back to the browser. That can leak backend internals from the event service. Log the full body server-side and return a generic error payload to the client instead.

Suggested fix

     if (!response.ok || !response.body) {
       const text = await response.text();
+      console.error("Assessment events upstream error:", response.status, text);
       return NextResponse.json(
-        { error: "Failed to connect assessment event stream", details: text },
+        { error: "Failed to connect assessment event stream" },
         { status: response.status || 500 },
       );
     }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/events/route.ts` around lines 19 - 24, The handler
currently returns the upstream SSE body directly to the client (response.text()
-> details: text); change this to log the full body server-side (use the
existing logger or processLogger/console.error to record the text and
response.status) and return a sanitized JSON to the client via NextResponse.json
(e.g., { error: "Failed to connect assessment event stream" }) with the same
status code; keep using the existing response variable and NextResponse.json
call but remove echoing `text` in the client payload while preserving
server-side logging for debugging.

app/assessment/components/EvaluationsTab.tsx (2)

786-786: ⚠️ Potential issue | 🟡 Minor

Use the existing status formatter for the assessment badge.

replace("_", " ") only changes the first underscore, so completed_with_errors renders as completed with_errors. Use the formatStatusLabel function defined at line 540 which uses a global regex.

🐛 Suggested fix

-                          {run.status.replace("_", " ")}
+                          {formatStatusLabel(run.status)}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/EvaluationsTab.tsx` at line 786, The badge
currently uses run.status.replace("_", " ") which only replaces the first
underscore; update the badge to call the existing formatStatusLabel function
(defined as formatStatusLabel) with run.status so all underscores are replaced
via the global regex and the label is consistently formatted; locate the usage
near the EvaluationsTab component where run.status is rendered and replace the
replace call with formatStatusLabel(run.status).

881-883: ⚠️ Potential issue | 🟠 Major

completed_with_errors child runs lose preview/export actions.

isCompletedChild only matches "completed", so partially successful runs with status "completed_with_errors" cannot be previewed or exported. These runs have valid results that users should be able to access.

🐛 Suggested fix

                             const isFailedChild = childRun.status === "failed";
                             const isCompletedChild =
-                              childRun.status === "completed";
+                              childRun.status === "completed" ||
+                              childRun.status === "completed_with_errors";

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/EvaluationsTab.tsx` around lines 881 - 883, The
isCompletedChild check only matches childRun.status === "completed", causing
"completed_with_errors" runs to miss preview/export actions; update the logic
that defines isCompletedChild (the variable using childRun.status) to include
"completed_with_errors" as a valid completed state (e.g., check for status ===
"completed" || status === "completed_with_errors" or use a startsWith/includes
test) so those child runs show the preview/export actions.

app/assessment/AssessmentPageClient.tsx (1)

354-367: ⚠️ Potential issue | 🟠 Major

Submission drops the mapped ground-truth columns.

columnMapping.groundTruthColumns is tracked in the component state (line 305), but it is not included in the POST body. Any assessment that depends on reference labels will be submitted without them.

🛡️ Suggested fix

       const payload = {
         experiment_name: experimentName.trim(),
         dataset_id: parseInt(datasetId, 10),
         prompt_template: promptTemplate || null,
         text_columns: columnMapping.textColumns,
+        ground_truth_columns: columnMapping.groundTruthColumns,
         attachments: columnMapping.attachments.map(
           ({ column, type, format }) => ({ column, type, format }),
         ),
         output_schema: schemaToJsonSchema(outputSchema) || null,
         configs: configs.map(({ config_id, config_version }) => ({
           config_id,
           config_version,
         })),
       };

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/AssessmentPageClient.tsx` around lines 354 - 367, The POST
payload in AssessmentPageClient.tsx is missing the mapped ground-truth columns
(state symbol columnMapping.groundTruthColumns) so reference labels are not
sent; update the payload construction (the payload object near the code creating
experiment_name, text_columns, attachments) to include a ground_truth_columns
property that serializes columnMapping.groundTruthColumns in the same shape as
text_columns/attachments (e.g., map each entry to the expected { column, ... }
structure or pass the raw mapping if the API accepts it), ensuring the key name
matches the API contract used by submit/create handlers.

app/assessment/components/DatasetStep.tsx (1)

76-105: ⚠️ Potential issue | 🟠 Major

Silent failure when dataset parse/fetch fails — user gets stuck.

The past review concern remains valid. fetchAndParseFile returns null on every failure path (missing file_content at line 85, missing sheet at line 91, empty rawData at line 97). In handleDatasetSelect, when parsed is null, no error feedback is shown, yet canProceed (line 321) checks only datasetId && !isLoadingColumns, potentially allowing navigation forward with zero columns loaded.

🛡️ Suggested fix

   const fetchAndParseFile = async (
     id: string | number,
   ): Promise<{ headers: string[]; rows: string[][] } | null> => {
     const json = await apiFetch<DatasetFileResponse>(
       `/api/assessment/datasets/${id}?fetch_content=true`,
       apiKey,
     );
     const base64 = json?.file_content;
-    if (!base64) return null;
+    if (!base64) throw new Error("Dataset content is unavailable.");

     // Decode base64 to binary and parse with XLSX
     const binary = Uint8Array.from(atob(base64), (c) => c.charCodeAt(0));
     const workbook = XLSX.read(binary, { type: "array" });
     const sheet = workbook.Sheets[workbook.SheetNames[0]];
-    if (!sheet) return null;
+    if (!sheet) throw new Error("No worksheet found in the file.");

     const rawData: string[][] = XLSX.utils.sheet_to_json(sheet, {
       header: 1,
       defval: "",
     });
-    if (rawData.length === 0) return null;
+    if (rawData.length === 0) throw new Error("Dataset appears to be empty.");

Then in handleDatasetSelect:

     try {
       const parsed = await fetchAndParseFile(id);
-      if (parsed?.headers) {
+      if (!parsed?.headers?.length) {
+        toast.error("Could not read columns from this dataset.");
+        return;
+      }
         const firstRow = parsed.rows[0] || [];
         ...
         onColumnsLoaded(parsed.headers, sampleRow);
-      }
     } catch (e) {
       if (handleForbiddenApiError(e, onForbidden)) return;
-      console.error("Failed to fetch dataset columns:", e);
+      toast.error(e instanceof Error ? e.message : "Failed to fetch dataset columns");
     } finally {

Also applies to: 224-242

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/DatasetStep.tsx` around lines 76 - 105,
fetchAndParseFile currently returns null on multiple failure paths which leaves
handleDatasetSelect unaware of failures; update fetchAndParseFile to throw
descriptive errors (e.g., "no file_content", "no sheet", "empty data") or return
a {error: string} result so callers can react, and ensure handleDatasetSelect
wraps the call in try/catch (or checks the error) to set
setIsLoadingColumns(false), clear/set columns via setColumns([]) on failure, and
surface a user-visible message (via setAlert or the existing error UI) instead
of silently doing nothing; also update canProceed to include a check on
columns.length (or !isLoadingColumns && columns.length > 0) so navigation is
blocked when no columns are loaded.

🧹 Nitpick comments (7)

app/components/speech-to-text/ModelComparisonCard.tsx (1)
70-70: Consider stabilizing detailsId with useId() to avoid collisions.

Line 70 builds DOM ids directly from modelId; if duplicate model ids are rendered, aria-controls can become ambiguous. useId() + model suffix is safer.
🧩 Proposed refactor
-import React, { useState, useEffect } from "react";
+import React, { useState, useEffect, useId } from "react";
@@
-  const detailsId = `model-card-details-${modelId}`;
+  const reactId = useId();
+  const detailsId = `model-card-details-${modelId}-${reactId}`;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/speech-to-text/ModelComparisonCard.tsx` at line 70, Replace
the unstable DOM id generation in ModelComparisonCard (detailsId built from
modelId) with a stable, React-generated id using useId(): import useId from
React, call const reactId = useId(), then compose detailsId by combining reactId
with the model-specific suffix (e.g.,
`${reactId}-model-card-details-${modelId}`) and update any
aria-controls/aria-labelledby usages to use this new detailsId so ids remain
unique even when modelId values collide; ensure useId() is called inside the
ModelComparisonCard component function.
app/assessment/components/JsonEditor.tsx (2)
16-23: Prefer shared theme tokens over local hardcoded hex values.

C currently hardcodes syntax colors, which makes future theming/dark-mode consistency harder. Consider moving these to shared tokens (CSS vars or centralized color map) and reading from there.

As per coding guidelines, "Use Tailwind CSS 4.x for styling, with custom color system and styles defined in /app/globals.css for cases not supported by Tailwind".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/JsonEditor.tsx` around lines 16 - 23, Replace the
local hardcoded color map C in JsonEditor.tsx with shared theme tokens: remove
hex literals and reference the centralized color system (CSS custom properties
defined in /app/globals.css or the project-wide color map) using the token names
used by our Tailwind 4.x custom color system; update the JsonEditor component to
read from those tokens (e.g., via getComputedStyle/CSS variables or importing
the shared color map) so syntax colors (keys: key, string, number, boolean,
null, punct) come from the global theme rather than local hex values, preserving
existing property names in C to minimize downstream changes.
3-3: Memoize highlighted HTML to reduce avoidable recomputation.

highlight(value) runs during every render. Memoizing the computed HTML keeps this cheaper when rerenders happen without value changes.
♻️ Proposed refactor
-import { useRef, useCallback, useId } from "react";
+import { useRef, useCallback, useId, useMemo } from "react";
...
   const textareaId = useId();
   const errorId = `${textareaId}-error`;
+  const highlightedHtml = useMemo(() => `${highlight(value)}\n`, [value]);
...
-          dangerouslySetInnerHTML={{ __html: highlight(value) + "\n" }}
+          dangerouslySetInnerHTML={{ __html: highlightedHtml }}
Please verify with React Profiler on a large JSON payload (e.g., several thousand characters) to confirm reduced render cost during non-text updates.

Also applies to: 83-85, 234-234
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/JsonEditor.tsx` at line 3, The call to
highlight(value) is recomputing on every render; wrap the highlighted HTML
computation in React.useMemo so it only recalculates when the input value
changes (e.g., const highlightedHtml = useMemo(() => highlight(value),
[value])). Update all occurrences where highlight(value) is used (including the
places around the existing useRef/useCallback hooks and the usage near the
render that injects the highlighted HTML) to use the memoized highlightedHtml
variable instead; run React Profiler afterwards with a large JSON payload to
confirm reduced work during renders that do not change value.
app/assessment/config/api.ts (1)
16-16: Use the root import alias here.

Switch this relative import to @/app/assessment/types for consistency with the repo standard.
Suggested fix
-import { ConfigSelection } from "../types";
+import { ConfigSelection } from "@/app/assessment/types";
As per coding guidelines, **/*.{ts,tsx,js,jsx}: Use the @/ import alias configured in tsconfig.json to resolve imports from the project root.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/config/api.ts` at line 16, The import in api.ts currently uses
a relative path for ConfigSelection; change it to the project root alias by
importing ConfigSelection from "@/app/assessment/types" instead of "../types" so
it follows the repo's tsconfig alias convention and keeps imports consistent
(update the import statement that references ConfigSelection).
app/assessment/AssessmentPageClient.tsx (1)
48-52: Global window property for navigation lock is acceptable but consider alternatives.

Using window.__assessmentForbiddenNavLock works for preventing duplicate redirects, but this pattern can be fragile. A module-scoped variable or a ref would be cleaner and wouldn't pollute the global namespace.
♻️ Alternative using module-scoped variable
+let assessmentForbiddenNavLock = false;
+
-declare global {
-  interface Window {
-    __assessmentForbiddenNavLock?: boolean;
-  }
-}

// Then in handleAssessmentForbidden:
-      if (
-        typeof window !== "undefined" &&
-        window.__assessmentForbiddenNavLock
-      ) {
+      if (assessmentForbiddenNavLock) {
         return;
       }
       ...
-      if (typeof window !== "undefined") {
-        window.__assessmentForbiddenNavLock = true;
-      }
+      assessmentForbiddenNavLock = true;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/AssessmentPageClient.tsx` around lines 48 - 52, The code uses
the global flag window.__assessmentForbiddenNavLock to prevent duplicate
redirects; replace this global with a module-scoped variable or a React ref to
avoid polluting window. Locate uses of window.__assessmentForbiddenNavLock in
AssessmentPageClient.tsx (e.g., in the redirect/navigation handler) and change
them to a top-level let (module-scoped) or a useRef inside the component (e.g.,
navigationLockRef) that you check/set instead of reading/writing
window.__assessmentForbiddenNavLock; ensure the logic for checking and setting
the lock is preserved and remove the global declaration block.
app/assessment/components/EvaluationsTab.tsx (2)
278-345: Potential stale closure in loadConfigDetail deduplication check.

The check at line 283 (if (configDetailsByKey[key] || configLoadingKeys[key]) return) reads from state objects that are dependencies of the useCallback. Since object references change on every state update, this should work correctly, but the deduplication logic relies on the closure capturing the latest state. Consider using a ref for the loading keys set if you encounter duplicate requests.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/EvaluationsTab.tsx` around lines 278 - 345, The
deduplication check inside loadConfigDetail can use stale closure values for
configLoadingKeys/configDetailsByKey; switch to using a mutable ref for at least
the loading set (e.g., configLoadingKeysRef) that you keep in sync whenever
setConfigLoadingKeys runs (update ref in the same setter or via useEffect), read
configLoadingKeysRef.current in the early-return check inside loadConfigDetail,
and update both state (setConfigLoadingKeys) and the ref when marking/clearing
keys so UI state stays consistent while dedupe uses the always-current ref; keep
references to loadConfigDetail, configLoadingKeys state setter, and
configDetailsByKey updates intact.
398-401: Raw fetch bypasses shared error handling.

triggerDownload uses raw fetch instead of apiFetch. While this is needed for blob downloads, the 403 handling at lines 402-405 is manual. Consider extracting a download-specific helper or documenting why raw fetch is required here.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/EvaluationsTab.tsx` around lines 398 - 401, The raw
fetch in triggerDownload bypasses shared API error handling; extract or use a
download-specific helper (e.g., apiFetchBlob) that accepts the URL, headers
(buildAuthHeaders()), and export_format, performs the fetch for a blob, and
invokes the same centralized error handling used elsewhere (e.g., the existing
apiFetch or handleApiError logic) so 403/unauthorized responses are handled
consistently instead of duplicating the 403 check in triggerDownload; update
triggerDownload to call that helper and return the blob/trigger the download.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/api/assessment/datasets/`[dataset_id]/route.ts:
- Around line 79-85: The DELETE response handling should special-case HTTP 204:
after calling apiClient(request, `/api/v1/assessment/datasets/${dataset_id}`, {
method: "DELETE" }) inspect the returned status and if it equals 204 return an
empty NextResponse with that status instead of calling NextResponse.json(data, {
status }), otherwise continue to return NextResponse.json(data, { status });
update the route handler in route.ts where apiClient and dataset_id are used to
implement this conditional.
- Around line 45-55: The route currently calls fetch(signedUrl) then awaits
fileResponse.arrayBuffer() and base64-encodes it (using
Buffer.from(...).toString("base64")), which can OOM on large objects; update the
logic in the handler in route.ts to first inspect
fileResponse.headers.get('content-length') (or otherwise determine size) and
enforce a strict max size (e.g., reject with NextResponse.json({ error: "File
too large" }, { status: 413 })) before calling arrayBuffer(), or alternatively
return the signedUrl to the client instead of downloading and encoding
server-side; ensure you change the branch that currently performs arrayBuffer()
and Buffer.from(...) to use the size check or return the signed URL so the
server never buffers large files into memory.

In `@app/assessment/config/api.ts`:
- Around line 25-33: The helper getApiKey currently always returns keys[0].key
which breaks when multiple keys exist; update getApiKey to resolve the same
selected/active keystore entry the rest of the assessment uses (instead of
blindly picking index 0) by reading the selection marker the app stores (e.g.,
an activeKeyId or selectedIndex stored alongside STORAGE_KEY) or accept the
selected key id as an argument; locate the function getApiKey and STORAGE_KEY
usage, read the selection field the rest of the assessment uses (or add a
parameter to getApiKey to receive selectedKeyId), then return the key object
matching that id (or null if not found) rather than keys[0].key.

In `@app/assessment/useAssessmentEvents.ts`:
- Around line 90-95: The SSE connection error handling in useAssessmentEvents
currently calls onForbidden() and then throws, which causes the reconnect loop
to keep retrying; instead, make 403 a terminal case by invoking onForbidden()
and returning early (do not throw) so the reconnect/breaker isn't fed a
retryable error. Update the block that checks response.status === 403 (the SSE
response handling in useAssessmentEvents) to call onForbidden() and exit the
function/connection setup immediately for 403, while continuing to throw for
other non-ok statuses.

In `@app/components/speech-to-text/ModelComparisonCard.tsx`:
- Around line 130-137: The visual spinner rendered in ModelComparisonCard.tsx
when status === "pending" is not announced to assistive tech; update the spinner
container (the div that currently has className "w-4 h-4 ... animate-spin") to
include an accessible status by adding role="status" and an appropriate text
alternative such as aria-label="Loading" or an offscreen <span> with
aria-live="polite" (or aria-hidden on purely decorative span) so screen readers
receive a clear "loading" announcement; ensure the change is applied inside the
component where the status variable is checked so it only appears during the
pending state.

---

Duplicate comments:
In `@app/api/assessment/events/route.ts`:
- Around line 19-24: The handler currently returns the upstream SSE body
directly to the client (response.text() -> details: text); change this to log
the full body server-side (use the existing logger or
processLogger/console.error to record the text and response.status) and return a
sanitized JSON to the client via NextResponse.json (e.g., { error: "Failed to
connect assessment event stream" }) with the same status code; keep using the
existing response variable and NextResponse.json call but remove echoing `text`
in the client payload while preserving server-side logging for debugging.

In `@app/assessment/AssessmentPageClient.tsx`:
- Around line 354-367: The POST payload in AssessmentPageClient.tsx is missing
the mapped ground-truth columns (state symbol columnMapping.groundTruthColumns)
so reference labels are not sent; update the payload construction (the payload
object near the code creating experiment_name, text_columns, attachments) to
include a ground_truth_columns property that serializes
columnMapping.groundTruthColumns in the same shape as text_columns/attachments
(e.g., map each entry to the expected { column, ... } structure or pass the raw
mapping if the API accepts it), ensuring the key name matches the API contract
used by submit/create handlers.

In `@app/assessment/components/DatasetStep.tsx`:
- Around line 76-105: fetchAndParseFile currently returns null on multiple
failure paths which leaves handleDatasetSelect unaware of failures; update
fetchAndParseFile to throw descriptive errors (e.g., "no file_content", "no
sheet", "empty data") or return a {error: string} result so callers can react,
and ensure handleDatasetSelect wraps the call in try/catch (or checks the error)
to set setIsLoadingColumns(false), clear/set columns via setColumns([]) on
failure, and surface a user-visible message (via setAlert or the existing error
UI) instead of silently doing nothing; also update canProceed to include a check
on columns.length (or !isLoadingColumns && columns.length > 0) so navigation is
blocked when no columns are loaded.

In `@app/assessment/components/EvaluationsTab.tsx`:
- Line 786: The badge currently uses run.status.replace("_", " ") which only
replaces the first underscore; update the badge to call the existing
formatStatusLabel function (defined as formatStatusLabel) with run.status so all
underscores are replaced via the global regex and the label is consistently
formatted; locate the usage near the EvaluationsTab component where run.status
is rendered and replace the replace call with formatStatusLabel(run.status).
- Around line 881-883: The isCompletedChild check only matches childRun.status
=== "completed", causing "completed_with_errors" runs to miss preview/export
actions; update the logic that defines isCompletedChild (the variable using
childRun.status) to include "completed_with_errors" as a valid completed state
(e.g., check for status === "completed" || status === "completed_with_errors" or
use a startsWith/includes test) so those child runs show the preview/export
actions.

---

Nitpick comments:
In `@app/assessment/AssessmentPageClient.tsx`:
- Around line 48-52: The code uses the global flag
window.__assessmentForbiddenNavLock to prevent duplicate redirects; replace this
global with a module-scoped variable or a React ref to avoid polluting window.
Locate uses of window.__assessmentForbiddenNavLock in AssessmentPageClient.tsx
(e.g., in the redirect/navigation handler) and change them to a top-level let
(module-scoped) or a useRef inside the component (e.g., navigationLockRef) that
you check/set instead of reading/writing window.__assessmentForbiddenNavLock;
ensure the logic for checking and setting the lock is preserved and remove the
global declaration block.

In `@app/assessment/components/EvaluationsTab.tsx`:
- Around line 278-345: The deduplication check inside loadConfigDetail can use
stale closure values for configLoadingKeys/configDetailsByKey; switch to using a
mutable ref for at least the loading set (e.g., configLoadingKeysRef) that you
keep in sync whenever setConfigLoadingKeys runs (update ref in the same setter
or via useEffect), read configLoadingKeysRef.current in the early-return check
inside loadConfigDetail, and update both state (setConfigLoadingKeys) and the
ref when marking/clearing keys so UI state stays consistent while dedupe uses
the always-current ref; keep references to loadConfigDetail, configLoadingKeys
state setter, and configDetailsByKey updates intact.
- Around line 398-401: The raw fetch in triggerDownload bypasses shared API
error handling; extract or use a download-specific helper (e.g., apiFetchBlob)
that accepts the URL, headers (buildAuthHeaders()), and export_format, performs
the fetch for a blob, and invokes the same centralized error handling used
elsewhere (e.g., the existing apiFetch or handleApiError logic) so
403/unauthorized responses are handled consistently instead of duplicating the
403 check in triggerDownload; update triggerDownload to call that helper and
return the blob/trigger the download.

In `@app/assessment/components/JsonEditor.tsx`:
- Around line 16-23: Replace the local hardcoded color map C in JsonEditor.tsx
with shared theme tokens: remove hex literals and reference the centralized
color system (CSS custom properties defined in /app/globals.css or the
project-wide color map) using the token names used by our Tailwind 4.x custom
color system; update the JsonEditor component to read from those tokens (e.g.,
via getComputedStyle/CSS variables or importing the shared color map) so syntax
colors (keys: key, string, number, boolean, null, punct) come from the global
theme rather than local hex values, preserving existing property names in C to
minimize downstream changes.
- Line 3: The call to highlight(value) is recomputing on every render; wrap the
highlighted HTML computation in React.useMemo so it only recalculates when the
input value changes (e.g., const highlightedHtml = useMemo(() =>
highlight(value), [value])). Update all occurrences where highlight(value) is
used (including the places around the existing useRef/useCallback hooks and the
usage near the render that injects the highlighted HTML) to use the memoized
highlightedHtml variable instead; run React Profiler afterwards with a large
JSON payload to confirm reduced work during renders that do not change value.

In `@app/assessment/config/api.ts`:
- Line 16: The import in api.ts currently uses a relative path for
ConfigSelection; change it to the project root alias by importing
ConfigSelection from "@/app/assessment/types" instead of "../types" so it
follows the repo's tsconfig alias convention and keeps imports consistent
(update the import statement that references ConfigSelection).

In `@app/components/speech-to-text/ModelComparisonCard.tsx`:
- Line 70: Replace the unstable DOM id generation in ModelComparisonCard
(detailsId built from modelId) with a stable, React-generated id using useId():
import useId from React, call const reactId = useId(), then compose detailsId by
combining reactId with the model-specific suffix (e.g.,
`${reactId}-model-card-details-${modelId}`) and update any
aria-controls/aria-labelledby usages to use this new detailsId so ids remain
unique even when modelId values collide; ensure useId() is called inside the
ModelComparisonCard component function.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a8bc382f-67d4-4a19-bfe1-93e4b7b924b0

📥 Commits

Reviewing files that changed from the base of the PR and between fd34470 and c5640ad.

📒 Files selected for processing (19)

app/api/assessment/assessments/[assessment_id]/results/route.ts
app/api/assessment/assessments/[assessment_id]/retry/route.ts
app/api/assessment/assessments/route.ts
app/api/assessment/datasets/[dataset_id]/route.ts
app/api/assessment/datasets/route.ts
app/api/assessment/evaluations/[evaluation_id]/results/route.ts
app/api/assessment/evaluations/[evaluation_id]/retry/route.ts
app/api/assessment/evaluations/route.ts
app/api/assessment/events/route.ts
app/api/assessment/utils.ts
app/assessment/AssessmentPageClient.tsx
app/assessment/components/DataViewModal.tsx
app/assessment/components/DatasetStep.tsx
app/assessment/components/EvaluationsTab.tsx
app/assessment/components/JsonEditor.tsx
app/assessment/config/api.ts
app/assessment/errorUtils.ts
app/assessment/useAssessmentEvents.ts
app/components/speech-to-text/ModelComparisonCard.tsx

✅ Files skipped from review due to trivial changes (1)

app/assessment/components/DataViewModal.tsx

🚧 Files skipped from review as they are similar to previous changes (5)

app/api/assessment/assessments/route.ts
app/api/assessment/datasets/route.ts
app/api/assessment/assessments/[assessment_id]/retry/route.ts
app/api/assessment/assessments/[assessment_id]/results/route.ts
app/api/assessment/evaluations/route.ts

… UI elements, and introduce a new toggle switch component

coderabbitai

Actionable comments posted: 5

♻️ Duplicate comments (7)

app/assessment/components/DatasetStep.tsx (1)

77-105: ⚠️ Potential issue | 🟠 Major

Parse failures still leave the step in a valid-looking state.

fetchAndParseFile() still collapses every bad path to null. In handleDatasetSelect(), that bypasses the catch block, keeps the dataset selected, and makes canProceed truthy again once loading finishes, so users can advance with no loaded columns. Please surface a real error here and clear/mark the selection invalid on parse failure.

Suggested fix

 const fetchAndParseFile = async (
   id: string | number,
-): Promise<{ headers: string[]; rows: string[][] } | null> => {
+): Promise<{ headers: string[]; rows: string[][] }> => {
   const json = await apiFetch<DatasetFileResponse>(
     `/api/assessment/datasets/${id}?fetch_content=true`,
     apiKey,
   );
   const base64 = json?.file_content;
-  if (!base64) return null;
+  if (!base64) {
+    throw new Error("Dataset content is unavailable.");
+  }

   const binary = Uint8Array.from(atob(base64), (c) => c.charCodeAt(0));
   const workbook = XLSX.read(binary, { type: "array" });
   const sheet = workbook.Sheets[workbook.SheetNames[0]];
-  if (!sheet) return null;
+  if (!sheet) {
+    throw new Error("No readable sheet found in this dataset.");
+  }

   const rawData: string[][] = XLSX.utils.sheet_to_json(sheet, {
     header: 1,
     defval: "",
   });
-  if (rawData.length === 0) return null;
+  if (rawData.length === 0) {
+    throw new Error("Dataset is empty.");
+  }

     try {
       const parsed = await fetchAndParseFile(id);
-      if (parsed?.headers) {
-        const firstRow = parsed.rows[0] || [];
-        const sampleRow = Object.fromEntries(
-          parsed.headers.map((header, index) => [
-            header,
-            String(firstRow[index] ?? ""),
-          ]),
-        );
-        onColumnsLoaded(parsed.headers, sampleRow);
-      }
+      if (!parsed.headers.length) {
+        onColumnsLoaded([], {});
+        toast.error("Could not read columns from this dataset.");
+        return;
+      }
+      const firstRow = parsed.rows[0] || [];
+      const sampleRow = Object.fromEntries(
+        parsed.headers.map((header, index) => [
+          header,
+          String(firstRow[index] ?? ""),
+        ]),
+      );
+      onColumnsLoaded(parsed.headers, sampleRow);
     } catch (e) {
       if (handleForbiddenApiError(e, onForbidden)) return;
-      console.error("Failed to fetch dataset columns:", e);
+      onColumnsLoaded([], {});
+      toast.error(
+        e instanceof Error ? e.message : "Failed to fetch dataset columns",
+      );
     } finally {
       setIsLoadingColumns(false);
     }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/DatasetStep.tsx` around lines 77 - 105,
fetchAndParseFile currently collapses all failure paths to null which prevents
handleDatasetSelect from entering its catch and leaves the step in a
valid-looking state; change fetchAndParseFile (in DatasetStep) to throw a
descriptive Error when the file is missing or parsing fails (e.g., "Failed to
parse dataset file" including underlying info) instead of returning null, and
update handleDatasetSelect to catch that Error, clear/reset the selected dataset
(remove the id/selection), mark the step invalid (set canProceed to false or
call the existing invalidation routine), and surface the error to the user
(e.g., via setError / toast) so a parse failure cannot leave the UI selectable.

app/assessment/AssessmentPageClient.tsx (1)

251-259: ⚠️ Potential issue | 🟠 Major

Ground-truth mappings are still dropped on submit.

The request body still omits columnMapping.groundTruthColumns, so assessments that rely on reference labels will be submitted without them.

Suggested fix

       const payload = {
         experiment_name: experimentName.trim(),
         dataset_id: parseInt(datasetId, 10),
         prompt_template: promptTemplate || null,
         text_columns: columnMapping.textColumns,
+        ground_truth_columns: columnMapping.groundTruthColumns,
         attachments: columnMapping.attachments.map(
           ({ column, type, format }) => ({ column, type, format }),
         ),
         output_schema: schemaToJsonSchema(outputSchema) || null,

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/AssessmentPageClient.tsx` around lines 251 - 259, The payload
being sent omits ground-truth mappings so reference labels are lost; update the
payload construction in AssessmentPageClient (where payload is built) to include
groundTruthColumns from columnMapping (e.g., add a ground_truth_columns field
set to columnMapping.groundTruthColumns or null if empty), ensure the key naming
matches backend expectations (snake_case like ground_truth_columns) and that any
transformation (mapping objects to { column, type, format } shapes) mirrors how
attachments are handled so the server receives the ground-truth column info.

app/assessment/config/ConfigurationStep.tsx (1)

830-907: ⚠️ Potential issue | 🟠 Major

The default card action is still pinned to version 1.

“Use this behavior” is still selecting version 1, not the latest saved version represented by the card. That makes the primary CTA apply stale behavior settings by default.

Suggested fix

-                          const defaultSelected = isSelected(config.id, 1);
-                          const defaultLoading =
-                            loadingSelectionKeys[`${config.id}:1`];
+                          const latestVersion =
+                            versions.items.reduce(
+                              (max, item) => Math.max(max, item.version),
+                              1,
+                            );
+                          const defaultSelected = isSelected(
+                            config.id,
+                            latestVersion,
+                          );
+                          const defaultLoading =
+                            loadingSelectionKeys[
+                              `${config.id}:${latestVersion}`
+                            ];
...
-                                      void toggleVersionSelection(config, 1);
+                                      void toggleVersionSelection(
+                                        config,
+                                        latestVersion,
+                                      );

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/config/ConfigurationStep.tsx` around lines 830 - 907, The card
is hardcoded to version 1 (see defaultSelected = isSelected(config.id, 1) and
the onClick calling toggleVersionSelection(config, 1)); change these to use the
card's actual latest/saved version instead of 1—derive the version id from the
available data (e.g., prefer config.latest_version_id or the first item in
versions.items: const latestVersionId = config.latest_version_id ??
versions.items[0]?.id) and replace both uses of the literal 1 with
latestVersionId (also use the same latestVersionId when computing
defaultLoading: loadingSelectionKeys[`${config.id}:${latestVersionId}`]) so the
primary CTA targets the most recent saved version.

app/assessment/components/PromptAndConfigStep.tsx (1)

1076-1175: ⚠️ Potential issue | 🟠 Major

The primary CTA still targets version 1.

The card action is still wired to version 1, so “Use this behavior” selects the oldest saved version unless the user opens the versions panel first. Derive the default version from the latest loaded version instead of pinning it to 1.

Suggested fix

-                          const defaultSelected = isSelected(config.id, 1);
-                          const defaultLoading =
-                            loadingSelectionKeys[`${config.id}:1`];
+                          const latestVersion =
+                            versions.items.reduce(
+                              (max, item) => Math.max(max, item.version),
+                              1,
+                            );
+                          const defaultSelected = isSelected(
+                            config.id,
+                            latestVersion,
+                          );
+                          const defaultLoading =
+                            loadingSelectionKeys[
+                              `${config.id}:${latestVersion}`
+                            ];
...
-                                  onClick={() =>
-                                    void toggleVersionSelection(config, 1)
-                                  }
+                                  onClick={() =>
+                                    void toggleVersionSelection(
+                                      config,
+                                      latestVersion,
+                                    )
+                                  }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/PromptAndConfigStep.tsx` around lines 1076 - 1175,
The primary CTA is hardcoded to version 1; compute a defaultVersion from the
loaded versions instead (e.g. latest = versions.items[0] or fallback to 1) and
replace the hardcoded `1` in isSelected(config.id, 1), the loading key lookup
`loadingSelectionKeys[\`${config.id}:1\`]`, and the toggle call
`toggleVersionSelection(config, 1)` with this defaultVersion variable so the
button targets the most recently loaded version when available. Ensure you also
use the same defaultVersion when building any labels/disabled/loading checks so
behavior is consistent.

app/assessment/components/EvaluationsTab.tsx (3)

407-410: ⚠️ Potential issue | 🟠 Major

Run exports through the shared client fetch path.

This request still bypasses the client-side 401 refresh / AUTH_EXPIRED_EVENT flow, so exports will fail once the session expires even though the rest of the UI can recover.

As per coding guidelines: Use clientFetch(endpoint, options) on the client-side for API requests, which handles token refresh on 401 errors and dispatches AUTH_EXPIRED_EVENT when refresh fails.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/EvaluationsTab.tsx` around lines 407 - 410, The
export fetch bypasses the client-side auth handling so expired sessions won't
refresh; replace the direct fetch call in EvaluationsTab (the block using
buildAuthHeaders() and credentials) with the shared clientFetch(endpoint,
options) call so token refresh and AUTH_EXPIRED_EVENT are handled; keep the same
URL and options (include export_format) but pass headers/payload via clientFetch
instead of fetch to reuse the centralized 401/refresh logic.

890-892: ⚠️ Potential issue | 🟠 Major

Treat completed_with_errors child runs as preview/exportable.

These runs can still have successful results, but isCompletedChild only matches "completed", which hides Preview and Export for partially successful child runs.

Suggested fix

-                            const isCompletedChild =
-                              childRun.status === "completed";
+                            const isCompletedChild =
+                              childRun.status === "completed" ||
+                              childRun.status === "completed_with_errors";

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/EvaluationsTab.tsx` around lines 890 - 892, The
check that determines whether a child run is considered "completed" only matches
childRun.status === "completed" (isCompletedChild), which excludes
"completed_with_errors" runs that should still allow Preview/Export; update the
condition in EvaluationsTab (the isCompletedChild boolean) to treat both
"completed" and "completed_with_errors" as completed (e.g., check
childRun.status === "completed" || childRun.status === "completed_with_errors")
so preview/export UI logic catches partially successful child runs; ensure any
downstream uses of isCompletedChild continue to work with the expanded
predicate.

788-796: ⚠️ Potential issue | 🟡 Minor

Use the existing status formatter for the assessment badge.

replace("_", " ") only fixes the first underscore, so statuses like completed_with_errors still render incorrectly here.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/EvaluationsTab.tsx` around lines 788 - 796, In
EvaluationsTab (the JSX that renders the badge using run.status and
statusStyle), replace the fragile run.status.replace("_", " ") expression with
the project's centralized status formatter (e.g., call the existing function
such as formatStatus or formatAssessmentStatus) so multi-underscore statuses
like "completed_with_errors" render correctly; import or reference that
formatter where needed and guard for null/undefined run.status (e.g., fallback
to an empty string) so the badge always displays a formatted label.

🧹 Nitpick comments (4)

app/assessment/components/CompactToggleSwitch.tsx (1)
36-37: Prefer design tokens over hardcoded hex values in thumb styles.

Lines 36-37 mix hardcoded colors with tokenized palette. Consider using colors.text.white / colors.text.primary for consistency and easier theming.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/CompactToggleSwitch.tsx` around lines 36 - 37, In
CompactToggleSwitch update the thumb style to use design tokens instead of
hardcoded hexes: replace the backgroundColor set when checked from "#ffffff" to
colors.text.white and replace the color set when checked from "#111111" to
colors.text.primary; make these substitutions in the thumb style block (where
backgroundColor and color are computed from the checked variable) so theming
remains consistent.
app/assessment/AssessmentPageClient.tsx (1)
74-75: Use the shared app context for sidebar state.

Owning sidebarCollapsed locally here means the shell state resets on remount and can drift from the rest of the app. Wire this through useApp() instead of page-local state. As per coding guidelines, "Manage sidebar and app-level UI state using the useApp() hook from @/app/lib/context/AppContext to access sidebarCollapsed, setSidebarCollapsed, and toggleSidebar".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/AssessmentPageClient.tsx` around lines 74 - 75, The component
is managing sidebarCollapsed locally; replace this with the shared app context
by importing and calling useApp() inside AssessmentPageClient and using the
returned sidebarCollapsed, setSidebarCollapsed (and/or toggleSidebar) instead of
the local useState variables; remove the const [sidebarCollapsed,
setSidebarCollapsed] = useState(...) declaration and update any references and
event handlers to call the context's setSidebarCollapsed or toggleSidebar so the
page uses global app UI state.
app/assessment/components/EvaluationsTab.tsx (2)
93-107: Use the shared date utilities for relative time formatting.

This hand-rolled formatter bypasses the repo-standard date stack and will be harder to keep consistent for edge cases and future localization.

As per coding guidelines: Use date-fns 4.1.0 and date-fns-tz 3.2.0 for date/time handling across the application.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/EvaluationsTab.tsx` around lines 93 - 107, Replace
the hand-rolled formatRelativeTime function with the repo-standard date utility:
remove the manual Date math in function formatRelativeTime and call the shared
relative-time helper (e.g., use the project's exported function like
formatRelative or formatRelativeFromNow from the date utilities) which is
implemented using date-fns 4.1.0 and date-fns-tz 3.2.0; ensure the function
signature stays the same (accepts dateStr: string and returns string) so callers
are unchanged and add any necessary import for the shared util at the top of
EvaluationsTab.tsx.
14-16: Switch these internal imports to the @/ alias.

This file mixes root-alias imports with relative app imports, which makes moves/renames more fragile than they need to be.
Suggested cleanup
-import DataViewModal, { jsonResultsToTableData } from "./DataViewModal";
+import DataViewModal, {
+  jsonResultsToTableData,
+} from "@/app/assessment/components/DataViewModal";
 import { ConfigResponse, ConfigVersionResponse } from "@/app/lib/configTypes";
-import { handleForbiddenApiError } from "../errorUtils";
+import { handleForbiddenApiError } from "@/app/assessment/errorUtils";
As per coding guidelines: Use the @/ import alias configured in tsconfig.json to resolve imports from the project root.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/EvaluationsTab.tsx` around lines 14 - 16, Replace
the relative imports with root-alias imports: change the DataViewModal import
(symbols DataViewModal and jsonResultsToTableData) from "./DataViewModal" to
"@/assessment/components/DataViewModal", and change the handleForbiddenApiError
import from "../errorUtils" to "@/assessment/errorUtils"; leave the existing
ConfigResponse/ConfigVersionResponse import as-is if it already uses the "@/..."
alias. Ensure the imported symbols and paths match the exported names in those
target modules.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/assessment/components/CompactToggleSwitch.tsx`:
- Line 4: Replace the relative import of the icon module with the project-root
alias: update the import that currently references "./icons/ToggleThumbIcons"
(bringing in ToggleOffIcon and ToggleOnIcon) to use the
"@/assessment/components/icons/ToggleThumbIcons" alias so the file
CompactToggleSwitch.tsx follows the repo rule requiring `@/` root imports.

In `@app/assessment/components/EvaluationsTab.tsx`:
- Around line 333-349: The catch block in EvaluationsTab.tsx that updates
setConfigErrorKeys needs to propagate forbidden (403) errors through the
existing forbidden handler instead of only storing the error locally; update the
catch to detect a 403 (by inspecting the thrown error/response) and call
handleForbiddenApiError(error) (the same function used elsewhere) before or
instead of setting setConfigErrorKeys for that case, leaving current behavior
for non-403 errors and preserving the finally block that updates
configLoadingKeys; ensure you reference the same error object/shape used by
other calls to handleForbiddenApiError so the forbidden flow is triggered
consistently.

In `@app/assessment/components/OutputSchemaStep.tsx`:
- Around line 287-306: The remove/delete buttons (the button that calls
onRemove(property.id) and the similar icon-only button further down) lack
accessible names; replace or augment the unreliable title attribute by adding
explicit aria-label attributes (e.g., aria-label={`Delete ${property.name ||
'field'}`} or aria-label="Remove enum value") on the button elements so screen
readers can announce them, keeping the existing onClick handlers (onRemove) and
styling intact; update both the button referencing onRemove(property.id) and the
corresponding button in the 327-345 block to use appropriate, descriptive
aria-label text.

In `@app/assessment/components/PromptAndConfigStep.tsx`:
- Around line 37-38: The helpers addSelection and removeSelection close over the
configs array and use non-functional setConfigs which can cause race conditions
when toggleVersionSelection awaits fetchConfigSelection; change setConfigs
usages to the functional updater form (e.g., setConfigs(prev => [...prev,
selection]) and setConfigs(prev => prev.filter(...))) inside
addSelection/removeSelection so updates are based on the latest state, and
ensure setConfigs signature accepts the updater. Also replace the hard-coded
version 1 in the "Use this behavior" CTA logic so it targets the latest known
version (derive the latest version from the config/version list or state and
inject that value instead of 1) in the component code that builds the CTA.

In `@app/assessment/config/ConfigurationStep.tsx`:
- Around line 33-36: addSelection and removeSelection capture the configs array
from a stale closure and can overwrite concurrent updates when
fetchConfigSelection resolves; change both places that call setConfigs to use
the functional updater form so updates are applied to the latest state (e.g., in
addSelection use setConfigs(prev => [...prev, selection]) after the async fetch
resolves, and in removeSelection use setConfigs(prev => prev.filter(c => c.id
!== idToRemove))). Also audit the analogous code in PromptAndConfigStep (the
selection add/remove flow) and apply the same functional-update pattern to avoid
lost selections.

---

Duplicate comments:
In `@app/assessment/AssessmentPageClient.tsx`:
- Around line 251-259: The payload being sent omits ground-truth mappings so
reference labels are lost; update the payload construction in
AssessmentPageClient (where payload is built) to include groundTruthColumns from
columnMapping (e.g., add a ground_truth_columns field set to
columnMapping.groundTruthColumns or null if empty), ensure the key naming
matches backend expectations (snake_case like ground_truth_columns) and that any
transformation (mapping objects to { column, type, format } shapes) mirrors how
attachments are handled so the server receives the ground-truth column info.

In `@app/assessment/components/DatasetStep.tsx`:
- Around line 77-105: fetchAndParseFile currently collapses all failure paths to
null which prevents handleDatasetSelect from entering its catch and leaves the
step in a valid-looking state; change fetchAndParseFile (in DatasetStep) to
throw a descriptive Error when the file is missing or parsing fails (e.g.,
"Failed to parse dataset file" including underlying info) instead of returning
null, and update handleDatasetSelect to catch that Error, clear/reset the
selected dataset (remove the id/selection), mark the step invalid (set
canProceed to false or call the existing invalidation routine), and surface the
error to the user (e.g., via setError / toast) so a parse failure cannot leave
the UI selectable.

In `@app/assessment/components/EvaluationsTab.tsx`:
- Around line 407-410: The export fetch bypasses the client-side auth handling
so expired sessions won't refresh; replace the direct fetch call in
EvaluationsTab (the block using buildAuthHeaders() and credentials) with the
shared clientFetch(endpoint, options) call so token refresh and
AUTH_EXPIRED_EVENT are handled; keep the same URL and options (include
export_format) but pass headers/payload via clientFetch instead of fetch to
reuse the centralized 401/refresh logic.
- Around line 890-892: The check that determines whether a child run is
considered "completed" only matches childRun.status === "completed"
(isCompletedChild), which excludes "completed_with_errors" runs that should
still allow Preview/Export; update the condition in EvaluationsTab (the
isCompletedChild boolean) to treat both "completed" and "completed_with_errors"
as completed (e.g., check childRun.status === "completed" || childRun.status ===
"completed_with_errors") so preview/export UI logic catches partially successful
child runs; ensure any downstream uses of isCompletedChild continue to work with
the expanded predicate.
- Around line 788-796: In EvaluationsTab (the JSX that renders the badge using
run.status and statusStyle), replace the fragile run.status.replace("_", " ")
expression with the project's centralized status formatter (e.g., call the
existing function such as formatStatus or formatAssessmentStatus) so
multi-underscore statuses like "completed_with_errors" render correctly; import
or reference that formatter where needed and guard for null/undefined run.status
(e.g., fallback to an empty string) so the badge always displays a formatted
label.

In `@app/assessment/components/PromptAndConfigStep.tsx`:
- Around line 1076-1175: The primary CTA is hardcoded to version 1; compute a
defaultVersion from the loaded versions instead (e.g. latest = versions.items[0]
or fallback to 1) and replace the hardcoded `1` in isSelected(config.id, 1), the
loading key lookup `loadingSelectionKeys[\`${config.id}:1\`]`, and the toggle
call `toggleVersionSelection(config, 1)` with this defaultVersion variable so
the button targets the most recently loaded version when available. Ensure you
also use the same defaultVersion when building any labels/disabled/loading
checks so behavior is consistent.

In `@app/assessment/config/ConfigurationStep.tsx`:
- Around line 830-907: The card is hardcoded to version 1 (see defaultSelected =
isSelected(config.id, 1) and the onClick calling toggleVersionSelection(config,
1)); change these to use the card's actual latest/saved version instead of
1—derive the version id from the available data (e.g., prefer
config.latest_version_id or the first item in versions.items: const
latestVersionId = config.latest_version_id ?? versions.items[0]?.id) and replace
both uses of the literal 1 with latestVersionId (also use the same
latestVersionId when computing defaultLoading:
loadingSelectionKeys[`${config.id}:${latestVersionId}`]) so the primary CTA
targets the most recent saved version.

---

Nitpick comments:
In `@app/assessment/AssessmentPageClient.tsx`:
- Around line 74-75: The component is managing sidebarCollapsed locally; replace
this with the shared app context by importing and calling useApp() inside
AssessmentPageClient and using the returned sidebarCollapsed,
setSidebarCollapsed (and/or toggleSidebar) instead of the local useState
variables; remove the const [sidebarCollapsed, setSidebarCollapsed] =
useState(...) declaration and update any references and event handlers to call
the context's setSidebarCollapsed or toggleSidebar so the page uses global app
UI state.

In `@app/assessment/components/CompactToggleSwitch.tsx`:
- Around line 36-37: In CompactToggleSwitch update the thumb style to use design
tokens instead of hardcoded hexes: replace the backgroundColor set when checked
from "#ffffff" to colors.text.white and replace the color set when checked from
"#111111" to colors.text.primary; make these substitutions in the thumb style
block (where backgroundColor and color are computed from the checked variable)
so theming remains consistent.

In `@app/assessment/components/EvaluationsTab.tsx`:
- Around line 93-107: Replace the hand-rolled formatRelativeTime function with
the repo-standard date utility: remove the manual Date math in function
formatRelativeTime and call the shared relative-time helper (e.g., use the
project's exported function like formatRelative or formatRelativeFromNow from
the date utilities) which is implemented using date-fns 4.1.0 and date-fns-tz
3.2.0; ensure the function signature stays the same (accepts dateStr: string and
returns string) so callers are unchanged and add any necessary import for the
shared util at the top of EvaluationsTab.tsx.
- Around line 14-16: Replace the relative imports with root-alias imports:
change the DataViewModal import (symbols DataViewModal and
jsonResultsToTableData) from "./DataViewModal" to
"@/assessment/components/DataViewModal", and change the handleForbiddenApiError
import from "../errorUtils" to "@/assessment/errorUtils"; leave the existing
ConfigResponse/ConfigVersionResponse import as-is if it already uses the "@/..."
alias. Ensure the imported symbols and paths match the exported names in those
target modules.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0641a33b-bcd9-4582-bd88-c6258ea5c400

📥 Commits

Reviewing files that changed from the base of the PR and between c5640ad and df62836.

📒 Files selected for processing (8)

app/assessment/AssessmentPageClient.tsx
app/assessment/components/CompactToggleSwitch.tsx
app/assessment/components/DatasetStep.tsx
app/assessment/components/EvaluationsTab.tsx
app/assessment/components/OutputSchemaStep.tsx
app/assessment/components/PromptAndConfigStep.tsx
app/assessment/components/icons/ToggleThumbIcons.tsx
app/assessment/config/ConfigurationStep.tsx

Ayush8923

Left some comments. Please check.

…e icon usage - Updated PromptAndConfigStep, PromptEditorStep, ReviewStep, and ConfigurationStep components to accept and utilize an API key for fetching configurations and selections. - Replaced inline SVG icons with reusable icon components (CloseIcon, ChevronLeftIcon, ChevronRightIcon, PlayIcon, DownloadIcon) for better maintainability and consistency. - Enhanced state management in ConfigurationStep and PromptAndConfigStep to use functional updates for setting configurations. - Removed unused API key retrieval logic from api.ts and adjusted fetch functions to require an API key as a parameter. - Cleaned up constants in constants.ts to remove unnecessary options from assessment model configurations. - Removed cockatiel dependency from package.json and package-lock.json as it is no longer needed.

…p GET and POST methods

…nsistency

…ve authentication handling

… API query handling

…button states

Co-authored-by: Ayush <80516839+Ayush8923@users.noreply.github.com>

Ayush8923

Added the comments for refactoring and cleanups, we can handle this in the separate PR . Will create the one issue for this.

Ayush8923 · 2026-05-04T16:51:38Z

- Added comments to clarify the purpose of various components and API routes. - Introduced utility functions for handling dataset files and managing assessment results. - Refactored imports for better organization and clarity. - Removed unused access handling code to streamline the assessment logic.

…ss multiple components and routes

…l data more gracefully

- Updated imports in ResponseSchema, SystemPrompt, and UserPrompt components to use the correct module paths. - Removed duplicate imports of ReviewSection in ColumnsReview, ConfigsReview, DatasetReview, InputReview, and SchemaReview components. - Removed unused parseCsvRow function from DatasetsTab and TextToSpeech DatasetsTab components, and moved it to utils. - Added new hook useAssessmentResults for managing assessment run list state, including fetching, polling, and error handling. - Introduced jsonResultsToTableData utility for transforming JSON results into table format. - Created assessmentFetcher for handling config fetchers and model helpers. - Added feature flag management utilities for handling user features in cookies and local storage. - Implemented output schema utilities for managing schema properties and validation.

…ty file

…zation and clarity

…n and clarity; introduce LoadingSpinner component and enhance dataset handling

…uery parameter handling

Co-authored-by: Ayush <80516839+Ayush8923@users.noreply.github.com>

vprashrex added 3 commits March 31, 2026 09:32

Merge remote-tracking branch 'origin/main' into feature/assesment

4b0e674

Merge branch 'main' into feature/assesment

5d8d9bc

coderabbitai Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread app/components/speech-to-text/ModelComparisonCard.tsx Outdated

Comment thread app/layout.tsx Outdated

vprashrex added 3 commits April 27, 2026 22:13

Refactor JsonEditor and schemaUtils for consistency and readability; …

4c43136

…add "use client" directive in ModelComparisonCard

coderabbitai Bot reviewed Apr 27, 2026

View reviewed changes

vprashrex changed the title ~~Feature/assesment~~ Assessment: Assessment Pipeline Apr 28, 2026

vprashrex self-assigned this Apr 28, 2026

vprashrex added the enhancement New feature or request label Apr 28, 2026

vprashrex added this to Kaapi-dev Apr 28, 2026

vprashrex linked an issue Apr 28, 2026 that may be closed by this pull request

AI Assessment: Multi-step LLM workflow #124

Open

vprashrex moved this to In Review in Kaapi-dev Apr 28, 2026

vprashrex requested review from Ayush8923 and nishika26 April 28, 2026 05:26

vprashrex added the ready-for-review label Apr 28, 2026

Ayush8923 requested changes Apr 28, 2026

View reviewed changes

Ayush8923 reviewed Apr 28, 2026

View reviewed changes

Comment thread app/assessment/AssessmentPageClient.tsx Outdated

Merge branch 'main' into feature/assesment

fd34470

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

vprashrex requested a review from Ayush8923 April 29, 2026 06:47

Refactor assessment components: remove unused event handling, enhance…

df62836

… UI elements, and introduce a new toggle switch component

coderabbitai Bot reviewed Apr 29, 2026

View reviewed changes

Ayush8923 requested changes Apr 29, 2026

View reviewed changes

Ayush8923 and others added 2 commits April 29, 2026 19:47

Merge branch 'main' into feature/assesment

e6ef810

vprashrex and others added 8 commits May 4, 2026 14:06

Refactor API routes by removing unused DELETE endpoint and cleaning u…

133b412

…p GET and POST methods

Refactor dataset imports to unify type definition and improve code co…

759e5aa

…nsistency

Refactor assessment components to remove API key dependency and impro…

202eb43

…ve authentication handling

Evaluation: UI/UX Revamp (#142)

c61a2b8

Refactor assessment components to remove unused variables and improve…

6b89fbb

… API query handling

Merge branch 'main' into feature/assesment

ad38e3c

Refactor ColumnMapperStep to remove ground truth handling and update …

2a4a2a0

…button states

Apply suggestions from code review

e061622

Co-authored-by: Ayush <80516839+Ayush8923@users.noreply.github.com>

Ayush8923 requested changes May 4, 2026

View reviewed changes

Ayush8923 mentioned this pull request May 4, 2026

PR Review: Refactor and resolve comments #146

Open

Ayush8923 changed the base branch from main to dev May 4, 2026 17:10

vprashrex added 9 commits May 4, 2026 23:36

Refactor assessment feature comments for clarity and conciseness acro…

407468a

…ss multiple components and routes

Refactor dataset API routes to improve type safety and handle optiona…

9bffd89

…l data more gracefully

Add SavedConfigCard component to manage version selection and display

57f4548

Refactor dataset deletion modal and move related logic to a new utili…

7ca3cd1

…ty file

Remove ModelComparisonCard component and its associated logic

56740b7

Refactor assessment-related imports and constants for improved organi…

04db1e2

…zation and clarity

Refactor assessment components and utilities for improved organizatio…

2b7d103

…n and clarity; introduce LoadingSpinner component and enhance dataset handling

vprashrex requested a review from Ayush8923 May 5, 2026 03:15

Refactor API route handlers to utilize withQueryParams for improved q…

4b6f4e3

…uery parameter handling

Ayush8923 reviewed May 5, 2026

View reviewed changes

Comment thread app/(main)/assessment/page.tsx Outdated

Apply suggestions from code review

8a055a7

Co-authored-by: Ayush <80516839+Ayush8923@users.noreply.github.com>

Ayush8923 merged commit 5927a7d into dev May 5, 2026
2 checks passed

Ayush8923 deleted the feature/assesment branch May 5, 2026 04:00

github-project-automation Bot moved this from In Review to Closed in Kaapi-dev May 5, 2026

This was referenced May 5, 2026

Chore: Assessment Fix #147

Merged

Assessment: Delete Modal and Card UI Fix #149

Merged

Assessment (HotFix): Gemini Model list added #153

Open

Conversation

vprashrex commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Ayush8923 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Ayush8923 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vprashrex commented Apr 21, 2026 •

edited

Loading

coderabbitai Bot commented Apr 21, 2026 •

edited

Loading