diff --git a/CLAUDE.md b/CLAUDE.md
new file mode 100644
index 0000000..b8e32cf
--- /dev/null
+++ b/CLAUDE.md
@@ -0,0 +1,144 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Commands
+
+```bash
+npm run dev              # Start dev server at http://localhost:3000
+npm run lint             # Run ESLint
+npm run build            # Production build
+npm run format           # Format TS/tsx with Prettier
+```
+
+## Architecture Overview
+
+### Tech Stack
+
+- **Framework**: Next.js 16.0.7 (App Router)
+- **React 19.2.0** + **TypeScript** (strict mode)
+- **Routing**: Next.js App Router (react-router-dom is listed as a dependency but currently unused)
+- **Styling**: Tailwind CSS 4.x + custom color system and styles defined in `/app/globals.css` for cases not supported by Tailwind
+- **Data Fetching**: Native Fetch API + SWR 2.3.6 (used selectively where required)
+- **Date/Time**: date-fns 4.1.0, date-fns-tz 3.2.0
+- **ESLint**: Used for maintaining code quality, enforcing consistent coding standards, and catching potential issues during development and build time
+
+### Directory Structure
+
+| Path                           | Purpose                                                                                       |
+| ------------------------------ | --------------------------------------------------------------------------------------------- |
+| `app/(auth)/`                  | Authentication-related routes (e.g., invite, verify flows)                                    |
+| `app/(main)/`                  | Main application routes (dashboard-level features like datasets, evaluations, settings, etc.) |
+| `app/api/`                     | Backend API route handlers (Next.js route handlers acting as BFF layer)                       |
+| `app/components/`              | App-scoped components used within routes/Pages                                                |
+| `app/components/icons/`        | Hand-authored React icon components                                                           |
+| `app/hooks/`                   | Custom React hooks specific to app features                                                   |
+| `app/lib/`                     | Core shared logic and utilities across the application                                        |
+| `app/lib/context/`             | React context providers (global state handling)                                               |
+| `app/lib/store/`               | State management logic (custom/global store)                                                  |
+| `app/lib/types/`               | TypeScript type definitions (shared across modules)                                           |
+| `app/lib/utils/`               | Domain-specific utility modules (e.g., evaluation, guardrails)                                |
+| `app/lib/data/`                | Static data and validators (e.g., guardrails validators)                                      |
+| `app/lib/apiClient.ts`         | Centralized API client for forwarding requests to the backend                                 |
+| `app/lib/authCookie.ts`        | Authentication cookie utilities (get/set/remove tokens)                                       |
+| `app/lib/configFetchers.ts`    | API fetchers related to configuration modules                                                 |
+| `app/lib/constants.ts`         | Global constants used across the app                                                          |
+| `app/lib/guardrailsClient.ts`  | Client-side API helpers for guardrails features                                               |
+| `app/lib/models.ts`            | Data models/interfaces for structured data handling                                           |
+| `app/lib/navConfig.ts`         | Navigation configuration (sidebar/menu structure)                                             |
+| `app/lib/promptEditorUtils.ts` | Utility functions for prompt editor logic                                                     |
+| `app/lib/utils.ts`             | General utility/helper functions                                                              |
+| `public/favicon.ico`           | Application favicon                                                                           |
+
+## Import Aliases
+
+[tsconfig.json](./tsconfig.json) sets paths: `{ "@/*": ["./*"] }`, so imports are resolved from the project root using the `@/` prefix. Use:
+
+```
+import { apiClient } from '@/app/lib/apiClient';
+import { Providers } from '@/app/components/providers';
+import { APP_NAME } from '@/app/lib/constants';
+```
+
+SVGs follow Next.js defaults (imported as static assets via next/image or referenced from /public).
+
+## Routing & Role-Based Access
+
+Routing uses the **Next.js App Router** exclusively. Routes are organized via route groups:
+
+- `app/(auth)/` - unauthenticated flows (`/invite`, `/verify`)
+- `app/(main)/` — authenticated app surface (`/evaluations`, `/datasets`, `/configurations`, `/guardrails`, `/knowledge-base`, `/settings`, etc.)
+
+Role gating lives in middleware.ts and reads a kaapi_role cookie with two values:
+
+- `user` - standard authenticated user
+- `superuser` - admin; required for `/settings/*`
+
+The cookie is issued server-side by [authCookie.ts](app/lib/authCookie.ts) after login/verify based on user.is_superuser. Middleware classifies each request into one of:
+
+- `PUBLIC_ROUTES` — open to everyone (`/evaluations`, `/invite`, `/verify`, `/coming-soon/*`)
+- `GUEST_ONLY_ROUTES` — unauthenticated only (`/keystore`); authenticated users are redirected to `/evaluations`
+- `/settings/*` — superuser only
+- Everything else — any authenticated user
+
+There is no dynamic/custom role system; only the two static roles above.
+
+## Toast Notifications
+
+Toasts are managed via a React Context provider ([Toast.tsx](app/components/Toast.tsx)), mounted once in [Providers.tsx](app/components/providers/Providers.tsx). Consume them from any client component:
+
+```
+import { useToast } from '@/app/components/Toast';
+// or the re-export: import { useToast } from '@/app/hooks/useToast';
+
+function MyComponent() {
+  const toast = useToast();
+
+  toast.success('Saved successfully');          // success toast
+  toast.error('Something went wrong');          // error toast
+  toast.warning('Heads up');                    // warning toast
+  toast.info('FYI');                            // info toast
+
+  // Optional: override the default 5000ms auto-dismiss
+  toast.success('Saved', 3000);
+
+  // Low-level API (type + duration)
+  toast.addToast('Custom message', 'success', 4000);
+}
+```
+
+## Authentication [AuthContext.tsx](app/lib/context/AuthContext.tsx)
+
+There is no `AuthService` class. Auth state is owned by a React Context provider (`AuthProvider`) mounted in [Providers.tsx](app/components/providers/Providers.tsx), and consumed via the `useAuth()` hook:
+
+```
+import { useAuth } from '@/app/lib/context/AuthContext';
+
+function MyComponent() {
+  const {
+    isAuthenticated, isHydrated,
+    session, currentUser, googleProfile,
+    apiKeys, activeKey, addKey, removeKey, setKeys,
+    loginWithToken, logout,
+  } = useAuth();
+}
+```
+
+## App Context [AppContext.tsx](app/lib/context/AppContext.tsx)
+
+Sidebar state is managed via `AppProvider`, consumed with `useApp()`:
+
+```
+import { useApp } from '@/app/lib/context/AppContext';
+
+const { sidebarCollapsed, setSidebarCollapsed, toggleSidebar } = useApp();
+```
+
+## API Client & Error Handling
+
+The BFF layer uses [apiClient.ts](app/lib/apiClient.ts) which forwards requests from Next.js route handlers to the backend at `BACKEND_URL` (defaults to `http://localhost:8000`). Key patterns:
+
+- **Server-side (route handlers)**: Use `apiClient(request, endpoint, options)` — it relays `X-API-KEY` and `Cookie` headers automatically and returns `{ status, data, headers }`.
+- **Client-side**: Use `clientFetch(endpoint, options)` — handles token refresh on 401, dispatches `AUTH_EXPIRED_EVENT` when refresh fails, and throws with a message extracted from `error`, `message`, or `detail` fields in the response body.
+- **Error extraction**: `extractErrorMessage(body, fallback)` reads `body.error || body.message || body.detail` — follow this pattern when adding new API routes.
+- **Auth expiry**: On 401 with failed refresh, a `CustomEvent(AUTH_EXPIRED_EVENT)` is dispatched on `window`, which `AuthContext` listens to for automatic logout.
diff --git a/app/(main)/configurations/page.tsx b/app/(main)/configurations/page.tsx
index 2a68c20..a36ab48 100644
--- a/app/(main)/configurations/page.tsx
+++ b/app/(main)/configurations/page.tsx
@@ -13,7 +13,7 @@ import { colors } from "@/app/lib/colors";
 import { usePaginatedList, useInfiniteScroll } from "@/app/hooks";
 import ConfigCard from "@/app/components/ConfigCard";
 import Loader, { LoaderBox } from "@/app/components/Loader";
-import { EvalJob } from "@/app/components/types";
+import { EvalJob } from "@/app/lib/types/evaluation";
 import {
   ConfigPublic,
   ConfigVersionItems,
diff --git a/app/(main)/evaluations/[id]/page.tsx b/app/(main)/evaluations/[id]/page.tsx
index d2e583f..517c927 100644
--- a/app/(main)/evaluations/[id]/page.tsx
+++ b/app/(main)/evaluations/[id]/page.tsx
@@ -10,19 +10,21 @@ import { useRouter, useParams } from "next/navigation";
 import { apiFetch } from "@/app/lib/apiClient";
 import { useAuth } from "@/app/lib/context/AuthContext";
 import { useApp } from "@/app/lib/context/AppContext";
-import {
+import type {
   EvalJob,
   AssistantConfig,
+  GroupedTraceItem,
+} from "@/app/lib/types/evaluation";
+import {
   hasSummaryScores,
   isNewScoreObjectV2,
   getScoreObject,
   normalizeToIndividualScores,
-  GroupedTraceItem,
   isGroupedFormat,
-} from "@/app/components/types";
+} from "@/app/lib/utils/evaluation";
 import ConfigModal from "@/app/components/ConfigModal";
 import Sidebar from "@/app/components/Sidebar";
-import DetailedResultsTable from "@/app/components/DetailedResultsTable";
+import DetailedResultsTable from "@/app/components/evaluations/DetailedResultsTable";
 import { colors } from "@/app/lib/colors";
 import { useToast } from "@/app/components/Toast";
 import Loader from "@/app/components/Loader";
@@ -126,7 +128,6 @@ export default function EvaluationReport() {
     if (isAuthenticated && jobId) fetchJobDetails();
   }, [isAuthenticated, jobId, fetchJobDetails]);
 
-  // Export grouped format CSV
   const exportGroupedCSV = (traces: GroupedTraceItem[]) => {
     if (!job) return;
     try {
@@ -391,9 +392,9 @@ export default function EvaluationReport() {
               >
                 <ChevronLeftIcon />
               </button>
-              <div className="min-w-0 flex items-center gap-3">
+              <div className="min-w-0 flex-1 flex items-center gap-3 overflow-hidden">
                 <h1
-                  className="text-base font-semibold truncate"
+                  className="text-base font-semibold truncate min-w-0"
                   style={{
                     color: colors.text.primary,
                     letterSpacing: "-0.01em",
@@ -411,99 +412,33 @@ export default function EvaluationReport() {
               </div>
             </div>
 
-            {/* Actions */}
-            <div className="flex items-center gap-3 flex-shrink-0">
+            <div className="flex items-center gap-3 flex-shrink-0 relative z-10">
               <div
                 className="inline-flex rounded-lg p-0.5"
                 style={{ backgroundColor: colors.bg.secondary }}
               >
                 <button
+                  type="button"
                   onClick={() => setExportFormat("row")}
-                  className="inline-flex items-center gap-1.5 px-3 py-1.5 rounded-md text-xs font-medium transition-all cursor-pointer"
-                  style={{
-                    backgroundColor:
-                      exportFormat === "row"
-                        ? colors.bg.primary
-                        : "transparent",
-                    color:
-                      exportFormat === "row"
-                        ? colors.text.primary
-                        : colors.text.primary,
-                    boxShadow:
-                      exportFormat === "row"
-                        ? "0 1px 2px rgba(0,0,0,0.08)"
-                        : "none",
-                    border:
-                      exportFormat === "row"
-                        ? `1px solid ${colors.border}`
-                        : "1px solid transparent",
-                  }}
-                  onMouseEnter={(e) => {
-                    if (exportFormat !== "row") {
-                      e.currentTarget.style.backgroundColor =
-                        "rgba(0,0,0,0.04)";
-                      e.currentTarget.style.boxShadow =
-                        "0 0 0 1px rgba(0,0,0,0.06)";
-                    }
-                  }}
-                  onMouseLeave={(e) => {
-                    if (exportFormat !== "row") {
-                      e.currentTarget.style.backgroundColor = "transparent";
-                      e.currentTarget.style.boxShadow = "none";
-                    }
-                  }}
+                  data-selected={exportFormat === "row"}
+                  className="inline-flex items-center gap-1.5 px-3 py-1.5 rounded-md text-xs font-medium transition-all cursor-pointer border border-transparent text-text-primary hover:bg-black/4 hover:shadow-[0_0_0_1px_rgba(0,0,0,0.06)] data-[selected=true]:bg-bg-primary data-[selected=true]:border-border data-[selected=true]:shadow-[0_1px_2px_rgba(0,0,0,0.08)] data-[selected=true]:hover:bg-bg-primary data-[selected=true]:hover:shadow-[0_1px_2px_rgba(0,0,0,0.08)]"
                 >
-                  <MenuIcon className="w-3.5 h-3.5" />
+                  <MenuIcon className="w-3.5 h-3.5 pointer-events-none" />
                   Individual Rows
                 </button>
                 <button
+                  type="button"
                   onClick={() => setExportFormat("grouped")}
-                  className="inline-flex items-center gap-1.5 px-3 py-1.5 rounded-md text-xs font-medium transition-all cursor-pointer"
-                  style={{
-                    backgroundColor:
-                      exportFormat === "grouped"
-                        ? colors.bg.primary
-                        : "transparent",
-                    color:
-                      exportFormat === "grouped"
-                        ? colors.text.primary
-                        : colors.text.primary,
-                    boxShadow:
-                      exportFormat === "grouped"
-                        ? "0 1px 2px rgba(0,0,0,0.08)"
-                        : "none",
-                    border:
-                      exportFormat === "grouped"
-                        ? `1px solid ${colors.border}`
-                        : "1px solid transparent",
-                  }}
-                  onMouseEnter={(e) => {
-                    if (exportFormat !== "grouped") {
-                      e.currentTarget.style.backgroundColor =
-                        "rgba(0,0,0,0.04)";
-                      e.currentTarget.style.boxShadow =
-                        "0 0 0 1px rgba(0,0,0,0.06)";
-                    }
-                  }}
-                  onMouseLeave={(e) => {
-                    if (exportFormat !== "grouped") {
-                      e.currentTarget.style.backgroundColor = "transparent";
-                      e.currentTarget.style.boxShadow = "none";
-                    }
-                  }}
+                  data-selected={exportFormat === "grouped"}
+                  className="inline-flex items-center gap-1.5 px-3 py-1.5 rounded-md text-xs font-medium transition-all cursor-pointer border border-transparent text-text-primary hover:bg-black/4 hover:shadow-[0_0_0_1px_rgba(0,0,0,0.06)] data-[selected=true]:bg-bg-primary data-[selected=true]:border-border data-[selected=true]:shadow-[0_1px_2px_rgba(0,0,0,0.08)] data-[selected=true]:hover:bg-bg-primary data-[selected=true]:hover:shadow-[0_1px_2px_rgba(0,0,0,0.08)]"
                 >
-                  <GroupIcon />
+                  <GroupIcon className="pointer-events-none" />
                   Group by Questions
                 </button>
               </div>
               <button
                 onClick={() => setIsConfigModalOpen(true)}
-                className="px-3 py-1.5 rounded-md text-xs font-medium border"
-                style={{
-                  backgroundColor: "transparent",
-                  borderColor: colors.border,
-                  color: colors.text.primary,
-                }}
+                className="px-3 py-1.5 rounded-md text-xs font-medium border bg-transparent border-border text-text-primary"
               >
                 View Config
               </button>
diff --git a/app/(main)/evaluations/page.tsx b/app/(main)/evaluations/page.tsx
index d7900f3..13ca97c 100644
--- a/app/(main)/evaluations/page.tsx
+++ b/app/(main)/evaluations/page.tsx
@@ -49,12 +49,8 @@ function SimplifiedEvalContent() {
   const [duplicationFactor, setDuplicationFactor] = useState("1");
   const [uploadedFile, setUploadedFile] = useState<File | null>(null);
   const [isUploading, setIsUploading] = useState(false);
-
-  // Stored datasets
   const [storedDatasets, setStoredDatasets] = useState<Dataset[]>([]);
   const [isDatasetsLoading, setIsDatasetsLoading] = useState(false);
-
-  // Evaluation config state
   const [selectedDatasetId, setSelectedDatasetId] = useState<string>(() => {
     return searchParams.get("dataset") || "";
   });
@@ -235,6 +231,10 @@ function SimplifiedEvalContent() {
       });
 
       setIsEvaluating(false);
+      setExperimentName("");
+      setSelectedDatasetId("");
+      setSelectedConfigId("");
+      setSelectedConfigVersion(0);
       toast.success(`Evaluation created!`);
       return true;
     } catch (error: unknown) {
diff --git a/app/components/CodeBlock.tsx b/app/components/CodeBlock.tsx
new file mode 100644
index 0000000..e76d9a3
--- /dev/null
+++ b/app/components/CodeBlock.tsx
@@ -0,0 +1,13 @@
+import type { ReactNode } from "react";
+
+interface CodeBlockProps {
+  children: ReactNode;
+}
+
+export default function CodeBlock({ children }: CodeBlockProps) {
+  return (
+    <div className="text-sm font-mono px-3 py-2.5 rounded-md whitespace-pre-wrap max-h-60 overflow-y-auto leading-[1.6] bg-bg-secondary text-text-primary">
+      {children}
+    </div>
+  );
+}
diff --git a/app/components/ConfigModal.tsx b/app/components/ConfigModal.tsx
index 817b24f..0f3f412 100644
--- a/app/components/ConfigModal.tsx
+++ b/app/components/ConfigModal.tsx
@@ -7,7 +7,11 @@
 
 import React, { useState, useEffect } from "react";
 import { colors } from "@/app/lib/colors";
-import { EvalJob, AssistantConfig } from "./types";
+import CopyableCodeBlock from "@/app/components/CopyableCodeBlock";
+import CodeBlock from "@/app/components/CodeBlock";
+import Tag from "@/app/components/Tag";
+import { CloseIcon } from "@/app/components/icons";
+import { EvalJob, AssistantConfig } from "@/app/lib/types/evaluation";
 import { useAuth } from "@/app/lib/context/AuthContext";
 import { apiFetch } from "@/app/lib/apiClient";
 import {
@@ -35,6 +39,24 @@ interface ConfigVersionInfo {
   knowledge_base_ids?: string[];
 }
 
+const ConfigField = ({
+  label,
+  children,
+}: {
+  label: string;
+  children: React.ReactNode;
+}) => (
+  <div>
+    <div
+      className="text-xs font-medium mb-1.5"
+      style={{ color: colors.text.secondary }}
+    >
+      {label}
+    </div>
+    {children}
+  </div>
+);
+
 export default function ConfigModal({
   isOpen,
   onClose,
@@ -80,15 +102,14 @@ export default function ConfigModal({
           const params: CompletionParams =
             blob?.completion?.params || ({} as CompletionParams);
 
-          // Extract knowledge base IDs from multiple sources
           const knowledgeBaseIds: string[] = [];
 
-          // 1. Check direct params.knowledge_base_ids
+          // Check direct params.knowledge_base_ids
           if (Array.isArray(params.knowledge_base_ids)) {
             knowledgeBaseIds.push(...params.knowledge_base_ids);
           }
 
-          // 2. Check tools array for knowledge_base_ids
+          // Check tools array for knowledge_base_ids
           if (params.tools) {
             const toolKbIds = params.tools
               .filter(
@@ -100,7 +121,6 @@ export default function ConfigModal({
             knowledgeBaseIds.push(...toolKbIds);
           }
 
-          // Remove duplicates
           const uniqueKbIds = [...new Set(knowledgeBaseIds)];
 
           setConfigVersionInfo({
@@ -128,51 +148,9 @@ export default function ConfigModal({
 
   if (!isOpen) return null;
 
-  const ConfigField = ({
-    label,
-    children,
-  }: {
-    label: string;
-    children: React.ReactNode;
-  }) => (
-    <div>
-      <div
-        className="text-xs font-medium mb-1.5"
-        style={{ color: colors.text.secondary }}
-      >
-        {label}
-      </div>
-      {children}
-    </div>
-  );
-
-  const CodeBlock = ({ children }: { children: React.ReactNode }) => (
-    <div
-      className="text-sm font-mono px-3 py-2.5 rounded-md whitespace-pre-wrap max-h-[240px] overflow-y-auto leading-[1.6]"
-      style={{
-        backgroundColor: colors.bg.secondary,
-        color: colors.text.primary,
-      }}
-    >
-      {children}
-    </div>
-  );
-
-  const Tag = ({ children }: { children: React.ReactNode }) => (
-    <span
-      className="inline-flex px-2.5 py-1 rounded-md text-xs font-medium"
-      style={{
-        backgroundColor: colors.bg.secondary,
-        color: colors.text.primary,
-      }}
-    >
-      {children}
-    </span>
-  );
-
   return (
     <div
-      className="fixed inset-0 z-[60] flex items-center justify-center bg-black/50"
+      className="fixed inset-0 z-60 flex items-center justify-center bg-black/50"
       onClick={onClose}
     >
       <div
@@ -180,7 +158,6 @@ export default function ConfigModal({
         style={{ backgroundColor: colors.bg.primary, maxHeight: "80vh" }}
         onClick={(e) => e.stopPropagation()}
       >
-        {/* Header */}
         <div
           className="flex items-center justify-between px-6 py-4 border-b flex-shrink-0"
           style={{ borderColor: colors.border }}
@@ -214,23 +191,10 @@ export default function ConfigModal({
             className="p-1.5 rounded"
             style={{ color: colors.text.secondary }}
           >
-            <svg
-              className="w-5 h-5"
-              fill="none"
-              viewBox="0 0 24 24"
-              stroke="currentColor"
-            >
-              <path
-                strokeLinecap="round"
-                strokeLinejoin="round"
-                strokeWidth={2}
-                d="M6 18L18 6M6 6l12 12"
-              />
-            </svg>
+            <CloseIcon className="w-5 h-5" />
           </button>
         </div>
 
-        {/* Content */}
         <div className="flex-1 overflow-y-auto px-6 py-5 space-y-5">
           {isLoadingConfig ? (
             <div className="py-8 text-center">
@@ -295,9 +259,11 @@ export default function ConfigModal({
               {configVersionInfo?.knowledge_base_ids &&
                 configVersionInfo.knowledge_base_ids.length > 0 && (
                   <ConfigField label="Knowledge Base IDs">
-                    <CodeBlock>
+                    <CopyableCodeBlock
+                      copyText={configVersionInfo.knowledge_base_ids.join("\n")}
+                    >
                       {configVersionInfo.knowledge_base_ids.join("\n")}
-                    </CodeBlock>
+                    </CopyableCodeBlock>
                   </ConfigField>
                 )}
 
@@ -305,11 +271,17 @@ export default function ConfigModal({
                 assistantConfig?.instructions ||
                 job.config?.instructions) && (
                 <ConfigField label="Instructions">
-                  <CodeBlock>
+                  <CopyableCodeBlock
+                    copyText={
+                      (configVersionInfo?.instructions ||
+                        assistantConfig?.instructions ||
+                        job.config?.instructions) as string
+                    }
+                  >
                     {configVersionInfo?.instructions ||
                       assistantConfig?.instructions ||
                       job.config?.instructions}
-                  </CodeBlock>
+                  </CopyableCodeBlock>
                 </ConfigField>
               )}
 
diff --git a/app/components/CopyableCodeBlock.tsx b/app/components/CopyableCodeBlock.tsx
new file mode 100644
index 0000000..7578033
--- /dev/null
+++ b/app/components/CopyableCodeBlock.tsx
@@ -0,0 +1,49 @@
+"use client";
+
+import React, { useState, useCallback } from "react";
+import { useToast } from "@/app/hooks/useToast";
+import { CheckIcon, CopyIcon } from "@/app/components/icons";
+
+interface CopyableCodeBlockProps {
+  children: React.ReactNode;
+  copyText: string;
+}
+
+export default function CopyableCodeBlock({
+  children,
+  copyText,
+}: CopyableCodeBlockProps) {
+  const toast = useToast();
+  const [copied, setCopied] = useState(false);
+
+  const handleCopy = useCallback(async () => {
+    try {
+      await navigator.clipboard.writeText(copyText);
+      setCopied(true);
+      toast.success("Copied to clipboard");
+      setTimeout(() => setCopied(false), 2000);
+    } catch {
+      toast.error("Failed to copy");
+    }
+  }, [copyText, toast]);
+
+  return (
+    <div className="relative">
+      <div className="text-sm font-mono pl-3 pr-10 py-2.5 rounded-md whitespace-pre-wrap max-h-60 overflow-y-auto leading-[1.6] bg-bg-secondary text-text-primary">
+        {children}
+      </div>
+      <button
+        type="button"
+        onClick={handleCopy}
+        className="absolute top-2 right-2 p-1.5 rounded-md cursor-pointer hover:bg-neutral-200"
+        title="Copy to clipboard"
+      >
+        {copied ? (
+          <CheckIcon className="w-4 h-4 text-status-success" />
+        ) : (
+          <CopyIcon className="w-4 h-4 text-text-secondary" />
+        )}
+      </button>
+    </div>
+  );
+}
diff --git a/app/components/DetailedResultsTable.tsx b/app/components/DetailedResultsTable.tsx
deleted file mode 100644
index c4b9de0..0000000
--- a/app/components/DetailedResultsTable.tsx
+++ /dev/null
@@ -1,639 +0,0 @@
-/**
- * DetailedResultsTable.tsx - Table view for evaluation results
- *
- * Displays Q&A pairs with scores in a tabular format
- * Supports both row format (individual traces) and grouped format (multiple answers per question)
- */
-
-import React, { useState, useEffect } from "react";
-import {
-  TraceScore,
-  getScoreObject,
-  normalizeToIndividualScores,
-  hasSummaryScores,
-  isNewScoreObjectV2,
-  isGroupedFormat,
-  GroupedTraceItem,
-  EvalJob,
-} from "@/app/components/types";
-
-// Helper function to format score value with color
-const formatScoreValue = (score: TraceScore | undefined) => {
-  if (!score) return { value: "N/A", color: "#737373", bg: "transparent" };
-
-  if (score.data_type === "CATEGORICAL") {
-    const catValue = String(score.value);
-    let color = "#171717";
-    let bg = "#fafafa";
-
-    if (catValue === "CORRECT") {
-      color = "#15803d";
-      bg = "#dcfce7";
-    } else if (catValue === "PARTIAL") {
-      color = "#92400e";
-      bg = "#fef3c7";
-    } else if (catValue === "INCORRECT") {
-      color = "#dc2626";
-      bg = "#fee2e2";
-    }
-
-    return { value: catValue, color, bg };
-  }
-
-  // NUMERIC
-  const numValue = Number(score.value);
-  const formattedValue = numValue.toFixed(2);
-  let color = "#171717";
-  let bg = "transparent";
-
-  // Color based on value
-  if (numValue >= 0.7) {
-    color = "#15803d";
-    bg = "#dcfce7";
-  } else if (numValue >= 0.5) {
-    color = "#92400e";
-    bg = "#fef3c7";
-  } else {
-    color = "#dc2626";
-    bg = "#fee2e2";
-  }
-
-  return { value: formattedValue, color, bg };
-};
-
-interface DetailedResultsTableProps {
-  job: EvalJob;
-}
-
-export default function DetailedResultsTable({
-  job,
-}: DetailedResultsTableProps) {
-  const [openCommentId, setOpenCommentId] = useState<string | null>(null);
-  const [commentPos, setCommentPos] = useState({ top: 0, left: 0 });
-
-  useEffect(() => {
-    if (!openCommentId) return;
-    const handleScroll = () => setOpenCommentId(null);
-    window.addEventListener("scroll", handleScroll, true);
-    return () => {
-      window.removeEventListener("scroll", handleScroll, true);
-    };
-  }, [openCommentId]);
-
-  const scoreObject = getScoreObject(job);
-
-  // 1. First check: Does it have summary_scores at all?
-  if (!scoreObject || !hasSummaryScores(scoreObject)) {
-    return (
-      <div
-        className="border rounded-lg p-6 text-center"
-        style={{ backgroundColor: "#fef3c7", borderColor: "#fbbf24" }}
-      >
-        <p className="text-sm" style={{ color: "#92400e" }}>
-          No detailed results available or using legacy format
-        </p>
-      </div>
-    );
-  }
-
-  // 2. Second check: Does it have traces? (NewScoreObjectV2)
-  if (isNewScoreObjectV2(scoreObject)) {
-    // Check if grouped format
-    if (isGroupedFormat(scoreObject.traces)) {
-      return (
-        <GroupedResultsTable
-          traces={scoreObject.traces as GroupedTraceItem[]}
-        />
-      );
-    }
-    // Otherwise show row format
-  }
-
-  // 3. Try to normalize to IndividualScore format
-  // This handles NewScoreObjectV2 (with traces)
-  const individual_scores = normalizeToIndividualScores(scoreObject);
-
-  // 4. If no individual scores available (e.g., BasicScoreObject with only summary_scores)
-  if (!individual_scores || individual_scores.length === 0) {
-    return (
-      <div
-        className="border rounded-lg p-6 text-center"
-        style={{ backgroundColor: "#fef3c7", borderColor: "#fbbf24" }}
-      >
-        <p className="text-sm" style={{ color: "#92400e" }}>
-          No individual scores available. Only summary metrics are available for
-          this evaluation.
-        </p>
-      </div>
-    );
-  }
-
-  // Get all unique score names from the first item
-  const scoreNames =
-    individual_scores[0]?.trace_scores?.map((s) => s.name) || [];
-
-  // Helper function to get score value by name
-  const getScoreByName = (
-    scores: TraceScore[],
-    name: string,
-  ): TraceScore | undefined => {
-    if (!scores || !Array.isArray(scores)) return undefined;
-    return scores.find((s) => s?.name === name);
-  };
-
-  return (
-    <div
-      className="border rounded-lg overflow-hidden"
-      style={{ backgroundColor: "#ffffff", borderColor: "#e5e5e5" }}
-    >
-      {/* Table Container */}
-      <div className="overflow-x-auto">
-        <table className="w-full border-collapse min-w-[800px] table-fixed">
-          {/* Table Header */}
-          <thead>
-            <tr
-              style={{
-                backgroundColor: "#fafafa",
-                borderBottom: "1px solid #e5e5e5",
-              }}
-            >
-              <th
-                className="px-4 py-3 text-left text-xs font-semibold uppercase"
-                style={{ color: "#171717", width: "5%" }}
-              ></th>
-              <th
-                className="px-4 py-3 text-left text-xs font-semibold uppercase"
-                style={{ color: "#171717", width: "25%" }}
-              >
-                Question
-              </th>
-              <th
-                className="px-4 py-3 text-left text-xs font-semibold uppercase"
-                style={{ color: "#171717", width: "25%" }}
-              >
-                Ground Truth
-              </th>
-              <th
-                className="px-4 py-3 text-left text-xs font-semibold uppercase"
-                style={{ color: "#171717", width: "25%" }}
-              >
-                Answer
-              </th>
-              {scoreNames.map((scoreName) => (
-                <th
-                  key={scoreName}
-                  className="px-4 py-3 text-center text-xs font-semibold uppercase"
-                  style={{
-                    color: "#171717",
-                    width: `${20 / scoreNames.length}%`,
-                  }}
-                >
-                  {scoreName}
-                </th>
-              ))}
-            </tr>
-          </thead>
-
-          {/* Table Body */}
-          <tbody>
-            {individual_scores.map((item, index) => {
-              const question = item.input?.question || "N/A";
-              const answer = item.output?.answer || "N/A";
-              const groundTruth = item.metadata?.ground_truth || "N/A";
-
-              return (
-                <tr
-                  key={item.trace_id || index}
-                  className="border-b"
-                  style={{
-                    borderColor: "#e5e5e5",
-                    transition: "background-color 0.15s ease",
-                  }}
-                  onMouseEnter={(e) => {
-                    const row = e.currentTarget;
-                    row.style.backgroundColor = "#fafafa";
-                  }}
-                  onMouseLeave={(e) => {
-                    const row = e.currentTarget;
-                    row.style.backgroundColor = "#ffffff";
-                  }}
-                >
-                  <td className="px-4 py-3 text-sm font-medium align-top text-text-secondary">
-                    {index + 1}
-                  </td>
-
-                  {/* Question */}
-                  <td
-                    className="px-4 py-3 align-top"
-                    style={{ backgroundColor: "#fafafa" }}
-                  >
-                    <div className="text-sm overflow-auto text-[#171717] leading-normal max-h-[150px] wrap-break-word">
-                      {question}
-                    </div>
-                  </td>
-
-                  {/* Ground Truth */}
-                  <td
-                    className="px-4 py-3 align-top"
-                    style={{ backgroundColor: "#fafafa" }}
-                  >
-                    <div className="text-sm overflow-auto text-[#171717] leading-normal max-h-[150px] wrap-break-word">
-                      {groundTruth}
-                    </div>
-                  </td>
-
-                  {/* Answer */}
-                  <td className="px-4 py-3 align-top">
-                    <div className="text-sm overflow-auto text-[#171717] leading-normal max-h-[150px] wrap-break-word">
-                      {answer}
-                    </div>
-                  </td>
-
-                  {/* Score Columns */}
-                  {scoreNames.map((scoreName) => {
-                    const score = getScoreByName(item.trace_scores, scoreName);
-                    const { value, color, bg } = formatScoreValue(score);
-
-                    return (
-                      <td
-                        key={scoreName}
-                        className="px-4 py-3 text-center align-top"
-                      >
-                        <div className="flex items-center justify-center gap-2">
-                          <div
-                            className="inline-block px-2 py-1 rounded text-xs font-medium"
-                            style={{
-                              color,
-                              backgroundColor: bg,
-                              borderWidth: bg === "transparent" ? "1px" : "0",
-                              borderColor: "#e5e5e5",
-                            }}
-                          >
-                            {value}
-                          </div>
-                          {score?.comment && (
-                            <>
-                              <div
-                                className="inline-flex items-center justify-center w-4 h-4 rounded-full text-xs font-normal"
-                                style={{
-                                  backgroundColor:
-                                    openCommentId === `${index}-${scoreName}`
-                                      ? "#171717"
-                                      : "#fafafa",
-                                  color:
-                                    openCommentId === `${index}-${scoreName}`
-                                      ? "#ffffff"
-                                      : "#737373",
-                                }}
-                                onMouseEnter={(e) => {
-                                  const rect =
-                                    e.currentTarget.getBoundingClientRect();
-                                  const tooltipWidth = 300;
-                                  const centerX = rect.left + rect.width / 2;
-                                  const clampedLeft = Math.min(
-                                    Math.max(centerX - tooltipWidth / 2, 8),
-                                    window.innerWidth - tooltipWidth - 8,
-                                  );
-                                  setCommentPos({
-                                    top: rect.top - 8,
-                                    left: clampedLeft,
-                                  });
-                                  setOpenCommentId(`${index}-${scoreName}`);
-                                }}
-                                onMouseLeave={() => setOpenCommentId(null)}
-                              >
-                                i
-                              </div>
-                              {openCommentId === `${index}-${scoreName}` && (
-                                <div
-                                  className="fixed z-50 px-3 py-2 rounded-md text-xs whitespace-normal pointer-events-none"
-                                  style={{
-                                    backgroundColor: "#171717",
-                                    color: "#ffffff",
-                                    width: "300px",
-                                    boxShadow: "0 4px 6px rgba(0, 0, 0, 0.1)",
-                                    top: commentPos.top,
-                                    left: commentPos.left,
-                                    transform: "translateY(-100%)",
-                                  }}
-                                >
-                                  {score.comment}
-                                </div>
-                              )}
-                            </>
-                          )}
-                        </div>
-                      </td>
-                    );
-                  })}
-                </tr>
-              );
-            })}
-          </tbody>
-        </table>
-      </div>
-    </div>
-  );
-}
-
-function GroupedResultsTable({ traces }: { traces: GroupedTraceItem[] }) {
-  const [openCommentId, setOpenCommentId] = useState<string | null>(null);
-  const [commentPos, setCommentPos] = useState({ top: 0, left: 0 });
-
-  useEffect(() => {
-    if (!openCommentId) return;
-    const handleScroll = () => setOpenCommentId(null);
-    window.addEventListener("scroll", handleScroll, true);
-    return () => {
-      window.removeEventListener("scroll", handleScroll, true);
-    };
-  }, [openCommentId]);
-
-  if (!traces || traces.length === 0) {
-    return (
-      <div
-        className="border rounded-lg p-6 text-center"
-        style={{ backgroundColor: "#fef3c7", borderColor: "#fbbf24" }}
-      >
-        <p className="text-sm" style={{ color: "#92400e" }}>
-          No grouped results available
-        </p>
-      </div>
-    );
-  }
-
-  // Get max answers count
-  const maxAnswers = Math.max(...traces.map((t) => t.llm_answers.length));
-
-  // Fixed column widths (in pixels) for predictable layout
-  const COLUMN_WIDTHS = {
-    qId: 60,
-    question: 200,
-    groundTruth: 200,
-    answer: 250,
-  };
-
-  // Calculate minimum table width based on number of answers
-  // This ensures horizontal scroll activates at the right point
-  const fixedColumnsWidth =
-    COLUMN_WIDTHS.qId + COLUMN_WIDTHS.question + COLUMN_WIDTHS.groundTruth;
-  const tableMinWidth = fixedColumnsWidth + maxAnswers * COLUMN_WIDTHS.answer;
-
-  return (
-    <div
-      className="border rounded-lg overflow-hidden"
-      style={{ backgroundColor: "#ffffff", borderColor: "#e5e5e5" }}
-    >
-      {/* Table Container - overflow-x-auto enables horizontal scroll when table exceeds viewport */}
-      <div className="overflow-x-auto">
-        <table
-          className="w-full border-collapse table-fixed"
-          style={{ minWidth: `${tableMinWidth}px` }}
-        >
-          {/* Table Header - matching row format styling */}
-          <thead>
-            <tr
-              style={{
-                backgroundColor: "#fafafa",
-                borderBottom: "1px solid #e5e5e5",
-              }}
-            >
-              <th
-                className="px-4 py-3 text-left text-xs font-semibold uppercase"
-                style={{
-                  color: "#171717",
-                  width: `${COLUMN_WIDTHS.qId}px`,
-                  minWidth: `${COLUMN_WIDTHS.qId}px`,
-                }}
-              >
-                Q.ID
-              </th>
-              <th
-                className="px-4 py-3 text-left text-xs font-semibold uppercase"
-                style={{
-                  color: "#171717",
-                  width: `${COLUMN_WIDTHS.question}px`,
-                  minWidth: `${COLUMN_WIDTHS.question}px`,
-                }}
-              >
-                Question
-              </th>
-              <th
-                className="px-4 py-3 text-left text-xs font-semibold uppercase"
-                style={{
-                  color: "#171717",
-                  width: `${COLUMN_WIDTHS.groundTruth}px`,
-                  minWidth: `${COLUMN_WIDTHS.groundTruth}px`,
-                }}
-              >
-                Ground Truth
-              </th>
-              {Array.from({ length: maxAnswers }, (_, i) => (
-                <th
-                  key={`answer-${i}`}
-                  className="px-4 py-3 text-left text-xs font-semibold uppercase"
-                  style={{
-                    color: "#171717",
-                    width: `${COLUMN_WIDTHS.answer}px`,
-                    minWidth: `${COLUMN_WIDTHS.answer}px`,
-                  }}
-                >
-                  Answer {i + 1}
-                </th>
-              ))}
-            </tr>
-          </thead>
-
-          {/* Table Body */}
-          <tbody>
-            {traces.map((group, index) => (
-              <React.Fragment key={group.question_id || index}>
-                {/* Text row */}
-                <tr
-                  key={`${group.question_id || index}-text`}
-                  style={{ backgroundColor: "#ffffff" }}
-                >
-                  {/* Question ID */}
-                  <td
-                    className="px-4 pt-3 pb-1 text-sm font-medium align-top"
-                    style={{ color: "#737373" }}
-                  >
-                    {group.question_id}
-                  </td>
-
-                  {/* Question */}
-                  <td className="px-4 pt-3 pb-1 align-top bg-[#fafafa]">
-                    <div className="text-sm overflow-auto text-[#171717] leading-normal max-h-[150px] wrap-break-word">
-                      {group.question}
-                    </div>
-                  </td>
-
-                  {/* Ground Truth */}
-                  <td className="px-4 pt-3 pb-1 align-top bg-bg-secondary">
-                    <div className="text-sm overflow-auto text-[#171717] leading-normal max-h-[150px] wrap-break-word">
-                      {group.ground_truth_answer}
-                    </div>
-                  </td>
-
-                  {/* Answer text only */}
-                  {Array.from({ length: maxAnswers }, (_, answerIndex) => {
-                    const answer = group.llm_answers[answerIndex];
-                    return (
-                      <td
-                        key={answerIndex}
-                        className="px-4 pt-3 pb-1 align-top"
-                      >
-                        {answer ? (
-                          <div className="text-sm overflow-auto text-[#171717] leading-6 max-h-[150px] wrap-break-word">
-                            {answer}
-                          </div>
-                        ) : (
-                          <span className="text-sm text-[#171717]">-</span>
-                        )}
-                      </td>
-                    );
-                  })}
-                </tr>
-                {/* Scores row */}
-                <tr
-                  key={`${group.question_id || index}-scores`}
-                  className="border-b border-[#e5e5e5]"
-                >
-                  {/* Empty cells for Q.ID, Question, Ground Truth */}
-                  <td className="px-4 pt-1 pb-3" />
-                  <td
-                    className="px-4 pt-1 pb-3"
-                    style={{ backgroundColor: "#fafafa" }}
-                  />
-                  <td
-                    className="px-4 pt-1 pb-3"
-                    style={{ backgroundColor: "#fafafa" }}
-                  />
-
-                  {/* Score cells */}
-                  {Array.from({ length: maxAnswers }, (_, answerIndex) => {
-                    const answerScores: TraceScore[] =
-                      group.scores?.[answerIndex] || [];
-                    const answer = group.llm_answers[answerIndex];
-
-                    return (
-                      <td
-                        key={answerIndex}
-                        className="px-4 pt-1 pb-3 align-bottom"
-                      >
-                        {answer && answerScores.length > 0 ? (
-                          <div className="space-y-1">
-                            {answerScores.map(
-                              (score: TraceScore, scoreIdx: number) => {
-                                if (!score) return null;
-                                const { value, color, bg } =
-                                  formatScoreValue(score);
-                                return (
-                                  <div
-                                    key={score.name || scoreIdx}
-                                    className="flex items-center justify-between gap-1"
-                                  >
-                                    <span
-                                      className="text-xs truncate min-w-0"
-                                      style={{ color: "#737373" }}
-                                    >
-                                      {score.name}:
-                                    </span>
-                                    <div className="flex items-center gap-1 flex-shrink-0">
-                                      <div
-                                        className="inline-block px-2 py-0.5 rounded text-xs font-medium"
-                                        style={{
-                                          color,
-                                          backgroundColor: bg,
-                                          borderWidth:
-                                            bg === "transparent" ? "1px" : "0",
-                                          borderColor: "#e5e5e5",
-                                        }}
-                                      >
-                                        {value}
-                                      </div>
-                                      {score?.comment &&
-                                        (() => {
-                                          const commentId = `g${index}-a${answerIndex}-s${scoreIdx}`;
-                                          return (
-                                            <>
-                                              <div
-                                                className="inline-flex items-center justify-center w-4 h-4 rounded-full text-xs font-normal"
-                                                style={{
-                                                  backgroundColor:
-                                                    openCommentId === commentId
-                                                      ? "#171717"
-                                                      : "#fafafa",
-                                                  color:
-                                                    openCommentId === commentId
-                                                      ? "#ffffff"
-                                                      : "#737373",
-                                                }}
-                                                onMouseEnter={(e) => {
-                                                  const rect =
-                                                    e.currentTarget.getBoundingClientRect();
-                                                  const tooltipWidth = 300;
-                                                  const centerX =
-                                                    rect.left + rect.width / 2;
-                                                  const clampedLeft = Math.min(
-                                                    Math.max(
-                                                      centerX -
-                                                        tooltipWidth / 2,
-                                                      8,
-                                                    ),
-                                                    window.innerWidth -
-                                                      tooltipWidth -
-                                                      8,
-                                                  );
-                                                  setCommentPos({
-                                                    top: rect.top - 8,
-                                                    left: clampedLeft,
-                                                  });
-                                                  setOpenCommentId(commentId);
-                                                }}
-                                                onMouseLeave={() =>
-                                                  setOpenCommentId(null)
-                                                }
-                                              >
-                                                i
-                                              </div>
-                                              {openCommentId === commentId && (
-                                                <div
-                                                  className="fixed z-50 px-3 py-2 rounded-md text-xs whitespace-normal pointer-events-none"
-                                                  style={{
-                                                    backgroundColor: "#171717",
-                                                    color: "#ffffff",
-                                                    width: "300px",
-                                                    boxShadow:
-                                                      "0 4px 6px rgba(0, 0, 0, 0.1)",
-                                                    top: commentPos.top,
-                                                    left: commentPos.left,
-                                                    transform:
-                                                      "translateY(-100%)",
-                                                  }}
-                                                >
-                                                  {score.comment}
-                                                </div>
-                                              )}
-                                            </>
-                                          );
-                                        })()}
-                                    </div>
-                                  </div>
-                                );
-                              },
-                            )}
-                          </div>
-                        ) : null}
-                      </td>
-                    );
-                  })}
-                </tr>
-              </React.Fragment>
-            ))}
-          </tbody>
-        </table>
-      </div>
-    </div>
-  );
-}
diff --git a/app/components/InfoTooltip.tsx b/app/components/InfoTooltip.tsx
index d070496..902841d 100644
--- a/app/components/InfoTooltip.tsx
+++ b/app/components/InfoTooltip.tsx
@@ -11,15 +11,16 @@ export default function InfoTooltip({ text }: InfoTooltipProps) {
     <span className="relative inline-flex items-center ml-1 align-text-bottom group">
       <button
         type="button"
-        className="w-3.5 h-3.5 rounded-full text-[10px] font-bold flex items-center justify-center leading-none select-none bg-bg-secondary text-text-secondary border border-border"
+        className="w-4 h-4 rounded-full text-[10px] font-bold flex items-center justify-center leading-none select-none bg-neutral-200 text-neutral-600 border border-neutral-300 cursor-pointer hover:bg-neutral-300 hover:text-neutral-700 transition-colors"
       >
         i
       </button>
       <div
         role="tooltip"
-        className="absolute z-50 left-5 top-0 w-64 text-xs rounded-lg p-2.5 shadow-lg bg-bg-primary border border-border text-text-secondary leading-relaxed hidden group-hover:block group-focus-within:block"
+        className="absolute z-50 bottom-full left-1/2 -translate-x-1/2 mb-2 w-64 text-xs rounded-lg p-3 shadow-lg bg-neutral-900 text-neutral-100 leading-relaxed hidden group-hover:block group-focus-within:block"
       >
         {text}
+        <span className="absolute top-full left-1/2 -translate-x-1/2 border-[5px] border-transparent border-t-neutral-900" />
       </div>
     </span>
   );
diff --git a/app/components/StatusBadge.tsx b/app/components/StatusBadge.tsx
index 48df1e5..12b3704 100644
--- a/app/components/StatusBadge.tsx
+++ b/app/components/StatusBadge.tsx
@@ -13,20 +13,14 @@ interface StatusBadgeProps {
 }
 
 export default function StatusBadge({ status, size = "sm" }: StatusBadgeProps) {
-  const colors = getStatusColor(status);
+  const statusColor = getStatusColor(status);
 
   const sizeClasses =
     size === "md" ? "px-3 py-1.5 text-sm" : "px-2 py-1 text-xs";
 
   return (
     <div
-      className={`inline-block ${sizeClasses} rounded font-semibold`}
-      style={{
-        backgroundColor: colors.bg,
-        borderWidth: "1px",
-        borderColor: colors.border,
-        color: colors.text,
-      }}
+      className={`inline-block ${sizeClasses} rounded font-semibold border ${statusColor.bg} ${statusColor.border} ${statusColor.text}`}
     >
       {status.toUpperCase()}
     </div>
diff --git a/app/components/Tag.tsx b/app/components/Tag.tsx
new file mode 100644
index 0000000..6932e29
--- /dev/null
+++ b/app/components/Tag.tsx
@@ -0,0 +1,13 @@
+import type { ReactNode } from "react";
+
+interface TagProps {
+  children: ReactNode;
+}
+
+export default function Tag({ children }: TagProps) {
+  return (
+    <span className="inline-flex px-2.5 py-1 rounded-md text-xs font-medium bg-bg-secondary text-text-primary">
+      {children}
+    </span>
+  );
+}
diff --git a/app/components/Toast.tsx b/app/components/Toast.tsx
index 2951ae8..bf3217b 100644
--- a/app/components/Toast.tsx
+++ b/app/components/Toast.tsx
@@ -88,7 +88,7 @@ function ToastContainer({
   removeToast: (id: string) => void;
 }) {
   return (
-    <div className="fixed top-4 right-4 z-[9999] flex flex-col gap-3 pointer-events-none">
+    <div className="fixed top-4 right-4 z-9999 flex flex-col gap-3 pointer-events-none">
       {toasts.map((toast) => (
         <ToastItem
           key={toast.id}
@@ -169,8 +169,8 @@ function ToastItem({ toast, onClose }: { toast: Toast; onClose: () => void }) {
         </div>
 
         <button
-          onClick={() => setExiting(true)}
-          className="shrink-0 self-start p-2 opacity-50 hover:opacity-100 transition-opacity"
+          onClick={onClose}
+          className="shrink-0 self-start p-2 opacity-50 hover:opacity-100 transition-opacity cursor-pointer"
         >
           <CloseIcon className="w-3.5 h-3.5 text-[#757575]" />
         </button>
diff --git a/app/components/evaluations/DetailedResultsTable.tsx b/app/components/evaluations/DetailedResultsTable.tsx
new file mode 100644
index 0000000..9d50ebd
--- /dev/null
+++ b/app/components/evaluations/DetailedResultsTable.tsx
@@ -0,0 +1,235 @@
+/**
+ * DetailedResultsTable.tsx - Table view for evaluation results
+ *
+ * Displays Q&A pairs with scores in a tabular format
+ * Supports both row format (individual traces) and grouped format (multiple answers per question)
+ */
+
+import { useState, useEffect } from "react";
+import type { GroupedTraceItem, EvalJob } from "@/app/lib/types/evaluation";
+import {
+  getScoreObject,
+  normalizeToIndividualScores,
+  hasSummaryScores,
+  isNewScoreObjectV2,
+  isGroupedFormat,
+} from "@/app/lib/utils/evaluation";
+import { formatScoreValue, getScoreByName } from "@/app/lib/utils";
+import GroupedResultsTable from "@/app/components/evaluations/GroupedResultsTable";
+
+interface DetailedResultsTableProps {
+  job: EvalJob;
+}
+
+export default function DetailedResultsTable({
+  job,
+}: DetailedResultsTableProps) {
+  const [openCommentId, setOpenCommentId] = useState<string | null>(null);
+  const [commentPos, setCommentPos] = useState({ top: 0, left: 0 });
+
+  useEffect(() => {
+    if (!openCommentId) return;
+    const handleScroll = () => setOpenCommentId(null);
+    window.addEventListener("scroll", handleScroll, true);
+    return () => {
+      window.removeEventListener("scroll", handleScroll, true);
+    };
+  }, [openCommentId]);
+
+  const scoreObject = getScoreObject(job);
+
+  if (!scoreObject || !hasSummaryScores(scoreObject)) {
+    return (
+      <div className="border rounded-lg p-6 text-center bg-[#fef3c7] border-[#fbbf24]">
+        <p className="text-sm text-[#92400e]">
+          No detailed results available or using legacy format
+        </p>
+      </div>
+    );
+  }
+
+  if (isNewScoreObjectV2(scoreObject)) {
+    if (isGroupedFormat(scoreObject.traces)) {
+      return (
+        <GroupedResultsTable
+          traces={scoreObject.traces as GroupedTraceItem[]}
+        />
+      );
+    }
+  }
+
+  const individual_scores = normalizeToIndividualScores(scoreObject);
+
+  if (!individual_scores || individual_scores.length === 0) {
+    return (
+      <div className="border rounded-lg p-6 text-center bg-[#fef3c7] border-[#fbbf24]">
+        <p className="text-sm text-[#92400e]">
+          No individual scores available. Only summary metrics are available for
+          this evaluation.
+        </p>
+      </div>
+    );
+  }
+
+  // Get all unique score names from the first item
+  const scoreNames =
+    individual_scores[0]?.trace_scores?.map((s) => s.name) || [];
+
+  const COLUMN_WIDTHS = {
+    index: 50,
+    question: 250,
+    groundTruth: 250,
+    answer: 250,
+    score: 160,
+  };
+  const tableMinWidth =
+    COLUMN_WIDTHS.index +
+    COLUMN_WIDTHS.question +
+    COLUMN_WIDTHS.groundTruth +
+    COLUMN_WIDTHS.answer +
+    scoreNames.length * COLUMN_WIDTHS.score;
+
+  return (
+    <div className="border rounded-lg overflow-hidden bg-white border-gray-200">
+      <div className="overflow-x-auto">
+        <table
+          className="w-full border-collapse table-fixed"
+          style={{ minWidth: `${tableMinWidth}px` }}
+        >
+          <thead>
+            <tr className="bg-bg-secondary border-b border-border">
+              <th
+                className="px-4 py-3 text-left text-xs font-semibold uppercase text-[#171717]"
+                style={{ width: `${COLUMN_WIDTHS.index}px` }}
+              ></th>
+              <th
+                className="px-4 py-3 text-left text-xs font-semibold uppercase text-[#171717]"
+                style={{ width: `${COLUMN_WIDTHS.question}px` }}
+              >
+                Question
+              </th>
+              <th
+                className="px-4 py-3 text-left text-xs font-semibold uppercase text-[#171717]"
+                style={{ width: `${COLUMN_WIDTHS.groundTruth}px` }}
+              >
+                Ground Truth
+              </th>
+              <th
+                className="px-4 py-3 text-left text-xs font-semibold uppercase text-[#171717]"
+                style={{ width: `${COLUMN_WIDTHS.answer}px` }}
+              >
+                Answer
+              </th>
+              {scoreNames.map((scoreName) => (
+                <th
+                  key={scoreName}
+                  className="px-4 py-3 text-center text-xs font-semibold uppercase text-[#171717] whitespace-normal wrap-break-word"
+                  style={{ width: `${COLUMN_WIDTHS.score}px` }}
+                >
+                  {scoreName}
+                </th>
+              ))}
+            </tr>
+          </thead>
+
+          <tbody>
+            {individual_scores.map((item, index) => {
+              const question = item.input?.question || "N/A";
+              const answer = item.output?.answer || "N/A";
+              const groundTruth = item.metadata?.ground_truth || "N/A";
+
+              return (
+                <tr
+                  key={item.trace_id || index}
+                  className="border-b border-border bg-bg-primary hover:bg-bg-secondary transition-colors duration-150"
+                >
+                  <td className="px-4 py-3 text-sm font-medium align-top text-text-secondary">
+                    {index + 1}
+                  </td>
+
+                  <td className="px-4 py-3 align-top bg-bg-primary">
+                    <div className="text-sm overflow-auto text-[#171717] leading-normal max-h-[150px] wrap-break-word">
+                      {question}
+                    </div>
+                  </td>
+
+                  <td className="px-4 py-3 align-top bg-bg-primary">
+                    <div className="text-sm overflow-auto text-[#171717] leading-normal max-h-[150px] wrap-break-word">
+                      {groundTruth}
+                    </div>
+                  </td>
+
+                  <td className="px-4 py-3 align-top bg-bg-primary">
+                    <div className="text-sm overflow-auto text-[#171717] leading-normal max-h-[150px] wrap-break-word">
+                      {answer}
+                    </div>
+                  </td>
+
+                  {scoreNames.map((scoreName) => {
+                    const score = getScoreByName(item.trace_scores, scoreName);
+                    const { value, color, bg } = formatScoreValue(score);
+
+                    return (
+                      <td
+                        key={scoreName}
+                        className="px-4 py-3 text-center align-top"
+                      >
+                        <div className="flex items-center justify-center gap-2">
+                          <div
+                            className={`inline-block px-2 py-1 rounded text-xs font-medium border-border ${bg === "transparent" ? "border" : ""}`}
+                            style={{
+                              color,
+                              backgroundColor: bg,
+                            }}
+                          >
+                            {value}
+                          </div>
+                          {score?.comment && (
+                            <>
+                              <div
+                                className={`inline-flex items-center justify-center w-4 h-4 rounded-full text-xs font-normal ${openCommentId === `${index}-${scoreName}` ? "bg-[#171717] text-bg-primary" : "bg-bg-secondary text-text-secondary"}`}
+                                onMouseEnter={(e) => {
+                                  const rect =
+                                    e.currentTarget.getBoundingClientRect();
+                                  const tooltipWidth = 300;
+                                  const centerX = rect.left + rect.width / 2;
+                                  const clampedLeft = Math.min(
+                                    Math.max(centerX - tooltipWidth / 2, 8),
+                                    window.innerWidth - tooltipWidth - 8,
+                                  );
+                                  setCommentPos({
+                                    top: rect.top - 8,
+                                    left: clampedLeft,
+                                  });
+                                  setOpenCommentId(`${index}-${scoreName}`);
+                                }}
+                                onMouseLeave={() => setOpenCommentId(null)}
+                              >
+                                i
+                              </div>
+                              {openCommentId === `${index}-${scoreName}` && (
+                                <div
+                                  className="fixed z-50 px-3 py-2 rounded-md text-xs whitespace-normal pointer-events-none bg-[#171717] text-white border border-gray-700 w-[300px] shadow-md -translate-y-full"
+                                  style={{
+                                    top: commentPos.top,
+                                    left: commentPos.left,
+                                  }}
+                                >
+                                  {score.comment}
+                                </div>
+                              )}
+                            </>
+                          )}
+                        </div>
+                      </td>
+                    );
+                  })}
+                </tr>
+              );
+            })}
+          </tbody>
+        </table>
+      </div>
+    </div>
+  );
+}
diff --git a/app/components/evaluations/EvalDatasetDescription.tsx b/app/components/evaluations/EvalDatasetDescription.tsx
index 4579101..b0b99a0 100644
--- a/app/components/evaluations/EvalDatasetDescription.tsx
+++ b/app/components/evaluations/EvalDatasetDescription.tsx
@@ -15,7 +15,7 @@ export default function EvalDatasetDescription({
 
   return (
     <div
-      className="mt-2 text-xs leading-relaxed break-words overflow-hidden"
+      className="mt-2 text-xs leading-relaxed wrap-break-word overflow-hidden"
       style={{ color: colors.text.secondary }}
     >
       <span>
diff --git a/app/components/evaluations/EvalRunCard.tsx b/app/components/evaluations/EvalRunCard.tsx
index 65990d1..64dba5b 100644
--- a/app/components/evaluations/EvalRunCard.tsx
+++ b/app/components/evaluations/EvalRunCard.tsx
@@ -2,18 +2,14 @@
 
 import { useState } from "react";
 import { useRouter } from "next/navigation";
-import { colors } from "@/app/lib/colors";
-import {
-  EvalJob,
-  AssistantConfig,
-  getScoreObject,
-} from "@/app/components/types";
-import { getStatusColor, formatCostUSD } from "@/app/components/utils";
-import { timeAgo } from "@/app/lib/utils";
-import ConfigModal from "@/app/components/ConfigModal";
-import ScoreDisplay from "@/app/components/ScoreDisplay";
+import type { EvalJob, AssistantConfig } from "@/app/lib/types/evaluation";
+import { getScoreObject } from "@/app/lib/utils/evaluation";
+import { getStatusColor } from "@/app/components/utils";
+import { timeAgo, formatCostUSD } from "@/app/lib/utils";
+import { ConfigModal, InfoTooltip } from "@/app/components";
+import ScoreDisplay from "@/app/components/evaluations/ScoreDisplay";
 import CostIcon from "@/app/components/icons/evaluations/CostIcon";
-import InfoTooltip from "@/app/components/InfoTooltip";
+import DatabaseIcon from "@/app/components/icons/evaluations/DatabaseIcon";
 
 export interface EvalRunCardProps {
   job: EvalJob;
@@ -33,91 +29,55 @@ export default function EvalRunCard({
 
   return (
     <div
-      className="rounded-lg overflow-hidden"
-      style={{
-        backgroundColor: colors.bg.primary,
-        boxShadow: "0 1px 3px rgba(0, 0, 0, 0.04)",
-        borderLeft: `3px solid ${statusColor.border}`,
-      }}
+      className={`rounded-lg overflow-hidden bg-bg-primary shadow-sm border-l-3 ${statusColor.border}`}
     >
       <div className="px-5 py-4">
-        {/* Row 1: Run Name (left) | Status (right) */}
         <div className="flex items-start justify-between gap-4">
           <div className="min-w-0 flex-1">
-            <div
-              className="text-sm font-semibold truncate"
-              style={{ color: colors.text.primary }}
-            >
+            <div className="text-sm font-semibold truncate text-text-primary">
               {job.run_name}
             </div>
             {job.inserted_at && (
-              <div
-                className="text-xs mt-0.5"
-                style={{ color: colors.text.secondary }}
-              >
+              <div className="text-xs mt-0.5 text-text-secondary">
                 {timeAgo(job.inserted_at)}
               </div>
             )}
             {/* Error message (if failed) */}
             {job.error_message && (
-              <div
-                className="mt-2 text-xs break-words overflow-hidden"
-                style={{ color: "hsl(8, 86%, 40%)" }}
-              >
+              <div className="mt-2 text-xs wrap-break-word overflow-hidden text-status-error-text">
                 {job.error_message}
               </div>
             )}
           </div>
           <span
-            className="px-2.5 py-1 rounded text-xs font-semibold uppercase tracking-wide flex-shrink-0"
-            style={{ backgroundColor: statusColor.bg, color: statusColor.text }}
+            className={`px-2.5 py-1 rounded text-xs font-semibold uppercase tracking-wide shrink-0 ${statusColor.bg} ${statusColor.text}`}
           >
             {job.status}
           </span>
         </div>
 
-        {/* Row 2: Scores */}
         {scoreObj && (
           <div className="mt-3">
             <ScoreDisplay score={scoreObj} errorMessage={job.error_message} />
           </div>
         )}
 
-        {/* Row 3: Dataset + Config + Cost (left) | Actions (right) */}
         <div className="flex items-center justify-between gap-4 mt-3">
-          <div
-            className="flex items-center gap-3 text-xs"
-            style={{ color: colors.text.secondary }}
-          >
+          <div className="flex items-center gap-3 text-xs text-text-secondary">
             {job.dataset_name && (
               <span className="flex items-center gap-1.5">
-                <svg
-                  className="w-3.5 h-3.5 flex-shrink-0"
-                  fill="none"
-                  viewBox="0 0 24 24"
-                  stroke="currentColor"
-                  strokeWidth={2}
-                >
-                  <path
-                    strokeLinecap="round"
-                    strokeLinejoin="round"
-                    d="M4 7v10c0 2 3.6 3 8 3s8-1 8-3V7M4 7c0 2 3.6 3 8 3s8-1 8-3M4 7c0-2 3.6-3 8-3s8 1 8 3M4 12c0 2 3.6 3 8 3s8-1 8-3"
-                  />
-                </svg>
+                <DatabaseIcon className="shrink-0" />
                 {job.dataset_name}
               </span>
             )}
             {job.assistant_id && assistantConfig?.name && (
-              <span
-                className="px-1.5 py-0.5 rounded"
-                style={{ backgroundColor: colors.bg.secondary }}
-              >
+              <span className="px-1.5 py-0.5 rounded bg-bg-secondary">
                 {assistantConfig.name}
               </span>
             )}
             {job.cost?.total_cost_usd != null && (
               <span className="flex items-center gap-1.5">
-                <CostIcon className="flex-shrink-0" />
+                <CostIcon className="shrink-0" />
                 {formatCostUSD(job.cost.total_cost_usd)}
                 <InfoTooltip
                   text={
@@ -144,30 +104,21 @@ export default function EvalRunCard({
               </span>
             )}
           </div>
-          <div className="flex items-center gap-3 flex-shrink-0">
+          <div className="flex items-center gap-3 shrink-0">
             <button
               onClick={() => setIsConfigModalOpen(true)}
-              className="px-3 py-1.5 rounded-lg text-xs font-medium border"
-              style={{
-                backgroundColor: "transparent",
-                borderColor: colors.border,
-                color: colors.text.primary,
-              }}
+              className="px-3 py-1.5 rounded-lg text-xs font-medium border border-border bg-transparent text-text-primary"
             >
               View Config
             </button>
             <button
               onClick={() => router.push(`/evaluations/${job.id}`)}
               disabled={!isCompleted}
-              className="px-3 py-1.5 rounded-lg text-xs font-medium border cursor-pointer disabled:cursor-not-allowed"
-              style={{
-                backgroundColor: "transparent",
-                borderColor: colors.border,
-                color: isCompleted
-                  ? colors.text.primary
-                  : colors.text.secondary,
-                opacity: isCompleted ? 1 : 0.5,
-              }}
+              className={`px-3 py-1.5 rounded-lg text-xs font-medium border border-border bg-transparent cursor-pointer disabled:cursor-not-allowed ${
+                isCompleted
+                  ? "text-text-primary opacity-100"
+                  : "text-text-secondary opacity-50"
+              }`}
             >
               View Results
             </button>
diff --git a/app/components/evaluations/EvaluationsTab.tsx b/app/components/evaluations/EvaluationsTab.tsx
index 3fce988..d57ea05 100644
--- a/app/components/evaluations/EvaluationsTab.tsx
+++ b/app/components/evaluations/EvaluationsTab.tsx
@@ -4,12 +4,13 @@ import { useState, useEffect, useCallback } from "react";
 import { apiFetch } from "@/app/lib/apiClient";
 import { colors } from "@/app/lib/colors";
 import { Dataset } from "@/app/lib/types/dataset";
-import { EvalJob, AssistantConfig } from "@/app/components/types";
+import { EvalJob, AssistantConfig } from "@/app/lib/types/evaluation";
 import ConfigSelector from "@/app/components/ConfigSelector";
 import Loader from "@/app/components/Loader";
 import EvalRunCard from "./EvalRunCard";
 import EvalDatasetDescription from "./EvalDatasetDescription";
 import { useAuth } from "@/app/lib/context/AuthContext";
+import { RefreshIcon } from "@/app/components/icons";
 
 type Tab = "datasets" | "evaluations";
 
@@ -390,23 +391,12 @@ export default function EvaluationsTab({
               <button
                 onClick={fetchEvaluations}
                 disabled={isLoading}
-                className="p-1.5 rounded"
-                style={{ color: colors.text.secondary }}
+                className="p-1.5 rounded text-text-secondary cursor-pointer"
                 aria-label="Refresh evaluations"
               >
-                <svg
-                  className={`w-4 h-4 ${isLoading ? "animate-spin" : ""}`}
-                  fill="none"
-                  viewBox="0 0 24 24"
-                  stroke="currentColor"
-                >
-                  <path
-                    strokeLinecap="round"
-                    strokeLinejoin="round"
-                    strokeWidth={2}
-                    d="M4 4v5h.582m15.356 2A8.001 8.001 0 004.582 9m0 0H9m11 11v-5h-.581m0 0a8.003 8.003 0 01-15.357-2m15.357 2H15"
-                  />
-                </svg>
+                <RefreshIcon
+                  className={`w-4 h-4 -scale-x-100 ${isLoading ? "animate-spin" : ""}`}
+                />
               </button>
             </div>
           </div>
@@ -418,14 +408,12 @@ export default function EvaluationsTab({
               boxShadow: "0 1px 3px rgba(0, 0, 0, 0.04)",
             }}
           >
-            {/* Loading */}
             {isLoading && evalJobs.length === 0 && (
               <div className="p-16">
                 <Loader size="md" message="Loading evaluation runs..." />
               </div>
             )}
 
-            {/* Error */}
             {error && (
               <div className="p-4">
                 <div
@@ -439,7 +427,6 @@ export default function EvaluationsTab({
               </div>
             )}
 
-            {/* Empty State */}
             {!isLoading && evalJobs.length === 0 && !error && (
               <div className="p-16 text-center">
                 <svg
@@ -469,7 +456,6 @@ export default function EvaluationsTab({
               </div>
             )}
 
-            {/* Runs List */}
             {evalJobs.length > 0 &&
               (() => {
                 const filteredJobs =
diff --git a/app/components/evaluations/GroupedResultsTable.tsx b/app/components/evaluations/GroupedResultsTable.tsx
new file mode 100644
index 0000000..1943d22
--- /dev/null
+++ b/app/components/evaluations/GroupedResultsTable.tsx
@@ -0,0 +1,261 @@
+/**
+ * GroupedResultsTable.tsx - Grouped view for evaluation results
+ *
+ * Displays multiple LLM answers per question in a grouped table format
+ */
+
+import { useState, useEffect, Fragment } from "react";
+import { TraceScore, GroupedTraceItem } from "@/app/lib/types/evaluation";
+import { formatScoreValue } from "@/app/lib/utils";
+
+export default function GroupedResultsTable({
+  traces,
+}: {
+  traces: GroupedTraceItem[];
+}) {
+  const [openCommentId, setOpenCommentId] = useState<string | null>(null);
+  const [commentPos, setCommentPos] = useState({ top: 0, left: 0 });
+
+  useEffect(() => {
+    if (!openCommentId) return;
+    const handleScroll = () => setOpenCommentId(null);
+    window.addEventListener("scroll", handleScroll, true);
+    return () => {
+      window.removeEventListener("scroll", handleScroll, true);
+    };
+  }, [openCommentId]);
+
+  if (!traces || traces.length === 0) {
+    return (
+      <div className="border rounded-lg p-6 text-center bg-[#fef3c7] border-[#fbbf24]">
+        <p className="text-sm text-[#92400e]">No grouped results available</p>
+      </div>
+    );
+  }
+
+  // Get max answers count
+  const maxAnswers = Math.max(...traces.map((t) => t.llm_answers.length));
+
+  // Fixed column widths (in pixels) for predictable layout
+  const COLUMN_WIDTHS = {
+    qId: 60,
+    question: 200,
+    groundTruth: 200,
+    answer: 250,
+  };
+
+  // Calculate minimum table width based on number of answers
+  // This ensures horizontal scroll activates at the right point
+  const fixedColumnsWidth =
+    COLUMN_WIDTHS.qId + COLUMN_WIDTHS.question + COLUMN_WIDTHS.groundTruth;
+  const tableMinWidth = fixedColumnsWidth + maxAnswers * COLUMN_WIDTHS.answer;
+
+  return (
+    <div className="border rounded-lg overflow-hidden bg-white border-border">
+      <div className="overflow-x-auto">
+        <table
+          className="w-full border-collapse table-fixed"
+          style={{ minWidth: `${tableMinWidth}px` }}
+        >
+          <thead>
+            <tr className="bg-bg-secondary border-b border-border">
+              <th
+                className="px-4 py-3 text-left text-xs font-semibold uppercase text-[#171717]"
+                style={{
+                  width: `${COLUMN_WIDTHS.qId}px`,
+                  minWidth: `${COLUMN_WIDTHS.qId}px`,
+                }}
+              >
+                Q.ID
+              </th>
+              <th
+                className="px-4 py-3 text-left text-xs font-semibold uppercase text-[#171717]"
+                style={{
+                  width: `${COLUMN_WIDTHS.question}px`,
+                  minWidth: `${COLUMN_WIDTHS.question}px`,
+                }}
+              >
+                Question
+              </th>
+              <th
+                className="px-4 py-3 text-left text-xs font-semibold uppercase text-[#171717]"
+                style={{
+                  width: `${COLUMN_WIDTHS.groundTruth}px`,
+                  minWidth: `${COLUMN_WIDTHS.groundTruth}px`,
+                }}
+              >
+                Ground Truth
+              </th>
+              {Array.from({ length: maxAnswers }, (_, i) => (
+                <th
+                  key={`answer-${i}`}
+                  className="px-4 py-3 text-left text-xs font-semibold uppercase text-[#171717]"
+                  style={{
+                    width: `${COLUMN_WIDTHS.answer}px`,
+                    minWidth: `${COLUMN_WIDTHS.answer}px`,
+                  }}
+                >
+                  Answer {i + 1}
+                </th>
+              ))}
+            </tr>
+          </thead>
+
+          <tbody>
+            {traces.map((group, index) => (
+              <Fragment key={group.question_id || index}>
+                <tr
+                  key={`${group.question_id || index}-text`}
+                  className="bg-white"
+                >
+                  <td className="px-4 pt-3 pb-1 text-sm font-medium align-top text-text-secondary">
+                    {group.question_id}
+                  </td>
+
+                  <td className="px-4 pt-3 pb-1 align-top bg-[#fafafa]">
+                    <div className="text-sm overflow-auto text-[#171717] leading-normal max-h-[150px] wrap-break-word">
+                      {group.question}
+                    </div>
+                  </td>
+
+                  <td className="px-4 pt-3 pb-1 align-top bg-bg-secondary">
+                    <div className="text-sm overflow-auto text-[#171717] leading-normal max-h-[150px] wrap-break-word">
+                      {group.ground_truth_answer}
+                    </div>
+                  </td>
+
+                  {/* Answer */}
+                  {Array.from({ length: maxAnswers }, (_, answerIndex) => {
+                    const answer = group.llm_answers[answerIndex];
+                    return (
+                      <td
+                        key={answerIndex}
+                        className="px-4 pt-3 pb-1 align-top"
+                      >
+                        {answer ? (
+                          <div className="text-sm overflow-auto text-[#171717] leading-6 max-h-[150px] wrap-break-word">
+                            {answer}
+                          </div>
+                        ) : (
+                          <span className="text-sm text-[#171717]">-</span>
+                        )}
+                      </td>
+                    );
+                  })}
+                </tr>
+                <tr
+                  key={`${group.question_id || index}-scores`}
+                  className="border-b border-border"
+                >
+                  <td className="px-4 pt-1 pb-3" />
+                  <td className="px-4 pt-1 pb-3 bg-bg-secondary" />
+                  <td className="px-4 pt-1 pb-3 bg-bg-secondary" />
+
+                  {Array.from({ length: maxAnswers }, (_, answerIndex) => {
+                    const answerScores: TraceScore[] =
+                      group.scores?.[answerIndex] || [];
+                    const answer = group.llm_answers[answerIndex];
+
+                    return (
+                      <td
+                        key={answerIndex}
+                        className="px-4 pt-1 pb-3 align-bottom"
+                      >
+                        {answer && answerScores.length > 0 ? (
+                          <div className="space-y-1">
+                            {answerScores.map(
+                              (score: TraceScore, scoreIdx: number) => {
+                                if (!score) return null;
+                                const { value, color, bg } =
+                                  formatScoreValue(score);
+                                return (
+                                  <div
+                                    key={score.name || scoreIdx}
+                                    className="flex items-center justify-between gap-1"
+                                  >
+                                    <span className="text-xs truncate min-w-0 text-text-secondary">
+                                      {score.name}:
+                                    </span>
+                                    <div className="flex items-center gap-1 shrink-0">
+                                      <div
+                                        className={`inline-block px-2 py-0.5 rounded text-xs font-medium ${
+                                          bg === "transparent"
+                                            ? "border border-border"
+                                            : ""
+                                        }`}
+                                        style={{ color, backgroundColor: bg }}
+                                      >
+                                        {value}
+                                      </div>
+                                      {score?.comment &&
+                                        (() => {
+                                          const commentId = `g${index}-a${answerIndex}-s${scoreIdx}`;
+                                          return (
+                                            <>
+                                              <div
+                                                className={`inline-flex items-center justify-center w-4 h-4 rounded-full text-xs font-normal ${
+                                                  openCommentId === commentId
+                                                    ? "bg-[#171717] text-white"
+                                                    : "bg-bg-secondary text-text-secondary"
+                                                }`}
+                                                onMouseEnter={(e) => {
+                                                  const rect =
+                                                    e.currentTarget.getBoundingClientRect();
+                                                  const tooltipWidth = 300;
+                                                  const centerX =
+                                                    rect.left + rect.width / 2;
+                                                  const clampedLeft = Math.min(
+                                                    Math.max(
+                                                      centerX -
+                                                        tooltipWidth / 2,
+                                                      8,
+                                                    ),
+                                                    window.innerWidth -
+                                                      tooltipWidth -
+                                                      8,
+                                                  );
+                                                  setCommentPos({
+                                                    top: rect.top - 8,
+                                                    left: clampedLeft,
+                                                  });
+                                                  setOpenCommentId(commentId);
+                                                }}
+                                                onMouseLeave={() =>
+                                                  setOpenCommentId(null)
+                                                }
+                                              >
+                                                i
+                                              </div>
+                                              {openCommentId === commentId && (
+                                                <div
+                                                  className="fixed z-50 px-3 py-2 rounded-md text-xs whitespace-normal pointer-events-none bg-[#171717] text-white w-[300px] shadow-[0_4px_6px_rgba(0,0,0,0.1)] -translate-y-full"
+                                                  style={{
+                                                    top: commentPos.top,
+                                                    left: commentPos.left,
+                                                  }}
+                                                >
+                                                  {score.comment}
+                                                </div>
+                                              )}
+                                            </>
+                                          );
+                                        })()}
+                                    </div>
+                                  </div>
+                                );
+                              },
+                            )}
+                          </div>
+                        ) : null}
+                      </td>
+                    );
+                  })}
+                </tr>
+              </Fragment>
+            ))}
+          </tbody>
+        </table>
+      </div>
+    </div>
+  );
+}
diff --git a/app/components/ScoreDisplay.tsx b/app/components/evaluations/ScoreDisplay.tsx
similarity index 94%
rename from app/components/ScoreDisplay.tsx
rename to app/components/evaluations/ScoreDisplay.tsx
index 2f8b1db..68efa33 100644
--- a/app/components/ScoreDisplay.tsx
+++ b/app/components/evaluations/ScoreDisplay.tsx
@@ -5,7 +5,8 @@
 
 "use client";
 
-import { ScoreObject, hasSummaryScores } from "./types";
+import type { ScoreObject } from "@/app/lib/types/evaluation";
+import { hasSummaryScores } from "@/app/lib/utils/evaluation";
 
 interface ScoreDisplayProps {
   score: ScoreObject | null;
@@ -16,7 +17,6 @@ export default function ScoreDisplay({
   score,
   errorMessage,
 }: ScoreDisplayProps) {
-  // No score available
   if (!score) {
     return (
       <div className="inline-flex items-center gap-2 px-3 py-1.5 rounded-md text-sm bg-[hsl(0,0%,95%)] border border-[hsl(0,0%,85%)] text-[hsl(330,3%,49%)]">
@@ -42,7 +42,6 @@ export default function ScoreDisplay({
       );
     }
 
-    // Separate numeric and categorical scores
     const numericScores = summaryScores.filter(
       (s) => s.data_type === "NUMERIC",
     );
@@ -83,7 +82,6 @@ export default function ScoreDisplay({
     );
   }
 
-  // Fallback for unsupported format
   return (
     <div className="inline-flex items-center gap-2 px-3 py-1.5 rounded-md text-sm bg-[hsl(0,0%,95%)] border border-[hsl(0,0%,85%)] text-[hsl(330,3%,49%)]">
       <span className="font-medium">Score:</span>
diff --git a/app/components/icons/common/CopyIcon.tsx b/app/components/icons/common/CopyIcon.tsx
new file mode 100644
index 0000000..ac3b372
--- /dev/null
+++ b/app/components/icons/common/CopyIcon.tsx
@@ -0,0 +1,20 @@
+interface IconProps {
+  className?: string;
+  style?: React.CSSProperties;
+}
+
+export default function CopyIcon({ className, style }: IconProps) {
+  return (
+    <svg
+      className={className}
+      fill="none"
+      viewBox="0 0 24 24"
+      stroke="currentColor"
+      strokeWidth={2}
+      style={style}
+    >
+      <rect x="9" y="9" width="13" height="13" rx="2" ry="2" />
+      <path d="M5 15H4a2 2 0 01-2-2V4a2 2 0 012-2h9a2 2 0 012 2v1" />
+    </svg>
+  );
+}
diff --git a/app/components/icons/common/RefreshIcon.tsx b/app/components/icons/common/RefreshIcon.tsx
index fedb9e2..e244959 100644
--- a/app/components/icons/common/RefreshIcon.tsx
+++ b/app/components/icons/common/RefreshIcon.tsx
@@ -13,11 +13,13 @@ export default function RefreshIcon({ className, style }: IconProps) {
       strokeWidth={2}
       style={style}
     >
-      <path
-        strokeLinecap="round"
-        strokeLinejoin="round"
-        d="M4 4v5h.582m15.356 2A8.001 8.001 0 004.582 9m0 0H9m11 11v-5h-.581m0 0a8.003 8.003 0 01-15.357-2m15.357 2H15"
-      />
+      <g transform="scale(-1,1) translate(-24,0)">
+        <path
+          strokeLinecap="round"
+          strokeLinejoin="round"
+          d="M4 4v5h.582m15.356 2A8.001 8.001 0 004.582 9m0 0H9m11 11v-5h-.581m0 0a8.003 8.003 0 01-15.357-2m15.357 2H15"
+        />
+      </g>
     </svg>
   );
 }
diff --git a/app/components/icons/index.tsx b/app/components/icons/index.tsx
index e46af15..450a0ed 100644
--- a/app/components/icons/index.tsx
+++ b/app/components/icons/index.tsx
@@ -2,6 +2,7 @@
 export { default as ArrowLeftIcon } from "./common/ArrowLeftIcon";
 export { default as ChevronDownIcon } from "./common/ChevronDownIcon";
 export { default as CheckIcon } from "./common/CheckIcon";
+export { default as CopyIcon } from "./common/CopyIcon";
 export { default as EyeIcon } from "./common/EyeIcon";
 export { default as EyeOffIcon } from "./common/EyeOffIcon";
 export { default as RefreshIcon } from "./common/RefreshIcon";
diff --git a/app/components/index.ts b/app/components/index.ts
index 318a498..9f5fbc4 100644
--- a/app/components/index.ts
+++ b/app/components/index.ts
@@ -1,5 +1,10 @@
 export { default as Button } from "./Button";
+export { default as CodeBlock } from "./CodeBlock";
+export { default as ConfigModal } from "./ConfigModal";
+export { default as CopyableCodeBlock } from "./CopyableCodeBlock";
 export { default as Field } from "./Field";
+export { default as InfoTooltip } from "./InfoTooltip";
 export { default as Modal } from "./Modal";
 export { default as PageHeader } from "./PageHeader";
 export { default as Sidebar } from "./Sidebar";
+export { default as Tag } from "./Tag";
diff --git a/app/components/speech-to-text/EvaluationsTab.tsx b/app/components/speech-to-text/EvaluationsTab.tsx
index 81dbebc..119e955 100644
--- a/app/components/speech-to-text/EvaluationsTab.tsx
+++ b/app/components/speech-to-text/EvaluationsTab.tsx
@@ -10,7 +10,8 @@ import Loader, { LoaderBox } from "@/app/components/Loader";
 import StatusBadge from "@/app/components/StatusBadge";
 import { computeWordDiff } from "./TranscriptionDiffViewer";
 import { getStatusColor } from "@/app/components/utils";
-import AudioPlayerFromUrl from "./AudioPlayerFromUrl";
+import AudioPlayerFromUrl from "@/app/components/speech-to-text/AudioPlayerFromUrl";
+import { RefreshIcon } from "@/app/components/icons";
 
 export interface EvaluationsTabProps {
   leftPanelWidth: number;
@@ -442,22 +443,11 @@ export default function EvaluationsTab({
                 <button
                   onClick={loadRuns}
                   disabled={isLoadingRuns}
-                  className="p-1.5 rounded"
-                  style={{ color: colors.text.secondary }}
+                  className="p-1.5 rounded cursor-pointer text-text-secondary"
                 >
-                  <svg
-                    className={`w-4 h-4 ${isLoadingRuns ? "animate-spin" : ""}`}
-                    fill="none"
-                    viewBox="0 0 24 24"
-                    stroke="currentColor"
-                  >
-                    <path
-                      strokeLinecap="round"
-                      strokeLinejoin="round"
-                      strokeWidth={2}
-                      d="M4 4v5h.582m15.356 2A8.001 8.001 0 004.582 9m0 0H9m11 11v-5h-.581m0 0a8.003 8.003 0 01-15.357-2m15.357 2H15"
-                    />
-                  </svg>
+                  <RefreshIcon
+                    className={`w-4 h-4 -scale-x-100 ${isLoadingRuns ? "animate-spin" : ""}`}
+                  />
                 </button>
               </div>
             )}
@@ -1213,27 +1203,18 @@ export default function EvaluationsTab({
                       return (
                         <div
                           key={run.id}
-                          className="rounded-lg overflow-hidden bg-bg-primary shadow-sm border-l-3"
-                          style={{
-                            borderLeftColor: statusColor.border,
-                          }}
+                          className={`rounded-lg overflow-hidden bg-bg-primary shadow-sm border-l-3 ${statusColor.border}`}
                         >
                           <div className="px-5 py-4">
                             {/* Row 1: Run Name + Status */}
                             <div className="flex items-start justify-between gap-4">
                               <div className="min-w-0 flex-1">
-                                <div
-                                  className="text-sm font-semibold truncate"
-                                  style={{ color: colors.text.primary }}
-                                >
+                                <div className="text-sm font-semibold truncate text-text-primary">
                                   {run.run_name}
                                 </div>
                                 {/* Error message */}
                                 {run.error_message && (
-                                  <div
-                                    className="mt-2 text-xs break-words overflow-hidden"
-                                    style={{ color: "hsl(8, 86%, 40%)" }}
-                                  >
+                                  <div className="mt-2 text-xs wrap-break-word overflow-hidden text-status-error-text">
                                     {run.error_message}
                                   </div>
                                 )}
diff --git a/app/components/text-to-speech/EvaluationsTab.tsx b/app/components/text-to-speech/EvaluationsTab.tsx
index b4a1cce..0caa46f 100644
--- a/app/components/text-to-speech/EvaluationsTab.tsx
+++ b/app/components/text-to-speech/EvaluationsTab.tsx
@@ -15,6 +15,7 @@ import { useAuth } from "@/app/lib/context/AuthContext";
 import { apiFetch } from "@/app/lib/apiClient";
 import Loader, { LoaderBox } from "@/app/components/Loader";
 import { getStatusColor } from "@/app/components/utils";
+import { RefreshIcon } from "@/app/components/icons";
 import AudioPlayerFromUrl from "./AudioPlayerFromUrl";
 import { useToast } from "@/app/components/Toast";
 
@@ -442,22 +443,11 @@ export default function EvaluationsTab({
                 <button
                   onClick={loadRuns}
                   disabled={isLoadingRuns}
-                  className="p-1.5 rounded"
-                  style={{ color: colors.text.secondary }}
+                  className="p-1.5 rounded text-text-secondary cursor-pointer"
                 >
-                  <svg
+                  <RefreshIcon
                     className={`w-4 h-4 ${isLoadingRuns ? "animate-spin" : ""}`}
-                    fill="none"
-                    viewBox="0 0 24 24"
-                    stroke="currentColor"
-                  >
-                    <path
-                      strokeLinecap="round"
-                      strokeLinejoin="round"
-                      strokeWidth={2}
-                      d="M4 4v5h.582m15.356 2A8.001 8.001 0 004.582 9m0 0H9m11 11v-5h-.581m0 0a8.003 8.003 0 01-15.357-2m15.357 2H15"
-                    />
-                  </svg>
+                  />
                 </button>
               </div>
             )}
@@ -1134,39 +1124,24 @@ export default function EvaluationsTab({
                       return (
                         <div
                           key={run.id}
-                          className="rounded-lg overflow-hidden"
-                          style={{
-                            backgroundColor: colors.bg.primary,
-                            boxShadow: "0 1px 3px rgba(0, 0, 0, 0.04)",
-                            borderLeft: `3px solid ${statusColor.border}`,
-                          }}
+                          className={`rounded-lg overflow-hidden bg-bg-primary shadow-sm border-l-3 ${statusColor.border}`}
                         >
                           <div className="px-5 py-4">
                             {/* Row 1: Run Name + Status */}
                             <div className="flex items-start justify-between gap-4">
                               <div className="min-w-0 flex-1">
-                                <div
-                                  className="text-sm font-semibold truncate"
-                                  style={{ color: colors.text.primary }}
-                                >
+                                <div className="text-sm font-semibold truncate text-text-primary">
                                   {run.run_name}
                                 </div>
                                 {/* Error message */}
                                 {run.error_message && (
-                                  <div
-                                    className="mt-2 text-xs break-words overflow-hidden"
-                                    style={{ color: "hsl(8, 86%, 40%)" }}
-                                  >
+                                  <div className="mt-2 text-xs wrap-break-word overflow-hidden text-status-error-text">
                                     {run.error_message}
                                   </div>
                                 )}
                               </div>
                               <span
-                                className="px-2.5 py-1 rounded text-xs font-semibold uppercase tracking-wide flex-shrink-0"
-                                style={{
-                                  backgroundColor: statusColor.bg,
-                                  color: statusColor.text,
-                                }}
+                                className={`px-2.5 py-1 rounded text-xs font-semibold uppercase tracking-wide shrink-0 ${statusColor.bg} ${statusColor.text}`}
                               >
                                 {run.status}
                               </span>
diff --git a/app/components/types.ts b/app/components/types.ts
deleted file mode 100644
index b2bbc73..0000000
--- a/app/components/types.ts
+++ /dev/null
@@ -1,234 +0,0 @@
-/**
- * Shared TypeScript types for evaluation components
- */
-
-export interface TraceScore {
-  name: string;
-  value: number | string;
-  data_type: "NUMERIC" | "CATEGORICAL";
-  comment?: string;
-}
-
-// New trace format (from evaluation-sample-3.json)
-export interface TraceItem {
-  trace_id: string;
-  question: string;
-  llm_answer: string;
-  ground_truth_answer: string;
-  scores: TraceScore[];
-}
-
-export interface GroupedTraceItem {
-  question_id: number;
-  question: string;
-  ground_truth_answer: string;
-  llm_answers: string[];
-  trace_ids: string[];
-  scores: TraceScore[][];
-}
-
-// Legacy individual score format (nested structure)
-export interface IndividualScore {
-  trace_id: string;
-  input?: {
-    question: string;
-  };
-  output?: {
-    answer: string;
-  };
-  metadata?: {
-    ground_truth?: string;
-    item_id?: string;
-    response_id?: string;
-  };
-  trace_scores: TraceScore[];
-}
-
-export interface SummaryScore {
-  name: string;
-  avg?: number;
-  std?: number;
-  total_pairs: number;
-  data_type: "NUMERIC" | "CATEGORICAL";
-  distribution?: Record<string, number>; // For categorical data
-}
-
-// New score object with traces array
-export interface NewScoreObjectV2 {
-  summary_scores: SummaryScore[];
-  traces: TraceItem[] | GroupedTraceItem[];
-}
-
-// Legacy score structure (for backward compatibility)
-export interface PerItemScore {
-  trace_id: string;
-  cosine_similarity: number;
-}
-
-export interface CosineSimilarity {
-  avg: number;
-  std: number;
-  total_pairs: number;
-  per_item_scores: PerItemScore[];
-}
-
-export interface LegacyScoreObject {
-  cosine_similarity: CosineSimilarity;
-}
-
-// Basic score object with only summary scores (no individual scores or traces)
-export interface BasicScoreObject {
-  summary_scores: SummaryScore[];
-}
-
-// Union type to support both old and new structures
-export type ScoreObject =
-  | NewScoreObjectV2
-  | BasicScoreObject
-  | LegacyScoreObject;
-
-export interface AssistantConfig {
-  name: string;
-  model: string;
-  knowledge_base_ids: string[];
-  project_id: number;
-  organization_id: number;
-  updated_at: string;
-  deleted_at: string | null;
-  instructions: string;
-  assistant_id: string;
-  temperature: number;
-  max_num_results: number;
-  id: number;
-  inserted_at: string;
-  is_deleted: boolean;
-}
-
-export interface EvalCostEntry {
-  model: string;
-  cost_usd: number;
-  input_tokens?: number;
-  output_tokens?: number;
-  prompt_tokens?: number;
-  total_tokens: number;
-}
-
-export interface EvalCost {
-  response?: EvalCostEntry;
-  embedding?: EvalCostEntry;
-  total_cost_usd: number;
-}
-
-export interface EvalJob {
-  id: number;
-  run_name: string;
-  dataset_name: string;
-  dataset_id: number;
-  batch_job_id: number;
-  embedding_batch_job_id: number | null;
-  status: string;
-  object_store_url: string | null;
-  total_items: number;
-  score?: ScoreObject | null;
-  scores?: ScoreObject | null; // Alternative field name
-  error_message: string | null;
-  config?: {
-    model?: string;
-    instructions?: string;
-    tools?: unknown[];
-    include?: string[];
-    temperature?: number;
-  };
-  config_id?: string;
-  config_version?: number;
-  model?: string;
-  assistant_id?: string;
-  organization_id: number;
-  project_id: number;
-  cost?: EvalCost | null;
-  inserted_at: string;
-  updated_at: string;
-}
-
-// Type guard functions
-
-// Shared guard: Check if score has summary_scores and intelligently narrow to NewScoreObjectV2 or BasicScoreObject
-// Priority: If it has traces → NewScoreObjectV2, otherwise → BasicScoreObject
-export function hasSummaryScores(
-  score: ScoreObject | null | undefined,
-): score is NewScoreObjectV2 | BasicScoreObject {
-  if (!score) return false;
-  if (!("summary_scores" in score)) return false;
-
-  // Prioritize traces format if available
-  if ("traces" in score) {
-    return true;
-  }
-
-  // Otherwise, it's BasicScoreObject (summary_scores only, no traces, no individual_scores)
-  return true;
-}
-
-export function isNewScoreObjectV2(
-  score: ScoreObject | null | undefined,
-): score is NewScoreObjectV2 {
-  if (!score) return false;
-  return "summary_scores" in score && "traces" in score;
-}
-
-export function isBasicScoreObject(
-  score: ScoreObject | null | undefined,
-): score is BasicScoreObject {
-  if (!score) return false;
-  return "summary_scores" in score && !("traces" in score);
-}
-
-export function isLegacyScoreObject(
-  score: ScoreObject | null | undefined,
-): score is LegacyScoreObject {
-  if (!score) return false;
-  return "cosine_similarity" in score;
-}
-
-// Helper to get score object from job
-export function getScoreObject(job: EvalJob): ScoreObject | null {
-  return job.scores || job.score || null;
-}
-
-export function isGroupedFormat(
-  traces: TraceItem[] | GroupedTraceItem[],
-): traces is GroupedTraceItem[] {
-  if (!traces || traces.length === 0) return false;
-  return "llm_answers" in traces[0] && Array.isArray(traces[0].llm_answers);
-}
-
-// Normalize traces to IndividualScore format for table display
-export function normalizeToIndividualScores(
-  score: ScoreObject | null | undefined,
-): IndividualScore[] {
-  if (!score) return [];
-
-  if (isNewScoreObjectV2(score)) {
-    // Convert TraceItem[] to IndividualScore[] for table display
-    // Note: Grouped traces should be detected earlier and handled separately
-    return score.traces.map((trace: TraceItem | GroupedTraceItem) => {
-      // Handle regular TraceItem format
-      if ("llm_answer" in trace) {
-        return {
-          trace_id: trace.trace_id,
-          input: { question: trace.question },
-          output: { answer: trace.llm_answer },
-          metadata: { ground_truth: trace.ground_truth_answer },
-          trace_scores: trace.scores,
-        };
-      }
-      // Should not reach here if grouped format is handled properly
-      return {
-        trace_id: "",
-        trace_scores: [],
-      };
-    });
-  }
-
-  return [];
-}
diff --git a/app/components/utils.ts b/app/components/utils.ts
index f1386e6..90912ae 100644
--- a/app/components/utils.ts
+++ b/app/components/utils.ts
@@ -27,9 +27,11 @@ export const formatDate = (dateString?: string): string => {
 };
 
 /**
- * Returns color scheme based on job/evaluation status
+ * Returns Tailwind class names based on job/evaluation status.
+ * The colour tokens are defined in globals.css as @theme inline vars.
+ *
  * @param status - Status string (completed, processing, failed, etc.)
- * @returns Object with bg, border, and text HSL color values
+ * @returns Object with bg, border, and text Tailwind class names
  */
 export const getStatusColor = (
   status: string,
@@ -38,50 +40,35 @@ export const getStatusColor = (
     case "completed":
     case "success":
       return {
-        bg: "hsl(134, 61%, 95%)",
-        border: "hsl(134, 61%, 70%)",
-        text: "hsl(134, 61%, 25%)",
+        bg: "bg-status-success-bg",
+        border: "border-status-success-border",
+        text: "text-status-success-text",
       };
     case "processing":
     case "pending":
     case "queued":
     case "running":
       return {
-        bg: "hsl(46, 100%, 95%)",
-        border: "hsl(46, 100%, 80%)",
-        text: "hsl(46, 100%, 25%)",
+        bg: "bg-status-warning-bg",
+        border: "border-status-warning-border",
+        text: "text-status-warning-text",
       };
     case "failed":
     case "error":
       return {
-        bg: "hsl(8, 86%, 95%)",
-        border: "hsl(8, 86%, 80%)",
-        text: "hsl(8, 86%, 40%)",
+        bg: "bg-status-error-bg",
+        border: "border-status-error-border",
+        text: "text-status-error-text",
       };
     default:
       return {
-        bg: "hsl(0, 0%, 100%)",
-        border: "hsl(0, 0%, 85%)",
-        text: "hsl(330, 3%, 49%)",
+        bg: "bg-status-default-bg",
+        border: "border-status-default-border",
+        text: "text-status-default-text",
       };
   }
 };
 
-/**
- * Formats a USD cost value for display
- * @param cost - Cost in USD
- * @returns Formatted cost string (e.g., "$0.0013", "$1.25")
- */
-export const formatCostUSD = (cost: number): string => {
-  if (!Number.isFinite(cost)) {
-    return "N/A";
-  }
-  if (cost < 0.01) {
-    return `$${cost.toFixed(4)}`;
-  }
-  return `$${cost.toFixed(2)}`;
-};
-
 /**
  * Calculates dynamic thresholds for color coding based on score distribution
  * @param scores - Array of similarity scores
diff --git a/app/globals.css b/app/globals.css
index 1d67ebc..05165ab 100644
--- a/app/globals.css
+++ b/app/globals.css
@@ -49,6 +49,34 @@
   --color-status-warning: #f59e0b;
 }
 
+/* Status badge colors — success */
+@theme inline {
+  --color-status-success-bg: hsl(134, 61%, 95%);
+  --color-status-success-border: hsl(134, 61%, 70%);
+  --color-status-success-text: hsl(134, 61%, 25%);
+}
+
+/* Status badge colors — warning */
+@theme inline {
+  --color-status-warning-bg: hsl(46, 100%, 95%);
+  --color-status-warning-border: hsl(46, 100%, 80%);
+  --color-status-warning-text: hsl(46, 100%, 25%);
+}
+
+/* Status badge colors — error */
+@theme inline {
+  --color-status-error-bg: hsl(8, 86%, 95%);
+  --color-status-error-border: hsl(8, 86%, 80%);
+  --color-status-error-text: hsl(8, 86%, 40%);
+}
+
+/* Status badge colors — default */
+@theme inline {
+  --color-status-default-bg: hsl(0, 0%, 100%);
+  --color-status-default-border: hsl(0, 0%, 85%);
+  --color-status-default-text: hsl(330, 3%, 49%);
+}
+
 @media (prefers-color-scheme: dark) {
   :root {
     --background: #000000;
diff --git a/app/lib/types/evaluation.ts b/app/lib/types/evaluation.ts
new file mode 100644
index 0000000..8b01a15
--- /dev/null
+++ b/app/lib/types/evaluation.ts
@@ -0,0 +1,141 @@
+export interface TraceScore {
+  name: string;
+  value: number | string;
+  data_type: "NUMERIC" | "CATEGORICAL";
+  comment?: string;
+}
+
+export interface TraceItem {
+  trace_id: string;
+  question: string;
+  llm_answer: string;
+  ground_truth_answer: string;
+  scores: TraceScore[];
+}
+
+export interface GroupedTraceItem {
+  question_id: number;
+  question: string;
+  ground_truth_answer: string;
+  llm_answers: string[];
+  trace_ids: string[];
+  scores: TraceScore[][];
+}
+
+export interface IndividualScore {
+  trace_id: string;
+  input?: {
+    question: string;
+  };
+  output?: {
+    answer: string;
+  };
+  metadata?: {
+    ground_truth?: string;
+    item_id?: string;
+    response_id?: string;
+  };
+  trace_scores: TraceScore[];
+}
+
+export interface SummaryScore {
+  name: string;
+  avg?: number;
+  std?: number;
+  total_pairs: number;
+  data_type: "NUMERIC" | "CATEGORICAL";
+  distribution?: Record<string, number>; // For categorical data
+}
+
+export interface NewScoreObjectV2 {
+  summary_scores: SummaryScore[];
+  traces: TraceItem[] | GroupedTraceItem[];
+}
+
+export interface PerItemScore {
+  trace_id: string;
+  cosine_similarity: number;
+}
+
+export interface CosineSimilarity {
+  avg: number;
+  std: number;
+  total_pairs: number;
+  per_item_scores: PerItemScore[];
+}
+
+export interface LegacyScoreObject {
+  cosine_similarity: CosineSimilarity;
+}
+
+export interface BasicScoreObject {
+  summary_scores: SummaryScore[];
+}
+
+export type ScoreObject =
+  | NewScoreObjectV2
+  | BasicScoreObject
+  | LegacyScoreObject;
+
+export interface AssistantConfig {
+  name: string;
+  model: string;
+  knowledge_base_ids: string[];
+  project_id: number;
+  organization_id: number;
+  updated_at: string;
+  deleted_at: string | null;
+  instructions: string;
+  assistant_id: string;
+  temperature: number;
+  max_num_results: number;
+  id: number;
+  inserted_at: string;
+  is_deleted: boolean;
+}
+
+export interface EvalCostEntry {
+  model: string;
+  cost_usd: number;
+  input_tokens?: number;
+  output_tokens?: number;
+  prompt_tokens?: number;
+  total_tokens: number;
+}
+
+export interface EvalCost {
+  response?: EvalCostEntry;
+  embedding?: EvalCostEntry;
+  total_cost_usd: number;
+}
+
+export interface EvalJob {
+  id: number;
+  run_name: string;
+  dataset_name: string;
+  dataset_id: number;
+  batch_job_id: number;
+  embedding_batch_job_id: number | null;
+  status: string;
+  object_store_url: string | null;
+  total_items: number;
+  score?: ScoreObject | null;
+  scores?: ScoreObject | null; // Alternative field name
+  error_message: string | null;
+  config?: {
+    model?: string;
+    instructions?: string;
+    tools?: unknown[];
+    include?: string[];
+    temperature?: number;
+  };
+  config_id?: string;
+  config_version?: number;
+  model?: string;
+  assistant_id?: string;
+  organization_id: number;
+  project_id: number;
+  cost?: EvalCost | null;
+  inserted_at: string;
+  updated_at: string;
+}
diff --git a/app/lib/utils.ts b/app/lib/utils.ts
index 27c28c5..1ac04eb 100644
--- a/app/lib/utils.ts
+++ b/app/lib/utils.ts
@@ -10,6 +10,7 @@ import {
 import { SavedConfig, ConfigGroup } from "./types/configs";
 import { isGpt5Model } from "@/app/lib/models";
 import { STORAGE_KEYS } from "@/app/lib/constants";
+import { TraceScore } from "@/app/lib/types/evaluation";
 
 export function timeAgo(dateStr: string): string {
   const date =
@@ -193,3 +194,67 @@ export const sanitizeCSVCell = (
   }
   return `"${sanitized}"`;
 };
+
+export const formatScoreValue = (score: TraceScore | undefined) => {
+  if (!score) return { value: "N/A", color: "#737373", bg: "transparent" };
+
+  if (score.data_type === "CATEGORICAL") {
+    const catValue = String(score.value);
+    let color = "#171717";
+    let bg = "#fafafa";
+
+    if (catValue === "CORRECT") {
+      color = "#15803d";
+      bg = "#dcfce7";
+    } else if (catValue === "PARTIAL") {
+      color = "#92400e";
+      bg = "#fef3c7";
+    } else if (catValue === "INCORRECT") {
+      color = "#dc2626";
+      bg = "#fee2e2";
+    }
+
+    return { value: catValue, color, bg };
+  }
+
+  const numValue = Number(score.value);
+  const formattedValue = numValue.toFixed(2);
+  let color = "#171717";
+  let bg = "transparent";
+
+  if (numValue >= 0.7) {
+    color = "#15803d";
+    bg = "#dcfce7";
+  } else if (numValue >= 0.5) {
+    color = "#92400e";
+    bg = "#fef3c7";
+  } else {
+    color = "#dc2626";
+    bg = "#fee2e2";
+  }
+
+  return { value: formattedValue, color, bg };
+};
+
+export const getScoreByName = (
+  scores: TraceScore[],
+  name: string,
+): TraceScore | undefined => {
+  if (!scores || !Array.isArray(scores)) return undefined;
+  return scores.find((s) => s?.name === name);
+};
+
+/**
+ * Formats a USD cost value for display
+ * @param cost - Cost in USD
+ * @returns Formatted cost string (e.g., "$0.0013", "$1.25")
+ */
+export const formatCostUSD = (cost: number): string => {
+  if (!Number.isFinite(cost)) {
+    return "N/A";
+  }
+  if (cost < 0.01) {
+    return `$${cost.toFixed(4)}`;
+  }
+  return `$${cost.toFixed(2)}`;
+};
diff --git a/app/lib/utils/evaluation.ts b/app/lib/utils/evaluation.ts
new file mode 100644
index 0000000..441fe18
--- /dev/null
+++ b/app/lib/utils/evaluation.ts
@@ -0,0 +1,53 @@
+import type {
+  EvalJob,
+  GroupedTraceItem,
+  IndividualScore,
+  NewScoreObjectV2,
+  BasicScoreObject,
+  ScoreObject,
+  TraceItem,
+} from "@/app/lib/types/evaluation";
+
+export function hasSummaryScores(
+  score: ScoreObject | null | undefined,
+): score is NewScoreObjectV2 | BasicScoreObject {
+  if (!score) return false;
+  return "summary_scores" in score;
+}
+
+export function isNewScoreObjectV2(
+  score: ScoreObject | null | undefined,
+): score is NewScoreObjectV2 {
+  if (!score) return false;
+  return "summary_scores" in score && "traces" in score;
+}
+
+export function getScoreObject(job: EvalJob): ScoreObject | null {
+  return job.scores || job.score || null;
+}
+
+export function isGroupedFormat(
+  traces: TraceItem[] | GroupedTraceItem[],
+): traces is GroupedTraceItem[] {
+  if (!traces || traces.length === 0) return false;
+  return "llm_answers" in traces[0] && Array.isArray(traces[0].llm_answers);
+}
+
+export function normalizeToIndividualScores(
+  score: ScoreObject | null | undefined,
+): IndividualScore[] {
+  if (!score || !isNewScoreObjectV2(score)) return [];
+
+  return score.traces.map((trace: TraceItem | GroupedTraceItem) => {
+    if ("llm_answer" in trace) {
+      return {
+        trace_id: trace.trace_id,
+        input: { question: trace.question },
+        output: { answer: trace.llm_answer },
+        metadata: { ground_truth: trace.ground_truth_answer },
+        trace_scores: trace.scores,
+      };
+    }
+    return { trace_id: "", trace_scores: [] };
+  });
+}
diff --git a/app/page.tsx b/app/page.tsx
index 50724ca..060f549 100644
--- a/app/page.tsx
+++ b/app/page.tsx
@@ -7,7 +7,6 @@ import { RefreshIcon } from "@/app/components/icons";
 export default function Home() {
   const router = useRouter();
 
-  // Auto-redirect to evaluations page
   useEffect(() => {
     router.push("/evaluations");
   }, [router]);
diff --git a/instructions/CLAUDE.md b/instructions/CLAUDE.md
deleted file mode 100644
index dbe1573..0000000
--- a/instructions/CLAUDE.md
+++ /dev/null
@@ -1,327 +0,0 @@
-# CLAUDE.md
-
-This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
-
-## Project Overview
-
-Kaapi Konsole is a Next.js 16 application by Tech4Dev for LLM development and evaluation. It provides:
-
-- LLM response evaluation against QnA datasets
-- Git-like version control for prompt templates
-- Configuration management with A/B testing
-- Dataset and API key management
-
-The application has evolved from a simple evaluation tool into a full-featured LLM development platform.
-
-## Technology Stack
-
-- **Framework**: Next.js 16.0.7 (App Router)
-- **React**: 19.2.0 (with hooks-based state management)
-- **Routing**: Next.js App Router + React Router DOM 7.9.5 (dual system)
-- **Styling**: Tailwind CSS 4.x + centralized color system in `/app/lib/colors.ts`
-- **TypeScript**: 5.x (strict mode disabled)
-- **Data Fetching**: SWR 2.3.6 (not widely used yet)
-- **Date/Time**: date-fns 4.1.0, date-fns-tz 3.2.0
-
-## Development Commands
-
-```bash
-# Start development server (http://localhost:3000)
-npm run dev
-
-# Build for production
-npm run build
-
-# Start production server
-npm start
-
-# Run linter
-npm run lint
-```
-
-## Application Architecture
-
-### Route Structure
-
-```
-/                              → Redirects to /evaluations
-/evaluations                   → Main eval interface (upload & results)
-/evaluations/[id]              → Detailed evaluation report
-/datasets                      → Dataset upload and management
-/keystore                      → API key management (localStorage-based)
-/configurations/prompt-editor  → Git-like prompt version control
-/test-evaluation               → Mock data testing page
-```
-
-**Coming Soon Routes** (placeholders):
-
-- `/model-testing`, `/speech-to-text`, `/text-to-speech`, `/guardrails`, `/redteaming`
-
-### Component Organization
-
-**Shared Components** (`/app/components/`):
-
-- `Sidebar.tsx` - Main navigation (240px collapsible)
-- `TabNavigation.tsx` - Reusable tab switcher
-- `ConfigModal.tsx` - Modal for viewing evaluation configs
-- `DetailedResultsTable.tsx` - Evaluation traces table
-- `ScoreDisplay.tsx`, `StatusBadge.tsx` - Display primitives
-- `types.ts` - Shared TypeScript interfaces
-- `utils.ts` - Date formatting, color utilities
-
-**Prompt Editor Components** (`/app/components/prompt-editor/`):
-
-- `Header.tsx` - Top nav with branch controls
-- `EditorView.tsx` - WYSIWYG prompt editor
-- `DiffView.tsx` - Side-by-side diff visualization
-- `HistorySidebar.tsx` - Commit history tree
-- `ConfigDrawer.tsx` - Right-side configuration drawer
-- `CurrentConfigTab.tsx`, `HistoryTab.tsx`, `ABTestTab.tsx` - Drawer tabs
-- `BranchModal.tsx`, `MergeModal.tsx` - Dialogs
-
-### State Management Pattern
-
-**No global state library** - uses React `useState` exclusively:
-
-- Component-level state with props drilling
-- LocalStorage for persistence (API keys, sidebar state)
-- No Context API or Redux/Zustand
-
-**LocalStorage Keys:**
-
-- `kaapi_api_keys` - API key storage
-- `sidebar-expanded-menus` - Sidebar expansion state
-
-### API Integration Architecture
-
-**Proxy Pattern**: All backend calls route through Next.js API handlers in `/app/api/`:
-
-```
-GET/POST /api/evaluations             → List/create eval jobs
-GET      /api/evaluations/[id]         → Get job details
-GET/POST /api/evaluations/datasets    → List/upload datasets
-GET      /api/evaluations/datasets/[dataset_id]
-GET      /api/assistant/[assistant_id] → Fetch assistant config
-```
-
-**Backend URL**: Configured via `BACKEND_URL` (default: `http://localhost:8000`)
-
-**Authentication**: Custom header `X-API-KEY` passed from client
-
-**Mock Data System**: Toggle via `USE_MOCK_DATA` flag in API routes. Mock files in `/public/mock-data/`.
-
-### Type System
-
-**Complex Type Hierarchies** in `/app/components/types.ts`:
-
-**Evaluation Types:**
-
-- `EvalJob` - Main evaluation job entity
-- `ScoreObject` - Union type supporting 3 formats:
-  - `NewScoreObjectV2` (with `traces[]` array)
-  - `NewScoreObject` (with `individual_scores[]`)
-  - `LegacyScoreObject` (old cosine similarity format)
-- `TraceItem` - Individual Q&A evaluation trace
-- `SummaryScore` - Aggregate metrics (NUMERIC/CATEGORICAL)
-
-**Type Guards**: `isNewScoreObjectV2()`, `isLegacyScoreObject()` for runtime type checking
-
-**Prompt Editor Types** in `/app/configurations/prompt-editor/types.ts`:
-
-- `Commit` - Git-like commit with branch/parent relationships
-- `Config` - LLM configuration blob with versioning
-- `Tool` - Vector store tool definition
-- `Variant` - A/B test variant configuration
-- `DiffLine` - Myers diff algorithm output
-
-### Styling System
-
-**Current Design**: Vercel-style minimalist black/white theme
-
-**Color Management**:
-
-- All colors defined in `/app/lib/colors.ts` as TypeScript object
-- Synchronized with CSS variables in `globals.css`
-- Dark mode support via `prefers-color-scheme` media query
-- See `COLOR_SCHEME.md` for quick preset options
-
-**Styling Approach**:
-
-1. Tailwind CSS for layout and spacing
-2. Inline styles for colors (referencing `colors` object)
-3. Hover states managed via React event handlers
-4. No custom Tailwind classes or extended theme
-
-**Color Palette**:
-
-```typescript
-bg: { primary: '#ffffff', secondary: '#fafafa' }
-text: { primary: '#171717', secondary: '#737373' }
-border: '#e5e5e5'
-accent: { primary: '#171717', hover: '#404040' }
-status: { success: '#16a34a', error: '#dc2626', warning: '#f59e0b' }
-```
-
-## Key Features
-
-### 1. LLM Evaluation Pipeline
-
-**Workflow**:
-
-1. Upload CSV with `question,answer` columns
-2. Configure experiment (model, instructions, vector stores)
-3. Backend creates evaluation job
-4. Job status polled every 10 seconds
-5. Results displayed with detailed metrics
-
-**Evaluation Modes**:
-
-- Config-based: Specify model, instructions, tools
-- Assistant-based: Use pre-configured assistant ID
-
-**Metrics Display**:
-
-- Summary scores (avg ± std for numeric, distribution for categorical)
-- Per-item traces with expandable Q&A pairs
-- Color-coded scores with dynamic thresholds
-- CSV export functionality
-
-### 2. Git-like Prompt Version Control
-
-**Core Concepts** (see `/configurations/prompt-editor/page.tsx`):
-
-- **Commits**: Versioned prompt snapshots with author/message/timestamp
-- **Branches**: Parallel development streams (e.g., main, experiment-v2)
-- **Diffs**: Myers algorithm for side-by-side change visualization
-- **Merges**: Branch integration with duplicate commit detection
-
-**Implementation Details**:
-
-- All commits stored in-memory (no backend persistence yet)
-- `createBranch()` preserves uncommitted changes when branching from HEAD
-- `switchBranch()` loads latest commit from target branch
-- `commitVersion()` creates new commit on current branch
-- `mergeBranch()` prevents duplicate merges
-
-**IMPORTANT**: When creating a new branch from current HEAD (not a specific historical commit), uncommitted changes in the editor must persist. This matches git behavior.
-
-### 3. Configuration Management & A/B Testing
-
-**Config Structure**:
-
-```javascript
-{
-  id: string,
-  name: string,
-  version: number,  // Auto-incremented per name
-  config_blob: {
-    completion: {
-      provider: 'openai' | 'anthropic' | 'google',
-      params: { model, instructions, temperature, tools[] }
-    }
-  }
-}
-```
-
-**Features**:
-
-- Multi-version configs (auto-incremented)
-- "Use Current Prompt" syncs from editor
-- History tab shows all saved configs
-- A/B testing with 2-4 variants
-- Simulated test runs (1.5s delay, random scores)
-
-See `CONFIG_AB.md` for complete feature specification.
-
-## Key Implementation Patterns
-
-### TypeScript Configuration
-
-- Path alias `@/*` maps to project root
-- Strict mode disabled (`strict: false`)
-- JSX uses `react-jsx` transform
-- Module resolution: `bundler`
-
-### Date/Time Handling
-
-- IST (Indian Standard Time) used throughout
-- Timezone offsets manually added to UTC dates
-- Format: `date-fns` with `date-fns-tz`
-
-### Component Patterns
-
-1. **Client-Side Components**: Most pages use `"use client"` for hooks and browser APIs
-2. **Props Drilling**: Deep component trees pass 10+ props (no Context API)
-3. **Inline Validation**: Error handling with alerts (no toast library)
-4. **Loading States**: Skeleton loaders with Tailwind pulse animation
-
-### Data Fetching
-
-- Direct `fetch()` calls (no axios/react-query)
-- SWR installed but minimally used
-- Polling intervals for job status (10s)
-- Mock data toggle for development
-
-## File Path Conventions
-
-- Use `@/` prefix for imports: `import Component from '@/app/components/Component'`
-- All application code in `/app/` (App Router structure)
-- Shared components: `/app/components/`
-- Feature components: `/app/components/[feature]/`
-- API routes: `/app/api/`
-- Utilities: `/app/lib/`
-
-## Development Workflow Guidelines
-
-1. **Styling**: Use centralized colors from `/app/lib/colors.ts`, not hardcoded hex values
-2. **State**: Keep state in component hierarchy, not global stores
-3. **Types**: Use shared types from `/app/components/types.ts` for evaluations
-4. **Colors**: Reference `colors` object for inline styles, Tailwind for layout
-5. **API Calls**: Route through `/app/api/` handlers, not direct backend calls
-6. **Date Formatting**: Use `formatDateTime()` from `/app/components/utils.ts`
-
-## Backend Integration
-
-**Environment Variables**:
-
-```bash
-BACKEND_URL=http://localhost:8000  # Backend API base URL
-```
-
-**Authentication**:
-
-- API keys stored in localStorage
-- Passed via `X-API-KEY` header
-- No JWT/OAuth implementation
-
-**Dataset Upload**:
-
-- CSV format: `question,expected_answer` columns
-- Duplication factor supported (1-10)
-- Backend handles file processing
-
-## Technical Debt & Known Patterns
-
-1. **Dual Routing**: Next.js App Router + React Router DOM coexist (avoid confusion)
-2. **Props Drilling**: Consider Context API for deeply nested props
-3. **Magic Strings**: Status values, localStorage keys hardcoded
-4. **Mixed Styling**: Tailwind + inline styles + CSS modules (prefer consistency)
-5. **No Testing**: No test files exist (add tests for critical paths)
-6. **Large Files**: Some components exceed 1000 lines (consider splitting)
-7. **Type Safety**: Strict mode disabled (many `any` types exist)
-
-## Important Notes
-
-1. **React 19**: Uses bleeding-edge React version (expect occasional breaking changes)
-2. **LocalStorage**: API keys stored client-side (not production-ready for sensitive data)
-3. **Mock Data**: Production code includes mock system (toggle via flags)
-4. **IST Timezone**: All timestamps assume Indian Standard Time
-5. **No Testing**: No test infrastructure exists yet
-6. **Component Location**: Check both `/app/components/` and feature folders for components
-
-## Documentation Files
-
-- `/CLAUDE.md` - This file (architectural guidance)
-- `/COLOR_SCHEME.md` - Quick color preset guide
-- `/CONFIG_AB.md` - A/B testing feature specification
-- `/README.md` - Standard Next.js boilerplate
diff --git a/instructions/COLOR_SCHEME.md b/instructions/COLOR_SCHEME.md
deleted file mode 100644
index 616d9d2..0000000
--- a/instructions/COLOR_SCHEME.md
+++ /dev/null
@@ -1,65 +0,0 @@
-# Color Scheme Configuration
-
-This app uses a centralized color configuration for easy experimentation.
-
-## Configuration File
-
-Edit `/app/lib/colors.ts` to change the entire app's color scheme.
-
-## Current Colors
-
-```typescript
-{
-  bg: {
-    primary: '#ffffff',    // Main background (white)
-    secondary: '#fafafa',  // Secondary background (light gray)
-  },
-  text: {
-    primary: '#171717',    // Main text (near black)
-    secondary: '#737373',  // Muted text (gray)
-  },
-  border: '#e5e5e5',       // All borders
-  accent: {
-    primary: '#0070f3',    // Primary buttons, links, active states (Vercel blue)
-    hover: '#0761d1',      // Hover state for accent
-  },
-  status: {
-    success: '#16a34a',    // Success states (green)
-    error: '#dc2626',      // Error states (red)
-    warning: '#f59e0b',    // Warning states (orange)
-  }
-}
-```
-
-## Quick Color Scheme Presets
-
-### Vercel Style (Current)
-
-- Accent: `#0070f3` (blue)
-
-### Linear Style
-
-- Accent: `#5E6AD2` (purple-blue)
-- Update `colors.accent.primary` to `#5E6AD2`
-- Update `colors.accent.hover` to `#4F5CC0`
-
-### GitHub Style
-
-- Accent: `#2DA44E` (green)
-- Update `colors.accent.primary` to `#2DA44E`
-- Update `colors.accent.hover` to `#238636`
-
-### Minimal Black
-
-- Accent: `#171717` (black)
-- Update `colors.accent.primary` to `#171717`
-- Update `colors.accent.hover` to `#404040`
-
-## How to Change
-
-1. Open `/app/lib/colors.ts`
-2. Modify the color values
-3. Save the file
-4. Refresh your browser
-
-That's it! All components use these centralized values.
diff --git a/instructions/CONFIG_AB.md b/instructions/CONFIG_AB.md
deleted file mode 100644
index 8b16150..0000000
--- a/instructions/CONFIG_AB.md
+++ /dev/null
@@ -1,277 +0,0 @@
-I need to implement a configuration drawer and A/B testing feature for a prompt version control system. Here's what needs to be built:
-
-## Context
-
-We have a React-based version control system for prompt templates (similar to Git). Users can commit prompts, create branches, view diffs, and merge. Now we need to add configuration management and A/B testing.
-
-## Requirements
-
-### 1. Configuration Drawer (Right Side, 420px width)
-
-**Trigger:** Floating Action Button (FAB) - "⚙️" icon, bottom-right corner, 56x56px circle, blue background
-
-**Drawer Structure:**
-
-- Slides in from right when FAB clicked
-- 3 tabs: "Current" | "History" | "A/B Test"
-- Close button (X) top-right
-- Boxshadow for depth
-
-### 2. Current Config Tab
-
-**Fields (top to bottom):**
-
-1. **Config Name Selector**
-   - Dropdown to select existing configs (shows: "Name (vX)")
-   - "+ New" button next to it
-   - If New clicked: show text input for new config name
-
-2. **Provider Dropdown**
-   - Options: OpenAI, Anthropic, Google
-   - Default: openai
-
-3. **Model Dropdown**
-   - Options: gpt-4o-mini, gpt-4o, gpt-4-turbo, gpt-3.5-turbo
-   - Default: gpt-4o-mini
-
-4. **Instructions Section**
-   - Label: "Instructions"
-   - Button: "Use Current Prompt" (copies from main editor)
-   - Textarea: multiline, monospace font, 120px min-height
-
-5. **Temperature Slider**
-   - Label: "Temperature: {value}"
-   - Range: 0 to 1, step 0.1
-   - Labels below: "Focused (0)" | "Balanced (0.5)" | "Creative (1)"
-
-6. **Tools Section**
-   - Label: "Tools" with "+ Add Tool" button
-   - Each tool shows:
-     - Type: File Search (hardcoded for now)
-     - Input: Vector Store ID
-     - Input: Max Results (number)
-     - Remove button
-
-7. **Commit Message**
-   - Optional text input
-   - Placeholder: "Describe this configuration..."
-
-8. **Save Button**
-   - Full width, green (#2da44e)
-   - Text: "Save Configuration"
-
-**Data Structure for Config:**
-
-```javascript
-{
-  id: 'cfg1',
-  name: 'Main Config',
-  version: 1,
-  timestamp: Date.now(),
-  config_blob: {
-    completion: {
-      provider: 'openai',
-      params: {
-        model: 'gpt-4o-mini',
-        instructions: '...',
-        temperature: 0.7,
-        tools: [
-          {
-            type: 'file_search',
-            knowledge_base_ids: ['vs_abc123'],
-            max_num_results: 20
-          }
-        ]
-      }
-    }
-  },
-  commitMessage: 'Optional message'
-}
-```
-
-### 3. History Tab
-
-**Display:**
-
-- List of all saved configs (reverse chronological)
-- Each card shows:
-  - Config name (vX)
-  - Model • temp: X
-  - Timestamp (formatted like "2h ago", "3d ago")
-  - Commit message (if exists, italicized)
-- Click card to load that config into Current tab
-- Active config highlighted
-
-### 4. A/B Test Tab
-
-**Variant Configuration:**
-
-- Show 2 variants by default (A and B)
-- Each variant card contains:
-  - Header: "Variant A/B/C/D"
-  - Config dropdown: Select from saved configs
-  - Prompt dropdown: Select from commit history (show: "#ID: message (branch)")
-  - Preview box (readonly): Shows model, temp, first line of prompt
-- "+ Add Variant" button (max 4 variants)
-
-**Test Input Section:**
-
-- Label: "Test Input"
-- Textarea for test prompt
-
-**Run Test Button:**
-
-- Full width, green
-- Text: "▶ Run Test"
-- Disabled if no test input
-
-**Results Section (appears after running):**
-
-- Card for each variant showing:
-  - Variant name
-  - Score (0.00-1.00 format)
-  - Config name • Commit message
-  - Latency in ms
-- Highlight best performer with "🏆 Best: Variant X" in green box
-
-**Test Simulation:**
-
-```javascript
-// For PoC, simulate API call:
-await new Promise((resolve) => setTimeout(resolve, 1500));
-const score = 0.7 + Math.random() * 0.25;
-const latency = 200 + Math.random() * 400;
-```
-
-### 5. State Management
-
-**New State Variables Needed:**
-
-```javascript
-// Drawer
-const [drawerOpen, setDrawerOpen] = useState(false);
-const [drawerTab, setDrawerTab] = useState("config");
-
-// Configs
-const [configs, setConfigs] = useState([]);
-const [selectedConfigId, setSelectedConfigId] = useState("");
-const [configName, setConfigName] = useState("");
-const [provider, setProvider] = useState("openai");
-const [model, setModel] = useState("gpt-4o-mini");
-const [instructions, setInstructions] = useState("");
-const [temperature, setTemperature] = useState(0.7);
-const [tools, setTools] = useState([]);
-const [configCommitMsg, setConfigCommitMsg] = useState("");
-
-// A/B Testing
-const [variants, setVariants] = useState([
-  { id: "A", configId: "", commitId: "", name: "Variant A" },
-  { id: "B", configId: "", commitId: "", name: "Variant B" },
-]);
-const [testInput, setTestInput] = useState("");
-const [testResults, setTestResults] = useState(null);
-const [isRunningTest, setIsRunningTest] = useState(false);
-```
-
-### 6. Key Functions to Implement
-
-```javascript
-// Save new config version
-const saveConfig = () => {
-  // Validate config name exists
-  // Create new config object with incremented version
-  // Add to configs array
-  // Show success alert
-};
-
-// Load existing config
-const loadConfig = (configId) => {
-  // Find config by ID
-  // Populate all form fields
-  // Set as selected config
-};
-
-// Add/remove/update tools
-const addTool = () => {
-  /* Add empty tool */
-};
-const removeTool = (index) => {
-  /* Remove by index */
-};
-const updateTool = (index, field, value) => {
-  /* Update specific field */
-};
-
-// Run A/B test
-const runABTest = async () => {
-  // Validate test input exists
-  // Set loading state
-  // Simulate API calls (1.5s delay)
-  // Generate mock scores and latencies
-  // Display results
-};
-
-// Manage variants
-const addVariant = () => {
-  /* Max 4 variants */
-};
-const updateVariant = (index, field, value) => {
-  /* Update variant config */
-};
-```
-
-### 7. UI/UX Details
-
-**Colors:**
-
-Use current B/W color scheme. Make sure the design system does not diverge.
-
-**Spacing:**
-
-- Drawer padding: 20px
-- Section spacing: 16px bottom margin
-- Input padding: 8px
-- Label font: 12px, weight 600
-
-**Interactions:**
-
-- FAB hover: scale(1.1) transform
-- Drawer animation: slide in from right (can use conditional render for MVP)
-- Close drawer on: X button click, overlay click (optional)
-
-### 8. Integration Points
-
-**With Existing System:**
-
-- Access `currentContent` from main editor for "Use Current Prompt"
-- Access `commits` array for A/B test prompt selection
-- Add "▶ Run A/B Test" button in header (opens drawer to A/B tab)
-
-### 9. Starting Point
-
-If you have the existing version control code, add:
-
-1. FAB button positioned fixed bottom-right
-2. Conditional render of drawer when `drawerOpen === true`
-3. Tab switching logic
-4. Form fields with controlled inputs
-5. A/B test variant management
-
-The drawer should NOT affect the existing version control tree, editor, or diff views. It's purely additive.
-
-## File Structure
-
-- Single React component (or can split into sub-components)
-- Keep all state in parent component for MVP
-- No external dependencies beyond React
-
-## Success Criteria
-
-✅ FAB opens/closes drawer
-✅ Can create and save configs with all fields
-✅ Can load previous configs from history
-✅ Can set up 2-4 A/B test variants
-✅ Can run test and see simulated results
-✅ Results show winner clearly
-✅ "Use Current Prompt" syncs editor content
-✅ UI is clean and uncluttered
diff --git a/instructions/CONFIG_API.md b/instructions/CONFIG_API.md
deleted file mode 100644
index c9ade1a..0000000
--- a/instructions/CONFIG_API.md
+++ /dev/null
@@ -1,215 +0,0 @@
-# Config Management API Integration Instructions
-
-## Overview
-
-Integrate the Config Management APIs into an existing Next.js UI. The API manages LLM configurations with version control (similar to git commits for config changes).
-
-## Base URL & Auth
-
-- Base: `/api/v1/configs`
-- Auth: Bearer token via `Authorization` header OR API key via `X-API-KEY` header
-
----
-
-## API Endpoints
-
-### 1. Configs (Parent Entity)
-
-#### List Configs
-
-```
-GET /api/v1/configs/
-Query: skip (default 0), limit (default 100, max 100)
-Response: { success: boolean, data: ConfigPublic[], error?: string }
-```
-
-#### Create Config
-
-```
-POST /api/v1/configs/
-Body: ConfigCreate
-Response 201: { success: boolean, data: ConfigWithVersion }
-```
-
-#### Get Config
-
-```
-GET /api/v1/configs/{config_id}
-Response: { success: boolean, data: ConfigPublic }
-```
-
-#### Update Config (metadata only)
-
-```
-PATCH /api/v1/configs/{config_id}
-Body: ConfigUpdate
-Response: { success: boolean, data: ConfigPublic }
-```
-
-#### Delete Config
-
-```
-DELETE /api/v1/configs/{config_id}
-Response: { success: boolean, data: { message: string } }
-```
-
-### 2. Config Versions (Child Entity)
-
-#### List Versions
-
-```
-GET /api/v1/configs/{config_id}/versions
-Query: skip, limit
-Response: { success: boolean, data: ConfigVersionItems[] }
-```
-
-#### Create Version
-
-```
-POST /api/v1/configs/{config_id}/versions
-Body: ConfigVersionCreate
-Response 201: { success: boolean, data: ConfigVersionPublic }
-```
-
-#### Get Specific Version
-
-```
-GET /api/v1/configs/{config_id}/versions/{version_number}
-Response: { success: boolean, data: ConfigVersionPublic }
-```
-
-#### Delete Version
-
-```
-DELETE /api/v1/configs/{config_id}/versions/{version_number}
-Response: { success: boolean, data: { message: string } }
-```
-
----
-
-## TypeScript Types
-
-```typescript
-// Request Types
-interface ConfigCreate {
-  name: string; // 1-128 chars, unique per project
-  description?: string | null; // max 512 chars
-  config_blob: ConfigBlob;
-  commit_message?: string | null; // max 512 chars
-}
-
-interface ConfigUpdate {
-  name?: string | null; // 1-128 chars
-  description?: string | null; // max 512 chars
-}
-
-interface ConfigVersionCreate {
-  config_blob: ConfigBlob;
-  commit_message?: string | null; // max 512 chars
-}
-
-interface ConfigBlob {
-  completion: CompletionConfig;
-}
-
-interface CompletionConfig {
-  provider: "openai"; // currently only "openai"
-  params: Record<string, any>; // provider-specific params (model, temperature, etc.)
-}
-
-// Response Types
-interface ConfigPublic {
-  id: string; // UUID
-  name: string;
-  description: string | null;
-  project_id: number;
-  inserted_at: string; // ISO datetime
-  updated_at: string; // ISO datetime
-}
-
-interface ConfigWithVersion extends ConfigPublic {
-  version: ConfigVersionPublic;
-}
-
-interface ConfigVersionPublic {
-  id: string; // UUID
-  config_id: string; // UUID
-  version: number; // starts at 1, auto-increments
-  config_blob: Record<string, any>;
-  commit_message: string | null;
-  inserted_at: string;
-  updated_at: string;
-}
-
-interface ConfigVersionItems {
-  id: string; // UUID
-  config_id: string; // UUID
-  version: number;
-  commit_message: string | null;
-  inserted_at: string;
-  updated_at: string;
-  // Note: config_blob excluded for list performance
-}
-
-interface APIResponse<T> {
-  success: boolean;
-  data: T | null;
-  error?: string | null;
-  metadata?: Record<string, any> | null;
-}
-```
-
----
-
-## Example config_blob
-
-```json
-{
-  "completion": {
-    "provider": "openai",
-    "params": {
-      "model": "gpt-4o-mini",
-      "instructions": "You are a helpful assistant...",
-      "temperature": 1,
-      "tools": [
-        {
-          "type": "file_search",
-          "knowledge_base_ids": ["vs_692d71f3f5708191b1c46525f3c1e196"],
-          "max_num_results": 20
-        }
-      ]
-    }
-  }
-}
-```
-
----
-
-## UI Implementation Notes
-
-1. **Config List View**: Display name, description, updated_at. Click to view versions.
-
-2. **Config Create Form**:
-   - name (required, unique)
-   - description (optional)
-   - config_blob JSON editor or structured form
-   - commit_message (optional, for initial version)
-
-3. **Version History View**:
-   - Show versions in descending order (newest first)
-   - Display version number, commit_message, timestamps
-   - Click version to view full config_blob
-
-4. **Create New Version**:
-   - Load current version's config_blob as starting point
-   - Allow editing config_blob
-   - Add commit_message to describe changes
-   - Auto-increments version number
-
-5. **Diff View** (optional enhancement):
-   - Compare config_blob between versions
-   - Highlight changes
-
-6. **Error Handling**:
-   - 422: Validation errors (check response.error)
-   - Duplicate name error when creating config
diff --git a/instructions/TESTING_MOCK_DATA.md b/instructions/TESTING_MOCK_DATA.md
deleted file mode 100644
index f62f758..0000000
--- a/instructions/TESTING_MOCK_DATA.md
+++ /dev/null
@@ -1,222 +0,0 @@
-# Testing with Mock Evaluation Data
-
-This guide explains how to test the new evaluation report UI with mock data.
-
-## Quick Start
-
-### Option 1: Using the Test Page (Easiest)
-
-1. Start the development server:
-
-   ```bash
-   npm run dev
-   ```
-
-2. Navigate to: **http://localhost:3000/test-evaluation**
-
-3. Click on either evaluation card to view the mock data
-
-### Option 2: Direct URL Access
-
-Navigate directly to the evaluation detail pages:
-
-- **Evaluation #43 (Hindi)**: http://localhost:3000/evaluations/43
-- **Evaluation #44 (English)**: http://localhost:3000/evaluations/44
-
-## Mock Data Files
-
-Located in `/public/mock-data/`:
-
-### `evaluation-sample-1.json` (ID: 43)
-
-- **Language**: Hindi
-- **Items**: 4 Q&A pairs
-- **Scores**:
-  - cosine_similarity (NUMERIC)
-  - SNEHA correctness (NUMERIC)
-  - llm_judge_relevance (NUMERIC)
-  - response_category (CATEGORICAL)
-- **Features**: Mix of CORRECT, PARTIAL, and INCORRECT responses
-
-### `evaluation-sample-2.json` (ID: 44)
-
-- **Language**: English
-- **Items**: 3 Q&A pairs
-- **Scores**: Same as above
-- **Features**: Higher average scores, includes assistant config
-- **Special**: 2 CORRECT, 1 PARTIAL (no INCORRECT)
-
-## What to Test
-
-### 1. Table View
-
-- ✅ Question, Answer, Ground Truth columns display properly
-- ✅ All score columns appear dynamically
-- ✅ Long text truncates with expand/collapse (details/summary)
-- ✅ Score values are color-coded (green/yellow/red)
-- ✅ Comments appear below scores
-- ✅ No trace IDs visible (as requested)
-- ✅ Row hover effects work
-
-### 2. Metrics Overview
-
-- ✅ All NUMERIC metrics show avg ± std
-- ✅ CATEGORICAL metrics show distribution
-- ✅ Responsive grid layout
-- ✅ Proper formatting (3 decimal places for scores)
-
-### 3. CSV Export
-
-- ✅ Click "Export CSV" button
-- ✅ File downloads with all columns
-- ✅ Q&A pairs and scores included
-- ✅ Proper CSV escaping
-
-### 4. Navigation
-
-- ✅ Back button returns to /evaluations?tab=results
-- ✅ View Config button opens modal
-- ✅ Sidebar navigation works
-
-### 5. Assistant Info
-
-- ✅ Evaluation #44 shows assistant badge
-- ✅ Evaluation #43 shows no assistant
-
-## Switching Between Mock and Real Data
-
-### Enable Mock Data (Default)
-
-In `/app/api/evaluations/[id]/route.ts`:
-
-```typescript
-const USE_MOCK_DATA = true;
-```
-
-### Disable Mock Data (Use Real Backend)
-
-```typescript
-const USE_MOCK_DATA = false;
-```
-
-**Note**: After changing this, restart your dev server.
-
-## ID Mapping
-
-The mock API maps IDs to files:
-
-- **ID 43, 1, or any other number** → `evaluation-sample-1.json`
-- **ID 44 or 2** → `evaluation-sample-2.json`
-
-You can modify this mapping in `/app/api/evaluations/[id]/route.ts`
-
-## Adding More Mock Data
-
-1. Create a new JSON file in `/public/mock-data/`
-2. Follow the structure in existing samples
-3. Update the ID mapping in the API route:
-
-```typescript
-let mockFileName = "evaluation-sample-1.json";
-if (id === "44" || id === "2") {
-  mockFileName = "evaluation-sample-2.json";
-} else if (id === "45") {
-  mockFileName = "your-new-file.json"; // Add your mapping
-}
-```
-
-## Expected Response Structure
-
-The mock data follows this structure:
-
-```json
-{
-  "id": 43,
-  "run_name": "...",
-  "dataset_name": "...",
-  "status": "completed",
-  "total_items": 4,
-  "scores": {
-    "summary_scores": [
-      {
-        "name": "cosine_similarity",
-        "avg": 0.453,
-        "std": 0.06,
-        "total_pairs": 4,
-        "data_type": "NUMERIC"
-      },
-      {
-        "name": "response_category",
-        "distribution": { "CORRECT": 1, "PARTIAL": 2, "INCORRECT": 1 },
-        "total_pairs": 4,
-        "data_type": "CATEGORICAL"
-      }
-    ],
-    "individual_scores": [
-      {
-        "trace_id": "...",
-        "input": { "question": "..." },
-        "output": { "answer": "..." },
-        "metadata": { "ground_truth": "..." },
-        "trace_scores": [
-          {
-            "name": "cosine_similarity",
-            "value": 0.452,
-            "data_type": "NUMERIC"
-          },
-          {
-            "name": "response_category",
-            "value": "INCORRECT",
-            "data_type": "CATEGORICAL"
-          }
-        ]
-      }
-    ]
-  }
-}
-```
-
-## Troubleshooting
-
-### Mock data not loading
-
-- Check console for `[MOCK MODE]` logs
-- Verify files exist in `/public/mock-data/`
-- Ensure `USE_MOCK_DATA = true`
-
-### Table not showing
-
-- Check browser console for errors
-- Verify `scores.individual_scores` exists in JSON
-- Check that all required fields are present
-
-### Scores not color-coded
-
-- Verify `data_type` is set correctly
-- Check that NUMERIC values are numbers, not strings
-- Ensure CATEGORICAL values match expected values
-
-## Production Deployment
-
-**IMPORTANT**: Before deploying to production:
-
-1. Set `USE_MOCK_DATA = false` in `/app/api/evaluations/[id]/route.ts`
-2. Delete or hide `/app/test-evaluation/page.tsx` (optional)
-3. Test with real backend to ensure everything works
-
-## Next Steps
-
-After testing with mock data and confirming the UI works:
-
-1. Update the backend API to return the new structure
-2. Set `USE_MOCK_DATA = false`
-3. Test with real evaluation data
-4. Deploy to production
-
----
-
-**Need Help?** Check the implementation files:
-
-- Type definitions: `/app/components/types.ts`
-- Table component: `/app/components/DetailedResultsTable.tsx`
-- Detail page: `/app/evaluations/[id]/page.tsx`
diff --git a/instructions/VERCEL_DESIGN_SYSTEM.md b/instructions/VERCEL_DESIGN_SYSTEM.md
deleted file mode 100644
index b1c1ba9..0000000
--- a/instructions/VERCEL_DESIGN_SYSTEM.md
+++ /dev/null
@@ -1,708 +0,0 @@
-# Vercel/shadcn Design System Aesthetics
-
-A comprehensive guide to reproducing the minimalist, modern design aesthetic inspired by Vercel and shadcn/ui.
-
-## Philosophy
-
-**Minimalism First**: Every element serves a purpose. No decorative flourishes, no unnecessary effects. The design is invisible until it needs to be visible.
-
-**Subtle Interactions**: Transitions are quick (0.15-0.2s) and purposeful. Hover states provide immediate feedback without being distracting.
-
-**Hierarchy Through Restraint**: Visual hierarchy comes from careful use of weight, spacing, and subtle color variations—not bold colors or heavy effects.
-
----
-
-## Color Palette
-
-### Core Colors
-
-**Light Mode**
-
-```
-Backgrounds:
-- Primary:   #ffffff (pure white)
-- Secondary: #fafafa (barely-there gray)
-
-Text:
-- Primary:   #171717 (near-black, not pure black)
-- Secondary: #737373 (muted gray for less important text)
-
-Borders:
-- Standard: #e5e5e5 (very light gray, barely visible)
-
-Accent:
-- Primary: #171717 (same as text primary—unified system)
-- Hover:   #404040 (slightly lighter on hover)
-```
-
-**Dark Mode**
-
-```
-Backgrounds:
-- Primary:   #000000 (pure black)
-- Secondary: #0a0a0a (barely-there lighter)
-
-Text:
-- Primary:   #ededed (off-white)
-- Secondary: #a1a1a1 (muted gray)
-
-Borders:
-- Standard: #262626 (subtle dark gray)
-```
-
-### Semantic Colors
-
-Used sparingly for status and feedback:
-
-```
-Success: #16a34a (green-600)
-Error:   #dc2626 (red-600)
-Warning: #f59e0b (amber-500)
-```
-
-### Color Usage Rules
-
-1. **Never use pure black (#000) for text** in light mode—use #171717 instead
-2. **Borders should be barely visible**—#e5e5e5 is the standard
-3. **Background variations are subtle**—primary (#fff) vs secondary (#fafafa)
-4. **Accent colors match text colors**—creates unified, cohesive system
-5. **Status colors only appear when needed**—success/error states
-
----
-
-## Typography
-
-### Font Stack
-
-- **Sans-serif**: System font stack or Geist Sans (Vercel's font)
-- **Monospace**: Geist Mono for code
-
-### Text Sizing
-
-```
-Extra Small:  10px (badges, labels)
-Small:        12px (secondary UI, submenus)
-Base:         14px (primary UI, body text)
-Medium:       16px (headings, emphasized text)
-Large:        20px+ (page titles, hero text)
-```
-
-### Font Weights
-
-```
-Regular: 400 (default text)
-Medium:  500 (interactive elements, subheadings)
-Semibold: 600 (active states, emphasis)
-```
-
-### Typography Rules
-
-1. **Use font weight for hierarchy**, not size differences
-2. **Active/selected states use weight 500-600**
-3. **Secondary text uses lighter weight AND color**
-4. **Letter spacing**: -0.01em for headings (tight tracking)
-5. **Line height**: Tight for UI (1.2-1.4), comfortable for body (1.5-1.6)
-
----
-
-## Spacing System
-
-### Scale (based on 4px grid)
-
-```
-0.5 → 2px   (tight gaps)
-1   → 4px   (minimal spacing)
-1.5 → 6px   (small gaps)
-2   → 8px   (standard small)
-2.5 → 10px  (compact spacing)
-3   → 12px  (standard medium)
-4   → 16px  (comfortable spacing)
-5   → 20px  (generous spacing)
-6   → 24px  (section spacing)
-```
-
-### Padding Patterns
-
-```
-Buttons:     px-3 py-2 (12px × 8px)
-Inputs:      px-3 py-2 (12px × 8px)
-Cards:       p-4 to p-6 (16px-24px)
-Containers:  px-6 py-6 (24px all sides)
-Sections:    py-8 to py-12 (32px-48px vertical)
-```
-
-### Margin Patterns
-
-```
-Between elements: 8-12px (space-y-2 to space-y-3)
-Between sections: 24-32px (my-6 to my-8)
-Page margins:     24px minimum (px-6)
-```
-
----
-
-## Components
-
-### Buttons
-
-**Primary Button**
-
-```
-Background:  #171717
-Text:        #ffffff
-Padding:     12px 16px
-Border:      none
-Radius:      6px
-Font:        14px, weight 500
-Transition:  all 0.2s ease
-
-Hover:
-- Background: #404040
-- No scale/shadow effects
-
-Disabled:
-- Background: #e5e5e5
-- Text: #a1a1a1
-- Cursor: not-allowed
-```
-
-**Secondary Button**
-
-```
-Background:  transparent
-Text:        #171717
-Border:      1px solid #e5e5e5
-Padding:     12px 16px
-Radius:      6px
-Font:        14px, weight 500
-
-Hover:
-- Background: #fafafa
-- Border: #d4d4d4
-```
-
-**Ghost Button**
-
-```
-Background:  transparent
-Text:        #737373
-Border:      none
-Padding:     8px 12px
-
-Hover:
-- Text: #171717
-- Background: #fafafa
-```
-
-### Input Fields
-
-```
-Background:  #ffffff
-Border:      1px solid #e5e5e5
-Padding:     12px
-Radius:      6px
-Font:        14px
-Text:        #171717
-
-Focus:
-- Border: #171717
-- No glow/shadow
-- Outline: none (use border instead)
-
-Placeholder:
-- Color: #a1a1a1
-- Font style: normal (not italic)
-```
-
-### Cards
-
-```
-Background:  #ffffff
-Border:      1px solid #e5e5e5
-Radius:      8px
-Padding:     16-24px
-Shadow:      none (or very subtle: 0 1px 2px rgba(0,0,0,0.05))
-
-Hover (if interactive):
-- Border: #d4d4d4
-- No shadow increase
-```
-
-### Navigation Items
-
-**Sidebar Item**
-
-```
-Default:
-- Background: transparent
-- Text: #737373
-- Font weight: 400-500
-- Padding: 8px 12px
-- Radius: 6px
-
-Hover:
-- Background: #ffffff (or primary bg)
-- Text: #171717
-
-Active:
-- Background: #ffffff
-- Text: #171717
-- Font weight: 600
-- Border: 1px solid #e5e5e5
-```
-
-**Tab Navigation**
-
-```
-Default:
-- Border bottom: 2px transparent
-- Text: #737373
-- Font weight: 400
-- Padding: 12px 16px
-
-Active:
-- Border bottom: 2px #171717
-- Text: #171717
-- Font weight: 500
-```
-
-### Badges/Pills
-
-```
-Background:  #fafafa
-Text:        #171717
-Padding:     4px 8px
-Radius:      4px (fully rounded: 999px)
-Font:        11-12px
-Font weight: 500
-
-Status Variants:
-- Success: bg #dcfce7, text #15803d
-- Error:   bg #fee2e2, text #dc2626
-- Warning: bg #fef3c7, text #92400e
-```
-
-### Modals/Dialogs
-
-```
-Backdrop:
-- Background: rgba(0, 0, 0, 0.4)
-- Animation: fade in 0.2s
-
-Container:
-- Background: #ffffff
-- Border: 1px solid #e5e5e5
-- Radius: 12px
-- Padding: 24px
-- Max width: 500px
-- Shadow: 0 4px 12px rgba(0, 0, 0, 0.1)
-- Animation: fade + scale (0.95 → 1.0) 0.3s
-
-Close button:
-- Position: top-right
-- Size: 32px
-- Icon: X mark
-- Color: #737373
-- Hover: #171717
-```
-
-### Tables
-
-```
-Container:
-- Border: 1px solid #e5e5e5
-- Radius: 8px
-- Overflow: hidden
-
-Header:
-- Background: #fafafa
-- Text: #171717
-- Font weight: 600
-- Padding: 12px 16px
-- Border bottom: 1px solid #e5e5e5
-
-Row:
-- Background: #ffffff
-- Border bottom: 1px solid #e5e5e5
-- Padding: 12px 16px
-
-Row Hover:
-- Background: #fafafa
-
-Last row:
-- No border bottom
-```
-
----
-
-## Layout Patterns
-
-### Sidebar Navigation
-
-```
-Width:       240px
-Background:  #fafafa
-Border:      1px solid #e5e5e5 (right)
-Height:      100vh
-Flex:        column
-
-Collapse:
-- Width: 0px
-- Overflow: hidden
-- Transition: 0.3s ease
-```
-
-### Page Container
-
-```
-Max width:   1280px (or 100% for full-width)
-Padding:     24px
-Margin:      0 auto
-```
-
-### Content Sections
-
-```
-Background:  #ffffff
-Border:      1px solid #e5e5e5
-Radius:      8px
-Padding:     24px
-Margin:      16px 0
-```
-
----
-
-## Animation & Transitions
-
-### Timing Functions
-
-```
-Standard: ease-in-out
-Quick:    ease (for micro-interactions)
-Entry:    ease-out
-Exit:     ease-in
-```
-
-### Duration Scale
-
-```
-Instant:   50ms  (color changes)
-Quick:     150ms (hover states, text color)
-Standard:  200ms (backgrounds, borders)
-Medium:    300ms (modals, drawers)
-Slow:      500ms (layout changes)
-```
-
-### Common Animations
-
-**Fade In**
-
-```css
-@keyframes fadeIn {
-  from {
-    opacity: 0;
-    transform: translateY(-4px);
-  }
-  to {
-    opacity: 1;
-    transform: translateY(0);
-  }
-}
-duration: 0.2s;
-```
-
-**Modal Entry**
-
-```css
-@keyframes modalSlideUp {
-  from {
-    opacity: 0;
-    transform: translateY(20px) scale(0.95);
-  }
-  to {
-    opacity: 1;
-    transform: translateY(0) scale(1);
-  }
-}
-duration: 0.3s;
-```
-
-**Page Transition**
-
-```css
-@keyframes pageIn {
-  from {
-    opacity: 0;
-    transform: translateY(8px);
-  }
-  to {
-    opacity: 1;
-    transform: translateY(0);
-  }
-}
-duration: 0.3s;
-```
-
-### Animation Rules
-
-1. **Hover transitions are 150-200ms**—fast enough to feel instant
-2. **No easing curves longer than cubic-bezier**—keep it simple
-3. **Entrance animations are subtle**—4-8px movement max
-4. **Never animate on exit unless closing**—just fade out
-5. **No bounce, elastic, or attention-seeking effects**
-
----
-
-## Interaction Patterns
-
-### Hover States
-
-**General Rules**
-
-- Background lightens slightly (#fafafa)
-- Text darkens to primary color (#171717)
-- Border darkens one shade
-- No scale/transform effects
-- Transition: 150ms
-
-### Focus States
-
-**Keyboard Navigation**
-
-- Use border color change, not glow
-- Border: 2px solid #171717
-- No box-shadow outline
-- Visible and clear
-
-### Active/Pressed States
-
-**On Click**
-
-- Slightly darker background
-- No scale down
-- 100ms transition (faster than hover)
-
-### Loading States
-
-**Skeleton Loaders**
-
-```
-Background:  #fafafa
-Animation:   pulse (opacity 1 → 0.5 → 1)
-Duration:    2s infinite
-Border:      same as element would have
-Radius:      match final element
-```
-
-**Spinners**
-
-```
-Size:        16-24px
-Color:       #171717
-Animation:   spin 1s linear infinite
-Line width:  2px
-```
-
----
-
-## Iconography
-
-### Icon Style
-
-- **Outline style** (not filled)
-- **2px stroke width**
-- **24px default size** (scale down to 16px for compact UI)
-- **Rounded line caps and joins**
-- **Match text color** of surrounding context
-
-### Icon Spacing
-
-- **Gap from text**: 8-10px (0.5rem to 0.625rem)
-- **Icon-only buttons**: 32px × 32px touch target minimum
-
----
-
-## Shadows (Use Sparingly)
-
-```
-None:    (default—no shadow)
-Subtle:  0 1px 2px rgba(0, 0, 0, 0.05)
-Light:   0 1px 3px rgba(0, 0, 0, 0.1)
-Medium:  0 4px 6px rgba(0, 0, 0, 0.1)
-Heavy:   0 10px 15px rgba(0, 0, 0, 0.1)
-```
-
-**When to Use Shadows**
-
-- Modals/dialogs: medium
-- Dropdown menus: light
-- Cards: none or subtle
-- Buttons: never
-- Popovers: light
-
----
-
-## Border Radius Scale
-
-```
-Small:   4px  (badges, pills)
-Default: 6px  (buttons, inputs)
-Medium:  8px  (cards, containers)
-Large:   12px (modals, large panels)
-Full:    9999px (circular buttons, pills)
-```
-
----
-
-## Responsive Breakpoints
-
-```
-Mobile:       < 640px
-Tablet:       640px - 1024px
-Desktop:      1024px+
-Wide:         1280px+
-```
-
-### Mobile Adaptations
-
-- Reduce padding: 16px instead of 24px
-- Collapse sidebar to overlay/drawer
-- Stack horizontal layouts vertically
-- Reduce font sizes slightly (13px base instead of 14px)
-- Increase touch targets to 44px minimum
-
----
-
-## Dark Mode Considerations
-
-### Automatic Switching
-
-```css
-@media (prefers-color-scheme: dark) {
-  /* Apply dark theme */
-}
-```
-
-### Dark Mode Colors
-
-**Backgrounds**
-
-- Pure black (#000) for drama
-- Slightly lighter (#0a0a0a) for panels
-- Very subtle borders (#262626)
-
-**Text**
-
-- Off-white (#ededed) not pure white
-- Gray (#a1a1a1) for secondary
-
-**Borders**
-
-- Much darker but still subtle (#262626)
-
-**Key Difference**: Dark mode has higher contrast between elements to maintain readability.
-
----
-
-## Common Mistakes to Avoid
-
-1. ❌ **Heavy drop shadows**—use subtle borders instead
-2. ❌ **Bold accent colors**—keep it monochrome with rare color use
-3. ❌ **Complex gradients**—solid colors only
-4. ❌ **Slow animations**—keep everything under 300ms
-5. ❌ **Scale/transform on hover**—just color/background changes
-6. ❌ **Too much border radius**—8px is usually the max
-7. ❌ **Pure black text**—use #171717 in light mode
-8. ❌ **Thick borders**—1px is standard, 2px for focus only
-9. ❌ **Colorful UI elements**—status colors only when needed
-10. ❌ **Overly tight spacing**—respect the 4px grid
-
----
-
-## Design Checklist
-
-When implementing a new component, ensure:
-
-- [ ] Uses colors from centralized palette
-- [ ] Border is 1px solid #e5e5e5 (or transparent)
-- [ ] Border radius is 6-8px
-- [ ] Padding follows 4px grid
-- [ ] Font size is 14px (or 12px for compact)
-- [ ] Font weight is 400-600 range
-- [ ] Hover transition is 150-200ms
-- [ ] No drop shadows (except modals)
-- [ ] Text color is #171717 or #737373
-- [ ] Background is #ffffff or #fafafa
-- [ ] Icons are 16-24px outline style
-- [ ] Touch targets are 32px+ for interactive elements
-- [ ] Animation is subtle and quick
-- [ ] Responsive on mobile (16px padding minimum)
-
----
-
-## Implementation Notes
-
-### CSS Variables Approach
-
-```css
-:root {
-  --bg-primary: #ffffff;
-  --bg-secondary: #fafafa;
-  --text-primary: #171717;
-  --text-secondary: #737373;
-  --border: #e5e5e5;
-  --radius: 8px;
-  --transition: 0.2s ease;
-}
-```
-
-### Tailwind CSS Approach
-
-```javascript
-// tailwind.config.js
-theme: {
-  colors: {
-    bg: { primary: '#ffffff', secondary: '#fafafa' },
-    text: { primary: '#171717', secondary: '#737373' },
-    border: '#e5e5e5',
-  },
-  borderRadius: {
-    DEFAULT: '6px',
-    lg: '8px',
-    xl: '12px',
-  },
-  transitionDuration: {
-    DEFAULT: '200ms',
-    fast: '150ms',
-  }
-}
-```
-
----
-
-## Inspiration Sources
-
-- **Vercel Dashboard**: vercel.com/dashboard
-- **shadcn/ui**: ui.shadcn.com
-- **Linear**: linear.app
-- **GitHub**: github.com (2023+ design)
-- **Raycast**: raycast.com
-
----
-
-## Summary
-
-The Vercel/shadcn aesthetic is defined by:
-
-1. **Extreme minimalism**—every pixel has purpose
-2. **Near-monochrome palette**—black, white, grays
-3. **Subtle borders and backgrounds**—barely visible until needed
-4. **Quick, purposeful transitions**—150-200ms standard
-5. **Typography-driven hierarchy**—weight and spacing over color
-6. **No decorative effects**—no shadows, gradients, or transforms
-7. **System fonts**—fast loading, native feel
-8. **Generous whitespace**—let content breathe
-9. **Status colors used sparingly**—only when semantically needed
-10. **Dark mode as first-class**—not an afterthought
-
-This creates interfaces that feel fast, professional, and get out of the user's way.
diff --git a/public/mock-data/evaluation-sample-1.json b/public/mock-data/evaluation-sample-1.json
deleted file mode 100644
index 75dc123..0000000
--- a/public/mock-data/evaluation-sample-1.json
+++ /dev/null
@@ -1,211 +0,0 @@
-{
-  "id": 43,
-  "run_name": "Hindi FAQ Evaluation - Run 1",
-  "dataset_name": "hindi_policy_qa_5_rows",
-  "config": {
-    "model": "gpt-4",
-    "instructions": "You are a helpful FAQ assistant for policy questions.",
-    "temperature": 0.7
-  },
-  "assistant_id": null,
-  "dataset_id": 50,
-  "batch_job_id": 71,
-  "embedding_batch_job_id": 72,
-  "status": "completed",
-  "object_store_url": "s3://ai-platform-documents-staging/evaluations/43",
-  "total_items": 4,
-  "scores": {
-    "summary_scores": [
-      {
-        "name": "cosine_similarity",
-        "avg": 0.45267303673682135,
-        "std": 0.06016189626290471,
-        "total_pairs": 4,
-        "data_type": "NUMERIC"
-      },
-      {
-        "name": "SNEHA correctness",
-        "avg": 0.25,
-        "std": 0.4330127018922193,
-        "total_pairs": 4,
-        "data_type": "NUMERIC"
-      },
-      {
-        "name": "llm_judge_relevance",
-        "avg": 0.75,
-        "std": 0.25,
-        "total_pairs": 4,
-        "data_type": "NUMERIC"
-      },
-      {
-        "name": "response_category",
-        "distribution": {
-          "CORRECT": 1,
-          "PARTIAL": 2,
-          "INCORRECT": 1
-        },
-        "total_pairs": 4,
-        "data_type": "CATEGORICAL"
-      }
-    ],
-    "individual_scores": [
-      {
-        "trace_id": "97ec280e-883e-450e-a3e9-6cd8cfc0741a",
-        "input": {
-          "question": "सीएलएफ में उपसमिति के कार्य की समीक्षा कौन करता है?"
-        },
-        "output": {
-          "answer": "सीएलएफ (CLF) में उपसमिति के कार्य की समीक्षा मुख्य रूप से सीएलएफ की प्रबंधन समिति या संबंधित अधिकृत प्रतिनिधि द्वारा की जाती है।"
-        },
-        "metadata": {
-          "ground_truth": "सीएलएफ (Cluster Level Federation) में उपसमिति के कार्य की समीक्षा कार्यकारिणी समिति (Executive Committee) करती है।",
-          "item_id": "f444e132-6a25-403e-b0fa-9fd0c5452248",
-          "response_id": "resp_0a2a9f840105b5ca006923fa5855f8819680d7c280c082011f"
-        },
-        "trace_scores": [
-          {
-            "name": "cosine_similarity",
-            "value": 0.45159810574988546,
-            "data_type": "NUMERIC"
-          },
-          {
-            "name": "SNEHA correctness",
-            "value": 0,
-            "data_type": "NUMERIC",
-            "comment": "The generation incorrectly claims there is no information about the specific committee"
-          },
-          {
-            "name": "llm_judge_relevance",
-            "value": 0.5,
-            "data_type": "NUMERIC",
-            "comment": "Response is somewhat relevant but misses key details"
-          },
-          {
-            "name": "response_category",
-            "value": "INCORRECT",
-            "data_type": "CATEGORICAL"
-          }
-        ]
-      },
-      {
-        "trace_id": "f1833206-e9b6-49ac-864c-90dea4076fa9",
-        "input": {
-          "question": "ग्राम संगठन की बैठक में कितने सदस्यों की उपस्थिति आवश्यक है?"
-        },
-        "output": {
-          "answer": "ग्राम संगठन की बैठक में कम से कम 50% सदस्यों की उपस्थिति आवश्यक है। यह गणपूर्ति सुनिश्चित करती है कि निर्णय लोकतांत्रिक और प्रतिनिधि हों।"
-        },
-        "metadata": {
-          "ground_truth": "ग्राम संगठन की बैठक में कम से कम 50% सदस्यों की उपस्थिति आवश्यक है।",
-          "item_id": "a234b567-8901-234c-567d-890123456789",
-          "response_id": "resp_1b3c5d7e9f11a13b15c17d19e21f23a25b27c29d31e33f35a37b"
-        },
-        "trace_scores": [
-          {
-            "name": "cosine_similarity",
-            "value": 0.4654791998066623,
-            "data_type": "NUMERIC"
-          },
-          {
-            "name": "SNEHA correctness",
-            "value": 1,
-            "data_type": "NUMERIC",
-            "comment": "Response correctly identifies all key points"
-          },
-          {
-            "name": "llm_judge_relevance",
-            "value": 1,
-            "data_type": "NUMERIC",
-            "comment": "Highly relevant and accurate response"
-          },
-          {
-            "name": "response_category",
-            "value": "CORRECT",
-            "data_type": "CATEGORICAL"
-          }
-        ]
-      },
-      {
-        "trace_id": "b4a85769-96e2-402b-ab79-c1338e85c8b1",
-        "input": {
-          "question": "स्वयं सहायता समूह में कितनी महिलाएं होनी चाहिए?"
-        },
-        "output": {
-          "answer": "स्वयं सहायता समूह में आमतौर पर 10-20 महिलाएं होनी चाहिए। यह आदर्श संख्या समूह की प्रभावी कार्यप्रणाली के लिए उपयुक्त मानी जाती है।"
-        },
-        "metadata": {
-          "ground_truth": "स्वयं सहायता समूह में 10-20 महिलाएं होनी चाहिए। यह आदर्श संख्या प्रभावी संचालन और प्रबंधन के लिए उपयुक्त है। **Policy Source:** SHG Formation Guidelines",
-          "item_id": "c345d678-9012-345e-678f-901234567890",
-          "response_id": "resp_2c4d6e8f10a12b14c16d18e20f22a24b26c28d30e32f34a36c"
-        },
-        "trace_scores": [
-          {
-            "name": "cosine_similarity",
-            "value": 0.5311989216166836,
-            "data_type": "NUMERIC"
-          },
-          {
-            "name": "SNEHA correctness",
-            "value": 0,
-            "data_type": "NUMERIC",
-            "comment": "Missing policy source and reference"
-          },
-          {
-            "name": "llm_judge_relevance",
-            "value": 0.75,
-            "data_type": "NUMERIC",
-            "comment": "Good coverage but incomplete"
-          },
-          {
-            "name": "response_category",
-            "value": "PARTIAL",
-            "data_type": "CATEGORICAL"
-          }
-        ]
-      },
-      {
-        "trace_id": "51c505d8-efe0-454a-a339-f42a13fe94b8",
-        "input": {
-          "question": "समुदाय संसाधन व्यक्ति (CRP) का मुख्य कार्य क्या है?"
-        },
-        "output": {
-          "answer": "समुदाय संसाधन व्यक्ति (CRP) का मुख्य कार्य समूहों को प्रशिक्षण देना और उन्हें तकनीकी सहायता प्रदान करना है।"
-        },
-        "metadata": {
-          "ground_truth": "समुदाय संसाधन व्यक्ति (CRP) का मुख्य कार्य SHG सदस्यों को प्रशिक्षण, क्षमता निर्माण, और सामुदायिक संगठन में सहायता प्रदान करना है। वे फील्ड-स्तरीय सहायता और मेंटरशिप भी प्रदान करते हैं।",
-          "item_id": "d456e789-0123-456f-789g-012345678901",
-          "response_id": "resp_3d5e7f9g11a13b15c17d19e21f23a25b27c29d31e33f35a37d"
-        },
-        "trace_scores": [
-          {
-            "name": "cosine_similarity",
-            "value": 0.36241591977405424,
-            "data_type": "NUMERIC"
-          },
-          {
-            "name": "SNEHA correctness",
-            "value": 0,
-            "data_type": "NUMERIC",
-            "comment": "Factually incomplete - misses key responsibilities"
-          },
-          {
-            "name": "llm_judge_relevance",
-            "value": 0.5,
-            "data_type": "NUMERIC",
-            "comment": "Tangentially related but misses main point"
-          },
-          {
-            "name": "response_category",
-            "value": "PARTIAL",
-            "data_type": "CATEGORICAL"
-          }
-        ]
-      }
-    ]
-  },
-  "error_message": null,
-  "organization_id": 1,
-  "project_id": 1,
-  "inserted_at": "2025-11-17T11:07:44.609916",
-  "updated_at": "2025-11-17T11:18:44.235194"
-}
diff --git a/public/mock-data/evaluation-sample-2.json b/public/mock-data/evaluation-sample-2.json
deleted file mode 100644
index bff4d0f..0000000
--- a/public/mock-data/evaluation-sample-2.json
+++ /dev/null
@@ -1,173 +0,0 @@
-{
-  "id": 44,
-  "run_name": "English FAQ Evaluation - Test Run",
-  "dataset_name": "english_policy_qa_3_rows",
-  "config": {
-    "model": "gpt-4-turbo",
-    "instructions": "You are a helpful assistant answering policy-related questions.",
-    "temperature": 0.3
-  },
-  "assistant_id": "asst_abc123xyz",
-  "dataset_id": 51,
-  "batch_job_id": 73,
-  "embedding_batch_job_id": 74,
-  "status": "completed",
-  "object_store_url": "s3://ai-platform-documents-staging/evaluations/44",
-  "total_items": 3,
-  "scores": {
-    "summary_scores": [
-      {
-        "name": "cosine_similarity",
-        "avg": 0.782,
-        "std": 0.123,
-        "total_pairs": 3,
-        "data_type": "NUMERIC"
-      },
-      {
-        "name": "SNEHA correctness",
-        "avg": 0.667,
-        "std": 0.471,
-        "total_pairs": 3,
-        "data_type": "NUMERIC"
-      },
-      {
-        "name": "llm_judge_relevance",
-        "avg": 0.833,
-        "std": 0.236,
-        "total_pairs": 3,
-        "data_type": "NUMERIC"
-      },
-      {
-        "name": "response_category",
-        "distribution": {
-          "CORRECT": 2,
-          "PARTIAL": 1,
-          "INCORRECT": 0
-        },
-        "total_pairs": 3,
-        "data_type": "CATEGORICAL"
-      }
-    ],
-    "individual_scores": [
-      {
-        "trace_id": "aaa11111-1111-1111-1111-111111111111",
-        "input": {
-          "question": "What is the minimum attendance required for a Village Organization meeting?"
-        },
-        "output": {
-          "answer": "The minimum attendance required for a Village Organization meeting is 50% of the total members. This quorum ensures that decisions are representative and democratic."
-        },
-        "metadata": {
-          "ground_truth": "A minimum of 50% of members must be present for a Village Organization meeting to proceed with decision-making. This is specified in the Community Operational Manual.",
-          "item_id": "item-001",
-          "response_id": "resp-eng-001"
-        },
-        "trace_scores": [
-          {
-            "name": "cosine_similarity",
-            "value": 0.89,
-            "data_type": "NUMERIC"
-          },
-          {
-            "name": "SNEHA correctness",
-            "value": 1,
-            "data_type": "NUMERIC",
-            "comment": "Accurate and complete response"
-          },
-          {
-            "name": "llm_judge_relevance",
-            "value": 1,
-            "data_type": "NUMERIC",
-            "comment": "Fully relevant with additional context"
-          },
-          {
-            "name": "response_category",
-            "value": "CORRECT",
-            "data_type": "CATEGORICAL"
-          }
-        ]
-      },
-      {
-        "trace_id": "bbb22222-2222-2222-2222-222222222222",
-        "input": {
-          "question": "How many women should be in a Self Help Group?"
-        },
-        "output": {
-          "answer": "A Self Help Group should typically have between 10 to 20 women members."
-        },
-        "metadata": {
-          "ground_truth": "A Self Help Group (SHG) should have 10-20 women members for effective operation and management.",
-          "item_id": "item-002",
-          "response_id": "resp-eng-002"
-        },
-        "trace_scores": [
-          {
-            "name": "cosine_similarity",
-            "value": 0.85,
-            "data_type": "NUMERIC"
-          },
-          {
-            "name": "SNEHA correctness",
-            "value": 1,
-            "data_type": "NUMERIC",
-            "comment": "Correct range provided"
-          },
-          {
-            "name": "llm_judge_relevance",
-            "value": 0.9,
-            "data_type": "NUMERIC",
-            "comment": "Directly answers the question"
-          },
-          {
-            "name": "response_category",
-            "value": "CORRECT",
-            "data_type": "CATEGORICAL"
-          }
-        ]
-      },
-      {
-        "trace_id": "ccc33333-3333-3333-3333-333333333333",
-        "input": {
-          "question": "What are the responsibilities of a Community Resource Person?"
-        },
-        "output": {
-          "answer": "A Community Resource Person (CRP) is responsible for providing training and technical support to groups."
-        },
-        "metadata": {
-          "ground_truth": "A Community Resource Person (CRP) provides training, capacity building, field-level support, mentorship, and assists in community organization activities for SHG members.",
-          "item_id": "item-003",
-          "response_id": "resp-eng-003"
-        },
-        "trace_scores": [
-          {
-            "name": "cosine_similarity",
-            "value": 0.601,
-            "data_type": "NUMERIC"
-          },
-          {
-            "name": "SNEHA correctness",
-            "value": 0,
-            "data_type": "NUMERIC",
-            "comment": "Incomplete - missing key responsibilities like mentorship and capacity building"
-          },
-          {
-            "name": "llm_judge_relevance",
-            "value": 0.6,
-            "data_type": "NUMERIC",
-            "comment": "Partially relevant but lacks detail"
-          },
-          {
-            "name": "response_category",
-            "value": "PARTIAL",
-            "data_type": "CATEGORICAL"
-          }
-        ]
-      }
-    ]
-  },
-  "error_message": null,
-  "organization_id": 1,
-  "project_id": 1,
-  "inserted_at": "2025-11-18T09:30:15.123456",
-  "updated_at": "2025-11-18T09:42:30.654321"
-}