Skip to content

Commit bbb523f

Browse files
Update README.md
Update competitor table
1 parent e379f6a commit bbb523f

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,8 +50,8 @@ Stratix is built differently. It gives you production-grade evaluation infrastru
5050

5151
| Capability | **Stratix** | LangSmith | Langfuse | DeepEval | Phoenix (Arize) |
5252
| ----------------------- | ---------------------------------------------- | -------------------------- | ----------------------- | ------------------- | ---------------------- |
53-
| Pre-built benchmarks | 100+ benchmarks, 200+ models | No public benchmarks | No public benchmarks | ~14 metrics | Bring your own |
54-
| Prompt-level comparison | Native head-to-head with outcome filters | Side-by-side runs (manual) | Not built-in | Manual setup | Not built-in |
53+
| Pre-built benchmarks | 100+ benchmarks, 200+ models | No public benchmarks | No public benchmarks | 30+ metrics | Bring your own |
54+
| Prompt-level comparison | Native head-to-head with outcome filters | Side-by-side runs (manual) | Prompt experiments + side-by-side (UI) | Manual setup | Not built-in |
5555
| Custom judge builder | Auto-optimized GEPA judges with budget control | LLM-as-judge (manual) | LLM-as-judge (manual) | Basic LLM judges | LLM-as-judge templates |
5656
| Agent trace evaluation | Upload, replay, judge every step | Trace logging + annotation | Trace logging + scoring | Trace logging only | Trace visualization |
5757
| Eval generation ladder | Heuristic > model-graded > deliberation > GEPA | Single generation | Single generation | Single generation | Single generation |

0 commit comments

Comments
 (0)