feat(iris): implement polygon tools in iris and refactor handler structure#3495
feat(iris): implement polygon tools in iris and refactor handler structure#3495
Conversation
Wire generate.Task to TestcaseManager so generated input/output pairs are saved after the generation loop. Rename GenerateRequest.Language to GeneratorLanguage, add testcaseCount upper bound and arg length validation, and report TotalTestcases alongside GeneratedTestcases in the result. Refactor judge and run tasks to call BuildUnit.Run() instead of sandbox.Run() directly, removing the redundant dir variable. Fix run factory's taskType match (userTestCase), add context cancellation in the generate loop, buffer the connector result channel, and inject logger into postgres/testcase-manager for partial-save warnings.
There was a problem hiding this comment.
Pull request overview
This PR refactors Iris’ request handling into a task/factory-based router/runner structure and adds new Polygon-style tooling flows (generate/validate), along with loader changes to support saving testcases to Postgres and passing extra runtime args into the sandbox.
Changes:
- Refactor: replace monolithic judge handler with
TaskRunner+ per-feature task factories (judge,run,generate,validate). - Feature: add generator/validator task flows and extend sandbox run arguments (
ExtraArgs). - Storage: add Postgres testcase
Savepath and split loader element types intoElementIn/ElementOut.
Reviewed changes
Copilot reviewed 41 out of 43 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| apps/iris/src/router/router.go | Routes message types to task factories and executes them via TaskRunner. |
| apps/iris/src/handler/task_runner.go | New orchestration layer for setting up build units and running task actions. |
| apps/iris/src/handler/build_unit.go | Encapsulates compile/setup + run for a unit of code (solution/generator/validator). |
| apps/iris/src/handler/judge/* | New judge task + request validation + factory and tests. |
| apps/iris/src/handler/run/* | New run task + request validation + factory and tests. |
| apps/iris/src/handler/generate/* | New generator task + request validation + factory and tests. |
| apps/iris/src/handler/validate/* | New validator task + request validation + factory and tests. |
| apps/iris/src/handler/interface.go | Adds task interfaces and generalizes ParseError. |
| apps/iris/src/service/sandbox/* | Adds ExtraArgs to run requests and lang config plumbing. |
| apps/iris/src/service/testcase/* | Adds SaveTestcase and switches testcase element type to ElementOut. |
| apps/iris/src/loader/postgres.go | Adds Save() for testcases and adds logging around DB operations. |
| apps/iris/src/loader/s3.go | Adjusts MinIO/AWS env behavior and switches to ElementOut results. |
| apps/iris/src/loader/element.go | Splits into ElementIn (DB insert) and ElementOut (response). |
| apps/iris/src/connector/rabbitmq/connector.go | Adjusts context usage and result channel buffering. |
| apps/iris/main.go | Wires up new runner + factories and updated loader constructors. |
| apps/iris/tests/data/submission/1_generate.json | Adds fixture for generate flow. |
| apps/iris/example/request/* | Adds example payloads for generate/validate. |
| apps/iris/example/response/* | Adds example responses for generate/validate. |
| .devcontainer/Dockerfile | Installs testlib.h into the devcontainer image. |
| .gitignore | Ignores agent-related directories/files. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
+8,000 은 testlib.h 때문에 그런 겁니다... |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request refactors the Iris application to introduce a new task-based architecture for handling different types of requests, including judging, running, generating, and validating code. This involves creating new handler factories, a TaskRunner, and a BuildUnit for managing code compilation and execution. The router has been updated to dispatch tasks to this new runner. Additionally, the loader package has been enhanced with new ElementIn and ElementOut types, and the Postgres data source now supports saving test cases with improved logging. Example C++ tools and request/response JSONs for generation and validation have also been added. Feedback from the review highlights that hardcoded time and memory limits in the generate task should be replaced with constants, and there is an opportunity to refactor duplicate request/result structs and RunAction logic between the judge and run packages into shared components for better maintainability.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request significantly refactors the Iris judging system, transitioning from a single JudgeHandler to a more modular, task-based architecture. It introduces a TaskRunner to orchestrate various task types, including Judge, Run, Generate, and Validate, each managed by its own factory and BuildUnits. New functionalities for generating and validating test cases are added, along with support for competitive programming tools like testlib.h and corresponding example files. Database (Postgres) and S3 interactions are updated to handle new ElementIn/ElementOut types and incorporate enhanced logging. The RabbitMQ connector also received minor cleanup and optimization. The review comments suggest improving the configurability of time and memory limits in the generate and validate tasks, optimizing the extractSignalAndRealTime function by replacing reflection with a type switch, and addressing potential side effects of unsetting global AWS environment variables in the S3 loader.
| TimeLimit: 2000, | ||
| MemoryLimit: 512 * 1024 * 1024, |
There was a problem hiding this comment.
The TimeLimit and MemoryLimit for the generator build unit are hardcoded. These limits should ideally be configurable or dynamically determined based on the problem's requirements. Hardcoding them can lead to generators failing prematurely or consuming excessive resources if the defaults are not appropriate for a given problem.
| TimeLimit: 2000, | |
| MemoryLimit: 512 * 1024 * 1024, | |
| TimeLimit: validReq.TimeLimit, // Or a configurable default | |
| MemoryLimit: validReq.MemoryLimit, // Or a configurable default |
| TimeLimit: 2000, | ||
| MemoryLimit: 512 * 1024 * 1024, |
There was a problem hiding this comment.
The TimeLimit and MemoryLimit for the solution build unit are hardcoded. These limits should ideally be configurable or dynamically determined based on the problem's requirements. Hardcoding them can lead to solutions failing prematurely or consuming excessive resources if the defaults are not appropriate for a given problem.
| TimeLimit: 2000, | |
| MemoryLimit: 512 * 1024 * 1024, | |
| TimeLimit: validReq.TimeLimit, // Or a configurable default | |
| MemoryLimit: validReq.MemoryLimit, // Or a configurable default |
apps/iris/src/handler/interface.go
Outdated
| func extractSignalAndRealTime(res any) (int, int, bool) { | ||
| v := reflect.ValueOf(res) | ||
| if !v.IsValid() { | ||
| return 0, 0, false | ||
| } | ||
|
|
||
| if v.Kind() == reflect.Pointer { | ||
| if v.IsNil() { | ||
| return 0, 0, false | ||
| } | ||
| v = v.Elem() | ||
| } | ||
|
|
||
| if v.Kind() == reflect.Struct { | ||
| signalField := v.FieldByName("Signal") | ||
| realTimeField := v.FieldByName("RealTime") | ||
| if signalField.IsValid() && realTimeField.IsValid() && | ||
| signalField.Kind() == reflect.Int && realTimeField.Kind() == reflect.Int { | ||
| return int(signalField.Int()), int(realTimeField.Int()), true | ||
| } | ||
| } | ||
|
|
||
| if m, ok := res.(map[string]any); ok { | ||
| signal, hasSignal := m["signal"].(float64) | ||
| realTime, hasRealTime := m["realTime"].(float64) | ||
| if hasSignal && hasRealTime { | ||
| return int(signal), int(realTime), true | ||
| } | ||
| } | ||
|
|
||
| return 0, 0, false | ||
| } |
There was a problem hiding this comment.
The extractSignalAndRealTime function uses reflection (reflect.ValueOf, FieldByName) to extract Signal and RealTime fields from an any type. While flexible, reflection can be less performant and introduces runtime type safety concerns compared to compile-time type assertions. For a component that might be called frequently in a performance-sensitive system, this could introduce overhead.
If the possible result types are limited and known (e.g., JudgeResult, RunResult, GenerateResult, ValidateResult), consider using a type switch to handle each concrete type and extract the fields directly, avoiding reflection.
| TimeLimit: 2000, | ||
| MemoryLimit: 512 * 1024 * 1024, |
There was a problem hiding this comment.
The TimeLimit and MemoryLimit for the validator build unit are hardcoded. These limits should ideally be configurable or dynamically determined based on the problem's requirements. Hardcoding them can lead to validators failing prematurely or consuming excessive resources if the defaults are not appropriate for a given problem.
| TimeLimit: 2000, | |
| MemoryLimit: 512 * 1024 * 1024, | |
| TimeLimit: validReq.TimeLimit, // Or a configurable default | |
| MemoryLimit: validReq.MemoryLimit, // Or a configurable default |
| _ = os.Unsetenv("AWS_PROFILE") | ||
| _ = os.Unsetenv("AWS_DEFAULT_PROFILE") |
There was a problem hiding this comment.
Unsetting global environment variables AWS_PROFILE and AWS_DEFAULT_PROFILE using os.Unsetenv can have unintended side effects on other parts of the application or other libraries that might rely on these variables. A more robust approach would be to configure the AWS SDK client directly to ignore shared credentials or to check if these variables are set before unsetting them, rather than modifying the global environment.
| _ = os.Unsetenv("AWS_PROFILE") | |
| _ = os.Unsetenv("AWS_DEFAULT_PROFILE") | |
| // _ = os.Unsetenv("AWS_PROFILE") // Consider more targeted configuration instead of global unset | |
| // _ = os.Unsetenv("AWS_DEFAULT_PROFILE") // Consider more targeted configuration instead of global unset |
Description
클로드로 초안을 작성했는데, 의도와 문제 인식이 더 명확히 드러나도록 수정할 예정입니다,
+8000 중 +6000 정도는 testlib.h 때문입니다.
테스트 파일 추가 예정입니다.
feat(iris): Polygon 스타일 도구 + 핸들러 아키텍처 재설계
요약
Iris에 Polygon 스타일 Generator/Validator 추가함. 근데 기존
judge-handler.go(401줄)에 그냥 때려넣으면 터지니까 핸들러 구조 자체를 갈아엎음.배경
judge-handler.go하나가 Judge랑 Run 둘 다 처리하고 있었음. 여기에 Generate, Validate까지 넣으면 파일 뚱뚱해지고 책임 경계 날아감. 그래서 핸들러 2개 추가가 아니라 Factory → Task → BuildUnit → TaskRunner 4계층으로 쪼개는 걸로 방향 잡음.변경사항
1. testlib.h 도입
Polygon 도구가 testlib.h에 의존함. Generator 시드 난수, Validator 입력 검증 다 이거 필요함.
lib/testlib.h추가 (6,252줄)example/에 generator, validator, checker 예제 코드 + 요청/응답 JSON1_generate.json추가2. 핸들러 아키텍처 재설계
모놀리스 → 4계층으로 분리함.
graph LR subgraph Before["기존"] OLD["judge-handler.go<br/>(401줄, 전부 다 처리)"] end subgraph After["변경 후"] direction TB F["Factory<br/>JSON 파싱 → Task 구성"] T["Task<br/>테스트케이스별 실행"] BU["BuildUnit<br/>소스 → 컴파일 → 실행"] TR["TaskRunner<br/>병렬 Setup + 디스패치"] F --> TR --> BU --> T end OLD -.->|"리팩토링"| Afterjudge, run, generate, validate 각각 factory.go / task.go / models.go 가짐.
graph TD Router["Router<br/>path별 분기"] Router -->|"judge"| JF["judge.Factory"] Router -->|"run"| RF["run.Factory"] Router -->|"generate"| GF["generate.Factory"] Router -->|"validate"| VF["validate.Factory"] JF & RF & GF & VF -->|"Task"| TR["TaskRunner"] TR -->|"병렬 Setup"| BU1["BuildUnit #1"] & BU2["BuildUnit #2"] BU1 & BU2 -->|"완료 후"| TA["Task.RunAction()"] TA -->|"sendResult()"| RouterBuildUnit도 정리함. 기존에
BuildUnit이랑PreparedBuildUnit이 따로 있어서 같은 놈을 시간축으로 두 번 표현하고 있었음. 하나로 합침.Setup()전후로 상태 채워지는 구조.PreparedBuildUnit,BuildContext,BuildAwareTask다 삭제.3. 병렬 빌드유닛 컴파일
Generate는 Generator + Solution 2개 빌드유닛 있음. 순차 컴파일 낭비니까
TaskRunner.Run()에서sync.WaitGroup+ 버퍼드 에러 채널로 병렬 처리함. 각 BuildUnit이 독립 디렉토리 만들어서 공유 상태 없음.4. Callback 기반 결과 스트리밍
기존에 Task가
outChan직접 들고 있어서 문제 3개 있었음.close(out)실수하면 패닉out공유하면 race conditiongraph LR subgraph Before["기존: Task가 채널 소유"] T1["Task"] -->|"outChan (공유)"| TR1["TaskRunner"] T1 -.->|"close(out) 실수 → 패닉"| TR1 end subgraph After["변경: TaskRunner가 클로저 주입"] TR2["TaskRunner"] -->|"sendResult (호출별 클로저)"| T2["Task.RunAction()"] TR2 -->|"defer close(out)"| CH["채널"] endsendResult가 호출별 클로저라 동시 요청 간 격리됨. 중첩 goroutine 없앰.close(out)은 TaskRunnerdefer에서만 함.5. Router 리팩토링
JudgeHandler하나만 알고 제네릭 타입 파라미터 씀"generate","validate"라우트 추가6. Testcase 계층 쓰기 지원
기존엔 읽기만 됐음. Generate는 만든 테스트케이스 저장해야 함.
Element→ElementOut(읽기) +ElementIn(쓰기) 분리Save([]ElementIn)벌크 INSERT 추가TestcaseManager에SaveTestcase추가7. Sandbox ExtraArgs
Polygon Generator가 커맨드라인 인자 받음 (
./generator 100 5이런 식). 기존 RunRequest에 전달 방법 없어서ExtraArgs []string추가하고 langConfig까지 연결함.8. RabbitMQ Connector 수정
버그 2개 잡음.
9. 컴파일 에러 메시지 정규화
C/C++ 컴파일 에러에 샌드박스 내부 경로(
/app/sandbox/results/abcd1234-default/main.cpp:5:1) 노출되고 있었음.normalizeCompileError()로 잘라냄.10. ParseError 다형화
기존엔
JudgeResult만 받았는데 결과 타입 여러 개 생김.ParseError(any, ResultCode)로 바꾸고reflect로 Signal/RealTime 필드 추출하게 함.전체 흐름 (Generate 기준)
sequenceDiagram participant MQ as RabbitMQ participant Router participant Factory as generate.Factory participant TR as TaskRunner participant Gen as BuildUnit<br/>(generator) participant Sol as BuildUnit<br/>(solution) participant Sandbox participant Task as generate.Task participant DB as Postgres Note over MQ,DB: 1. 요청 수신 MQ->>Router: "generate" 메시지 Router->>Factory: Create("generate", data) Factory->>Factory: JSON 파싱 + Validate() Factory-->>Router: Task 반환 Note over MQ,DB: 2. 병렬 컴파일 Router->>TR: go Run(task) par Generator 컴파일 TR->>Gen: Setup() Gen->>Sandbox: 디렉토리 생성 → 소스 저장 → 컴파일 and Solution 컴파일 TR->>Sol: Setup() Sol->>Sandbox: 디렉토리 생성 → 소스 저장 → 컴파일 end TR->>TR: WaitGroup.Wait() Note over MQ,DB: 3. 테스트케이스 생성 TR->>Task: RunAction(ctx, sendResult) loop N개 반복 Task->>Gen: Run() → input 생성 Task->>Sol: Run(input) → output 생성 end Note over MQ,DB: 4. 저장 + 응답 Task->>DB: SaveTestcase(problemId, pairs) Task->>Router: sendResult(결과) Router->>MQ: PublishBreaking Changes
전부 내부 패키지라 외부 영향 없음.
ParseError:ParseError(JudgeResult, ResultCode)→ParseError(any, ResultCode)NewRouter: 제네릭 제거, 4 Factory + TaskRunner 주입NewTestcaseManager:logger파라미터 추가LangConfig.ToRunExecArgs:extraArgs []string파라미터 추가변경 규모
43개 파일, +8,359 / -480### Additional context
Before submitting the PR, please make sure you do the following
fixes #123).