Skip to content

Commit 6bd460d

Browse files
committed
feat: Add reuse support for translation stage
- Add load_translation_outputs_from_run() function in resume.py - Implement full reuse logic in translation phase (graph.py) - Update stage lists in main.py and resume.py to include translation - Unify stage naming to use "translation" consistently - Translation outputs can now be reused from previous runs, similar to other stages
1 parent 4c8b529 commit 6bd460d

File tree

10 files changed

+2912
-2325
lines changed

10 files changed

+2912
-2325
lines changed

docs/pages/mindmaps/neetcode_ontology_agent_evolved_en.html

Lines changed: 555 additions & 511 deletions
Large diffs are not rendered by default.

docs/pages/mindmaps/neetcode_ontology_agent_evolved_zh-TW.html

Lines changed: 616 additions & 574 deletions
Large diffs are not rendered by default.

tools/ai-markmap-agent/README.md

Lines changed: 49 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -108,9 +108,23 @@ This system refines existing high-quality Markmaps through multi-expert review a
108108
│ Apply adopted improvements to baseline → Refined Markmap │
109109
│ │ │
110110
│ ════════════════════════════════════════════════════════════════════════ │
111-
│ Phase 5-6: Translation & Post-Processing
111+
│ Phase 4: Writer
112112
│ ════════════════════════════════════════════════════════════════════════ │
113113
│ │
114+
│ Apply improvements → Raw markdown (saved to debug) │
115+
│ │ │
116+
│ ════════════════════════════════════════════════════════════════════════ │
117+
│ Phase 5: Translation │
118+
│ ════════════════════════════════════════════════════════════════════════ │
119+
│ │ │
120+
│ Translate raw markdown → Translated raw markdown (saved to debug) │
121+
│ │ │
122+
│ ════════════════════════════════════════════════════════════════════════ │
123+
│ Phase 6: Post-Processing │
124+
│ ════════════════════════════════════════════════════════════════════════ │
125+
│ │ │
126+
│ Normalize links for BOTH English and translated outputs │
127+
│ │
114128
└─────────────────────────────────────────────────────────────────────────────┘
115129
```
116130

@@ -158,13 +172,25 @@ Apply adopted improvements surgically to the baseline:
158172
- Preserve existing quality
159173
- Verify links and formatting
160174

161-
### Phase 5-6: Post-Processing
175+
### Phase 4: Writer
176+
177+
- Applies adopted improvements to baseline
178+
- Outputs raw markdown (no post-processing)
179+
- Saves to debug output for inspection
180+
181+
### Phase 5: Translation
182+
183+
- Translates writer outputs (raw markdown)
184+
- Both original and translated outputs saved to debug
185+
- Outputs raw markdown (no post-processing)
186+
187+
### Phase 6: Post-Processing
162188

163-
- Translation (en → zh-TW)
189+
- Processes both English and translated outputs
164190
- Link validation and normalization
165-
- Automatic LeetCode URL generation
191+
- Automatic LeetCode URL generation for all languages
166192
- GitHub solution link addition
167-
- Comparison file generation
193+
- Comparison file generation (before/after for each language)
168194

169195
---
170196

@@ -928,18 +954,34 @@ Post-processing automatically converts LeetCode problem references to standardiz
928954
929955
**Priority:** Local TOML > API cache
930956
957+
### Processing Flow
958+
959+
1. **Writer Phase**: Generates raw markdown (no post-processing)
960+
- Saved to debug output: `llm_output_writer_write.md`
961+
- Used as input for translation
962+
963+
2. **Translation Phase**: Translates raw markdown (no post-processing)
964+
- Debug output: `translation_before_*.md` and `translation_after_*.md`
965+
- Outputs translated raw markdown
966+
967+
3. **Post-Processing Phase**: Processes both English and translated outputs
968+
- Input: Raw markdown from writer (English) + translations (e.g., Chinese)
969+
- Output: Post-processed markdown with normalized links for all languages
970+
- Debug output: `post_processing_before_*.md` and `post_processing_after_*.md` for each language
971+
931972
### Comparison Files
932973
933974
After each post-processing run, a comparison file is automatically generated:
934975
935976
**Location:** `outputs/final/post_processing_comparison_{timestamp}.md`
936977
937978
**Contents:**
938-
- Before: Original AI-generated content
979+
- Before/After comparison for each language (English, Chinese, etc.)
980+
- Before: Raw content from Writer/Translation (no post-processing)
939981
- After: Post-processed content with normalized links
940982
941983
**Usage:**
942-
- Verify link generation correctness
984+
- Verify link generation correctness for all languages
943985
- Check format compliance
944986
- Identify improvements needed
945987
Lines changed: 141 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -1,120 +1,191 @@
1-
# 後處理連結處理說明
1+
# Post-Processing Link Handling
22

3-
## 概述
3+
## Overview
44

5-
後處理模組 (`post_processing.py`) 負責將 AI 生成的 mindmap 內容中的 LeetCode 問題引用轉換為標準化的連結格式。
5+
The post-processing module (`post_processing.py`) is responsible for converting LeetCode problem references in AI-generated mindmap content into standardized link formats.
66

7-
## 連結格式
7+
## Link Format
88

9-
### 目標格式
9+
### Target Format
1010

1111
```
1212
[LeetCode 11](leetcode_url) | [Solution](github_url)
1313
```
1414

15-
**特點:**
16-
- 只使用題號,不包含標題
17-
- 格式簡潔統一
18-
- 自動添加 GitHub solution 連結(如果有)
15+
**Features:**
16+
- Uses only problem numbers, excludes titles
17+
- Concise and unified format
18+
- Automatically adds GitHub solution links (if available)
1919

20-
### 處理的輸入格式
20+
### Input Formats Handled
2121

22-
後處理會處理以下多種 AI 可能產生的格式:
22+
Post-processing handles the following various formats that AI may generate:
2323

24-
1. **純文字格式**
24+
1. **Plain Text Format**
2525
- `LeetCode 11`
2626
- `LeetCode 11 - Container With Most Water`
2727
- `LC 11`
2828

29-
2. **Markdown 連結格式**
29+
2. **Markdown Link Format**
3030
- `[LeetCode 11](url)`
3131
- `[LeetCode 11 - Container With Most Water](url)`
3232
- `[LC 11](url)`
3333

34-
3. **錯誤的 URL**
35-
- `[LeetCode 11](wrong_url)`自動修正為正確的 URL
34+
3. **Incorrect URLs**
35+
- `[LeetCode 11](wrong_url)`Automatically corrected to the correct URL
3636

37-
## 處理流程
37+
## Processing Flow
3838

39-
### 步驟 1: 文字替換
39+
### Overall Flow
40+
41+
1. **Writer Phase**: Produces raw markdown (**no post-processing**)
42+
- Saved to debug output
43+
- Used for translation phase
44+
45+
2. **Translation Phase**: Translates raw markdown (**no post-processing**)
46+
- Saved to debug output
47+
- Produces translated raw markdown
48+
49+
3. **Post-Processing Phase**: Processes links for English and Chinese
50+
- Simultaneously processes `writer_outputs` (English) and `translated_outputs` (Chinese)
51+
- Generates standardized links for all languages
52+
53+
### Post-Processing Steps
54+
55+
### Step 1: Text Replacement
4056

4157
- `LC 11``LeetCode 11`
4258
- `LC-11``LeetCode 11`
4359
- `LeetCode11``LeetCode 11`
4460

45-
### 步驟 2: 連結轉換
61+
### Step 2: Link Conversion
4662

47-
將純文字或現有連結轉換為標準格式:
63+
Convert plain text or existing links to standard format:
4864

49-
**輸入:**
65+
**Input:**
5066
```
5167
LeetCode 11 - Container With Most Water
5268
```
5369

54-
**輸出:**
70+
**Output:**
5571
```
5672
[LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/)
5773
```
5874

59-
### 步驟 3: URL 正規化
75+
### Step 3: URL Normalization
6076

61-
確保所有 LeetCode URL 使用正確的格式:
62-
- 移除檔案名稱格式的 slug(如 `0011_container_with_most_water`
63-
- 轉換為標準 slug(如 `container-with-most-water`
64-
- 確保以 `/description/` 結尾
77+
Ensure all LeetCode URLs use the correct format:
78+
- Remove file name format slugs (e.g., `0011_container_with_most_water`)
79+
- Convert to standard slugs (e.g., `container-with-most-water`)
80+
- Ensure ending with `/description/`
6581

66-
### 步驟 4: 添加 GitHub Solution 連結
82+
### Step 4: Add GitHub Solution Links
6783

68-
如果問題有對應的 solution 檔案,自動添加 GitHub 連結:
84+
If a problem has a corresponding solution file, automatically add GitHub link:
6985

70-
**輸入:**
86+
**Input:**
7187
```
7288
[LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/)
7389
```
7490

75-
**輸出:**
91+
**Output:**
7692
```
7793
[LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/) | [Solution](https://github.com/lufftw/neetcode/blob/main/solutions/0011_container_with_most_water.py)
7894
```
7995

80-
## 資料來源
96+
## Data Sources
97+
98+
### Local TOML Files
99+
100+
Load problem metadata from `meta/problems/` directory, including:
101+
- Problem titles
102+
- Solution file paths
103+
- Other metadata
104+
105+
### LeetCode API Cache
81106

82-
### 本地 TOML 檔案
107+
Load from `tools/.cache/leetcode_problems.json`:
108+
- LeetCode URLs
109+
- Slugs
110+
- Problem titles (as supplement)
83111

84-
`meta/problems/` 目錄載入問題元資料,包含:
85-
- 問題標題
86-
- Solution 檔案路徑
87-
- 其他元資料
112+
**Priority:**
113+
1. Local TOML data (priority)
114+
2. API cache data (supplement)
88115

89-
### LeetCode API 快取
116+
## Comparison Files
90117

91-
`tools/.cache/leetcode_problems.json` 載入:
92-
- LeetCode URL
93-
- Slug
94-
- 問題標題(作為補充)
118+
After each post-processing execution, a comparison file is automatically generated:
95119

96-
**優先順序:**
97-
1. 本地 TOML 資料(優先)
98-
2. API 快取資料(補充)
120+
**Location:** `outputs/final/post_processing_comparison_{timestamp}.md`
99121

100-
## 對比檔案
122+
**Content:**
123+
- Before/After comparison for each language (English, Chinese, etc.)
124+
- Before: Original content (Writer/Translation output, unprocessed)
125+
- After: Post-processed content (links standardized)
101126

102-
每次執行後處理後,會自動生成對比檔案:
127+
**Purpose:**
128+
- Check post-processing effectiveness
129+
- Verify links are correctly generated (English and Chinese)
130+
- Compare differences before and after processing
103131

104-
**位置:** `outputs/final/post_processing_comparison_{timestamp}.md`
132+
## Flow Confirmation
105133

106-
**內容:**
107-
- Before: 原始內容(AI 生成)
108-
- After: 後處理後的內容
134+
### Writer Phase Output
109135

110-
**用途:**
111-
- 檢查後處理效果
112-
- 驗證連結是否正確生成
113-
- 比較處理前後的差異
136+
**Output:** Raw markdown (no post-processing)
137+
```
138+
- LeetCode 11 - Container With Most Water
139+
- LeetCode 3 - Longest Substring
140+
```
114141

115-
## 範例
142+
**Debug Output:** `llm_output_writer_write.md` (original content)
116143

117-
### 範例 1: 純文字轉換
144+
### Translation Phase Output
145+
146+
**Input:** Writer's raw markdown (no post-processing)
147+
148+
**Output:** Translated raw markdown (no post-processing)
149+
```
150+
- LeetCode 11 - 盛最多水的容器
151+
- LeetCode 3 - 無重複字符的最長子串
152+
```
153+
154+
**Debug Output:**
155+
- `translation_before_general_en_general_zh-TW.md` (before translation)
156+
- `translation_after_general_en_general_zh-TW.md` (after translation)
157+
158+
### Post-Processing Phase Output
159+
160+
**Input:**
161+
- Writer raw markdown (English)
162+
- Translated raw markdown (Chinese)
163+
164+
**Output:** Post-processed markdown (English and Chinese)
165+
166+
**English:**
167+
```
168+
- [LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/) | [Solution](...)
169+
- [LeetCode 3](https://leetcode.com/problems/longest-substring-without-repeating-characters/description/) | [Solution](...)
170+
```
171+
172+
**Chinese:**
173+
```
174+
- [LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/) | [Solution](...)
175+
- [LeetCode 3](https://leetcode.com/problems/longest-substring-without-repeating-characters/description/) | [Solution](...)
176+
```
177+
178+
**Debug Output:**
179+
- `post_processing_before_general_en.md` (English before processing)
180+
- `post_processing_after_general_en.md` (English after processing)
181+
- `post_processing_before_general_zh-TW.md` (Chinese before processing)
182+
- `post_processing_after_general_zh-TW.md` (Chinese after processing)
183+
184+
**Comparison File:** `post_processing_comparison_{timestamp}.md` (contains comparisons for all languages)
185+
186+
## Examples
187+
188+
### Example 1: Plain Text Conversion
118189

119190
**Before:**
120191
```markdown
@@ -128,7 +199,7 @@ LeetCode 11 - Container With Most Water
128199
- [LeetCode 3](https://leetcode.com/problems/longest-substring-without-repeating-characters/description/) | [Solution](https://github.com/lufftw/neetcode/blob/main/solutions/0003_longest_substring_without_repeating_characters.py)
129200
```
130201

131-
### 範例 2: 修正錯誤 URL
202+
### Example 2: Correcting Incorrect URLs
132203

133204
**Before:**
134205
```markdown
@@ -140,7 +211,7 @@ LeetCode 11 - Container With Most Water
140211
- [LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/) | [Solution](https://github.com/lufftw/neetcode/blob/main/solutions/0011_container_with_most_water.py)
141212
```
142213

143-
### 範例 3: 處理多種格式
214+
### Example 3: Handling Multiple Formats
144215

145216
**Before:**
146217
```markdown
@@ -156,9 +227,9 @@ LeetCode 11 - Container With Most Water
156227
- [LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/) | [Solution](https://github.com/lufftw/neetcode/blob/main/solutions/0011_container_with_most_water.py)
157228
```
158229

159-
## 配置
230+
## Configuration
160231

161-
後處理行為由 `config/config.yaml` 中的 `workflow.post_processing` 配置控制:
232+
Post-processing behavior is controlled by the `workflow.post_processing` configuration in `config/config.yaml`:
162233

163234
```yaml
164235
workflow:
@@ -168,17 +239,16 @@ workflow:
168239
replacement: "LeetCode \\1"
169240
```
170241
171-
## 相關檔案
172-
173-
- `src/post_processing.py` - 後處理主模組
174-
- `src/leetcode_api.py` - LeetCode API 資料載入
175-
- `src/graph.py` - 工作流程整合
176-
- `tools/sync_leetcode_data.py` - API 資料同步工具
242+
## Related Files
177243
178-
## 注意事項
244+
- `src/post_processing.py` - Post-processing main module
245+
- `src/leetcode_api.py` - LeetCode API data loading
246+
- `src/graph.py` - Workflow integration
247+
- `tools/sync_leetcode_data.py` - API data synchronization tool
179248

180-
1. **格式簡化**:只使用題號,不包含標題,因為 AI 產生的格式很多元
181-
2. **自動補充**:如果本地資料缺少 URL,自動從 API 快取補充
182-
3. **對比檔案**:每次執行都會生成對比檔案,方便檢查效果
183-
4. **向後相容**:不影響現有功能,只做補充和標準化
249+
## Notes
184250

251+
1. **Format Simplification**: Uses only problem numbers, excludes titles, because AI-generated formats are diverse
252+
2. **Automatic Supplementation**: If local data lacks URLs, automatically supplements from API cache
253+
3. **Comparison Files**: Comparison files are generated after each execution for easy effect checking
254+
4. **Backward Compatibility**: Does not affect existing functionality, only supplements and standardizes

0 commit comments

Comments
 (0)