Skip to content

Commit bd9874a

Browse files
committed
fix(translate_only): ensure MD and HTML outputs use consistent directories
Problem: - MD output was saved to input file's parent directory - HTML output used final_dirs.html from config - This caused MD/HTML files to be out of sync when input came from version history (outputs/versions/v1/) Solution: - Both outputs now use final_dirs from config - MD → final_dirs.markdown (docs/mindmaps/) - HTML → final_dirs.html (docs/pages/mindmaps/) Impact: - translate_only.py --html now produces synchronized outputs - Consistent with main pipeline behavior - Files are always in expected final directories
1 parent 982520c commit bd9874a

File tree

8 files changed

+1777
-574
lines changed

8 files changed

+1777
-574
lines changed

tools/ai-markmap-agent/outputs/versions/v1/neetcode_ontology_agent_evolved_en.html

Lines changed: 385 additions & 92 deletions
Large diffs are not rendered by default.

tools/ai-markmap-agent/outputs/versions/v1/neetcode_ontology_agent_evolved_en.md

Lines changed: 385 additions & 92 deletions
Large diffs are not rendered by default.

tools/ai-markmap-agent/outputs/versions/v1/neetcode_ontology_agent_evolved_zh-TW.html

Lines changed: 325 additions & 158 deletions
Large diffs are not rendered by default.

tools/ai-markmap-agent/outputs/versions/v1/neetcode_ontology_agent_evolved_zh-TW.md

Lines changed: 319 additions & 148 deletions
Large diffs are not rendered by default.

tools/ai-markmap-agent/prompts/translator/zh_tw_translator_behavior.md

Lines changed: 134 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -2,42 +2,97 @@
22

33
Translate the following Markmap content to **Traditional Chinese (Taiwan)**.
44

5-
## CRITICAL: Use Taiwan's Algorithm & Data Structure Terminology
6-
7-
### ⚠️ Taiwan vs Mainland China Terminology (MUST use Taiwan terms)
8-
9-
The following terms differ between Taiwan (台灣) and Mainland China (中國大陸).
10-
**You MUST use the Taiwan column. NEVER use Mainland China terms.**
11-
12-
| English | 台灣 (USE THIS) | 中國大陸 (NEVER USE) |
13-
|---------|-----------------|---------------------|
14-
| Pointer | 指標 | ~~指針~~ |
15-
| Two Pointers | 雙指標 | ~~雙指針~~ |
16-
| Fast-Slow Pointers | 快慢指標 | ~~快慢指針~~ |
17-
| In-place | 原地 | ~~就地~~ |
18-
| Enumerate | 列舉 | ~~枚舉~~ |
19-
| Boolean | 布林 / Boolean | ~~布爾~~ |
20-
| Function | 函式 | ~~函數~~ |
21-
| Variable | 變數 | ~~變量~~ |
22-
| Parameter | 參數 | ~~參數~~ (same) |
23-
| Memory | 記憶體 | ~~內存~~ |
24-
| Program | 程式 | ~~程序~~ |
25-
| Object | 物件 | ~~對象~~ |
26-
| Interface | 介面 | ~~接口~~ |
27-
| Implementation | 實作 | ~~實現~~ |
28-
| Information | 資訊 | ~~信息~~ |
29-
| Data | 資料 | ~~數據~~ |
30-
| Network | 網路 | ~~網絡~~ |
31-
| Software | 軟體 | ~~軟件~~ |
32-
| Hardware | 硬體 | ~~硬件~~ |
33-
| Default | 預設 | ~~默認~~ |
34-
| Support | 支援 | ~~支持~~ |
35-
| Recursive | 遞迴 | ~~遞歸~~ |
36-
| Iterate | 迭代 | ~~迭代~~ (same) |
37-
| Loop | 迴圈 | ~~循環~~ |
38-
| Execute | 執行 | ~~執行~~ (same) |
39-
40-
### Standard Taiwan CS Terminology
5+
## ⚠️ CRITICAL: Taiwan DSA Terminology Standards
6+
7+
You are translating for **Taiwan's Computer Science community**. Taiwan uses different terminology from Mainland China. Using Mainland terms will immediately mark the document as "非台灣體系" (non-Taiwan system).
8+
9+
---
10+
11+
## 🚨 A-Level: ZERO TOLERANCE (Must Replace)
12+
13+
These terms will **100% be identified as Mainland Chinese** by Taiwan CS readers. **NEVER use the left column.**
14+
15+
| ❌ 禁用 (NEVER USE) | ✅ 台灣標準 (USE THIS) | English |
16+
|---------------------|------------------------|---------|
17+
| 字符串 | **字串** | String |
18+
| 字符 | **字元** | Character |
19+
| 指针 / 指針 | **指標** | Pointer |
20+
| 就地 | **原地** | In-place |
21+
| 枚举 / 枚舉 | **列出 / 逐一產生** (動詞); **窮舉** (名詞) | Enumerate |
22+
| 搜索 | **搜尋** | Search |
23+
| 修剪 | **剪枝** | Prune/Pruning |
24+
| 映射 | **對應表 / 對照表** | Mapping |
25+
| 窗口 | **視窗** | Window |
26+
| 運行 | **執行** | Run/Execute |
27+
| 單元格 | **格子** | Cell (grid) |
28+
| 前沿 | **frontier / 邊界** | Frontier |
29+
| 链表 / 鏈表 | **鏈結串列** | Linked List |
30+
| 数组 / 數組 | **陣列** | Array |
31+
| 哈希 / 哈希表 | **雜湊 / 雜湊表** | Hash / Hash Table |
32+
| 堆栈 | **堆疊** | Stack |
33+
| 布尔 / 布爾 | **布林** | Boolean |
34+
| 函数 / 函數 | **函式** | Function |
35+
| 变量 / 變量 | **變數** | Variable |
36+
| 内存 / 內存 | **記憶體** | Memory |
37+
| 程序 | **程式** | Program |
38+
| 对象 / 對象 | **物件** | Object |
39+
| 接口 | **介面** | Interface |
40+
| 实现 / 實現 | **實作** | Implementation |
41+
| 信息 | **資訊** | Information |
42+
| 数据 / 數據 | **資料** | Data |
43+
| 网络 / 網絡 | **網路** | Network |
44+
| 软件 / 軟件 | **軟體** | Software |
45+
| 硬件 / 硬件 | **硬體** | Hardware |
46+
| 默认 / 默認 | **預設** | Default |
47+
| 支持 | **支援** | Support |
48+
| 递归 / 遞歸 | **遞迴** | Recursive |
49+
| 循环 / 循環 | **迴圈** | Loop |
50+
| 调用 / 調用 | **呼叫** | Call (function) |
51+
52+
---
53+
54+
## ⚠️ B-Level: SHOULD REPLACE (Taiwan Preference)
55+
56+
These won't break the document but will make it "sound like Mainland notes." **Prefer Taiwan terms.**
57+
58+
| 🔶 中國偏用 (Avoid) | ✅ 台灣慣用 (Prefer) | English |
59+
|---------------------|----------------------|---------|
60+
| 遍历 / 遍歷 (as noun) | **走訪 / 逐一處理** | Traversal |
61+
| 搜索树 / 搜索樹 | **搜尋樹** | Search Tree |
62+
| 子串 | **子字串** | Substring |
63+
| 区间 / 區間 | **區間** (OK, but 範圍 also works) | Interval |
64+
| 前缀 / 前綴 | **前綴** | Prefix |
65+
| 后缀 / 後綴 | **後綴** | Suffix |
66+
| 队列 / 隊列 | **佇列** | Queue |
67+
| 入队 / 入隊 | **加入佇列 / enqueue** | Enqueue |
68+
| 出队 / 出隊 | **移出佇列 / dequeue** | Dequeue |
69+
| 权重 / 權重 | **權重 / weight** | Weight |
70+
| 覆盖 / 覆蓋 (cover) | **涵蓋 / 包含** | Cover |
71+
| 边界情况 / 邊界情況 | **邊界情況 / edge case** | Edge Case |
72+
| 节点 / 節點 | **節點** (OK, ensure consistent) | Node |
73+
74+
---
75+
76+
## ⚠️ C-Level: 語感問題 (Sounds Like Mainland Teaching Materials)
77+
78+
These are not "wrong" but will make Taiwan readers feel the text is "not local." **Strongly recommend replacing.**
79+
80+
| 🔶 陸系語感 (Avoid) | ✅ 台灣自然說法 (Prefer) | Context |
81+
|---------------------|-------------------------|---------|
82+
| 變體 | **變形 / 延伸題 / 變化題 / 進階題** | Problem variants |
83+
| 列舉 (名詞化) | **列出 / 找出** | "列舉所有解" → "列出所有解" |
84+
| 系統映射 | **系統對應 / 系統對照** | System mapping |
85+
| 防護欄 | **注意事項 / 限制 / 實作注意** | Guardrails |
86+
| 有效性 | **成立條件 / 判定條件** | Validity |
87+
| 有效 (狀態) | **成立 / 合法** | "當有效時" → "當成立時" |
88+
| 無效 (狀態) | **不成立 / 不合法** | Invalid state |
89+
| 取捨 | **權衡** | Trade-offs |
90+
| 目標 (列表式) | **求解目標 / 要求** | "目標:存在" → "求解目標:存在" |
91+
| 實作不變量 | **實作時的不變量** | Implementation invariant |
92+
93+
---
94+
95+
## ✅ Taiwan Standard CS Terminology Reference
4196

4297
| English | 台灣繁體中文 |
4398
|---------|-------------|
@@ -55,17 +110,17 @@ The following terms differ between Taiwan (台灣) and Mainland China (中國大
55110
| Sorting | 排序 |
56111
| Sliding Window | 滑動視窗 |
57112
| Dynamic Programming | 動態規劃 |
58-
| Backtracking | 回溯 |
113+
| Backtracking | 回溯法 |
59114
| Greedy | 貪婪法 |
60115
| Divide and Conquer | 分治法 |
61-
| BFS (Breadth-First Search) | 廣度優先搜尋 (BFS) |
62-
| DFS (Depth-First Search) | 深度優先搜尋 (DFS) |
116+
| BFS | 廣度優先搜尋 (BFS) |
117+
| DFS | 深度優先搜尋 (DFS) |
63118
| Traversal | 走訪 |
64119
| Node | 節點 |
65120
| Edge ||
66121
| Vertex | 頂點 |
67122
| Index | 索引 |
68-
| Invariant | 不變量 |
123+
| Invariant | 不變量 / 不變式 |
69124
| Complexity | 複雜度 |
70125
| Time Complexity | 時間複雜度 |
71126
| Space Complexity | 空間複雜度 |
@@ -80,20 +135,25 @@ The following terms differ between Taiwan (台灣) and Mainland China (中國大
80135
| Frequency | 頻率 |
81136
| Counter | 計數器 |
82137
| Window | 視窗 |
138+
| Sliding Window | 滑動視窗 |
83139
| Shrink | 收縮 |
84140
| Expand | 擴展 |
85-
| Valid | 有效 |
86-
| Invalid | 無效 |
141+
| Cell (grid) | 格子 |
142+
| Frontier | frontier / 邊界 |
143+
| Run/Execute | 執行 |
144+
| Valid | 有效 / 合法 |
145+
| Invalid | 無效 / 不合法 |
87146
| Target | 目標 |
88147
| Template | 模板 |
89148
| Pattern | 模式 |
90149
| State Machine | 狀態機 |
91-
| Wavefront | 波前 |
92-
| Streaming | 流式 |
150+
| Pointer | 指標 |
151+
| Two Pointers | 雙指標 |
152+
| Fast-Slow Pointers | 快慢指標 |
93153

94154
---
95155

96-
## DO NOT Translate (Keep in English)
156+
## 🔒 DO NOT Translate (Keep in English)
97157

98158
### 1. API Kernel Names (Class-style identifiers)
99159
Keep these EXACTLY as-is:
@@ -116,26 +176,11 @@ Keep these EXACTLY as-is:
116176
- `sliding_window_cost_bounded`
117177
- `two_pointer_opposite_maximize`
118178
- `two_pointer_three_sum`
119-
- `two_pointer_opposite_palindrome`
120-
- `two_pointer_writer_dedup`
121-
- `two_pointer_writer_remove`
122-
- `two_pointer_writer_compact`
123-
- `fast_slow_cycle_detect`
124-
- `fast_slow_cycle_start`
125-
- `fast_slow_midpoint`
126-
- `fast_slow_implicit_cycle`
127179
- `dutch_flag_partition`
128-
- `two_way_partition`
129180
- `quickselect_partition`
130181
- `merge_two_sorted_lists`
131-
- `merge_two_sorted_arrays`
132-
- `merge_sorted_from_ends`
133-
- `merge_k_sorted_heap`
134-
- `merge_k_sorted_divide`
135182
- `heap_kth_element`
136-
- `linked_list_k_group_reversal`
137-
- `backtracking_n_queens`
138-
- `grid_bfs_propagation`
183+
- `fast_slow_cycle_detect`
139184
- Any other `snake_case` pattern identifiers
140185

141186
### 3. Code Elements
@@ -153,27 +198,45 @@ Keep these EXACTLY as-is:
153198
- Keep link text that contains problem names: "[LeetCode 3 - Longest Substring...]"
154199

155200
### 6. Table Headers with Technical Terms
156-
- Keep column headers like "Invariant", "State", "Goal" in the pattern tables
201+
- Keep column headers like "Invariant", "State", "Goal" in pattern tables
157202
- These are technical terms that match code concepts
158203

159204
---
160205

161206
## Translation Rules
162207

163208
1. **Preserve Formatting**: Keep ALL Markdown formatting exactly (headers, lists, links, checkboxes, code blocks, tables)
164-
2. **Translate**:
165-
- Section headings (but keep API Kernel names in English)
166-
- Descriptive text and explanations
167-
- Emoji labels are fine to keep
168-
3. **Hybrid Headers**: For headers like "### SubstringSlidingWindow — *1D window state machine*"
209+
2. **Hybrid Headers**: For headers like "### SubstringSlidingWindow — *1D window state machine*"
169210
- Keep `SubstringSlidingWindow` in English
170211
- Translate the description part: "一維視窗狀態機"
171-
4. **Preserve Structure**: Maintain the same tree structure and indentation
172-
5. **Style**: Use Taiwan's technical documentation style - concise and professional
212+
3. **Preserve Structure**: Maintain the same tree structure and indentation
213+
4. **Style**: Use Taiwan's technical documentation style - concise, professional, academic tone
173214

174215
---
175216

176-
## Output
217+
## Self-Check Before Output
177218

178-
Output ONLY the translated Markdown content. No explanations, no code fence wrappers.
219+
Scan your translation for these terms. If ANY appear, you have failed:
220+
221+
**A-Level (零容忍):**
222+
```
223+
字符串, 字符, 指针, 指針, 就地, 枚举, 枚舉, 搜索, 修剪,
224+
映射, 数组, 數組, 链表, 鏈表, 哈希, 堆栈, 布尔, 布爾,
225+
函数, 函數, 变量, 變量, 内存, 內存, 程序, 对象, 對象,
226+
接口, 实现, 實現, 信息, 数据, 數據, 网络, 網絡,
227+
软件, 軟件, 硬件, 默认, 默認, 支持, 递归, 遞歸, 循环, 循環,
228+
窗口, 運行, 單元格, 前沿
229+
```
230+
231+
**C-Level (語感問題 - 強烈建議避免):**
232+
```
233+
變體, 系統映射, 防護欄, 有效性, 取捨
234+
```
235+
- 「列舉」只能當動詞用,不要名詞化
236+
- 「有效/無效」改用「成立/不成立」或「合法/不合法」
237+
238+
---
239+
240+
## Output
179241

242+
Output ONLY the translated Markdown content. No explanations, no code fence wrappers around the output.

tools/ai-markmap-agent/src/agents/expert.py

Lines changed: 28 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -271,16 +271,35 @@ def _parse_adoption_list(self, response: str) -> AdoptionList:
271271
"""Parse adoption list from discussion response."""
272272
adopted_ids = []
273273

274-
# Look for adoption list section
275-
adoption_section = re.search(
276-
r'(?:Final Adoption List|My Final Adoption|I recommend adopting).*?(?=##|$)',
277-
response,
278-
re.IGNORECASE | re.DOTALL
279-
)
280-
281-
if adoption_section:
282-
section_text = adoption_section.group(0)
274+
# Strategy 1: Look for explicit adoption section
275+
# The regex was failing because "###" contains "##"
276+
# Use a more robust pattern: find the adoption header and take everything after it
277+
adoption_patterns = [
278+
r'(?:^|\n)#+\s*(?:My\s+)?Final\s+Adoption\s+List.*', # "### My Final Adoption List"
279+
r'I\s+recommend\s+adopting\s+(?:these\s+)?suggestions?:?\s*\n.*', # "I recommend adopting..."
280+
r'(?:^|\n)#+\s*Part\s*2\s*:?\s*Final\s+Adoption.*', # "## Part 2: Final Adoption..."
281+
]
282+
283+
section_text = ""
284+
for pattern in adoption_patterns:
285+
match = re.search(pattern, response, re.IGNORECASE | re.DOTALL)
286+
if match:
287+
# Take from match position to end of response
288+
section_text = response[match.start():]
289+
break
290+
291+
# Strategy 2: If no explicit section found, look for all ✅ Agree votes
292+
if not section_text:
293+
# Fallback: collect IDs from ✅ Agree vote lines
294+
agree_pattern = r'\*\*Vote\*\*:\s*✅\s*Agree.*?(?:^|\n)#+\s*([APE]\d+)'
295+
agrees = re.findall(agree_pattern, response, re.IGNORECASE | re.DOTALL | re.MULTILINE)
296+
if agrees:
297+
adopted_ids = list(dict.fromkeys(agrees))
298+
299+
# Extract IDs from section text
300+
if section_text:
283301
# Find all suggestion IDs (A1, P2, E3, etc.)
302+
# Match IDs that appear in list items or bold text
284303
ids = re.findall(r'\b([APE]\d+)\b', section_text)
285304
adopted_ids = list(dict.fromkeys(ids)) # Remove duplicates, preserve order
286305

tools/ai-markmap-agent/translate_only.py

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -184,6 +184,9 @@ def main() -> int:
184184
return 1
185185
print(f"\n📂 Found latest output: {input_path}")
186186

187+
# Create converter for output path resolution
188+
converter = MarkMapHTMLConverter(config)
189+
187190
# Determine output file
188191
if args.output:
189192
output_path = Path(args.output)
@@ -196,7 +199,9 @@ def main() -> int:
196199
new_stem = stem[:-len(suffix)] + f"_{args.target}"
197200
else:
198201
new_stem = f"{stem}_{args.target}"
199-
output_path = input_path.parent / f"{new_stem}.md"
202+
203+
# Use final_dirs.markdown from config for consistency with HTML output
204+
output_path = converter.md_output_dir / f"{new_stem}.md"
200205

201206
# Determine model
202207
model = args.model
@@ -219,14 +224,12 @@ def main() -> int:
219224
# Generate HTML if requested
220225
if args.html:
221226
print("\n📊 Generating HTML...")
222-
converter = MarkMapHTMLConverter(config)
223227
html_content = converter.convert(
224228
translated,
225229
title=f"NeetCode Agent Evolved Mindmap ({args.target.upper()})"
226230
)
227231
# Use correct HTML output directory from config
228-
html_dir = converter.html_output_dir
229-
html_path = html_dir / f"{output_path.stem}.html"
232+
html_path = converter.html_output_dir / f"{output_path.stem}.html"
230233
html_path.write_text(html_content, encoding="utf-8")
231234
print(f" ✓ Saved: {html_path}")
232235

0 commit comments

Comments
 (0)