Skip to content

Commit 4c8b529

Browse files
committed
docs: streamline LeetCode API discussion and merge post-processing summary into README
- Rewrite leetcode_api_discussion.md in concise English - Reduce content from extensive draft to a focused overview - Clarify API structure, field mapping, and URL generation logic - Retain only implementation-relevant details and usage notes - Merge POST_PROCESSING_UPDATE_SUMMARY.md into tools/ai-markmap-agent/README.md - Add "Post-Processing Link Generation" section with standardized link formats - Document LeetCode API integration, data sources, and comparison file generation - Describe automatic URL normalization and GitHub solution link insertion - Update README module responsibilities - Add post_processing.py for link normalization and generation - Add leetcode_api.py for LeetCode API data loading - Remove temporary documentation file - Delete POST_PROCESSING_UPDATE_SUMMARY.md after consolidation
1 parent 1b5aec6 commit 4c8b529

File tree

5 files changed

+389
-470
lines changed

5 files changed

+389
-470
lines changed

tools/ai-markmap-agent/README.md

Lines changed: 66 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -161,8 +161,10 @@ Apply adopted improvements surgically to the baseline:
161161
### Phase 5-6: Post-Processing
162162

163163
- Translation (en → zh-TW)
164-
- Link validation
165-
- HTML generation
164+
- Link validation and normalization
165+
- Automatic LeetCode URL generation
166+
- GitHub solution link addition
167+
- Comparison file generation
166168

167169
---
168170

@@ -902,6 +904,66 @@ The system automatically loads:
902904

903905
---
904906

907+
## Post-Processing Link Generation
908+
909+
### Link Format
910+
911+
Post-processing automatically converts LeetCode problem references to standardized links:
912+
913+
**Format:**
914+
```
915+
[LeetCode 11](leetcode_url) | [Solution](github_url)
916+
```
917+
918+
**Features:**
919+
- Simple format: Only problem ID, no title
920+
- Handles multiple AI-generated formats
921+
- Auto-generates LeetCode URLs from API cache
922+
- Adds GitHub solution links when available
923+
924+
### Data Sources
925+
926+
1. **Local TOML files** (`meta/problems/`) - Primary source
927+
2. **LeetCode API cache** (`tools/.cache/leetcode_problems.json`) - Auto-supplement
928+
929+
**Priority:** Local TOML > API cache
930+
931+
### Comparison Files
932+
933+
After each post-processing run, a comparison file is automatically generated:
934+
935+
**Location:** `outputs/final/post_processing_comparison_{timestamp}.md`
936+
937+
**Contents:**
938+
- Before: Original AI-generated content
939+
- After: Post-processed content with normalized links
940+
941+
**Usage:**
942+
- Verify link generation correctness
943+
- Check format compliance
944+
- Identify improvements needed
945+
946+
### LeetCode API Integration
947+
948+
The system automatically syncs with LeetCode API:
949+
950+
```bash
951+
# Sync LeetCode problem data (7-day cache)
952+
python tools/sync_leetcode_data.py
953+
954+
# Check cache status
955+
python tools/sync_leetcode_data.py --check
956+
```
957+
958+
**Integration:**
959+
- `PostProcessor` automatically loads and merges API cache data
960+
- Missing URLs are auto-generated from API data
961+
- No configuration required
962+
963+
See [Post-Processing Links Documentation](docs/POST_PROCESSING_LINKS.md) for details.
964+
965+
---
966+
905967
## Module Responsibilities
906968

907969
| Module | Responsibility |
@@ -910,6 +972,8 @@ The system automatically loads:
910972
| `consensus.py` | Programmatic majority voting |
911973
| `writer.py` | Refinement-mode writer |
912974
| `graph.py` | LangGraph workflow orchestration |
975+
| `post_processing.py` | Link normalization and generation |
976+
| `leetcode_api.py` | LeetCode API data loading |
913977
| `config_loader.py` | Configuration management |
914978

915979
---
Lines changed: 184 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,184 @@
1+
# 後處理連結處理說明
2+
3+
## 概述
4+
5+
後處理模組 (`post_processing.py`) 負責將 AI 生成的 mindmap 內容中的 LeetCode 問題引用轉換為標準化的連結格式。
6+
7+
## 連結格式
8+
9+
### 目標格式
10+
11+
```
12+
[LeetCode 11](leetcode_url) | [Solution](github_url)
13+
```
14+
15+
**特點:**
16+
- 只使用題號,不包含標題
17+
- 格式簡潔統一
18+
- 自動添加 GitHub solution 連結(如果有)
19+
20+
### 處理的輸入格式
21+
22+
後處理會處理以下多種 AI 可能產生的格式:
23+
24+
1. **純文字格式**
25+
- `LeetCode 11`
26+
- `LeetCode 11 - Container With Most Water`
27+
- `LC 11`
28+
29+
2. **Markdown 連結格式**
30+
- `[LeetCode 11](url)`
31+
- `[LeetCode 11 - Container With Most Water](url)`
32+
- `[LC 11](url)`
33+
34+
3. **錯誤的 URL**
35+
- `[LeetCode 11](wrong_url)` → 自動修正為正確的 URL
36+
37+
## 處理流程
38+
39+
### 步驟 1: 文字替換
40+
41+
- `LC 11``LeetCode 11`
42+
- `LC-11``LeetCode 11`
43+
- `LeetCode11``LeetCode 11`
44+
45+
### 步驟 2: 連結轉換
46+
47+
將純文字或現有連結轉換為標準格式:
48+
49+
**輸入:**
50+
```
51+
LeetCode 11 - Container With Most Water
52+
```
53+
54+
**輸出:**
55+
```
56+
[LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/)
57+
```
58+
59+
### 步驟 3: URL 正規化
60+
61+
確保所有 LeetCode URL 使用正確的格式:
62+
- 移除檔案名稱格式的 slug(如 `0011_container_with_most_water`
63+
- 轉換為標準 slug(如 `container-with-most-water`
64+
- 確保以 `/description/` 結尾
65+
66+
### 步驟 4: 添加 GitHub Solution 連結
67+
68+
如果問題有對應的 solution 檔案,自動添加 GitHub 連結:
69+
70+
**輸入:**
71+
```
72+
[LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/)
73+
```
74+
75+
**輸出:**
76+
```
77+
[LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/) | [Solution](https://github.com/lufftw/neetcode/blob/main/solutions/0011_container_with_most_water.py)
78+
```
79+
80+
## 資料來源
81+
82+
### 本地 TOML 檔案
83+
84+
`meta/problems/` 目錄載入問題元資料,包含:
85+
- 問題標題
86+
- Solution 檔案路徑
87+
- 其他元資料
88+
89+
### LeetCode API 快取
90+
91+
`tools/.cache/leetcode_problems.json` 載入:
92+
- LeetCode URL
93+
- Slug
94+
- 問題標題(作為補充)
95+
96+
**優先順序:**
97+
1. 本地 TOML 資料(優先)
98+
2. API 快取資料(補充)
99+
100+
## 對比檔案
101+
102+
每次執行後處理後,會自動生成對比檔案:
103+
104+
**位置:** `outputs/final/post_processing_comparison_{timestamp}.md`
105+
106+
**內容:**
107+
- Before: 原始內容(AI 生成)
108+
- After: 後處理後的內容
109+
110+
**用途:**
111+
- 檢查後處理效果
112+
- 驗證連結是否正確生成
113+
- 比較處理前後的差異
114+
115+
## 範例
116+
117+
### 範例 1: 純文字轉換
118+
119+
**Before:**
120+
```markdown
121+
- LeetCode 11 - Container With Most Water
122+
- LeetCode 3 - Longest Substring
123+
```
124+
125+
**After:**
126+
```markdown
127+
- [LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/) | [Solution](https://github.com/lufftw/neetcode/blob/main/solutions/0011_container_with_most_water.py)
128+
- [LeetCode 3](https://leetcode.com/problems/longest-substring-without-repeating-characters/description/) | [Solution](https://github.com/lufftw/neetcode/blob/main/solutions/0003_longest_substring_without_repeating_characters.py)
129+
```
130+
131+
### 範例 2: 修正錯誤 URL
132+
133+
**Before:**
134+
```markdown
135+
- [LeetCode 11](https://leetcode.com/problems/0011_container_with_most_water/)
136+
```
137+
138+
**After:**
139+
```markdown
140+
- [LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/) | [Solution](https://github.com/lufftw/neetcode/blob/main/solutions/0011_container_with_most_water.py)
141+
```
142+
143+
### 範例 3: 處理多種格式
144+
145+
**Before:**
146+
```markdown
147+
- LC 11
148+
- LeetCode 11 - Container With Most Water
149+
- [LeetCode 11](wrong_url)
150+
```
151+
152+
**After:**
153+
```markdown
154+
- [LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/) | [Solution](https://github.com/lufftw/neetcode/blob/main/solutions/0011_container_with_most_water.py)
155+
- [LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/) | [Solution](https://github.com/lufftw/neetcode/blob/main/solutions/0011_container_with_most_water.py)
156+
- [LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/) | [Solution](https://github.com/lufftw/neetcode/blob/main/solutions/0011_container_with_most_water.py)
157+
```
158+
159+
## 配置
160+
161+
後處理行為由 `config/config.yaml` 中的 `workflow.post_processing` 配置控制:
162+
163+
```yaml
164+
workflow:
165+
post_processing:
166+
text_replacements:
167+
- pattern: "\\bLC[-\\s]?(\\d+)"
168+
replacement: "LeetCode \\1"
169+
```
170+
171+
## 相關檔案
172+
173+
- `src/post_processing.py` - 後處理主模組
174+
- `src/leetcode_api.py` - LeetCode API 資料載入
175+
- `src/graph.py` - 工作流程整合
176+
- `tools/sync_leetcode_data.py` - API 資料同步工具
177+
178+
## 注意事項
179+
180+
1. **格式簡化**:只使用題號,不包含標題,因為 AI 產生的格式很多元
181+
2. **自動補充**:如果本地資料缺少 URL,自動從 API 快取補充
182+
3. **對比檔案**:每次執行都會生成對比檔案,方便檢查效果
183+
4. **向後相容**:不影響現有功能,只做補充和標準化
184+

tools/ai-markmap-agent/src/graph.py

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@
3232
ConsensusResult,
3333
)
3434
from .output.html_converter import save_all_markmaps, MarkMapHTMLConverter
35+
from datetime import datetime
3536
from .post_processing import clean_translated_content
3637

3738
__all__ = [
@@ -98,6 +99,74 @@ class WorkflowState(TypedDict, total=False):
9899
_resume_config: dict[str, Any]
99100

100101

102+
def _save_post_processing_comparison(
103+
comparison_data: dict[str, dict[str, str]],
104+
config: dict[str, Any]
105+
) -> None:
106+
"""
107+
Save post-processing before/after comparison to markdown file.
108+
109+
Args:
110+
comparison_data: Dict mapping output key to {"before": str, "after": str}
111+
config: Configuration dictionary
112+
"""
113+
if not comparison_data:
114+
return
115+
116+
# Get output directory from config
117+
output_config = config.get("output", {})
118+
final_dirs = output_config.get("final_dirs", {})
119+
markdown_dir = final_dirs.get("markdown", "outputs/final")
120+
121+
base_dir = Path(__file__).parent.parent.parent.parent
122+
output_path = base_dir / markdown_dir
123+
output_path.mkdir(parents=True, exist_ok=True)
124+
125+
# Create comparison file
126+
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
127+
comparison_file = output_path / f"post_processing_comparison_{timestamp}.md"
128+
129+
content_parts = [
130+
"# Post-Processing Link Comparison",
131+
"",
132+
f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}",
133+
"",
134+
"This file shows the before/after comparison of post-processing link generation.",
135+
"",
136+
"---",
137+
"",
138+
]
139+
140+
for key, data in comparison_data.items():
141+
before = data.get("before", "")
142+
after = data.get("after", "")
143+
144+
content_parts.extend([
145+
f"## {key}",
146+
"",
147+
"### Before (原始內容)",
148+
"",
149+
"```markdown",
150+
before[:5000] + ("..." if len(before) > 5000 else ""), # Limit length
151+
"```",
152+
"",
153+
"### After (後處理後)",
154+
"",
155+
"```markdown",
156+
after[:5000] + ("..." if len(after) > 5000 else ""), # Limit length
157+
"```",
158+
"",
159+
"---",
160+
"",
161+
])
162+
163+
try:
164+
comparison_file.write_text("\n".join(content_parts), encoding="utf-8")
165+
print(f" 📄 Post-processing comparison saved: {comparison_file.name}")
166+
except Exception as e:
167+
print(f" ⚠ Failed to save comparison: {e}")
168+
169+
101170
def load_baseline_markmap(config: dict[str, Any]) -> str:
102171
"""
103172
Load the baseline Markmap from file.
@@ -873,17 +942,29 @@ def run_post_processing(state: WorkflowState) -> WorkflowState:
873942

874943
# Apply post-processing
875944
final_outputs = {}
945+
post_processing_comparison = {} # Store before/after for comparison
946+
876947
for key, content in all_outputs.items():
877948
if debug.enabled:
878949
debug.save_post_processing(content, key, is_before=True)
879950

880951
processed = processor.process(content)
881952
final_outputs[key] = processed
953+
954+
# Store comparison for later saving
955+
post_processing_comparison[key] = {
956+
"before": content,
957+
"after": processed
958+
}
959+
882960
print(f" ✓ Processed: {key}")
883961

884962
if debug.enabled:
885963
debug.save_post_processing(processed, key, is_before=False)
886964

965+
# Save post-processing comparison to markdown file
966+
_save_post_processing_comparison(post_processing_comparison, config)
967+
887968
state["final_outputs"] = final_outputs
888969
return state
889970

0 commit comments

Comments
 (0)