1- # 後處理連結處理說明
1+ # Post-Processing Link Handling
22
3- ## 概述
3+ ## Overview
44
5- 後處理模組 (` post_processing.py ` ) 負責將 AI 生成的 mindmap 內容中的 LeetCode 問題引用轉換為標準化的連結格式。
5+ The post-processing module (` post_processing.py ` ) is responsible for converting LeetCode problem references in AI-generated mindmap content into standardized link formats.
66
7- ## 連結格式
7+ ## Link Format
88
9- ### 目標格式
9+ ### Target Format
1010
1111```
1212[LeetCode 11](leetcode_url) | [Solution](github_url)
1313```
1414
15- ** 特點: **
16- - 只使用題號,不包含標題
17- - 格式簡潔統一
18- - 自動添加 GitHub solution 連結(如果有)
15+ ** Features: **
16+ - Uses only problem numbers, excludes titles
17+ - Concise and unified format
18+ - Automatically adds GitHub solution links (if available)
1919
20- ### 處理的輸入格式
20+ ### Input Formats Handled
2121
22- 後處理會處理以下多種 AI 可能產生的格式:
22+ Post-processing handles the following various formats that AI may generate:
2323
24- 1 . ** 純文字格式 **
24+ 1 . ** Plain Text Format **
2525 - ` LeetCode 11 `
2626 - ` LeetCode 11 - Container With Most Water `
2727 - ` LC 11 `
2828
29- 2 . ** Markdown 連結格式 **
29+ 2 . ** Markdown Link Format **
3030 - ` [LeetCode 11](url) `
3131 - ` [LeetCode 11 - Container With Most Water](url) `
3232 - ` [LC 11](url) `
3333
34- 3 . ** 錯誤的 URL **
35- - ` [LeetCode 11](wrong_url) ` → 自動修正為正確的 URL
34+ 3 . ** Incorrect URLs **
35+ - ` [LeetCode 11](wrong_url) ` → Automatically corrected to the correct URL
3636
37- ## 處理流程
37+ ## Processing Flow
3838
39- ### 步驟 1: 文字替換
39+ ### Overall Flow
40+
41+ 1 . ** Writer Phase** : Produces raw markdown (** no post-processing** )
42+ - Saved to debug output
43+ - Used for translation phase
44+
45+ 2 . ** Translation Phase** : Translates raw markdown (** no post-processing** )
46+ - Saved to debug output
47+ - Produces translated raw markdown
48+
49+ 3 . ** Post-Processing Phase** : Processes links for English and Chinese
50+ - Simultaneously processes ` writer_outputs ` (English) and ` translated_outputs ` (Chinese)
51+ - Generates standardized links for all languages
52+
53+ ### Post-Processing Steps
54+
55+ ### Step 1: Text Replacement
4056
4157- ` LC 11 ` → ` LeetCode 11 `
4258- ` LC-11 ` → ` LeetCode 11 `
4359- ` LeetCode11 ` → ` LeetCode 11 `
4460
45- ### 步驟 2: 連結轉換
61+ ### Step 2: Link Conversion
4662
47- 將純文字或現有連結轉換為標準格式:
63+ Convert plain text or existing links to standard format:
4864
49- ** 輸入: **
65+ ** Input: **
5066```
5167LeetCode 11 - Container With Most Water
5268```
5369
54- ** 輸出: **
70+ ** Output: **
5571```
5672[LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/)
5773```
5874
59- ### 步驟 3: URL 正規化
75+ ### Step 3: URL Normalization
6076
61- 確保所有 LeetCode URL 使用正確的格式:
62- - 移除檔案名稱格式的 slug(如 ` 0011_container_with_most_water ` )
63- - 轉換為標準 slug(如 ` container-with-most-water ` )
64- - 確保以 ` /description/ ` 結尾
77+ Ensure all LeetCode URLs use the correct format:
78+ - Remove file name format slugs (e.g., ` 0011_container_with_most_water ` )
79+ - Convert to standard slugs (e.g., ` container-with-most-water ` )
80+ - Ensure ending with ` /description/ `
6581
66- ### 步驟 4: 添加 GitHub Solution 連結
82+ ### Step 4: Add GitHub Solution Links
6783
68- 如果問題有對應的 solution 檔案,自動添加 GitHub 連結:
84+ If a problem has a corresponding solution file, automatically add GitHub link:
6985
70- ** 輸入: **
86+ ** Input: **
7187```
7288[LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/)
7389```
7490
75- ** 輸出: **
91+ ** Output: **
7692```
7793[LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/) | [Solution](https://github.com/lufftw/neetcode/blob/main/solutions/0011_container_with_most_water.py)
7894```
7995
80- ## 資料來源
96+ ## Data Sources
97+
98+ ### Local TOML Files
99+
100+ Load problem metadata from ` meta/problems/ ` directory, including:
101+ - Problem titles
102+ - Solution file paths
103+ - Other metadata
104+
105+ ### LeetCode API Cache
81106
82- ### 本地 TOML 檔案
107+ Load from ` tools/.cache/leetcode_problems.json ` :
108+ - LeetCode URLs
109+ - Slugs
110+ - Problem titles (as supplement)
83111
84- 從 ` meta/problems/ ` 目錄載入問題元資料,包含:
85- - 問題標題
86- - Solution 檔案路徑
87- - 其他元資料
112+ ** Priority:**
113+ 1 . Local TOML data (priority)
114+ 2 . API cache data (supplement)
88115
89- ### LeetCode API 快取
116+ ## Comparison Files
90117
91- 從 ` tools/.cache/leetcode_problems.json ` 載入:
92- - LeetCode URL
93- - Slug
94- - 問題標題(作為補充)
118+ After each post-processing execution, a comparison file is automatically generated:
95119
96- ** 優先順序:**
97- 1 . 本地 TOML 資料(優先)
98- 2 . API 快取資料(補充)
120+ ** Location:** ` outputs/final/post_processing_comparison_{timestamp}.md `
99121
100- ## 對比檔案
122+ ** Content:**
123+ - Before/After comparison for each language (English, Chinese, etc.)
124+ - Before: Original content (Writer/Translation output, unprocessed)
125+ - After: Post-processed content (links standardized)
101126
102- 每次執行後處理後,會自動生成對比檔案:
127+ ** Purpose:**
128+ - Check post-processing effectiveness
129+ - Verify links are correctly generated (English and Chinese)
130+ - Compare differences before and after processing
103131
104- ** 位置: ** ` outputs/final/post_processing_comparison_{timestamp}.md `
132+ ## Flow Confirmation
105133
106- ** 內容:**
107- - Before: 原始內容(AI 生成)
108- - After: 後處理後的內容
134+ ### Writer Phase Output
109135
110- ** 用途:**
111- - 檢查後處理效果
112- - 驗證連結是否正確生成
113- - 比較處理前後的差異
136+ ** Output:** Raw markdown (no post-processing)
137+ ```
138+ - LeetCode 11 - Container With Most Water
139+ - LeetCode 3 - Longest Substring
140+ ```
114141
115- ## 範例
142+ ** Debug Output: ** ` llm_output_writer_write.md ` (original content)
116143
117- ### 範例 1: 純文字轉換
144+ ### Translation Phase Output
145+
146+ ** Input:** Writer's raw markdown (no post-processing)
147+
148+ ** Output:** Translated raw markdown (no post-processing)
149+ ```
150+ - LeetCode 11 - 盛最多水的容器
151+ - LeetCode 3 - 無重複字符的最長子串
152+ ```
153+
154+ ** Debug Output:**
155+ - ` translation_before_general_en_general_zh-TW.md ` (before translation)
156+ - ` translation_after_general_en_general_zh-TW.md ` (after translation)
157+
158+ ### Post-Processing Phase Output
159+
160+ ** Input:**
161+ - Writer raw markdown (English)
162+ - Translated raw markdown (Chinese)
163+
164+ ** Output:** Post-processed markdown (English and Chinese)
165+
166+ ** English:**
167+ ```
168+ - [LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/) | [Solution](...)
169+ - [LeetCode 3](https://leetcode.com/problems/longest-substring-without-repeating-characters/description/) | [Solution](...)
170+ ```
171+
172+ ** Chinese:**
173+ ```
174+ - [LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/) | [Solution](...)
175+ - [LeetCode 3](https://leetcode.com/problems/longest-substring-without-repeating-characters/description/) | [Solution](...)
176+ ```
177+
178+ ** Debug Output:**
179+ - ` post_processing_before_general_en.md ` (English before processing)
180+ - ` post_processing_after_general_en.md ` (English after processing)
181+ - ` post_processing_before_general_zh-TW.md ` (Chinese before processing)
182+ - ` post_processing_after_general_zh-TW.md ` (Chinese after processing)
183+
184+ ** Comparison File:** ` post_processing_comparison_{timestamp}.md ` (contains comparisons for all languages)
185+
186+ ## Examples
187+
188+ ### Example 1: Plain Text Conversion
118189
119190** Before:**
120191``` markdown
@@ -128,7 +199,7 @@ LeetCode 11 - Container With Most Water
128199- [LeetCode 3](https://leetcode.com/problems/longest-substring-without-repeating-characters/description/) | [Solution](https://github.com/lufftw/neetcode/blob/main/solutions/0003_longest_substring_without_repeating_characters.py)
129200```
130201
131- ### 範例 2: 修正錯誤 URL
202+ ### Example 2: Correcting Incorrect URLs
132203
133204** Before:**
134205``` markdown
@@ -140,7 +211,7 @@ LeetCode 11 - Container With Most Water
140211- [LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/) | [Solution](https://github.com/lufftw/neetcode/blob/main/solutions/0011_container_with_most_water.py)
141212```
142213
143- ### 範例 3: 處理多種格式
214+ ### Example 3: Handling Multiple Formats
144215
145216** Before:**
146217``` markdown
@@ -156,9 +227,9 @@ LeetCode 11 - Container With Most Water
156227- [LeetCode 11](https://leetcode.com/problems/container-with-most-water/description/) | [Solution](https://github.com/lufftw/neetcode/blob/main/solutions/0011_container_with_most_water.py)
157228```
158229
159- ## 配置
230+ ## Configuration
160231
161- 後處理行為由 ` config/config.yaml ` 中的 ` workflow.post_processing ` 配置控制:
232+ Post-processing behavior is controlled by the ` workflow.post_processing ` configuration in ` config/config.yaml ` :
162233
163234``` yaml
164235workflow :
@@ -168,17 +239,16 @@ workflow:
168239 replacement : " LeetCode \\ 1"
169240` ` `
170241
171- ## 相關檔案
172-
173- - ` src/post_processing.py` - 後處理主模組
174- - ` src/leetcode_api.py` - LeetCode API 資料載入
175- - ` src/graph.py` - 工作流程整合
176- - ` tools/sync_leetcode_data.py` - API 資料同步工具
242+ ## Related Files
177243
178- # # 注意事項
244+ - ` src/post_processing.py` - Post-processing main module
245+ - ` src/leetcode_api.py` - LeetCode API data loading
246+ - ` src/graph.py` - Workflow integration
247+ - ` tools/sync_leetcode_data.py` - API data synchronization tool
179248
180- 1. **格式簡化**:只使用題號,不包含標題,因為 AI 產生的格式很多元
181- 2. **自動補充**:如果本地資料缺少 URL,自動從 API 快取補充
182- 3. **對比檔案**:每次執行都會生成對比檔案,方便檢查效果
183- 4. **向後相容**:不影響現有功能,只做補充和標準化
249+ # # Notes
184250
251+ 1. **Format Simplification** : Uses only problem numbers, excludes titles, because AI-generated formats are diverse
252+ 2. **Automatic Supplementation** : If local data lacks URLs, automatically supplements from API cache
253+ 3. **Comparison Files** : Comparison files are generated after each execution for easy effect checking
254+ 4. **Backward Compatibility** : Does not affect existing functionality, only supplements and standardizes
0 commit comments