Open
Conversation
重构 `is_duplicate` 判断逻辑,仅在去重状态发生变化时打印日志
Reviewer's GuideImplements stateful deduplication logging for screenshots, fixes the pending task count to use unprocessed screenshots, and adds a homepage navigation button to the chat interface. Entity relationship diagram for updated pending screenshot statisticserDiagram
SCREENSHOT {
int id
bool is_processed
}
PROCESSINGQUEUE {
int id
string status
}
SCREENSHOT ||--o{ PROCESSINGQUEUE : "related to"
Class diagram for updated screenshot deduplication logicclassDiagram
class Recorder {
last_hashes: Dict[int, str]
_last_duplicate_status: Dict[int, bool]
hash_threshold: int
_is_duplicate(screen_id: int, image_hash: str) -> bool
}
Recorder : +_is_duplicate(screen_id, image_hash)
Recorder : +last_hashes
Recorder : +_last_duplicate_status
Recorder : +hash_threshold
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey there - I've reviewed your changes - here's some feedback:
- The deduplication logic currently uses both print and logger calls—consider removing prints and relying solely on structured logging with appropriate levels.
- You repeatedly check and initialize _last_duplicate_status inside _is_duplicate; moving that initialization into the class constructor would simplify the method and avoid repeated hasattr checks.
- Switching pending_tasks to count Screenshot entries instead of using the ProcessingQueue model could change the intended behavior—double-check that this still reflects the correct number of pending jobs.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The deduplication logic currently uses both print and logger calls—consider removing prints and relying solely on structured logging with appropriate levels.
- You repeatedly check and initialize _last_duplicate_status inside _is_duplicate; moving that initialization into the class constructor would simplify the method and avoid repeated hasattr checks.
- Switching pending_tasks to count Screenshot entries instead of using the ProcessingQueue model could change the intended behavior—double-check that this still reflects the correct number of pending jobs.
## Individual Comments
### Comment 1
<location> `lifetrace_backend/recorder.py:358-366` </location>
<code_context>
def _is_duplicate(self, screen_id: int, image_hash: str) -> bool:
"""检查是否为重复图像"""
if not self.deduplicate:
return False
is_duplicate = False
if screen_id not in self.last_hashes:
# 第一次截图,不是重复
self.last_hashes[screen_id] = image_hash
# 初始化状态跟踪
if not hasattr(self, '_last_duplicate_status'):
self._last_duplicate_status = {}
self._last_duplicate_status[screen_id] = False
return False
try:
# 计算汉明距离
current = imagehash.hex_to_hash(image_hash)
previous = imagehash.hex_to_hash(self.last_hashes[screen_id])
distance = current - previous
is_duplicate = distance <= self.hash_threshold
# 检查状态是否发生变化
if not hasattr(self, '_last_duplicate_status'):
self._last_duplicate_status = {}
last_status = self._last_duplicate_status.get(screen_id, False)
# 只有状态发生变化时才打印日志
if is_duplicate and not last_status:
logger.info(f"屏幕 {screen_id}: 开始重复截图")
print(f"[去重] 屏幕 {screen_id}: 开始重复截图")
elif not is_duplicate and last_status:
logger.info(f"屏幕 {screen_id}: 重复结束,恢复截图")
print(f"[去重] 屏幕 {screen_id}: 重复结束,恢复截图")
elif is_duplicate and last_status:
# 连续重复,可以选择性地每N次打印一次或者完全不打印
pass
# 更新状态记录
self._last_duplicate_status[screen_id] = is_duplicate
self.last_hashes[screen_id] = image_hash
except Exception as e:
logger.error(f"比较图像哈希失败: {e}")
is_duplicate = False
return is_duplicate
</code_context>
<issue_to_address>
**issue (code-quality):** We've found these issues:
- Remove redundant conditional ([`remove-redundant-if`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/remove-redundant-if/))
- Remove empty elif clause ([`remove-pass-elif`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/remove-pass-elif/))
</issue_to_address>
### Comment 2
<location> `lifetrace_backend/storage.py:664` </location>
<code_context>
def get_statistics(self) -> Dict[str, Any]:
"""获取统计信息"""
try:
with self.get_session() as session:
total_screenshots = session.query(Screenshot).count()
processed_screenshots = session.query(Screenshot).filter_by(is_processed=True).count()
pending_tasks = session.query(Screenshot).filter_by(is_processed=False).count()
# 今日统计
today = datetime.now().date()
today_start = datetime.combine(today, datetime.min.time())
today_screenshots = session.query(Screenshot).filter(
Screenshot.created_at >= today_start
).count()
return {
'total_screenshots': total_screenshots,
'processed_screenshots': processed_screenshots,
'pending_tasks': pending_tasks,
'today_screenshots': today_screenshots,
'processing_rate': processed_screenshots / max(total_screenshots, 1) * 100
}
except SQLAlchemyError as e:
logging.error(f"获取统计信息失败: {e}")
return {}
</code_context>
<issue_to_address>
**issue (code-quality):** Extract code out into method ([`extract-method`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/extract-method/))
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Comment on lines
+358
to
+366
| if is_duplicate and not last_status: | ||
| logger.info(f"屏幕 {screen_id}: 开始重复截图") | ||
| print(f"[去重] 屏幕 {screen_id}: 开始重复截图") | ||
| elif not is_duplicate and last_status: | ||
| logger.info(f"屏幕 {screen_id}: 重复结束,恢复截图") | ||
| print(f"[去重] 屏幕 {screen_id}: 重复结束,恢复截图") | ||
| elif is_duplicate and last_status: | ||
| # 连续重复,可以选择性地每N次打印一次或者完全不打印 | ||
| pass |
There was a problem hiding this comment.
issue (code-quality): We've found these issues:
- Remove redundant conditional (
remove-redundant-if) - Remove empty elif clause (
remove-pass-elif)
| @@ -663,7 +663,7 @@ def get_statistics(self) -> Dict[str, Any]: | |||
| with self.get_session() as session: | |||
| total_screenshots = session.query(Screenshot).count() | |||
There was a problem hiding this comment.
issue (code-quality): Extract code out into method (extract-method)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
1.修改服务端 “待处理” 截图数量
2.优化去重打印的日志
3.在服务端聊天页面添加回首页按钮
Summary by Sourcery
Optimize screenshot deduplication logging, fix pending screenshot count, and add a Home button to the chat page
Enhancements: