Skip to content

Conversation

@IMMOSkipper
Copy link

No description provided.

@codecov
Copy link

codecov bot commented Dec 15, 2025

Codecov Report

❌ Patch coverage is 34.61538% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.54%. Comparing base (b32ef4b) to head (c36c11c).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
blog/views.py 32.00% 17 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #797      +/-   ##
==========================================
- Coverage   84.30%   83.54%   -0.77%     
==========================================
  Files          73       73              
  Lines        3263     3288      +25     
==========================================
- Hits         2751     2747       -4     
- Misses        512      541      +29     
Flag Coverage Δ
unittests 83.54% <34.61%> (-0.77%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
blog/models.py 94.28% <100.00%> (ø)
blog/views.py 86.38% <32.00%> (-5.43%) ⬇️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements enhanced search functionality for a Django blog application, including keyword highlighting, multiple sort options (relevance, time, views), search history with localStorage, and a custom Chinese tokenizer for Whoosh. The changes also switch from MySQL to SQLite for the database and modify the slug generation library from django-uuslug to python-slugify.

Key Changes:

  • Added search result sorting by relevance, time, and views with visual feedback
  • Implemented keyword highlighting in search results with XSS protection attempts
  • Added search history tracking using localStorage with auto-suggestions and popular searches
  • Replaced jieba's built-in ChineseAnalyzer with custom ChineseTokenizer implementation
  • Changed database from MySQL to SQLite and reconfigured Haystack to use Whoosh by default

Reviewed changes

Copilot reviewed 6 out of 8 changed files in this pull request and generated 19 comments.

Show a summary per file
File Description
whoosh_index/_MAIN_0.toc Generated Whoosh index file - binary data for the search index schema
templates/search/search.html Added sorting controls UI and CSS styles for highlighting search keywords
templates/blog/tags/sidebar.html Added search history, suggestions dropdown with JavaScript, active state styling for categories/tags, and extensive inline CSS
djangoblog/whoosh_cn_backend.py Replaced jieba.analyse.ChineseAnalyzer import with custom ChineseTokenizer class implementation
djangoblog/settings.py Switched database from MySQL to SQLite, added Whoosh configuration block, commented out mdeditor app
blog/views.py Enhanced EsSearchView with sorting logic and keyword highlighting in titles/excerpts
blog/models.py Changed import from uuslug.slugify to slugify.slugify (python-slugify library)
Comments suppressed due to low confidence (2)

djangoblog/settings.py:157

  • This assignment to 'HAYSTACK_CONNECTIONS' is unnecessary as it is redefined before this value is used.
    This assignment to 'HAYSTACK_CONNECTIONS' is unnecessary as it is redefined before this value is used.
HAYSTACK_CONNECTIONS = {

djangoblog/settings.py:164

  • This assignment to 'HAYSTACK_SIGNAL_PROCESSOR' is unnecessary as it is redefined before this value is used.
HAYSTACK_SIGNAL_PROCESSOR = 'haystack.signals.RealtimeSignalProcessor'

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +314 to +315
# 替换HTML特殊字符以避免XSS攻击
query = query.replace('&', '&amp;').replace('<', '&lt;').replace('>', '&gt;')
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The XSS protection is applied to the query before escaping it in the regex, but the escaped query is then used in regex substitution which injects HTML directly into article titles and excerpts. If the original query contains double quotes or single quotes, they are not escaped, which could still lead to attribute injection vulnerabilities when the highlighted content is rendered in HTML attributes. Consider using a proper HTML escaping library or escaping quotes as well.

Copilot uses AI. Check for mistakes.
Comment on lines +325 to +341
# 高亮标题中的关键词
if article.title:
article.title = regex.sub(r'<em class="highlight">\1</em>', article.title)

# 高亮摘要中的关键词
if hasattr(article, 'excerpt') and article.excerpt:
article.excerpt = regex.sub(r'<em class="highlight">\1</em>', article.excerpt)

# 如果没有摘要,从正文中提取部分内容并高亮
elif article.body:
# 提取前200个字符作为摘要
excerpt = article.body[:200]
if len(article.body) > 200:
excerpt += '...'

# 高亮摘要中的关键词
article.excerpt = regex.sub(r'<em class="highlight">\1</em>', excerpt)
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modifying the article.title and article.excerpt attributes directly in the search view mutates the model instances with HTML markup. This is problematic because: 1) It permanently modifies the model objects in memory, potentially affecting other views or code that accesses these same objects, 2) The highlighted HTML will be double-escaped if the template already applies escaping (which Django does by default), and 3) It violates separation of concerns by mixing presentation logic with data retrieval. Consider storing the highlighted versions as separate variables in the context or result objects instead of mutating the model attributes.

Copilot uses AI. Check for mistakes.
Comment on lines +88 to +95
setTimeout(() => {
suggestionsDiv.style.display = 'none';
}, 200);
}

// 搜索框事件
searchInput.addEventListener('focus', showSuggestions);
searchInput.addEventListener('input', showSuggestions);
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a 200ms setTimeout to hide suggestions is a workaround that can lead to race conditions. A better approach would be to check if the newly focused element is within the suggestions div using relatedTarget in the blur event, or use proper focus management.

Suggested change
setTimeout(() => {
suggestionsDiv.style.display = 'none';
}, 200);
}
// 搜索框事件
searchInput.addEventListener('focus', showSuggestions);
searchInput.addEventListener('input', showSuggestions);
suggestionsDiv.style.display = 'none';
}
// 搜索框事件
searchInput.addEventListener('focus', showSuggestions);
searchInput.addEventListener('input', showSuggestions);
searchInput.addEventListener('blur', function(e) {
// If focus moves to an element inside suggestionsDiv, do not hide
if (!suggestionsDiv.contains(e.relatedTarget)) {
hideSuggestions();
}
}, true);

Copilot uses AI. Check for mistakes.
Comment on lines +311 to +341
# 关键词高亮处理
query = self.query
if query:
# 替换HTML特殊字符以避免XSS攻击
query = query.replace('&', '&amp;').replace('<', '&lt;').replace('>', '&gt;')

# 创建正则表达式,不区分大小写
import re
regex = re.compile(r'(' + re.escape(query) + r')', re.IGNORECASE)

# 对搜索结果中的标题和摘要进行关键词高亮
for result in page.object_list:
article = result.object

# 高亮标题中的关键词
if article.title:
article.title = regex.sub(r'<em class="highlight">\1</em>', article.title)

# 高亮摘要中的关键词
if hasattr(article, 'excerpt') and article.excerpt:
article.excerpt = regex.sub(r'<em class="highlight">\1</em>', article.excerpt)

# 如果没有摘要,从正文中提取部分内容并高亮
elif article.body:
# 提取前200个字符作为摘要
excerpt = article.body[:200]
if len(article.body) > 200:
excerpt += '...'

# 高亮摘要中的关键词
article.excerpt = regex.sub(r'<em class="highlight">\1</em>', excerpt)
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The keyword highlighting logic in get_context lacks test coverage. Tests should verify that: 1) keywords are correctly highlighted in titles and excerpts, 2) HTML entities are properly escaped to prevent XSS, 3) case-insensitive matching works correctly, and 4) the excerpt truncation logic handles various body lengths appropriately.

Copilot uses AI. Check for mistakes.
<div>
<label class="screen-reader-text" for="s">{% trans 'search' %}:</label>
<input type="text" value="" name="q" id="q"/>
<input type="text" value="" name="q" id="search-input" placeholder="搜索文章..."/>
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The input field's ID attribute has changed from 'q' to 'search-input', which may break other JavaScript code or tests that rely on the original 'q' ID to interact with this search input field.

Copilot uses AI. Check for mistakes.
Comment on lines +30 to +34
# 使用jieba分词
words = jieba.cut(text)
for word in words:
token = Token()
token.text = word
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ChineseTokenizer implementation doesn't set important Token attributes like 'pos' (position in the stream), 'startchar', 'endchar', which are used by Whoosh for features like highlighting and phrase queries. Without these attributes, search highlighting and position-dependent features may not work correctly.

Suggested change
# 使用jieba分词
words = jieba.cut(text)
for word in words:
token = Token()
token.text = word
# 使用jieba.tokenize获取分词及其在原文中的位置
pos = 0
for word, start, end in jieba.tokenize(text):
token = Token()
token.text = word
token.pos = pos
token.startchar = start
token.endchar = end
pos += 1

Copilot uses AI. Check for mistakes.
} else {
history.forEach(query => {
const li = document.createElement('li');
li.innerHTML = `<a href="/search/?q=${encodeURIComponent(query)}">${query}</a>`;
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The search history rendering uses encodeURIComponent in JavaScript but doesn't sanitize the query text itself before inserting it into the innerHTML. If a user enters malicious content containing HTML/JavaScript, it will be stored in localStorage and could execute when rendered. Consider using textContent instead of innerHTML or sanitizing the query display text.

Suggested change
li.innerHTML = `<a href="/search/?q=${encodeURIComponent(query)}">${query}</a>`;
const a = document.createElement('a');
a.href = `/search/?q=${encodeURIComponent(query)}`;
a.textContent = query;
li.appendChild(a);

Copilot uses AI. Check for mistakes.
Comment on lines +111 to +112
'ENGINE': 'django.db.backends.sqlite3',
'NAME': BASE_DIR / 'db.sqlite3',
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing from MySQL to SQLite is a significant downgrade for production environments. SQLite lacks concurrent write support, full-text search capabilities, and doesn't scale well. This change should only be for development/testing purposes and should not be committed to the main branch without proper configuration management (e.g., using environment variables to switch between databases).

Suggested change
'ENGINE': 'django.db.backends.sqlite3',
'NAME': BASE_DIR / 'db.sqlite3',
'ENGINE': os.environ.get('DJANGO_DB_ENGINE', 'django.db.backends.sqlite3'),
'NAME': os.environ.get('DJANGO_DB_NAME', BASE_DIR / 'db.sqlite3'),
'USER': os.environ.get('DJANGO_DB_USER', ''),
'PASSWORD': os.environ.get('DJANGO_DB_PASSWORD', ''),
'HOST': os.environ.get('DJANGO_DB_HOST', ''),
'PORT': os.environ.get('DJANGO_DB_PORT', ''),

Copilot uses AI. Check for mistakes.
}
else:
# 默认使用Whoosh作为搜索引擎
import os
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The os module is imported again locally within the else block, but it's already imported at the top of the file (line 12). This redundant import should be removed.

Suggested change
import os

Copilot uses AI. Check for mistakes.
'django.contrib.sites',
'django.contrib.sitemaps',
'mdeditor',
# 'mdeditor',
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mdeditor app is commented out in INSTALLED_APPS, but the model field MDTextField is still imported and used in blog/models.py (line 11). This will cause an ImportError when the app tries to start. Either keep mdeditor enabled or replace MDTextField with an alternative field type.

Suggested change
# 'mdeditor',
'mdeditor',

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant