-
Notifications
You must be signed in to change notification settings - Fork 594
Description
Overview
This proposal outlines a feature to automatically detect and flag potential duplicate issues in the repository. The system will analyze newly created issues and compare them with existing ones to reduce redundancy, improve maintainability, and streamline issue management.
Problem Statement
Currently, contributors may unknowingly create duplicate issues due to:
- Lack of prior search
- Similar but differently worded issue titles
- Large issue backlog
This leads to:
- Increased maintainer workload
- Fragmented discussions
- Wasted developer effort
- Slower triaging process
A structured, automated duplicate detection mechanism is required to mitigate these inefficiencies.
Proposed Solution
The system will perform the following steps when a new issue is created:
Issue Ingestion
Capture new issue title and description.
Similarity Analysis
- Compare the new issue against:
- Open issues
- Recently closed issues
Using:
- Text similarity algorithms (Cosine Similarity / TF-IDF / embeddings)
- Fuzzy matching (Levenshtein / Fuse.js)
- Keyword matching
Duplicate Detection Mechanism
If a potential duplicate is detected:
-
Automatically comment on the issue with:
- “Possible duplicate of #XX”
-
Tag issue with label: potential-duplicate
-
Provide list of top 3 similar issues
Technical Approach
GitHub Actions
Trigger:
on:
issues:
types: [opened]
Workflow:
- Fetch existing issues via GitHub API
- Compute similarity score
- If similarity > threshold (e.g., 0.75), comment and label issue
Record
- I agree to follow this project's Code of Conduct
- I want to work on this issue