Skip to content

Commit 5edb8bc

Browse files
committed
Merge branch 'main' of github.com:opensensor/codediff
2 parents 74921d8 + 7be04bb commit 5edb8bc

File tree

1 file changed

+275
-42
lines changed

1 file changed

+275
-42
lines changed

README.md

Lines changed: 275 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -1,87 +1,320 @@
11
# Smart Code Diff
22

3-
A next-generation code diffing tool that performs structural and semantic comparison of source code files. Unlike traditional line-based diff tools, Smart Code Diff understands code structure at the Abstract Syntax Tree (AST) level, enabling intelligent comparison of functions, classes, and other code elements regardless of their position in files.
3+
A next-generation code comparison tool that performs structural and semantic analysis of source code files using Abstract Syntax Tree (AST) level comparison, going far beyond traditional line-based diff tools.
44

5-
## Features
5+
## 🚀 Features
66

7-
- **Multi-Language Support**: Support for Java, Python, JavaScript, C++, C#, and more
8-
- **Structural Comparison**: Compare code at function/method level, ignoring order and formatting
9-
- **Semantic Understanding**: Identify renamed identifiers, moved code blocks, and refactoring patterns
10-
- **Cross-File Tracking**: Track code moved between files
11-
- **Multiple Interfaces**: Command-line tool, web interface, and REST API
12-
- **Customizable Rules**: User-defined comparison rules and similarity thresholds
7+
### Advanced Code Analysis
8+
- **Structural Comparison**: AST-level analysis instead of line-by-line comparison
9+
- **Semantic Understanding**: Symbol resolution and type information extraction
10+
- **Multi-Language Support**: Java, Python, JavaScript, C++, C
11+
- **Function Matching**: Intelligent function correspondence across file versions
12+
- **Refactoring Detection**: Automatic identification of common refactoring patterns
1313

14-
## Quick Start
14+
### Multiple Interfaces
15+
- **Command Line Interface**: Comprehensive CLI with multiple output formats
16+
- **Web Interface**: Modern React-based UI with interactive visualizations
17+
- **REST API**: Full-featured API for programmatic integration
1518

16-
### Installation
19+
### Visualization Modes
20+
- **Side-by-Side View**: Synchronized code comparison with change highlighting
21+
- **Unified Diff View**: Traditional diff format with enhanced context
22+
- **Structure View**: AST-level structural comparison
23+
- **Function-Centric View**: Detailed function-level analysis with similarity scores
1724

25+
## 📦 Installation
26+
27+
### Prerequisites
28+
- Rust 1.70+ (for building from source)
29+
- Node.js 18+ (for web interface)
30+
31+
### From Source
1832
```bash
1933
# Clone the repository
20-
git clone https://github.com/your-org/smart-code-diff.git
34+
git clone https://github.com/smart-code-diff/smart-code-diff.git
2135
cd smart-code-diff
2236

23-
# Build the project
24-
cargo build --release
37+
# Build the CLI tool
38+
cargo build --release -p smart-diff-cli
39+
40+
# Build the web server
41+
cargo build --release -p smart-diff-web
42+
43+
# Install frontend dependencies and build
44+
cd frontend
45+
npm install
46+
npm run build
47+
```
48+
49+
### Using Docker
50+
```bash
51+
# Pull and run the Docker image
52+
docker pull smartcodediff/smart-code-diff:latest
53+
docker run -p 3000:3000 smartcodediff/smart-code-diff:latest
2554
```
2655

27-
### Usage
56+
## 🎯 Quick Start
2857

29-
#### Command Line
58+
### CLI Usage
3059

60+
**Compare two files:**
3161
```bash
32-
# Compare two files
33-
smart-diff compare file1.py file2.py
62+
smart-diff-cli compare Calculator.java Calculator_refactored.java
63+
```
64+
65+
**Compare directories:**
66+
```bash
67+
smart-diff-cli compare-dir src/ src-refactored/ --recursive
68+
```
3469

35-
# Compare directories
36-
smart-diff compare --recursive src1/ src2/
70+
**Generate HTML report:**
71+
```bash
72+
smart-diff-cli compare --output html old.py new.py > report.html
73+
```
3774

38-
# Output in JSON format
39-
smart-diff compare --format json file1.js file2.js
75+
**JSON output for automation:**
76+
```bash
77+
smart-diff-cli compare --output json file1.js file2.js | jq '.similarity'
4078
```
4179

42-
#### Web Interface
80+
### Web Interface
4381

82+
1. Start the web server:
4483
```bash
45-
# Start the web server
4684
smart-diff-server
85+
```
4786

48-
# Open http://localhost:3000 in your browser
87+
2. Open your browser to `http://localhost:3000`
88+
89+
3. Upload files and explore different visualization modes
90+
91+
### API Integration
92+
93+
```bash
94+
# Compare files via REST API
95+
curl -X POST http://localhost:3000/api/compare \
96+
-H "Content-Type: application/json" \
97+
-d '{
98+
"file1": {"path": "old.java", "content": "..."},
99+
"file2": {"path": "new.java", "content": "..."},
100+
"options": {"threshold": 0.7, "detect_moves": true}
101+
}'
49102
```
50103

51-
## Architecture
104+
## 🏗️ Architecture
52105

53-
The project is organized as a Rust workspace with the following crates:
106+
Smart Code Diff follows a modular architecture with clear separation of concerns:
54107

55-
- **parser**: Multi-language parser engine using tree-sitter
56-
- **semantic-analysis**: Symbol resolution and type information extraction
57-
- **diff-engine**: Core diff computation with tree edit distance algorithms
58-
- **cli**: Command-line interface
59-
- **web-ui**: Web server and REST API
108+
```
109+
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
110+
│ Web UI │ │ CLI │ │ REST API │
111+
│ (React/TS) │ │ (Rust) │ │ (Axum) │
112+
└─────────────────┘ └─────────────────┘ └─────────────────┘
113+
│ │ │
114+
└───────────────────────┼───────────────────────┘
115+
116+
┌───────────────────────────────────────────────┐
117+
│ Core Engine │
118+
└───────────────────────────────────────────────┘
119+
120+
┌────────────┬────────────────┼────────────────┬────────────┐
121+
│ │ │ │ │
122+
┌───▼───┐ ┌───▼───┐ ┌───▼───┐ ┌───▼───┐ ┌───▼───┐
123+
│Parser │ │Semantic│ │ Diff │ │Function│ │Change │
124+
│Engine │ │Analysis│ │Engine │ │Matcher │ │Classifier│
125+
└───────┘ └───────┘ └───────┘ └───────┘ └───────┘
126+
```
60127

61-
## Development
128+
### Core Components
62129

63-
### Prerequisites
130+
- **Parser Engine**: Tree-sitter based multi-language parsing
131+
- **Semantic Analysis**: Symbol resolution and type extraction
132+
- **Diff Engine**: Zhang-Shasha tree edit distance algorithm
133+
- **Function Matcher**: Hungarian algorithm for optimal matching
134+
- **Change Classifier**: Intelligent change categorization
135+
- **Refactoring Detector**: Pattern recognition for common refactorings
136+
137+
## 📊 Example Output
64138

65-
- Rust 1.70+
66-
- Node.js 18+ (for web UI development)
139+
### Text Output
140+
```
141+
File Comparison: Calculator.java → Calculator.java
142+
Language: java
143+
Overall Similarity: 87.5%
144+
145+
Function Analysis:
146+
├── add (100% match) - unchanged
147+
├── multiply (100% match) - unchanged
148+
├── isEven → isNumberEven (75% match) - renamed
149+
└── subtract (new) - added
150+
151+
Changes Detected:
152+
├── Function renamed: isEven → isNumberEven
153+
├── Method extracted: checkEvenness
154+
└── Function added: subtract
155+
156+
Refactoring Patterns:
157+
└── Extract Method (92% confidence)
158+
└── Logic extracted from isNumberEven to checkEvenness
159+
```
160+
161+
### JSON Output
162+
```json
163+
{
164+
"similarity": 0.875,
165+
"analysis": {
166+
"functions": {
167+
"total_functions": 4,
168+
"matched_functions": 3,
169+
"average_similarity": 0.92
170+
},
171+
"changes": {
172+
"total_changes": 3,
173+
"change_types": {
174+
"renamed": 1,
175+
"added": 1,
176+
"extracted": 1
177+
}
178+
},
179+
"refactoring_patterns": [
180+
{
181+
"pattern": "extract_method",
182+
"confidence": 0.92,
183+
"description": "Logic extracted from isNumberEven to checkEvenness"
184+
}
185+
]
186+
}
187+
}
188+
```
67189

68-
### Building
190+
## 🔧 Configuration
69191

192+
Smart Code Diff supports flexible configuration through multiple methods:
193+
194+
### Global Configuration (`~/.smart-diff/config.toml`)
195+
```toml
196+
[parser]
197+
max_file_size = 10485760 # 10MB
198+
parse_timeout = 30
199+
200+
[diff_engine]
201+
default_similarity_threshold = 0.7
202+
enable_refactoring_detection = true
203+
204+
[output]
205+
default_format = "text"
206+
enable_colors = true
207+
```
208+
209+
### Project Configuration (`.smart-diff.toml`)
210+
```toml
211+
[project]
212+
name = "My Project"
213+
exclude_patterns = ["**/test/**", "**/*.generated.*"]
214+
215+
[analysis]
216+
complexity_threshold = 15
217+
duplicate_threshold = 0.85
218+
```
219+
220+
## 🧪 Use Cases
221+
222+
### Code Review
223+
- Analyze pull request changes with structural understanding
224+
- Identify refactoring patterns vs. functional changes
225+
- Generate comprehensive change reports
226+
227+
### Refactoring Analysis
228+
- Track large-scale refactoring impacts
229+
- Verify refactoring tool outputs
230+
- Measure code evolution over time
231+
232+
### Migration Projects
233+
- Compare implementations across languages
234+
- Analyze architectural changes
235+
- Validate migration completeness
236+
237+
### Quality Assessment
238+
- Detect code duplication
239+
- Measure complexity changes
240+
- Track technical debt evolution
241+
242+
## 📚 Documentation
243+
244+
- **[User Guide](docs/user-guide.md)**: Comprehensive usage documentation
245+
- **[API Documentation](docs/api/)**: REST API reference and integration guide
246+
- **[Developer Guide](docs/developer-guide.md)**: Architecture and contribution guidelines
247+
- **[Configuration Reference](docs/configuration.md)**: Detailed configuration options
248+
- **[Examples](examples/)**: Practical usage examples and tutorials
249+
250+
## 🤝 Contributing
251+
252+
We welcome contributions! Please see our [Contributing Guidelines](docs/developer-guide.md#contributing-guidelines) for details.
253+
254+
### Development Setup
70255
```bash
71-
# Build all crates
256+
# Clone and setup
257+
git clone https://github.com/smart-code-diff/smart-code-diff.git
258+
cd smart-code-diff
259+
260+
# Install dependencies
72261
cargo build
262+
cd frontend && npm install
73263

74264
# Run tests
75265
cargo test
266+
npm test
76267

77-
# Run benchmarks
78-
cargo bench
268+
# Start development server
269+
cargo run -p smart-diff-web &
270+
cd frontend && npm run dev
79271
```
80272

81-
### Contributing
273+
### Code Style
274+
- Follow Rust formatting: `cargo fmt`
275+
- Use Clippy for linting: `cargo clippy`
276+
- Write comprehensive tests
277+
- Follow conventional commits
278+
279+
## 📈 Performance
280+
281+
Smart Code Diff is optimized for performance with:
282+
283+
- **Parallel Processing**: Multi-threaded analysis for large codebases
284+
- **Intelligent Caching**: Multi-level caching for repeated operations
285+
- **Memory Efficiency**: Streaming processing for large files
286+
- **Algorithmic Optimizations**: Heuristic pruning and early termination
287+
288+
### Benchmarks
289+
- **Large Files**: 10MB+ files processed in seconds
290+
- **Directory Comparison**: 1000+ files analyzed in parallel
291+
- **Memory Usage**: Efficient memory management with configurable limits
82292

83-
Please read [CONTRIBUTING.md](CONTRIBUTING.md) for details on our code of conduct and the process for submitting pull requests.
293+
## 🛡️ Security
84294

85-
## License
295+
- Input validation and sanitization
296+
- Resource limits to prevent DoS attacks
297+
- Sandboxed execution environments
298+
- No code execution, only analysis
299+
300+
## 📄 License
86301

87302
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
303+
304+
## 🙏 Acknowledgments
305+
306+
- [Tree-sitter](https://tree-sitter.github.io/) for parsing infrastructure
307+
- [Zhang-Shasha Algorithm](https://doi.org/10.1137/0218082) for tree edit distance
308+
- [Hungarian Algorithm](https://en.wikipedia.org/wiki/Hungarian_algorithm) for optimal matching
309+
- The Rust and React communities for excellent tooling and libraries
310+
311+
## 📞 Support
312+
313+
- **Documentation**: Check the [docs/](docs/) directory
314+
- **Issues**: Report bugs on [GitHub Issues](https://github.com/smart-code-diff/smart-code-diff/issues)
315+
- **Discussions**: Join [GitHub Discussions](https://github.com/smart-code-diff/smart-code-diff/discussions)
316+
- **Email**: support@smartcodediff.com
317+
318+
---
319+
320+
**Smart Code Diff** - Revolutionizing code comparison with structural and semantic analysis. 🚀

0 commit comments

Comments
 (0)