|
1 | 1 | # Smart Code Diff |
2 | 2 |
|
3 | | -A next-generation code diffing tool that performs structural and semantic comparison of source code files. Unlike traditional line-based diff tools, Smart Code Diff understands code structure at the Abstract Syntax Tree (AST) level, enabling intelligent comparison of functions, classes, and other code elements regardless of their position in files. |
| 3 | +A next-generation code comparison tool that performs structural and semantic analysis of source code files using Abstract Syntax Tree (AST) level comparison, going far beyond traditional line-based diff tools. |
4 | 4 |
|
5 | | -## Features |
| 5 | +## 🚀 Features |
6 | 6 |
|
7 | | -- **Multi-Language Support**: Support for Java, Python, JavaScript, C++, C#, and more |
8 | | -- **Structural Comparison**: Compare code at function/method level, ignoring order and formatting |
9 | | -- **Semantic Understanding**: Identify renamed identifiers, moved code blocks, and refactoring patterns |
10 | | -- **Cross-File Tracking**: Track code moved between files |
11 | | -- **Multiple Interfaces**: Command-line tool, web interface, and REST API |
12 | | -- **Customizable Rules**: User-defined comparison rules and similarity thresholds |
| 7 | +### Advanced Code Analysis |
| 8 | +- **Structural Comparison**: AST-level analysis instead of line-by-line comparison |
| 9 | +- **Semantic Understanding**: Symbol resolution and type information extraction |
| 10 | +- **Multi-Language Support**: Java, Python, JavaScript, C++, C |
| 11 | +- **Function Matching**: Intelligent function correspondence across file versions |
| 12 | +- **Refactoring Detection**: Automatic identification of common refactoring patterns |
13 | 13 |
|
14 | | -## Quick Start |
| 14 | +### Multiple Interfaces |
| 15 | +- **Command Line Interface**: Comprehensive CLI with multiple output formats |
| 16 | +- **Web Interface**: Modern React-based UI with interactive visualizations |
| 17 | +- **REST API**: Full-featured API for programmatic integration |
15 | 18 |
|
16 | | -### Installation |
| 19 | +### Visualization Modes |
| 20 | +- **Side-by-Side View**: Synchronized code comparison with change highlighting |
| 21 | +- **Unified Diff View**: Traditional diff format with enhanced context |
| 22 | +- **Structure View**: AST-level structural comparison |
| 23 | +- **Function-Centric View**: Detailed function-level analysis with similarity scores |
17 | 24 |
|
| 25 | +## 📦 Installation |
| 26 | + |
| 27 | +### Prerequisites |
| 28 | +- Rust 1.70+ (for building from source) |
| 29 | +- Node.js 18+ (for web interface) |
| 30 | + |
| 31 | +### From Source |
18 | 32 | ```bash |
19 | 33 | # Clone the repository |
20 | | -git clone https://github.com/your-org/smart-code-diff.git |
| 34 | +git clone https://github.com/smart-code-diff/smart-code-diff.git |
21 | 35 | cd smart-code-diff |
22 | 36 |
|
23 | | -# Build the project |
24 | | -cargo build --release |
| 37 | +# Build the CLI tool |
| 38 | +cargo build --release -p smart-diff-cli |
| 39 | + |
| 40 | +# Build the web server |
| 41 | +cargo build --release -p smart-diff-web |
| 42 | + |
| 43 | +# Install frontend dependencies and build |
| 44 | +cd frontend |
| 45 | +npm install |
| 46 | +npm run build |
| 47 | +``` |
| 48 | + |
| 49 | +### Using Docker |
| 50 | +```bash |
| 51 | +# Pull and run the Docker image |
| 52 | +docker pull smartcodediff/smart-code-diff:latest |
| 53 | +docker run -p 3000:3000 smartcodediff/smart-code-diff:latest |
25 | 54 | ``` |
26 | 55 |
|
27 | | -### Usage |
| 56 | +## 🎯 Quick Start |
28 | 57 |
|
29 | | -#### Command Line |
| 58 | +### CLI Usage |
30 | 59 |
|
| 60 | +**Compare two files:** |
31 | 61 | ```bash |
32 | | -# Compare two files |
33 | | -smart-diff compare file1.py file2.py |
| 62 | +smart-diff-cli compare Calculator.java Calculator_refactored.java |
| 63 | +``` |
| 64 | + |
| 65 | +**Compare directories:** |
| 66 | +```bash |
| 67 | +smart-diff-cli compare-dir src/ src-refactored/ --recursive |
| 68 | +``` |
34 | 69 |
|
35 | | -# Compare directories |
36 | | -smart-diff compare --recursive src1/ src2/ |
| 70 | +**Generate HTML report:** |
| 71 | +```bash |
| 72 | +smart-diff-cli compare --output html old.py new.py > report.html |
| 73 | +``` |
37 | 74 |
|
38 | | -# Output in JSON format |
39 | | -smart-diff compare --format json file1.js file2.js |
| 75 | +**JSON output for automation:** |
| 76 | +```bash |
| 77 | +smart-diff-cli compare --output json file1.js file2.js | jq '.similarity' |
40 | 78 | ``` |
41 | 79 |
|
42 | | -#### Web Interface |
| 80 | +### Web Interface |
43 | 81 |
|
| 82 | +1. Start the web server: |
44 | 83 | ```bash |
45 | | -# Start the web server |
46 | 84 | smart-diff-server |
| 85 | +``` |
47 | 86 |
|
48 | | -# Open http://localhost:3000 in your browser |
| 87 | +2. Open your browser to `http://localhost:3000` |
| 88 | + |
| 89 | +3. Upload files and explore different visualization modes |
| 90 | + |
| 91 | +### API Integration |
| 92 | + |
| 93 | +```bash |
| 94 | +# Compare files via REST API |
| 95 | +curl -X POST http://localhost:3000/api/compare \ |
| 96 | + -H "Content-Type: application/json" \ |
| 97 | + -d '{ |
| 98 | + "file1": {"path": "old.java", "content": "..."}, |
| 99 | + "file2": {"path": "new.java", "content": "..."}, |
| 100 | + "options": {"threshold": 0.7, "detect_moves": true} |
| 101 | + }' |
49 | 102 | ``` |
50 | 103 |
|
51 | | -## Architecture |
| 104 | +## 🏗️ Architecture |
52 | 105 |
|
53 | | -The project is organized as a Rust workspace with the following crates: |
| 106 | +Smart Code Diff follows a modular architecture with clear separation of concerns: |
54 | 107 |
|
55 | | -- **parser**: Multi-language parser engine using tree-sitter |
56 | | -- **semantic-analysis**: Symbol resolution and type information extraction |
57 | | -- **diff-engine**: Core diff computation with tree edit distance algorithms |
58 | | -- **cli**: Command-line interface |
59 | | -- **web-ui**: Web server and REST API |
| 108 | +``` |
| 109 | +┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ |
| 110 | +│ Web UI │ │ CLI │ │ REST API │ |
| 111 | +│ (React/TS) │ │ (Rust) │ │ (Axum) │ |
| 112 | +└─────────────────┘ └─────────────────┘ └─────────────────┘ |
| 113 | + │ │ │ |
| 114 | + └───────────────────────┼───────────────────────┘ |
| 115 | + │ |
| 116 | + ┌───────────────────────────────────────────────┐ |
| 117 | + │ Core Engine │ |
| 118 | + └───────────────────────────────────────────────┘ |
| 119 | + │ |
| 120 | + ┌────────────┬────────────────┼────────────────┬────────────┐ |
| 121 | + │ │ │ │ │ |
| 122 | +┌───▼───┐ ┌───▼───┐ ┌───▼───┐ ┌───▼───┐ ┌───▼───┐ |
| 123 | +│Parser │ │Semantic│ │ Diff │ │Function│ │Change │ |
| 124 | +│Engine │ │Analysis│ │Engine │ │Matcher │ │Classifier│ |
| 125 | +└───────┘ └───────┘ └───────┘ └───────┘ └───────┘ |
| 126 | +``` |
60 | 127 |
|
61 | | -## Development |
| 128 | +### Core Components |
62 | 129 |
|
63 | | -### Prerequisites |
| 130 | +- **Parser Engine**: Tree-sitter based multi-language parsing |
| 131 | +- **Semantic Analysis**: Symbol resolution and type extraction |
| 132 | +- **Diff Engine**: Zhang-Shasha tree edit distance algorithm |
| 133 | +- **Function Matcher**: Hungarian algorithm for optimal matching |
| 134 | +- **Change Classifier**: Intelligent change categorization |
| 135 | +- **Refactoring Detector**: Pattern recognition for common refactorings |
| 136 | + |
| 137 | +## 📊 Example Output |
64 | 138 |
|
65 | | -- Rust 1.70+ |
66 | | -- Node.js 18+ (for web UI development) |
| 139 | +### Text Output |
| 140 | +``` |
| 141 | +File Comparison: Calculator.java → Calculator.java |
| 142 | +Language: java |
| 143 | +Overall Similarity: 87.5% |
| 144 | +
|
| 145 | +Function Analysis: |
| 146 | +├── add (100% match) - unchanged |
| 147 | +├── multiply (100% match) - unchanged |
| 148 | +├── isEven → isNumberEven (75% match) - renamed |
| 149 | +└── subtract (new) - added |
| 150 | +
|
| 151 | +Changes Detected: |
| 152 | +├── Function renamed: isEven → isNumberEven |
| 153 | +├── Method extracted: checkEvenness |
| 154 | +└── Function added: subtract |
| 155 | +
|
| 156 | +Refactoring Patterns: |
| 157 | +└── Extract Method (92% confidence) |
| 158 | + └── Logic extracted from isNumberEven to checkEvenness |
| 159 | +``` |
| 160 | + |
| 161 | +### JSON Output |
| 162 | +```json |
| 163 | +{ |
| 164 | + "similarity": 0.875, |
| 165 | + "analysis": { |
| 166 | + "functions": { |
| 167 | + "total_functions": 4, |
| 168 | + "matched_functions": 3, |
| 169 | + "average_similarity": 0.92 |
| 170 | + }, |
| 171 | + "changes": { |
| 172 | + "total_changes": 3, |
| 173 | + "change_types": { |
| 174 | + "renamed": 1, |
| 175 | + "added": 1, |
| 176 | + "extracted": 1 |
| 177 | + } |
| 178 | + }, |
| 179 | + "refactoring_patterns": [ |
| 180 | + { |
| 181 | + "pattern": "extract_method", |
| 182 | + "confidence": 0.92, |
| 183 | + "description": "Logic extracted from isNumberEven to checkEvenness" |
| 184 | + } |
| 185 | + ] |
| 186 | + } |
| 187 | +} |
| 188 | +``` |
67 | 189 |
|
68 | | -### Building |
| 190 | +## 🔧 Configuration |
69 | 191 |
|
| 192 | +Smart Code Diff supports flexible configuration through multiple methods: |
| 193 | + |
| 194 | +### Global Configuration (`~/.smart-diff/config.toml`) |
| 195 | +```toml |
| 196 | +[parser] |
| 197 | +max_file_size = 10485760 # 10MB |
| 198 | +parse_timeout = 30 |
| 199 | + |
| 200 | +[diff_engine] |
| 201 | +default_similarity_threshold = 0.7 |
| 202 | +enable_refactoring_detection = true |
| 203 | + |
| 204 | +[output] |
| 205 | +default_format = "text" |
| 206 | +enable_colors = true |
| 207 | +``` |
| 208 | + |
| 209 | +### Project Configuration (`.smart-diff.toml`) |
| 210 | +```toml |
| 211 | +[project] |
| 212 | +name = "My Project" |
| 213 | +exclude_patterns = ["**/test/**", "**/*.generated.*"] |
| 214 | + |
| 215 | +[analysis] |
| 216 | +complexity_threshold = 15 |
| 217 | +duplicate_threshold = 0.85 |
| 218 | +``` |
| 219 | + |
| 220 | +## 🧪 Use Cases |
| 221 | + |
| 222 | +### Code Review |
| 223 | +- Analyze pull request changes with structural understanding |
| 224 | +- Identify refactoring patterns vs. functional changes |
| 225 | +- Generate comprehensive change reports |
| 226 | + |
| 227 | +### Refactoring Analysis |
| 228 | +- Track large-scale refactoring impacts |
| 229 | +- Verify refactoring tool outputs |
| 230 | +- Measure code evolution over time |
| 231 | + |
| 232 | +### Migration Projects |
| 233 | +- Compare implementations across languages |
| 234 | +- Analyze architectural changes |
| 235 | +- Validate migration completeness |
| 236 | + |
| 237 | +### Quality Assessment |
| 238 | +- Detect code duplication |
| 239 | +- Measure complexity changes |
| 240 | +- Track technical debt evolution |
| 241 | + |
| 242 | +## 📚 Documentation |
| 243 | + |
| 244 | +- **[User Guide](docs/user-guide.md)**: Comprehensive usage documentation |
| 245 | +- **[API Documentation](docs/api/)**: REST API reference and integration guide |
| 246 | +- **[Developer Guide](docs/developer-guide.md)**: Architecture and contribution guidelines |
| 247 | +- **[Configuration Reference](docs/configuration.md)**: Detailed configuration options |
| 248 | +- **[Examples](examples/)**: Practical usage examples and tutorials |
| 249 | + |
| 250 | +## 🤝 Contributing |
| 251 | + |
| 252 | +We welcome contributions! Please see our [Contributing Guidelines](docs/developer-guide.md#contributing-guidelines) for details. |
| 253 | + |
| 254 | +### Development Setup |
70 | 255 | ```bash |
71 | | -# Build all crates |
| 256 | +# Clone and setup |
| 257 | +git clone https://github.com/smart-code-diff/smart-code-diff.git |
| 258 | +cd smart-code-diff |
| 259 | + |
| 260 | +# Install dependencies |
72 | 261 | cargo build |
| 262 | +cd frontend && npm install |
73 | 263 |
|
74 | 264 | # Run tests |
75 | 265 | cargo test |
| 266 | +npm test |
76 | 267 |
|
77 | | -# Run benchmarks |
78 | | -cargo bench |
| 268 | +# Start development server |
| 269 | +cargo run -p smart-diff-web & |
| 270 | +cd frontend && npm run dev |
79 | 271 | ``` |
80 | 272 |
|
81 | | -### Contributing |
| 273 | +### Code Style |
| 274 | +- Follow Rust formatting: `cargo fmt` |
| 275 | +- Use Clippy for linting: `cargo clippy` |
| 276 | +- Write comprehensive tests |
| 277 | +- Follow conventional commits |
| 278 | + |
| 279 | +## 📈 Performance |
| 280 | + |
| 281 | +Smart Code Diff is optimized for performance with: |
| 282 | + |
| 283 | +- **Parallel Processing**: Multi-threaded analysis for large codebases |
| 284 | +- **Intelligent Caching**: Multi-level caching for repeated operations |
| 285 | +- **Memory Efficiency**: Streaming processing for large files |
| 286 | +- **Algorithmic Optimizations**: Heuristic pruning and early termination |
| 287 | + |
| 288 | +### Benchmarks |
| 289 | +- **Large Files**: 10MB+ files processed in seconds |
| 290 | +- **Directory Comparison**: 1000+ files analyzed in parallel |
| 291 | +- **Memory Usage**: Efficient memory management with configurable limits |
82 | 292 |
|
83 | | -Please read [CONTRIBUTING.md](CONTRIBUTING.md) for details on our code of conduct and the process for submitting pull requests. |
| 293 | +## 🛡️ Security |
84 | 294 |
|
85 | | -## License |
| 295 | +- Input validation and sanitization |
| 296 | +- Resource limits to prevent DoS attacks |
| 297 | +- Sandboxed execution environments |
| 298 | +- No code execution, only analysis |
| 299 | + |
| 300 | +## 📄 License |
86 | 301 |
|
87 | 302 | This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. |
| 303 | + |
| 304 | +## 🙏 Acknowledgments |
| 305 | + |
| 306 | +- [Tree-sitter](https://tree-sitter.github.io/) for parsing infrastructure |
| 307 | +- [Zhang-Shasha Algorithm](https://doi.org/10.1137/0218082) for tree edit distance |
| 308 | +- [Hungarian Algorithm](https://en.wikipedia.org/wiki/Hungarian_algorithm) for optimal matching |
| 309 | +- The Rust and React communities for excellent tooling and libraries |
| 310 | + |
| 311 | +## 📞 Support |
| 312 | + |
| 313 | +- **Documentation**: Check the [docs/](docs/) directory |
| 314 | +- **Issues**: Report bugs on [GitHub Issues](https://github.com/smart-code-diff/smart-code-diff/issues) |
| 315 | +- **Discussions**: Join [GitHub Discussions](https://github.com/smart-code-diff/smart-code-diff/discussions) |
| 316 | +- **Email**: support@smartcodediff.com |
| 317 | + |
| 318 | +--- |
| 319 | + |
| 320 | +**Smart Code Diff** - Revolutionizing code comparison with structural and semantic analysis. 🚀 |
0 commit comments