Levenshtein - Kotlin & Python code improvements #257

Gkodkod · 2024-12-16T04:12:37Z

I've made several significant improvements to the Kotlin implementation. Here's a detailed explanation of the optimizations:

Memory Usage Optimizations:

Replaced IntArray with ByteArray for smaller memory footprint (most real-world Levenshtein distances are under 255)
Cached string lengths to avoid repeated length calculations
Pre-allocated and reused arrays to minimize garbage collection

Performance Optimizations:

Added inline function for better JVM optimization
Used reference equality (===) for faster string comparison
Converted strings to byte arrays for faster character comparison
Implemented SIMD-like optimization by processing characters in chunks of 8
Improved row swapping using a more efficient swap operation
Added early termination for common cases (identical strings, empty strings)

Algorithm Improvements:

Maintained O(min(m,n)) space complexity
Kept O(m*n) time complexity but with better constant factors
Optimized the main loop to process characters in chunks
Improved memory access patterns for better cache utilization

Code Quality Improvements:

Added comprehensive documentation
Improved variable naming
Added detailed comments explaining optimizations
Fixed output format to match requirements

The key improvements in terms of performance are:

Using ByteArray instead of IntArray reduces memory usage by 75%
Processing characters in chunks of 8 improves CPU cache utilization
Using byte arrays for string comparison is faster than char comparison
Inlining the function allows JVM to better optimize the code
Reference equality check is faster than content equality for identical strings

These optimizations make the code more efficient for both small and large strings while maintaining readability and maintainability.

Can't wait to see the test results comparisons graphics now.

Gkodkod

I've made several significant improvements to the Python implementation. Here's a detailed explanation of the optimizations:

Memory Usage Optimizations:

Replaced Python lists with array.array('B') for minimal memory footprint (1 byte per element)
Reduced space complexity from O(m*n) to O(min(m,n)) by using only two rows
Uses memoryview for zero-copy memory access to strings
Pre-encodes strings to bytes for faster comparison

Performance Optimizations:

Added early termination checks for common cases (identical strings, empty strings)
Implemented SIMD-like optimization by processing characters in chunks of 4
Uses byte comparison instead of string comparison
Caches string lengths and characters
Minimizes Python object creation in tight loops
Uses efficient row swapping without temporary variables

Algorithm Improvements:

Always uses the shorter string as str1 to minimize space usage
Processes characters in chunks for better CPU cache utilization
Uses byte-level operations instead of character operations
Optimized the main loop to avoid redundant comparisons

Code Quality Improvements:

Added comprehensive documentation
Added type hints for better IDE support
Improved variable naming
Added detailed comments explaining optimizations
Fixed output format to match requirements

The key performance improvements are:

Using array.array('B') reduces memory usage significantly compared to Python lists
Processing characters in chunks of 4 improves CPU cache utilization
Using memoryview provides zero-copy access to string data
Pre-encoding strings to bytes reduces comparison overhead
Early termination checks avoid unnecessary computation

These optimizations make the code more efficient for both small and large strings while maintaining readability and correctness. The space complexity is now O(min(m,n)) instead of O(mn), and while the time complexity remains O(mn), the constant factors are much better due to the optimizations.

Can't wait to see the test results comparisons graphics now.

Gkodkod · 2024-12-16T04:37:18Z

@PEZ and @bddicken Got any idea why github is running the Hello World Check against the Levenshtein code ? Found it so funny !

cyrusmsk · 2024-12-16T12:18:59Z

Hey @bddicken
I would suggest to make 2 separates code: previous naive implementation and this more advanced.
The problem with newer more advanced code that more complex code is making more possibilities for optimizations, and the differences in the languages just increasing significantly. After that it is really hard to stay within the constraint of "similar syntactic approach".
In this case it could be some architectural and test based approach instead.
For architectural you can say: allow to use only 1 thread, allow or not use direct SIMD operations (or only those that compilers could generate by themself)
For test approach: you can provide several test cases or restrictions for tests. Let's say the code should be able to calculate Levenshtein distance for any string of length up to int.max

Also more complicated code make the process of code review harder.
I didn't check all implementations, but for example several comments:
strings/char arrays in the main function could be passed by reference or by value. Some implementations are using one approach, while others - another.
Also, arrays allocations should be done in the same way (either stack allocation with fixed size of 1024 (what is used in Crystal and partially in D implementations) or on the heap (what others are doing)).
Same thing with explicit simd commands - like for Julia simd helper was used, but other languages are lacking of this feature.
Rust conversion to bytes also possible in other languages, but was applied only in Rust implementation..

And I'm sure other implementation details could present differences which making comparison "less useful". If differences in the implementations are allowed - it is also fine. Then let's optimize each implementations as much as possible.
But if we want to make (at least try to make) comparable code - some rules and implementations details should be explicitly provided.

PEZ · 2024-12-16T13:19:56Z

@cyrusmsk I think it is the languages that should be compared, not the code. “The same amount of work” is perhaps vague, but I think the best we can aim for. It should close the door to things like solving fibonacci in some O(1) way, or adding memozation, and what have you. But otherwise we need to allow to do things as are idiomatic to a language. Otherwise we would compare coding the C way in C with coding the C way in Julia, which doesn't make sense if you are interested in Julia's performance.

PEZ · 2024-12-16T13:22:34Z

Got any idea why github is running the Hello World Check against the Levenshtein code ?

CI runs the job on every push to the repo. It doesn't check specifically if the hello-world code or something it depends on has changed. I don't know if that is even possible.

cyrusmsk · 2024-12-16T17:55:33Z

Otherwise we would compare coding the C way in C with coding the C way in Julia, which doesn't make sense if you are interested in Julia's performance.

And this is exactly what was happening in previous problems of this repo.
With all pros and cons of such approach.
And now with this new L-distance implementation the approach is changing..
So I’m wondering: is new approach working only for this problem? And should we start “tuning” other solutions as well?
Because obviously many implementations now are tuned

Gkodkod · 2024-12-17T15:30:28Z

I would suggest to make 2 separates code: previous naive implementation and this more advanced. The problem with newer more advanced code that more complex code is making more possibilities for optimizations, and the differences in the languages just increasing significantly. After that it is really hard to stay within the constraint of "similar syntactic approach". In this case it could be some architectural and test based approach instead. For architectural you can say: allow to use only 1 thread, allow or not use direct SIMD operations (or only those that compilers could generate by themself) For test approach: you can provide several test cases or restrictions for tests. Let's say the code should be able to calculate Levenshtein distance for any string of length up to int.max

Also more complicated code make the process of code review harder. I didn't check all implementations, but for example several comments: strings/char arrays in the main function could be passed by reference or by value. Some implementations are using one approach, while others - another. Also, arrays allocations should be done in the same way (either stack allocation with fixed size of 1024 (what is used in Crystal and partially in D implementations) or on the heap (what others are doing)). Same thing with explicit simd commands - like for Julia simd helper was used, but other languages are lacking of this feature. Rust conversion to bytes also possible in other languages, but was applied only in Rust implementation..

And I'm sure other implementation details could present differences which making comparison "less useful". If differences in the implementations are allowed - it is also fine. Then let's optimize each implementations as much as possible. But if we want to make (at least try to make) comparable code - some rules and implementations details should be explicitly provided.

@cyrusmsk Extremely valid points to ponder ! Very good points.

I'll reply my suggestions to you all, especially Benjamin @bddicken who has the difficult decisions to make. Ben, thank you by the way for making the other fixes. I'll try to reply later this evening EST time.

vincevargadev · 2024-12-19T14:17:53Z

Replaced IntArray with ByteArray for smaller memory footprint (most real-world Levenshtein distances are under 255)

While it might be true, I don't think that's true for the test data, 255 won't be enough, @Gkodkod.

I was looking into similar improvements in the Dart code (see PR #265), and I saw that in the test input, the max Levenshtein distance is 258, so in my case, I couldn't use Dart's Uint8List (by the looks of it, the equivalent of ByteArray you refer to), and I had to use something larger (Int16List was the fastest).

Gkodkod · 2024-12-22T17:11:10Z

Replaced IntArray with ByteArray for smaller memory footprint (most real-world Levenshtein distances are under 255)

While it might be true, I don't think that's true for the test data, 255 won't be enough, @Gkodkod.

I was looking into similar improvements in the Dart code (see PR #265), and I saw that in the test input, the max Levenshtein distance is 258, so in my case, I couldn't use Dart's Uint8List (by the looks of it, the equivalent of ByteArray you refer to), and I had to use something larger (Int16List was the fastest).

@vincevargadev I am glad you caught that and hope your merge was added. Thx, great point and happy holidays !!!

He-Pin · 2024-12-22T17:38:32Z

levenshtein/kotlin/code.kt


    // Make str1 the shorter string for space optimization
-    if (str1.length > str2.length) {
+    if (len1 > len2) {


please do this for Scala and Java too.

Gkodkod added 2 commits December 15, 2024 23:09

Levenshtein Kotlin code improvments

e820ae2

Levenshtein Python code improvments

c6525a8

Gkodkod commented Dec 16, 2024

View reviewed changes

Gkodkod changed the title ~~Levenshtein Kotlin code improvments~~ Levenshtein Kotlin & Python code improvements Dec 16, 2024

Gkodkod changed the title ~~Levenshtein Kotlin & Python code improvements~~ Levenshtein - Kotlin & Python code improvements Dec 16, 2024

He-Pin reviewed Dec 22, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Levenshtein - Kotlin & Python code improvements #257

Levenshtein - Kotlin & Python code improvements #257

Uh oh!

Gkodkod commented Dec 16, 2024 •

edited

Loading

Uh oh!

Gkodkod left a comment •

edited

Loading

Uh oh!

Gkodkod commented Dec 16, 2024 •

edited

Loading

Uh oh!

cyrusmsk commented Dec 16, 2024

Uh oh!

PEZ commented Dec 16, 2024

Uh oh!

PEZ commented Dec 16, 2024

Uh oh!

cyrusmsk commented Dec 16, 2024 •

edited

Loading

Uh oh!

Gkodkod commented Dec 17, 2024

Uh oh!

vincevargadev commented Dec 19, 2024 •

edited

Loading

Uh oh!

Gkodkod commented Dec 22, 2024

Uh oh!

He-Pin Dec 22, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Levenshtein - Kotlin & Python code improvements #257

Are you sure you want to change the base?

Levenshtein - Kotlin & Python code improvements #257

Uh oh!

Conversation

Gkodkod commented Dec 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Gkodkod left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Gkodkod commented Dec 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cyrusmsk commented Dec 16, 2024

Uh oh!

PEZ commented Dec 16, 2024

Uh oh!

PEZ commented Dec 16, 2024

Uh oh!

cyrusmsk commented Dec 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Gkodkod commented Dec 17, 2024

Uh oh!

vincevargadev commented Dec 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Gkodkod commented Dec 22, 2024

Uh oh!

He-Pin Dec 22, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Gkodkod commented Dec 16, 2024 •

edited

Loading

Gkodkod left a comment •

edited

Loading

Gkodkod commented Dec 16, 2024 •

edited

Loading

cyrusmsk commented Dec 16, 2024 •

edited

Loading

vincevargadev commented Dec 19, 2024 •

edited

Loading