Skip to content

Conversation

@justDemo-hjw
Copy link

Fixes #2281

Summary

Currently, all packages are stored in ~/.cache/nm/ without registry dimension isolation. This causes package confusion and integrity risks when switching between different registries (e.g., registry.npmjs.org and registry.npmmirror.com).

This PR adds registry dimension isolation to the cache directory structure by including the registry host in the cache path. The cache structure changes from ~/.cache/nm/package-name/version/ to ~/.cache/nm/registry-host/package-name/version/.

Changes:

  • Added registry_to_dir_name() function to convert registry URL to directory name
  • Modified get_cache_dir() to return path with registry isolation
  • Updated clean command to use get_cache_dir() instead of hardcoded path

All existing code using get_cache_dir() automatically adapts to the new structure without modification, ensuring full backward compatibility.

Test Plan

  1. Unit tests: Added tests for registry_to_dir_name() with various registry URL formats and verified get_cache_dir() includes registry dimension.

  2. Manual testing:

    • Install packages with default registry (registry.npmmirror.com), verify cache path is ~/.cache/nm/registry.npmmirror.com/package-name/version/
    • Switch to registry.npmjs.org and install packages, verify cache path is ~/.cache/nm/registry.npmjs.org/package-name/version/
    • Run utoo clean, verify only current registry's cache is cleaned
    • Verify packages from different registries are stored in separate directories

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @justDemo-hjw, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the cache management system to introduce registry-level isolation for stored packages. By embedding the registry host into the cache directory path, it resolves potential conflicts and integrity issues that arise when using multiple package registries. This enhancement improves the reliability and predictability of the caching mechanism, ensuring that packages from different sources are kept distinct.

Highlights

  • Cache Directory Isolation: Implemented registry isolation in the cache directory structure to prevent package confusion and integrity risks when switching between different registries.
  • New Cache Path Structure: The cache path now includes the registry host, changing from "/.cache/nm/package-name/version/" to "/.cache/nm/registry-host/package-name/version/".
  • registry_to_dir_name() Function: Introduced a new utility function "registry_to_dir_name()" to convert registry URLs into a clean directory name format by stripping protocols and trailing slashes.
  • get_cache_dir() Update: Modified "get_cache_dir()" to dynamically construct the cache path, incorporating the current registry's host for isolation.
  • clean Command Adaptation: The "clean" command now utilizes the updated "get_cache_dir()", ensuring that it only cleans the cache relevant to the currently configured registry.
  • Backward Compatibility: The changes are designed to be fully backward compatible, with existing code automatically adapting to the new cache structure.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces registry isolation in the cache directory, which is a valuable enhancement for users working with multiple package registries. The implementation is well-structured, and the changes are logical. I've identified a potential cross-platform compatibility issue in how registry URLs are converted to directory names, particularly when a port is present. Additionally, I've suggested an improvement to a test case to make it more robust in verifying the new functionality. Overall, these are great changes that improve the tool's flexibility.

.strip_prefix("https://")
.or_else(|| registry_url.strip_prefix("http://"))
.unwrap_or(registry_url);
url.trim_end_matches('/').to_string()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation of registry_to_dir_name doesn't sanitize characters that are invalid in directory names on certain operating systems. For example, if a registry URL contains a port (e.g., http://localhost:4873), the resulting directory name localhost:4873 would be invalid on Windows due to the colon :. To ensure cross-platform compatibility, it's best to replace such characters.

You could also strengthen test_registry_to_dir_name by adding a test case for a URL with a port, assuming the fix is applied:

assert_eq!(
    registry_to_dir_name("http://localhost:4873"),
    "localhost_4873"
);
Suggested change
url.trim_end_matches('/').to_string()
url.trim_end_matches('/').replace(':', "_").to_string()

Comment on lines +313 to +318
let cache_dir = get_cache_dir();
let cache_dir_str = cache_dir.to_string_lossy();

// Should contain registry directory name
// The exact structure depends on current registry setting
assert!(cache_dir_str.contains("nm") || cache_dir_str.contains("cache"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This test's assertion is a bit too lenient. assert!(cache_dir_str.contains("nm") || cache_dir_str.contains("cache")) would likely pass even with the old implementation, as the path already contained .cache/nm. A more robust test would be to verify that the generated cache directory path actually ends with the directory-sanitized registry name. This would more accurately confirm that registry isolation is working as intended.

Suggested change
let cache_dir = get_cache_dir();
let cache_dir_str = cache_dir.to_string_lossy();
// Should contain registry directory name
// The exact structure depends on current registry setting
assert!(cache_dir_str.contains("nm") || cache_dir_str.contains("cache"));
let cache_dir = get_cache_dir();
let registry = super::config::get_registry();
let registry_dir_name = registry_to_dir_name(&registry);
assert!(cache_dir.ends_with(registry_dir_name));

@justDemo-hjw
Copy link
Author

#2281

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant