Skip to content

Change package URL output to Google Maven format#310

Merged
woocheol-lge merged 1 commit intomainfrom
m_test
Apr 27, 2026
Merged

Change package URL output to Google Maven format#310
woocheol-lge merged 1 commit intomainfrom
m_test

Conversation

@woocheol-lge
Copy link
Copy Markdown
Contributor

@woocheol-lge woocheol-lge commented Apr 24, 2026

Description

Change package URL output to Google Maven format

Summary by CodeRabbit

  • New Features
    • Smarter resolution of Maven artifact download locations using mapped URLs.
    • Added detection and HTTP verification for Google Android Maven artifacts.
    • Falls back to Maven Central paths when Google verification fails, improving accuracy of dependency download links.

@woocheol-lge woocheol-lge self-assigned this Apr 24, 2026
@woocheol-lge woocheol-lge added the chore [PR/Issue] Refactoring, maintenance the code label Apr 24, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 24, 2026

📝 Walkthrough

Walkthrough

Reworked download URL resolution: get_download_location now directly reads download_url_map and branches based on the mapped URL; added get_google_maven_url helper that probes Google Maven POM URLs via HEAD (with redirects) and returns a web-index URL on HTTP 200 or falls back to mvnrepository path.

Changes

Cohort / File(s) Summary
Maven URL Resolution
src/fosslight_dependency/_package_manager.py
get_download_location now directly looks up actual_url from download_url_map and routes: return mvnrepo path when mapped URL contains Maven Central host substrings, return the mapped URL when it's neither Maven nor Google-like, or call new get_google_maven_url for Google Maven probes. Added get_google_maven_url(mvnrepo_url, group_id, artifact_id, version) which builds a Google Maven POM URL, performs a HEAD request (follow redirects), returns maven.google.com/web/index.html#<group>:<artifact>:<version> on HTTP 200, otherwise falls back to the mvnrepo path.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: modifying package URL output to use Google Maven format instead of the previous approach.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch m_test

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/fosslight_dependency/_package_manager.py (1)

580-590: Use a module-level Session, cache on artifact key, and narrow exception handling.

get_google_maven_url is called during resolution for each artifact, issuing a blocking HTTP HEAD (5s timeout, redirects followed) with no connection reuse. For projects with many Android/Maven dependencies, this causes repeated TLS handshakes to the same host (dl.google.com). A couple of straightforward improvements:

  • Reuse a requests.Session (module-level) to maintain persistent TCP connections across calls, yielding significant latency reduction for repeated requests to the same host.
  • Wrap the function with functools.lru_cache on (mvnrepo_url, group_id, artifact_id, version) to avoid re-probing identical keys across modules/configurations.
  • Narrow the blanket except Exception to requests.RequestException to avoid silently swallowing unrelated errors (Ruff BLE001).
♻️ Proposed change
+from functools import lru_cache
+
+_google_maven_session = requests.Session()
+
+@lru_cache(maxsize=4096)
 def get_google_maven_url(mvnrepo_url, group_id, artifact_id, version):
     group_path = group_id.replace('.', '/')
     pom_url = (f"https://dl.google.com/dl/android/maven2"
                f"/{group_path}/{artifact_id}/{version}/{artifact_id}-{version}.pom")
     try:
-        resp = requests.head(pom_url, timeout=5, allow_redirects=True)
+        resp = _google_maven_session.head(pom_url, timeout=5, allow_redirects=True)
         if resp.status_code == 200:
             return f"https://maven.google.com/web/index.html?#{group_id}:{artifact_id}:{version}"
-    except Exception:
+    except requests.RequestException:
         logger.debug(f"Failed to check Google Maven URL: {pom_url}")
     return f"{mvnrepo_url}{group_id}/{artifact_id}/{version}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fosslight_dependency/_package_manager.py` around lines 580 - 590,
get_google_maven_url issues repeated plain requests.head calls and a broad
except; fix by introducing a module-level requests.Session (e.g. _HTTP_SESSION)
and use that session's head method inside get_google_maven_url, decorate
get_google_maven_url with functools.lru_cache(maxsize=None) so results are
cached on the (mvnrepo_url, group_id, artifact_id, version) tuple, and replace
the bare except Exception with except requests.RequestException to only catch
network-related errors; keep the same timeout and allow_redirects parameters
when calling session.head and return the same URLs on success/failure.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/fosslight_dependency/_package_manager.py`:
- Around line 568-590: The helper get_google_maven_url can return a Maven
Central–style URL on HEAD failure which is incorrect for artifacts already
resolved from Google; update get_download_location and get_google_maven_url so
that when actual_url contains a Google host (checked in get_download_location),
we either (a) call get_google_maven_url with an explicit fallback equal to
actual_url and return that fallback on any non-200/exception, or (b) skip the
HEAD probe and directly return the Google web index URL (constructed from
group_id:artifact_id:version) — implement option (a) by adding an optional
fallback parameter to get_google_maven_url and using it whenever requests.head
fails, and ensure get_download_location passes actual_url as fallback when the
host is Google.

---

Nitpick comments:
In `@src/fosslight_dependency/_package_manager.py`:
- Around line 580-590: get_google_maven_url issues repeated plain requests.head
calls and a broad except; fix by introducing a module-level requests.Session
(e.g. _HTTP_SESSION) and use that session's head method inside
get_google_maven_url, decorate get_google_maven_url with
functools.lru_cache(maxsize=None) so results are cached on the (mvnrepo_url,
group_id, artifact_id, version) tuple, and replace the bare except Exception
with except requests.RequestException to only catch network-related errors; keep
the same timeout and allow_redirects parameters when calling session.head and
return the same URLs on success/failure.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2fb5168c-fde7-425f-9fcd-3cdbd45efc63

📥 Commits

Reviewing files that changed from the base of the PR and between 89e9574 and 189be30.

📒 Files selected for processing (1)
  • src/fosslight_dependency/_package_manager.py

Comment thread src/fosslight_dependency/_package_manager.py
Signed-off-by: woocheol <jayden6659@gmail.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/fosslight_dependency/_package_manager.py (1)

580-580: Trailing comma in function signature.

def get_google_maven_url(mvnrepo_url, group_id, artifact_id, version, ): is valid Python but the dangling , ) looks like an editing leftover. Drop it for clarity.

-def get_google_maven_url(mvnrepo_url, group_id, artifact_id, version, ):
+def get_google_maven_url(mvnrepo_url, group_id, artifact_id, version):
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fosslight_dependency/_package_manager.py` at line 580, The function
signature for get_google_maven_url has an extraneous trailing comma before the
closing parenthesis; update the def get_google_maven_url(...) declaration to
remove the dangling comma so it reads a normal parameter list (i.e., def
get_google_maven_url(mvnrepo_url, group_id, artifact_id, version):), then run
linter/formatting to ensure no other signature style issues remain.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/fosslight_dependency/_package_manager.py`:
- Around line 568-590: get_download_location/get_google_maven_url currently
issues a synchronous requests.head per artifact which can cause large cumulative
delays; add a simple in-memory memoization keyed by (group_id, artifact_id,
version) in the module and have get_google_maven_url check that cache before
issuing requests.head and store both positive (Google URL) and negative
(fallback) outcomes; additionally track a short-lived failure counter (or
timestamped negative cache) to short‑circuit repeated probes after several
failures and return the mvnrepo fallback immediately; update
get_google_maven_url to use the cache and failure threshold and keep logging via
logger when a probe is skipped.
- Line 14: The module contains an unconditional "import requests" which means
the project must declare requests as a runtime dependency; add "requests" to the
dependencies section of pyproject.toml (with an appropriate version spec, e.g.
"requests>=2.28,<3") and also add it to requirements.txt if your repo maintains
one so fresh installs won't raise ImportError; verify the package is listed
under the same environment (main/runtime) and run a local install to confirm the
import resolves.

---

Nitpick comments:
In `@src/fosslight_dependency/_package_manager.py`:
- Line 580: The function signature for get_google_maven_url has an extraneous
trailing comma before the closing parenthesis; update the def
get_google_maven_url(...) declaration to remove the dangling comma so it reads a
normal parameter list (i.e., def get_google_maven_url(mvnrepo_url, group_id,
artifact_id, version):), then run linter/formatting to ensure no other signature
style issues remain.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b5c95f52-6446-40ad-a6ca-fb00dcacd110

📥 Commits

Reviewing files that changed from the base of the PR and between 189be30 and 9455945.

📒 Files selected for processing (1)
  • src/fosslight_dependency/_package_manager.py

import subprocess
import shutil
import stat
import requests
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Locate dependency manifests and check for a `requests` declaration.
fd -t f -H -E .git '^(setup\.py|setup\.cfg|pyproject\.toml|requirements.*\.txt|Pipfile|Pipfile\.lock)$' \
  | xargs -I{} sh -c 'echo "=== {} ==="; grep -nE "(^|[^a-zA-Z0-9_])requests([^a-zA-Z0-9_]|$)" "{}" || echo "(no match)"'

Repository: fosslight/fosslight_dependency_scanner

Length of output: 390


🏁 Script executed:

#!/bin/bash

# 1. Check if setup.py exists at root
echo "=== Checking for setup.py ==="
find . -maxdepth 1 -name "setup.py" -type f

# 2. Read the main dependency files
echo -e "\n=== pyproject.toml ==="
cat pyproject.toml 2>/dev/null || echo "(file not found)"

echo -e "\n=== setup.py ==="
cat setup.py 2>/dev/null || echo "(file not found)"

echo -e "\n=== requirements.txt ==="
cat requirements.txt 2>/dev/null || echo "(file not found)"

# 3. Confirm the import statement at line 14 of _package_manager.py
echo -e "\n=== _package_manager.py lines 1-20 ==="
head -20 src/fosslight_dependency/_package_manager.py

Repository: fosslight/fosslight_dependency_scanner

Length of output: 2198


Add requests to the project's runtime dependencies.

requests is imported unconditionally at line 14 of this module. It is not currently declared in pyproject.toml or requirements.txt, which will cause fresh installs to fail with an ImportError. Add requests to the dependencies list in pyproject.toml (and requirements.txt if maintained).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fosslight_dependency/_package_manager.py` at line 14, The module contains
an unconditional "import requests" which means the project must declare requests
as a runtime dependency; add "requests" to the dependencies section of
pyproject.toml (with an appropriate version spec, e.g. "requests>=2.28,<3") and
also add it to requirements.txt if your repo maintains one so fresh installs
won't raise ImportError; verify the package is listed under the same environment
(main/runtime) and run a local install to confirm the import resolves.

Comment on lines 568 to +590
def get_download_location(download_url_map, group_id, artifact_id, version, mvnrepo_url):
actual_key = f"{group_id}:{artifact_id}:{version}"
if download_url_map:
try:
actual_url = download_url_map.get(actual_key)

use_mvnrepo = True
if actual_url:
central_like = ("repo1.maven.org" in actual_url) or ("repo.maven.apache.org" in actual_url)
google_like = (("maven.google.com" in actual_url) or
("dl.google.com/android/maven2" in actual_url) or
("dl.google.com/dl/android/maven2" in actual_url))
if central_like or google_like:
use_mvnrepo = True
else:
use_mvnrepo = False
except Exception as e:
logger.debug(f"Failed to get download location from download_url_map: {e}")
use_mvnrepo = True
else:
use_mvnrepo = True
if use_mvnrepo:
return f"{mvnrepo_url}{group_id}/{artifact_id}/{version}"
else:
return actual_url
actual_url = download_url_map.get(actual_key) if download_url_map else None
if actual_url:
if any(host in actual_url for host in ("repo1.maven.org", "repo.maven.apache.org")):
return f"{mvnrepo_url}{group_id}/{artifact_id}/{version}"
if not any(host in actual_url for host in (
"maven.google.com", "dl.google.com/android/maven2", "dl.google.com/dl/android/maven2")):
return actual_url
return get_google_maven_url(mvnrepo_url, group_id, artifact_id, version)


def get_google_maven_url(mvnrepo_url, group_id, artifact_id, version, ):
group_path = group_id.replace('.', '/')
pom_url = (f"https://dl.google.com/dl/android/maven2"
f"/{group_path}/{artifact_id}/{version}/{artifact_id}-{version}.pom")
try:
resp = requests.head(pom_url, timeout=5, allow_redirects=True)
if resp.status_code == 200:
return f"https://maven.google.com/web/index.html#{group_id}:{artifact_id}:{version}"
except Exception:
logger.debug(f"Failed to check Google Maven URL: {pom_url}")
return f"{mvnrepo_url}{group_id}/{artifact_id}/{version}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Per‑artifact HEAD probe can substantially slow large scans.

get_download_location is invoked once per oss_item by Android/Gradle/Maven callers. Whenever the mapped URL is None or Google‑hosted, get_google_maven_url issues a synchronous requests.head with a 5 s timeout. For projects with hundreds of dependencies, or when dl.google.com is unreachable (offline / corporate proxy), the cumulative wait time grows linearly and the same (group_id, artifact_id, version) may be probed repeatedly across duplicates. Consider memoizing the probe result and/or short‑circuiting when several consecutive probes have failed.

♻️ Suggested cache to avoid repeated network round‑trips
+_GOOGLE_MAVEN_CACHE: dict[tuple[str, str, str], str] = {}
+
 def get_google_maven_url(mvnrepo_url, group_id, artifact_id, version, ):
+    cache_key = (group_id, artifact_id, version)
+    if cache_key in _GOOGLE_MAVEN_CACHE:
+        return _GOOGLE_MAVEN_CACHE[cache_key]
     group_path = group_id.replace('.', '/')
     pom_url = (f"https://dl.google.com/dl/android/maven2"
                f"/{group_path}/{artifact_id}/{version}/{artifact_id}-{version}.pom")
+    result = f"{mvnrepo_url}{group_id}/{artifact_id}/{version}"
     try:
         resp = requests.head(pom_url, timeout=5, allow_redirects=True)
         if resp.status_code == 200:
-            return f"https://maven.google.com/web/index.html#{group_id}:{artifact_id}:{version}"
+            result = f"https://maven.google.com/web/index.html#{group_id}:{artifact_id}:{version}"
     except Exception:
         logger.debug(f"Failed to check Google Maven URL: {pom_url}")
-    return f"{mvnrepo_url}{group_id}/{artifact_id}/{version}"
+    _GOOGLE_MAVEN_CACHE[cache_key] = result
+    return result
🧰 Tools
🪛 Ruff (0.15.11)

[warning] 588-588: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fosslight_dependency/_package_manager.py` around lines 568 - 590,
get_download_location/get_google_maven_url currently issues a synchronous
requests.head per artifact which can cause large cumulative delays; add a simple
in-memory memoization keyed by (group_id, artifact_id, version) in the module
and have get_google_maven_url check that cache before issuing requests.head and
store both positive (Google URL) and negative (fallback) outcomes; additionally
track a short-lived failure counter (or timestamped negative cache) to
short‑circuit repeated probes after several failures and return the mvnrepo
fallback immediately; update get_google_maven_url to use the cache and failure
threshold and keep logging via logger when a probe is skipped.

@woocheol-lge woocheol-lge merged commit 4edf330 into main Apr 27, 2026
14 checks passed
@woocheol-lge woocheol-lge deleted the m_test branch April 27, 2026 06:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chore [PR/Issue] Refactoring, maintenance the code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants