Skip to content

Conversation

@Ulthran
Copy link
Contributor

@Ulthran Ulthran commented Dec 15, 2025

Summary

  • display full data URIs and update archive links on run detail pages
  • show machine type alongside the kit on the runs listing
  • add an Archive dashboard with Chart.js powered usage stats and warnings backed by a new archive size API

Testing

  • python -m pytest (fails: ModuleNotFoundError: No module named 'flask_sqlalchemy')

Codex Task

Copilot AI review requested due to automatic review settings December 15, 2025 21:54
@codecov
Copy link

codecov bot commented Dec 15, 2025

Codecov Report

❌ Patch coverage is 0% with 59 lines in your changes missing coverage. Please review.
✅ Project coverage is 60.37%. Comparing base (f85a7f9) to head (79a6bee).

Files with missing lines Patch % Lines
sample_registry/app.py 0.00% 59 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master      #60      +/-   ##
==========================================
- Coverage   64.30%   60.37%   -3.94%     
==========================================
  Files           8        8              
  Lines         706      752      +46     
==========================================
  Hits          454      454              
- Misses        252      298      +46     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds an Archive dashboard to visualize storage usage and makes improvements to run listing pages. The Archive page features a Chart.js-powered bar chart showing archive sizes grouped by month, backed by a new /api/archive_sizes endpoint that scans filesystem directories to calculate storage totals.

Key changes:

  • New /archive route and template displaying storage statistics with warnings about missing or empty archives
  • New /api/archive_sizes API endpoint that recursively scans archive directories to calculate sizes per month
  • Simplified platform display logic in run listings (removed conditional Illumina prefix handling)
  • Updated run detail page to show full data_uri paths instead of just filenames

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
sample_registry/templates/show_run.html Updated data file link to display full data_uri path and point to local filesystem path
sample_registry/templates/browse_runs.html Simplified platform display by removing conditional Illumina prefix logic
sample_registry/templates/base.html Added Archive navigation link to header menu
sample_registry/templates/archive.html New template with Chart.js visualization and collapsible warnings list for archive statistics
sample_registry/app.py Added archive functionality including helper functions for date parsing and directory size calculation, plus new API endpoint and archive route

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +302 to +308
def _parsed_month(date_str: str):
for fmt in ("%Y-%m-%d", "%Y/%m/%d", "%m/%d/%y", "%m/%d/%Y"):
try:
return datetime.strptime(date_str, fmt).strftime("%Y-%m")
except ValueError:
continue
return None
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper function lacks a docstring explaining its purpose, parameters, and return value. Since it handles multiple date formats and returns None on failure, documentation would help future maintainers understand the expected behavior and supported date formats.

Copilot uses AI. Check for mistakes.
Comment on lines +23 to +25
<button class="btn btn-link p-0" type="button" data-bs-toggle="collapse" data-bs-target="#warningsCollapse" aria-expanded="true" aria-controls="warningsCollapse">
Warnings
</button>
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The collapse button for warnings does not have proper ARIA attributes to indicate its toggle state. While it has aria-expanded="true", this value is static and won't update when the collapse state changes. Consider using Bootstrap's data attributes properly or adding JavaScript to update the aria-expanded attribute dynamically when the collapse state changes, ensuring screen reader users are aware of the current state.

Copilot uses AI. Check for mistakes.
db = SQLAlchemy(model_class=Base)
db.init_app(app)

ARCHIVE_ROOT = Path("/mnt/isilon/microbiome/")
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded constant ARCHIVE_ROOT = Path("/mnt/isilon/microbiome/") makes the application inflexible for different deployment environments. Consider making this configurable through an environment variable or configuration file (similar to how SQLALCHEMY_DATABASE_URI is handled) to support different deployment scenarios such as development, testing, and production environments.

Suggested change
ARCHIVE_ROOT = Path("/mnt/isilon/microbiome/")
# Make ARCHIVE_ROOT configurable via environment variable, defaulting to the original path
ARCHIVE_ROOT = Path(os.environ.get("ARCHIVE_ROOT", "/mnt/isilon/microbiome/"))

Copilot uses AI. Check for mistakes.
Comment on lines +302 to +308
def _parsed_month(date_str: str):
for fmt in ("%Y-%m-%d", "%Y/%m/%d", "%m/%d/%y", "%m/%d/%Y"):
try:
return datetime.strptime(date_str, fmt).strftime("%Y-%m")
except ValueError:
continue
return None
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function name _parsed_month uses past tense which is inconsistent with Python naming conventions for functions that perform an action. Consider renaming to _parse_month to follow the convention of using imperative verbs for function names.

Copilot uses AI. Check for mistakes.
Comment on lines +311 to +334
def _archive_size_for_run(run, warnings):
archive_path = (ARCHIVE_ROOT / run.data_uri).parent
run_label = f"CMR{run.run_accession:06d}"

if not archive_path.exists():
warnings.append(f"{run_label}: Archive path {archive_path} does not exist")
return 0

if not archive_path.is_dir():
warnings.append(f"{run_label}: Archive path {archive_path} is not a directory")
return 0

total_size = 0
for entry in archive_path.rglob("*"):
try:
if entry.is_file():
total_size += entry.stat().st_size
except OSError as exc:
warnings.append(f"{run_label}: Error accessing {entry}: {exc}")

if total_size == 0:
warnings.append(f"{run_label}: Archive at {archive_path} has size 0 bytes")

return total_size
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function lacks a docstring explaining its purpose, parameters, and return value. Given that it performs complex filesystem operations, modifies the warnings list passed by reference, and has specific error handling behavior, documentation would greatly improve maintainability.

Copilot uses AI. Check for mistakes.
Comment on lines +337 to +360
@app.route("/api/archive_sizes")
def archive_sizes():
runs = db.session.query(Run).all()
warnings = []
totals_by_month = defaultdict(int)

for run in runs:
month_label = _parsed_month(run.run_date)
if not month_label:
warnings.append(
f"CMR{run.run_accession:06d}: Unable to parse run_date '{run.run_date}'"
)

archive_size = _archive_size_for_run(run, warnings)

if month_label:
totals_by_month[month_label] += archive_size

by_month = [
{"month": month, "size_bytes": totals_by_month[month]}
for month in sorted(totals_by_month.keys())
]

return jsonify({"by_month": by_month, "warnings": warnings})
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This endpoint performs potentially expensive filesystem operations (recursive directory traversal with rglob) for all runs in the database synchronously within a single HTTP request handler. This could cause severe performance issues and request timeouts if there are many runs or large archive directories. Consider implementing this as an asynchronous background job with cached results, or adding pagination/limiting to process only a subset of runs at a time.

Copilot uses AI. Check for mistakes.
Comment on lines +94 to +96
.catch((error) => {
renderWarnings([`Error loading archive data: ${error}`]);
});
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling converts the error object to a string and displays it to the user, but this could potentially expose sensitive information about the server's internal structure or filesystem. Consider using a generic error message for users while logging the detailed error server-side for debugging purposes.

Copilot uses AI. Check for mistakes.
<li><strong>Lane:</strong> {{ run.lane }}</li>
<li><strong>Platform:</strong> {{ run.machine_type }} {{ run.machine_kit }}</li>
<li><strong>Data file:</strong> <a href="respublica.research.chop.edu:/mnt/isilon/microbiome/{{ run.data_uri }}">{{ run.data_uri.split('/')|last }}</a></li>
<li><strong>Data file:</strong> <a href="/mnt/isilon/microbiome/{{ run.data_uri }}">{{ run.data_uri }}</a></li>
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The anchor tag uses an href with a local filesystem path (/mnt/isilon/microbiome/). This will not work as a clickable link in a web browser since browsers cannot navigate to local filesystem paths on the server. Consider either making this plain text (removing the anchor tag) or implementing a file download endpoint if users need to access these files through the web interface.

Copilot uses AI. Check for mistakes.
Comment on lines +302 to +365
def _parsed_month(date_str: str):
for fmt in ("%Y-%m-%d", "%Y/%m/%d", "%m/%d/%y", "%m/%d/%Y"):
try:
return datetime.strptime(date_str, fmt).strftime("%Y-%m")
except ValueError:
continue
return None


def _archive_size_for_run(run, warnings):
archive_path = (ARCHIVE_ROOT / run.data_uri).parent
run_label = f"CMR{run.run_accession:06d}"

if not archive_path.exists():
warnings.append(f"{run_label}: Archive path {archive_path} does not exist")
return 0

if not archive_path.is_dir():
warnings.append(f"{run_label}: Archive path {archive_path} is not a directory")
return 0

total_size = 0
for entry in archive_path.rglob("*"):
try:
if entry.is_file():
total_size += entry.stat().st_size
except OSError as exc:
warnings.append(f"{run_label}: Error accessing {entry}: {exc}")

if total_size == 0:
warnings.append(f"{run_label}: Archive at {archive_path} has size 0 bytes")

return total_size


@app.route("/api/archive_sizes")
def archive_sizes():
runs = db.session.query(Run).all()
warnings = []
totals_by_month = defaultdict(int)

for run in runs:
month_label = _parsed_month(run.run_date)
if not month_label:
warnings.append(
f"CMR{run.run_accession:06d}: Unable to parse run_date '{run.run_date}'"
)

archive_size = _archive_size_for_run(run, warnings)

if month_label:
totals_by_month[month_label] += archive_size

by_month = [
{"month": month, "size_bytes": totals_by_month[month]}
for month in sorted(totals_by_month.keys())
]

return jsonify({"by_month": by_month, "warnings": warnings})


@app.route("/archive")
def archive():
return render_template("archive.html")
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new archive-related functionality lacks test coverage. There are no tests for the archive_sizes API endpoint, _parsed_month helper function, or _archive_size_for_run function. Given that the repository has comprehensive test coverage for other modules (as seen in tests/test_registrar.py, tests/test_register.py), this new functionality should have corresponding tests to verify date parsing, filesystem operations, error handling, and API response structure.

Copilot uses AI. Check for mistakes.
Comment on lines +337 to +360
@app.route("/api/archive_sizes")
def archive_sizes():
runs = db.session.query(Run).all()
warnings = []
totals_by_month = defaultdict(int)

for run in runs:
month_label = _parsed_month(run.run_date)
if not month_label:
warnings.append(
f"CMR{run.run_accession:06d}: Unable to parse run_date '{run.run_date}'"
)

archive_size = _archive_size_for_run(run, warnings)

if month_label:
totals_by_month[month_label] += archive_size

by_month = [
{"month": month, "size_bytes": totals_by_month[month]}
for month in sorted(totals_by_month.keys())
]

return jsonify({"by_month": by_month, "warnings": warnings})
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API endpoint /api/archive_sizes does not handle errors gracefully. If the database query fails or if there are critical filesystem errors, the endpoint will return a 500 error without a proper JSON response. Consider adding try-except blocks around the main logic and returning appropriate error responses with meaningful error messages in JSON format (e.g., {"error": "message"}) and appropriate HTTP status codes.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants