Add literature freshness review assistant#394
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new “literature freshness review assistant” slice to the SCIBASE AI-Powered Research Assistant Suite, intended to screen reviewer-facing manuscript claims for staleness and evidence drift prior to releasing reviewer packets.
Changes:
- Introduces a new
literature-freshness-review-assistant/module with a freshness evaluator, deterministic audit digests, and Markdown/SVG renderers. - Adds dependency-free tests plus deterministic demo artifact generation (JSON/MD/SVG) and an optional ffmpeg-based MP4 demo renderer.
- Updates the repo README to link to the new slice.
Reviewed changes
Copilot reviewed 10 out of 12 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| README.md | Adds a “Bounty Slices” section linking to the new assistant. |
| literature-freshness-review-assistant/README.md | Documents the slice scope, files, and validation commands. |
| literature-freshness-review-assistant/package.json | Adds per-slice scripts (check, test, demo, demo:video). |
| literature-freshness-review-assistant/index.js | Implements freshness evaluation logic, digests, and Markdown/SVG rendering. |
| literature-freshness-review-assistant/sample-data.js | Adds synthetic policy, evidence ledger, and manuscript packets fixtures. |
| literature-freshness-review-assistant/test.js | Adds assertion-based tests for evaluator + renderers + digest stability. |
| literature-freshness-review-assistant/demo.js | Generates deterministic demo JSON/MD/SVG artifacts. |
| literature-freshness-review-assistant/scripts/render-demo-video.js | Optional ffmpeg script to render an MP4 demo. |
| literature-freshness-review-assistant/reports/demo.json | Committed example JSON output. |
| literature-freshness-review-assistant/reports/demo.md | Committed example Markdown output. |
| literature-freshness-review-assistant/reports/demo.svg | Committed example SVG output. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (signal.currentVersion && claim.datasetVersion && claim.datasetVersion !== signal.currentVersion) { | ||
| addFinding(findings, { | ||
| severity: "major", | ||
| code: "DATASET_VERSION_DRIFT", | ||
| claimId: claim.id, | ||
| topic: claim.topic, | ||
| detail: `Claim uses dataset ${claim.datasetVersion}; current ledger version is ${signal.currentVersion}.`, | ||
| requiredAction: signal.requirement, | ||
| evidenceSignalId: signal.id | ||
| }); | ||
| } | ||
|
|
||
| if (signal.currentVersion && claim.benchmarkVersion && claim.benchmarkVersion !== signal.currentVersion) { | ||
| addFinding(findings, { | ||
| severity: "major", | ||
| code: "BENCHMARK_VERSION_DRIFT", | ||
| claimId: claim.id, | ||
| topic: claim.topic, | ||
| detail: `Claim uses benchmark ${claim.benchmarkVersion}; current ledger version is ${signal.currentVersion}.`, | ||
| requiredAction: signal.requirement, | ||
| evidenceSignalId: signal.id | ||
| }); | ||
| } |
There was a problem hiding this comment.
Addressed in 1cddc7d. Dataset drift now only fires for dataset_release signals and benchmark drift only fires for benchmark_update signals, and both are gated on current/latest/state-of-the-art wording or an explicit currentEvidenceClaim flag. I added a regression test for a historical versioned claim that should not receive a drift finding.
| function evaluateClaimFreshness(claim, ledger, policy) { | ||
| const findings = []; | ||
| const signals = evidenceForTopic(ledger, claim.topic); | ||
| const latestCitation = latestDate(claim.citationDates || []); | ||
| const reviewDate = policy.reviewDate; | ||
|
|
There was a problem hiding this comment.
Addressed in 1cddc7d. The evaluator now filters topic signals through policy.requiredEvidenceTypes, so unsupported signal types are ignored for policy-driven freshness holds. I added coverage for an unsupported blog_post signal.
| lines.push("| Severity | Code | Claim | Detail | Required action |"); | ||
| lines.push("| --- | --- | --- | --- | --- |"); | ||
| for (const finding of manuscript.findings) { | ||
| lines.push( | ||
| `| ${finding.severity} | ${finding.code} | ${finding.claimId} | ${finding.detail} | ${finding.requiredAction} |` | ||
| ); |
There was a problem hiding this comment.
Addressed in 1cddc7d. Markdown table cells now escape pipes and normalize newlines to <br>, with a regression test covering both cases.
| function yearsBetween(olderDate, newerDate) { | ||
| const older = new Date(`${olderDate}T00:00:00Z`); | ||
| const newer = new Date(`${newerDate}T00:00:00Z`); | ||
| return (newer.getTime() - older.getTime()) / (365.25 * 24 * 60 * 60 * 1000); | ||
| } | ||
|
|
||
| function latestDate(dates) { | ||
| const validDates = dates.filter(Boolean).sort(); |
There was a problem hiding this comment.
Addressed in 1cddc7d. Date parsing now validates ISO YYYY-MM-DD strings and finite timestamps. Invalid support dates produce an INVALID_SUPPORT_DATE finding and, if no valid support date remains, the existing missing-date hold still fires. Added regression coverage.
|
Follow-up commit
Re-ran:
|
/claim #16
Summary
Adds a distinct literature freshness review assistant for the AI-Powered Research Assistant Suite.
This slice compares manuscript claims against a synthetic current-evidence ledger before AI reviewer packets are released, checking:
Non-overlap
This is not another broad assistant suite, evidence/protocol trace, statistics review, research-gap planner, rebuttal pack, ethics/data availability check, citation-context reconciler, reporting-guideline compliance module, benchmark-leakage audit, figure/table consistency assistant, analysis-variable provenance assistant, domain-template selector, grant-fit module, limitations disclosure assistant, uncertainty calibration assistant, supplement-readiness assistant, prompt-safety guard, study-power checker, COI/funding disclosure checker, retraction sentinel, preregistration deviation assistant, external-validity assistant, image-integrity assistant, or assay control/calibration assistant. It focuses specifically on whether reviewer-facing claims are current against newer evidence and temporal-drift signals.
Safety
literature-freshness-review-assistant/sample-data.jsDemo artifacts
literature-freshness-review-assistant/reports/demo.jsonliterature-freshness-review-assistant/reports/demo.mdliterature-freshness-review-assistant/reports/demo.svgliterature-freshness-review-assistant/reports/demo.mp4Validation
npm run checknpm testnpm run demonpm run demo:videowithFFMPEG_PATHpointing to a temporaryffmpeg-staticbinary outside the repoffmpeg -v error -i literature-freshness-review-assistant/reports/demo.mp4 -f null -git diff --checkgit diff --cached --checkrg -n "token|secret|password|private key|BEGIN|sk-|ghp_|github_pat|wallet|seed phrase" README.md literature-freshness-review-assistant-> no matchesAI-assisted with Codex; reviewed and locally verified before submission.