feat(varpro): unsupervised lasso importance — gg_beta_uvarpro() + gg_sdependent()#120
Closed
ehrlinger wants to merge 3 commits into
Closed
feat(varpro): unsupervised lasso importance — gg_beta_uvarpro() + gg_sdependent()#120ehrlinger wants to merge 3 commits into
ehrlinger wants to merge 3 commits into
Conversation
Scaffold the unsupervised analogue of gg_beta_varpro(): a tidy wrapper + plot method around varPro::get.beta.entropy() for uvarpro() fits. From the (released-variable x variable) lasso-coefficient matrix it computes beta_mean = colMeans(|beta|) per variable (most-important first, factor levels reversed for the coord_flip top-at-top convention) and flags variables above a selection cutoff (default mean(beta_mean)). Optional beta_fit accepts a precomputed get.beta.entropy() matrix so the expensive CV-lasso runs once. Follows the get.beta.entropy + sdependent "lasso importance" workflow from the varPro::uvarpro() help (iowa-housing example), per Lu/Ishwaran. Tests: deterministic logic via the precomputed beta_fit path with a mock matrix (CRAN-safe, no live grow) + one skip_on_cran() live integration test. check_man() clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## dev #120 +/- ##
==========================================
+ Coverage 87.26% 87.84% +0.58%
==========================================
Files 48 52 +4
Lines 4090 4352 +262
==========================================
+ Hits 3569 3823 +254
- Misses 521 529 +8
🚀 New features to boost your workflow:
|
Add the standard gg_* S3 companions, matching gg_beta_varpro: - print.gg_beta_uvarpro: one-line header + selected/region summary - summary.gg_beta_uvarpro: summary.gg with top variables by mean |beta| - autoplot.gg_beta_uvarpro: dispatches to plot.gg_beta_uvarpro Methods live in gg_beta_uvarpro.R; @Rdname routes them onto the shared print.gg / summary.gg pages. Tests + check_man clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Wrap varPro::sdependent() (signal-variable detection on the get.beta.entropy() matrix) as a tidy gg_sdependent frame: one row per candidate variable with imp_score, graph degree, and a signal flag (membership in sdependent()$signal.vars), ranked by imp_score. Default plot is a ranked lollipop coloured by the signal flag. print/summary/ autoplot companions follow the gg_* conventions; beta_fit accepts a precomputed entropy matrix shared with gg_beta_uvarpro()/gg_udependent(). Complements gg_udependent() (the dependency graph) with the "which variables are signal" ranking. Matrix-driven tests run CRAN-safe (sdependent does not grow a forest); the live uvarpro() grow is skip_on_cran(). check_man() clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
ehrlinger
added a commit
that referenced
this pull request
Jun 12, 2026
Fold the unsupervised varPro wrappers (originally PR #120 on dev) into the 3.1.2 release line alongside the #118 fix: - gg_beta_uvarpro() / plot/print/summary/autoplot: tidy wrapper for varPro::get.beta.entropy() (unsupervised lasso importance), the analogue of gg_beta_varpro(); colMeans(|beta|) per variable + cutoff/selected, with a precomputed beta_fit cache path. - gg_sdependent() / plot/print/summary/autoplot: tidy wrapper for varPro::sdependent() signal-variable detection (imp_score / degree / signal flag), complementing gg_udependent()'s graph. Frame-building split into .gg_sdependent_build() to keep cyclomatic complexity under the lint cap. Tests are CRAN-safe (constructed matrices, no varPro grow) with skip_on_cran live-integration cases. lint_package() = 0; NEWS entries under v3.1.2. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Owner
Author
ehrlinger
added a commit
that referenced
this pull request
Jun 12, 2026
Fold the unsupervised varPro wrappers (originally PR #120 on dev) into the 3.1.2 release line alongside the #118 fix: - gg_beta_uvarpro() / plot/print/summary/autoplot: tidy wrapper for varPro::get.beta.entropy() (unsupervised lasso importance), the analogue of gg_beta_varpro(); colMeans(|beta|) per variable + cutoff/selected, with a precomputed beta_fit cache path. - gg_sdependent() / plot/print/summary/autoplot: tidy wrapper for varPro::sdependent() signal-variable detection (imp_score / degree / signal flag), complementing gg_udependent()'s graph. Frame-building split into .gg_sdependent_build() to keep cyclomatic complexity under the lint cap. Tests are CRAN-safe (constructed matrices, no varPro grow) with skip_on_cran live-integration cases. lint_package() = 0; NEWS entries under v3.1.2. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds two
uvarpro-side wrappers (the "new methods" Hemant/Lu pointed to in thevarPro::uvarpro()help, iowa housing — illustrates lasso importance), each withplot/print/summary/autoplotcompanions.gg_beta_uvarpro()— unsupervised lasso importanceWraps
varPro::get.beta.entropy(). Aggregates the (released-variable × variable) |β| matrix intobeta_mean = colMeans(|β|, na.rm=TRUE)per variable (most-important first, reversed factor levels for top-at-top), withcutoff/selectedand abeta_fitcache path. The unsupervised analogue ofgg_beta_varpro().gg_sdependent()— signal-variable detectionWraps
varPro::sdependent(). One row per candidate variable —imp_score, graphdegree,signalflag (membership insdependent()$signal.vars) — ranked byimp_score, plotted as a lollipop coloured by signal. Complementsgg_udependent()(which already wrapssdependent()as the dependency graph ondev) with the "which variables are signal" ranking. Shares thebeta_fitentropy matrix withgg_beta_uvarpro()/gg_udependent().Tests
gg_beta_uvarpro: deterministic logic via mock matrix (CRAN-safe, no live grow) +skip_on_cranlive integration.gg_sdependent: matrix-driven tests run CRAN-safe (sdependent()does not grow a forest) +skip_on_cranliveuvarpro()integration.check_man()clean; new files only.Follow-ups
get.beta.entropy()(pre.filter,second.stage,lambda.sel) — left at varPro defaults pending the analytical goal.sdependent-style overlay ongg_udependent(node colour = signal).Targets the v4.0.0 (
dev) line. Self-contained, so it can also ride a quick 3.1.2 offmainif you want it on CRAN sooner.🤖 Generated with Claude Code