v3.0.0b3 – Notebook Demo Update & Bug Fixes#1196
Open
Conversation
TransformerBridge.add_hook() now accepts a callable filter `(str) -> bool` as the name parameter, matching the HookedTransformer API. When a callable is passed, the hook is added to every hook point whose name satisfies the filter. This was already supported in run_with_hooks() but missing from add_hook(), causing AttributeError when migrating notebooks that use filter-based hook registration.
…lding after Layer Norm application (#1188) * Fixed bug where stale joint QKV is being used instead of the correct split weights * Format fixes * Fixing typing issues
* HF optimizes that batch size information, this converts it to the true batch size to ensure replicable information * Fixed bug for read and clear only
* Fixing issue with storing 3D tensor for hook_result when a 4D tensor is expexted * Restore hook_result
#1014) * updated loading in exploratory analysis demo to use transformer bridge * updated loading in exploratory analysis demo to use transformer bridge * Fix boot_transformers kwargs and clear stale outputs - Move weight processing args (center_unembed, fold_ln, etc.) from boot_transformers() to enable_compatibility_mode() where they belong - Clear stale outputs from cell with execution_count=None Notebook blocked on missing TransformerBridge features: W_U property delegation (Bug 6), tokens_to_residual_directions (Bug 7), and pos_embed batch dim mismatch (Bug 3). See .claude/plans/transformer_bridge_bugs.md. * Work on exploratory analysis demo updates * Removed inline bug fix code in favorite of systemic fixes * Additional bug resolution * More bug fixes --------- Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: jlarson4 <jonahalarson@comcast.net>
…dge (#1021) * updated loading in patchscopes generation demo to use transformer bridge * Migrate Patchscopes Generation Demo to TransformerBridge - Replace HookedTransformer with TransformerBridge.boot_transformers() - Fix deprecated ipython.magic() to ipython.run_line_magic() - Clear stale outputs from unrun cells All 20 cells pass locally. * Fixes to ensure functionality with v3.x --------- Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: jlarson4 <jonahalarson@comcast.net>
* updated loading in exploratory analysis demo to use transformer bridge * updated loading in exploratory analysis demo to use transformer bridge * Fix boot_transformers kwargs and clear stale outputs - Move weight processing args (center_unembed, fold_ln, etc.) from boot_transformers() to enable_compatibility_mode() where they belong - Clear stale outputs from cell with execution_count=None Notebook blocked on missing TransformerBridge features: W_U property delegation (Bug 6), tokens_to_residual_directions (Bug 7), and pos_embed batch dim mismatch (Bug 3). See .claude/plans/transformer_bridge_bugs.md. * Work on exploratory analysis demo updates * Removed inline bug fix code in favorite of systemic fixes * Additional bug resolution * More bug fixes --------- Co-authored-by: degenfabian <fabian.degen@tuta.com> Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
* updated loading in attribution patching demo to use transformer bridge * updated loading in bert demo to use transformer bridge * Update to allow NSP via bridge * Format and type fixes * Add import * Attribution Patching moved to own branch * Hiding Attribution patching until its own PR --------- Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: jlarson4 <jonahalarson@comcast.net>
* updating loading in qwen demo to use transformer bridge * add qwen demo to CI * Updating Qwen Notebook for TransformerLens 3.x * Changing model to fit in CI --------- Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: jlarson4 <jonahalarson@comcast.net>
…#1011) * updated loading in Activation Patching in TL Demo to use transformer bridge * use undeprecated ipython code to avoid deprecation warnings * revert metadata changes * updated installation source * Fix notebook CI: skip widget MIME type comparison and clear stale cell output Add application/vnd.jupyter.widget-view+json to conftest.py skip_compare to avoid false failures from random widget model_id values. Clear outputs from unrun cell (execution_count=null) in Activation Patching demo. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Neel Plotly import does not run in CI * Extend processing time for slow cell * Added NBVAL_SKIP for long running process that cant pass CI --------- Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: jlarson4 <jonahalarson@comcast.net> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* updating loading in t5 demo to use transformer bridge * add T5 demo to CI * Adjusting system to properly account for encoder-decoder * Cleanup * Small repairs * Updating doc_sanitize * Activation patching update * Final cleanup for activation patching * device cleanup --------- Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: jlarson4 <jonahalarson@comcast.net>
#1013) * updated loading in attribution patching demo to use transformer bridge * updated loading in attribution patching demo to use transformer bridge * Replace deprecated torchtyping import and clear stale cell outputs Replace `from torchtyping import TensorType as TT` with a lightweight stub class since torchtyping is not in project dependencies. Clear outputs from cells with execution_count=null. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Replace pysvelte with circuitsvis for attention visualization pysvelte was never imported in the notebook. Replace pysvelte.AttentionMulti with cv.attention.attention_heads from circuitsvis, which is already a project dependency. * Use run_with_cache for forward caching, clean up stale outputs - Replace manual forward cache hooks with model.run_with_cache() which handles hook alias resolution automatically - Keep manual backward hooks for gradient caching (no built-in method) - Add alias entries for grad_cache to fix hook.name mismatch - Clear stale stderr output (DeprecationWarning for ipython.magic) - Clear stale error output (torchtyping ModuleNotFoundError) - Clear stale Cell 18 output (cache counts differ with TransformerBridge) Note: Notebook is blocked on TransformerBridge bugs documented in .claude/plans/transformer_bridge_bugs.md (pos_embed batch dim, cache aliasing, MPS placeholder storage). Will revisit after upstream fixes. * Updates to Attribution Patching notebook for TransformerLens v5 * Skip excessive forward pass test, too long for CI * Fixing output bug * Additional notebook changes * Rerunning the notebook * Running the notebook again to get correct outputs * Another attempt --------- Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: jlarson4 <jonahalarson@comcast.net> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
a08043b to
fc68fe6
Compare
fc68fe6 to
e6b72eb
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Updating Notebooks to work with TransformerLens v3. Resolving and discovered bugs and confirming that all architectures are still performing as expected
Type of change
Checklist: