Skip to content

v3.0.0b3 – Notebook Demo Update & Bug Fixes#1196

Open
jlarson4 wants to merge 21 commits intodev-3.xfrom
dev-3.x-canary
Open

v3.0.0b3 – Notebook Demo Update & Bug Fixes#1196
jlarson4 wants to merge 21 commits intodev-3.xfrom
dev-3.x-canary

Conversation

@jlarson4
Copy link
Collaborator

Description

Updating Notebooks to work with TransformerLens v3. Resolving and discovered bugs and confirming that all architectures are still performing as expected

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have not rewritten tests relating to key interfaces which would affect backward compatibility

jlarson4 and others added 20 commits February 27, 2026 13:28
TransformerBridge.add_hook() now accepts a callable filter
`(str) -> bool` as the name parameter, matching the HookedTransformer
API. When a callable is passed, the hook is added to every hook point
whose name satisfies the filter. This was already supported in
run_with_hooks() but missing from add_hook(), causing AttributeError
when migrating notebooks that use filter-based hook registration.
…lding after Layer Norm application (#1188)

* Fixed bug where stale joint QKV is being used instead of the correct split weights

* Format fixes

* Fixing typing issues
* HF optimizes that batch size information, this converts it to the true batch size to ensure replicable information

* Fixed bug for read and clear only
* Fixing issue with storing 3D tensor for hook_result when a 4D tensor is expexted

* Restore hook_result
#1014)

* updated loading in exploratory analysis demo to use transformer bridge

* updated loading in exploratory analysis demo to use transformer bridge

* Fix boot_transformers kwargs and clear stale outputs

- Move weight processing args (center_unembed, fold_ln, etc.) from
  boot_transformers() to enable_compatibility_mode() where they belong
- Clear stale outputs from cell with execution_count=None

Notebook blocked on missing TransformerBridge features: W_U property
delegation (Bug 6), tokens_to_residual_directions (Bug 7), and
pos_embed batch dim mismatch (Bug 3). See .claude/plans/transformer_bridge_bugs.md.

* Work on exploratory analysis demo updates

* Removed inline bug fix code in favorite of systemic fixes

* Additional bug resolution

* More bug fixes

---------

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
Co-authored-by: jlarson4 <jonahalarson@comcast.net>
…dge (#1021)

* updated loading in patchscopes generation demo to use transformer bridge

* Migrate Patchscopes Generation Demo to TransformerBridge

- Replace HookedTransformer with TransformerBridge.boot_transformers()
- Fix deprecated ipython.magic() to ipython.run_line_magic()
- Clear stale outputs from unrun cells

All 20 cells pass locally.

* Fixes to ensure functionality with v3.x

---------

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
Co-authored-by: jlarson4 <jonahalarson@comcast.net>
* updated loading in exploratory analysis demo to use transformer bridge

* updated loading in exploratory analysis demo to use transformer bridge

* Fix boot_transformers kwargs and clear stale outputs

- Move weight processing args (center_unembed, fold_ln, etc.) from
  boot_transformers() to enable_compatibility_mode() where they belong
- Clear stale outputs from cell with execution_count=None

Notebook blocked on missing TransformerBridge features: W_U property
delegation (Bug 6), tokens_to_residual_directions (Bug 7), and
pos_embed batch dim mismatch (Bug 3). See .claude/plans/transformer_bridge_bugs.md.

* Work on exploratory analysis demo updates

* Removed inline bug fix code in favorite of systemic fixes

* Additional bug resolution

* More bug fixes

---------

Co-authored-by: degenfabian <fabian.degen@tuta.com>
Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
* updated loading in attribution patching demo to use transformer bridge

* updated loading in bert demo to use transformer bridge

* Update to allow NSP via bridge

* Format and type fixes

* Add import

* Attribution Patching moved to own branch

* Hiding Attribution patching until its own PR

---------

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
Co-authored-by: jlarson4 <jonahalarson@comcast.net>
* updating loading in qwen demo to use transformer bridge

* add qwen demo to CI

* Updating Qwen Notebook for TransformerLens 3.x

* Changing model to fit in CI

---------

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
Co-authored-by: jlarson4 <jonahalarson@comcast.net>
…#1011)

* updated loading in Activation Patching in TL Demo to use transformer bridge

* use undeprecated ipython code to avoid deprecation warnings

* revert metadata changes

* updated installation source

* Fix notebook CI: skip widget MIME type comparison and clear stale cell output

Add application/vnd.jupyter.widget-view+json to conftest.py skip_compare
to avoid false failures from random widget model_id values. Clear outputs
from unrun cell (execution_count=null) in Activation Patching demo.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Neel Plotly import does not run in CI

* Extend processing time for slow cell

* Added NBVAL_SKIP for long running process that cant pass CI

---------

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
Co-authored-by: jlarson4 <jonahalarson@comcast.net>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* updating loading in t5 demo to use transformer bridge

* add T5 demo to CI

* Adjusting system to properly account for encoder-decoder

* Cleanup

* Small repairs

* Updating doc_sanitize

* Activation patching update

* Final cleanup for activation patching

* device cleanup

---------

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
Co-authored-by: jlarson4 <jonahalarson@comcast.net>
#1013)

* updated loading in attribution patching demo to use transformer bridge

* updated loading in attribution patching demo to use transformer bridge

* Replace deprecated torchtyping import and clear stale cell outputs

Replace `from torchtyping import TensorType as TT` with a lightweight
stub class since torchtyping is not in project dependencies. Clear
outputs from cells with execution_count=null.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Replace pysvelte with circuitsvis for attention visualization

pysvelte was never imported in the notebook. Replace
pysvelte.AttentionMulti with cv.attention.attention_heads from
circuitsvis, which is already a project dependency.

* Use run_with_cache for forward caching, clean up stale outputs

- Replace manual forward cache hooks with model.run_with_cache() which
  handles hook alias resolution automatically
- Keep manual backward hooks for gradient caching (no built-in method)
- Add alias entries for grad_cache to fix hook.name mismatch
- Clear stale stderr output (DeprecationWarning for ipython.magic)
- Clear stale error output (torchtyping ModuleNotFoundError)
- Clear stale Cell 18 output (cache counts differ with TransformerBridge)

Note: Notebook is blocked on TransformerBridge bugs documented in
.claude/plans/transformer_bridge_bugs.md (pos_embed batch dim, cache
aliasing, MPS placeholder storage). Will revisit after upstream fixes.

* Updates to Attribution Patching notebook for TransformerLens v5

* Skip excessive forward pass test, too long for CI

* Fixing output bug

* Additional notebook changes

* Rerunning the notebook

* Running the notebook again to get correct outputs

* Another attempt

---------

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
Co-authored-by: jlarson4 <jonahalarson@comcast.net>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
@jlarson4 jlarson4 changed the title Dev 3.x canary v3.0.0b3 – Notebook Demo Update & Bug Fixes Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants