Add support for LoRA with Transformer Engine by balvisio · Pull Request #3048 · huggingface/peft

balvisio · 2026-02-18T02:22:36Z

This PR adds support so that TransformerEngine layers (https://github.com/NVIDIA/TransformerEngine) are recognized as valid layers to which LoRA adapters can be added.

BenjaminBossan · 2026-02-19T14:00:14Z

Thanks for this PR to add support for TE in PEFT. Before proceeding further, do you have a practical example of how to use it with a transformers model? I assume one way would be for the user to employ accelerate to apply TE. Also, I saw that you intend to replace the nn.Linear layers used for lora_A and lora_B with TE's low FP precision layers. Did you test if that works well? Usually, we would keep these layers in fp32 (or fp16/bf16), even if the base layer uses lower precision (say, int4).

balvisio · 2026-02-19T16:54:54Z

Thank you for taking a look at this. The primary goal of this PR is to support models with TE layers already inside them. In fact, using te.Linear for the adapters is not strictly necessary. However, by default, this TE layers will use the torch default type.
For the example, we have created and validated a "recipe" to use fine-tune an ESM2 model using LoRA here: https://github.com/NVIDIA/bionemo-framework/tree/main/bionemo-recipes/recipes/esm2_peft_te

BenjaminBossan · 2026-02-20T10:42:10Z

The primary goal of this PR is to support models with TE layers already inside them.

I see, could you please provide a small example of how this looks in practice? We should eventually add a unit test with a real model architecture anyway, so this example would also help for that.

using te.Linear for the adapters is not strictly necessary.

I think for consistency, we should stick with float32 by default. We could think of an option to use 4bit and 8bit floats but I'd have to think about how the API for that should look.

For the example, we have created and validated a "recipe" to use fine-tune an ESM2 model using LoRA here: https://github.com/NVIDIA/bionemo-framework/tree/main/bionemo-recipes/recipes/esm2_peft_te

I think it would be great to include such an example here. Maybe simplified if possible (using Trainer, no DDP, no extra logging). WDYT?

balvisio · 2026-02-20T15:43:38Z

We should eventually add a unit test with a real model architecture anyway, so this example would also help for that.
There is a unit test that creates a toy model that uses TE layers. Is that what you meant ? Otherwise I can change the test to do: AutoModelForTokenClassification.from_pretrained("nvidia/esm2_t6_8M_UR50D", config=config, trust_remote_code=True, dtype="bfloat16"). lmk

I think for consistency, we should stick with float32 by default.
Do you mean to remove using TE layers as adapters? I can do that but just to make sure, currently the TE adapters will use the default torch type, not a low a precision necessarily.

I think it would be great to include such an example here
Where should I add the example exactly? Is there a doc with examples?

BenjaminBossan · 2026-02-20T16:28:50Z

There is a unit test that creates a toy model that uses TE layers. Is that what you meant ? Otherwise I can change the test to do

Yeah, we can use a bit heavier models in test_gpu_examples.py, my goal would be to use a real model there, or at least a "tiny" version of a real model, instead of a toy model.

Do you mean to remove using TE layers as adapters?

Yes, so keep the base layers as they are and use normal nn.Linear for LoRA.

I can do that but just to make sure, currently the TE adapters will use the default torch type, not a low a precision necessarily.

Is there any advantage then?

Where should I add the example exactly? Is there a doc with examples?

Docs would be nice, but I was thinking of the examples/ folder.

balvisio · 2026-02-25T19:46:01Z

@BenjaminBossan : I changed the tests to use a TE-based model and added an example.

BenjaminBossan

Thanks for updating the PR. I could successfully run the unit tests and overall the changes look good. However, I still have a few comments, please check. Most notably, we should avoid running anything with trust_remote_code=True by default.

tests/test_lora_te.py

src/peft/tuners/lora/te.py

examples/lora_finetuning_transformer_engine/requirements.txt

examples/lora_finetuning_transformer_engine/lora_finetuning_te.py

src/peft/tuners/lora/te.py

examples/lora_finetuning_transformer_engine/README.md

BenjaminBossan

Thanks for the updates, not much is missing at this point.

Besides the comments I made, could you please also add an entry to the docs?

https://github.com/huggingface/peft/blob/main/docs/source/developer_guides/quantization.md

It doesn't have to be long, but having it helps users discover the feature.

examples/lora_finetuning_transformer_engine/README.md

examples/lora_finetuning_transformer_engine/requirements.txt

src/peft/tuners/lora/te.py

tests/test_gpu_examples.py

balvisio · 2026-02-27T20:08:08Z

@BenjaminBossan Thanks for looking at this. I have addressed your comments.

balvisio force-pushed the dev/ba/support-te-lora branch from 0a19da8 to ffb0f7b Compare February 25, 2026 12:57

Add support for LoRA with Transformer Engine

d4984b1

balvisio force-pushed the dev/ba/support-te-lora branch from ffb0f7b to d4984b1 Compare February 25, 2026 19:41

BenjaminBossan requested changes Feb 26, 2026

View reviewed changes

address comments

de8010a

BenjaminBossan requested changes Feb 27, 2026

View reviewed changes

balvisio added 3 commits February 27, 2026 17:36

addresed more comments

1d6e3f1

add reference to a real dataset

e562759

Added documentation

22b4527

balvisio force-pushed the dev/ba/support-te-lora branch from 3b57840 to 22b4527 Compare February 27, 2026 20:07

Conversation

balvisio commented Feb 18, 2026

Uh oh!

BenjaminBossan commented Feb 19, 2026

Uh oh!

balvisio commented Feb 19, 2026

Uh oh!

BenjaminBossan commented Feb 20, 2026

Uh oh!

balvisio commented Feb 20, 2026

Uh oh!

BenjaminBossan commented Feb 20, 2026

Uh oh!

balvisio commented Feb 25, 2026

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

balvisio commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants