feat(compression): update tooling to use DECODE operators #3400

rkuester · 2026-02-03T00:45:03Z

This is a draft PR for running CI, review, and seeing the commits in
context. The commits along this branch will be individually submitted
for merge.

This obsoletes the original feat-decode branch (#3257), which has been
reworked to address review feedback.

See the linked issue for a description of the change.

BUG=implements #3256

Implement unified module for creating, reading, and modifying TFLite models with a clean API. The module eliminates manual index tracking and buffer management through automatic bookkeeping, supporting both declarative and imperative construction styles. Wrapper classes (Tensor, Operator, Subgraph, Model) hold the underlying flatbuffer T objects as backing storage rather than copying fields into dataclasses. This ensures all schema fields are preserved during read-modify-write cycles, even fields not explicitly handled by model_editor. Future schema additions will be preserved automatically. Add comprehensive test coverage including field preservation tests that verify unhandled schema fields survive read-modify-write. BUG=implements tensorflow#3256

…_editor Replace model_facade with model_editor in compress.py and tests. model_editor provides a cleaner API with better buffer and metadata handling. Update BUILD dependencies accordingly. BUG=implements tensorflow#3256

Remove model_facade module and its tests, now superseded by model_editor. BUG=implements tensorflow#3256

…ess_test Replace dictionary-based test_models.build() with model_editor's declarative API for building test models. BUG=implements tensorflow#3256

Remove test_models module and its tests, now superseded by model_editor. BUG=implements tensorflow#3256

Add decode module with DecodeType constants and DecodeCommonMetadata, per the TFLM DECODE Operator Design document. BUG=implements tensorflow#3256

Define the plugin interface for compression methods. Each compressor implements the Compressor protocol with a compress() method that returns encoded data and ancillary data. BUG=implements tensorflow#3256

Implement LutCompressor using the Compressor protocol. Lookup table compression replaces tensor values with indices into a table of unique values, producing packed indices and ancillary data in the format expected by the TFLM DECODE kernel. Supports per-tensor and per-channel compression, sizes value tables to actual unique count, and handles unquantized tensors. BUG=implements tensorflow#3256

Add spec types, YAML parser support, and plugin stubs for Huffman and Pruning compression methods. The plugins raise CompressionError when invoked, to be replaced with working implementations later. BUG=implements tensorflow#3256

Add alt_decompression_memory_size parameter to the Python interpreter API. When non-zero, allocates a separate memory region for DECODE operator outputs and calls SetDecompressionMemory before AllocateTensors. BUG=implements tensorflow#3256

Insert DECODE operators before consumers of compressed tensors. Each consumer gets its own DECODE operator to support alternate decompression memory, which resets allocations between DECODE invocations. After insertion, compressed tensors are rewritten to hold encoded data as UINT8 with shape matching byte count. BUG=implements tensorflow#3256

Replace monolithic compression logic with a dispatch table that routes compression requests to plugin modules based on the spec's compression method type. After compressing tensors, insert DECODE operators into the model graph. Warn when compression expands data, helping users identify tensors that don't benefit from compression. BUG=implements tensorflow#3256

Add tests that compress models with LUT compression, run them through the TFLM Python interpreter, and verify outputs match uncompressed originals. Cover per-tensor and per-channel quantization, various index bitwidths, unquantized weights, and alternate decompression memory. BUG=implements tensorflow#3256

Add a manual test for verifying compression on proprietary models that can't be checked into the repository. See the module docstring for usage instructions. BUG=implements tensorflow#3256

ddavis-2015 · 2026-02-03T01:21:45Z

python/tflite_micro/python_ops_resolver.cc

+#ifdef USE_TFLM_COMPRESSION
  AddDecode();
+#endif


AddDecode shouldn't be dependent on USE_TFLM_COMPRESSION (none of the DECODE code is conditionally compiled)

Addressed in commit d2ac3ce. The #ifdef USE_TFLM_COMPRESSION guard around AddDecode() is removed, so DECODE is registered unconditionally. The compression and proprietary integration tests also drop their with_compression_enabled gating since DECODE-based models no longer require the flag.

Explicit inheritance from Protocol enables static type checking at definition time and makes the interface self-documenting. BUG=implements tensorflow#3256

ddavis-2015 · 2026-02-04T01:08:00Z

tensorflow/lite/micro/compression/decode_insert.py

+      # Create DECODE operator
+      decode_op = model_editor.Operator(
+          opcode=tflite.BuiltinOperator.CUSTOM,
+          custom_code=DECODE_CUSTOM_OP_NAME,
+          inputs=[info.tensor, ancillary_tensor],
+          outputs=[output_tensor],
+      )
+
+      # Insert DECODE immediately before this consumer
+      insert_pos = subgraph.operators.index(consumer)
+      subgraph.operators.insert(insert_pos, decode_op)


This being located here does not allow for a single DECODE operator to have multiple encoded inputs and ancillary tensors. The example would be CONCATENATION which takes multiple inputs, where several might be encoded tensors.

The compressor design currently uses one DECODE operator per compressed tensor. Looking at the C++ kernel, I see it already supports multiple input/output pairs. Were you expecting us to batch them into a single DECODE for cases like CONCATENATION? Would that run into the same alt decompression memory problems as reusing the output of a single DECODE? Does using multiple DECODEs also run into those issues?

If a single operator (for example CONCATENATION) has multiple compressed tensors, then those tensors and their DCMs should be batched into a single DECODE operator. The DECODE kernel already handles this (as you observed) and handles the alternate decompression memory correctly in this case.

Or to put it more succinctly: multiple encoded tensors for a single operator, MUST be passed as multiple inputs to a single DECODE operator.

Addressed. Multiple compressed tensor inputs to the same operator are now batched into a single DECODE. The grouping is per-consumer, so a tensor shared across different consumers still gets a separate DECODE before each one to avoid clobbering the alternate decompression memory.

ddavis-2015 · 2026-02-04T02:00:09Z

tensorflow/lite/micro/compression/compression_integration_test.py

Perhaps add a test where the compression spec is empty? The original model and the no-spec "compressed" model should give the same results.

Rather than pass through an empty spec silently, the compressor now rejects an empty spec as an error, since it's almost certainly a mistake. There's a corresponding test.

ddavis-2015 · 2026-02-04T02:04:52Z

tensorflow/lite/micro/compression/decode_insert_test.py

Will need a test where the simple model has an operator with two inputs that are compressed (FULLY_CONNECTED weight + bias, or two inputs of CONCATENATION). Each encoded tensor could have different bit-width, thus generating different DCM for each. Or perhaps extend an existing test?

Addressed. test_multiple_compressed_inputs_batched tests a CONCATENATION with two compressed tensor inputs at different bitwidths, verifying a single DECODE with 4 inputs and 2 outputs where each ancillary tensor carries its own distinct data. test_mixed_compressed_and_uncompressed_inputs covers the case where only one of two CONCATENATION inputs is compressed.

The DECODE kernel and its dependencies are already compiled unconditionally -- none are guarded by USE_TFLM_COMPRESSION. Remove the #ifdef around AddDecode() in PythonOpsResolver so DECODE-based compressed models work in a default Python build. Remove the with_compression_enabled gating from compression and proprietary integration tests, since they use DECODE-based models that no longer require the flag.

Now that DECODE is always registered, compress() produces models that load successfully, making the old test wrong. Rewrite to inject raw COMPRESSION_METADATA into the flatbuffer metadata via model_editor, directly exercising the HasCompressionMetadata() detection path for legacy-compressed models.

Add test_multiple_compressed_inputs_batched: a CONCATENATION with two compressed tensor inputs, each with a different bitwidth, should produce a single DECODE with 4 inputs and 2 outputs, each ancillary tensor carrying its own distinct data. Marked expectedFailure until the implementation lands. Add test_mixed_compressed_and_uncompressed_inputs: a CONCATENATION with one compressed and one plain input leaves the plain input untouched. This already passes with the current code.

When a single operator (e.g., CONCATENATION) has multiple compressed tensor inputs, group them into one DECODE instead of creating a separate DECODE for each. Grouping is per-consumer, so a tensor shared across different consumers still gets a separate DECODE before each one to avoid clobbering the alternate decompression memory.

An empty spec list passed to compress() previously returned an unmodified model silently. Fail early with a clear error instead, since an empty spec is almost certainly a mistake.

ddavis-2015 · 2026-02-10T08:27:39Z

python/tflite_micro/test_compression_unsupported.py

  os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
  os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'


Should add a comment on the meaning of these environment vars

Addressed in d40a84e.

rkuester added 14 commits February 2, 2026 18:21

chore(compression): remove model_facade.py

1367adf

Remove model_facade module and its tests, now superseded by model_editor. BUG=implements tensorflow#3256

refactor(compression): replace test_models with model_editor in compr…

890c89c

…ess_test Replace dictionary-based test_models.build() with model_editor's declarative API for building test models. BUG=implements tensorflow#3256

chore(compression): remove test_models.py

97b89d8

Remove test_models module and its tests, now superseded by model_editor. BUG=implements tensorflow#3256

feat(compression): add DECODE operator types and metadata

b06d064

Add decode module with DecodeType constants and DecodeCommonMetadata, per the TFLM DECODE Operator Design document. BUG=implements tensorflow#3256

feat(compression): add Compressor protocol

4786bb4

Define the plugin interface for compression methods. Each compressor implements the Compressor protocol with a compress() method that returns encoded data and ancillary data. BUG=implements tensorflow#3256

test(compression): add proprietary model integration test

1e5651f

Add a manual test for verifying compression on proprietary models that can't be checked into the repository. See the module docstring for usage instructions. BUG=implements tensorflow#3256

ddavis-2015 reviewed Feb 3, 2026

View reviewed changes

rkuester mentioned this pull request Feb 3, 2026

feat(compression): update tooling to use DECODE operators #3257

Draft

refactor(compression): compressors inherit from Compressor protocol

66fae7c

Explicit inheritance from Protocol enables static type checking at definition time and makes the interface self-documenting. BUG=implements tensorflow#3256

ddavis-2015 reviewed Feb 4, 2026

View reviewed changes

rkuester added 5 commits February 9, 2026 19:20

feat(compression): reject empty compression spec

aa5679d

An empty spec list passed to compress() previously returned an unmodified model silently. Fail early with a clear error instead, since an empty spec is almost certainly a mistake.

ddavis-2015 reviewed Feb 10, 2026

View reviewed changes

docs(python): explain env vars in test runner

d40a84e

rkuester mentioned this pull request Feb 10, 2026

feat(compression): implement model_editor for TFLite model manipulation #3439

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(compression): update tooling to use DECODE operators #3400

feat(compression): update tooling to use DECODE operators #3400

Uh oh!

rkuester commented Feb 3, 2026 •

edited

Loading

Uh oh!

ddavis-2015 Feb 3, 2026

Uh oh!

rkuester Feb 10, 2026

Uh oh!

ddavis-2015 Feb 4, 2026

Uh oh!

rkuester Feb 10, 2026

Uh oh!

ddavis-2015 Feb 4, 2026

Uh oh!

rkuester Feb 10, 2026

Uh oh!

ddavis-2015 Feb 4, 2026

Uh oh!

rkuester Feb 10, 2026

Uh oh!

ddavis-2015 Feb 10, 2026

Uh oh!

rkuester Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
		os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'

feat(compression): update tooling to use DECODE operators #3400

Are you sure you want to change the base?

feat(compression): update tooling to use DECODE operators #3400

Uh oh!

Conversation

rkuester commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rkuester commented Feb 3, 2026 •

edited

Loading