Skip to content

Add GEP RTL support for 1D and 2D address generation#280

Merged
ShangkunLi merged 3 commits intotancheng:masterfrom
ShangkunLi:gep-rtl
Mar 31, 2026
Merged

Add GEP RTL support for 1D and 2D address generation#280
ShangkunLi merged 3 commits intotancheng:masterfrom
ShangkunLi:gep-rtl

Conversation

@ShangkunLi
Copy link
Copy Markdown
Collaborator

Thanks for the Claude Code!

Overview

Implements the GepRTL functional unit for GetElementPtr (GEP) operations. 1D and 2D address generation are handled directly in RTL; decomposition of 3D/4D+ into basic 1D/2D with additional mul/shift is left to the compiler pass.

Changes

lib/opt_type.py — 4 new opcodes:

Opcode Value Semantics
OPT_GEP 88 result = base(in0) + index(in1)
OPT_GEP_CONST 89 result = base(const) + index(in0)
OPT_GEP_2D 90 result = base(in0) + index0(in1) * stride + index1(in2)
OPT_GEP_2D_CONST 91 result = base(const) + index0(in0) * stride + index1(in1)

lib/cmd_type.py — 1 new command:

  • CMD_CONFIG_GEP_STRIDE (43): pre-configures the stride register in the GEP FU via recv_from_ctrl_mem before execution, following the same CMD-based configuration pattern used in LoopCounterRTL.

fu/single/GepRTL.py (new): implements all four GEP variants. The 2D stride is stored in a per-FU register latched on CMD_CONFIG_GEP_STRIDE. The const_queue is used for the base address in *_CONST variants.

fu/single/test/GepRTL_test.py (new): 20 test cases covering:

  • 1D GEP with parametrized (base, index) combinations
  • 1D GEP with const base, multiple sequential operations
  • 2D GEP with CMD-configured stride
  • 2D GEP with const base and CMD-configured stride
  • Mixed 1D + 2D operations in sequence
  • Predicate propagation (zero predicate on any input propagates to output)

@Jackcuii
Copy link
Copy Markdown
Collaborator

LGTM! Thank you so much Shangkun~

@ShangkunLi ShangkunLi merged commit d914f48 into tancheng:master Mar 31, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants