This is a great study, I have never seen such an efficient algorithm, and I would appreciate it if you could develop code to adapt to pnnx