Skip to content

Latest commit

 

History

History
17 lines (11 loc) · 799 Bytes

File metadata and controls

17 lines (11 loc) · 799 Bytes

JIT Code Generation for Arrow-DataFusion

This project is a code generation tool for the Arrow-DataFusion project.

With JIT codegen, we could generate specific code for each query to reduce branching overhead from the generalized interpret mode execution. Furthermore, we could reduce the memory footprint during the execution by chaining multiple Arrow compute kernels together and reusing the intermediate vectors.

I've just finished the proof of concept and will first try to accelerate row and columnar data transformation introduced in apache/datafusion#1782.

Development

TODOs:

  • Function register and reuse
  • Hook JIT codegen with DataFusion RuntimeEnv
  • Support unsigned int types
  • ...