This regressed in 1.5.0.
This affects queries that use window functions but do not reference the arrow stream table more than once, e.g. SELECT g, v, sum(v) OVER (PARTITION BY g) AS s FROM %s.
1.4.4 always left this as one ARROW_SCAN_DUMB in the plan. 1.5.0 rewrites this to a self-join with multiple ARROW_SCAN_DUMB nodes, which doesn't work. Manifests as:
java.sql.SQLException: Invalid Input Error: This stream has been released
Related to: duckdb/duckdb-python#70
I'm filing a separate issue because that preexisting issue describes a query that references the arrow stream multiple times. If that were the limitation, it's much clearer boundary: There are still situations where you can use arrow stream inputs and it's easy to understand when you can and when you can't (multiple references no, one reference yes). This new regression breaks that contract.
(PR: duckdb/duckdb#23323)
This regressed in 1.5.0.
This affects queries that use window functions but do not reference the arrow stream table more than once, e.g.
SELECT g, v, sum(v) OVER (PARTITION BY g) AS s FROM %s.1.4.4 always left this as one ARROW_SCAN_DUMB in the plan. 1.5.0 rewrites this to a self-join with multiple ARROW_SCAN_DUMB nodes, which doesn't work. Manifests as:
java.sql.SQLException: Invalid Input Error: This stream has been releasedRelated to: duckdb/duckdb-python#70
I'm filing a separate issue because that preexisting issue describes a query that references the arrow stream multiple times. If that were the limitation, it's much clearer boundary: There are still situations where you can use arrow stream inputs and it's easy to understand when you can and when you can't (multiple references no, one reference yes). This new regression breaks that contract.
(PR: duckdb/duckdb#23323)