Support Materialized Views (to_table) by hadia206 · Pull Request #493 · bodo-ai/PyDough

hadia206 · 2026-02-13T22:00:39Z

Summary
This PR implements the to_table functionality for PyDough, allowing users to materialize PyDough queries as database tables or views, and then use them in subsequent queries.

Workflow
PyDough Query -> to_table() -> DDL executed -> ViewGeneratedCollection -> use in new PyDough Query

User writes PyDough query
User calls to_table() to materialize it
PyDough generates DDL (CREATE TABLE AS SELECT...)
DDL is executed on the database
Returns a collection reference to the new table (ViewGeneratedCollection)
User can use that reference in new PyDough queries

Example

# Step 1: PyDough query
asian_nations = nations.WHERE(region.name == 'ASIA')

# Steps 2-5: Materialize it as a temp table
asian_tmp = pydough.to_table(asian_nations, name='asian_nations', temp=True)

# Step 6: Use the materialized table in subsequent queries
result = asian_tmp.CALCULATE(name).ORDER_BY(name)

# Use with other collections via CROSS
result = regions.CROSS(asian_tmp).WHERE(asian_tmp.region_key == regions.key).CALCULATE(
    nation_name=asian_tmp.name,
    region_name=regions.name
)

Main Changes

Added to_table() function:
- Generates appropriate DDL statements for each database dialect (SQLite, MySQL, PostgreSQL, Snowflake) and returns a collection reference that can be used in subsequent PyDough queries
- Support for as_view=True to create views instead of tables
- Support for replace=True to replace existing tables/views
- Support for temp=True to create temporary tables
ViewGeneratedCollection :
- New collection type representing a user-created table/view
Added execute_ddl() method to DatabaseConnection:
- Execute DDL statements (CREATE [OR REPLACE TEMP] TABLE/VIEW, DROP TABLE/VIEW IF EXISTS)
Test Infrastructure
- Added reset_active_session fixture to automatically resets the global active session after each test to avoid session overlap which lead to some duplicate writing errors
- Tests for different PyDough queries
- Tests for different DDL statements

closes #499

… SQLGlot bug discovered along the way [RUN CI]

… into John/df_collection

hadia206 · 2026-03-09T20:27:30Z

tests/conftest.py

            schema=schema_name,
        )

+        # Sqlite's datetime functions operate in UTC,


Unrelated to the PR.

The defog Snowflake e2e tests compare PyDough results on Snowflake against reference SQL on SQLite. SQLite always uses UTC, but Snowflake defaults to Pacific Time, so time-relative queries ("last week", "today", etc.) diverge in certain day/time runs. This fix ensures the Snowflake test connection sets TIMEZONE = 'UTC' to match SQLite's behavior.

…materialize_view

john-sanchez31

LGTM! Just some comments below

john-sanchez31 · 2026-03-30T14:38:19Z

documentation/usage.md

+| SQLite     | No (uses DROP + CREATE)| Yes        | No (uses DROP + CREATE)| Yes       |
+| Snowflake  | Yes                    | Yes        | Yes                    | No        |
+| PostgreSQL | No (uses DROP + CREATE)| Yes        | Yes                    | No        |
+| MySQL      | No (uses DROP + CREATE)| Yes        | Yes                    | No        |


Don't forget to add Oracle here

john-sanchez31 · 2026-03-30T15:25:48Z

pydough/sqlglot/execute_relational.py

+# double-quoted when used as column aliases (especially in CTAS, where Oracle
+# creates actual column names). Sourced from Oracle 19c+ reserved word list
+# and confirmed issues with TPCH column names.
+_ORACLE_RESERVED_ALIASES: frozenset[str] = frozenset(


There is a list on error_utils.py called SQL_RESERVED_KEYWORDS with all words that need to be quoted. Can't we add these there and use _is_sql_keyword?

They serve different purposes.

SQL_RESERVED_KEYWORDS raises error must be a valid identifier and not a reserved word

_ORACLE_RESERVED_ALIASES allows the names but adds double-quotes when emitting Oracle SQL. The words there (comment, date, number, key, size) are perfectly valid identifiers in SQLite, Snowflake, Postgres, and MySQL. Only Oracle doesn't like them as unquoted column aliases in CTAS.

If we merged them into SQL_RESERVED_KEYWORDS, those words would be rejected for all dialects at the name validation step i.e. users couldn't name a column key or date even on other dialects. That's too restrictive.

tests/test_pipeline_tpch_custom.py

knassre-bodo

LGTM, just a fe comments to address before merging. One of them might be a longer matter to address.

tests/test_pipeline_tpch_custom.py

tests/testing_utilities.py

pydough/database_connectors/database_connector.py

john-sanchez31 and others added 30 commits January 8, 2026 08:41

Initial documentation

dfad733

base df collection implementation for sqlite, ansi and mysql

3d10cc0

types fixed

4ea44f8

ref sql added

486d785

implementation df collections for postgres and snowflake

0eef40d

datatypes fixed

036f39a

datatypes, numbers and inf test added

c882800

string and cross df collection tests (no fix for partition yet)

4e84394

WIP: patition with user generated collections

60b9703

Adding more range colleciton partition tests and fixing qualification…

4d6ca40

… SQLGlot bug discovered along the way [RUN CI]

fixing comments and deleting unneccesary case

63fb854

Merge branch 'John/df_collection' of https://github.com/bodo-ai/PyDough…

b0d942e

… into John/df_collection

partition 2 df collection test

f3f22ae

df collection where date test added

7a49b16

df collection top_k test added

0318e25

dataframe_collection_best test added

a2d3dd6

bad test and window function test added

c500b54

docstring and refactored code [run all]

2038159

pyarrow dependency added [run all]

cb77bd8

pyarrow dependency changed [run all]

78ec168

testing [run all]

3c2d350

testing [run all]

f15d3a1

reverting [run all]

eb53764

connectors version locked [run all]

29276b1

range test updated [run all]

6ec7c96

conflicts solved

3c9c571

dataframe test changed to tpch_custom file

62c32ce

defog deealership_adv8 test fixed

01f94d2

refsol sql added

c246449

defog dealership_adv13 added

4ffe44f

hadia206 added 2 commits March 9, 2026 13:11

update related SQL files

c3a090f

[run CI][run dialects][run s3]

1d04bf6

hadia206 commented Mar 9, 2026

View reviewed changes

hadia206 added 19 commits March 25, 2026 10:37

merge conflict

27ad52f

Merge branch 'main' of https://github.com/bodo-ai/PyDough into Hadia/…

532c049

…materialize_view

Merge branch 'main' of https://github.com/bodo-ai/PyDough into Hadia/…

971e990

…materialize_view

[run CI][run dialects] Oracle support and explain

710342b

[run oracle] fix typo

a61ad7f

use private temp Oracle

7363ba9

[run CI][run dialects] cleanup tpch_custom to use all_dialects fixture

0fdede5

add missed tpch_custom_test_data_dialect_replacements

23695e5

[run CI][run SF] fix drop with temp table oracle

16379f7

[run CI][run dialects][run s3]

4680eb0

[run oracle] replace limit 0

e8d9bd6

[run CI][run dialects] make Oracle temp table not supported

432fac7

[run oracle] reserved keywords

57dcaff

[run CI][run oracle] quote reserved keyword column names

e63ece0

[run CI][run oracle] add test files. Fix date harmonize

85bab63

[run oracle] try again

c8548ca

[run oracle] update test

c004de6

[run oracle] again

68f8c6d

[run CI][run S3][run dialects] cleanup

95aab1e

hadia206 requested review from john-sanchez31 and knassre-bodo March 27, 2026 16:57

john-sanchez31 approved these changes Mar 30, 2026

View reviewed changes

knassre-bodo approved these changes Apr 8, 2026

View reviewed changes

hadia206 added 2 commits April 8, 2026 15:07

[run CI][run dialects] address Kian and John comments

3de79e8

[run CI] update dealership_adv8 SQL files

1a51245

hadia206 merged commit 43b9adf into main Apr 8, 2026
15 checks passed

hadia206 deleted the Hadia/materialize_view branch April 8, 2026 22:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Materialized Views (to_table)#493

Support Materialized Views (to_table)#493
hadia206 merged 130 commits intomainfrom
Hadia/materialize_view

hadia206 commented Feb 13, 2026 •

edited

Loading

Uh oh!

hadia206 Mar 9, 2026

Uh oh!

john-sanchez31 left a comment

Uh oh!

john-sanchez31 Mar 30, 2026

Uh oh!

john-sanchez31 Mar 30, 2026

Uh oh!

hadia206 Apr 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

knassre-bodo left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hadia206 commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hadia206 Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

john-sanchez31 left a comment

Choose a reason for hiding this comment

Uh oh!

john-sanchez31 Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

john-sanchez31 Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

hadia206 Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

knassre-bodo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hadia206 commented Feb 13, 2026 •

edited

Loading

hadia206 Apr 8, 2026 •

edited

Loading