Conversation
… SQLGlot bug discovered along the way [RUN CI]
… into John/df_collection
tests/conftest.py
Outdated
| schema=schema_name, | ||
| ) | ||
|
|
||
| # Sqlite's datetime functions operate in UTC, |
There was a problem hiding this comment.
Unrelated to the PR.
The defog Snowflake e2e tests compare PyDough results on Snowflake against reference SQL on SQLite. SQLite always uses UTC, but Snowflake defaults to Pacific Time, so time-relative queries ("last week", "today", etc.) diverge in certain day/time runs. This fix ensures the Snowflake test connection sets TIMEZONE = 'UTC' to match SQLite's behavior.
…materialize_view
…materialize_view
john-sanchez31
left a comment
There was a problem hiding this comment.
LGTM! Just some comments below
| | SQLite | No (uses DROP + CREATE)| Yes | No (uses DROP + CREATE)| Yes | | ||
| | Snowflake | Yes | Yes | Yes | No | | ||
| | PostgreSQL | No (uses DROP + CREATE)| Yes | Yes | No | | ||
| | MySQL | No (uses DROP + CREATE)| Yes | Yes | No | |
There was a problem hiding this comment.
Don't forget to add Oracle here
| # double-quoted when used as column aliases (especially in CTAS, where Oracle | ||
| # creates actual column names). Sourced from Oracle 19c+ reserved word list | ||
| # and confirmed issues with TPCH column names. | ||
| _ORACLE_RESERVED_ALIASES: frozenset[str] = frozenset( |
There was a problem hiding this comment.
There is a list on error_utils.py called SQL_RESERVED_KEYWORDS with all words that need to be quoted. Can't we add these there and use _is_sql_keyword?
There was a problem hiding this comment.
They serve different purposes.
SQL_RESERVED_KEYWORDS raises error must be a valid identifier and not a reserved word
_ORACLE_RESERVED_ALIASES allows the names but adds double-quotes when emitting Oracle SQL. The words there (comment, date, number, key, size) are perfectly valid identifiers in SQLite, Snowflake, Postgres, and MySQL. Only Oracle doesn't like them as unquoted column aliases in CTAS.
If we merged them into SQL_RESERVED_KEYWORDS, those words would be rejected for all dialects at the name validation step i.e. users couldn't name a column key or date even on other dialects. That's too restrictive.
knassre-bodo
left a comment
There was a problem hiding this comment.
LGTM, just a fe comments to address before merging. One of them might be a longer matter to address.
Summary
This PR implements the
to_tablefunctionality for PyDough, allowing users to materialize PyDough queries as database tables or views, and then use them in subsequent queries.Workflow
PyDough Query ->
to_table()-> DDL executed -> ViewGeneratedCollection -> use in new PyDough Queryto_table()to materialize itCREATE TABLE AS SELECT...)ViewGeneratedCollection)Example
Main Changes
Added
to_table()function:as_view=Trueto create views instead of tablesreplace=Trueto replace existing tables/viewstemp=Trueto create temporary tablesViewGeneratedCollection:Added
execute_ddl()method to DatabaseConnection:CREATE [OR REPLACE TEMP] TABLE/VIEW,DROP TABLE/VIEW IF EXISTS)Test Infrastructure
reset_active_sessionfixture to automatically resets the global active session after each test to avoid session overlap which lead to some duplicate writing errorscloses #499