You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/guides/configuration.md
+104-2Lines changed: 104 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -320,10 +320,14 @@ The cache directory is automatically created if it doesn't exist. You can clear
320
320
321
321
SQLMesh creates schemas, physical tables, and views in the data warehouse/engine. Learn more about why and how SQLMesh creates schema in the ["Why does SQLMesh create schemas?" FAQ](../faq/faq.md#schema-question).
322
322
323
-
The default SQLMesh behavior described in the FAQ is appropriate for most deployments, but you can override where SQLMesh creates physical tables and views with the `physical_schema_mapping`, `environment_suffix_target`, and `environment_catalog_mapping` configuration options. These options are in the [environments](../reference/configuration.md#environments) section of the configuration reference page.
323
+
The default SQLMesh behavior described in the FAQ is appropriate for most deployments, but you can override *where* SQLMesh creates physical tables and views with the `physical_schema_mapping`, `environment_suffix_target`, and `environment_catalog_mapping` configuration options.
324
+
325
+
You can also override *what* the physical tables are called by using the `physical_table_naming_convention` option.
326
+
327
+
These options are in the [environments](../reference/configuration.md#environments) section of the configuration reference page.
324
328
325
329
#### Physical table schemas
326
-
By default, SQLMesh creates physical tables for a model with a naming convention of `sqlmesh__[model schema]`.
330
+
By default, SQLMesh creates physical schemas for a model with a naming convention of `sqlmesh__[model schema]`.
327
331
328
332
This can be overridden on a per-schema basis using the `physical_schema_mapping` option, which removes the `sqlmesh__` prefix and uses the [regex pattern](https://docs.python.org/3/library/re.html#regular-expression-syntax) you provide to map the schemas defined in your model to their corresponding physical schemas.
329
333
@@ -436,6 +440,104 @@ Given the example of a model called `my_schema.users` with a default catalog of
436
440
- Using `environment_suffix_target: catalog` only works on engines that support querying across different catalogs. If your engine does not support cross-catalog queries then you will need to use `environment_suffix_target: schema` or `environment_suffix_target: table` instead.
437
441
- Automatic catalog creation is not supported on all engines even if they support cross-catalog queries. For engines where it is not supported, the catalogs must be managed externally from SQLMesh and exist prior to invoking SQLMesh.
438
442
443
+
#### Physical table naming convention
444
+
445
+
Out of the box, SQLMesh has the following defaults set:
- no `physical_schema_mapping` overrides, so a `sqlmesh__<model schema>` physical schema will be created for each model schema
450
+
451
+
This means that given a catalog of `warehouse` and a model named `finance_mart.transaction_events_over_threshold`, SQLMesh will create physical tables using the following convention:
This deliberately contains some redundancy with the *model* schema as it's repeated at the physical layer in both the physical schema name as well as the physical table name.
460
+
461
+
This default exists to make the physical table names portable between different configurations. If you were to define a `physical_schema_mapping` that maps all models to the same physical schema, since the model schema is included in the table name as well, there are no naming conflicts.
462
+
463
+
##### Table only
464
+
465
+
Some engines have object name length limitations which cause them to [silently truncate](https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS) table and view names that exceed this limit. This behaviour breaks SQLMesh, so we raise a runtime error if we detect the engine would silently truncate the name of the table we are trying to create.
466
+
467
+
Having redundancy in the physical table names does reduce the number of characters that can be utilised in model names. To increase the number of characters available to model names, you can use `physical_table_naming_convention` like so:
468
+
469
+
=== "YAML"
470
+
471
+
```yaml linenums="1"
472
+
physical_table_naming_convention: table_only
473
+
```
474
+
475
+
=== "Python"
476
+
477
+
```python linenums="1"
478
+
from sqlmesh.core.config import Config, ModelDefaultsConfig, TableNamingConvention
Notice that the model schema name is no longer part of the physical table name. This allows for slightly longer model names on engines with low identifier length limits, which may be useful for your project.
494
+
495
+
In this configuration, it is your responsibility to ensure that any schema overrides in `physical_schema_mapping` result in each model schema getting mapped to a unique physical schema.
496
+
497
+
For example, the following configuration will cause **data corruption**:
498
+
499
+
```yaml
500
+
physical_table_naming_convention: table_only
501
+
physical_schema_mapping:
502
+
'.*': sqlmesh
503
+
```
504
+
505
+
This is because every model schema is mapped to the same physical schema but the model schema name is omitted from the physical table name.
506
+
507
+
##### MD5 hash
508
+
509
+
If you *still* need more characters, you can set `physical_table_naming_convention: hash_md5` like so:
510
+
511
+
=== "YAML"
512
+
513
+
```yaml linenums="1"
514
+
physical_table_naming_convention: hash_md5
515
+
```
516
+
517
+
=== "Python"
518
+
519
+
```python linenums="1"
520
+
from sqlmesh.core.config import Config, ModelDefaultsConfig, TableNamingConvention
This has a downside that now it's much more difficult to determine which table corresponds to which model by just looking at the database with a SQL client. However, the table names have a predictable length so there are no longer any surprises with identfiers exceeding the max length at the physical layer.
540
+
439
541
#### Environment view catalogs
440
542
441
543
By default, SQLMesh creates an environment view in the same [catalog](../concepts/glossary.md#catalog) as the physical table the view points to. The physical table's catalog is determined by either the catalog specified in the model name or the default catalog defined in the connection.
|`ignore_patterns`| Files that match glob patterns specified in this list are ignored when scanning the project folder (Default: `[]`) | list[string]| N |
22
-
|`project`| The project name of this config. Used for [multi-repo setups](../guides/multi_repo.md). | string | N |
|`ignore_patterns`| Files that match glob patterns specified in this list are ignored when scanning the project folder (Default: `[]`) | list[string]| N |
22
+
|`project`| The project name of this config. Used for [multi-repo setups](../guides/multi_repo.md). | string | N |
23
23
|`cache_dir`| The directory to store the SQLMesh cache. Can be an absolute path or relative to the project directory. (Default: `.cache`) | string | N |
24
+
|`log_limit`| The default number of historical log files to keep (Default: `20`) | int | N |
24
25
25
-
### Environments
26
+
### Database (Physical Layer)
26
27
27
-
Configuration options for SQLMesh environment creation and promotion.
28
+
Configuration options for how SQLMesh manages database objects in the [physical layer](../concepts/glossary.md#physical-layer).
|`snapshot_ttl`| The period of time that a model snapshot not a part of any environment should exist before being deleted. This is defined as a string with the default `in 1 week`. Other [relative dates](https://dateparser.readthedocs.io/en/latest/) can be used, such as `in 30 days`. (Default: `in 1 week`) | string | N |
33
+
|`physical_schema_override`| (Deprecated) Use `physical_schema_mapping` instead. A mapping from model schema names to names of schemas in which physical tables for the corresponding models will be placed. | dict[string, string]| N |
34
+
|`physical_schema_mapping`| A mapping from regular expressions to names of schemas in which physical tables for the corresponding models [will be placed](../guides/configuration.md#physical-table-schemas). (Default physical schema name: `sqlmesh__[model schema]`) | dict[string, string]| N |
35
+
|`physical_table_naming_convention`| Sets which parts of the model name are included in the physical table names. Options are `schema_and_table`, `table_only` or `hash_md5` - [additional details](../guides/configuration.md#physical-table-naming-convention). (Default: `schema_and_table`) | string | N |
36
+
37
+
### Environments (Virtual Layer)
38
+
39
+
Configuration options for how SQLMesh manages environment creation and promotion in the [virtual layer](../concepts/glossary.md#virtual-layer).
|`environment_ttl`| The period of time that a development environment should exist before being deleted. This is defined as a string with the default `in 1 week`. Other [relative dates](https://dateparser.readthedocs.io/en/latest/) can be used, such as `in 30 days`. (Default: `in 1 week`) | string | N |
33
44
|`pinned_environments`| The list of development environments that are exempt from deletion due to expiration | list[string]| N |
34
-
|`time_column_format`| The default format to use for all model time columns. This time format uses [python format codes](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes) (Default: `%Y-%m-%d`) | string | N |
35
45
|`default_target_environment`| The name of the environment that will be the default target for the `sqlmesh plan` and `sqlmesh run` commands. (Default: `prod`) | string | N |
36
-
|`physical_schema_override`| (Deprecated) Use `physical_schema_mapping` instead. A mapping from model schema names to names of schemas in which physical tables for the corresponding models will be placed. | dict[string, string]| N |
37
-
|`physical_schema_mapping`| A mapping from regular expressions to names of schemas in which physical tables for the corresponding models [will be placed](../guides/configuration.md#physical-table-schemas). (Default physical schema name: `sqlmesh__[model schema]`) | dict[string, string]| N |
38
-
|`environment_suffix_target`| Whether SQLMesh views should append their environment name to the `schema` or `table` - [additional details](../guides/configuration.md#view-schema-override). (Default: `schema`) | string | N |
39
-
|`gateway_managed_virtual_layer`| Whether SQLMesh views of the virtual layer will be created by the default gateway or model specified gateways - [additional details](../guides/multi_engine.md#gateway-managed-virtual-layer). (Default: False) | boolean | N |
40
-
|`infer_python_dependencies`| Whether SQLMesh will statically analyze Python code to automatically infer Python package requirements. (Default: True) | boolean | N |
46
+
|`environment_suffix_target`| Whether SQLMesh views should append their environment name to the `schema`, `table` or `catalog` - [additional details](../guides/configuration.md#view-schema-override). (Default: `schema`) | string | N |
47
+
|`gateway_managed_virtual_layer`| Whether SQLMesh views of the virtual layer will be created by the default gateway or model specified gateways - [additional details](../guides/multi_engine.md#gateway-managed-virtual-layer). (Default: False) | boolean | N |
41
48
|`environment_catalog_mapping`| A mapping from regular expressions to catalog names. The catalog name is used to determine the target catalog for a given environment. | dict[string, string]| N |
42
-
|`log_limit`| The default number of logs to keep (Default: `20`) | int | N |
|`time_column_format`| The default format to use for all model time columns. This time format uses [python format codes](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes) (Default: `%Y-%m-%d`) | string | N |
55
+
|`infer_python_dependencies`| Whether SQLMesh will statically analyze Python code to automatically infer Python package requirements. (Default: True) | boolean | N |
56
+
|`model_defaults`| Default [properties](./model_configuration.md#model-defaults) to set on each model. At a minimum, `dialect` must be set. | dict[string, any]| Y |
45
57
46
58
The `model_defaults` key is **required** and must contain a value for the `dialect` key.
0 commit comments