diff --git a/docs/docs/spark-configuration.md b/docs/docs/spark-configuration.md index 01bb773680fd..e8e4f7e3c8c1 100644 --- a/docs/docs/spark-configuration.md +++ b/docs/docs/spark-configuration.md @@ -112,6 +112,16 @@ Spark's built-in catalog supports existing v1 and v2 tables tracked in a Hive Me This configuration can use same Hive Metastore for both Iceberg and non-Iceberg tables. +`SparkSessionCatalog` is useful when you want `spark_catalog` to work with both Iceberg and non-Iceberg +tables in the same metastore. + +!!! note + Spark before 4.2.0 does not support `V2Function` in the session catalog. See + [SPARK-54760](https://issues.apache.org/jira/browse/SPARK-54760) ([apache/spark#53531](https://github.com/apache/spark/pull/53531)) for details. As a result, + catalog-scoped SQL functions such as `system.bucket`, `system.days`, and `system.iceberg_version` + are not available through `spark_catalog`. To work around this limitation, configure a separate + Iceberg catalog with `org.apache.iceberg.spark.SparkCatalog` and call them through that catalog. + ### Using catalog specific Hadoop configuration values Similar to configuring Hadoop properties by using `spark.hadoop.*`, it's possible to set per-catalog Hadoop configuration values when using Spark by adding the property for the catalog with the prefix `spark.sql.catalog.(catalog-name).hadoop.*`. These properties will take precedence over values configured globally using `spark.hadoop.*` and will only affect Iceberg tables. diff --git a/docs/docs/spark-queries.md b/docs/docs/spark-queries.md index 54c3b1572dac..91f1759f568c 100644 --- a/docs/docs/spark-queries.md +++ b/docs/docs/spark-queries.md @@ -51,6 +51,14 @@ writing filters that match Iceberg partition transforms. These functions are ava [Iceberg catalog](spark-configuration.md#catalog-configuration); they are not registered in Spark's built-in catalog. +!!! note + Spark before 4.2.0 does not support `V2Function` in the session catalog. + Queries such as `SELECT spark_catalog.system.bucket(16, id)` fail even when + `spark_catalog` is configured with `org.apache.iceberg.spark.SparkSessionCatalog`. + See [SPARK-54760](https://issues.apache.org/jira/browse/SPARK-54760) ([apache/spark#53531](https://github.com/apache/spark/pull/53531)) for details. + To use Iceberg SQL functions, call them through a catalog configured with + `org.apache.iceberg.spark.SparkCatalog`. + Use the `system` namespace when calling these functions: ```sql