From de01572c0dc823fbe0448702855d42b035c24d18 Mon Sep 17 00:00:00 2001 From: rwitzel Date: Tue, 16 Nov 2021 14:51:45 +0100 Subject: [PATCH] adresses #1 : Documented what to do when the Spark session cannot be created as described in the user's specific Spark environment. Because the comments in #1 indicate that the current documentation does not emphasises enough the importance of having the JARs installed. --- README.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/README.md b/README.md index 8e605fc..f23ebfe 100644 --- a/README.md +++ b/README.md @@ -45,7 +45,13 @@ spark = (SparkSession .config("spark.jars.packages", pydeequ.deequ_maven_coord) .config("spark.jars.excludes", pydeequ.f2j_maven_coord) .getOrCreate()) +``` + +In case you can't programmatically configure the Spark session +this way (e.g. in Databricks), check your vendor's documentation +about installing JARs from Maven Central. +``` df = spark.sparkContext.parallelize([ Row(a="foo", b=1, c=5), Row(a="bar", b=2, c=6),