The project uses Java 11, and Zulu OpenJDK is a preferred JDK. Users can switch to the specific JDK every time they work with this repository by installing SDKMAN and running the following command in the project root directory:
sdk envThe project uses sbt v1.7.+ for building and running tests. Users can
download sbt by following the instructions here.
The plugin is built and tested against Spark v3.3.0 and Hadoop v3.3.+, respectively. The instructions for setting up a Hadoop and Spark installation on a machine with VEs attached can be found on the project website, [here]](https://sparkcyclone.io/docs/spark-sql/getting-started/hadoop-and-spark-installation-guide) and here.
In addition, instructions for configuring a local (custom) installation of Spark with an established Hadoop cluster can be found here.
For Windows, make sure you configure Hadoop as per
Hadoop on Windows
and set the appropriate HADOOP_HOME (Use winutils as needed)
The files should look like this:
C:/hadoop-3.2.1/bin/hadoop.dll
...Also add the bin directory to the PATH.
For cluster-mode/detection tests that run on the VectorEngine scope, make sure
that $SPARK_HOME/work is writable:
$ mkdir -p /opt/spark/work && chmod -R 777 /opt/spark/workInstructions can be found here to lower the latency of SSH connections, which is likely needed in the case of software development involving VEs in a remote server(in general, a 40% decrease latency can be observed).
The sbt console should be launched with large amount of heap memory available:
SBT_OPTS="-Xmx16g" sbtTo build the plugin, simply run in the sbt console:
show assemblyThe location of the assembled fat JAR will be displayed.
A shortcut is provided in the sbt console to copy the built plugin JAR to a
pre-determined directory in the filesystem:
// Copy the JAR to /opt/cyclone/${USER}/
deploySee Testing and CI for more information on how to run Spark Cyclone tests on different levels.