Skip to main content

On Premise Hadoop Cluster

Running on Hadoop is similar to running on a local filesystem, we just need to set the correct filesystem and add the jackson-dataformat-yaml library to the classpath on HDP:

HDP example
spark-submit --deploy-mode cluster --master yarn \
--jars hdfs://my-namespace/libraires/jackson-dataformat-yaml-2.12.3.jar \
--conf spark.yarn.appMasterEnv.SL_ROOT=/user/userguide \
--conf spark.yarn.appMasterEnv.SL_FS=hdfs://my-namespace \
--class ai.starlake.job.Main \
hdfs://my-namespace/libraires/comet-spark2_2.12-0.2.8-assembly.jar import