To run your Spark job on Hadoop-based YARN cluster, you need to specify enough heap space for the JVM. You need to edit the etc/hadoop/hadoop-env.sh file. Enable the following properties:
HADOOP_HEAPSIZE="500"
HADOOP_NAMENODE_INIT_HEAPSIZE="500"
Now you also need to edit the mapred-env.sh file with the following line:
HADOOP_JOB_HISTORYSERVER_HEAPSIZE=250
Finally, make sure that you have edited yarn-env.sh to make the changes permanent for Hadoop YARN:
JAVA_HEAP_MAX=-Xmx500m
YARN_HEAPSIZE=500