I have created and ran a %pyspark program in Apache Zeppelin running on a Spark Cluster with yarn-client. The program is reading a file in a Dataframe from HDFS and does a simple groupby command and prints the output successfully. I am using Zeppellin version 0.6.2 and Spark 2.0.0 .
I can see the job running in YARN(see application_1480590511892_0007):
But when I check the Spark UI at the same time there is nothing at all for this job:
Question 1: Shouldn't this job appear in both of these windows?
Also, the completed applications in the SparkUI image just above, were Zeppelin jobs with the %python interpreter simply initializing a SparkSession and stopping it:
1st Zeppelin block:
from pyspark.sql import SparkSession
from pyspark.sql import Row
spark = SparkSession.builder.appName("SparkSQL").getOrCreate()
2nd Zeppelin block:
Question 2: This job in turn, has not appeared in the YARN UI. Is it the case that whenever a job appears in the SparkUI means that it is running with Spark Resource manager?
Any insights for these questions are highly appreciated.