Configure JanusGraph 0.6.0 for Spark

OLAP is supported by TinkerPop’s SparkGraphComputer engine

Spark Local

:plugin use tinkerpop.hadoop:plugin use tinkerpop.sparkgraph = GraphFactory.open(‘conf/hadoop-graph/read-cql.properties’)g = graph.traversal().withComputer(SparkGraphComputer)g.V().count()
Running Spark traversal in Spark local mode

Spark Standalone Cluster

➜ sbin ./start-all.sh
starting org.apache.spark.deploy.master.Master, logging to /Users/liboxuan/Downloads/spark-3.0.0-bin-hadoop2.7/logs/spark-liboxuan-org.apache.spark.deploy.master.Master-1-liboxuans-MacBook-Pro.local.out
localhost: starting org.apache.spark.deploy.worker.Worker, logging to /Users/liboxuan/Downloads/spark-3.0.0-bin-hadoop2.7/logs/spark-liboxuan-org.apache.spark.deploy.worker.Worker-1-liboxuans-MacBook-Pro.local.out
Spark Standalone Cluster
spark.master=spark://liboxuans-MacBook-Pro.local:7077spark.executor.memory=1gspark.executor.extraClassPath=/Users/liboxuan/Downloads/janusgraph-0.6.0/lib/*spark.serializer=org.apache.spark.serializer.KryoSerializerspark.kryo.registrator=org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator
:plugin use tinkerpop.hadoop:plugin use tinkerpop.sparkgraph = GraphFactory.open(‘conf/hadoop-graph/read-cql-standalone-cluster.properties’)g = graph.traversal().withComputer(SparkGraphComputer)g.V().count()
Running Spark traversal in Spark standalone cluster mode

Spark on Yarn Cluster

export HADOOP_HOME=”/Users/liboxuan/Downloads/hadoop-2.7.0"
<configuration><property><name>fs.defaultFS</name><value>hdfs://localhost:9000</value></property></configuration>
<configuration><property><name>dfs.replication</name><value>1</value></property></configuration>
<configuration><property><name>mapreduce.framework.name</name><value>yarn</value></property><property><name>mapreduce.application.classpath</name> <value>$HADOOP_HOME/share/hadoop/mapreduce/*:$HADOOP_HOME/share/hadoop/mapreduce/lib/*</value></property></configuration>
<configuration><property><name>yarn.log-aggregation-enable</name><value>true</value></property></configuration>
cd /Users/liboxuan/Downloads/janusgraph-0.6.0mkdir tmpcd /Users/liboxuan/Downloads/spark-3.0.0-bin-hadoop2.7/jarscp * /Users/liboxuan/Downloads/janusgraph-0.6.0/tmpcd /Users/liboxuan/Downloads/janusgraph-0.6.0/tmprm guava-14.0.1.jar commons-text-1.6.jarcp ../lib/guava-29.0-jre.jar .cp ../lib/commons-text-1.9.jar .zip spark-gremlin.zip *.jar
spark.master=yarn
spark.submit.deployMode=client
spark.yarn.archive=/Users/liboxuan/Downloads/janusgraph-0.6.0/tmp/spark-gremlin.zip
spark.yarn.appMasterEnv.CLASSPATH=./__spark_libs__/*:/Users/liboxuan/Downloads/janusgraph-0.6.0/lib/*:/Users/liboxuan/Downloads/hadoop-2.7.0/etc/hadoop
spark.executor.extraClassPath=./__spark_libs__/*:/Users/liboxuan/Downloads/janusgraph-0.6.0/lib/*:/Users/liboxuan/Downloads/hadoop-2.7.0/etc/hadoop
export HADOOP_CONF_DIR=”${HADOOP_HOME}/etc/hadoop”export CLASSPATH=”${HADOOP_CONF_DIR}”
hdfs:plugin use tinkerpop.hadoop:plugin use tinkerpop.sparkgraph = GraphFactory.open(‘conf/hadoop-graph/read-cql-yarn.properties’)g = graph.traversal().withComputer(SparkGraphComputer)g.V().count()
Running Spark traversal in Spark Yarn cluster mode

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store