Spark应用程序部署工具spark-submit
Spark应用程序部署工具spark-submit
打包Spark application
将Spark application打成assemblyed jar
构建工具:
1.maven--maven-shade-plugin
2.sbt
只打包需要的依赖
使用spark-submit启动Spark application:
./bin/spark-submit \
--class
--master \
--deploy-mode \
--conf = \
... # other options
\
[application-arguments]
spark-submit usage:
Usage: spark-submit [options] [app arguments]
Usage: spark-submit --kill [submission ID] --master [spark://...]
Usage: spark-submit --status [submission ID] --master [spark://...]
spark-submit option—运行模式相关
设置Spark的运行模式,根据需求选择
典型的Master URL:
注意:--deploy-mode不是spark on yarn专有
典型的Master URL:
spark-submit options—常规:
spark-submit options—classpath相关、driver、executor相关:
spark-submit options—资源、配置相关:
spark-submit options—YARN-only
以下options只有在Saprk on YARN模式下才有效:
spark-submit options—其他:
Advanced Dependency Management
依赖包分发方式
1.file—绝对路径,file:/xxxx
2.hdfs、http、https、ftp
3.local
--repositories、--packages
--py-files(仅限python app)
Clean up
Jars和files会被拷贝到每个executor的工作目录,需要定期清理:
Spark on yarn会自动清理(spark.yarn.preserve.staging.files设置为flase,默认就是false)
Spark standalone(spark.worker.cleanup.appDataTtl)
- 点赞
- 收藏
- 关注作者
评论(0)