spark如何与华为云云容器引擎cce集成
本文介绍了如何创建cce、安装spark,并将spark的任务提交到cce中运行。本文介绍的是将spark用allinone的方式安装到cce的node结点上。并测试了将spark任务推送到cce中进行运行。
1、创建cce集群
2、cce集群创建好后用putty登录node结点
获取node结点外网ip如下图所示
3、测试k8命令是否能执行
# kubectl get pods The connection to the server localhost:8080 was refused - did you specify the right host or port? |
如果不能需要配置k8命令所需的环境变量,详细配置流程可见下图的详细步骤:
在界面上点击VM Clusters-> donotdel…spark -> Kubectl
Console界面链接:
https://console.otc.t-systems.com/cce2.0/
将step2中的 “Download the kubectl configuration file”中的配置文件下载下来传到node虚拟机上面的/home目录。安装kubectl后,执行“install and set up kubectl”中的命令将kubectl需要的环境变量配置好。
cd /home |
完成k8变量配置后再运行k8命令
# kubectl get pods NAME READY STATUS RESTARTS AGE icagent-vp6gp 1/1 Running 0 3d |
现在已经能够获取到pod的信息了
4、安装spark
4.1安装java jdk,scala
# sudo yum install java-1.8.0-openjdk # java -version openjdk version "1.8.0_191" OpenJDK Runtime Environment (build 1.8.0_191-b12) OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode) |
4.2安装scala
wget http://downloads.lightbend.com/scala/2.11.8/scala-2.11.8.rpm sudo yum install scala-2.11.8.rpm scala -version |
4.3安装spark
tar xf spark-2.2.0-k8s-0.5.0-bin-with-hadoop-2.7.3.tgz mkdir /usr/local/spark cp -r spark-2.2.0-k8s-0.5.0-bin-2.7.3/* /usr/local/spark |
4.4设置spark环境变量
export SPARK_EXAMPLES_JAR=/usr/local/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar PATH=$PATH:$HOME/bin:/usr/local/spark/bin source ~/.bash_profile |
4.5测试spark命令是否可用
# spark-shell Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 2019-04-26 06:06:04 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2019-04-26 06:06:04 WARN Utils:66 - Your hostname, spark-test-16259.novalocal resolves to a loopback address: 127.0.0.1; using 192.168.1.102 instead (on interface eth0) 2019-04-26 06:06:04 WARN Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address 2019-04-26 06:06:11 WARN ObjectStore:6666 - Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0 2019-04-26 06:06:11 WARN ObjectStore:568 - Failed to get database default, returning NoSuchObjectException 2019-04-26 06:06:12 WARN ObjectStore:568 - Failed to get database global_temp, returning NoSuchObjectException Spark context Web UI available at http://192.168.1.102:4040 Spark context available as 'sc' (master = local[*], app id = local-1556258765615). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.2.0-k8s-0.5.0 /_/
Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_191) Type in expressions to have them evaluated. Type :help for more information.
scala> |
4.6测试spark任务可否提交到cce容器中执行
4.6.1创建spark account,赋予名为spark的用户可编辑权限
kubectl create serviceaccount spark kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark --namespace=default |
4.6.2获取cce集群地址信息
# kubectl cluster-info Kubernetes master is running at https://192.168.0.250:5443 |
4.6.3查看spark镜像中的example jar包路径
可将spark镜像Pull到docker中,再登录到镜像里面查看镜像路径中的example jar包全路径,比如下面命令中的 /opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.4.0.jar包
#docker pull kubespark/spark-driver:v2.2.0-kubernetes-0.4.0 #docker run -it kubespark/spark-driver:v2.2.0-kubernetes-0.4.0 /bin/sh ++ id -u + myuid=0 ++ id -g + mygid=0 ++ getent passwd 0 + uidentry=root:x:0:0:root:/root:/bin/ash + '[' -z root:x:0:0:root:/root:/bin/ash ']' + /sbin/tini -s -- /bin/sh sh-4.3# cd /opt/spark/examples/jars/ sh-4.3# ls scopt_2.11-3.3.0.jar spark-examples_2.11-2.2.0-k8s-0.4.0.jar |
4.6.4提交spark任务到cce集群中执行
1、 将获取到的k8集群master地址配置到下面的—master参数中https://192.168.0.250:5443
2、 新创建的k8 account名称配置到spark.kubernetes.authenticate.driver.serviceAccountName中
3、 spark镜像中的example jar包路径配置到下面命令的最后一行
spark-submit \ --deploy-mode cluster \ --class org.apache.spark.examples.SparkPi \ --master k8s://https://192.168.0.250:5443 \ --kubernetes-namespace default \ --conf spark.executor.instances=5 \ --conf spark.app.name=spark-pi \ --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.4.0 \ --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.4.0 \ --conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.2.0-kubernetes-0.4.0 \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.4.0.jar |
完成后查看pod状态,可以看到状态为Completed,表示任务已经执行完成
kubectl get pods NAME READY STATUS RESTARTS AGE spark-pi-1556531834261-driver 0/1 Completed 0 8m |
也可以使用 kubectl logs spark-pi-1556531834261-driver命令查看具体的运行日志信息
- 点赞
- 收藏
- 关注作者
评论(0)