九十、Spark-SparkSQL(查询sql)
【摘要】
textFile文件读取
读取数据展示
代码
package org.example.SQL import org.apache.log4j.{Level, Logger}import org.apache.spark.SparkContextimport org.apache.spark.rdd.RDD...
textFile文件读取
读取数据展示
代码
-
package org.example.SQL
-
-
import org.apache.log4j.{Level, Logger}
-
import org.apache.spark.SparkContext
-
import org.apache.spark.rdd.RDD
-
import org.apache.spark.sql.{DataFrame, SparkSession}
-
-
object Test4 { //查询sql
-
def main(args: Array[String]): Unit = {
-
-
Logger.getLogger("org").setLevel(Level.ERROR) //不打印日志
-
val spark: SparkSession = SparkSession.builder().appName("test4").master("local").getOrCreate()
-
val sc: SparkContext = spark.sparkContext
-
val lines = sc.textFile("data/input/person.txt")
-
-
val rdd: RDD[person] = lines.map { line => {
-
val arr: Array[String] = line.split(" ")
-
person(arr(0).toInt, arr(1), arr(2).toInt)
-
}
-
}
-
import spark.implicits._
-
val personDF: DataFrame = rdd.toDF() //转换为DataFrame
-
-
personDF.printSchema()
-
personDF.show()
-
-
//--------------------SQL----------------------
-
//注册表名
-
personDF.createOrReplaceTempView("student")
-
//查看name字段
-
spark.sql("select name from student").show()
-
//查看name和age字段
-
spark.sql("select name,age from student").show()
-
//查看所有age和name字段,并将age+1
-
spark.sql("select name,age,age+1 from student").show()
-
//过滤age>=25的
-
spark.sql("select name,age from student where age<25").show()
-
//统计年龄大于35的人数
-
spark.sql("select count(*) from student where age>35").show()
-
//按年龄进行分组并统计相同年龄的人数
-
spark.sql("select age,count(*) from student group by age").show()
-
//查询姓名等于张三的
-
spark.sql("select name from student where name = 'zhangsan' ").show()
-
-
}
-
-
case class person(id: Int, name: String, age: Int)
-
}
约束
数据表
数据过滤
-
+--------+
-
| name|
-
+--------+
-
|zhangsan|
-
| lisi|
-
| wangwu|
-
| zhaoliu|
-
| tianqi|
-
| kobe|
-
+--------+
-
+--------+---+
-
| name|age|
-
+--------+---+
-
|zhangsan| 20|
-
| lisi| 29|
-
| wangwu| 25|
-
| zhaoliu| 30|
-
| tianqi| 35|
-
| kobe| 40|
-
+--------+---+
-
+--------+---+---------+
-
| name|age|(age + 1)|
-
+--------+---+---------+
-
|zhangsan| 20| 21|
-
| lisi| 29| 30|
-
| wangwu| 25| 26|
-
| zhaoliu| 30| 31|
-
| tianqi| 35| 36|
-
| kobe| 40| 41|
-
+--------+---+---------+
-
+--------+---+
-
| name|age|
-
+--------+---+
-
|zhangsan| 20|
-
+--------+---+
-
+--------+
-
|count(1)|
-
+--------+
-
| 1|
-
+--------+
-
+---+--------+
-
|age|count(1)|
-
+---+--------+
-
| 20| 1|
-
| 40| 1|
-
| 35| 1|
-
| 25| 1|
-
| 29| 1|
-
| 30| 1|
-
+---+--------+
-
+--------+
-
| name|
-
+--------+
-
|zhangsan|
-
+--------+
文章来源: tuomasi.blog.csdn.net,作者:托马斯-酷涛,版权归原作者所有,如需转载,请联系作者。
原文链接:tuomasi.blog.csdn.net/article/details/123975210
【版权声明】本文为华为云社区用户转载文章,如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱:
cloudbbs@huaweicloud.com
- 点赞
- 收藏
- 关注作者
评论(0)