参加《21天晋级大数据实战营》——DAY17轻松探索数据背后的价值 - 数据湖探索实验小结
1、测试数据下载
http://obs-salepredict.obs.cn-north-1.myhwclouds.com/index.html
点击chicago.csv下载原始数据。
2、obs对象存储
3、桶创建
4、上传数据
5、DLI
6、队列
7、创建数据库
8、创建OBS表
CREATE TABLE chicago(
room_id long,
survey_id int,
host_id long,
room_type string,
country string,
city string,
borough string,
neighborhood string,
reviews int,
verall_satisfaction float,
accommodates int,
bedrooms float,
bathrooms string,
price float,
minstay string,
last_modified string,
latitude double,
longitude double,
location string
) USING csv OPTIONS (path "s3a://dli-demo-richblue88/")
9、SQL编辑与查询
select
case when price <= 50 then '1 (<50)' when price > 50
and price <= 100 then '2 (50-100)' when price > 100
and price <= 150 then '3 (100-150)' when price > 150
and price <= 200 then '4 (150-200)' when price > 200
and price <= 250 then '5 (200-250)' when price > 250
and price <= 300 then '6 (250-300)' when price > 300
and price <= 500 then '7 (300-500)' else '8 (>500)' end as price_level,
count(*) / (
select
count(*)
from
chicago
) as percentage
from
db1.chicago
group by
price_level
order by
price_level
图表展示
- 点赞
- 收藏
- 关注作者
评论(0)