数据可视化--Seaborn绘图总结2
数据可视化–Seaborn绘图总结2
Seaborn其实是在matplotlib的基础上进行了更高级的API封装,从而使得作图更加容易。同时它能高度兼容numpy与pandas数据结构以及scipy与statsmodels等统计模式
reference
@[toc]
类型
-
Relational plots 关系类图表
- relplot() 关系类图表的接口,其实是下面两种图的集成,通过指定kind参数可以画出下面的两种图
- scatterplot() 散点图
- lineplot() 折线图
-
Categorical plots 分类图表
- catplot() 分类图表的接口,其实是下面八种图表的集成,通过指定kind参数可以画出下面的八种图
- stripplot() 分类散点图
- swarmplot() 能够显示分布密度的分类散点图
- boxplot() 箱图
- violinplot() 小提琴图
- boxenplot() 增强箱图
- pointplot() 点图
- barplot() 条形图
- countplot() 计数图
-
Distribution plot 分布图
- jointplot() 双变量关系图
- pairplot() 变量关系组图
- distplot() 直方图,质量估计图
- kdeplot() 核函数密度估计图
- rugplot() 将数组中的数据点绘制为轴上的数据
-
Regression plots 回归图
- lmplot() 回归模型图
- regplot() 线性回归图
- residplot() 线性回归残差图
-
Matrix plots 矩阵图
- heatmap() 热力图
- clustermap() 聚集图
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple jieba
import seaborn as sns
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import warnings
warnings.filterwarnings("ignore")
pointplot
点图
"""
Conditional means with observations
===================================
"""
sns.set(style="whitegrid")
iris = pd.read_csv("./seaborn-data-master/iris.csv")
# "Melt" the dataset to "long-form" or "tidy" representation
iris = pd.melt(iris, "species", var_name="measurement")
# Initialize the figure
f, ax = plt.subplots()
sns.despine(bottom=True, left=True)
# Show each observation with a scatterplot
sns.stripplot(x="value", y="measurement", hue="species",
data=iris, dodge=True, jitter=True,
alpha=.25, zorder=1)
# Show the conditional means
sns.pointplot(x="value", y="measurement", hue="species",
data=iris, dodge=.532, join=False, palette="dark",
markers="d", scale=.75, ci=None)
# Improve the legend
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles[3:], labels[3:], title="species",
handletextpad=0, columnspacing=1,
loc="lower right", ncol=3, frameon=True)
scatterplot
散点图
"""
Scatterplot with categorical and numerical semantics
====================================================
"""
sns.set(style="whitegrid")
# Load the example iris dataset
diamonds = pd.read_csv("./seaborn-data-master/diamonds.csv")
# Draw a scatter plot while assigning point colors and sizes to different
# variables in the dataset
f, ax = plt.subplots(figsize=(6.5, 6.5))
sns.despine(f, left=True, bottom=True)
clarity_ranking = ["I1", "SI2", "SI1", "VS2", "VS1", "VVS2", "VVS1", "IF"]
sns.scatterplot(x="carat", y="price",
hue="clarity", size="depth",
palette="ch:r=-.2,d=.3_r",
hue_order=clarity_ranking,
sizes=(1, 8), linewidth=0,
data=diamonds, ax=ax)
<AxesSubplot:xlabel='carat', ylabel='price'>
boxenplot
增强箱图
"""
Plotting large distributions
============================
"""
sns.set(style="whitegrid")
diamonds = pd.read_csv("./seaborn-data-master/diamonds.csv")
clarity_ranking = ["I1", "SI2", "SI1", "VS2", "VS1", "VVS2", "VVS1", "IF"]
sns.boxenplot(x="clarity", y="carat",
color="b", order=clarity_ranking,
scale="linear", data=diamonds)
<AxesSubplot:xlabel='clarity', ylabel='carat'>
Scatterplot
散点图
"""
Scatterplot with continuous hues and sizes
==========================================
"""
sns.set()
# Load the example iris dataset
planets = pd.read_csv("./seaborn-data-master/planets.csv")
cmap = sns.cubehelix_palette(rot=-.2, as_cmap=True)
ax = sns.scatterplot(x="distance", y="orbital_period",
hue="year", size="mass",
palette=cmap, sizes=(10, 200),
data=planets)
"""
Scatterplot with marginal ticks
===============================
"""
sns.set(style="white", color_codes=True)
# Generate a random bivariate dataset
rs = np.random.RandomState(9)
mean = [0, 0]
cov = [(1, 0), (0, 2)]
x, y = rs.multivariate_normal(mean, cov, 100).T
# Use JointGrid directly to draw a custom plot
grid = sns.JointGrid(x, y, space=0, height=6, ratio=50)
grid.plot_joint(plt.scatter, color="g")
grid.plot_marginals(sns.rugplot, height=1, color="g")
PairGrid
用于绘制数据集中成对关系的子图网格。
"""
Paired density and scatterplot matrix
=====================================
"""
sns.set(style="white")
df = pd.read_csv("./seaborn-data-master/iris.csv")
g = sns.PairGrid(df, diag_sharey=False)
g.map_lower(sns.kdeplot)
g.map_upper(sns.scatterplot)
g.map_diag(sns.kdeplot, lw=3)
"""
Paired categorical plots
========================
"""
sns.set(style="whitegrid")
# Load the example Titanic dataset
titanic = pd.read_csv("./seaborn-data-master/titanic.csv")
# Set up a grid to plot survival probability against several variables
g = sns.PairGrid(titanic, y_vars="survived",
x_vars=["class", "sex", "who", "alone"],
height=5, aspect=.5)
# Draw a seaborn pointplot onto each Axes
g.map(sns.pointplot, scale=1.3, errwidth=4, color="xkcd:plum")
g.set(ylim=(0, 1))
sns.despine(fig=g.fig, left=True)
residplot
线性回归残差图
"""
Plotting model residuals
========================
"""
sns.set(style="whitegrid")
# Make an example dataset with y ~ x
rs = np.random.RandomState(7)
x = rs.normal(2, 1, 75)
y = 2 + 1.5 * x + rs.normal(0, 2, 75)
# Plot the residuals after fitting a linear model
sns.residplot(x, y, lowess=True, color="g")
"""
Scatterplot with varying point sizes and hues
==============================================
"""
sns.set(style="white")
# Load the example mpg dataset
mpg = pd.read_csv("./seaborn-data-master/mpg.csv")
# Plot miles per gallon against horsepower with other semantics
sns.relplot(x="horsepower", y="mpg", hue="origin", size="weight",
sizes=(40, 400), alpha=.5, palette="muted",
height=6, data=mpg)
swarmplot
能够显示分布密度的分类散点图
"""
Scatterplot with categorical variables
======================================
"""
sns.set(style="whitegrid", palette="muted")
# Load the example iris dataset
iris = pd.read_csv("./seaborn-data-master/iris.csv")
# "Melt" the dataset to "long-form" or "tidy" representation
iris = pd.melt(iris, "species", var_name="measurement")
# Draw a categorical scatterplot to show each observation
sns.swarmplot(x="measurement", y="value", hue="species",
palette=["r", "c", "y"], data=iris)
pairplot
变量关系组图
"""
Scatterplot Matrix
==================
"""
sns.set(style="ticks")
df = pd.read_csv("./seaborn-data-master/iris.csv")
sns.pairplot(df, hue="species")
<seaborn.axisgrid.PairGrid at 0x2c7c11763a0>
clustermap
聚集图
"""
Discovering structure in heatmap data
=====================================
"""
sns.set()
# Load the brain networks example dataset
df = pd.read_csv("./seaborn-data-master/brain_networks.csv", header=[0, 1, 2], index_col=0)
# Select a subset of the networks
used_networks = [1, 5, 6, 7, 8, 12, 13, 17]
used_columns = (df.columns.get_level_values("network")
.astype(int)
.isin(used_networks))
df = df.loc[:, used_columns]
# Create a categorical palette to identify the networks
network_pal = sns.husl_palette(8, s=.45)
network_lut = dict(zip(map(str, used_networks), network_pal))
# Convert the palette to vectors that will be drawn on the side of the matrix
networks = df.columns.get_level_values("network")
network_colors = pd.Series(networks, index=df.columns).map(network_lut)
# Draw the full plot
sns.clustermap(df.corr(), center=0, cmap="vlag",
row_colors=network_colors, col_colors=network_colors,
linewidths=.75, figsize=(13, 13))
推荐阅读:
Tableau数据分析-Chapter01条形图、堆积图、直方图
Tableau数据分析-Chapter02数据预处理、折线图、饼图
Tableau数据分析-Chapter03基本表、树状图、气泡图、词云
Tableau数据分析-Chapter04标靶图、甘特图、瀑布图
Tableau数据分析-Chapter05数据集合并、符号地图
Tableau数据分析-Chapter06填充地图、多维地图、混合地图
Tableau数据分析-Chapter07多边形地图和背景地图
Tableau数据分析-Chapter08数据分层、数据分组、数据集
Tableau数据分析-Chapter09粒度、聚合与比率
Tableau数据分析-Chapter10 人口金字塔、漏斗图、箱线图
Tableau中国五城市六年PM2.5数据挖掘
- 点赞
- 收藏
- 关注作者
评论(0)