12-Groupby扩展操作--数据分析
【摘要】
groupby扩展操作
In [3]:
...
In [3]:
import pandas as pd
import numpy as np
df = pd.DataFrame({'A':['foo','bar','foo','bar','foo','bar','foo','foo'],
'B':['one','one','two','three','two','two','one','three'],
'C':np.random.randn(8),
'D':np.random.randn(8)})
df
Out[3]:
In [5]:
grouped = df.groupby('A')
grouped
Out[5]:
In [7]:
grouped.count()
Out[7]:
In [8]:
grouped = df.groupby(['A','B','C'])
grouped.count()
Out[8]:
In [9]:
def get_letter_type(letter):
if letter.lower() in 'aeiou':
return 'a'
else:
return 'b'
grouped = df.groupby(get_letter_type,axis = 1)
grouped.count().iloc[0]
Out[9]:
In [10]:
s = pd.Series([1,2,3,1,2,3],[8,7,5,8,7,5])
s
Out[10]:
In [13]:
grouped = s.groupby(level=0,sort=False)
In [14]:
grouped.first()
Out[14]:
In [15]:
grouped.last()
Out[15]:
In [16]:
grouped.sum()
Out[16]:
In [17]:
df2 = pd.DataFrame({'X':['A','B','A','B'],
'Y':[1,2,3,4]})
df2
Out[17]:
In [19]:
df2.groupby(['X']).get_group('A')
Out[19]:
In [20]:
df2.groupby(['X']).get_group('B')
Out[20]:
In [21]:
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
In [23]:
index = pd.MultiIndex.from_arrays(arrays,names=['first','second'])
In [24]:
s = pd.Series(np.random.randn(8),index=index)
s
Out[24]:
### groupby(level=0) level属性分组级别,对应的值可以是数字,0代表第一列。或者直接写列的名字
In [25]:
grouped = s.groupby(level = 0)
grouped.sum()
Out[25]:
In [26]:
grouped = s.groupby(level = 1)
grouped.sum()
Out[26]:
In [27]:
grouped = s.groupby(level = 'first')
grouped.sum()
Out[27]:
In [28]:
df
Out[28]:
In [29]:
grouped = df.groupby(['A','B'])
grouped.aggregate(np.sum)
Out[29]:
In [30]:
grouped = df.groupby(['A','B'],as_index=False)
grouped.aggregate(np.sum)
Out[30]:
In [33]:
grouped = df.groupby(['A','B']).sum().reset_index()
grouped
Out[33]:
In [34]:
grouped = df.groupby(['A','B'])
grouped.size()
Out[34]:
In [35]:
grouped.describe().head()
Out[35]:
In [38]:
grouped = df.groupby('A')
grouped['C'].agg([np.sum,np.mean,np.std])
Out[38]:
In [39]:
grouped['C'].agg({'求和':np.sum,'平均':np.mean,'方差':np.std})
Out[39]:
In [ ]:
文章来源: brucelong.blog.csdn.net,作者:Bruce小鬼,版权归原作者所有,如需转载,请联系作者。
原文链接:brucelong.blog.csdn.net/article/details/80764534
【版权声明】本文为华为云社区用户转载文章,如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱:
cloudbbs@huaweicloud.com
- 点赞
- 收藏
- 关注作者
评论(0)