11-pandas常用操作--数据分析
【摘要】
pandas常用操作
In [3]:
...
In [3]:
import pandas as pd
data = pd.DataFrame({'group':['a','a','a','b','b','b','c','c','c'],
'data':[4,3,2,1,12,3,4,5,7]})
data
Out[3]:
In [5]:
data.sort_values(by=['group','data'],ascending=[False,True],inplace=True)
data
Out[5]:
In [7]:
data = pd.DataFrame({'k1':['one']*3+['two']*4,
'k2':[3,2,1,3,3,4,4]})
data
Out[7]:
In [8]:
data.sort_values(by='k2')
Out[8]:
In [9]:
data.drop_duplicates()
Out[9]:
In [10]:
data.drop_duplicates('k1')
Out[10]:
In [11]:
data.drop_duplicates('k2')
Out[11]:
In [21]:
data = pd.DataFrame({'food':['A1','A2','B1','B2','B3','C1','C2'],'data':[1,2,3,4,5,6,7]})
data
Out[21]:
In [22]:
def food_map(series):
if series['food'] == 'A1':
return 'A'
elif series['food'] == 'A2':
return 'A'
elif series['food'] == 'B1':
return 'B'
elif series['food'] == 'B2':
return 'B'
elif series['food'] == 'B3':
return 'B'
elif series['food'] == 'C1':
return 'C'
elif series['food'] == 'C2':
return 'C'
data['food_map'] = data.apply(food_map,axis = 'columns')
data
Out[22]:
In [23]:
food2Upper = {
'A1':'A',
'A2':'A',
'B1':'B',
'B2':'B',
'B3':'B',
'C1':'C',
'C2':'C'
}
data['upper'] = data['food'].map(food2Upper)
data
Out[23]:
In [31]:
import numpy as np
df = pd.DataFrame({'data1':np.random.randn(5),
'data2':np.random.randn(5)})
df2 = df.assign(ration = df['data1']/df['data2'])
df2
Out[31]:
In [29]:
df2.drop('ration',axis='columns',inplace=True)
In [30]:
df2
Out[30]:
In [33]:
data = pd.Series([1,2,3,4,5,6,7,8,9])
data
Out[33]:
In [34]:
data.replace(9,np.nan,inplace=True)
In [35]:
data
Out[35]:
In [41]:
ages = [13,23,34,45,56,67,79]
bins = [10,40,80]
bins_res = pd.cut(ages,bins)
bins_res
Out[41]:
In [43]:
pd.value_counts(bins_res)
Out[43]:
In [44]:
pd.cut(ages,[10,30,50,80])
Out[44]:
In [47]:
group_names =['Yonth','Mille','Old']
pd.cut(ages,[10,20,50,80],labels=group_names)
Out[47]:
In [48]:
group_names =['Yonth','Mille','Old']
pd.value_counts(pd.cut(ages,[10,20,50,80],labels=group_names))
Out[48]:
In [49]:
df = pd.DataFrame([range(3),[0,np.nan,0],[0,0,np.nan],range(3)])
df
Out[49]:
In [50]:
df.isnull()
Out[50]:
In [51]:
df.isnull().any()
Out[51]:
In [53]:
df.isnull().any(axis=1)
Out[53]:
In [54]:
df.fillna(5)
Out[54]:
In [55]:
df[df.isnull().any(axis = 1)]
Out[55]:
In [ ]:
文章来源: brucelong.blog.csdn.net,作者:Bruce小鬼,版权归原作者所有,如需转载,请联系作者。
原文链接:brucelong.blog.csdn.net/article/details/80763617
【版权声明】本文为华为云社区用户转载文章,如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱:
cloudbbs@huaweicloud.com
- 点赞
- 收藏
- 关注作者
评论(0)