如何对最终输出的数据进行排序?

我想按两列对数据框进行分组,然后对各组中的汇总结果进行排序。

In [167]:df

count   job source
0   2   sales   A
1   4   sales   B
2   6   sales   C
3   3   sales   D
4   7   sales   E
5   5   market  A
6   3   market  B
7   2   market  C
8   4   market  D
9   1   market  E
df.groupby(['job','source']).agg({'count':sum})
Out[168]:

job     source  count
market  A   5
        B   3
        C   2
        D   4
        E   1
sales   A   2
        B   4
        C   6
        D   3
        E   7

现在,我想在每个组中按降序对计数列进行排序。然后只取前三行。得到类似的东西:

job     source  count
market  A   5
        D   4
        B   3
sales   E   7
        C   6
        B   4

I want to further sort this problem w.r.t job, so if the sum of count for sales is more, I want the data to be printed as

job     source  count
sales   E   7
        C   6
        B   4
market  A   5
        D   4
        B   3

我无法获得前五名的工作

评论
Dulle
Dulle

you can do a further groupby and use .nlargest(3)

df.groupby(['job','source']).agg({'count':sum}).groupby(level=0)['count']\
.nlargest(3).reset_index(0,drop=True).to_frame().sort_values('count',ascending=False)
#out
               count
job    source       
sales  E           7
       C           6
market A           5
       D           4
sales  B           4
market B           3
点赞
评论