python:基于数据帧中位置的特定行值的求和

我有以下数据框:

df

    Name    A   B   C   D
0   Americas
1   Mark    1   2   3   4
2   Mat     5   6   7   8
3   ANZ
4   Pat     9   1   2   3
5   Pacific
6   Zack    4   5   6   7
7   Zoey    8   9   1   2
8   Zee     3   4   5   6

在这里,美洲的计算公式为行1:2的总和

ANZ计算为第4行的总和

太平洋按行6:8的总和计算

注意:这些行号是常量,不依赖于任何条件,而是按原样使用。

计算各列总计的方法如下:

df.at[5, 'A'] = df['A'].loc[6:8].sum()
df.at[5, 'B'] = df['B'].loc[6:8].sum()
df.at[5, 'C'] = df['C'].loc[6:8].sum()
df.at[5, 'D'] = df['D'].loc[6:8].sum()

有没有一种方法可以在不使用3行代码的情况下做到这一点?

结果数据框:

df

    Name        A   B   C   D
0   Americas    6   8   10  12
1   Mark        1   2   3   4
2   Mat         5   6   7   8
3   ANZ         9   1   2   3
4   Pat         9   1   2   3
5   Pacific     15  18  12  15
6   Zack        4   5   6   7
7   Zoey        8   9   1   2
8   Zee         3   4   5   6
评论
  • pporro
    pporro 回复

    First find lets find columns we need to fill by using a isnull boolean.

    然后我们创建一个条件列进行分组。

    since you want your original dataframe back, we can use iloc to assign the values back along their index.

    idx = df[df['A'].isnull()].index
    
    #Int64Index([0, 3, 5], dtype='int64')
    
    df.loc[idx,'Key'] = df['Name']
    df['Key'] = df['Key'].ffill()
    
    
    df.iloc[idx,1:-1] = df.drop('Name',axis=1).groupby('Key').transform('sum').loc[idx]
    
    print(df.drop('Key',axis=1))
    
           Name     A     B     C     D
    0  Americas  15.0   9.0  12.0  15.0
    1      Mark   1.0   2.0   3.0   4.0
    2       Mat   5.0   6.0   7.0   8.0
    3       ANZ   9.0   1.0   2.0   3.0
    4       Pat   9.0   1.0   2.0   3.0
    5   Pacific  15.0  18.0  12.0  15.0
    6      Zack   4.0   5.0   6.0   7.0
    7      Zoey   8.0   9.0   1.0   2.0
    8       Zee   3.0   4.0   5.0   6.0
    

  • 念念不忘
    念念不忘 回复

    I would go with finding missing rows with isna and all, and create groups with cumsum. Then use this in a groupby and transform with the sum. Use fillna to fill the missing value.

    g = df[['A','B','C','D']].isna().all(axis=1).cumsum()
    df = df.fillna(df.groupby(g)[['A','B','C','D']].transform('sum'))
    print (df)
           Name     A     B     C     D
    0  Americas   6.0   8.0  10.0  12.0
    1      Mark   1.0   2.0   3.0   4.0
    2       Mat   5.0   6.0   7.0   8.0
    3       ANZ   9.0   1.0   2.0   3.0
    4       Pat   9.0   1.0   2.0   3.0
    5   Pacific  15.0  18.0  12.0  15.0
    6      Zack   4.0   5.0   6.0   7.0
    7      Zoey   8.0   9.0   1.0   2.0
    8       Zee   3.0   4.0   5.0   6.0