如何合并具有重复行的两个数据框?

I have two data frames df1 and df2. The df1 has repeated text wrt column name but column hobby changes. The df2 also has repeated text in the column name. I want to merge both the data frames and keep everything.

df1:
name   hobby

mike   cricket 
mike   football
jack   chess
jack   football
jack   vollyball
pieter sleeping
pieter cyclying

my df2 is

df2:
name

mike
pieter 
jack  
mike
pieter 

Now I have to merge df2 with df1 on name column So my resultant df3 should look like this:

df3:
name   hobby

mike   cricket 
mike   football
pieter sleeping
pieter cyclying
jack   chess
jack   football
jack   vollyball
mike   cricket 
mike   football
pieter sleeping
pieter cyclying


评论
  • 零零柒
    零零柒 回复

    IIUC, you want to assign an order for df2, merge on name, then sort by the said order:

    (df2.assign(rank=np.arange(len(df2)))
        .merge(df1, on='name')
        .sort_values('rank')
        .drop('rank', axis=1)
    )
    

    输出:

          name      hobby
    0     mike    cricket
    1     mike   football
    4   pieter   sleeping
    5   pieter   cyclying
    8     jack      chess
    9     jack   football
    10    jack  vollyball
    2     mike    cricket
    3     mike   football
    6   pieter   sleeping
    7   pieter   cyclying
    
  • 光腚局
    光腚局 回复

    你可以这样做:

    这将保留重复项:

    In [438]: pd.merge(df1,df2, on='name')                                                                                                                                                                      
    Out[438]: 
          name      hobby
    0     mike    cricket
    1     mike    cricket
    2     mike   football
    3     mike   football
    4     jack      chess
    5     jack   football
    6     jack  vollyball
    7   pieter   sleeping
    8   pieter   sleeping
    9   pieter   cyclying
    10  pieter   cyclying
    

    如果要执行以下操作,将从上方删除重复项:

    pd.merge(df1,df2, on='name').drop_duplicates()