让我们说我们有以下数据框:
df = pd.DataFrame(
data={
'from': [103, 444, 104, 999230],
'to': [104, 999230, 103, 444],
'id': [1] * 4,
'p': [415, 1203.11, -414.35, -1197.37],
'q': [0, -395.44, 62.23, 489.83]
})
要么
from to id p q
0 103 104 1 415.00 0.00
1 444 999230 1 1203.11 -395.44
2 104 103 1 -414.35 62.23
3 999230 444 1 -1197.37 489.83
The goal is to combine the rows that have the same from
and to
values. In the example above, rows 0 and 2, and rows 1 and 3, needs to be combined.
输出应如下所示:
from to id p q p1 q1
0 103 104 1 415.00 0.00 -414.35 62.23
1 444 999230 1 1203.11 -395.44 -1197.37 489.83
当然,以下内容也是可以接受的:
from to id p q p1 q1
0 104 103 1 -414.35 62.23 415.00 0.00
1 999230 444 1 -1197.37 489.83 1203.11 -395.44
任何帮助表示赞赏:)
First sorting both columns
from
andto
bynumpy.sort
, then create counterSeries
byGroupBy.cumcount
, reshape byDataFrame.set_index
andDataFrame.unstack
with sorting second level byDataFrame.sort_index
, last flattenMultiIndex
withf-strings
and convertMultiindex in index
to columns byDataFrame.reset_index
: