我要构建三元组:源->目标->边缘,并将这些三元组存储在新的数据框中。
我有两个数据框
Accident_ID Location CarID_1 CarID_2 DriverID_1 DriverID_2
0 1 Tartu 1000 1001 1 3
1 2 Tallin 1002 1003 2 5
2 3 Tartu 1004 1005 4 6
3 4 Tallin 1006 1007 7 8
User_ID First Name Last Name Age Address Accident_ID ROLE
0 1 Chester Murphy 25 Narva 108, Tartu 1 Driver
1 2 Walter Turner 26 Tilgi 49, Tartu 2 Driver
2 3 Daryl Fowler 25 Piik 67, Tartu 1 Driver
3 4 Ted Nelson 45 Herne 20, Tartu 3 Driver
4 5 Olivia Crawford 38 Kalevi 25, Tartu 2 Driver
5 1 Chester Murphy 25 Narva 108, Tartu 2 Witness
6 6 Amy Miller 27 Riia 408, Tartu 3 Driver
7 7 Tes Smith 25 Narva 108, Tartu 4 Driver
8 8 Josh Blake 36 Parnu 37, Tallin 4 Driver
9 3 Daryl Fowler 25 Piik 67, Tartu 4 Witness
我必须形成的三元组就是这种模式 [![在此处输入图片描述] [2]] [2]
这是什么python代码?我已经写了这个,但是我得到的错误见证没有定义
df3 = df1.merge(df2,on='Accident_ID')
df3["train"] = df3.Accident_ID < 5
df3["train"] .value_counts()
triples = []
for _, row in df3[df3["train"]].iterrows():
if row["ROLE"] == "Driver":
if row["User_ID"] == row["DriverID_1"]:
Drives = (row["User_ID"],row["CarID_1"], "Drives")
elif row["User_ID"] == row["DriverID_2"]:
Drives = (row["User_ID"],row["CarID_2"], "Drives")
else:
Witness = (row["User_ID"],row["Accident_ID"], "Witness")
Involved_in_first = (row["CarID_1"],row["Accident_ID"], "Involved in")
Involved_in_second = (row["CarID_2"],row["Accident_ID"], "Involved in")
Happened_in = (row["Accident_ID"],row["Location"], "Happened in")
Lives_in = (row["User_ID"],row["Address"], "Lives in")
triples.extend((Drives , Witness , Involved_in_first,Involved_in_second, Happened_in , Lives_in ))
triples_df = pd.DataFrame(triples, columns=["Source", "Target", "Edge"])
triples_df.shape
您应该像这样,并对其余的边缘执行相同的过程:
输出: