所以我有一部分代码要优化
nan_rows = df.loc[df.Open.isna()].index
for i in nan_rows:
df.Open.iloc[i] = df.Close.iloc[i-1]
它的作用是为nan值分配另一列的先前值。我发现这段代码很慢,而且很多时候我不得不将此方法应用于更大的数据帧。有什么办法可以优化这个?谢谢
所以我有一部分代码要优化
nan_rows = df.loc[df.Open.isna()].index
for i in nan_rows:
df.Open.iloc[i] = df.Close.iloc[i-1]
它的作用是为nan值分配另一列的先前值。我发现这段代码很慢,而且很多时候我不得不将此方法应用于更大的数据帧。有什么办法可以优化这个?谢谢
The
fillna()
method is probably what you're looking for.If used as follows, it fill the
NaN
values with a (valid) value in the previous row.limit = n
specifies the number consecutiveNaN
values to replace with the previous valid value. The default isNone
, which implies that forward-filling will continue indefinitely, as long asNaN
-s are encountered consecutively.Read more about it in the docs.