I have a generator
, something like this:
import numpy as np
attn = [[1, 2, 3, 4, 5, 6], [11, 2, 23, 4, 5, 6], [1, 12, 3, 4, 5, 6], [1, 21, 3, 4, 51, 6], [1, 12, 13, 4, 5, 6]]
def get_weights():
for i in range(10000000):
yield np.array(attn[i%5]) * (i**2)%103//3
我需要获取随机索引的值。
i_f = get_weights()
The fastest solution for me seems, making the generator to a list and getting the specific index in O(1)
, but for a longer list, it's not feasible. (Also won't the list()
operation be O(N)
to make the list in the first place?)
I found from some relevant answers that itertools islice
is a better approach.
所以,
import time
t1 = time.time()
print(list(islice(i_f, 9000000,9000001,1)))
t2 = time.time()
[array([10, 20, 30, 5, 15, 25], dtype=int64)]
43.81341481208801
It took 43 seconds which is very high. So, I thought maybe if I could make the generator bi-directional (or maybe generalize and make iterator from any location and in any direction in broader sense), things will be faster. If the location is above middle-index (N//2)
I can reversely iterate else I will use regular one (in simple sense).
我的幼稚方法如下:
import numpy as np
attn = [[1, 2, 3, 4, 5, 6], [11, 2, 23, 4, 5, 6], [1, 12, 3, 4, 5, 6], [1, 21, 3, 4, 51, 6], [1, 12, 13, 4, 5, 6]]
def get_weights():
for i in range(10000000):
yield np.array(attn[i%5]) * (i**2)%103//3
def get_weights_rev():
for i in range(10000000,-1,-1):
yield np.array(attn[i%5]) * (i**2)%103//3
i_f = get_weights()
i_b = get_weights_rev()
import time
t1 = time.time()
print(list(islice(i_f, 9000000,9000001,1)))
t2 = time.time()
print(t2-t1)
t1 = time.time()
print(list(islice(i_b, 9000000,9000001,1)))
t2 = time.time()
print(t2-t1)
[array([10, 20, 30, 5, 15, 25], dtype=int64)]
43.81341481208801
[array([ 2, 5, 8, 10, 13, 16], dtype=int64)]
43.59048891067505
令人惊讶的是,它对我来说丝毫没有减少。我期望第二个运行速度快9倍。谁能解释我的这种行为,以及如何以更自然/ Python的方式有效地索引生成器?
N.B:实际上,关于通过生成器访问网络的随机层,我有一个不同的问题,但这似乎是一个很好的虚拟例子来说明问题。