在流上调用flush()后,如何知道磁盘何时就绪?

I have a file called foo.txt with a 1B rows.

我想对产生10个新行的每一行应用一个操作。预计输出约为10B行。

To increase the speed and the IO, foo.txt is on DiskA and bar.txt DiskB (different drives - physically speaking).

DiskB将成为限制因素。因为有很多行要写入,所以在写入DiskB时添加了一个大缓冲区。

我的问题是:当我在diskB上调用flush()时,文件处理程序的缓冲区会将其刷新到硬盘驱动器。自从命令返回以来,这似乎是一个非阻塞调用,但是我仍然可以看到磁盘正在写入并且其繁忙指示器为100%。几秒钟后,指示器恢复为0%。 python中有没有办法等待磁盘完成?理想情况下,我希望flush()是一个阻塞调用。我现在看到的唯一解决方案是添加任意sleep()并希望磁盘已准备好。

Here's a snippet to show visually (It's a bit more complicated in practice as bar.txt is not just one file but thousands of files so the IO efficiency is very poor):

with open('bar.txt', 'w', buffering=100 * io.DEFAULT_BUFFER_SIZE) as w:
    with open('foo.txt') as r:
        for line in r:
            # writes each line of foo 10 times in bar.
            for i in range(10):
                w.write(line)
            # w.flush()

评论