如何在python中以UTF-8样式编写和编码文件?

import io

def write_ngrams(table, filename):

    with io.open(filename, "w") as file:
        for i in table:
            outputstring=(('%d %s\n' % (table[i], i)))
            encoded = outputstring.encode("utf-8")
            file.write(encoded)

tabel = ngram_table('hiep, hiep, hoera!', 3, 0) // these are not really interesting for now

write_ngrams(tabel, "testfile3.txt")

我在file.write(encoded)行出现错误,指出以下内容:

TypeError: write() argument must be str, not bytes.

但是我的任务是:输出必须使用utf8编码,

这意味着输出应为b'....'的形式

用我尝试过的方法,我只能得到没有编码或错误的字符串。但是,当我使用print(encoded)时,我确实收到了UTF-8编码的输出,但是当我将其写入文件时,编码消失了,或者出现了错误。

任何提示将不胜感激。

评论
  • dquis
    dquis 回复

    You can pass the string to write() & open the file with the encoding set to utf-8

    import io
    
    def write_ngrams(table, filename):
    
        with io.open(filename, "w", encoding='utf-8') as file:
            for i in table:
                outputstring=(('%d %s\n' % (table[i], i)))
                file.write(outputstring)
    
    tabel = ngram_table('hiep, hiep, hoera!', 3, 0) // these are not really interesting for now
    
    write_ngrams(tabel, "testfile3.txt")