Hi I am trying to use unicodedata
in python 3.7 on Linux but unfortunately it fails. Any help is highly appreciated.
我一直在网上寻找同样的问题,但是找不到任何指向正确方向的提示。
My problem: I make use of unicodedata.name(string)
and there I get an error TypeError: name() argument 1 must be a unicode character, not str
.
最小的工作例子
#!/usr/bin/env python3
import re
import emoji
import unicodedata
def replace_emoji(document):
emoji_all = emoji.EMOJI_ALIAS_UNICODE.items()
emoji_items = []
emoji_pattern = re.compile(u'|'.join(
re.escape(u[1]) for u in emoji_all), flags=re.UNICODE)
emoji_items = re.findall(emoji_pattern, document)
for item in emoji_items:
unicodes = []
unicode_values = []
for char in range(len(item)):
if not len(item) > 1:
unicodes.append(r'{:x}'.format(ord(item[char])).upper())
unicode_values.append([hex(ord(x)) for x in item[char]][0])
char_length = len(unicode_values)
chars = [chr(int(u, 16)) for u in unicode_values]
if char_length == 2:
print(chars)
value = u'\\U{:x}\\U{:x}'.format(
ord(chars[0]), ord(chars[1])).upper()
unicodedata.name(value)
return document
我的测试运行
print(replace_emoji(u'?????????????????????????????'))
我相信您可以将所有表情符号字符都视为python 3中的普通字符。
无法测试代码atm,但是我认为应该这样做。