使用正则表达式python删除空格

我正在尝试修改文件的每一行,以删除任何以字符'('开头或方括号中包含数字/字符,即'[2]'的部分:

f = open('/Users/name/Desktop/university_towns.txt',"r")
listed = []
import re 
for i in f.readlines():
    if i.find(r'\(.*?\)\n'): 
       here = re.sub(r'\(.*?\)\[.*?\]\n', "", i)
       listed.append(here)
    elif i.find(r' \(.*?\)\n'):
       here = re.sub(r' \(.*?\)\[.*?\]\n', "", i)
       listed.append(here)
    elif i.find(r' \[.*?\]\n'): 
       here = re.sub(r' \[.*?\]\n', "", i)
       listed.append(here) 
    else:
       here = re.sub(r'\[.*?\]\n', "", i)
       listed.append(here)

我的输入数据样本:

Platteville (University of Wisconsin–Platteville)[2]
River Falls (University of Wisconsin–River Falls)[2]
Stevens Point (University of Wisconsin–Stevens Point)[2]
Waukesha (Carroll University)
Whitewater (University of Wisconsin–Whitewater)[2]
Wyoming[edit]
Laramie (University of Wyoming)[5]

我的输出数据样本:

Platteville 
River Falls 
Stevens Point 
Waukesha (Carroll University)
Whitewater 
Wyoming[edit]
Laramie 

但是,我不需要诸如“((卡洛尔大学)”)或“ [编辑]”之类的部分。

如何修改配方?

如果有人可以给我任何建议,我将非常感激!

评论
昔其雨
昔其雨

你可以做:

import re 

with open(ur_file) as f_in:
    for line in f_in:
        if m:=re.search(r'^([^(\[]+)', line):  # Python 3.8+
            print(m.group(1))

印刷品:

Platteville 
River Falls 
Stevens Point 
Waukesha 
Whitewater 
Wyoming
Laramie 
点赞
评论
透心凉
透心凉

改用此RegEx:

\(.*\)|\[.*\]

像这样:

re.sub(r'\(.*\)|\[.*\]', '', i)

This will substitute anything in parenthesis (\(.*\)) or (|) anything in square brackets (\[.*\])

点赞
评论