Python Question无法将数据从网站保存到.txt文件

我在用Python将从网站抓取的原始数据写入txt.file时遇到问题。我在这里浏览了不同的问题,但仍然无法使我的代码正常工作。有什么建议么?我可以得到它来打印我想要的东西,但是对我来说一生都无法弄清楚如何简单地将其写入.txt文件。

#PACKAGES WE WILL NEED FOR THIS PROJECT
import csv
import re
import requests
import pprint
from bs4 import BeautifulSoup
from urllib.parse import urljoin, urlparse

#CREATE VARIABLE FOR LINK TO PAGE WE WILL WEBSCRAPE
base_censuspage = "https://www.census.gov/programs-surveys/popest.html"

#EXTRACT DATA FROM WEBPAGE
r = requests.get(base_censuspage)
htmlcontent = r.text
soup = BeautifulSoup(htmlcontent,'html.parser')
links_array = []

#FIND LINKS TO OTHER PAGES AND ADD THEM TO LIST
for link in soup.find_all('a',attrs={'href':re.compile(r'html')}):
    links_array.append(urljoin(base_censuspage,link.get('href')))

#REMOVE DUPLICATES AND PRINT LIST TO VERIFY DUPLICATES WERE REMOVED
unique_links = set(links_array)
pprint.pprint(unique_links)
pprint.pprint(htmlcontent)

#SAVE TO CSV FILE
with open("C996PROJECTASSESSMENTCSVFILE.CSV","w") as f:
    wr = csv.writer(f,delimiter="\n")
    wr.writerow(links_array)

#SAVE TO TXT FILE
with open('webscrapeddata.txt','w') as f:
    f.write(htmlcontent)
评论
24K纯贱
24K纯贱

I ran the program and the UnicodeEncodeError that was being thrown happened because the encoding method of the htmlcontent was not the same as the open files in python so to solve that just add an encoding argument as following:

#SAVE TO TXT FILE
with open('webscrapeddata.txt','w', encoding='utf-8') as f: # <-- HERE
    f.write(htmlcontent)
点赞
评论