两部分的问题。我正在尝试从互联网档案下载多个存档的Cory Doctorow播客。我的iTunes提要中未包含的旧内容。我已经编写了脚本,但是下载的文件格式不正确。
问题1-如何更改以下载zip mp3文件? 问题2-将变量传递到URL的更好方法是什么?
# and the base url.
def dlfile(file_name,file_mode,base_url):
from urllib2 import Request, urlopen, URLError, HTTPError
#create the url and the request
url = base_url + file_name + mid_url + file_name + end_url
req = Request(url)
# Open the url
try:
f = urlopen(req)
print "downloading " + url
# Open our local file for writing
local_file = open(file_name, "wb" + file_mode)
#Write to our local file
local_file.write(f.read())
local_file.close()
#handle errors
except HTTPError, e:
print "HTTP Error:",e.code , url
except URLError, e:
print "URL Error:",e.reason , url
# Set the range
var_range = range(150,153)
# Iterate over image ranges
for index in var_range:
base_url = 'http://www.archive.org/download/Cory_Doctorow_Podcast_'
mid_url = '/Cory_Doctorow_Podcast_'
end_url = '_64kb_mp3.zip'
#create file name based on known pattern
file_name = str(index)
dlfile(file_name,"wb",base_url
该脚本是从这里改编的
最佳答案
Here's how I'd deal with the url building and downloading. I'm making sure to name the file as the basename of the url (the last bit after the trailing slash) and I'm also using the with
clause for opening the file to write to. This uses a ContextManager which is nice because it will close that file when the block exits. In addition, I use a template to build the string for the url. urlopen
doesn't need a request object, just a string.
import os
from urllib2 import urlopen, URLError, HTTPError
def dlfile(url):
# Open the url
try:
f = urlopen(url)
print "downloading " + url
# Open our local file for writing
with open(os.path.basename(url), "wb") as local_file:
local_file.write(f.read())
#handle errors
except HTTPError, e:
print "HTTP Error:", e.code, url
except URLError, e:
print "URL Error:", e.reason, url
def main():
# Iterate over image ranges
for index in range(150, 151):
url = ("http://www.archive.org/download/"
"Cory_Doctorow_Podcast_%d/"
"Cory_Doctorow_Podcast_%d_64kb_mp3.zip" %
(index, index))
dlfile(url)
if __name__ == '__main__':
main()