当前代码:查找所有体育馆的网址,并像这样放入csv:
https://www.lifetime.life/life-time-locations/al-vestavia-hills.html
https://www.lifetime.life/life-time-locations/az-biltmore.html
我想要它做什么:我在从每个URL中提取地址时遇到了麻烦。我在地址部分的尝试是在下面“代码”底部的第4行和第5行中。确切的错误是:
gymrow.append(address_line1[0].text)
IndexError: list index out of range
编码*:
import urllib2
import BeautifulSoup
initial_url = "https://www.lifetime.life"
request = urllib2.Request("https://www.lifetime.life/view-all-locations.html")
response = urllib2.urlopen(request)
soup = BeautifulSoup.BeautifulSoup(response)
with open('gyms2.csv', 'w') as gf:
gymwriter = csv.writer(gf)
for a in soup.findAll('a'):
if '/life-time-locations/' in a['href']:
gymurl1 = (urlparse.urljoin(initial_url, a.get('href')))
sitemap_content = requests.get(gymurl1).content
gymrow = [gymurl1]
address_line1 = soup.select('p[class~=small m-b-sm p-t-1] > span[class~=btn-icon-text]')
gymrow.append(address_line1[0].text)
print(gymrow)
gymwriter.writerow(gymrow)
time.sleep(3)
Image of inspect element: the p class, span class and the address I want to scrape
非常感谢你!