在python中,结果文件不包含所有这些值

我使用此代码从网站提取数据,但不足以获取我需要的所有数据

网站链接示例

https://www.noon.com/saudi-en/accessories-and-supplies?f[is_fbn]=1&sort[by]=price&sort[dir]=asc&limit=150&page=1

运行代码的结果

price,title,sku, 
0.75,Lightning Cable For Apple iPhone7 6centimeter WhiteSAR 0.75,
1.30,AirPods Strap WhiteSAR 1.30SAR 470% Off,
1.35,Anti-Lost Sport Silicone Strap Cable For Apple AirPods WhiteSAR 1.35SAR 354% Off,
1.40,Micro USB Fast Charging Cable 1meter WhiteSAR 1.40,
.
.
.
.
.
. #rest of results

please notice the result in not neat, there are extra information should be in separated columns for example : title should be only AirPods Strap White not AirPods Strap WhiteSAR 1.30SAR 470% Off,

also I can not catch the sku value which is buried at the end of the page with other valuable data at the end of the page (which I do not how to get it)

{"offer_code":"dd3125025109fb4d","sku":"N15614801A","sku_config":"N15614801A","brand":null,"name":"AirPods Strap White","plp_specifications":{},"price":4.4,"sale_price":1.3,"url":"airpods-strap-white","image_key":"v1532025662/N15614801A_1","is_buyable":true,"flags":["fbn","prepaid"]},

使结果文件包含所有这些值(这是我正在寻找的)将非常有帮助

price,title,sku,offer_code,brand,sale_price

这是我使用的python代码

from bs4 import BeautifulSoup as soup
from concurrent.futures import ThreadPoolExecutor
import requests

number_of_threads = 6
out_filename = "noonresult.csv"
headers = "price,title,sku, \n"

def extract_data_from_url_func(url):
    print(url)
    response = requests.get(url)
    page_soup = soup(response.text, "html.parser")

    containers = page_soup.findAll('div',{'class' : 'jsx-3152181095 productContainer'})
    output = ''
    for container in containers:

        price = container.find('span',{'class':'value'}).text if container.find('span',{'class':'value'}) else ""
        title = container.find('div',{'class':'jsx-866269109 detailsContainer'}).text if container.find('div',{'class':'jsx-866269109 detailsContainer'}) else ""
        sku = container.find('div',{'class':'jsx-866269109 wrapper listView'}).a['href'] if container.find('div',{'class':'jsx-866269109 wrapper listView'}) else ""

        output_list = [price,title,sku,]
        output = output + ",".join(output_list) + "\n"
        print(output)

    return output

with open("speednoon.txt", "r") as fr:
    URLS = list(map(lambda x: x.strip(), fr.readlines()))

with ThreadPoolExecutor(max_workers=number_of_threads) as executor:
    results = executor.map( extract_data_from_url_func, URLS)
    responses = []
    for result in results:
        responses.append(result)


with open(out_filename, "w", encoding='utf-8-sig') as fw:
  fw.write(headers)
  for response in responses:
      fw.write(response + "\n")
评论