Yahoo Finance Webscrape上的动态Reactid

我是BS4网站抓手的新手,需要您的帮助。

我的目标是从以下网址中刮掉Yahoo Finance的“总债务”行:

https://finance.yahoo.com/quote/AAPL/key-statistics/

我尝试使用以下代码未成功:

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

from bs4 import BeautifulSoup as soup
      res = requests.get(https://finance.yahoo.com/quote/AAPL)
      html = res.text
      soup = soup(html, 'html.parser')
      total_debt = soup.find( "span", {"data-reactid" : "591"} )
      print("total debt is " , total_debt)

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

我注意到的另一件事是,如果我尝试使用以下URL抓取另一个置顶栏(Facebook),FB

https://finance.yahoo.com/quote/FB/key-statistics/

数据反应堆不再存在于591中。

谁能提供一些见解?谢谢大家的帮助!

评论
  • 妾随
    妾随 回复

    That reactid will always change. Either find a class you can utilize to identify that row or similar rows or you can use a package called yahooquery. That data can be retrieved pretty easily:

    from yahooquery import Ticker
    
    aapl = Ticker(‘aapl’)
    data = aapl.financial_data
    data[“totalDebt”]
    118760996864