我正在尝试编写一些将抓取不同资产类别数据的代码。用户输入要获取其信息的股票名称,然后代码获取该特定股票的相关定量数据。
我在考虑使用VBA,因为将数据输出到excel电子表格很容易。但是如果更容易的话,也可以使用其他语言(我知道一点点Python)。
我遇到的问题是似乎没有要查看的唯一标签,然后可以访问我想要的数据表。
E.g. I could GetElementsByClassName("clear")
but many of these exist. I thought maybe I can access the table class name itself. I have two issues with this:
1):我无法在VBA中使用它,因为它似乎不喜欢此类名称所具有的空格
2):即使我可以使用它,我也不知道超量库存的类名是否相同...
任何建议都非常感谢。这是我最复杂的VBA项目。
该元素具有3个类; CSS类中没有空格。
You could
GetElementsByClassName("companyFinancialSumaryTbl")
, that would get you a collection of nodes that includes the<table>
element, presumably as the first and only item.From there you can get the
<tbody>
child element, and then you can iterate its<tr>
children, and in each row you can iterate the<td>
child nodes; when a<td>
has thebold
class you know you're looking at a row heading.We don't know either! If there's another table to read data from, it probably doesn't have the
companyFinancialSummaryTbl
class, likely has someoverstocksTbl
class instead; either way, it'll be a<table>
element with child nodes that you can navigate and iterate.I'm not super familiar with web scraping, but say you have the
<table>
element in objecte
, then you could conceivably get the<tbody>
element like this:如果可行,那么也应该这样做: