循环运行时如何获取链接?

library(rvest);library(tidyverse)
urls <- str_c("https://news.ycombinator.com/news?p", seq(1,2,1))

    title <- urls %>% 
      map(
        gettitle <- function(df){
          read_html(df) %>% 
            html_nodes("a.storylink") %>% 
            html_text() %>% 
            enframe(name = NULL)
        }
      ) %>%  
      bind_rows()

这将是一个只有一列的数据框。我想创建一个新列并粘贴属于每一行标题的URL。

# A tibble: 6 x 2
  value                                                       url                                  
  <chr>                                                       <chr>                                
1 1k True Fans? Try 100                                       https://news.ycombinator.com/news?p=1
2 FLIF – Free Lossless Image Format                           https://news.ycombinator.com/news?p=1
3 Critical Bluetooth Vulnerability in Android (CVE-2020-0022) https://news.ycombinator.com/news?p=1
4 The Rapid Growth of Io_uring                                https://news.ycombinator.com/news?p=1
5 Show HN: Building an open-source language-learning platform https://news.ycombinator.com/news?p=1
6 TV Backlight Compensation                                   https://news.ycombinator.com/news?p=1
评论
  • Wolf
    Wolf 回复

    Here is one way for you. When you loop through each page, you can create a data frame which contains two columns. map_dfr() binds two data frames.

    library(rvest)
    library(tidyverse)
    
    map_dfr(.x = paste("https://news.ycombinator.com/news?p", 1:2, sep = ""),
            .f = function(x){tibble(url = x,
                                    title = read_html(x) %>% 
                                            html_nodes("a.storylink") %>% 
                                            html_text()
                                )})
    
       url                                  title                                                                       
       <chr>                                <chr>                                                                       
     1 https://news.ycombinator.com/news?p1 1k True Fans? Try 100                                                       
     2 https://news.ycombinator.com/news?p1 Critical Bluetooth Vulnerability in Android (CVE-2020-0022)                 
     3 https://news.ycombinator.com/news?p1 FLIF – Free Lossless Image Format                                           
     4 https://news.ycombinator.com/news?p1 The Rapid Growth of Io_uring                                                
     5 https://news.ycombinator.com/news?p1 Show HN: Building an open-source language-learning platform                 
     6 https://news.ycombinator.com/news?p1 Why Google Might Prefer Dropping a $22B Business                            
     7 https://news.ycombinator.com/news?p1 TV Backlight Compensation                                                   
     8 https://news.ycombinator.com/news?p1 This person does not exist                                                  
     9 https://news.ycombinator.com/news?p1 Angular 9.0                                                                 
    10 https://news.ycombinator.com/news?p1 Before the DNS: how yours truly upstaged the NIC's official HOSTS.TXT (2004)