将带有条件的自定义函数传递给dplyr :: mutate

我有一个类似以下的数据集:

 seq <- tibble(REF = c("A","C","G","T","C","G"),
        REF2 = c("A","G","G","A","C","G")) %>%
   dplyr::mutate(UP = dplyr::lag(REF, n=1),
                 DOWN = dplyr::lead(REF, n=1))

# A tibble: 6 x 4
#  REF   REF2  UP    DOWN 
#  <chr> <chr> <chr> <chr>
#1 A     A     NA    C    
#2 C     G     A     G    
#3 G     G     C     T    
#4 T     A     G     C    
#5 C     C     T     G    
#6 G     G     C     NA 

And would like to change some of these letters (between A-T and G-C) above when the content of REF and REF2 columns are different. To do so, I have written a small functions, and run it with dplyr::mutate as follows:

switch_strand <- function(base) {
  if (base=="A") return ("T")
  else if (base=="T") return ("A")
  else if (base=="G") return ("C")
  else if (base=="C") return ("G")
  else if (is.na(base)) return (NA) 
  else stop("Error, base does not exist")
}

seq %>% dplyr::mutate(UP = ifelse(REF!=REF2,switch_strand(UP),UP),
                      DOWN = ifelse(REF!=REF2,switch_strand(DOWN),DOWN))

但是获得以下错误:

if(base ==“ A”)return(“ T”)否则if(base ==“ T”)return(“ A”)else if(base ==:     需要TRUE / FALSE的缺失值   另外:警告消息:   如果if(base ==“ A”)return(“ T”)else if(base ==“ T”)return(“ A”)else if(base ==:     条件的长度> 1,并且仅使用第一个元素

Which I dont understand, aren't values called in dplyr::mutate used in a row-wise manner? The above function works as expected if single letters are entered, but I do not understand why the full columns is being entered as argument there. How can be this fixed?

预期输出为:

# A tibble: 6 x 4
#  REF   REF2  UP    DOWN 
#  <chr> <chr> <chr> <chr>
#1 A     A     NA    C    
#2 C     G     T     C    
#3 G     G     C     T    
#4 T     A     C     G    
#5 C     C     T     G    
#6 G     G     C     NA

EDIT: I have fixed the switch_base function so it should return NA if base is NA, but seems to fail in this case.. it might be related to this.

评论
  • 听风在唱歌
    听风在唱歌 回复

    在mutate之前通过附加功能“ dplyr :: rowwise()”:

    seq %>% dplyr::rowwise() %>% dplyr::mutate(UP = ifelse(REF!=REF2,switch_strand(UP),UP),
                          DOWN = ifelse(REF!=REF2,switch_strand(DOWN),DOWN))