我有一个包含重复整数有序序列的向量:
x <- c(1, 1, 1, 2, 2, 2, 2, 3, 3, 5, 5, 5, 5, 6, 6, 9, 9, 9, 9)
I want to create a "run ID" (I assume using data.table::rleid()
) for numbers that are in sequence. That is, numbers which are either equal or +1
the previous value.
因此,预期输出为:
x
#> [1] 1 1 1 2 2 2 2 3 3 5 5 5 5 6 6 9 9 9 9
data.table::rleid(???)
#> [1] 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3
My first thought was to simply check if each value is the same or +1
the previous, but that doesn't work since the first change is considered a run of its own, obviously (a FALSE
surrounded by TRUEs
):
x
#> [1] 1 1 1 2 2 2 2 3 3 5 5 5 5 6 6 9 9 9 9
data.table::rleid((x - lag(x, default = 1)) %in% 0:1)
#> [1] 1 1 1 1 1 1 1 1 1 2 3 3 3 3 3 4 5 5 5
我显然需要一些东西,使我可以将每个值与最后一个不同的值进行比较,但是我无法考虑如何有效地做到这一点。有指针吗?
How about using
lag
fromdplyr
withcumsum
?Or the
data.table
way withshift
: