我有一个数据框,其中每个条目都与NHS中的职位发布相关,指定职位发布的星期以及职位所属的NHS信任(和地区)。
目前,我的数据框看起来像这样:
set.seed(1)
df1 <- data.frame(
NHS_Trust = sample(1:30,20,T),
Week = sample(1:10,20,T),
Region = sample(1:15,20,T))
我想计算每个NHS信托基金每周的工作数量,并将该值分配给新的列“工作”,因此我的数据框如下所示:
set.seed(1)
df2 <- data.frame(
NHS_Trust = rep(1:30, each=10),
Week = rep(seq(1,10),30),
Region = rep(as.integer(runif(30,1,15)),1,each = 10),
Jobs = rpois(10*30, lambda = 2))
然后,可以使用数据框创建泊松纵向多级模型,在这里我可以对作业数量进行建模。
Using the data.table package you can group by, count and assign to a new column in a single expression. The syntax for data.tables is
dt[i, j, by]
. Herei
is "with" - ie the subset of data specified byi
or data in the order ofi
which is empty in this case so all data is used in its original order. Thej
tells what is to be done, here counting the the number of occurrences using.N
, which is then assigned to the new variablecount
using the assign operator:=
. Theby
takes a list of variables where thej
operation is performed on each group.tidyverse方法将是
真的很难回答这个问题,因为我无法追踪您。我似乎错过了一些东西。评论中的讨论支持这种感觉。
However, you may want to have a look at the
aggregate
function.如果您编辑问题以提供更多详细信息,例如一个带有输入及其相关输出的小矩阵。
我希望这有帮助!