我的R循环和Bootstrap脚本运行时间太长:如何重写它以加快运行速度

I have the below R script which takes more than 24 hours to but finally runs on Windows 10 of 10-gigabyte ram and core M7. The script does the following:

Here is what I desire to do with R

  • A. I have generated a 50-time series dataset.

  • B. I slice the same time series dataset into chunks of the following sizes: 2,3,...,48,49 making me have 48 different time series formed from step 1 above.

  • C. I divided each 48-time series dataset into train and test sets so I can use rmse function in Metrics package to get the Root Mean Squared Error (RMSE) for the 48 subseries formed in step 2.

  • D. The RMSE for each series is then tabulated according to their chunk sizes

  • E. I obtained the best ARIMA model for each 48 different time series data set.

我的R脚本

# simulate arima(1,0,0)
library(forecast)
library(Metrics)
n=50
phi <- 0.5
set.seed(1)
wn <- rnorm(n, mean=0, sd=1)
    ar1 <- sqrt((wn[1])^2/(1-phi^2))
for(i in 2:n){
  ar1[i] <- ar1[i - 1] * phi + wn[i]
}
ts <- ar1

t<-length(ts)# the length of the time series
li <- seq(n-2)+1 # vector of block sizes to be 1 < l < n (i.e to be between 1 and n 
exclusively)

RMSEblk<-matrix(nrow = 1, ncol = length(li))#vector to store block means
colnames(RMSEblk)<-li
for (b in 1:length(li)){
    l<- li[b]# block size
    m <- ceiling(t / l) # number of blocks
    blk<-split(ts, rep(1:m, each=l, length.out = t)) # divides the series into blocks
    singleblock <- vector() #initialize vector to receive result from for loop
    for(i in 1:1000){
        res<-sample(blk, replace=T, 10000) # resamples the blocks
        res.unlist<-unlist(res, use.names = F) # unlist the bootstrap series
        # Split the series into train and test set
        train <- head(res.unlist, round(length(res.unlist) * 0.6))
        h <- length(res.unlist) - length(train)
        test <- tail(res.unlist, h)

        # Forecast for train set
        model <- auto.arima(train)
        future <- forecast(test, model=model,h=h)
        nfuture <- as.numeric(future$mean) # makes the `future` object a vector
        # use the `rmse` function from `Metrics` package
        RMSE <- rmse(test, nfuture)
        singleblock[i] <- RMSE # Assign RMSE value to final result vector element i
    }
    #singleblock
    RMSEblk[b]<-mean(singleblock) #store into matrix
}
RMSEblk

The R script actually runs but it takes more than 24 hours to complete. The number of runs in the loops (10000 and 1000) are the minimum that is necessary to make the task perfect.

请问我该怎么做才能使脚本在更短的时间内完成?

评论