We choose up the place the first publish on this sequence left us: confronting the duty of multi-step time-series forecasting.

Our first try was a workaround of types. The mannequin had been educated to ship a single prediction, equivalent to the very subsequent cut-off date. Thus, if we would have liked an extended forecast, all we may do is use that prediction and feed it again to the mannequin, shifting the enter sequence by one worth (from ([x_{t-n}, …, x_t]) to ([x_{t-n-1}, …, x_{t+1}]), say).

In distinction, the brand new mannequin might be designed – and educated – to forecast a configurable variety of observations without delay. The structure will nonetheless be primary – about as primary as doable, given the duty – and thus, can function a baseline for later makes an attempt.

We work with the identical knowledge as earlier than, `vic_elec`

from `tsibbledata`

.

In comparison with final time although, the `dataset`

class has to alter. Whereas, beforehand, for every batch merchandise the goal (`y`

) was a single worth, it now’s a vector, similar to the enter, `x`

. And similar to `n_timesteps`

was (and nonetheless is) used to specify the size of the enter sequence, there’s now a second parameter, `n_forecast`

, to configure goal dimension.

In our instance, `n_timesteps`

and `n_forecast`

are set to the identical worth, however there isn’t a want for this to be the case. You possibly can equally nicely practice on week-long sequences after which forecast developments over a single day, or a month.

Aside from the truth that `.getitem()`

now returns a vector for `y`

in addition to `x`

, there’s not a lot to be mentioned about dataset creation. Right here is the whole code to arrange the information enter pipeline:

```
n_timesteps <- 7 * 24 * 2
n_forecast <- 7 * 24 * 2
batch_size <- 32
vic_elec_get_year <- operate(12 months, month = NULL) {
vic_elec %>%
filter(12 months(Date) == 12 months, month(Date) == if (is.null(month)) month(Date) else month) %>%
as_tibble() %>%
choose(Demand)
}
elec_train <- vic_elec_get_year(2012) %>% as.matrix()
elec_valid <- vic_elec_get_year(2013) %>% as.matrix()
elec_test <- vic_elec_get_year(2014, 1) %>% as.matrix()
train_mean <- imply(elec_train)
train_sd <- sd(elec_train)
elec_dataset <- dataset(
identify = "elec_dataset",
initialize = operate(x, n_timesteps, n_forecast, sample_frac = 1) {
self$n_timesteps <- n_timesteps
self$n_forecast <- n_forecast
self$x <- torch_tensor((x - train_mean) / train_sd)
n <- size(self$x) - self$n_timesteps - self$n_forecast + 1
self$begins <- type(pattern.int(
n = n,
dimension = n * sample_frac
))
},
.getitem = operate(i) {
begin <- self$begins[i]
finish <- begin + self$n_timesteps - 1
pred_length <- self$n_forecast
record(
x = self$x[start:end],
y = self$x[(end + 1):(end + pred_length)]$squeeze(2)
)
},
.size = operate() {
size(self$begins)
}
)
train_ds <- elec_dataset(elec_train, n_timesteps, n_forecast, sample_frac = 0.5)
train_dl <- train_ds %>% dataloader(batch_size = batch_size, shuffle = TRUE)
valid_ds <- elec_dataset(elec_valid, n_timesteps, n_forecast, sample_frac = 0.5)
valid_dl <- valid_ds %>% dataloader(batch_size = batch_size)
test_ds <- elec_dataset(elec_test, n_timesteps, n_forecast)
test_dl <- test_ds %>% dataloader(batch_size = 1)
```

The mannequin replaces the only linear layer that, within the earlier publish, had been tasked with outputting the ultimate prediction, with a small community, full with two linear layers and – non-compulsory – dropout.

In `ahead()`

, we first apply the RNN, and similar to within the earlier publish, we make use of the `outputs`

solely; or extra particularly, the `output`

equivalent to the ultimate time step. (See that earlier publish for a detailed dialogue of what a `torch`

RNN returns.)

```
mannequin <- nn_module(
initialize = operate(sort, input_size, hidden_size, linear_size, output_size,
num_layers = 1, dropout = 0, linear_dropout = 0) {
self$sort <- sort
self$num_layers <- num_layers
self$linear_dropout <- linear_dropout
self$rnn <- if (self$sort == "gru") {
nn_gru(
input_size = input_size,
hidden_size = hidden_size,
num_layers = num_layers,
dropout = dropout,
batch_first = TRUE
)
} else {
nn_lstm(
input_size = input_size,
hidden_size = hidden_size,
num_layers = num_layers,
dropout = dropout,
batch_first = TRUE
)
}
self$mlp <- nn_sequential(
nn_linear(hidden_size, linear_size),
nn_relu(),
nn_dropout(linear_dropout),
nn_linear(linear_size, output_size)
)
},
ahead = operate(x) {
x <- self$rnn(x)
x[[1]][ ,-1, ..] %>%
self$mlp()
}
)
```

For mannequin instantiation, we now have a further configuration parameter, associated to the quantity of dropout between the 2 linear layers.

```
web <- mannequin(
"gru", input_size = 1, hidden_size = 32, linear_size = 512, output_size = n_forecast, linear_dropout = 0
)
# coaching RNNs on the GPU at present prints a warning which will muddle
# the console
# see https://github.com/mlverse/torch/points/461
# alternatively, use
# machine <- "cpu"
machine <- torch_device(if (cuda_is_available()) "cuda" else "cpu")
web <- web$to(machine = machine)
```

The coaching process is totally unchanged.

```
optimizer <- optim_adam(web$parameters, lr = 0.001)
num_epochs <- 30
train_batch <- operate(b) {
optimizer$zero_grad()
output <- web(b$x$to(machine = machine))
goal <- b$y$to(machine = machine)
loss <- nnf_mse_loss(output, goal)
loss$backward()
optimizer$step()
loss$merchandise()
}
valid_batch <- operate(b) {
output <- web(b$x$to(machine = machine))
goal <- b$y$to(machine = machine)
loss <- nnf_mse_loss(output, goal)
loss$merchandise()
}
for (epoch in 1:num_epochs) {
web$practice()
train_loss <- c()
coro::loop(for (b in train_dl) {
loss <-train_batch(b)
train_loss <- c(train_loss, loss)
})
cat(sprintf("nEpoch %d, coaching: loss: %3.5f n", epoch, imply(train_loss)))
web$eval()
valid_loss <- c()
coro::loop(for (b in valid_dl) {
loss <- valid_batch(b)
valid_loss <- c(valid_loss, loss)
})
cat(sprintf("nEpoch %d, validation: loss: %3.5f n", epoch, imply(valid_loss)))
}
```

```
# Epoch 1, coaching: loss: 0.65737
#
# Epoch 1, validation: loss: 0.54586
#
# Epoch 2, coaching: loss: 0.43991
#
# Epoch 2, validation: loss: 0.50588
#
# Epoch 3, coaching: loss: 0.42161
#
# Epoch 3, validation: loss: 0.50031
#
# Epoch 4, coaching: loss: 0.41718
#
# Epoch 4, validation: loss: 0.48703
#
# Epoch 5, coaching: loss: 0.39498
#
# Epoch 5, validation: loss: 0.49572
#
# Epoch 6, coaching: loss: 0.38073
#
# Epoch 6, validation: loss: 0.46813
#
# Epoch 7, coaching: loss: 0.36472
#
# Epoch 7, validation: loss: 0.44957
#
# Epoch 8, coaching: loss: 0.35058
#
# Epoch 8, validation: loss: 0.44440
#
# Epoch 9, coaching: loss: 0.33880
#
# Epoch 9, validation: loss: 0.41995
#
# Epoch 10, coaching: loss: 0.32545
#
# Epoch 10, validation: loss: 0.42021
#
# Epoch 11, coaching: loss: 0.31347
#
# Epoch 11, validation: loss: 0.39514
#
# Epoch 12, coaching: loss: 0.29622
#
# Epoch 12, validation: loss: 0.38146
#
# Epoch 13, coaching: loss: 0.28006
#
# Epoch 13, validation: loss: 0.37754
#
# Epoch 14, coaching: loss: 0.27001
#
# Epoch 14, validation: loss: 0.36636
#
# Epoch 15, coaching: loss: 0.26191
#
# Epoch 15, validation: loss: 0.35338
#
# Epoch 16, coaching: loss: 0.25533
#
# Epoch 16, validation: loss: 0.35453
#
# Epoch 17, coaching: loss: 0.25085
#
# Epoch 17, validation: loss: 0.34521
#
# Epoch 18, coaching: loss: 0.24686
#
# Epoch 18, validation: loss: 0.35094
#
# Epoch 19, coaching: loss: 0.24159
#
# Epoch 19, validation: loss: 0.33776
#
# Epoch 20, coaching: loss: 0.23680
#
# Epoch 20, validation: loss: 0.33974
#
# Epoch 21, coaching: loss: 0.23070
#
# Epoch 21, validation: loss: 0.34069
#
# Epoch 22, coaching: loss: 0.22761
#
# Epoch 22, validation: loss: 0.33724
#
# Epoch 23, coaching: loss: 0.22390
#
# Epoch 23, validation: loss: 0.34013
#
# Epoch 24, coaching: loss: 0.22155
#
# Epoch 24, validation: loss: 0.33460
#
# Epoch 25, coaching: loss: 0.21820
#
# Epoch 25, validation: loss: 0.33755
#
# Epoch 26, coaching: loss: 0.22134
#
# Epoch 26, validation: loss: 0.33678
#
# Epoch 27, coaching: loss: 0.21061
#
# Epoch 27, validation: loss: 0.33108
#
# Epoch 28, coaching: loss: 0.20496
#
# Epoch 28, validation: loss: 0.32769
#
# Epoch 29, coaching: loss: 0.20223
#
# Epoch 29, validation: loss: 0.32969
#
# Epoch 30, coaching: loss: 0.20022
#
# Epoch 30, validation: loss: 0.33331
```

From the best way loss decreases on the coaching set, we conclude that, sure, the mannequin is studying one thing. It in all probability would proceed enhancing for fairly some epochs nonetheless. We do, nonetheless, see much less of an enchancment on the validation set.

Naturally, now we’re interested in test-set predictions. (Keep in mind, for testing we’re selecting the “significantly laborious” month of January, 2014 – significantly laborious due to a heatwave that resulted in exceptionally excessive demand.)

With no loop to be coded, analysis now turns into fairly easy:

```
web$eval()
test_preds <- vector(mode = "record", size = size(test_dl))
i <- 1
coro::loop(for (b in test_dl) {
enter <- b$x
output <- web(enter$to(machine = machine))
preds <- as.numeric(output)
test_preds[[i]] <- preds
i <<- i + 1
})
vic_elec_jan_2014 <- vic_elec %>%
filter(12 months(Date) == 2014, month(Date) == 1)
test_pred1 <- test_preds[[1]]
test_pred1 <- c(rep(NA, n_timesteps), test_pred1, rep(NA, nrow(vic_elec_jan_2014) - n_timesteps - n_forecast))
test_pred2 <- test_preds[[408]]
test_pred2 <- c(rep(NA, n_timesteps + 407), test_pred2, rep(NA, nrow(vic_elec_jan_2014) - 407 - n_timesteps - n_forecast))
test_pred3 <- test_preds[[817]]
test_pred3 <- c(rep(NA, nrow(vic_elec_jan_2014) - n_forecast), test_pred3)
preds_ts <- vic_elec_jan_2014 %>%
choose(Demand) %>%
add_column(
mlp_ex_1 = test_pred1 * train_sd + train_mean,
mlp_ex_2 = test_pred2 * train_sd + train_mean,
mlp_ex_3 = test_pred3 * train_sd + train_mean) %>%
pivot_longer(-Time) %>%
update_tsibble(key = identify)
preds_ts %>%
autoplot() +
scale_colour_manual(values = c("#08c5d1", "#00353f", "#ffbf66", "#d46f4d")) +
theme_minimal()
```

Evaluate this to the forecast obtained by feeding again predictions. The demand profiles over the day look much more practical now. How concerning the phases of utmost demand? Evidently, these should not mirrored within the forecast, not any greater than within the “loop method”. The truth is, the forecast permits for attention-grabbing insights into this mannequin’s character: Apparently, it actually likes fluctuating across the imply – “prime” it with inputs that oscillate round a considerably increased stage, and it’ll rapidly shift again to its consolation zone.

Seeing how, above, we offered an choice to make use of dropout contained in the MLP, chances are you’ll be questioning if this might assist with forecasts on the take a look at set. Seems it didn’t, in my experiments. Possibly this isn’t so unusual both: How, absent exterior cues (temperature), ought to the community know that prime demand is developing?

In our evaluation, we will make a further distinction. With the primary week of predictions, what we see is a failure to anticipate one thing that *couldn’t* moderately have been anticipated (two, or two-and-a-half, say, days of exceptionally excessive demand). Within the second, all of the community would have needed to do was keep on the present, elevated stage. It will likely be attention-grabbing to see how that is dealt with by the architectures we talk about subsequent.

Lastly, a further concept you’ll have had is – what if we used temperature as a second enter variable? As a matter of truth, coaching efficiency certainly improved, however no efficiency impression was noticed on the validation and take a look at units. Nonetheless, chances are you’ll discover the code helpful – it’s simply prolonged to datasets with extra predictors. Subsequently, we reproduce it within the appendix.

Thanks for studying!

```
# Information enter code modified to accommodate two predictors
n_timesteps <- 7 * 24 * 2
n_forecast <- 7 * 24 * 2
vic_elec_get_year <- operate(12 months, month = NULL) {
vic_elec %>%
filter(12 months(Date) == 12 months, month(Date) == if (is.null(month)) month(Date) else month) %>%
as_tibble() %>%
choose(Demand, Temperature)
}
elec_train <- vic_elec_get_year(2012) %>% as.matrix()
elec_valid <- vic_elec_get_year(2013) %>% as.matrix()
elec_test <- vic_elec_get_year(2014, 1) %>% as.matrix()
train_mean_demand <- imply(elec_train[ , 1])
train_sd_demand <- sd(elec_train[ , 1])
train_mean_temp <- imply(elec_train[ , 2])
train_sd_temp <- sd(elec_train[ , 2])
elec_dataset <- dataset(
identify = "elec_dataset",
initialize = operate(knowledge, n_timesteps, n_forecast, sample_frac = 1) {
demand <- (knowledge[ , 1] - train_mean_demand) / train_sd_demand
temp <- (knowledge[ , 2] - train_mean_temp) / train_sd_temp
self$x <- cbind(demand, temp) %>% torch_tensor()
self$n_timesteps <- n_timesteps
self$n_forecast <- n_forecast
n <- nrow(self$x) - self$n_timesteps - self$n_forecast + 1
self$begins <- type(pattern.int(
n = n,
dimension = n * sample_frac
))
},
.getitem = operate(i) {
begin <- self$begins[i]
finish <- begin + self$n_timesteps - 1
pred_length <- self$n_forecast
record(
x = self$x[start:end, ],
y = self$x[(end + 1):(end + pred_length), 1]
)
},
.size = operate() {
size(self$begins)
}
)
### relaxation equivalent to single-predictor code above
```

Photograph by Monica Bourgeau on Unsplash