Forecasting occurs at time $t$ before a new value $y_t$ has been observed. There are two types of forecasts:

**Marginal**- A forecast made at*k*time steps into the future only. For example, marginal forecasting with $k=1$ will simulate from the forecast distribution for $y_t$. Marginal forecasting with $k=2$ will simulate from the forecast distribution for $y_{t+1}$, and so on.

**Path**- A joint forecast made from*1*to*k*steps into the future. Path forecasting with $k=3$ will simulate*dependent*observations $y_t, y_{t+1}, y_{t+2}$. Path forecasting is important when computing functions across multiple forecast horizons. The most common example is forecasting the*sum*$\sum_{h=1:k} y_{t+k-1}$. This is accomplished by path forecasting $k$ steps ahead, and then summing each path to get a random sample from the forecast distribution for the sum.

Naturally, forecasting $k-$steps ahead requires knowing all predictors $X$ that far into the future as well. Marginal forecasting $k-$steps ahead accepts $X_{t+k-1}$ as an argument. Path forecasting accepts the predictors in an array. Each row contains the predictors at one time step, starting with $X_t$ in the first row, down to $X_{t+k-1}$ in the last row.

This marginal forecast method works for Poisson and Bernoulli DGLMs. Below we run through a basic usage with the following steps:

- Manually define a simple DGLM with an intercept and 2 regression predictors
- Define the future predictors, $X_{t+k-1}$
- Forecast by drawing a random sample at $k=5$, and getting a $95\%$ credible interval
- Forecast the mean only, without drawing a random sample

```
import numpy as np
from pybats.dglm import dlm, pois_dglm, bern_dglm, bin_dglm
from pybats.analysis import analysis
# Define Poisson and Bernoulli DGLMs, along with a normal DLM, with 1 trend term and 2 regression predictors
a0 = np.array([1, 1, 1])
R0 = np.eye(3)/10
mod_n = dlm(a0, R0, ntrend=1, nregn=2, deltrend=1, delregn=.9, discount_forecast=True)
mod_p = pois_dglm(a0, R0, ntrend=1, nregn=2, deltrend=1, delregn=.9)
mod_bern = bern_dglm(a0, R0, ntrend=1, nregn=2, deltrend=1, delregn=.9)
# New predictors for k=5 steps into the future
X = np.array([1,2])
```

First we're going to forecast from the Poisson DGLM, which will draw a random sample from the forecast distribution. We'll use the forecast samples to calculate a $95\%$ credible interval.

```
forecast_samples = mod_p.forecast_marginal(k = 5, X=X, nsamps=1000)
# Get the 95% credible intervals
np.percentile(forecast_samples, [2.5, 97.5])
```

We can repeat this analysis using the normal DLM:

```
forecast_samples = mod_n.forecast_marginal(k = 5, X=X, nsamps=1000)
# Get the 95% credible intervals
np.percentile(forecast_samples, [2.5, 97.5])
```

Next we'll test the models by setting `mean_only=True`

. This will return only the mean value, without simulation from the forecast distribution. It is much faster if the mean is the only value of interest.

Notice that we skipped simulating from the Bernoulli DGLM, because Bernoulli outcomes are only $0$ or $1$ anyway. The mean of the forecast distribution in a Bernoulli DGLM is equal to the probability of the outcome being $1$.

```
# Setting the flag mean_only=True returns the exact mean, without simulation
m_p = mod_p.forecast_marginal(k = 5, X=X, mean_only=True)
ans = [70.38455503971952]
assert (np.equal(np.round(ans[0], 5), np.round(m_p, 5)).all())
# Test the Bernoulli DGLM forecast mean is correct
# Setting the flag mean_only=True returns the exact mean, without simulation
m_bern = mod_bern.forecast_marginal(k=5, X=X, mean_only=True)
ans = [0.9771442828394032]
assert (np.equal(np.round(ans[0], 5), np.round(m_bern, 5)).all())
# Test the Bernoulli DGLM forecast mean is correct
# Setting the flag state_mean_var=True returns the exact mean and variance, without simulation
m_n, v = mod_n.forecast_marginal(k = 5, X=X, state_mean_var=True)
ans = [4., 1.82222222]
assert(np.equal(np.round(ans[0], 5), np.round(m_n, 5)).all())
assert(np.equal(np.round(ans[1], 5), np.round(v, 5)).all())
```

`dglm.forecast_marginal`

has three special features beyond returning just a random sample from the forecast distribution:

- Setting
`mean_only = True`

makes the function return the mean of the forecast distribution, instead of a random sample. - Setting
`state_mean_and_var = True`

makes the function return the mean and variance of the linear predictor, $\lambda_t = F_t^{'} \theta_t$. - Passing in a value for
`y`

makes the function return the*log*probability density of`y`

under the forecast distribution, $p(y)$.

We'll repeat the same tests from above on a binomial DGLM by specifying both the predictors `X`

and the number of trials `n`

.

```
mod_b = bin_dglm(a0, R0, ntrend=1, nregn=2, deltrend=1, delregn=.9)
# Test the Binomial DGLM
m_bin = mod_b.forecast_marginal(n = 10, k=5, X=X, mean_only=True)
ans = [9.771442828394033]
assert (np.equal(np.round(ans[0], 5), np.round(m_bin, 5)).all())
```

There are two functions for path forecasting with DGLMs. `forecast_path_copula`

is the default method, and is part of my new research. It dramatically speeds up path forecasting in DGLMs. The traditional `forecast_path`

function is only used when `dglm.forecast_path`

is called with the flag `copula=FALSE`

.

Below we'll test path forecasting - with a copula, as is the default - for $k=2$ steps into the future. The Poisson model will have $2$ trend terms - an intercept and a slope - and 1 regression term. This test will confirm that marginal forecasting and path forecasting lead to identical forecast means for $k=1$ and $k=2$ steps ahead.

```
a0 = np.array([1, 1, 1])
R0 = np.eye(3)/100
mod_p = pois_dglm(a0, R0, ntrend=2, nregn=1, deltrend=1, delregn=.9)
# The predictors for the next 2 time steps, X_t and X_(t+1), in a matrix with k=2 rows
X = np.array([[1], [0.5]])
# Test the Poisson DGLM by comparing the path and marginal forecast distributions
samps_copula = mod_p.forecast_path_copula(k = 2, X=X, nsamps=100000)
m_samp = samps_copula.mean(axis=0)
m_marg = np.array([mod_p.forecast_marginal(k=1, X = X[0], mean_only=True),
mod_p.forecast_marginal(k=2, X=X[1], mean_only=True)]).reshape(-1)
assert (np.equal(np.round(m_samp - m_marg, 1), np.zeros(2)).all())
samps_path = mod_p.forecast_path(k = 2, X=X, nsamps=2000, copula=False)
m_samp = samps_path.mean(axis=0)
assert (np.equal(np.round(m_samp - m_marg, 0), np.zeros(2)).all())
```

`dglm.forecast_path`

has two special features beyond returning just a random sample from the forecast distribution. These only work when using `forecast_path_copula`

, and are not yet available for a normal DLM.

- Passing in a value (a vector of length $k$) for
`y`

makes the function return the*log*probability density of`y`

under the forecast distribution, $p(y)$. - Setting
`return_cov = True`

makes the function return the covariance in the copula distribution. This will be different than the covariance in the forecast distribution, which can be calculated directly from a random sample.