This vignette shows how to model discretely observed diffusion models
with bssm
. We assume that the state equation is defined as
a continuous time diffusion model of form dαt = μ(αt, θ)dt + σ(αt, θ)dBt, t ≥ 0,
where Bt
is a Brownian motion and where μ and σ are scalar-valued functions, with
the univariate observation density g(yk|αk)
defined at integer times k = 1…, n. As these
transition densities are generally unavailable for non-linear
diffusions, we use Milstein time-discretisation scheme for approximate
simulation with bootstrap particle filter. Fine discretisation mesh
gives less bias than the coarser one, with increased computational
complexity. Here IS-MCMC approach (Vihola, Helske, and Franks
2020) can provide substantial computational savings.
Discretely observed latent diffusion models can be constructed using
the ssm_sde
function, which takes pointers to
C++
functions defining the drift, diffusion, the derivative
of the diffusion function, the log-densities of the observations, and
the log-prior. As an example, let us consider an Ornstein–Uhlenbeck
process dαt = ρ(ν − αt)dt + σdBt,
with parameters θ = (log ρ, ν, log σ) = (log (0.5), 2, log (0.2))
and the initial condition α0 = 1. For observation
density, we use Poisson distribution with parameter exp (αk). We
first simulate a trajectory x0, …, x40
using the sde.sim
function from the sde
package (Iacus
2016) and use that for the simulation of observations y:
set.seed(1)
n <- 40
suppressMessages(library("sde"))
x <- sde.sim(t0 = 0, T = n, X0 = 1, N = n * 2^5,
drift = expression(0.5 * (2 - x)),
sigma = expression(0.2),
sigma.x = expression(0),
method = "milstein")
integer_x <- x[seq(frequency(x) + 1, length(x), frequency(x))]
y <- rpois(n, exp(integer_x))
We then modify the C++
functions (see Appendix) which
define the terms of the stochastic differential equation, the
observation density, and the priors for the unknown parameter vector
θ. After compilation with the
help of Rcpp::sourceCpp
, we input pointers to these
functions to ssm_sde
function:
library("bssm")
Rcpp::sourceCpp("ssm_sde_template.cpp")
pntrs <- create_xptrs()
sde_model <- ssm_sde(y, pntrs$drift, pntrs$diffusion,
pntrs$ddiffusion, pntrs$obs_density, pntrs$prior,
theta = c(log_rho = log(0.5), mu = 2, log_sigma = log(0.2)),
x0 = 1, positive = FALSE)
We then run IS-MCMC with 20,000 iterations (with first half discarded as burn-in by default), using coarse mesh with Lc = 22 discretization points, finer mesh with Lf = 25 points, and 30 particles. We also use two parallel threads for faster post-processing step with finer mesh (note that for accurate inference, more iterations should be used, but here we keep the run short and use only two threads due to CRAN check requirements).
Finally, we can draw our estimated state trajectory and the the corresponding 95 % posterior intervals, together with true process (dashed line, with points corresponding to integer times):
suppressMessages(library("ggplot2"))
suppressMessages(library("dplyr"))
suppressMessages(library("diagis"))
d <- as.data.frame(out, variable = "states")
state_fit <- d |>
group_by(time) |>
summarise(state = weighted_mean(value, weight),
lwr = weighted_quantile(value, weight, 0.025),
upr = weighted_quantile(value, weight, 0.975))
ggplot(state_fit, aes(x = time, y = state)) +
geom_ribbon(aes(ymin = lwr, ymax = upr),
fill = "#7570b3", alpha = 0.25) +
geom_line(data = data.frame(
state = x,
time = time(x)),
colour = "#d95f02", linetype = "dashed") +
geom_line(colour = "#7570b3") +
geom_point(colour = "#7570b3") +
geom_point(data=data.frame(state=integer_x,time=1:n), colour = "#d95f02") +
theme_bw()
This is the full ssm_sde_template.cpp
file:
// A template for building a univariate discretely observed diffusion model
// Here we define a latent Ornstein–Uhlenbeck process with Poisson observations
// d\alpha_t = \rho (\nu - \alpha_t) dt + \sigma dB_t, t>=0
// y_k ~ Poisson(exp(\alpha_k)), k = 1,...,n
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::interfaces(r, cpp)]]
// x: state
// theta: vector of parameters
// theta(0) = log_rho
// theta(1) = nu
// theta(2) = log_sigma
// Drift function
// [[Rcpp::export]]
double drift(const double x, const arma::vec& theta) {
return exp(theta(0)) * (theta(1) - x);
}
// diffusion function
// [[Rcpp::export]]
double diffusion(const double x, const arma::vec& theta) {
return exp(theta(2));
}
// Derivative of the diffusion function
// [[Rcpp::export]]
double ddiffusion(const double x, const arma::vec& theta) {
return 0.0;
}
// log-density of the prior
// [[Rcpp::export]]
double log_prior_pdf(const arma::vec& theta) {
// rho ~ gamma(2, 0.5) // shape-scale parameterization
// nu ~ N(0, 4)
// sigma ~ half-N(0,1) (theta(2) is log(sigma))
double log_pdf =
R::dgamma(exp(theta(0)), 2, 0.5, 1) +
R::dnorm(theta(1), 0, 4, 1) +
R::dnorm(exp(theta(2)), 0, 1, 1) +
theta(0) + theta(2); // jacobians of transformations
return log_pdf;
}
// log-density of observations
// given vector of sampled states alpha
// [[Rcpp::export]]
arma::vec log_obs_density(const double y,
const arma::vec& alpha, const arma::vec& theta) {
arma::vec log_pdf(alpha.n_elem);
for (unsigned int i = 0; i < alpha.n_elem; i++) {
log_pdf(i) = R::dpois(y, exp(alpha(i)), 1);
}
return log_pdf;
}
// Function which returns the pointers to above functions (no need to modify)
// [[Rcpp::export]]
Rcpp::List create_xptrs() {
// typedef for a pointer of drift/volatility function
typedef double (*fnPtr)(const double x, const arma::vec& theta);
// typedef for log_prior_pdf
typedef double (*prior_fnPtr)(const arma::vec& theta);
// typedef for log_obs_density
typedef arma::vec (*obs_fnPtr)(const double y,
const arma::vec& alpha, const arma::vec& theta);
return Rcpp::List::create(
Rcpp::Named("drift") = Rcpp::XPtr<fnPtr>(new fnPtr(&drift)),
Rcpp::Named("diffusion") = Rcpp::XPtr<fnPtr>(new fnPtr(&diffusion)),
Rcpp::Named("ddiffusion") = Rcpp::XPtr<fnPtr>(new fnPtr(&ddiffusion)),
Rcpp::Named("prior") = Rcpp::XPtr<prior_fnPtr>(new prior_fnPtr(&log_prior_pdf)),
Rcpp::Named("obs_density") = Rcpp::XPtr<obs_fnPtr>(new obs_fnPtr(&log_obs_density)));
}