LowLevelParticleFilters

CI codecov

This is a library for state estimation, that is, given measurements $y(t)$ from a dynamical system, estimate the state vector $x(t)$. Throughout, we assume dynamics on the form

\[\begin{aligned} x(t+1) &= f(x(t), u(t), p, t, w(t))\\ y(t) &= g(x(t), u(t), p, t, e(t)) \end{aligned}\]

or the linear version

\[\begin{aligned} x(t+1) &= Ax(t) + Bu(t) + w(t)\\ y(t) &= Cx(t) + Du(t) + e(t) \end{aligned}\]

where $x$ is the state vector, $u$ an input, $p$ some form of parameters, $t$ is the time and $w,e$ are disturbances (noise). Throughout the documentation, we often call the function $f$ dynamics and the function $g$ measurement.

The dynamics above describe a discrete-time system, i.e., the function $f$ takes the current state and produces the next state. This is in contrast to a continuous-time system, where $f$ takes the current state but produces the time derivative of the state. A continuous-time system can be discretized, described in detail in Discretization.

The parameters $p$ can be anything, or left out. You may write the dynamics functions such that they depend on $p$ and include parameters when you create a filter object. You may also override the parameters stored in the filter object when you call any function on the filter object. This behavior is modeled after the SciML ecosystem.

Depending on the nature of $f$ and $g$, the best method of estimating the state may vary. If $f,g$ are linear and the disturbances are additive and Gaussian, the KalmanFilter is an optimal state estimator. If any of the above assumptions fail to hold, we may need to resort to more advanced estimators. This package provides several filter types, outlined below.

Estimator types

We provide a number of filter types

  • KalmanFilter. A standard Kalman filter. Is restricted to linear dynamics (possibly time varying) and Gaussian noise.
  • SqKalmanFilter. A standard Kalman filter on square-root form (slightly slower but more numerically stable with ill-conditioned covariance).
  • ExtendedKalmanFilter: For nonlinear systems, the EKF runs a regular Kalman filter on linearized dynamics. Uses ForwardDiff.jl for linearization. The noise model must be Gaussian.
  • UnscentedKalmanFilter: The Unscented Kalman filter often performs slightly better than the Extended Kalman filter but may be slightly more computationally expensive. The UKF handles nonlinear dynamics and measurement model, but still requires an additive Gaussian noise model and assumes all posterior distributions are Gaussian, i.e., can not handle multi-modal posteriors.
  • ParticleFilter: The particle filter is a nonlinear estimator. This version of the particle filter is simple to use and assumes that both dynamics noise and measurement noise are additive. Particle filters handle multi-modal posteriors.
  • AdvancedParticleFilter: This filter gives you more flexibility, at the expense of having to define a few more functions. This filter does not require the noise to be additive and is thus the most flexible filter type.
  • AuxiliaryParticleFilter: This filter is identical to ParticleFilter, but uses a slightly different proposal mechanism for new particles.
  • DAEUnscentedKalmanFilter: An Unscented Kalman filter for differential-algebraic systems (DAE).

Functionality

This package provides

  • Filtering, estimating $x(t)$ given measurements up to and including time $t$. We call the filtered estimate $x(t|t)$ (read as $x$ at $t$ given $t$).
  • Smoothing, estimating $x(t)$ given data up to $T > t$, i.e., $x(t|T)$.
  • Parameter estimation.

All filters work in two distinct steps.

  1. The prediction step (predict!). During prediction, we use the dynamics model to form $x(t|t-1) = f(x(t-1), ...)$
  2. The correction step (correct!). In this step, we adjust the predicted state $x(t|t-1)$ using the measurement $y(t)$ to form $x(t|t)$.

In general, all filters represent not only a point estimate of $x(t)$, but a representation of the complete posterior probability distribution over $x$ given all the data available up to time $t$. One major difference between different filter types is how they represent these probability distributions.

Particle filter

A particle filter represents the probability distribution over the state as a collection of samples, each sample is propagated through the dynamics function $f$ individually. When a measurement becomes available, the samples, called particles, are given a weight based on how likely the particle is given the measurement. Each particle can thus be seen as representing a hypothesis about the current state of the system. After a few time steps, most weights are inevitably going to be extremely small, a manifestation of the curse of dimensionality, and a resampling step is incorporated to refresh the particle distribution and focus the particles on areas of the state space with high posterior probability.

Defining a particle filter is straightforward, one must define the distribution of the noise df in the dynamics function, dynamics(x,u,p,t) and the noise distribution dg in the measurement function measurement(x,u,p,t). Both of these noise sources are assumed to be additive, but can have any distribution. The distribution of the initial state d0 must also be provided. In the example below, we use linear Gaussian dynamics so that we can easily compare both particle and Kalman filters. (If we have something close to linear Gaussian dynamics in practice, we should of course use a Kalman filter and not a particle filter.)

using LowLevelParticleFilters, LinearAlgebra, StaticArrays, Distributions, Plots

Define problem

n = 2   # Dimension of state
m = 1   # Dimension of input
p = 1   # Dimension of measurements
N = 500 # Number of particles

const dg = MvNormal(p,0.2)          # Measurement noise Distribution
const df = MvNormal(n,0.1)          # Dynamics noise Distribution
const d0 = MvNormal(randn(n),2.0)   # Initial state Distribution

Define linear state-space system (using StaticArrays for maximum performance)

const A = SA[0.97043   -0.097368
             0.09736    0.970437]
const B = SA[0.1; 0;;]
const C = SA[0 1.0]

Next, we define the dynamics and measurement equations, they both take the signature (x,u,p,t) = (state, input, parameters, time)

dynamics(x,u,p,t) = A*x .+ B*u
measurement(x,u,p,t) = C*x
vecvec_to_mat(x) = copy(reduce(hcat, x)') # Helper function

the parameter p can be anything, and is often optional. If p is not provided when performing operations on filters, any p stored in the filter objects (if supported) is used. The default if none is provided and none is stored in the filter is p = LowLevelParticleFilters.NullParameters().

We are now ready to define and use a filter

pf = ParticleFilter(N, dynamics, measurement, df, dg, d0)
ParticleFilter{PFstate{StaticArraysCore.SVector{2, Float64}, Float64}, typeof(Main.dynamics), typeof(Main.measurement), Distributions.ZeroMeanIsoNormal{Tuple{Base.OneTo{Int64}}}, Distributions.ZeroMeanIsoNormal{Tuple{Base.OneTo{Int64}}}, Distributions.IsoNormal, DataType, Random.Xoshiro, LowLevelParticleFilters.NullParameters}
  state: PFstate{StaticArraysCore.SVector{2, Float64}, Float64}
  dynamics: dynamics (function of type typeof(Main.dynamics))
  measurement: measurement (function of type typeof(Main.measurement))
  dynamics_density: Distributions.ZeroMeanIsoNormal{Tuple{Base.OneTo{Int64}}}
  measurement_density: Distributions.ZeroMeanIsoNormal{Tuple{Base.OneTo{Int64}}}
  initial_density: Distributions.IsoNormal
  resample_threshold: Float64 0.1
  resampling_strategy: LowLevelParticleFilters.ResampleSystematic <: LowLevelParticleFilters.ResamplingStrategy
  rng: Random.Xoshiro
  p: LowLevelParticleFilters.NullParameters LowLevelParticleFilters.NullParameters()
  threads: Bool false

With the filter in hand, we can simulate from its dynamics and query some properties

du = MvNormal(m,1.0)         # Random input distribution for simulation
xs,u,y = simulate(pf,200,du) # We can simulate the model that the pf represents
pf(u[1], y[1])               # Perform one filtering step using input u and measurement y
particles(pf)                # Query the filter for particles, try weights(pf) or expweights(pf) as well
x̂ = weighted_mean(pf)        # using the current state
2-element Vector{Float64}:
 -0.0144051626509101
  0.905576326097814

If you want to perform filtering using vectors of inputs and measurements, try any of the functions

sol = forward_trajectory(pf, u, y) # Filter whole vectors of signals
x̂,ll = mean_trajectory(pf, u, y)
plot(sol, xreal=xs, markersize=2)
Example block output

u ad y are then assumed to be vectors of vectors. StaticArrays is recommended for maximum performance.

If MonteCarloMeasurements.jl is loaded, you may transform the output particles to Matrix{MonteCarloMeasurements.Particles} with the layout T × n_state using Particles(x,we). Internally, the particles are then resampled such that they all have unit weight. This is conventient for making use of the plotting facilities of MonteCarloMeasurements.jl.

For a full usage example, see the benchmark section below or example_lineargaussian.jl

Resampling

The particle filter will perform a resampling step whenever the distribution of the weights has become degenerate. The resampling is triggered when the effective number of samples is smaller than pf.resample_threshold $\in [0, 1]$, this value can be set when constructing the filter. How the resampling is done is governed by pf.resampling_strategy, we currently provide ResampleSystematic <: ResamplingStrategy as the only implemented strategy. See https://en.wikipedia.org/wiki/Particle_filter for more info.

Particle Smoothing

Smoothing is the process of finding the best state estimate given both past and future data. Smoothing is thus only possible in an offline setting. This package provides a particle smoother, based on forward filtering, backward simulation (FFBS), example usage follows:

N     = 2000 # Number of particles
T     = 80   # Number of time steps
M     = 100  # Number of smoothed backwards trajectories
pf    = ParticleFilter(N, dynamics, measurement, df, dg, d0)
du    = MvNormal(m,1)     # Control input distribution
x,u,y = simulate(pf,T,du) # Simulate trajectory using the model in the filter
tosvec(y) = reinterpret(SVector{length(y[1]),Float64}, reduce(hcat,y))[:] |> copy
x,u,y = tosvec.((x,u,y)) # It's good for performance to use StaticArrays to the extent possible

xb,ll = smooth(pf, M, u, y) # Sample smoothing particles
xbm   = smoothed_mean(xb)   # Calculate the mean of smoothing trajectories
xbc   = smoothed_cov(xb)    # And covariance
xbt   = smoothed_trajs(xb)  # Get smoothing trajectories
xbs   = [diag(xbc) for xbc in xbc] |> vecvec_to_mat .|> sqrt
plot(xbm', ribbon=2xbs, lab="PF smooth")
plot!(vecvec_to_mat(x), l=:dash, lab="True")
Example block output

We can plot the particles themselves as well

downsample = 5
plot(vecvec_to_mat(x), l=(4,), layout=(2,1), show=false)
scatter!(xbt[1, 1:downsample:end, :]', subplot=1, show=false, m=(1,:black, 0.5), lab="")
scatter!(xbt[2, 1:downsample:end, :]', subplot=2, m=(1,:black, 0.5), lab="")
Example block output

Kalman filter

The KalmanFilter (wiki) assumes that $f$ and $g$ are linear functions, i.e., that they can be written on the form

\[\begin{aligned} x(t+1) &= Ax(t) + Bu(t) + w(t)\\ y(t) &= Cx(t) + Du(t) + e(t) \end{aligned}\]

for some matrices $A,B,C,D$ where $w \sim N(0, R_1)$ and $e \sim N(0, R_2)$ are zero mean and Gaussian. The Kalman filter represents the posterior distributions over $x$ by the mean and a covariance matrix. The magic behind the Kalman filter is that linear transformations of Gaussian distributions remain Gaussian, and we thus have a very efficient way of representing them.

A Kalman filter is easily created using the constructor. Many of the functions defined for particle filters, are defined also for Kalman filters, e.g.:

R1 = cov(df)
R2 = cov(dg)
kf = KalmanFilter(A, B, C, 0, R1, R2, d0)
sol = forward_trajectory(kf, u, y) # filtered, prediction, pred cov, filter cov, loglik

It can also be called in a loop like the pf above

for t = 1:T
    kf(u,y) # Performs both correct and predict!!
    # alternatively
    ll, e = correct!(kf, y, nothing, t) # Returns loglikelihood and prediction error
    x     = state(kf)
    R     = covariance(kf)
    predict!(kf, u, nothing, t)
end

The matrices in the Kalman filter may be time varying, such that A[:, :, t] is $A(t)$. They may also be provided as functions on the form $A(t) = A(x, u, p, t)$. This works for both dynamics and covariance matrices.

The numeric type used in the Kalman filter is determined from the mean of the initial state distribution, so make sure that this has the correct type if you intend to use, e.g., Float32 or ForwardDiff.Dual for automatic differentiation.

Smoothing using KF

Kalman filters can also be used for smoothing

kf = KalmanFilter(A, B, C, 0, cov(df), cov(dg), d0)
xT,R,lls = smooth(kf, u, y, p) # Smoothed state, smoothed cov, loglik

Plot and compare PF and KF

plot(vecvec_to_mat(xT), lab="Kalman smooth", layout=2)
plot!(xbm', lab="pf smooth")
plot!(vecvec_to_mat(x), lab="true")
Example block output

Kalman filter tuning tutorial

The tutorial "How to tune a Kalman filter" details how to figure out appropriate covariance matrices for the Kalman filter, as well as how to add disturbance models to the system model.

Unscented Kalman Filter

The UnscentedKalmanFilter represents posterior distributions over $x$ as Gaussian distributions, but propagate them through a nonlinear function $f$ by a deterministic sampling of a small number of particles called sigma points (this is referred to as the unscented transform). This UKF thus handles nonlinear functions $f,g$, but only Gaussian disturbances and unimodal posteriors.

The UKF takes the same arguments as a regular KalmanFilter, but the matrices defining the dynamics are replaced by two functions, dynamics and measurement, working in the same way as for the ParticleFilter above.

ukf = UnscentedKalmanFilter(dynamics, measurement, cov(df), cov(dg), MvNormal([1.,1.]), nu=m, ny=p)
UnscentedKalmanFilter{false, false, false, false, typeof(Main.dynamics), typeof(Main.measurement), Matrix{Float64}, Matrix{Float64}, Distributions.ZeroMeanDiagNormal{Tuple{Base.OneTo{Int64}}}, StaticArraysCore.SVector{2, Float64}, StaticArraysCore.SVector{2, Float64}, StaticArraysCore.SVector{2, Float64}, StaticArraysCore.SVector{1, Float64}, Vector{Float64}, Matrix{Float64}, LowLevelParticleFilters.NullParameters}
  dynamics: dynamics (function of type typeof(Main.dynamics))
  measurement: measurement (function of type typeof(Main.measurement))
  R1: Array{Float64}((2, 2)) [0.010000000000000002 0.0; 0.0 0.010000000000000002]
  R2: Array{Float64}((1, 1)) [0.04000000000000001;;]
  d0: Distributions.ZeroMeanDiagNormal{Tuple{Base.OneTo{Int64}}}
  xsd: Array{StaticArraysCore.SVector{2, Float64}}((5,))
  xsd0: Array{StaticArraysCore.SVector{2, Float64}}((5,))
  xsm: Array{StaticArraysCore.SVector{2, Float64}}((5,))
  ys: Array{StaticArraysCore.SVector{1, Float64}}((5,))
  x: Array{Float64}((2,)) [0.0, 0.0]
  R: Array{Float64}((2, 2)) [1.0 0.0; 0.0 1.0]
  t: Int64 1
  ny: Int64 1
  nu: Int64 1
  p: LowLevelParticleFilters.NullParameters LowLevelParticleFilters.NullParameters()
Info

If your function dynamics describes a continuous-time ODE, do not forget to discretize it before passing it to the UKF. See Discretization for more information.

Extended Kalman Filter

The ExtendedKalmanFilter (EKF) is similar to the UKF, but propagates Gaussian distributions by linearizing the dynamics and using the formulas for linear systems similar to the standard Kalman filter. This can be slightly faster than the UKF (not always), but also less accurate for strongly nonlinear systems. The linearization is performed automatically using ForwardDiff.jl. In general, the UKF is recommended over the EKF unless the EKF is faster and computational performance is the top priority.

The EKF constructor has the following two signatures

ExtendedKalmanFilter(dynamics, measurement, R1,R2,d0=MvNormal(Matrix(R1)); nu::Int, p = LowLevelParticleFilters.NullParameters(), α = 1.0, check = true)
ExtendedKalmanFilter(kf, dynamics, measurement)

The first constructor takes all the arguments required to initialize the extended Kalman filter, while the second one takes an already defined standard Kalman filter. using the first constructor, the user must provide the number of inputs to the system, nu.

where kf is a standard KalmanFilter from which the covariance properties are taken.

Info

If your function dynamics describes a continuous-time ODE, do not forget to discretize it before passing it to the UKF. See Discretization for more information.

AdvancedParticleFilter

The AdvancedParticleFilter works very much like the ParticleFilter, but admits more flexibility in its noise models.

The AdvancedParticleFilter type requires you to implement the same functions as the regular ParticleFilter, but in this case you also need to handle sampling from the noise distributions yourself. The function dynamics must have a method signature like below. It must provide one method that accepts state vector, control vector, parameter, time and noise::Bool that indicates whether or not to add noise to the state. If noise should be added, this should be done inside dynamics An example is given below

using Random
const rng = Random.Xoshiro()
function dynamics(x, u, p, t, noise=false) # It's important that this defaults to false
    x = A*x .+ B*u # A simple linear dynamics model in discrete time
    if noise
        x += rand(rng, df) # it's faster to supply your own rng
    end
    x
end

The measurement_likelihood function must have a method accepting state, input, measurement, parameter and time, and returning the log-likelihood of the measurement given the state, a simple example below:

function measurement_likelihood(x, u, y, p, t)
    logpdf(dg, C*x-y) # A simple linear measurement model with normal additive noise
end

This gives you very high flexibility. The noise model in either function can, for instance, be a function of the state, something that is not possible for the simple ParticleFilter. To be able to simulate the AdvancedParticleFilter like we did with the simple filter above, the measurement method with the signature measurement(x,u,p,t,noise=false) must be available and return a sample measurement given state (and possibly time). For our example measurement model above, this would look like this

measurement(x, u, p, t, noise=false) = C*x + noise*rand(rng, dg)

We now create the AdvancedParticleFilter and use it in the same way as the other filters:

apf = AdvancedParticleFilter(N, dynamics, measurement, measurement_likelihood, df, d0)
sol = forward_trajectory(apf, u, y, p)
LowLevelParticleFilters.ParticleFilteringSolution{AdvancedParticleFilter{PFstate{StaticArraysCore.SVector{2, Float64}, Float64}, typeof(Main.dynamics), typeof(Main.measurement), typeof(Main.measurement_likelihood), Distributions.ZeroMeanIsoNormal{Tuple{Base.OneTo{Int64}}}, Distributions.IsoNormal, DataType, Random.Xoshiro, LowLevelParticleFilters.NullParameters}, Vector{StaticArraysCore.SVector{1, Float64}}, Vector{StaticArraysCore.SVector{1, Float64}}, Matrix{StaticArraysCore.SVector{2, Float64}}, Matrix{Float64}, Matrix{Float64}, Float64}(AdvancedParticleFilter{PFstate{StaticArraysCore.SVector{2, Float64}, Float64}, typeof(Main.dynamics), typeof(Main.measurement), typeof(Main.measurement_likelihood), Distributions.ZeroMeanIsoNormal{Tuple{Base.OneTo{Int64}}}, Distributions.IsoNormal, DataType, Random.Xoshiro, LowLevelParticleFilters.NullParameters}
  state: PFstate{StaticArraysCore.SVector{2, Float64}, Float64}
  dynamics: dynamics (function of type typeof(Main.dynamics))
  measurement: measurement (function of type typeof(Main.measurement))
  measurement_likelihood: measurement_likelihood (function of type typeof(Main.measurement_likelihood))
  dynamics_density: Distributions.ZeroMeanIsoNormal{Tuple{Base.OneTo{Int64}}}
  initial_density: Distributions.IsoNormal
  resample_threshold: Float64 0.5
  resampling_strategy: LowLevelParticleFilters.ResampleSystematic <: LowLevelParticleFilters.ResamplingStrategy
  rng: Random.Xoshiro
  p: LowLevelParticleFilters.NullParameters LowLevelParticleFilters.NullParameters()
  threads: Bool false
, StaticArraysCore.SVector{1, Float64}[[-0.4144125163835883], [-0.5219730559584325], [0.3145538830906717], [0.7017601582293528], [1.2330730786655693], [-0.4652025217561179], [-0.06367199850875631], [-1.087198352992061], [0.7185778077598954], [-0.4553335541536205]  …  [0.44760791644705783], [-0.5988074353123369], [1.7614595328999059], [0.633399651677678], [-0.9256458645259944], [-0.7850598947629804], [-1.771746987335207], [-1.253865180905387], [1.7249812873924983], [-1.6914858470663048]], StaticArraysCore.SVector{1, Float64}[[0.7885253254081591], [0.46430336505439146], [0.5579303483941609], [0.1395615762310376], [0.39092427573527], [0.22829563313644843], [0.27460562723618537], [0.5377294762248039], [0.17955753804798208], [-0.18291476606810303]  …  [-1.0865024063992805], [-1.253592026847598], [-0.8722430764172682], [-0.7105987251708414], [-1.1147694206730256], [-1.3387787423396305], [-0.588409330633208], [-0.5849277330207583], [-0.674711372864728], [-0.40824711590179713]], StaticArraysCore.SVector{2, Float64}[[2.495562222872565, -0.17113541759492434] [1.5182039976717046, 0.6124927464240043] … [0.10717593591434801, -0.7381694305133293] [0.37150747307309967, -0.6035231211277261]; [-0.29338046342319385, -3.562822976765544] [1.4944485182562355, 0.6678312411771183] … [0.6929978941842497, -0.7021452671756101] [0.8444778472990342, -0.5074058442013731]; … ; [0.2731624568620541, 2.6354621423905686] [-0.24009883045003746, 1.0556011648388859] … [0.1272110435429829, -0.7079238955208565] [0.3958442663545983, -0.6760584444927212]; [-1.8072532576163132, -1.7134277189114044] [-0.32344815884482, 1.0349914418345978] … [-0.03569637066131373, -0.8771975221814242] [0.22902361830737258, -1.0199169812654527]], [-16.85397946299996 -7.017951025204151 … -6.910039427718965 -6.791553815912184; -242.02002077760918 -7.2612448203081446 … -7.1591961233839765 -6.686957201339597; … ; -47.98181526629851 -11.113863466195093 … -7.062999808181977 -7.364391565183182; -83.58923314198371 -10.814510878058925 … -9.181063365056794 -13.262669077962457], [4.7908185349760554e-8 0.0008956587963105248 … 0.0009977184579458365 0.0011232221371204527; 7.799027759898238e-106 0.0007022333369791987 … 0.0007776794603427484 0.0012470715935030722; … ; 1.451317386797544e-21 1.4904260203306658e-5 … 0.0008562057845679173 0.0006334106783757395; 4.984909749297099e-37 2.0105626070523347e-5 … 0.00010297097917097698 1.7381850254377724e-6], -11.211910878853612)
plot(sol, xreal=x)
Example block output

We can even use this type as an AuxiliaryParticleFilter

apfa = AuxiliaryParticleFilter(apf)
sol = forward_trajectory(apfa, u, y, p)
plot(sol, dim=1, xreal=x) # Same as above, but only plots a single dimension
Example block output

See the tutorials section for more advanced examples, including state estimation for DAE (Differential-Algebraic Equation) systems.

Troubleshooting and tuning

Tuning a particle filter can be quite the challenge. To assist with this, we provide som visualization tools

debugplot(pf,u[1:20],y[1:20], runall=true, xreal=x[1:20])
Time     Surviving    Effective nbr of particles
--------------------------------------------------------------
t:     1   1.000    2000.0
t:     2   1.000     290.9
t:     3   0.147    2000.0
t:     4   1.000    1454.1
t:     5   1.000     915.9
t:     6   1.000     599.3
t:     7   1.000     419.8
t:     8   1.000     294.4
t:     9   0.181    2000.0
t:    10   1.000    1638.4
t:    11   1.000     817.6
t:    12   1.000     676.2
t:    13   1.000     588.5
t:    14   1.000     358.5
t:    15   1.000     322.7
t:    16   1.000     256.2
t:    17   1.000     226.1
t:    18   0.225    2000.0
t:    19   1.000     747.5
t:    20   1.000     429.9

The plot displays all states and all measurements. The heatmap in the background represents the weighted particle distributions per time step. For the measurement sequences, the heatmap represent the distributions of predicted measurements. The blue dots corresponds to measured values. In this case, we simulated the data and we had access to states as well, if we do not have that, just omit xreal. You can also manually step through the time-series using

  • commandplot(pf,u,y; kwargs...)

For options to the debug plots, see ?pplot.

Something seems to be off with this figure as the hottest spot is not really where we would expect it

Optimization of the log likelihood can be done by, e.g., global/black box methods, see BlackBoxOptim.jl, see examples in Parameter Estimation. Standard tricks apply, such as performing the parameter search in log-space etc.