Identification of nonlinear models

This package supports two forms of nonlinear system identification.

  • Parameter estimation in a known model structure (linear or nonlinear) $x⁺ = f(x, u, p)$ where p is a vector of parameters to be estimated.
  • Estimation of Hammerstein-Wiener models, i.e., linear systems with static nonlinear functions on the input and/or output.

Parameter estimation in a known model structure

Parameter estimation in differential equations can be performed by forming a one-step ahead predictor of the output, and minimizing the prediction error. This procedure is packaged in the function ControlSystemIdentification.nonlinear_pem which is available as a package extension, available if the user manually installs and loads LeastSquaresOptim.jl.

The procedure to use this function is as follows

  1. The dynamics is specified on the form $x_{k+1} = f(x_k, u_k, p, t)$ or $ẋ = f(x, u, p, t)$ where $x$ is the state, $u$ the input, p is a vector of parameters to be estimated and $t$ is time.
  2. If the dynamics is in continuous time (a differential equation or differential-algebraic equation), use the package SeeToDee.jl to discretize it. If the dynamics is already in discrete time, skip this step.
  3. Define the measurement function $y = h(x, u, p, t)$ that maps the state and input to the measured output.
  4. Specify covariance properties of the dynamics noise and measurement noise, similar to how one would do when building a Kalman filter for a linear system.
  5. Perform the estimation using ControlSystemIdentification.nonlinear_pem.

Internally, ControlSystemIdentification.nonlinear_pem constructs an Unscented Kalman filter (UKF) from the package LowLevelParticleFilters.jl in order to perform state estimation along the provided data trajectory. An optimization problem is then solved in order to find the parameters (and optionally initial condition) that minimizes the prediction errors. This procedure is somewhat different from simply finding the parameters that make a pure simulation of the system match the data, notably, the prediction-error approach can usually handle very poor initial guesses, unstable systems and even chaotic systems. To learn more about the prediction-error method, see the tutorial Properties of the Prediction-Error Method.

ControlSystemIdentification.nonlinear_pemFunction
nonlinear_pem(
    d::IdData,
    discrete_dynamics,
    measurement,
    p0,
    x0,
    R1,
    R2,
    nu;
    optimizer = LevenbergMarquardt(),
    λ = 1.0,
    optimize_x0 = true,
    kwargs...,
)

Nonlinear Prediction-Error Method (PEM).

This method attempts to find the optimal vector of parameters, $p$, and the initial condition $x_0$, that minimizes the sum of squared one-step prediction errors. The prediction is performed using an Unscented Kalman Filter (UKF) and the optimization is performed using a Gauss-Newton method.

Requires LeastSquaresOptim.jl

This function is available only if LeastSquaresOptim.jl is manually installed and loaded by the user.

Arguments:

  • d: Identification data
  • discrete_dynamics: A dynamics function (xₖ, uₖ, p, t) -> x(k+1) that takes the current state x, input u, parameters p, and time t and returns the next state x(k+1).
  • measurement: The measurement / output function of the nonlinear system (xₖ, uₖ, p, t) -> yₖ
  • p0: The initial guess for the parameter vector
  • x0: The initial guess for the initial condition
  • R1: Dynamics noise covariance matrix (increasing this makes the algorithm trust the model less)
  • R2: Measurement noise covariance matrix (increasing this makes the algorithm trust the measurements less)
  • nu: Number of inputs to the system
  • optimizer: Any optimizer from LeastSquaresOptim
  • λ: A weighting factor to minimize dot(e, λ, e). A commonly used metric is λ = Diagonal(1 ./ (mag.^2)), where mag is a vector of the "typical magnitude" of each output. Internally, the square root of W = sqrt(λ) is calculated so that the residuals stored in res are W*e.
  • optimize_x0: Whether to optimize the initial condition x0 or not. If false, the initial condition is fixed to the value of x0 and the optimization is performed only on the parameters p.

The inner optimizer accepts a number of keyword arguments:

  • lower: Lower bounds for the parameters and initial condition (if optimized). If x0 is optimized, this is a vector with layout [lower_p; lower_x0].
  • upper: Upper bounds for the parameters and initial condition (if optimized). If x0 is optimized, this is a vector with layout [upper_p; upper_x0].
  • x_tol = 1e-8
  • f_tol = 1e-8
  • g_tol = 1e-8
  • iterations = 1_000
  • Δ = 10.0
  • store_trace = false

See Identification of nonlinear models for more details.

Experimental

This function is considered experimental and may change in the future without respecting semantic versioning. This implementation also lacks a number of features associated with good nonlinear PEM implementations, such as regularization and support for multiple datasets.

source

Example: Quad tank

This example considers a quadruple tank, where two upper tanks feed liquid into two lower tanks, depicted in the schematics below. The quad-tank process is a well-studied example in many multivariable control courses, this particular instance of the process is borrowed from the Lund University introductory course on automatic control.

process

The process has a cross coupling between the tanks, governed by a parameters $\gamma_i$: The flows from the pumps are divided according to the two parameters $γ_1 , γ_2 ∈ [0, 1]$. The flow to tank 1 is $γ_1 k_1u_1$ and the flow to tank 4 is $(1 - γ_1 )k_1u_1$. Tanks 2 and 3 behave symmetrically.

The dynamics are given by

\[\begin{aligned} \dot{h}_1 &= \dfrac{-a_1}{A_1} \sqrt{2g h_1} + \dfrac{a_3}{A_1} \sqrt{2g h_3} + \dfrac{γ_1 k_1}{A_1} u_1 \\ \dot{h}_2 &= \dfrac{-a_2}{A_2} \sqrt{2g h_2} + \dfrac{a_4}{A_2} \sqrt{2g h_4} + \dfrac{γ_2 k_2}{A_2} u_2 \\ \dot{h}_3 &= \dfrac{-a_3}{A_3} \sqrt{2g h_3} + \dfrac{(1-γ_2) k_2}{A_3} u_2 \\ \dot{h}_4 &= \dfrac{-a_4}{A_4} \sqrt{2g h_4} + \dfrac{(1-γ_1) k_1}{A_4} u_1 \end{aligned}\]

where $h_i$ are the tank levels and $a_i, A_i$ are the cross-sectional areas of outlets and tanks respectively.

We start by defining the dynamics in continuous time, and discretize them using the integrator SeeToDee.Rk4 with a sample time of $T_s = 1$s.

using StaticArrays, SeeToDee

function quadtank(h, u, p, t)
    k1, k2, g = p[1], p[2], 9.81
    A1 = A3 = A2 = A4 = p[3]
    a1 = a3 = a2 = a4 = 0.03
    γ1 = γ2 = p[4]

    ssqrt(x) = √(max(x, zero(x)) + 1e-3) # For numerical robustness at x = 0

    SA[
        -a1/A1 * ssqrt(2g*h[1]) + a3/A1*ssqrt(2g*h[3]) +     γ1*k1/A1 * u[1]
        -a2/A2 * ssqrt(2g*h[2]) + a4/A2*ssqrt(2g*h[4]) +     γ2*k2/A2 * u[2]
        -a3/A3*ssqrt(2g*h[3])                          + (1-γ2)*k2/A3 * u[2]
        -a4/A4*ssqrt(2g*h[4])                          + (1-γ1)*k1/A4 * u[1]
    ]
end
measurement(x,u,p,t) = SA[x[1], x[2]]

Ts = 1.0
discrete_dynamics = SeeToDee.ForwardEuler(quadtank, Ts) # Use Heun, Rk3 or Rk4 if dynamics is more complex or sample rate is lower, ForwardEuler is fastest and higher-order methods do not provide any noticeable improvement in this particular case
SeeToDee.ForwardEuler{typeof(Main.quadtank), Float64}(Main.quadtank, 1.0, 1)

The output from this system is the water level in the first two tanks, i.e., $y = [x_1, x_2]$.

We also specify the number of state variables, outputs and inputs as well as a vector of "true" parameters, the ones we will try to estimate.

nx = 4
ny = 2
nu = 2
p_true = [1.6, 1.6, 4.9, 0.2]
4-element Vector{Float64}:
 1.6
 1.6
 4.9
 0.2

We then simulate some data from the system to use for identification:

using ControlSystemIdentification, ControlSystemsBase
using ControlSystemsBase.DemoSystems: resonant
using LowLevelParticleFilters
using LeastSquaresOptim
using Random, Plots, LinearAlgebra

# Generate some data from the system
Random.seed!(1)
Tperiod = 200
t = 0:Ts:1000
u1 = vcat.(0.25 .* sign.(sin.(2pi/Tperiod .* (t ./ 40).^2)) .+ 0.25)
u2 = vcat.(0.25 .* sign.(sin.(2pi/Tperiod .* (t ./ 40).^2 .+ pi/2)) .+ 0.25)
u  = vcat.(u1,u2)
u = [u; 0 .* u[1:100]]
x0 = [2.5, 1.5, 3.2, 2.8]
x = LowLevelParticleFilters.rollout(discrete_dynamics, x0, u, p_true)[1:end-1]
y = measurement.(x, u, 0, 0)
y = [y .+ 0.5randn(ny) for y in y] # Add some measurement noise
Y = reduce(hcat, y) # Go from vector of vectors to a matrix
U = reduce(hcat, u) # Go from vector of vectors to a matrix
plot(
    plot(reduce(hcat, x)', title="States", lab=["h1" "h2" "h3" "h4"]),
    plot(U', title="Inputs", lab=["u1" "u2"]),
    plot(Y', title="Outputs", lab=["y1" "y2"]),
)
Example block output

We package the experimental data into an iddata object as usual. Finally, we specify the covariance matrices for the dynamics noise and measurement noise as well as a guess for the initial condition. Since we can measure the level in the first two tanks, we use the true initial condition for those tanks, but we pretend that we are quite off when guessing the initial condition for the last two tanks.

Choosing the covariance matrices can be non-trivial, see the blog post How to tune a Kalman filter for some background. Here, we pick some value for $R_1$ that seems reasonable, and pick a deliberately large value for $R_2$. Choosing a large covariance of the measurement noise will lead to the state estimator trusting the measurements less, which in turns leads to a smaller feedback correction. This will make the algorithm favor a model that is good at simulation, rather than focusing exclusively on one-step prediction.

Finally, we call the function ControlSystemIdentification.nonlinear_pem to perform the estimation.

d = iddata(Y, U, Ts)
x0_guess = [2.5, 1.5, 1, 2] # Guess for the initial condition
p_guess = [1.4, 1.4, 5.1, 0.25] # Initial guess for the parameters

R1 = Diagonal([0.1, 0.1, 0.1, 0.1])
R2 = 100*Diagonal(0.5^2 * ones(ny))

model = ControlSystemIdentification.nonlinear_pem(d, discrete_dynamics, measurement, p_guess, x0_guess, R1, R2, nu)
NonlinearPredictionErrorModel
  p: [1.6128542071159255, 1.599892556793914, 4.886859888419816, 0.20431954274871023]
  x0: [2.559653959422618, 1.6758856247603875, 2.8867807519132636, 2.1062164316435323]
  Ts: 1.0
  ny = 2, nu = 2, nx = 4

We can then test how the model performs on the data, and compare with the model corresponding to our initial guess

simplot(model, d, layout=2)

x_guess = LowLevelParticleFilters.rollout(discrete_dynamics, x0_guess, u, p_guess)[1:end-1]
y_guess = measurement.(x_guess, u, 0, 0)
plot!(reduce(hcat, y_guess)', lab="Initial guess")
Example block output

We can also perform a residual analysis to see if the model is able to capture the dynamics of the system

residualplot(model, d)
Example block output

since we are using simulated data here, the residuals are white and there's nothing to worry about. In practice, one should always inspect the residuals to see if there are any systematic errors in the model.

Internally, the returned model object contains the estimated parameters, let's see if they are any good

[p_true p_guess model.p]
4×3 Matrix{Float64}:
 1.6  1.4   1.61285
 1.6  1.4   1.59989
 4.9  5.1   4.88686
 0.2  0.25  0.20432

hopefully, the estimated parameters are close to the true ones.

To customize the implementation of the nonlinear prediction-error method, see a lower-level interface being used in the tutorial in the documentation of LowLevelParticleFilters.jl which also provides the UKF.

Identifying parameters in a ModelingToolkit model

The following shows how to use nonlinear_pem to estimate the parameters of an ModelingToolkit model. This example is a continuation of the quadruple-tank example above, and we'll reuse the same data generated previously.

Warning

ModelingToolkit is a fast moving target that breaks frequently. The example below was tested with ModelingToolkit v9.23, but is not run as part of the build process for this documentation and is not to be considered a supported interface between ControlSystemIdentification and ModelingToolkit.

When we define the MTK model, we give defaults for all parameters. We then specify the inputs and outputs of this model, since they are arrays in this example, we call collect to turn them into scalars. To obtain a dynamics function on the form $\dot x = f(x, u, p, t)$ from MTK, we call ModelingToolkit.generate_control_function. This returns two versions of this function, where the other one operates in place. This function also returns the state variables chosen, the parameters of the model as well as a simplified system with inputs and outputs. The function returned from ModelingToolkit.generate_control_function expects all the parameters of the system to be provided, but we only want to optimize a few of them. We thus wrap this function in discrete_dynamics_mtk in order to insert the optimized parameters into a parameter array that contains also the non-optimized parameters. We make use of the function similar to ensure that the final parameter array has the correct type (Dual numbers for AD will be used).

For good performance, we wrap all the glue code in a function estimate_model_mtk so that we avoid the use of too many global variables.

using ModelingToolkit
using ModelingToolkit: D_nounits as D
t = ModelingToolkit.t_nounits
ssqrt(x) = √(max(x, zero(x)) + 1e-3) # For numerical robustness at x = 0
@register_symbolic ssqrt(x)

@mtkmodel QuadtankModel begin
    @parameters begin
        k1 = 1.4
        k2 = 1.4
        g = 9.81
        A = 5.1
        a = 0.03
        γ = 0.25
    end
    begin
        A1 = A2 = A3 = A4 = A
        a1 = a3 = a2 = a4 = a
        γ1 = γ2 = γ
    end
    @variables begin
        h(t)[1:4] = 0
        u(t)[1:2] = 0
    end
    @equations begin
        D(h[1]) ~ -a1/A1 * ssqrt(2g*h[1]) + a3/A1*ssqrt(2g*h[3]) +     γ1*k1/A1 * u[1]
        D(h[2]) ~ -a2/A2 * ssqrt(2g*h[2]) + a4/A2*ssqrt(2g*h[4]) +     γ2*k2/A2 * u[2]
        D(h[3]) ~ -a3/A3*ssqrt(2g*h[3])                          + (1-γ2)*k2/A3 * u[2]
        D(h[4]) ~ -a4/A4*ssqrt(2g*h[4])                          + (1-γ1)*k1/A4 * u[1]
    end
end

@named mtkmodel = QuadtankModel()
mtkmodel = complete(mtkmodel)
inputs = [collect(mtkmodel.u);]
outputs = [collect(mtkmodel.h[1:2]);]

function estimate_model_mtk(mtkmodel, inputs, outputs) # A wrapper function to avoid using global variables
    (f_oop, f_ip), statevars, p, io_sys = ModelingToolkit.generate_control_function(mtkmodel, inputs; outputs)
    continuous_dynamics = f_oop
    inner_discrete_dynamics = SeeToDee.Rk4(continuous_dynamics, Ts)

    tunable_p = [mtkmodel.k1, mtkmodel.k2, mtkmodel.A, mtkmodel.γ] # Provided in the same order as p_guess
    tunable_indices = [findfirst(isequal(pi), p) for pi in tunable_p] # Figure out what indices of the parameter array correspond to our tunable parameters
    p0 = [ModelingToolkit.defaults(io_sys)[pi] for pi in p]
    full_p = deepcopy(p0)
    output_indices = [findfirst(isequal(yi), statevars) for yi in outputs] # Figure out what indices of the state array correspond to our outputs

    function discrete_dynamics_wrapper(x, u, p, t)
        opt_p = similar(p, length(full_p)) # Create an array of correct length and element type to host the full parameter vector
        opt_p .= full_p # Write all the initial parameters into this new array
        opt_p[tunable_indices] .= p # Overwrite the tunable parameters with the optimization variable
        inner_discrete_dynamics(x, u, opt_p, t) # Call the discretized dynamics function from MTK
    end

    mtk_measurement(x,u,p,t) = x[output_indices]

    model = ControlSystemIdentification.nonlinear_pem(d, discrete_dynamics_wrapper, mtk_measurement, p_guess, x0_guess, R1, R2, nu)
end

estimate_model_mtk(mtkmodel, inputs, outputs)

This should result in a similar final cost value as when the dynamics was implemented manually above.

Hammerstein-Wiener estimation

This package provides elementary identification of nonlinear systems on Hammerstein-Wiener form, i.e., systems with a static input nonlinearity and a static output nonlinearity with a linear system in-between, where the nonlinearities are known. The only aspect of the nonlinearities that are optionally estimated are parameters. To formalize this, the estimation method newpem allows for estimation of a model of the form

\[\begin{aligned} x^+ &= Ax + B u_{nl} \\ y &= Cx + D u_{nl} \\ u_{nl} &= g_i(u, p) \\ y_{nl} &= g_o(y, p) \end{aligned}\]

   ┌─────┐   ┌─────┐   ┌─────┐
 u │     │uₙₗ│     │ y │     │ yₙₗ
──►│  gᵢ ├──►│  P  ├──►│  gₒ ├─►
   │     │   │     │   │     │
   └─────┘   └─────┘   └─────┘

where g_i and g_o are static, nonlinear functions that may depend on some parameter vector $p$ which is optimized together with the matrices $A,B,C,D$. The procedure to estimate such a model is detailed in the docstring for newpem.

The result of this estimation is the linear system without the nonlinearities applied, those must be handled manually by the user.

The default optimizer BFGS may struggle with problems including nonlinearities. If you do not get good results, try a different optimizer, e.g., optimizer = Optim.NelderMead().

Example with simulated data:

The example below identifies a model of a resonant system where the sign of the output is unknown, i.e., the output nonlinearity is given by $y_{nl} = |y|$. To make the example a bit more realistic, we also simulate colored measurement and input noise, yn and un. For an example with real data, see Hammerstein-Wiener estimation of nonlinear belt-drive system.

using ControlSystemIdentification, ControlSystemsBase
using ControlSystemsBase.DemoSystems: resonant
using Random, Plots

# Generate some data from the system
Random.seed!(1)
T = 200
sys = c2d(resonant(ω0 = 0.1) * tf(1, [0.1, 1]), 1)# generate_system(nx, nu, ny)
nx = sys.nx
nu = 1
ny = 1
x0 = zeros(nx)
sim(sys, u, x0 = x0) = lsim(sys, u, 1:T, x0 = x0)[1]
sysn = c2d(resonant(ω0 = 1) * tf(1, [0.1, 1]), 1)

σu = 1e-2 # Input noise standard deviation
σy = 1e-3 # Output noise standard deviation

u  = randn(nu, T)
un = u + sim(sysn, σu * randn(size(u)), 0 * x0)
y  = sim(sys, un, x0)
yn = y + sim(sysn, σy * randn(size(u)), 0 * x0)

# Nonlinear output transformation
ynn = abs.(yn)
d  = iddata(ynn, un, 1)
output_nonlinearity = (y, p) -> y .= abs.(y)

# Estimate 10 models with different random initialization and pick the best one
# If results are poor, try `optimizer = Optim.NelderMead()` instead
results = map(1:10) do _
    sysh, x0h, opt = newpem(d, nx; output_nonlinearity, show_trace=false, focus = :simulation)
    (; sysh, x0h, opt)
end;

(; sysh, x0h, opt) = argmin(r->r.opt.minimum, results) # Find the model with the smallest cost

yh = simulate(sysh, d, x0h)
output_nonlinearity(yh, nothing) # We need to manually apply the output nonlinearity to the prediction
plot(d.t, [abs.(y); u]', lab=["True nonlinear output" "Input"], seriestype = [:line :steps], layout=(2,1), xlabel="Time")
scatter!(d.t, ynn', lab="Measured nonlinear output", sp=1)
plot!(d.t, yh', lab="Simulation", sp=1, l=:dash)
Example block output

Video tutorials

Relevant video tutorials are available here: