Learning a disturbance model using SciML

In this example we will attempt to learn how an unknown and stochastic input, sun shining in through a window, influences a dynamical system, the temperature in a house. How the sun is shining on a house on a cloud-free day, absent any surrounding trees or buildings can be readily simulated. However, the real world offers a number of challenges that influence the effect this has on the inside temperature

Surrounding trees and buildings may cast shadows on the house at certain parts of the day.
The sun shining in through a window has much greater effect than if it's shining on a wall.
Cloud cover modulates the effect of the sun.
As a vendor of, e.g., HVAC equipment with interesting control systems, you may not want to model each individual site in detail, including the location and size of windows and surrounding shading elements. Even if these are static, they are thus to be considered unknown.

We can model this as some deterministic parts and some stochastic parts, some known and some unknown. The path of the sun across the sky is deterministic and periodic, with one daily and one yearly component. The surroundings, like trees and buildings, are for the most part static, but the influence these have on the insolation is unknown, and so is the exact location of windows on the house. However, the cloud cover is stochastic. We can thus model insolation by

Treating the current cloud cover as a stochastic variable $C_{cloud} \in [0, 1]$ to be estimated continuously. We achieve this by including the cloud cover as a state variable in our system.
Treating the insolation when there is no cloud cover as a deterministic function of the time of day (we ignore the yearly component here for simplicity). This function will be modeled as a basis-function expansion that will be learned from data.
The effective insolation at any point in time is thus $I_{solar} = (1 - C_{cloud}) I_{solar, clear}$, that is, the cloud-free insolation is modulated by the current cloud cover.

System Description

We consider a simplified thermal model of a single-room house:

State variable: room temperature T
Control input: heater power P_heater
Disturbances: external temperature T_ext and solar insolation through windows

The heat transfer dynamics follow Newton's law of cooling with additional terms for heating and solar gains:

\[C_{thermal} Ṫ = -k_{loss} (T - T_{ext}) + η P_{heater} + A_{window} I_{solar}\]

where:

C_thermal: thermal capacity of the room
k_loss: heat loss coefficient
η: heater efficiency
A_window: effective window area
I_solar: solar insolation (W/m²)

Data Generation

First, let's generate realistic thermal data with time-varying external conditions:

using LowLevelParticleFilters, Random, SeeToDee, StaticArrays, Plots, LinearAlgebra, Statistics
using LowLevelParticleFilters: SimpleMvNormal
using Optim

# System parameters
const C_thermal = 10.0f0      # Thermal capacity (kWh/°C)
const k_loss = 0.5f0          # Heat loss coefficient (kW/°C)
const η = 0.95f0              # Heater efficiency
const A_window = 20.0f0       # Effective window area factor

# Time constants
const hours_per_day = 24.0f0
const Ts = 0.25f0            # Sample time (15 minutes)

# Helper function for time of day (0-24 hours)
time_of_day(t) = mod(t, hours_per_day)

# External temperature (sinusoidal daily variation)
function external_temp(t)
    tod = time_of_day(t)
    T_mean = 10.0f0  # Mean temperature (°C)
    T_amplitude = 5.0f0  # Daily variation amplitude
    T_mean + T_amplitude * sin(2π * (tod - 6) / hours_per_day)  # Peak at 12:00
end

# True solar insolation pattern (W/m²)
function true_insolation(t, cloud_cover)
    tod = time_of_day(t)
    # Cropped sinusoid
    base_insolation = max(500.0f0 * (0.2 + sin(π * (tod - 6) / 12)), 0)
    return base_insolation * (1 - cloud_cover)
end

# Plot the daily patterns
t_plot = 0:0.1:48  # Two days for visualization
plot(t_plot, external_temp.(t_plot), label="External Temperature (°C)", lw=2)
plot!(t_plot, true_insolation.(t_plot, 0) ./ 100, label="Clear Sky Insolation (×100 W/m²)", lw=2)
plot!(t_plot, true_insolation.(t_plot, 0.5) ./ 100, label="50% Cloud Cover (×100 W/m²)", lw=2, ls=:dash)
xlabel!("Time (hours)")
title!("Daily Environmental Patterns")

Now let's define the true system dynamics and generate training data:

# True system dynamics with time-varying cloud cover
function thermal_dynamics_true(x, u, p, t)
    T_room, cloud_cover = x
    P_heater = u[1]

    # External conditions
    T_ext = external_temp(t)
    I_solar = true_insolation(t, cloud_cover)

    # Heat balance
    dT_dt = (-k_loss * (T_room - T_ext) + η * P_heater + A_window * I_solar / 1000) / C_thermal

    # Cloud cover changes slowly (random walk)
    dcloud_dt = 0.0f0  # Driven by process noise, zero deterministic dynamics

    SA[dT_dt, dcloud_dt]
end

# Discretize the dynamics
discrete_dynamics_true = SeeToDee.Rk4(thermal_dynamics_true, Ts)

# Generate training data
function generate_thermal_data(; days=7, measure_cloud_cover=true)
    rng = Random.default_rng()
    Random.seed!(rng, 123)

    t = 0:Ts:(days * hours_per_day)
    N = length(t)

    # Generate control inputs (varied heating patterns)
    u = Vector{SVector{1, Float32}}(undef, N)
    for (i, t) in enumerate(t)
        tod = time_of_day(t)
        # Different heating strategies throughout the day
        if 6 <= tod < 8 || 17 <= tod < 22  # Morning and evening comfort
            u[i] = SA[3.0f0 + 0.5f0 * randn(rng)]  # Higher heating
        elseif 22 <= tod || tod < 6  # Night setback
            u[i] = SA[1.0f0 + 0.2f0 * randn(rng)]  # Lower heating
        else  # Day time
            u[i] = SA[2.0f0 + 0.3f0 * randn(rng)]  # Moderate heating
        end
        u[i] = SA[clamp(u[i][1], 0.0f0, 5.0f0)]  # Heater power limits
    end

    # Initial conditions
    x0 = SA[20.0f0, 0.3f0]  # Initial room temp and cloud cover

    # Simulate with slowly varying cloud cover
    x = Vector{SVector{2, Float32}}(undef, N)
    x[1] = x0
    for i in 2:N
        # Process noise (small for temperature, larger for cloud cover)
        w = SA[0.01f0 * randn(rng), 0.06f0 * randn(rng)]
        x_next = discrete_dynamics_true(x[i-1], u[i-1], nothing, t[i-1])
        x[i] = x_next + w
        # Keep cloud cover in [0, 1]
        x[i] = SA[x[i][1], clamp(x[i][2]*0.999, 0.0f0, 1.0f0)]
    end

    # Measurements - optionally include cloud cover
    if measure_cloud_cover
        y = [SA[x_i[1] + 0.1f0 * randn(rng), x_i[2] + 0.01f0 * randn(rng)] for x_i in x]
        ny = 2
    else
        y = [SA[x_i[1] + 0.1f0 * randn(rng)] for x_i in x]
        ny = 1
    end

    (; x, u, y, t, N, Ts, ny, measure_cloud_cover)
end

# Generate data (with cloud cover measurements by default)
measure_cloud_cover = false  # Set to false to exclude cloud cover measurements
data = generate_thermal_data(; days=14, measure_cloud_cover)

# Visualize the generated data
p1 = plot(data.t, [x[1] for x in data.x], label="Room Temperature", ylabel="Temperature (°C)")
plot!(data.t, [external_temp(t) for t in data.t], label="External Temperature", ls=:dash, alpha=0.7)

p2 = plot(data.t, [x[2] for x in data.x], label="Cloud Cover", ylabel="Cloud Cover (0-1)", color=:orange)
p3 = plot(data.t, [u[1] for u in data.u], label="Heater Power", ylabel="Power (kW)", color=:red)

plot(p1, p2, p3, layout=(3,1), size=(900, 600), xlabel="Time (hours)")

Radial Basis Function Model

We'll use a radial basis function expansion to learn the solar insolation pattern. This provides a more interpretable model than a neural network, with basis functions centered during daylight hours:

# Initialize RBF weights (parameters to be learned)
rng = Random.default_rng()
Random.seed!(rng, 456)
const n_basis = 8  # Number of basis functions
# Initialize with positive weights since insolation is always positive ("negative insolation" could model things like someone always opening a window in the morning letting cold air in)
rbf_weights = 100.0f0 * rand(Float32, n_basis)  # Random positive initialization

function basis_functions(t)
    tod = time_of_day(t)
    centers = LinRange(7.0f0, 17.0f0, n_basis) # Centers spread from 9 AM to 5 PM
    width = 1.5f0  # Width of each Gaussian basis function (in hours)
    @. exp(-((tod - centers) / width)^2)
end

# RBF evaluation function
function compute_nn_insolation(t, weights)
    return weights'basis_functions(t) # Linear combination of basis functions
end

Hybrid Dynamics for Estimation

Define the dynamics model that uses the neural network for insolation estimation. Since our measurement is linear (we directly observe parts of the state), we can use a LinearMeasurementModel for improved efficiency:

# Linear measurement model - more efficient than nonlinear for direct state measurements
if data.measure_cloud_cover
    C = SA[1.0f0 0.0f0; 0.0f0 1.0f0]  # Measure both temperature and cloud cover
else
    C = SA[1.0f0 0.0f0]  # Only measure temperature
end
measurement_model = LinearMeasurementModel(C, 0, data.measure_cloud_cover ? Diagonal([0.1f0^2, 0.01f0^2]) : Diagonal([0.1f0^2]); ny=data.ny)

# Hybrid dynamics with neural network and cloud cover state
function thermal_dynamics_hybrid(x, u, p, t)
    T_room, cloud_cover = x
    P_heater = u[1]

    # External temperature (known)
    T_ext = external_temp(t)

    # Solar insolation from neural network
    I_base = compute_nn_insolation(t, p)
    I_solar = I_base * (1 - cloud_cover)

    # Heat balance
    dT_dt = (-k_loss * (T_room - T_ext) + η * P_heater + A_window * I_solar / 1000) / C_thermal

    # Cloud cover changes slowly
    dcloud_dt = 0.0001f0*(0.5f0 - cloud_cover)  # Driven by process noise, assume we know mean cloud cover over time

    SA[dT_dt, dcloud_dt]
end

# Discretize hybrid dynamics
const discrete_dynamics_hybrid = SeeToDee.ForwardEuler(thermal_dynamics_hybrid, Ts)

function clamped_dynamics(x,u,p,t)
    xp = discrete_dynamics_hybrid(x,u,p,t)
    SA[xp[1], clamp(xp[2], 0.0f0, 1.0f0)]
end

# System dimensions for the filter
nx = 2  # State: [temperature, cloud_cover]
nu = 1  # Input: heater power
ny = data.ny  # Output dimension depends on whether cloud cover is measured

Parameter Estimation

Now we'll set up the state estimator and the optimization problem using a quasi-Newton method:

# Process and measurement noise for the filter
R1 = SMatrix{nx, nx}(Diagonal([0.01f0, 0.06f0]))  # Process noise
if data.measure_cloud_cover
    R2 = SMatrix{ny, ny}(Diagonal([0.1f0^2, 0.01f0^2]))  # Temperature and cloud cover noise
else
    R2 = SMatrix{ny, ny}(Diagonal([0.1f0^2]))  # Only temperature noise
end

# Initial state estimate
x0 = SA[20.0f0, 0.5f0]  # Initial temperature and cloud cover guess

# Cost function for optimization (sum of squared errors)
function cost(θ)
    T = eltype(θ)

    # Create filter with current parameters and linear measurement model
    kf = UnscentedKalmanFilter(
        clamped_dynamics,
        measurement_model,  # Use the linear measurement model
        R1,
        SimpleMvNormal(T.(x0), T.(2*R1));
        ny, nu, Ts,
    )

    # Compute sum of squared prediction errors
    LowLevelParticleFilters.sse(kf, data.u, data.y, θ)
end

# Initial parameters from the neural network initialization
θ_init = copy(rbf_weights)

# Define optimization options once for reuse
opt_options = Optim.Options(
    show_trace = false,
    store_trace = true,
    iterations = 200,
    g_tol = 1e-12,
)

result = Optim.optimize(
    cost,
    θ_init,
    BFGS(),
    opt_options;
    autodiff = :forward,  # Use forward-mode AD for gradients
)

params_opt = result.minimizer

@info "Optimization complete. Converged: $(Optim.converged(result)), Iterations: $(Optim.iterations(result))"
@info "Final cost: $(Optim.minimum(result))"

# Plot convergence history
# plot(getfield.(result.trace, :value), #yscale=:log10,
#      xlabel="Iteration", ylabel="Cost (SSE)",
#      title="Convergence", lw=2)

[ Info: Optimization complete. Converged: true, Iterations: 67
[ Info: Final cost: 23.895609

Results Analysis

Let's analyze the results by running the filter with optimized parameters:

# Run filter with optimized parameters and linear measurement model
kf_final = UnscentedKalmanFilter(
    clamped_dynamics,
    measurement_model,  # Use the linear measurement model
    R1,
    SimpleMvNormal(x0, R1);
    p = params_opt,
    ny, nu, Ts
)

sol = forward_trajectory(kf_final, data.u, data.y)

# Extract estimated states
T_est = [sol.xt[i][1] for i in 1:length(sol.xt)]
cloud_est = [sol.xt[i][2] for i in 1:length(sol.xt)]
T_true = [data.x[i][1] for i in 1:length(data.x)]
cloud_true = [data.x[i][2] for i in 1:length(data.x)]

# Only compute cloud RMSE when sun is above horizon
sun_up_mask = [true_insolation(data.t[i], 0.0f0) > 0 for i in 1:length(data.t)]
cloud_error = sqrt(mean(abs2, cloud_true[sun_up_mask] .- cloud_est[sun_up_mask]))

# Plot temperature estimation
p1 = plot(data.t, T_true, label="True Temperature", lw=2, color=:blue)
plot!(data.t, T_est, label="Estimated Temperature", lw=2, ls=:dash, color=:red)
plot!(data.t, [y[1] for y in data.y], label="Measurements", alpha=0.3, seriestype=:scatter, ms=1, color=:gray)
ylabel!("Temperature (°C)")
title!("Temperature Estimation")

# Plot cloud cover estimation
p2 = plot(data.t, cloud_true, label="True Cloud Cover", lw=2, color=:blue)
plot!(data.t, cloud_est, label="Estimated Cloud Cover", lw=2, ls=:dash, color=:red)
ylabel!("Cloud Cover")
xlabel!("Time (hours)")
title!("Cloud Cover Estimation")

plot(p1, p2, layout=(2,1), size=(1200, 800))

As we can see, it's easy to estimate the internal temperature, after all, we measure this directly. Estimating the cloud cover is significantly harder, notice in particular how the estimation drifts to 0.5 each night when there is no sun. This is expected since it is impossible to observe (in the estimation-theoretical sense) the cloud cover when there is no sun, since when there is no sun there is no effect of the cloud cover on the variable we do measure, the temperature. The fact that it drifts to 0.5 in particular can be explained by the growing estimated covariance during night combined with the clamping of the estimated cloud cover variable between 0 and 1.

Learned vs True Insolation Pattern

We now have a look at the function we learned for the effect of insolation on the internal temperature, absent of clouds. Since this is a simulated example, we have access to the true function to compare with:

# Generate time points for one day
tod_test = LinRange(0.0f0, 24.0f0, 100)

# Compute true insolation (without clouds)
I_true = [true_insolation(t, 0.0f0) for t in tod_test]

# Compute learned insolation
I_learned = [compute_nn_insolation(t, params_opt) for t in tod_test]

# Plot comparison
plot(tod_test, I_true, label="True Insolation", lw=3, color=:blue)
plot!(tod_test, I_learned, label="Learned Insolation", lw=2, ls=:dash, color=:red)
xlabel!("Time of Day (hours)")
ylabel!("Insolation (W/m²)")
title!("Learned Solar Insolation Pattern")

Hopefully, we see that the estimation has captured the general shape of the true insolation pattern, but perhaps not perfectly, since this function is "hidden" behind an unknown and noisy estimate of the cloud cover.

Discussion

This example demonstrates a classical SciML workflow, the combination of physics-based thermal dynamics with a data-driven model to capture unknown solar patterns. During the day, we were able to roughly estimate the cloud cover despite not being directly measured, by leveraging its effect on the temperature dynamics, but during night our estimator has no fighting chance of doing a good job here, a limitation inherent to the unobservability of the cloud cover in the absence of sunlight.

Diving deeper: How to handle constraints

The variable cloud cover is constrained to be between 0 and 1. The Kalman-filtering framework does not handle such a constraint natively, but there are several different more or less heuristic methods available to handle it. Above, we simply clamped the estimated value to be between 0 and 1, simple but effective. Can we do any better? This section compares a number of different methods

The clamping method
Reformulating the dynamics to use an unconstrained variable that is projected onto the constraint set using a sigmoid function
Projection implemented as a "perfect measurement": We may treat the projection as a fictitious measurement update, imagining that we have obtained a zero-variance measurement that the constrained variable is at the constraint boundary. This is similar to the naive clamping above, but also updates the covariance. We perform this projection using the function LowLevelParticleFilters.project_bound and make use of a callback in order to apply it during the estimation.
Sigma-point rejection, when we use an UnscentedKalmanFilter, we may reject sigma points that fall outside of the feasible region.

# Evaluation function to compare methods
function evaluate_solution(sol, data, params)
    # Extract estimated states
    T_est = [sol.xt[i][1] for i in 1:length(sol.xt)]
    cloud_est = [sol.xt[i][2] for i in 1:length(sol.xt)]
    T_true = [data.x[i][1] for i in 1:length(data.x)]
    cloud_true = [data.x[i][2] for i in 1:length(data.x)]

    # Compute errors
    temp_rmse = sqrt(mean(abs2, T_true .- T_est))
    # Only compute cloud RMSE when sun is above horizon (true insolation > 0)
    sun_up_mask = [true_insolation(data.t[i], 0.0f0) > 0 for i in 1:length(data.t)]
    cloud_rmse = sqrt(mean(abs2, cloud_true[sun_up_mask] .- cloud_est[sun_up_mask]))

    # Compute learned insolation vs true
    tod_test = LinRange(0.0f0, 24.0f0, 100)
    I_true = [true_insolation(t, 0.0f0) for t in tod_test]
    I_learned = [compute_nn_insolation(t, params) for t in tod_test]
    insolation_rmse = sqrt(mean(abs2, I_true .- I_learned))

    return (;
        temp_rmse,
        cloud_rmse,
        insolation_rmse,
        T_est,
        cloud_est,
        I_learned
    )
end

# Evaluate the clamping method (already optimized)
eval_clamping = evaluate_solution(sol, data, params_opt)
@info "Clamping method - Temperature RMSE: $(round(eval_clamping.temp_rmse, digits=3))°C, Cloud RMSE: $(round(eval_clamping.cloud_rmse, digits=3))"

[ Info: Clamping method - Temperature RMSE: 0.075°C, Cloud RMSE: 0.215

Comparison of Constraint Handling Methods

The cloud cover state must be constrained to lie in the interval [0, 1]. We have explored several different methods to handle this constraint:

Clamping: Directly clamp the cloud cover after each dynamics update
Sigmoid transformation: Transform the state through a sigmoid function
Projection: Project the state back to the constraint set after filter updates
Sigma-point rejection: Reject sigma points that violate constraints
Truncated moment matching: Use truncated normal distribution moments

Let's compare these approaches:

Method 1: Clamping (Already Implemented)

The clamping method was used in the main tutorial above. It simply clips the cloud cover to [0, 1] after each dynamics step:

function clamped_dynamics(x,u,p,t)
    xp = discrete_dynamics_hybrid(x,u,p,t)
    SA[xp[1], clamp(xp[2], 0.0f0, 1.0f0)]
end

Method 2: Sigmoid Transformation

# Sigmoid transformation method
sigmoid(x) = 1 / (1 + exp(-x))
sigmoid_inv(y) = log(y / (1 - y))

function thermal_dynamics_sigmoid(x, u, p, t)
    T_room, log_cloud_cover = x
    cloud_cover = sigmoid(log_cloud_cover)
    P_heater = u[1]

    # External temperature (known)
    T_ext = external_temp(t)

    # Solar insolation from RBF model
    I_base = compute_nn_insolation(t, p)
    I_solar = I_base * (1 - cloud_cover)

    # Heat balance
    dT_dt = (-k_loss * (T_room - T_ext) + η * P_heater + A_window * I_solar / 1000) / C_thermal

    # Cloud cover changes slowly (in transformed space)
    dlogcloud_dt = 0.0001f0*(sigmoid_inv(0.5f0) - log_cloud_cover)

    SA[dT_dt, dlogcloud_dt]
end

# Discretize sigmoid dynamics
discrete_dynamics_sigmoid = SeeToDee.ForwardEuler(thermal_dynamics_sigmoid, Ts)

# Measurement model for sigmoid method
measurement_sigmoid = if data.measure_cloud_cover
    (x, u, p, t) -> SA[x[1], sigmoid(x[2])]
else
    (x, u, p, t) -> SA[x[1]]
end

# Optimize sigmoid method
function cost_sigmoid(θ)
    T = eltype(θ)
    x0_sigmoid = SA[20.0f0, sigmoid_inv(0.5f0)]

    kf = UnscentedKalmanFilter(
        discrete_dynamics_sigmoid,
        measurement_sigmoid,
        R1,
        R2,
        SimpleMvNormal(T.(x0_sigmoid), T.(2*R1));
        ny, nu, Ts,
    )

    LowLevelParticleFilters.sse(kf, data.u, data.y, θ)
end

# Run optimization for sigmoid method
@info "Optimizing sigmoid method..."
result_sigmoid = Optim.optimize(
    cost_sigmoid,
    θ_init,
    BFGS(),
    opt_options;
    autodiff = :forward,
)

params_sigmoid = result_sigmoid.minimizer

# Evaluate sigmoid method
x0_sigmoid = SA[20.0f0, sigmoid_inv(0.5f0)]
kf_sigmoid = UnscentedKalmanFilter(
    discrete_dynamics_sigmoid,
    measurement_sigmoid,
    R1,
    R2,
    SimpleMvNormal(x0_sigmoid, R1);
    p = params_sigmoid,
    ny, nu, Ts
)

sol_sigmoid = forward_trajectory(kf_sigmoid, data.u, data.y)

# Transform cloud estimates back from log space
sol_sigmoid_transformed = deepcopy(sol_sigmoid)
for i in 1:length(sol_sigmoid_transformed.xt)
    x = sol_sigmoid_transformed.xt[i]
    sol_sigmoid_transformed.xt[i] = SA[x[1], sigmoid(x[2])]
end

eval_sigmoid = evaluate_solution(sol_sigmoid_transformed, data, params_sigmoid)
@info "Sigmoid method - Temperature RMSE: $(round(eval_sigmoid.temp_rmse, digits=3))°C, Cloud RMSE: $(round(eval_sigmoid.cloud_rmse, digits=3))"

[ Info: Optimizing sigmoid method...
[ Info: Sigmoid method - Temperature RMSE: 0.073°C, Cloud RMSE: 0.207

Method 3: Projection

# Projection method - uses standard dynamics but projects after updates

function post_update_cb(kf, u, y, p, ll, e)
    if !(0 <= kf.x[2] <= 1)
        xn, Rn = LowLevelParticleFilters.project_bound(kf.x, kf.R, 2; lower=0, upper=1, tol=1e-9)
        kf.x = xn
        kf.R = Rn
    end
    nothing
end

# Use original unclamped dynamics for projection method
discrete_dynamics_unclamped = discrete_dynamics_hybrid

# Optimize projection method
function cost_projection(θ)
    T = eltype(θ)

    kf = UnscentedKalmanFilter(
        discrete_dynamics_unclamped,
        measurement_model,  # Use the linear measurement model
        R1,
        SimpleMvNormal(T.(x0), T.(2*R1));
        ny, nu, Ts,
    )

    LowLevelParticleFilters.sse(kf, data.u, data.y, θ; post_update_cb)
end

# Run optimization for projection method
@info "Optimizing projection method..."
result_projection = Optim.optimize(
    cost_projection,
    θ_init,
    BFGS(),
    opt_options;
    autodiff = :forward,
)

params_projection = result_projection.minimizer

# Evaluate projection method
kf_projection = UnscentedKalmanFilter(
    discrete_dynamics_unclamped,
    measurement_model,  # Use the linear measurement model
    R1,
    SimpleMvNormal(x0, R1);
    p = params_projection,
    ny, nu, Ts
)

post_predict_cb(kf, p, args...) = post_update_cb(kf, 0, 0, p, 0, 0)
sol_projection = forward_trajectory(kf_projection, data.u, data.y; post_predict_cb, post_correct_cb=post_predict_cb)

eval_projection = evaluate_solution(sol_projection, data, params_projection)
@info "Projection method - Temperature RMSE: $(round(eval_projection.temp_rmse, digits=3))°C, Cloud RMSE: $(round(eval_projection.cloud_rmse, digits=3))"

[ Info: Optimizing projection method...
[ Info: Projection method - Temperature RMSE: 0.075°C, Cloud RMSE: 0.255

Method 4: Sigma-point rejection

The UKF provides a mechanism to reject sigma points that violate constraints. When a sigma point falls outside [0,1] for cloud cover, we can reject it and replace it with the mean point:

# Sigma-point rejection method
function reject_sigma_point(x)
    # Reject if cloud cover is outside [0, 1]
    return !(0.0f0 <= x[2] <= 1.0f0)
end

# Use unclamped dynamics for rejection method
discrete_dynamics_rejection = discrete_dynamics_hybrid

# Optimize with sigma-point rejection
function cost_rejection(θ)
    T = eltype(θ)

    kf = UnscentedKalmanFilter(
        discrete_dynamics_rejection,
        measurement_model,  # Use the linear measurement model
        R1,
        SimpleMvNormal(T.(x0), T.(2*R1));
        ny, nu, Ts,
        reject = reject_sigma_point  # Add rejection function
    )

    LowLevelParticleFilters.sse(kf, data.u, data.y, θ)
end

# Run optimization for rejection method
@info "Optimizing sigma-point rejection method..."
result_rejection = Optim.optimize(
    cost_rejection,
    θ_init,
    BFGS(),
    opt_options;
    autodiff = :forward,
)

params_rejection = result_rejection.minimizer

# Evaluate rejection method
kf_rejection = UnscentedKalmanFilter(
    discrete_dynamics_rejection,
    measurement_model,  # Use the linear measurement model
    R1,
    SimpleMvNormal(x0, R1);
    p = params_rejection,
    ny, nu, Ts,
    reject = reject_sigma_point
)

sol_rejection = forward_trajectory(kf_rejection, data.u, data.y)

eval_rejection = evaluate_solution(sol_rejection, data, params_rejection)
@info "Rejection method - Temperature RMSE: $(round(eval_rejection.temp_rmse, digits=3))°C, Cloud RMSE: $(round(eval_rejection.cloud_rmse, digits=3))"

[ Info: Optimizing sigma-point rejection method...
[ Info: Rejection method - Temperature RMSE: 0.077°C, Cloud RMSE: 0.281

Method 5: Truncated Moment Matching

This method uses truncated normal distribution moments to handle the constraint, providing a statistically principled approach:

# Truncated moment matching method - uses standard dynamics but applies moment matching after updates
function post_update_tmm(kf, u, y, p, ll, e)
    if !(0 <= kf.x[2] <= 1)
        xn, Rn = LowLevelParticleFilters.truncated_moment_match(kf.x, kf.R, 2; lower=0, upper=1, tol=1e-9)
        kf.x = xn
        kf.R = Rn
    end
    nothing
end

# Use original unclamped dynamics for truncated moment matching
discrete_dynamics_tmm = discrete_dynamics_hybrid

# Optimize truncated moment matching method
function cost_tmm(θ)
    T = eltype(θ)

    kf = UnscentedKalmanFilter(
        discrete_dynamics_tmm,
        measurement_model,  # Use the linear measurement model
        R1,
        SimpleMvNormal(T.(x0), T.(2*R1));
        ny, nu, Ts,
    )

    LowLevelParticleFilters.sse(kf, data.u, data.y, θ; post_update_cb=post_update_tmm)
end

# Run optimization for truncated moment matching
@info "Optimizing truncated moment matching method..."
result_tmm = Optim.optimize(
    cost_tmm,
    θ_init,
    BFGS(),
    opt_options;
    autodiff = :forward,
)

params_tmm = result_tmm.minimizer

# Evaluate truncated moment matching method
kf_tmm = UnscentedKalmanFilter(
    discrete_dynamics_tmm,
    measurement_model,  # Use the linear measurement model
    R1,
    SimpleMvNormal(x0, R1);
    p = params_tmm,
    ny, nu, Ts
)

post_predict_tmm(kf, p, args...) = post_update_tmm(kf, 0, 0, p, 0, 0)
sol_tmm = forward_trajectory(kf_tmm, data.u, data.y; post_predict_cb=post_predict_tmm, post_correct_cb=post_predict_tmm)

eval_tmm = evaluate_solution(sol_tmm, data, params_tmm)
@info "Truncated moment matching - Temperature RMSE: $(round(eval_tmm.temp_rmse, digits=3))°C, Cloud RMSE: $(round(eval_tmm.cloud_rmse, digits=3))"

[ Info: Optimizing truncated moment matching method...
[ Info: Truncated moment matching - Temperature RMSE: 0.075°C, Cloud RMSE: 0.21

Comparison Results

# Plot comparison of all methods
p1 = plot(data.t, [data.x[i][2] for i in 1:length(data.x)],
    label="True Cloud Cover", lw=3, color=:black, alpha=0.7)
plot!(data.t, eval_clamping.cloud_est, label="Clamping", lw=2, color=:blue)
plot!(data.t, eval_sigmoid.cloud_est, label="Sigmoid", lw=2, color=:red, ls=:dash)
plot!(data.t, eval_projection.cloud_est, label="Projection", lw=2, color=:green, ls=:dot)
plot!(data.t, eval_rejection.cloud_est, label="Rejection", lw=2, color=:purple, ls=:dashdot)
plot!(data.t, eval_tmm.cloud_est, label="Truncated MM", lw=2, color=:orange, ls=:dashdotdot)
ylabel!("Cloud Cover")
xlabel!("Time (hours)")
title!("Cloud Cover Estimation - Method Comparison")

# Compare learned insolation patterns
tod_test = LinRange(0.0f0, 24.0f0, 100)
I_true_plot = [true_insolation(t, 0.0f0) for t in tod_test]

p2 = plot(tod_test, I_true_plot, label="True", lw=3, color=:black, alpha=0.7)
plot!(tod_test, eval_clamping.I_learned, label="Clamping", lw=2, color=:blue)
plot!(tod_test, eval_sigmoid.I_learned, label="Sigmoid", lw=2, color=:red, ls=:dash)
plot!(tod_test, eval_projection.I_learned, label="Projection", lw=2, color=:green, ls=:dot)
plot!(tod_test, eval_rejection.I_learned, label="Rejection", lw=2, color=:purple, ls=:dashdot)
plot!(tod_test, eval_tmm.I_learned, label="Truncated MM", lw=2, color=:orange, ls=:dashdotdot)
xlabel!("Time of Day (hours)")
ylabel!("Insolation (W/m²)")
title!("Learned Insolation Patterns")

plot(p1, p2, layout=(2,1), size=(1200, 800))

Summary table

println("\n=== Method Comparison Summary ===")
println("Method        | Temp RMSE | Cloud RMSE | Insolation RMSE")
println("------------- | --------- | ---------- | ---------------")
println("Clamping      | $(round(eval_clamping.temp_rmse, digits=3))     | $(round(eval_clamping.cloud_rmse, digits=3))      | $(round(eval_clamping.insolation_rmse, digits=1))")
println("Sigmoid       | $(round(eval_sigmoid.temp_rmse, digits=3))     | $(round(eval_sigmoid.cloud_rmse, digits=3))      | $(round(eval_sigmoid.insolation_rmse, digits=1))")
println("Projection    | $(round(eval_projection.temp_rmse, digits=3))     | $(round(eval_projection.cloud_rmse, digits=3))      | $(round(eval_projection.insolation_rmse, digits=1))")
println("Rejection     | $(round(eval_rejection.temp_rmse, digits=3))     | $(round(eval_rejection.cloud_rmse, digits=3))      | $(round(eval_rejection.insolation_rmse, digits=1))")
println("Truncated MM  | $(round(eval_tmm.temp_rmse, digits=3))     | $(round(eval_tmm.cloud_rmse, digits=3))      | $(round(eval_tmm.insolation_rmse, digits=1))")


=== Method Comparison Summary ===
Method        | Temp RMSE | Cloud RMSE | Insolation RMSE
------------- | --------- | ---------- | ---------------
Clamping      | 0.075     | 0.215      | 43.0
Sigmoid       | 0.073     | 0.207      | 32.7
Projection    | 0.075     | 0.255      | 60.5
Rejection     | 0.077     | 0.281      | 182.2
Truncated MM  | 0.075     | 0.21      | 31.2

Discussion of Constraint Methods

Each method has different trade-offs:

Clamping: Simple and computationally efficient, but creates discontinuities in the dynamics that some filters may not like.
Sigmoid: Smooth transformation that naturally keeps variables bounded, but changes the noise characteristics and may introduce "stickyness" at the boundaries.
Projection: Similar to clamping, but takes correlation between variables into account, updating also non-clamped variables and covariance.
Sigma-point Rejection: Simple method that often introduces large bias.
Truncated Moment Matching: Provides a statistically principled approach by computing the exact moments of the truncated normal distribution. May be slightly computationally expensive.

The results show that the sigmoidal transformation and the truncated moment matching appear to perform best on this particular problem. We didn't use all that much data, so there is bound to be some randomness in these findings. I did try with 10x more data and the findings were quite similar.