An introduction to the theory and methods of inverse problems and data assimilation
Chapter 6

Programming of numerical algorithms and useful tools


Published Copyright © IOP Publishing Ltd 2015
Pages 6-1 to 6-18

Download ePub chapter

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

Download complete PDF book, the ePub book or the Kindle book

Export citation and abstract

BibTeX RIS

Share this chapter

978-0-7503-1218-9

Abstract

A key part of the area of inverse problems and data assimilation is the development of algorithms which solve inverse problems and assimilate data into dynamical models. Algorithms need to be implemented and tested, both on generic situations which are set up to investigate their behavior as well as in real world environments.

This article is available under the terms of the IOP-Standard Books License

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the publisher, or as expressly permitted by law or under terms agreed with the appropriate rights organization. Multiple copying is permitted in accordance with the terms of licences issued by the Copyright Licensing Agency, the Copyright Clearance Centre and other reproduction rights organisations.

Permission to make use of IOP Publishing content other than as set out above may be sought at permissions@iop.org.

Gen Nakamura and Roland Potthast have asserted their right to be identified as the authors of this work in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988.

Media content for this book is available from http://iopscience.iop.org/book/978-0-7503-1218-9/page/about-the-book.

A key part of the area of inverse problems and data assimilation is the development of algorithms which solve inverse problems and assimilate data into dynamical models. Algorithms need to be implemented and tested, both on generic situations which are set up to investigate their behavior as well as in real world environments.

Of course, today a wide variety of software languages and tools is available for implementation. Often, the choice of tools is influenced by the institutional environment of researchers. For example, many operational centers for weather prediction run their code on supercomputers where FORTRAN compilers have been used for decades. Also, programming languages such as C++ and JAVA are very popular among researchers today.

For developing new algorithms and for learning about inversion and assimilation, fast implementation and visualization is needed. To fully understand algorithms, you need to gain some experience of simple examples and more complex problems. To this end, we will provide generic codes for many of the algorithms in an environment which can be easily used by everyone (available for download at the book website). The codes can be run with either MATLAB or its free version OCTAVE [1], which is available for download on the Internet.

Here, we assume that the reader is familiar with elementary programming in OCTAVE, such as variables, loops, scripts, functions and elementary plotting. These aspects can be learned within hours from simple online tutorials. We do not assume much further knowledge.

6.1.  MATLAB or OCTAVE programming: the butterfly

Let us start with a simple OCTAVE code to generate the butterfly shown in figure 5.1 or figure 6.1, which is one of the standard examples for a dynamical system introduced by Lorenz in 1963. It is very popular as a study case for data assimilation algorithms and consists of three coupled ordinary differential equations

Equation (6.1.1)

Equation (6.1.2)

Equation (6.1.3)

with constants $\sigma ,\rho ,\beta ,$ known as the Prandtl number, the Rayleigh number and a non-dimensional wave number, respectively. The classical values $\sigma =10,$ $\beta =8/3$ and $\rho =28$ have been suggested by Lorenz.

The state of the system is given by $\varphi ={(x,y,z)}^{T}\in {{\mathbb{R}}}^{3}$ and the above system can be summarized into the form

Equation (6.1.4)

with

Equation (6.1.5)

The Runge–Kutta scheme of 4th-order (see for example [2]) to calculate a numerical solution to the system (6.1.1)–(6.1.3) with initial values x0, y0 and z0 iteratively calculates

Equation (6.1.6)

Equation (6.1.7)

Equation (6.1.8)

Equation (6.1.9)

Equation (6.1.10)

Code 6.1.1.   The following OCTAVE or MATLAB function simulates the solution to the ordinary differential equation (6.1.1)–(6.1.3) by a 4th-order Runge–Kutta scheme.

Figure 6.1.

Figure 6.1. The result of code 6.1.3 which displays the solution to the Lorenz 1963 system (6.1.1)–(6.1.3).

Standard image High-resolution image

Here, we use the function sim_06_1_2_F as given by the following code.

Code 6.1.2.   Forcing term for the Lorenz 1963 dynamical system.

MATLAB or OCTAVE allows both functions and scripts. Functions hide the values of variables and allow clean calls of functionality. For code development, often simple scripting is helpful at the beginning. In any case, parameters need to be set and functions need to be called. We recommend carrying this out using well-defined scripts as in the following code 6.1.3, since this allows a clear debugging of how values are set.

Code 6.1.3.   A simple script sim_06_1_3_butterfly.m to call the Runge–Kutta solution of the Lorenz 1963 system to generate the 'butterfly' shown in figure 5.1.

In the next section we will show how data assimilation methods can easily be applied to the Lorenz 1963 model.

6.2. Data assimilation made simple

For data assimilation algorithms, we need to be able to apply the model to some initial state and run it some time T into the future. Here, we realize this model by a routine called Lorenz63. It is based on the Runge–Kutta solution of the Lorenz 1963 model described in section 6.1. We have set up this model routine such that we can generate an ensemble of $L\in {\mathbb{N}}$ states by running the model for an ensemble of L initial conditions.

Code 6.2.1.   The dynamics is generated by the model routine sim_06_2_1_Lorenz63.m.

The code here calls the routine sim_06_1_1_Lorenz63.m, which generates the full trajectory between time t = 0 and time t = T with a spacing h of the time steps and ${N}_{\text{Time}}$ such steps to reach T. This has been chosen just for simplicity, of course for other applications one would modify the original routine and not store all intermediate states.

Now, we first run the model to generate some kind of truth. This is usually denoted as a nature run. In practical applications the nature run is different from the model—as nature is different from our approximations to it. So it is important to thoroughly choose the parameters and set-up of the nature run to reflect this difference between the truth and the model which is to be used for the data assimilation. Here, we choose to modify the Lorenz 1963 parameter σ for the nature run.

Observations are modeled by the observation operator H. Here, we choose a linear operator, which observes either one component of the system, or the sum of two components, or the full state.

Code 6.2.2.   Code sim_06_2_2_Generate_Original_and_Data.m to generate the nature run and the measurement data.

The next step is running the data assimilation algorithm. Here, we start with three-dimensional variational assimilation (3D-VAR) or Tikhonov regularization. We need to define the covariance matrices, which here we choose by hand as diagonal matrices.

A key point here is to make sure that the model is different from the nature run model. Here, we modify the parameter σ. The initial state for our assimilation is chosen to be identical to the initial state of the nature run.

The assimilation cycle takes model runs and assimilation steps in turn. We carry out Nnat such cycling steps for all available data yj at time tj , $j=1,\ldots ,\;{\mathtt{\text{Nnat}}}.$ From the analysis xa at some time step ${t}_{j-1}$ we run the model forward and calculate the background xb at time tj . Then, the analysis step following equation (5.2.14) with $\alpha =1$ is carried out and xa is calculated at time tj . We store it in some vector and also calculate the error ej at time tj , i.e. the norm difference between the analysis and the truth. Now, the whole analysis cycle can be summarized into a short script.

Code 6.2.3.   Code sim_06_2_3_Carry_out_Assimilation.m to carry out the data assimilation by a 3D-VAR algorithm.

Finally, we display the nature run, the background states and the analysis states in typical three-dimensional graphics as shown in figure 6.2. The following code also generates a figure with the error curve ej , $j=1,\ldots ,\;{\mathtt{\text{Nnat}}}.$

Figure 6.2.

Figure 6.2. The result of code 6.2.3 as generated by code 6.2.4, i.e. the assimilation with observation operator $H=[1\ 1\ 0]$ and the 3D-VAR algorithm. In (a) the original is shown in black, the background in blue and the analysis in red. The error evolution is shown in (b). The assimilation leads to a stable synchronization between the truth and the assimilation cycle.

Standard image High-resolution image

Code 6.2.4.   Code sim_06_2_4_Plot_Original_and_Analysis.m to display the nature run, the background states and the analysis states.

Finally, we note that these scripts can be called by a control script to run them all one by one, just call sim_06_2_5_Control.m.

Code 6.2.5.   Script to control the nature run, the data assimilation cycle and visualization.

6.3. Ensemble data assimilation in a nutshell

The goal of this section is to show how the ensemble Kalman square root filter (SRF) can be set up in a simple way for the Lorenz 1963 system to carry out the assimilation. We will also show that for some observation operators this ensemble data assimilation system is better than 3D-VAR.

Figure 6.3.

Figure 6.3. The result of code 6.2.3 with the 3D-VAR algorithm; in contrast to figure 6.2 we employed the observation operator $H=[1\ 0\ 0].$ In (a) and (c) the original is shown in black, the background in blue and the analysis in red. The error evolution is shown in (b) and (d). The assimilation does not lead to a stable synchronization between the truth and the assimilation cycle. In (c) and (d) we show the results of code 6.3.2, where the ensemble Kalman SRF is used. Here, with the ensemble Kalman filter using the dynamic covariances between the different variables, the assimilation is stable.

Standard image High-resolution image
Figure 6.4.

Figure 6.4. The true function ${x}_{\text{true}}$ (black dots), the reconstruction x without regularization (red) and the regularized solution ${x}_{\alpha }$ in blue. (a) is carried out with n = 24, (b) with n = 26, where the ill-posedness strongly increases, as seen by the oscillations which take over in the reconstruction (red curve). The images are generated using code 6.4.1.

Standard image High-resolution image

The generation of the nature run and the simulation of measurements is described in code 6.2.2. Then, we need to generate an initial ensemble and carry out the data assimilation step in a cycled way as described by equation (5.2.14) with the covariance matrix B approximated according to equation (5.5.15) and the calculation of the analysis ensemble by the SRF as described in (5.5.22). The control code is given as follows, generating figure 6.3.

Code 6.3.1.   Script sim_06_3_1_control_EnKF.m to carry out the ensemble Kalman SRF assimilation cycle for the Lorenz 1963 system.

And the ensemble Kalman filter is carried out by code 6.3.2.

Code 6.3.2.   Script sim_06_3_2_EnKF_square_root_filter.m to carry out the ensemble Kalman SRF assimilation cycle for the Lorenz 1963 system.

We note that for a three-dimensional system such as Lorenz 1963 we do not need techniques such as localization. Also, we need to remark that we employed some inflation by a factor $\gamma =1.2$ in line 19 of code 6.3.2, i.e. we have pushed the spread of the analysis ensemble up by a multiplicative factor.

If we do not use some kind of inflation to take care of the model error, the spread of the background ensemble is only increased by the divergence of the model dynamics when different ensemble initial conditions are used. In this case, it will become too small and not properly take into account the difference between the truth (the nature run) and the model ensembles.

6.4. An integral equation of the first kind, regularization and atmospheric radiance retrievals

To learn more about regularization methods let us solve an integral equation of the first kind with some Gaussian kernel. Such equations can be employed as models for atmospheric temperature or humidity reconstruction in an atmospheric column from measured infrared or microwave radiation.

Let $k :[0,H]\times [0,H]\to {\mathbb{R}}$ for some maximal height $H\gt 0$ be a continuous kernel function, $f\in C([0,H])$ be some function representing observations and $\varphi \in C([0,H])$ be an unknown profile function. In atmospheric radiance retrievals, φ could be the temperature or humidity in an atmospheric column and $f(\nu )$ could be measurements of brightness temperature at particular frequencies ν. Here we consider f as a function of the maximum value of its kernel $k(\nu ,\cdot )$ in $[0,H]$—such that ν becomes a height variable and is no longer a frequency or some channel number. We search for a solution to

Equation (6.4.1)

by a collocation method as follows. Let ${\bf{x}}\in {{\mathbb{R}}}^{m}$ be a discretization of the function φ at quadrature points

Equation (6.4.2)

We replace the integral by a quadrature formula, which in the most simple case is the rectangular rule, evaluate the formula for ${\nu }_{\xi } := \xi /n\cdot H$ for $\xi =1,\ldots ,\;n,$ employ the definitions

Equation (6.4.3)

for the right-hand side at the points ${y}_{\xi },$ $\xi =1,\ldots ,\;n$ and with quadrature weights ${\gamma }_{\xi },$ to obtain the finite-dimensional discrete linear system

Equation (6.4.4)

The integral operator $A\;:C([0,H])\to C([0,H])$ defined by

Equation (6.4.5)

is approximated by the matrix

Equation (6.4.6)

such that the operator equation (6.4.1) is approximated by

Equation (6.4.7)

Let k be a Gaussian kernel, i.e.

Equation (6.4.8)

We define a true solution ${{\bf{x}}}_{\text{true}} :=\;{(\mathrm{cos}({s}_{\xi }))}_{\xi =1,\ldots ,n},$ data by ${\bf{y}}:={{\bf{Ax}}}_{\text{true}}+\delta $ where δ is some random error generated by a uniform distribution between ±0.01. The unregularized solution is calculated by $x={{\bf{A}}}^{-1}{\bf{y}}$ and the regularized solution via Tikhonov regularization (3.1.24) ${{\bf{x}}}_{\alpha }:=\;{(\alpha I+{{\bf{A}}}^{*}{\bf{A}})}^{-1}{{\bf{A}}}^{*}y.$ The OCTAVE code is given by code 6.4.1 and the results are displayed in figure 6.4.

Code 6.4.1.   Script sim_06_4_1_radiance_igl.m to demonstrate the implicit ill-posedness of radiance data retrieval and radiance assimilation.

Figure 6.4 demonstrates the unregularized and regularized solutions to the integral equation (6.4.1), where we chose regularization parameter $\alpha ={10}^{-3}.$ The behavior of the solutions depends on the number of quadrature points chosen. The more quadrature points we choose, the higher the ill-posedness of the inverse problem, reflected in figure 6.4(a) with n = 24 quadrature points and (b) with n = 26 quadrature points.

6.5. Integro-differential equations and neural fields

The following chapter 7 will introduce inverse theory for neural fields. Here, we want to carry out the basic first steps to simulate such fields. The neural field equation in its simplest form is an integro-differential equation of the form

Equation (6.5.1)

for $x\in {\rm{\Omega }},t\in [0,T]$ on some domain Ω in ${{\mathbb{R}}}^{m}$ for $m=1,2,3$ and time interval $[0,T],$ where τ is some parameter, w is a neural kernel and f is a sigmoidal function

Equation (6.5.2)

As is well known, it is possible to numerically solve such an equation by explicit time iteration known as an Euler scheme defined by

Equation (6.5.3)

for the state vectors ${{\bf{u}}}_{\xi }\in {{\mathbb{R}}}^{m}$ at time ${t}_{\xi },$ where we have chosen a discretization

Equation (6.5.4)

where ${x}_{\xi },$ $\xi =1,\ldots ,\; n$ with $n\in {\mathbb{N}}$ points is a discretization of the domain Ω and ${t}_{k}:={h}_{t}k,$ $k=0,1,2,\ldots $ with the time grid spacing ${h}_{t}\gt 0.$ The kernel ${\bf{W}}$ based on $w({x}_{\xi },{x}_{j})$ for $\xi ,j=1,\ldots ,\;n$ is defined as the matrix

Equation (6.5.5)

with integration weight sj at point xj , $j=1,\ldots ,\;n.$ From (6.5.3) we obtain

Equation (6.5.6)

For our example, we choose a kernel defined by

Equation (6.5.7)

which we denote as a directed Gaussian in direction x1. We start some simulation in a rectangular neural patch Ω with the initial state given by a Gaussian

Equation (6.5.8)

with some ${x}_{0}\in {\rm{\Omega }}.$

For many examples we need to quickly generate points on a regular grid. This is carried out in an elementary way in lines 1–13 of code 6.5.1, where p1v and p2v contain the x1 and x2 coordinates of the calculation nodes on a regular grid on ${\rm{\Omega }}=[0,{a}_{1}]\times [0,{a}_{2}].$ This is explained in detail in figure 8.2 and section 8.1.

Figure 6.5.

Figure 6.5. One column of the kernel (6.5.7), showing that a point (shown as *) excites points in the x1-direction and inhibits nodes in the negative x1-direction, the figure is generated by sim_06_5_0_directed_Gaussian.m.

Standard image High-resolution image

Based on these points, the n × n-matrix representing the neural kernel on the grid with n points is defined in lines 14–18, compare figure 6.5. Initialization of the neural field equation is carried out in lines 19–25. The Euler scheme (6.5.6) is performed by the loop of the lines 26–30.

Code 6.5.1.   Script sim_06_5_1_neural_simulation.m to calculate the temporal evolution of some neural activity function according to the neural field equation (6.5.1). Further, we show the code for the sigmoidal function f.m.

Figure 6.6.

Figure 6.6. We display the activity distribution as described by some neural field $u(x,t)$ at three different time steps, when the kernel ${\bf{W}}$ is given by (6.5.7) and some Gaussian excitation is used as initial state $u(\cdot ,0).$

Standard image High-resolution image

We display the state evolution of the neural activity function $u(x,t)$ in figure 6.6 by showing the activity at three different time steps. The OCTAVE code to calculate these images is shown in code 6.5.2.

Code 6.5.2.   Script sim_06_5_1_show_graphics.m to display the temporal evolution (figure 6.6) of some neural activity function according to the neural field equation (6.5.1).

6.6. Image processing operators

For many inversion tasks, working with masks and operators defined on two- or three-dimensional images is very advantageous. Here, we briefly provide a survey of such operators.

Figure 6.7.

Figure 6.7. A test domain and the corresponding mask mG for two grids with different grid parameters ${h}_{{\rm{grid}}}.$ All cells which come close to the domain are shown in gray here.

Standard image High-resolution image

Definition of masks. A straightforward realization of the masks is carried out as follows. We define

Equation (6.6.1)

The discretized version of the mask in the form of a column vector is defined by

Equation (6.6.2)

Examples are given in figure 6.7. We reorder the column vector according to (8.1.17) to obtain a matrix version of the mask ${{\bf{m}}}_{G}$ of dimension ${M}_{1}\times {M}_{2}.$

In the terminology of image processing the mask ${{\bf{m}}}_{G}$ is a binary digital image of dimension ${M}_{1}\times {M}_{2}.$ There is a large range of possible binary operations which can be employed for image processing and is very useful for the construction of scattering domains. Before we introduce several of these operations, we need to consider the construction of binary masks from boundary representations of domains.

Construction of masks. The construction of the masks can be a non-trivial task, depending on the particular form of the test domain G. If G is a ball with center z and radius R, then the mask can be easily constructed by testing the distance of a point $x\in { \mathcal G }$ to z. The same procedure can be applied to all star-like domains which are given in parametrized form.

For more general domains it is more difficult to decide whether a point $x\in { \mathcal G }$ is inside or outside G. We briefly mention two possibilities.

  • (a)  
    It is possible to restrict oneʼs attention to a simple class of domains for which every point x outside G has a straight line connecting x with infinity. In this case we can test all lines passing from a point x as to whether they intersect $\partial G.$ If we find one which is not intersecting $\partial G$ we know that x is on the outside, otherwise it is in $\bar{G}.$
  • (b)  
    In two dimensions, using the winding number of a curve, we can calculate a mask for some domain G with given boundary parametrization. In complex coordinates $z={z}_{1}+{\rm{i}}{z}_{2}$ with ${z}_{1},{z}_{2}\in {\mathbb{R}},$ it is carried out by the Cauchy integral
    Equation (6.6.3)
    For our purposes transform (6.6.3) into the form
    Equation (6.6.4)
    with the tangential vector ${\rm{\Gamma }}(t)^{\prime} $ of the curve Γ parametrized over $[0,2\pi ].$ Figure 6.8 demonstrates the generation of masks via the Cauchy integral (6.6.4).

Figure 6.8.

Figure 6.8. The generation of masks via the Cauchy integral for two different domains whose boundary is described by some trigonometric polynomial. For the Cauchy integral we used n = 200 quadrature points and evaluated the integral on a regular grid with M = 50 points in each direction.

Standard image High-resolution image
Figure 6.9.

Figure 6.9. A mask for a kite-shaped domain (a) on a 50 × 50 grid. Random error is added to the mask in (b). An application of the shrinking process (6.6.9) generates (c). We apply (6.6.9) a second time to obtain a fully cleaned mask (d). Clearly, the mask in (d) is slightly smaller than the original mask, since two layers of boundary points have been removed. We then apply the smoothing operator L3 and L4 to generate (e). Finally, a thickening operation leads to (f).

Standard image High-resolution image

Masking operations. The elementary set theoretic operations such as intersections and unions of sets can easily be implemented via masking operations. Let G and H denote two domains in ${{\mathbb{R}}}^{2}$ with masks ${{\bf{m}}}_{G}$ and ${{\bf{m}}}_{H}.$ Then the intersection is carried out by

Equation (6.6.5)

where the symbol $\odot $ denotes pointwise (element-wise) multiplication of matrices or vectors. Unions of G and H are calculated by

Equation (6.6.6)

which means that we add the masks of G and H by regular matrix addition and then pointwise take the signum function. Both operations (6.6.5) and (6.6.6) can be generalized to products or sums with n terms, i.e. we have

Equation (6.6.7)

Equation (6.6.8)

Another useful operation is the following shrinking process. On a binary matrix ${\bf{m}}$ we define

Equation (6.6.9)

i.e. the matrix element $S{{\bf{m}}}_{i,j}$ is one if and only if all the corresponding adjacent matrix elements of ${\bf{m}}$ are one. The application of the shrinking operator S removes all boundary elements of a domain G given by the binary matrix ${\bf{m}}.$ Analogously, we can define the thickening operator

Equation (6.6.10)

A smoothing operator of level ${\ell }$ can be defined by

Equation (6.6.11)

This operator removes pixels which have less or equal to ${\ell }$ neighbors of value 1.

The operators of shrinking, thickening and smoothing are usually employed for purposes of noise removal and smoothing. We employ them for our probe and sampling implementations. The smoothing effect is demonstrated in figure 6.9, where we apply the shrinking operation two times to a noisy mask and then thicken it by an application of the thickening operator.

Multiplication operators. Multiplication operators will be helpful for several inversion schemes. Given a function $f\in {L}^{2}(M)$ with some set M and a function $g\in C(M),$ let Mg be the operator which maps ${L}^{2}(M)$ into itself by multiplication of every function f by g, i.e.

Equation (6.6.12)

References

  • [1]Eaton J W, Bateman D, Hauberg S and Wehbring R 2015 GNU Octave Version 4.0.0 Manual: a High-Level Interactive Language for Numerical Computations  (Boston, MA: Free Software Foundation) 
  • [2]Kress R 1998 Numerical Analysis (Graduate Texts in Mathematics vol 181)  (Berlin: Springer) 

Export references: BibTeX RIS