# Optimization

by Nikolai Shokhirev

 Optimization, Fitting

### Introduction

A common problem in many areas (science, technology, economics, etc.) is to determine the maximum or minimum (and the corresponding arguments) of a real-valued function,

of N real variables (or parameters). The word optimization denotes either the minimization or maximization. Note that the search for a maximum of some function F is equivalent to the search of a minimum of -F.

Very often optimization problems arise from fitting problems. A broad class of real system can be described by their response to some input parameters: . In general, both the response function and the parameters can be multi-dimensional. The picture below illustrates the one-dimensional case:

The problem is to put the system into some desired state (the target value of response) by changing (tuning) the parameters. This can be expressed as a mathematical equation (or set of equations):

 (1)

The solution of this equation    is the set of optimal parameters (the exact fit).

The above equation can be reformulated in the following way: "reduce the distance between  and " :

 (2)

This is a typical optimization problem. The function F is called the objective function or fitness function. The exact fit is reached if the distance tends to zero.

Note that the equation (1) does not necessarily have a solution (i.e. min can be > 0, for example due to experimental errors). However the equation (2) always has a minimum. This solution gives the best possible fit.

Note that such an approximate solution depends on the definition of distance D (especially in multi-dimensional case when the parameters are of a different nature).

The figures below show the case with one exact solution (at ~ 2.2) and one approximate fit (at ~ -0.7) and the corresponding fitness function:

 Approximate and exact fit Fitness function

This fitness function has two local minima and only one global minimum that correspond to the exact solution.

### Linear vs. Non-linear Optimization

If the response function is linear:

 or

then equation (1) reduces to the system of linear equations:

that can be resolved if

1. M = N and
2. matrix B is not degenerate.

According to Eq. (2) the fitness function is the N-dimensional paraboloid (the second-order function of the parameters):

 (3)

As an example, the two-dimensional paraboloid is shown in the picture below:

The equation (3) also can be solved by applying standard methods of Linear algebra.

Of course, the case of a pure linear response is rare. However in a sufficiently small region of parameters the response function can be considered as linear and the fitness function as quadratic. This is the most developed area of optimization  methods.

The drawback of this simplification is that only a local minimum can be found. A search for all minima or for the global minimum cannot be solved using the linear approach.

In the case of essentially non-linear functions, both Eqs. (1) and (2) can give several sets of optimal parameters (sometimes equally important)

On the contrary, the linear approximation always give only one solution (the red parabola in the picture below):

### One dimensional optimization

#### Golden section search

Suppose, we search for the minimum of a given function F(x).  The initial interval [x1, x4] (the middle one in the figure below) is symmetrically divided into three subintervals so that

(x2 - x1) = (x4 - x3) = g (x4 - x1).

and  (x3 - x1) = (1 - g) (x4 - x1)

Suppose, we know that the minimum is somewhere in the interval. If  F(x2) < F(x3)  then we can bracket the minimum by the interval [x1, x3] (the lower picture):

so that  x1new = x1old, x3new = x2old, x4new = x3old. The function at  x2new  should be calculated. If we require that the new subintervals have the same relative lengths then we come to the equation

( x4new- x3new ) = (x3old - x2old

or  g (1 - g) = (1 - 2 g)

The solution is

This value is called the golden mean or golden section (sometimes 1 - g = 0.6180339887498948482 is called the golden section as well).

The case  F(x2) > F(x3)  leads to the bracketing by [x2, x4] (the upper picture) but the golden section can be also used.

### References

 Optimization, Fitting