Since I had recently a request (from Benoît) about optimal control, I will start here a series of posts on that topic. But first, let us start with a simple problem, with discrete time, no randomness, and with a finite horizon. This might be a too simple framework to model complex problem, but that should be interesting to derive heuristic intuitions (I will skip here the mathematical problems, that can be found in very good books… references will come soon).

**An introduction to backward induction**

Before starting seriously, let us consider the following example: we want to reach the red city on the right from the red city on the left, as fast as possible. There are some roads, and the number is the number of hours it takes. Let us *prove *that the optimal way is the red one,

The first idea can be to calculate *all *possible trajectories, but with a large number of roads, the number of possible ways can soon be extremely large. An alternative can be to look backwards (like any students facing a question where the answer is given: start from the end, and try to find a possible way to reach it).

Numbers are greens are the number of hours that we still need one we’ve reached that point. Let us move again one step backward, and consider the orange points,

In the middle, we had to chose either to go up (it will still take 9 hours) or go down (and then 14 hours are necessary). Thus, the *optimal* strategy, once we’ve reach that point, it to take the 9 hour road. This will be idea the idea proposed by Bellman. Let us go backward again, to the purple cities. Again, we have to chose the shorter way,

From the top, the fastest road will take 13 hours, and from the bottom, it will take 16 hours. Thus, since it takes 10 hours to reach the top city, this road is necessarily faster than taking the one below (since it takes 8 hours to reach the nearest city).

Any this is it. We now have an intuitive idea of what should be done to find optimal strategies.

**The optimization problem, discrete with finite horizon**

Let us consider the following optimization problem: we want to find the optimal strategy that maximizes the following function

with simple constraints, such as (a state space) and (i.e. some dynamic constraints). Assume further that the starting point is given, i.e.

. In economic application, note that frequently i.e. we consider a discounted version of the value.

**The idea of dynamic programming**

The intuitive idea of dynamic programming is that if the optimal path from A to C goes throught B, then the path is optimal from B to C. Thus, it will be natural to consider backward induction techniques.

Thus, define

…

Then Bellman’s principle can be used to link those problems: if is a solution of the problem then, for all

, is a solution of problem .

Note that, so far, we assume that such an optimal sequence does exist. Thus, we get that for all x

and more generally,

He

nce, from a practical point of view, we solve those equation using a backward approch. I.e. first,

and then

and so on

… etc. It can be proved that the sequence is solution of if and only if for all ,

is solution of

So far, it does not look so difficult….

Let denote the consumption at period *t*, and assume consumption yields utility

as long as the consumer lives. Assume the consumer is impatient, or has a stronger preference for present, so that he discounts future utility by a factor . Let be capital he got at time t. Assume that his initial capital is a given amount , and suppose that this period’s capital and consumption determine next period’s capital as

where is a positive constant and . Assume further that capital cannot be negative. Then the consumer’s problem is simply

given . Bellman’s equation is then

whih leads us to a simplier problem than the initial one, since only two variables are involved here and . And to solve that problem, we use backwards induction techniques.

Since is known, we can derive easily , and so on until . More precisely, given , we can get which is the maximum of function

with . One can see that the following function is a possible solution

where each is a constant. Further, the optimal amount to consume at time is

i.e., if we explicit those expressions

…etc.

**A reformulation of the optimisation problem**

Another way of expressing the optimization problem is the following: was the variable of interest, was the control variable, and . Thus, the programm

can be expressed as

Several extension can be considered,

- consider a random component

- consider a continuous version

But those items will be for posts that I still have to write down….