The optimal control of Markovian processes
Loading...
Files
Date
1977
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Te Herenga Waka—Victoria University of Wellington
Abstract
With the development of the theory of dynamic programming, considerable interest was generated in Markovian decision processes and since 1960 several research papers have been published on this topic. Even if solutions were difficult to obtain via the dynamic programming method, it was found that certain problems arising from the fields of systems engineering, operations research and optimal control could be formulated as dynamic programming problems.
This thesis gives an expository account of the mathematical theory and algorithms involved in the optimal control of Markovian dynamic systems. Such a system is observed periodically and classified into one of a possible number of states. After each classification one of a possible number of actions is taken. The sequence of implemented actions (called a policy) interacts in a probabilistic manner to effect the evolution of the system. The mathematical abstraction of the process is called a Markovian decision process.
Throughout the thesis optimization is considered relative to two objective functions - the total expected discounted return and the average return per unit time. Emphasis is placed on the existence of optimal policies and the methods of determining such policies. Some illustrative examples are included.
Chapter I begins with a definition of the discrete time problem over a finite planning horizon. From then on the problem is considered over an infinite planning horizon. The state and action spaces are finite. A policy iterative technique called the Howard policy improvement routine is described and it is shown that these problems can be formulated and solved as linear programming problems.
In Chapter II we consider the discrete time problem when the state space is a denumerable set. There it is pointed out that under the average return criterion an optimal policy may not exist. Sufficient conditions for the existence of optimal policies are given and use is made of the principle of contraction mapping in obtaining solutions.
In Chapter III we discuss semi-Markovian decision processes. The continuous time Markovian decision process is considered as a special case of the semi-Markovian process. It is shown that the policy iterative technique can be applied to the continuous problem. It is also shown how the continuous problem may be formulated into a linear programming problem by first transforming the continuous problem into a discrete one and formulating the resulting discrete problem into a linear programming one.
Finally, in Chapter IV we consider the continuous time problem under a denumerable state space. We discuss results analogous to those for the discrete case of Chapter II.
Description
Keywords
Decision making, Mathematical models, Markov processes, Mathematical optimization