Orbital library

orbital.algorithm.template
Interface MarkovDecisionProblem

Type Parameters:
A - the type of transition actions.
S - the type of transition states.
M - the class representing transitions.
All Superinterfaces:
AlgorithmicProblem, TransitionModel
All Known Subinterfaces:
GeneralSearchProblem
All Known Implementing Classes:
DelegateGeneralSearchProblem, OpenClosedGeneralSearchProblem

public interface MarkovDecisionProblem
extends TransitionModel, AlgorithmicProblem

Hook class for MarkovDecisionProcess algorithm. Objects implementing this interface represent a Markov decision process model.

A Markov decision problem (MDP) for a Markov decision process is a mathematical model for making sense of some classes of problems. An MDP is a special kind of sequential decision problem. It is a combinatorical optimization problem as long as the state space is discrete. It is characterized by

Its solution is a policy π:S→A(s) telling which action to take in each state. A solution π is optimal if it has minimum expected cost.


A Partially Observable Markov decision problem (POMDP) is one kind of a Markov decision problem with incomplete information which are an extension to MDPs. POMDPs do not rely on an accessible environment, but work on an inaccessible environment where some percepts might not be enough to determine the state, and thus we have incomplete information about the state. Utilitarian ideas have been around in artificial intelligence for a long time. For ideas of a non-observing agent see (Lem 1971).

For POMDPs, the Markov property does not hold for percepts, but only for states. In the absence of feedback, a POMDP is a deterministic search in belief state space. In the presence of feedback, a POMDP is an MDP over belief space (Astrom, 1965). However, a single belief state is a probability distribution over physical states, then. In fact, solving a POMDP can be quite complex.

Author:
André Platzer
See Also:
MarkovDecisionProcess, Hector Geffner. Modelling and Problem Solving, "D. P. Bertsekas. Dynamic Programming: Deterministic and Stochastic Models. Prentice-Hall, Englewood Cliffs, NJ, 1989.", "S. Ross. Introduction to Stochastic Dynamic Programming. Academic Press, New York, 1983.", "A. Barto, S. Bradtke, and S. Singh. Learning to act using real-time dynamic programming. Artificial Intelligence, 72:81-138, 1995.", "Stanisław Lem. Doktor Diagoras in: Sterntagebücher, suhrkamp p491, 1978. (original edition 1971)"

Nested Class Summary
static class MarkovDecisionProblem.DefaultTransition
          Default implementation of transition options for Markov decision processes.
static interface MarkovDecisionProblem.Transition
          Represents a transition option during a Markov decision process.
 
Method Summary
 boolean isSolution(java.lang.Object state)
          Check whether the given state is a goal state (a valid solution to the problem).
 
Methods inherited from interface orbital.algorithm.template.TransitionModel
actions, states, transition
 

Method Detail

isSolution

boolean isSolution(java.lang.Object state)
Check whether the given state is a goal state (a valid solution to the problem).

Optional variation: Search algorithms generally seek out for one single solution. In order to find all solutions to a problem, simply let this method store solutions and return false instead, until enough solutions have occurred. However, expanding solution nodes should result in an empty list at some time to ensure termination, then.

Parameters:
state - the state s∈S to check for being a goal state.
Returns:
G(s), resp. whether s∈G.
Preconditions:
s∈S

Orbital library
1.3.0: 11 Apr 2009

Copyright © 1996-2009 André Platzer
All Rights Reserved.