Generates a random MDP problem.
[P, R] = mdp_example_rand (S, A) [P, R] = mdp_example_rand (S, A, is_sparse) [P, R] = mdp_example_rand (S, A, is_sparse, mask)
mdp_example_rand generates a transition probability array P and a reward array R.
Optional arguments allow to define sparse matrices and pairs of states with impossible transitions.
number of states.
S is an integer greater than 0.
number of actions.
A is an integer greater than 0.
used to generate sparse matrices.
is_sparse is a boolean. If it is set to %T, sparse matrices are generated.
By default, it is set to %F.
indicates the possible transitions between states.
mask is a (SxS) matrix composed of 0 and 1 elements (0 indicates a transition probability always equal to zero).
By default, mask is only composed of 1.
transition probability array.
P is a 3 dimensions array (SxSxA) or a list (1xA), each list element containing a sparse matrix (SxS).
reward array.
R is a 3 dimensions array (SxSxA) or a list (1xA), each list element containing a sparse matrix (SxS). Elements of R are in ]-1; 1[.
-> // to reproduce the following example, it is necessary to init the pseudorandom number generator -> grand('setsd',ones(625,1)) -> [P, R] = mdp_example_rand (2, 2, %F, [1 1; 0 1]) R = (:,:,1) - 0.9980468 - 0.9980468 0. - 0.9980468 (:,:,2) - 0.9980468 - 0.9980468 0. - 0.9980468 P = (:,:,1) 0.5 0.5 0. 1. (:,:,2) 0.5 0.5 0. 1. | ![]() | ![]() |