Statistical model checking for Markov decision processes

Statistical Model Checking (SMC) is a computationally
very efficient verification technique based on
selective system sampling. One well identified
shortcoming of SMC is that, unlike probabilistic model
checking, it cannot be applied to systems featuring
nondeterminism, such as Markov Decision Processes
(MDP). We address this limitation by developing an
algorithm that resolves nondeterminism
probabilistically, and then uses multiple rounds of
sampling and Reinforcement Learning to provably
improve resolutions of nondeterminism with respect to
satisfying a Bounded Linear Temporal Logic (BLTL)
property. Our algorithm thus reduces an MDP to a fully
probabilistic Markov chain on which SMC may be applied
to give an approximate solution to the problem of
checking the probabilistic BLTL property. We integrate
our algorithm in a parallelised modification of the
PRISM simulation framework. Extensive validation with
both new and PRISM benchmarks demonstrates that the
approach scales very well in scenarios where symbolic
algorithms fail to do so.

@inproceedings{DBLP:conf/qest/HenriquesMZPC12,
	pdf = {pub/smcmdp.pdf},
	slides = {pub/smcmdp-slides.pdf},
	study = {https://www.cs.cmu.edu/ jmartins/QEST12.zip},
	author = {['David Henriques', 'João G. Martins', 'Paolo Zuliani', 'André Platzer', 'Edmund M. Clarke']},
	title = {Statistical Model Checking for
               Markov Decision Processes},
	booktitle = {QEST},
	year = {2012},
	pages = {84-93},
	doi = {10.1109/QEST.2012.19},
	longbooktitle = {Ninth International Conference on
               Quantitative Evaluation of Systems, QEST
               2012, London, UK, 17-20 September, 2012},
	publisher = {IEEE Computer Society},
	keywords = {statistical model checking,
               Markov decision processes,
               reinforcement learning},
	abstract = {
    Statistical Model Checking (SMC) is a computationally
    very efficient verification technique based on
    selective system sampling. One well identified
    shortcoming of SMC is that, unlike probabilistic model
    checking, it cannot be applied to systems featuring
    nondeterminism, such as Markov Decision Processes
    (MDP). We address this limitation by developing an
    algorithm that resolves nondeterminism
    probabilistically, and then uses multiple rounds of
    sampling and Reinforcement Learning to provably
    improve resolutions of nondeterminism with respect to
    satisfying a Bounded Linear Temporal Logic (BLTL)
    property. Our algorithm thus reduces an MDP to a fully
    probabilistic Markov chain on which SMC may be applied
    to give an approximate solution to the problem of
    checking the probabilistic BLTL property. We integrate
    our algorithm in a parallelised modification of the
    PRISM simulation framework. Extensive validation with
    both new and PRISM benchmarks demonstrates that the
    approach scales very well in scenarios where symbolic
    algorithms fail to do so.}
}